DocumentCode :
813596
Title :
Minimum spanning tree partitioning algorithm for microaggregation
Author :
Laszlo, Michael ; Mukherjee, Sumitra
Author_Institution :
Graduate Sch. of Comput. & Inf. Sci., Nova Southeastern Univ., Fort Lauderdale, FL, USA
Volume :
17
Issue :
7
fYear :
2005
fDate :
7/1/2005 12:00:00 AM
Firstpage :
902
Lastpage :
911
Abstract :
This paper presents a clustering algorithm for partitioning a minimum spanning tree with a constraint on minimum group size. The problem is motivated by microaggregation, a disclosure limitation technique in which similar records are aggregated into groups containing a minimum of k records. Heuristic clustering methods are needed since the minimum information loss microaggregation problem is NP-hard. Our MST partitioning algorithm for microaggregation is sufficiently efficient to be practical for large data sets and yields results that are comparable to the best available heuristic methods for microaggregation. For data that contain pronounced clustering effects, our method results in significantly lower information loss. Our algorithm is general enough to accommodate different measures of information loss and can be used for other clustering applications that have a constraint on minimum group size.
Keywords :
computational complexity; data handling; data mining; pattern clustering; security of data; statistical databases; tree searching; very large databases; NP-hard problem; heuristic clustering methods; large data sets; microaggregation disclosure limitation technique; microdata protection; minimum information loss microaggregation problem; minimum spanning tree partitioning algorithm; pattern clustering algorithm; Clustering algorithms; Clustering methods; Euclidean distance; Loss measurement; Partitioning algorithms; Polynomials; Protection; Size measurement; Tree graphs; Index Terms- Clustering; disclosure control.; microdata protection; minimum spanning tree; partitioning;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2005.112
Filename :
1432700
Link To Document :
بازگشت