DocumentCode :
3321792
Title :
Mining Approximate Order Preserving Clusters in the Presence of Noise
Author :
Zhang, Mengsheng ; Wang, Wei ; Liu, Jinze
Author_Institution :
Dept. of Comput. Sci., Univ. of North Carolina, Chapel Hill, NC
fYear :
2008
fDate :
7-12 April 2008
Firstpage :
160
Lastpage :
168
Abstract :
Subspace clustering has attracted great attention due to its capability of finding salient patterns in high dimensional data. Order preserving subspace clusters have been proven to be important in high throughput gene expression analysis, since functionally related genes are often co-expressed under a set of experimental conditions. Such co-expression patterns can be represented by consistent orderings of attributes. Existing order preserving cluster models require all objects in a cluster have identical attribute order without deviation. However, real data are noisy due to measurement technology limitation and experimental variability which prohibits these strict models from revealing true clusters corrupted by noise. In this paper, we study the problem of revealing the order preserving clusters in the presence of noise. We propose a noise-tolerant model called approximate order preserving cluster (AOPC). Instead of requiring all objects in a cluster have identical attribute order, we require that (1) at least a certain fraction of the objects have identical attribute order; (2) other objects in the cluster may deviate from the consensus order by up to a certain fraction of attributes. We also propose an algorithm to mine AOPC. Experiments on gene expression data demonstrate the efficiency and effectiveness of our algorithm.
Keywords :
data mining; pattern clustering; approximate order preserving cluster; approximate order preserving clusters mining; high dimensional data; high throughput gene expression analysis; salient patterns; subspace clustering; Clustering algorithms; Computer science; Data analysis; Databases; Gene expression; Genetics; Noise measurement; Throughput;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on
Conference_Location :
Cancun
Print_ISBN :
978-1-4244-1836-7
Electronic_ISBN :
978-1-4244-1837-4
Type :
conf
DOI :
10.1109/ICDE.2008.4497424
Filename :
4497424
Link To Document :
بازگشت