DocumentCode :
2191910
Title :
High-Dimensional Multimodal Distribution Embedding
Author :
Szekely, Eniko ; Bruno, Eric ; Marchand-Maillet, Stephane
Author_Institution :
Comput. Sci. Dept., Univ. of Geneva, Geneva, Switzerland
fYear :
2010
fDate :
13-13 Dec. 2010
Firstpage :
434
Lastpage :
441
Abstract :
High-dimensional data is emerging in more and more varied domains, but its analysis has revealed to be difficult due to the curse of dimensionality. Dimension reduction emerged as a powerful tool in overcoming problems related to high-dimensionality, still the curse of dimensionality continues to impact many of the existing methods. The current paper concentrates on low-dimensional distance-based embeddings for high-dimensional multimodal distributions, i.e. clustered data. Pair wise distances are particularly influenced by high-dimensionality. Their analysis is at the basis of the embedding method presented here and called HDME. To avoid the problems of high-dimensionality, HDME performs a distance transformation based on interpoint relationships. The positive influence of the transformation in preserving and emphasizing clusters is first demonstrated using label information. The distance transformation is driven by the estimation of the neighbourhood information. The transformed distances are embedded in a low-dimensional space using a classical embedding method. Experiments on real-world data show that distance transformations can be effectively used in conjunction with distance-based embedding methods to obtain representation spaces that well discriminate clusters.
Keywords :
data analysis; pattern clustering; statistical distributions; distance transformation; high dimensional multimodal distribution embedding method; interpoint relationship; low dimensional distance based embedding; neighbourhood information estimation; clustering; dimension reduction; high-dimensional data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining Workshops (ICDMW), 2010 IEEE International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-1-4244-9244-2
Electronic_ISBN :
978-0-7695-4257-7
Type :
conf
DOI :
10.1109/ICDMW.2010.194
Filename :
5693330
Link To Document :
بازگشت