Selecting the variables that train a self-organizing map (SOM) which best separates predefined clusters

Author

Laine, Sampsa

Author_Institution

Lab. of Comput. & Inf. Sci., Helsinki Univ. of Technol., Espoo, Finland

Volume

4

fYear

2002

fDate

18-22 Nov. 2002

Firstpage

1961

Abstract

The paper presents how to find the variables that best illustrate a problem of interest when visualizing with the self-organizing map (SOM). The user defines what is interesting by labeling data points, e.g. with alphabets. These labels assign the data points into clusters. An optimization algorithm looks for the set of variables that best separates the clusters. These variables reflect the knowledge the user applied when labeling the data points. The paper measures the separability, not in the variable space, but on a SOM trained into this space. The found variables contain interesting information, and are well suited for the SOM. The trained SOM can comprehensively visualize the problem of interest, which supports discussion and learning from data. The approach is illustrated using the case of the Hitura mine; and compared with a standard statistical visualization algorithm, the Fisher discriminant analysis.

Keywords

data mining; data visualisation; search problems; self-organising feature maps; statistical analysis; unsupervised learning; Fisher discriminant analysis; Hitura mine; SOM; optimization algorithm; predefined clusters separation; problem visualization; self-organizing map; separability; unsupervised learning; variables selection; Algorithm design and analysis; Clustering algorithms; Data mining; Data visualization; Extraterrestrial measurements; Information science; Labeling; Laboratories; Principal component analysis; Visual databases;

fLanguage

English

Publisher

ieee

Conference_Titel

Neural Information Processing, 2002. ICONIP '02. Proceedings of the 9th International Conference on

Print_ISBN

981-04-7524-1

Type

conf

DOI

10.1109/ICONIP.2002.1199016

Filename

1199016