Title :
Assessment of dimensionality reduction based on communication channel model; application to immersive information visualization
Author :
Babaee, Mohammadreza ; Datcu, Mihai ; Rigoll, Gerhard
Author_Institution :
Inst. for Human-Machine Commun., Tech. Univ. Munchen, Munich, Germany
Abstract :
We are dealing with large-scale high-dimensional image data sets requiring new approaches for data mining where visualization plays the main role. Dimension reduction (DR) techniques are widely used to visualize high-dimensional data. However, the information loss due to reducing the number of dimensions is the drawback of DRs. In this paper, we introduce a novel metric to assess the quality of DRs in terms of preserving the structure of data. We model the dimensionality reduction process as a communication channel model transferring data points from a high-dimensional space (input) to a lower one (output). In this model, a co-ranking matrix measures the degree of similarity between the input and the output. Mutual information (MI) and entropy defined over the co-ranking matrix measure the quality of the applied DR technique. We validate our method by reducing the dimension of SIFT and Weber descriptors extracted from Earth Observation (EO) optical images. In our experiments, Laplacian Eigenmaps (LE) and Stochastic Neighbor Embedding (SNE) act as DR techniques. The experimental results demonstrate that the DR technique with the largest MI and entropy preserves the structure of data better than the others.
Keywords :
data mining; data structures; data visualisation; matrix algebra; very large databases; visual databases; DR techniques; EO optical images; Laplacian Eigenmaps; SIFT; SNE; Weber descriptors; co-ranking matrix; communication channel model; data mining; data points; data structure; degree of similarity; dimension reduction techniques; dimensionality reduction process; earth observation optical images; entropy; immersive information visualization; information loss; large-scale high-dimensional image data sets; mutual information; stochastic neighbor embedding; Communication channels; Data mining; Data models; Data visualization; Entropy; Feature extraction; Mutual information; Communication channel; Dimensionality Reduction; Immersive information Visualization; Quality Assessment;
Conference_Titel :
Big Data, 2013 IEEE International Conference on
Conference_Location :
Silicon Valley, CA
DOI :
10.1109/BigData.2013.6691726