Title :
Visualization of high dimensional data using Similarity-Dissimilarity plot
Author_Institution :
Dept. of Comput. Sci. & Eng., Air Univ., Islamabad, Pakistan
Abstract :
Quality of features in discriminating different classes plays an important role in pattern classification problems. In real life, pattern classification may require high dimensional feature space in which class distribution is impossible to visualize. In this paper, we have proposed a Similarity-Dissimilarity plot which can project high dimensional space to a two dimensional space while retaining important characteristics required to assess the discrimination quality of the features. Similarity-dissimilarity plot can reveal information about the amount of overlap of features of different classes. Separable data points of different classes will also be visible on the plot which can be classified correctly using appropriate classifier. Hence, approximate classification accuracy can be predicted. Moreover, it is possible to know about whom class the misclassified data points will be confused by the classifier. Outlier data points can also be located on the similarity-dissimilarity plot. Various examples of synthetic data are used to highlight important characteristics of the proposed plot. Some real life examples from biomedical data are also used for the analysis. The proposed plot is independent of number of dimensions of the feature space.
Keywords :
data visualisation; pattern classification; data visualization; dimilarity-dissimilarity plot; high dimensional data; pattern classification; quality of features; Accuracy; Artificial neural networks; Biomembranes; Clustering algorithms; Data visualization; Indexes; Pattern classification;
Conference_Titel :
Intelligent and Advanced Systems (ICIAS), 2010 International Conference on
Conference_Location :
Kuala Lumpur, Malaysia
Print_ISBN :
978-1-4244-6623-8
DOI :
10.1109/ICIAS.2010.5716185