Title :
A shrinking-based dimension reduction approach for multi-dimensional analysis
Author :
Shi, Yong ; Zhang, Aidong
Author_Institution :
Dept. of Comput. Sci. & Eng., State Univ. of New York, USA
Abstract :
In this paper, we present continuous research on data analysis based on our previous work on the shrinking approach. Shrinking is a novel data preprocessing technique which optimizes the inner structure of data inspired by the Newton´s Universal Law of Gravitation in the real world. It can be applied in many data mining fields. Following our previous work on the shrinking method for multidimensional data analysis in full data space, we propose a shrinking-based dimension reduction approach which tends to solve the dimension reduction problem from a new perspective. In this approach data are moved along the direction of the density gradient, thus making the inner structure of data more prominent. It is conducted on a sequence of grids with different cell sizes. Dimension reduction process is performed based on the difference of the data distribution projected on each dimension before and after the data-shrinking process. Those dimensions with dramatic variation of data distribution through the data-shrinking process are selected as good dimension candidates for further data analysis. This approach can assist to improve the performance of existing data analysis approaches. We demonstrate how this shrinking-based dimension reduction approach affects the clustering results of well known algorithms.
Keywords :
data analysis; data mining; data reduction; data structures; database management systems; pattern clustering; Newton Universal Law of Gravitation; data analysis; data clustering; data distribution; data mining; data preprocessing technique; data space; data structure optimization; data-shrinking process; density gradient; grid sequence; multidimensional analysis; shrinking-based dimension reduction; Clustering algorithms; Computer science; Data analysis; Data engineering; Data mining; Data preprocessing; Databases; Explosives; Histograms; Multidimensional systems;
Conference_Titel :
Scientific and Statistical Database Management, 2004. Proceedings. 16th International Conference on
Print_ISBN :
0-7695-2146-0
DOI :
10.1109/SSDM.2004.1311243