From similarity retrieval to cluster analysis: The case of R*-trees

Author

Pi, Jiaxiong ; Shi, Yong ; Chen, Zhengxin

Author_Institution

Nebraska Univ., Omaha, NE

fYear

2007

fDate

March 1 2007-April 5 2007

Firstpage

524

Lastpage

529

Abstract

Data mining is concerned with important aspects related to both database techniques and AI/machine learning mechanisms, and provides an excellent opportunity for exploring the interesting relationship between retrieval and inference/reasoning, a fundamental issue concerning the nature of data mining. In the data mining context, this relationship can be restated as connection and differences between data retrieval and data mining. In this paper we explore this relationship by examining time series data indexed through R*-trees, and study the issues of (1) retrieval of data similar to a given query (which is a plain data retrieval task), and (2) clustering of the data based on similarity (which is a data mining task). Along the way of examination of our central theme, we also report new algorithms and new results related to these two issues. We have developed a software package consisting of a similarity analysis tool and two implemented clustering algorithms: KMeans-R and Hierarchy-R. A sketch of experimental results is also provided

Keywords

data mining; inference mechanisms; learning (artificial intelligence); pattern clustering; tree data structures; Hierarchy-R; KMeans-R; R-trees; artificial intelligence; cluster analysis; data mining; data retrieval; database techniques; inference mechanisms; machine learning; reasoning; similarity retrieval; Clustering algorithms; Computational intelligence; Data mining; Indexes; Information retrieval; Learning systems; Multidimensional systems; Software packages; Spatial databases; Tree data structures;

fLanguage

English

Publisher

ieee

Conference_Titel

Computational Intelligence and Data Mining, 2007. CIDM 2007. IEEE Symposium on

Conference_Location

Honolulu, HI

Print_ISBN

1-4244-0705-2

Type

conf

DOI

10.1109/CIDM.2007.368919

Filename

4221343