Title :
Clustering and Validation for Very Large Databases (VLDB)
Author :
Momin, Bashirahamed F.
Author_Institution :
Walchand Coll. of Eng., Sangli
Abstract :
The digital revolution has made digitized information easy to capture and fairly inexpensive to store in database at exponential rate. Unfortunately, these very large databases (VLDB) are not able to analyze in a reasonable amount of time and cost. Clustering is one such operation to group similar objects based on their distance, connectivity, relative density or some specific characteristics. Predicting the correct number of clusters and its quality evaluation are the key issues. This paper review various techniques for clustering and its validation. It presents a framework for cluster validation with most commonly used validity indices. The experiments on microarray gene expression dataset demonstrate the clustering and its validations. The results obtained indicate how to choose quality cluster that helps in data mining.
Keywords :
data mining; genetics; pattern clustering; very large databases; cluster validation; data mining; microarray gene expression dataset; quality evaluation; very large databases; Clustering algorithms; Computer Society; Computer science; Data engineering; Data mining; Databases; Gene expression; Iterative algorithms; Machine learning algorithms; Partitioning algorithms; VLDB; clustering; data mining; gene expression data;
Conference_Titel :
Information and Automation, 2006. ICIA 2006. International Conference on
Conference_Location :
Shandong
Print_ISBN :
1-4244-0555-6
Electronic_ISBN :
1-4244-0555-6
DOI :
10.1109/ICINFA.2006.374125