Title :
Patterns Discovery on Complex Diagnosis and Biological Data Using Fuzzy Latent Variables
Author :
Zong-Xian Yin ; Jung-Hsien Chiang
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., National Cheng Kung Univ., Taiwan
Abstract :
This paper proposes a new clustering algorithm referred to as the possibilitic latent variables (PLV) clustering algorithm. This algorithm provides a powerful tool for the analysis of complex data, such as clinical diagnosis and biological expressions data, due to its robustness to various data distributions and its accuracy in establishing appropriate groups from data. The algorithm combines a distribution model and the fuzzy degrees concept. Compared to the expectation-maximization (EM) algorithm, which is a well-known distribution estimating algorithm, the PLV algorithm has the considerable advantage that it can be applied to various data types, i.e. it is not restricted solely to Gaussian data distributions. Additionally, the proposed algorithm has a better performance than the well-known fuzzy clustering algorithm, i.e. the FCM algorithm, where it can address compact regions, other than simply dividing objects into several equal populations. The performance of the proposed algorithm is verified by conducting clustering tasks on the contents of several medical diagnosis and biological expressions datasets.
Keywords :
biology computing; data mining; fuzzy set theory; pattern clustering; PLV clustering algorithm; biological data; complex diagnosis; distribution model; fuzzy degrees concept; fuzzy latent variables; patterns discovery; possibilitic latent variables; Algorithm design and analysis; Biology; Clinical diagnosis; Clustering algorithms; Computer science; Data analysis; Data engineering; Medical diagnostic imaging; Partitioning algorithms; Power engineering and energy;
Conference_Titel :
Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on
Conference_Location :
Istanbul
Print_ISBN :
1-4244-0802-4
DOI :
10.1109/ICDE.2007.367903