DocumentCode :
1924080
Title :
Protein sequence motif patterns using adaptive Fuzzy C-Means granular computing model
Author :
Chitralegha, M. ; Thangavel, K.
Author_Institution :
Dept. of Comput. Sci., Periyar Univ., Salem, India
fYear :
2013
fDate :
21-22 Feb. 2013
Firstpage :
96
Lastpage :
103
Abstract :
Data Mining is the process to extract hidden predictive information from large databases. In Bioinformatics, data mining enables researchers to meet the challenge of mining large amount of biomolecular data to discover real knowledge. Major research efforts done in the area of bioinformatics involves sequence analysis, protein structure prediction and gene finding. Proteins are said to be prominent molecules in our cells. They involve virtually in all cell functions. The activities and functions of proteins can be determined by protein sequence motifs. These protein motifs are identified from the segments of protein sequences. All segments may not be important to produce good motif patterns. The generated sequence segments do not have classes or labels. Hence, unsupervised segment selection technique is adopted to select significant segments. Therefore Singular Value Decomposition (SVD) entropy method is adopted to select significant sequence segments. In this proposed work, weighted K-Means and Adaptive Fuzzy C-Means have been applied to the selected segments to generate granules, since large amount of segments cannot be grouped or clustered as such. Each granules generated by weighted K-Means algorithm are further clustered by using the K-Means algorithm and granules generated by Adaptive Fuzzy C-Means algorithm are clustered by using Weighted K-Means. The two proposed models are compared with K-Means granular computing model. The experimental results show that Adaptive Fuzzy C-Means with Weighted K-Means technique produces better results than K-Means and weighted K-Means granular computing methods.
Keywords :
bioinformatics; data mining; entropy; learning (artificial intelligence); molecular biophysics; pattern classification; proteins; singular value decomposition; SVD entropy method; bioinformatics; biomolecular data; cell function; data mining; fuzzy C-means granular computing model; gene finding; knowledge discovery; protein sequence motif pattern; protein structure prediction; sequence analysis; singular value decomposition; weighted K-means technique; Amino acids; Bioinformatics; Clustering algorithms; Computational modeling; Partitioning algorithms; Protein sequence; Clustering; Motif; Proteins; SVD; Weighted K-Means;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition, Informatics and Mobile Engineering (PRIME), 2013 International Conference on
Conference_Location :
Salem
Print_ISBN :
978-1-4673-5843-9
Type :
conf
DOI :
10.1109/ICPRIME.2013.6496454
Filename :
6496454
Link To Document :
بازگشت