DocumentCode :
1925430
Title :
Outliers detection on protein localization sites by partitional clustering methods
Author :
Ashok, P. ; Kadhar Nawaz, G.M. ; Thangavel, K. ; Elayaraja, E.
Author_Institution :
R&D Centre, Bharathiar Univ., Coimbatore, India
fYear :
2013
fDate :
21-22 Feb. 2013
Firstpage :
447
Lastpage :
453
Abstract :
A large molecule composed of one or more chains of amino acids in a specific order, the order is determined by the base sequence of nucleotides in the gene that codes for the protein. Proteins are required for the structure, function, and regulation of the body´s cells, tissues, and organs and each protein has unique functions. Localization sites of proteins are identified by the mechanism and moved to its corresponding organelles. In this paper, we introduce the method clustering and its type´s K-Means and K-Medoids. The clustering algorithms are improved by implementing the two initial centroid selection methods instead of selecting centroid randomly. K-Means algorithm can be improved by implementing the initial cluster centroids are selected by the two proposed algorithms instead of selecting centroids randomly, which is compared by using Davie Bouldin index measure, hence the proposed algorithm1 overcomes the drawbacks of selecting initial cluster centers then other methods. In the yeast dataset, the defective proteins (objects) are considered as outliers, which are identified by the clustering methods with ADOC (Average Distance between Object and Centroid) function. The outlier´s detection method and performance analysis method are studied and compared, the experimental results shows that the K-Medoids method performs well when compare with the K-Means clustering.
Keywords :
biology computing; pattern clustering; proteins; ADOC function; Davie Bouldin index measure; base sequence; centroid; defective proteins; gene; k means clustering algorithm; k medoids method; nucleotides; organelles; outliers detection; partitional clustering method; performance analysis; protein localization sites; yeast dataset; Algorithm design and analysis; Biological systems; Clustering algorithms; Convergence; Indexes; Partitioning algorithms; Proteins; ADOC; Initial centroid; K-Means; K-Medoids; Outliers; stopping criteria;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition, Informatics and Mobile Engineering (PRIME), 2013 International Conference on
Conference_Location :
Salem
Print_ISBN :
978-1-4673-5843-9
Type :
conf
DOI :
10.1109/ICPRIME.2013.6496519
Filename :
6496519
Link To Document :
بازگشت