Title : 
Centre-based clustering for Y-Short Tandem Repeats (Y-STR) as numerical and categorical data
         
        
            Author : 
Seman, Ali ; Bakar, Z.A. ; Sapawi, Azizian Mohd.
         
        
            Author_Institution : 
Centre for Comput. Sci. Studies, Univ. Teknol. MARA (UiTM), Shah Alam, Malaysia
         
        
        
        
        
        
            Abstract : 
Centre-based clustering is among the most applicable method for partitioning objects into homogenous groups. This paper presents two Centre-based clustering; K-Means and K-Modes algorithms to investigate and evaluate the clustering results of Y-STR data. The main goal of this paper is to compare the accuracy of clustering Y-STR results for different types of data: numerical and categorical data. The results show that the Y-STR data is more favour to categorical data. The accuracy of the Y-STR, treated as categorical data is 49%, whereas the numerical data is only a 26% chance producing a good clustering result. However, the amount of time taken by numerical data is much better compared to categorical data.
         
        
            Keywords : 
pattern clustering; Y-short tandem repeats; categorical data; centre-based clustering; k-means clustering; k-modes clustering; numerical data; Bioinformatics; Clustering algorithms; Clustering methods; Computer science; DNA; Helium; Partitioning algorithms; Sequences; Centre-based clustering; Y-STR; categorical data; numerical data;
         
        
        
        
            Conference_Titel : 
Information Retrieval & Knowledge Management, (CAMP), 2010 International Conference on
         
        
            Conference_Location : 
Shah Alam, Selangor
         
        
            Print_ISBN : 
978-1-4244-5650-5
         
        
        
            DOI : 
10.1109/INFRKM.2010.5466953