DocumentCode :
3184533
Title :
Efficient Markov clustering algorithm for protein sequence grouping
Author :
Szilagyi, L. ; Szilagyi, Sandor M.
Author_Institution :
Dept. of Electr. Eng., Sapientia Univ., Tîrgu Mureş, Romania
fYear :
2013
fDate :
3-7 July 2013
Firstpage :
639
Lastpage :
642
Abstract :
In this paper we propose an efficient reformulation of a Markov clustering algorithm, suitable for fast and accurate grouping of protein sequences, based on pairwise similarity information. The proposed modification consists of optimal reordering of rows and columns in the similarity matrix after every iteration, transforming it into a matrix with several compact blocks along the diagonal, and zero similarities outside the blocks. These blocks are treated separately in later iterations, thus reducing the computational burden of the algorithm. The proposed algorithm was tested on protein sequence databases like SCOP95. In terms of efficiency, the proposed solution achieves a speed-up factor in the range 15-50 compared to the conventional Markov clustering, depending on input data size and parameter settings. This improvement in computation time is reached without losing anything from the partition accuracy. The convergence is usually reached in 40-50 iterations. Combining the proposed method with sparse matrix representation and parallel execution will certainly lead to a significantly more efficient solution in future.
Keywords :
Markov processes; pattern clustering; proteins; proteomics; sparse matrices; SCOP95 database; columns reordering; computation time; convergence; efficiency; efficient Markov clustering algorithm; optimal reordering; pairwise similarity information; partition accuracy; protein sequence grouping; rows reordering; similarity matrix; sparse matrix representation; Accuracy; Clustering algorithms; Databases; Markov processes; Partitioning algorithms; Proteins; Symmetric matrices;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Engineering in Medicine and Biology Society (EMBC), 2013 35th Annual International Conference of the IEEE
Conference_Location :
Osaka
ISSN :
1557-170X
Type :
conf
DOI :
10.1109/EMBC.2013.6609581
Filename :
6609581
Link To Document :
بازگشت