Title :
Spectral clustering of protein sequences
Author :
Paccanaro, Alberto ; Chennubhotla, C. ; Casbon, James A. ; Saqi, Mansoor A S
Author_Institution :
Dept. of Med. Microbiol., Queen Mary Univ. of London, UK
Abstract :
A major challenge in bioinformatics is the grouping together of protein sequences into functionally similar families. Large scale clustering of protein sequences may help to identify novel relationships and may also be of use in structural genomics. This paper explores the use of graph-theoretic spectral methods for clustering protein sequences. Using the leading eigenvectors of a matrix derived from similarity information between protein sequences, we were able to obtain meaningful clusters on quite diverse sets of proteins. The results presented show how this method is often able to identify correctly the superfamilies to which the sequences belong.
Keywords :
biology computing; eigenvalues and eigenfunctions; graph theory; pattern clustering; proteins; bioinformatics; graph-theoretic spectral methods; matrix eigenvectors; protein sequence clustering; protein superfamilies; spectral clustering; structural genomics; Bioinformatics; Biotechnology; Clustering algorithms; Computer science; Couplings; Genomics; Large-scale systems; Protein engineering; Protein sequence; Throughput;
Conference_Titel :
Neural Networks, 2003. Proceedings of the International Joint Conference on
Print_ISBN :
0-7803-7898-9
DOI :
10.1109/IJCNN.2003.1224064