Title :
ClassX: a browsing tool for protein sequence megaclassifications
Author :
Harris, Nomi L. ; States, David J. ; Hunter, Lawrence
Author_Institution :
Dept. of Pharm. Chem., California Univ., San Francisco, CA, USA
Abstract :
The authors have developed an algorithm, HHS, for efficient clustering of very large sequence databases into groups based on sequence similarity. When applied to protein sequence databases, the induced groups objectively define both protein superfamilies and recurring sequence motifs. HHS has been successfully applied to large sequence databases; however, the output is too large and complex to be readily interpreted. The authors report on a software tool called ClassX that was developed to help explore the results of megaclassification runs. They also discuss the problem of automatic annotations, i.e., how to use the classifications produced by the program to label regions of interest in an anonymous protein sequence.
Keywords :
biology computing; classification; macromolecular configurations; pattern recognition; proteins; ClassX; HHS; automatic annotations; browsing tool; clustering; induced groups; large sequence databases; protein sequence megaclassifications; protein superfamilies; recurring sequence motifs; region labelling; sequence similarity; Amino acids; Biology computing; Clustering algorithms; Databases; Genetic mutations; Insulin; Libraries; Modems; Protein sequence; Software tools;
Conference_Titel :
System Sciences, 1993, Proceeding of the Twenty-Sixth Hawaii International Conference on
Print_ISBN :
0-8186-3230-5
DOI :
10.1109/HICSS.1993.270685