DocumentCode
3228027
Title
A simple clustering approach for pathogenic strain identification based on local and global amino acid compositional signatures from enomic sequences: the Escherichia genus case
Author
Promponas, Vasilis J.
Author_Institution
Dept. of Biol. Sci., Univ. of Cyprus, Nicosia, Cyprus
fYear
2009
fDate
4-7 Nov. 2009
Firstpage
1
Lastpage
4
Abstract
Cluster analysis offers a suite of powerful unsupervised methods, commonly used as exploratory data analysis tools. Such tools can be proven especially useful when we face the situation of analyzing large data sets and want to get an intuitive insight at subtle correlations between instances of the data. In this work, we demonstrate that simple hierarchical clustering approaches (based on compositional features extracted from the amino acid sequences encoded in the complete genomic sequences of 25 species/strains belonging to the proteobacterial genus Escherichia) can be used to accurately discriminate between pathogenic and nonpathogenic strains of those bacteria.
Keywords
biology computing; cellular biophysics; feature extraction; genomics; microorganisms; molecular biophysics; pattern clustering; statistical analysis; Escherichia genus; amino acid compositional signature; amino acid sequences; cluster analysis; compositional feature extraction; exploratory data analysis tool; genomic sequences; hierarchical clustering; pathogenic strain identification; proteobacteria; Amino acids; Bioinformatics; Capacitive sensors; Data analysis; Genomics; Intestines; Microorganisms; Organisms; Pathogens; Proteins; Compositional signatures; bacterial pathogenicity; clustering; genome;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Technology and Applications in Biomedicine, 2009. ITAB 2009. 9th International Conference on
Conference_Location
Larnaca
Print_ISBN
978-1-4244-5379-5
Electronic_ISBN
978-1-4244-5379-5
Type
conf
DOI
10.1109/ITAB.2009.5394396
Filename
5394396
Link To Document