DocumentCode :
2460709
Title :
A Comparison Study of Virus Classification by Genome Sequences
Author :
Wang, Jing-doo
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Asia Univ., Taichung, Taiwan
fYear :
2011
fDate :
24-26 Oct. 2011
Firstpage :
270
Lastpage :
273
Abstract :
In this study, instead of traditional approaches to virus classification, we proposed a novel approach in the vector space model for virus classification via two types of genome sequences, DNA and CDS. For DNA sequence, in this study, the k-mer approach was adopted for pattern extraction and the entropy of the pattern frequency distribution among classes was for pattern weighting. For CDS sequence, however, the pattern extraction was based on the identification of distinctive protein functions which were formed by CDS clustering and a weighting method, similar to tf * idf approach, for these protein functions was proposed. The experimental resources were download from NCBI and there were 35 classes (virus family) consisted of 1,877 viruses selected. The highest values of classification accuracy via SVM classifier were as high as 94.7% and 91.3% via DNA and CDS sequences, respectively. This study not only proposed a novel approach for virus classification but also provided a new methodology for comparative genomic analysis.
Keywords :
DNA; biology computing; cellular biophysics; genomics; microorganisms; molecular biophysics; physiological models; proteins; support vector machines; CDS clustering; DNA sequence; SVM classifier; classification accuracy; comparative genomic analysis; genome sequences; k-mer approach; pattern extraction; pattern frequency distribution; pattern weighting; protein functions; vector space model; virus classification; Accuracy; Bioinformatics; DNA; Encoding; Genomics; Vectors; Viruses (medical); Comparative genomics; genome sequence; virus classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics and Bioengineering (BIBE), 2011 IEEE 11th International Conference on
Conference_Location :
Taichung
Print_ISBN :
978-1-61284-975-1
Type :
conf
DOI :
10.1109/BIBE.2011.47
Filename :
6089838
Link To Document :
بازگشت