DocumentCode :
2700809
Title :
Linguistic analysis of the nucleoprotein gene of influenza A virus
Author :
Skourikhine, Alexei N. ; Burr, Tom
Author_Institution :
Safeguards Syst. Group, Los Alamos Nat. Lab., NM, USA
fYear :
2000
fDate :
2000
Firstpage :
193
Lastpage :
199
Abstract :
Applies a linguistic analysis method (N-grams) to classify nucleotide and amino acid sequences of the nucleoprotein (NP) gene of the influenza A virus isolated from three hosts and several geographic regions. We considered letter frequency (1-grams), letter-pairs´ frequency (2-grams) and triplets´ frequency (3-grams). Nearest-neighbor classifiers and decision-tree classifiers based on 1-, 2- and 3-grams were constructed for NP nucleotide and amino acid strains, and their classification efficiencies were compared with the groupings obtained using phylogenetic analysis. Our results show that disregarding positional information for NP can provide almost the same high level of classification accuracy as alternative, more complex classification techniques that use positional information
Keywords :
biocybernetics; computational linguistics; decision trees; diseases; genetics; microorganisms; nomograms; pattern classification; proteins; sequences; N-grams; amino acid sequence classification; classification accuracy; classification efficiency; decision-tree classifiers; geographic regions; influenza A virus; letter frequency; letter-pair frequency; linguistic analysis; nearest-neighbor classifiers; nucleoprotein gene; nucleotide sequence classification; phylogenetic analysis; positional information; triplet frequency; Aggregates; Amino acids; Capacitive sensors; Classification tree analysis; Frequency; Genetics; Influenza; Laboratories; Sequences; Strain measurement;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bio-Informatics and Biomedical Engineering, 2000. Proceedings. IEEE International Symposium on
Conference_Location :
Arlington, VA
Print_ISBN :
0-7695-0862-6
Type :
conf
DOI :
10.1109/BIBE.2000.889607
Filename :
889607
Link To Document :
بازگشت