Title :
Principal component based method for whole genome phylogenetic analysis without alignment: Application to HEV genotype
Author :
Sahana, Subrata ; Das, Sanjoy ; Sarkar, Bimal Kumar
Author_Institution :
Dept. of Comput. Sci. & Eng., Galgotias Univ., Greater Noida, India
Abstract :
We describe principal component method for the DNA sequence analysis using digital filters. With the huge amount of data accessible in the public domain, digital filters are very helpful in DNA sequence processing. In this technique, the occurrence frequency of the q-gram genetic word of interest is determined from the DNA sequence. The sequence is then elucidated by using finite impulse response (FIR) type filter in order to determine the q-gram word density along the sequence. The word density distribution is further used for principal component analysis (PCA) to determine the similarity / dissimilarity between the sequences. The technique is verified by using 48 HEV genotypes. The results are in good agreement with other methodology.
Keywords :
DNA; FIR filters; bioinformatics; digital filters; genetics; principal component analysis; DNA sequence analysis; DNA sequence processing; FIR type filter; HEV genotype; PCA; data accessibility; digital filters; finite impulse response type filter; occurrence frequency; principal component based method; public domain; q-gram genetic word; q-gram word density; sequence dissimilarity; sequence similarity; whole-genome phylogenetic analysis; Bioinformatics; DNA; Genomics; Hybrid electric vehicles; Phylogeny; Principal component analysis; Strain; DNA sequence; HEV; digital filter; principal component analysis;
Conference_Titel :
Computing, Communication & Automation (ICCCA), 2015 International Conference on
Conference_Location :
Noida
Print_ISBN :
978-1-4799-8889-1
DOI :
10.1109/CCAA.2015.7148518