Title :
Conditional LZ Complexity of DNA Sequences Analysis and its Application in Phylogenetic Tree Reconstruction
Author :
Liu, Jingjun ; Li, Dachao
Author_Institution :
Dept. of Math., Hainan Normal Univ., Hainan
Abstract :
A DNA sequence can be identified with a word over an alphabet N= A, C, G, T. Characteristic sequences of a DNA sequence are given in term of classifications of bases of nucleic acids. Here we propose a new measure for the similarity analysis of DNA sequences. It is based on conditional LZ complexity and (0,1) characteristic sequences of DNA primary sequences. This measure enables biologists to extract similarity information from biological sequences according to their requirements. For example, by this measure, one can obtain either the full similarity information or a similarity analysis from a given biological aspect. Moreover, the length of DNA primary sequence is not problematic. This new measure has been applied to phylogenetic tree construction, Based on conditional LZ complexity distance matrix. The application of the measure to the phylogenetic tree construction of 22 species shows its flexibility.
Keywords :
DNA; biology computing; genetics; molecular biophysics; pattern classification; DNA sequence analysis; DNA sequence similarity analysis measure; Lempel-Ziv complexity; conditional LZ complexity; full similarity information; nucleic acid base classification; phylogenetic tree reconstruction; Bioinformatics; Biomedical engineering; Biomedical measurements; DNA; Data compression; Data mining; Genomics; Mathematics; Phylogeny; Sequences;
Conference_Titel :
BioMedical Engineering and Informatics, 2008. BMEI 2008. International Conference on
Conference_Location :
Sanya
Print_ISBN :
978-0-7695-3118-2
DOI :
10.1109/BMEI.2008.203