Title :
Distinguish Coding And Noncoding Sequences In A Complete Genome Using Fourier Transform
Author :
Zhou, Yu ; Zhou, Li-Qian ; Yu, Zu-Guo ; Ann, V.
Author_Institution :
Xiangtan Univ., Xiangtan
Abstract :
A Fourier transform method is proposed to distinguish coding and non-coding sequences in a complete genome based on a number sequence representation of the DNA sequence proposed in our previous paper (Zhou et ah, J. Theor. Biol. 2005) and the imperfect periodicity of 3 in protein coding sequences. The three parameters Px(s macr)(1), Px(s macr)(1/3) and Px(s macr)(1/36) in the Fourier transform of the number sequence representation of DNA sequences are selected to form a three-dimensional parameter space. Each DNA sequence is then represented by a point in this space. The points corresponding to coding and non-coding sequences in the complete genome of prokaryotes are seen to be divided into different regions. If the point (Px(s macr)(1), Px(s macr)(1/3), Px(s macr) (1/36)) for a DNA sequence is situated in the region corresponding to coding sequences, the sequence is distinguished as a coding sequence; otherwise, the sequence is classified as a noncoding one. Fisher´s discriminant algorithm is used to study the discriminant accuracy. The average discriminant accuracies pc, pnc, qc and qnc of all 51 prokaryotes obtained by the present method reach 81.02%, 92.27%, 80.77% and 92.24% respectively.
Keywords :
Fourier transforms; biocomputing; DNA sequence; Fisher discriminant algorithm; Fourier transform; genome coding-noncoding sequences; number sequence representation; prokaryotes; protein coding sequences; three-dimensional parameter space; Bioinformatics; Biology computing; DNA; Fourier transforms; Fractals; Genomics; Organisms; Performance analysis; Proteins; Sequences;
Conference_Titel :
Natural Computation, 2007. ICNC 2007. Third International Conference on
Conference_Location :
Haikou
Print_ISBN :
978-0-7695-2875-5
DOI :
10.1109/ICNC.2007.333