Title :
Comparison of Autoregressive Measures for DNA Sequence Similarity
Author_Institution :
Drexel Univ. Philadelphia, Philadelphia
Abstract :
It has been shown that DNA sequences can be modeled with autoregressive processes and that the Euclidean distance between model parameters is useful for detecting sequence similarity. But, the measure´s robustness to nonexact, approximate matches is not explored. We go one step further and not only look at exact gene searching, but how the AR distance measures are perturbed by errors and mutation. To achieve higher accuracy in similarity searching, we compare the performance of the Euclidean distance measure to Itakura distance measure using different nucleotide mappings. The numerical mappings and distance measures have comparable performance, but in general, the Euclidean distance using the binary SW mapping distinguishes perfect matches the best. Finally, we show that it is possible to use AR measures to detect mutation-prone approximate matches by increasing the AR model order.
Keywords :
DNA; autoregressive processes; genetics; molecular biophysics; molecular configurations; DNA sequence similarity; Euclidean distance; Itakura distance; autoregressive measures; gene searching; nucleotide mappings; Autoregressive processes; DNA computing; Electric variables measurement; Euclidean distance; Filters; Genetic mutations; Nuclear measurements; Predictive models; Robustness; Sequences;
Conference_Titel :
Genomic Signal Processing and Statistics, 2007. GENSIPS 2007. IEEE International Workshop on
Conference_Location :
Tuusula
Print_ISBN :
978-1-4244-0998-3
Electronic_ISBN :
978-1-4244-0999-0
DOI :
10.1109/GENSIPS.2007.4365814