Title :
Back translated peptide K-mer search and local alignment in large DNA sequence databases using BoND-SD-tree indexing
Author :
A K M Tauhidul Islam;Sakti Pramanik;Xinge Ji;James R. Cole;Qiang Zhu
Author_Institution :
Michigan State University, East Lansing, MI 48824, USA
Abstract :
In the past, genome sequence databases had used main memory indexing, such as the suffix tree, for fast sequence searches. With next generation sequencing technologies, the amount of sequence data being generated is huge and main memory indexing is limited by the amount of memory available. K-mer based techniques are being more used for various genome sequence database applications such as local alignment. K-mer can also provide an excellent basis for creating efficient disk based indexing. In this paper, we have proposed a k-mer based database searching and local alignment tool using box queries on BoND-SD-tree indexing. BoND-tree is quite efficient for indexing and searching in Non-Ordered Discrete Data Space (NDDS). We have conducted experiments on searching DNA sequence databases using back translated protein query sequences and have compared with existing methods. We have also implemented local alignment of back translated protein query sequences with large DNA sequence databases using this index based k-mer search. Performances of this local alignment approach has been compared with that of Tblastn of NCBI. The results are quite promising and justify significance of the proposed approach.
Keywords :
"DNA","Proteins","Amino acids","Indexing","Genomics"
Conference_Titel :
Bioinformatics and Bioengineering (BIBE), 2015 IEEE 15th International Conference on
DOI :
10.1109/BIBE.2015.7367638