DocumentCode :
3714468
Title :
Large scale multi-species palindrome study using distributed in-memory computing
Author :
Devin Petersohn;Matthew Spencer;Alex Fratila;Chi-Ren Shyu
Author_Institution :
Department of Computer Science, University of Missouri, Columbia, 65211, USA
fYear :
2015
Firstpage :
685
Lastpage :
690
Abstract :
Palindromic DNA has many interesting and functional properties, including the ability to form non-canonical DNA structures such as hairpins, cruciforms, and slipped strand structures. Palindromes also serve important roles in binding sites and enzyme activity, and have a strong effect on mutation rates. Palindromes are abundant in most genomes, often occurring within coding sequences, though in many instances it is still not clear how their presence affects genomic functions. The identification and study of palindromic DNA is essential to the progression of our understanding of the genome. To address this need, we present a novel method using an in-memory computing environment for identifying, extracting, and indexing palindromes in a searchable database for all mammals in Ensembl release 80. We discuss the preliminary results of a multi-species study on palindromic DNA, focusing on the size, frequency, and distribution of palindromes. Utilizing a Big Data ecosystem enables us to generate the largest palindrome database to date, comprising 42 genomes. Our study offers new insight into the dynamics of palindromes and facilitates future investigation.
Keywords :
"Genomics","DNA","Indexing","Bioinformatics"
Publisher :
ieee
Conference_Titel :
Bioinformatics and Biomedicine (BIBM), 2015 IEEE International Conference on
Type :
conf
DOI :
10.1109/BIBM.2015.7359769
Filename :
7359769
Link To Document :
بازگشت