DocumentCode
3495459
Title
Storage and Matching Technique in Genomic Sequences for Approximate Motif Searching
Author
Srinivasa, K.G. ; Shashidhara, H.S.
Author_Institution
Data Min. Lab., M S Ramaiah Inst. of Technol., Bangalore
Volume
1
fYear
2008
fDate
11-13 Nov. 2008
Firstpage
968
Lastpage
973
Abstract
Sequence retrieval serves as a preprocess for a number of other processes including motif discovery, in which obtained sequences are scored against a consensus before being recognized as a motif. This depends on the way sequences are stored prior to retrieval. The usage of two bits for representing genomic characters is optimal storage wise, however does not provide any details regarding length of repetitive characters or other details of positional significance. The intent of the paper is to showcase an alternative storage technique for the sequence and its corresponding retrieval technique. We represent our technique with the use of integers for clarity of understanding. With the bit equivalent of the integers used in actual representation we could minimize storage complexity significantly. We give a clear picture of the requirements of a storage technique from a motif discovery perspective before showcasing our proposal.
Keywords
bioinformatics; genetics; information retrieval; molecular biophysics; molecular configurations; approximate motif searching; genomic sequences; matching; sequence retrieval; storage; Arithmetic; Bioinformatics; Data mining; Genomics; Information retrieval; Information technology; Laboratories; Proposals; Redundancy; Sequences; Bioinformatics; DNA; biological data mining; sequence alignment; storage techniques;
fLanguage
English
Publisher
ieee
Conference_Titel
Convergence and Hybrid Information Technology, 2008. ICCIT '08. Third International Conference on
Conference_Location
Busan
Print_ISBN
978-0-7695-3407-7
Type
conf
DOI
10.1109/ICCIT.2008.201
Filename
4682157
Link To Document