DocumentCode :
3495459
Title :
Storage and Matching Technique in Genomic Sequences for Approximate Motif Searching
Author :
Srinivasa, K.G. ; Shashidhara, H.S.
Author_Institution :
Data Min. Lab., M S Ramaiah Inst. of Technol., Bangalore
Volume :
1
fYear :
2008
fDate :
11-13 Nov. 2008
Firstpage :
968
Lastpage :
973
Abstract :
Sequence retrieval serves as a preprocess for a number of other processes including motif discovery, in which obtained sequences are scored against a consensus before being recognized as a motif. This depends on the way sequences are stored prior to retrieval. The usage of two bits for representing genomic characters is optimal storage wise, however does not provide any details regarding length of repetitive characters or other details of positional significance. The intent of the paper is to showcase an alternative storage technique for the sequence and its corresponding retrieval technique. We represent our technique with the use of integers for clarity of understanding. With the bit equivalent of the integers used in actual representation we could minimize storage complexity significantly. We give a clear picture of the requirements of a storage technique from a motif discovery perspective before showcasing our proposal.
Keywords :
bioinformatics; genetics; information retrieval; molecular biophysics; molecular configurations; approximate motif searching; genomic sequences; matching; sequence retrieval; storage; Arithmetic; Bioinformatics; Data mining; Genomics; Information retrieval; Information technology; Laboratories; Proposals; Redundancy; Sequences; Bioinformatics; DNA; biological data mining; sequence alignment; storage techniques;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Convergence and Hybrid Information Technology, 2008. ICCIT '08. Third International Conference on
Conference_Location :
Busan
Print_ISBN :
978-0-7695-3407-7
Type :
conf
DOI :
10.1109/ICCIT.2008.201
Filename :
4682157
Link To Document :
بازگشت