DocumentCode :
2857507
Title :
An efficient mining algorithm for key segment from DNA sequences
Author :
Guojun Mao
Author_Institution :
Inf. Sch., Central Univ. of Finance & Econ., Beijing, China
fYear :
2015
fDate :
3-6 May 2015
Firstpage :
396
Lastpage :
399
Abstract :
Unlike transaction sequences in business, DNA sequences typically have a small alphabet and a long length, and so mining DNA sequences faces different challenges from other applications. This paper deals with the problem of mining key segments from long DNA sequences. We design a compact data structure, called Association Matrix, to maintain in memory the statistical information from scanning DNA sequences. Based on the Association Matrix structure, we present an algorithm for mining key segments from a super long DNA sequence. We also evaluate the approach on synthetic and real life data sets, and its good performances in time and space are approved by the experiments.
Keywords :
DNA; biology computing; data mining; statistical analysis; DNA sequences; association matrix; data structure; key segment; mining algorithm; real life data sets; statistical information; synthetic data sets; Algorithm design and analysis; Bioinformatics; DNA; Data mining; Databases; Knowledge discovery;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Electrical and Computer Engineering (CCECE), 2015 IEEE 28th Canadian Conference on
Conference_Location :
Halifax, NS
ISSN :
0840-7789
Print_ISBN :
978-1-4799-5827-6
Type :
conf
DOI :
10.1109/CCECE.2015.7129310
Filename :
7129310
Link To Document :
بازگشت