Title :
Quick-motif: An efficient and scalable framework for exact motif discovery
Author :
Yuhong Li ; Leong Hou U ; Man Lung Yiu ; Zhiguo Gong
Author_Institution :
Dept. of Comput. & Inf. Sci., Univ. of Macau, Macau, China
Abstract :
Discovering motifs in sequence databases has been receiving abundant attentions from both database and data mining communities, where the motif is the most correlated pair of subsequences in a sequence object. Motif discovery is expensive for emerging applications which may have very long sequences (e.g., million observations per sequence) or the queries arrive rapidly (e.g., per 10 seconds). Prior works cannot offer fast correlation computations and prune subsequence pairs at the same time, as these two techniques require different orderings on examining subsequence pairs. In this work, we propose a novel framework named Quick-Motif which adopts a two-level approach to enable batch pruning at the outer level and enable fast correlation calculation at the inner level. We further propose two optimization techniques for the outer and the inner level. In our experimental study, our method is up to 3 orders of magnitude faster than the state-of-the-art methods.
Keywords :
data mining; optimisation; query processing; Quick-Motif framework; batch pruning; correlation computations; data mining communities; database communities; exact motif discovery; inner level; optimization techniques; outer level; sequence databases; sequence object; subsequence pairs; two-level approach; Force; Silicon;
Conference_Titel :
Data Engineering (ICDE), 2015 IEEE 31st International Conference on
Conference_Location :
Seoul
DOI :
10.1109/ICDE.2015.7113316