DocumentCode :
610382
Title :
On shortest unique substring queries
Author :
Jian Pei ; Wu, W.C.-H. ; Mi-Yen Yeh
Author_Institution :
Sch. of Comput. Sci., Simon Fraser Univ., Burnaby, BC, Canada
fYear :
2013
fDate :
8-12 April 2013
Firstpage :
937
Lastpage :
948
Abstract :
In this paper, we tackle a novel type of interesting queries - shortest unique substring queries. Given a (long) string S and a query point q in the string, can we find a shortest substring containing q that is unique in S? We illustrate that shortest unique substring queries have many potential applications, such as information retrieval, bioinformatics, and event context analysis. We develop efficient algorithms for online query answering. First, we present an algorithm to answer a shortest unique substring query in O(n) time using a suffix tree index, where n is the length of string S. Second, we show that, using O(n·h) time and O(n) space, we can compute a shortest unique substring for every position in a given string, where h is variable theoretically in O(n) but on real data sets often much smaller than n and can be treated as a constant. Once the shortest unique substrings are pre-computed, shortest unique substring queries can be answered online in constant time. In addition to the solid algorithmic results, we empirically demonstrate the effectiveness and efficiency of shortest unique substring queries on real data sets.
Keywords :
query processing; bioinformatics; event context analysis; information retrieval; online query answering; shortest unique substring queries; suffix tree index; Algorithm design and analysis; Bioinformatics; Context; Indexes; Organisms; Upper bound;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering (ICDE), 2013 IEEE 29th International Conference on
Conference_Location :
Brisbane, QLD
ISSN :
1063-6382
Print_ISBN :
978-1-4673-4909-3
Electronic_ISBN :
1063-6382
Type :
conf
DOI :
10.1109/ICDE.2013.6544887
Filename :
6544887
Link To Document :
بازگشت