Title :
Efficient instant-fuzzy search with proximity ranking
Author :
Cetindil, Inci ; Esmaelnezhad, Jamshid ; Taewoo Kim ; Chen Li
Author_Institution :
Dept. of Comput. Sci., Univ. of California, Irvine, Irvine, CA, USA
fDate :
March 31 2014-April 4 2014
Abstract :
Instant search is an emerging information-retrieval paradigm in which a system finds answers to a query instantly while a user types in keywords character-by-character. Fuzzy search further improves user search experiences by finding relevant answers with keywords similar to query keywords. A main computational challenge in this paradigm is the high-speed requirement, i.e., each query needs to be answered within milliseconds to achieve an instant response and a high query throughput. At the same time, we also need good ranking functions that consider the proximity of keywords to compute relevance scores. In this paper, we study how to integrate proximity information into ranking in instant-fuzzy search while achieving efficient time and space complexities. We adapt existing solutions on proximity ranking to instant-fuzzy search. A naïve solution is computing all answers then ranking them, but it cannot meet this high-speed requirement on large data sets when there are too many answers, so there are studies of early-termination techniques to efficiently compute relevant answers. To overcome the space and time limitations of these solutions, we propose an approach that focuses on common phrases in the data and queries, assuming records with these phrases are ranked higher. We study how to index these phrases and develop an incremental-computation algorithm for efficiently segmenting a query into phrases and computing relevant answers. We conducted a thorough experimental study on real data sets to show the tradeoffs between time, space, and quality of these solutions.
Keywords :
computational complexity; fuzzy set theory; query formulation; query processing; early-termination techniques; incremental-computation algorithm; information-retrieval paradigm; instant-fuzzy search; proximity ranking; query answering; query segmentation; space complexity; time complexity; Dictionaries; Heart; Indexing; Servers; Surgery; Tumors;
Conference_Titel :
Data Engineering (ICDE), 2014 IEEE 30th International Conference on
Conference_Location :
Chicago, IL
DOI :
10.1109/ICDE.2014.6816662