DocumentCode :
2831396
Title :
Efficient maintenance scheme of inverted index for large-scale full-text retrieval
Author :
Liu, Xiaozhu
Author_Institution :
State Key Lab. of Software Eng., Wuhan Univ., Wuhan, China
Volume :
1
fYear :
2010
fDate :
21-24 May 2010
Abstract :
Inverted index is the mainstay of modern full-text retrieval systems, and it is a promising way to improve time and space efficiencies with appropriately maintenance scheme of inverted files for huge amount of information management and retrieval. In order to improve the retrieval performance of inverted index in large-scale full-text systems, a time and space efficient random access blocked inverted index (RABI) and an efficient dynamic maintenance scheme (DMS) are proposed in this paper. RABI divides inverted list into blocks and compresses different part of each block with the corresponding compression method to decrease space consumption. Based on RABI, DMS distinguishes between long and short posting lists. Then short posting lists are updated by remerge strategy and long posting lists are updated by hybrid in-place and remerge strategy. Experimental results show that, compared with existed schemes, the proposed scheme greatly averagely reduces space cost, conjunctive Boolean query time, and the cost of on-line index construction.
Keywords :
data compression; information retrieval systems; compression method; dynamic maintenance scheme; in-place strategy; inverted index maintenance scheme; large-scale full-text retrieval systems; random access blocked inverted index; remerge strategy; short posting lists; Appropriate technology; Automation; Costs; Information management; Information retrieval; Large-scale systems; Query processing; Software engineering; Space technology; Technology management; index maintenance; information retrieval; inverted index;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Future Computer and Communication (ICFCC), 2010 2nd International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-5821-9
Type :
conf
DOI :
10.1109/ICFCC.2010.5497725
Filename :
5497725
Link To Document :
بازگشت