Title :
Compressing an inverted file with LCS
Author :
Leu, Fang-Yie ; Fan, Yao-Chung
Author_Institution :
Comput. Sci. & Inf. Eng. Dept., Univ. of Tung-Hai, Taichung, Taiwan
Abstract :
The document index construction is one of the most important concerns in designing an information retrieval system. The most common index structure used in document retrieval is the inverted file, which consists of inverted lists holding lists of pointers to all the locations of the given terms in the documents collected. The size of an inverted file can be reduced by the use of compression techniques. We exploit randomized minimum spanning tree (MST) algorithm, which uses the spanning tree verification and randomized sampling.
Keywords :
computational complexity; data compression; document handling; indexing; tree data structures; tree searching; document index; inverted file compression; randomized minimum spanning tree algorithm; randomized sampling; spanning tree verification; Clustering algorithms; Computer science; Data mining; Indexing; Information retrieval; Information systems; Partitioning algorithms; Search engines; Software algorithms; Tree graphs;
Conference_Titel :
Computer Software and Applications Conference, 2004. COMPSAC 2004. Proceedings of the 28th Annual International
Print_ISBN :
0-7695-2209-2
DOI :
10.1109/CMPSAC.2004.1342676