DocumentCode :
3146417
Title :
Models for compression in full-text retrieval systems
Author :
Witten, Ian H. ; Bell, Timothy C. ; Nevill, Craig G.
Author_Institution :
Comput. Sci., Calgary Univ., Alta., Canada
fYear :
1991
fDate :
8-11 Apr 1991
Firstpage :
23
Lastpage :
32
Abstract :
This paper explores the application of arithmetic coding to systems involving the storage of a large body of text, along with a lexicon that lists the words and a concordance that indicates the exact locations at which each word can be found. A typical query might seek all sentences that contain a particular word or combination of words. The random-access requirement means that many current compression techniques are not directly applicable-particularly those using adaptive modelling. However, the static nature of the text and the existence of a lexicon give help that is not available in other compression scenarios. A number of different kinds of model developed for different parts of a full-text retrieval system are presented and evaluated
Keywords :
data compression; encoding; information retrieval systems; arithmetic coding; compression techniques; concordance; full-text retrieval systems; lexicon; model; query; Arithmetic; Data structures; Databases; Delay; Huffman coding; Information retrieval; Mechanical factors; Power system modeling; Probability distribution; Scalability;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Compression Conference, 1991. DCC '91.
Conference_Location :
Snowbird, UT
Print_ISBN :
0-8186-9202-2
Type :
conf
DOI :
10.1109/DCC.1991.213370
Filename :
213370
Link To Document :
بازگشت