DocumentCode :
793295
Title :
Efficient signature file methods for text retrieval
Author :
Lee, Dik Lun ; Kim, Young Man ; Patel, Gaurav
Author_Institution :
Dept. of Comput. & Inf. Sci., Ohio State Univ., Columbus, OH, USA
Volume :
7
Issue :
3
fYear :
1995
fDate :
6/1/1995 12:00:00 AM
Firstpage :
423
Lastpage :
435
Abstract :
Signature files have been studied extensively, as an access method for textual databases. Many approaches have been proposed for searching signatures files efficiently. However, different methods make different assumptions and use different performance measures, making it difficult to compare their performance. In this paper, we study three basic methods proposed in the literature, namely, the indexed descriptor file, the two-level superimposed coding scheme, and the partitioned signature file approach. The contribution of this paper is two-fold. First, we present a uniform analytical performance model so that the methods can be compared fairly and consistently. The analysis shows that the two-level superimposed coding scheme, if stored in a transposed file, has the best performance. Second, we extend the two-level superimposed coding method into a multilevel superimposed coding method, we obtain the optimal number of levels for the multilevel method and show that for databases with reasonable size the optimal value is much larger than 2, which is assumed in the two-level method. The accuracy of the analytical formula is demonstrated by simulation
Keywords :
information retrieval; access method; indexed descriptor file; partitioned signature file approach; performance measures; signature file methods; simulation; text retrieval; textual databases; two-level superimposed coding scheme; Analytical models; Chemicals; Cities and towns; Computer Society; DNA; Hardware; Multimedia databases; Performance analysis; Performance evaluation; Search methods;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/69.390248
Filename :
390248
Link To Document :
بازگشت