مرکز منطقه ای اطلاع رساني علوم و فناوري - Finding the most similar documents across multiple text databases

DocumentCode :

2958867

Title :

Finding the most similar documents across multiple text databases

Author :

Yu, Clement ; Liu, King-Lup ; Wu, Wensheng ; Meng, Weiyi ; Rishe, Naphtali

Author_Institution :

Dept. of Electr. Eng. & Comput. Sci., Illinois Univ., Chicago, IL, USA

fYear :

1999

fDate :

1999

Firstpage :

150

Lastpage :

162

Abstract :

We present a methodology for finding the n most similar documents across multiple text databases for any given query and for any positive integer n. This methodology consists of two steps. First, databases are ranked in a certain order. Next, documents are retrieved from the databases according to the order and in a particular way. If the databases containing the n most similar documents for a given query can be ranked ahead of other databases, the methodology will guarantee the retrieval of the n most similar documents for the query. A statistical method is provided to identify databases, each of which is estimated to contain at least one of the n most similar documents. Then, a number of strategies are presented to retrieve documents from the identified databases. Experimental results are given to illustrate the relative performance of different strategies

Keywords :

database management systems; information retrieval; search engines; text analysis; database ranking; document retrieval; most similar documents; multiple text databases; relative performance; statistical method; Australia; Computer networks; Database systems; ISDN; Indexing; Information retrieval; Information systems; Internet; Machine learning; Transaction databases;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Research and Technology Advances in Digital Libraries, 1999. Proceedings. IEEE Forum on

Conference_Location :

Baltimore, MD

ISSN :

1092-9959

Print_ISBN :

0-7695-0219-9

Type :

conf

DOI :

10.1109/ADL.1999.777710

Filename :

777710

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2958867