DocumentCode
2640211
Title
Tree indexing for efficient search of similar documents
Author
Chen, Chung-Min ; Liu, Duen-Ren
Author_Institution
Telcordia Technol., Morristown, NJ, USA
fYear
2000
fDate
2000
Firstpage
210
Lastpage
211
Abstract
Linear algebra-based techniques have long been used to correlate similar documents. They map the documents to a multidimensional vector space, in which each document is represented by a vector. Searching related documents then translates into searching nearest neighbors in the vector space. We propose an indexing structure, called cosine R-tree, which indexes multidimensional vector space and provides efficient nearest neighbor search. Our preliminary results show that it gives better performance than a brute-force linear scan strategy
Keywords
database theory; indexing; information retrieval; search problems; cosine R-tree; efficient search; information retrieval; linear algebra-based techniques; linear scan; multi-dimensional vector space; multidimensional vector space; nearest neighbor search; nearest neighbors; related documents; similar documents; tree indexing; vector space approach; Euclidean distance; Indexing; Information management; Information retrieval; Multidimensional systems; Nearest neighbor searches; Space technology; Vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Software and Applications Conference, 2000. COMPSAC 2000. The 24th Annual International
Conference_Location
Taipei
ISSN
0730-3157
Print_ISBN
0-7695-0792-1
Type
conf
DOI
10.1109/CMPSAC.2000.884720
Filename
884720
Link To Document