• DocumentCode
    2640211
  • Title

    Tree indexing for efficient search of similar documents

  • Author

    Chen, Chung-Min ; Liu, Duen-Ren

  • Author_Institution
    Telcordia Technol., Morristown, NJ, USA
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    210
  • Lastpage
    211
  • Abstract
    Linear algebra-based techniques have long been used to correlate similar documents. They map the documents to a multidimensional vector space, in which each document is represented by a vector. Searching related documents then translates into searching nearest neighbors in the vector space. We propose an indexing structure, called cosine R-tree, which indexes multidimensional vector space and provides efficient nearest neighbor search. Our preliminary results show that it gives better performance than a brute-force linear scan strategy
  • Keywords
    database theory; indexing; information retrieval; search problems; cosine R-tree; efficient search; information retrieval; linear algebra-based techniques; linear scan; multi-dimensional vector space; multidimensional vector space; nearest neighbor search; nearest neighbors; related documents; similar documents; tree indexing; vector space approach; Euclidean distance; Indexing; Information management; Information retrieval; Multidimensional systems; Nearest neighbor searches; Space technology; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Software and Applications Conference, 2000. COMPSAC 2000. The 24th Annual International
  • Conference_Location
    Taipei
  • ISSN
    0730-3157
  • Print_ISBN
    0-7695-0792-1
  • Type

    conf

  • DOI
    10.1109/CMPSAC.2000.884720
  • Filename
    884720