• DocumentCode
    3190122
  • Title

    Document comparison with a weighted topic hierarchy

  • Author

    Gelbukh, A. ; Sidorov, G. ; Guzmán-Arenas, A.

  • Author_Institution
    Nat. Language Lab., Nat. Polytech. Inst., Mexico City, Mexico
  • fYear
    1999
  • fDate
    1999
  • Firstpage
    566
  • Lastpage
    570
  • Abstract
    A method of document comparison based on a hierarchical dictionary of topics (concepts) is described. The hierarchical links in the dictionary are supplied with the weights that are used for detecting the main topics of a document and for determining the similarity between two documents. The method allows for the comparison of documents that do not share any words literally but do share concepts, including comparison of documents in different languages. Also, the method allows for comparison with respect to a specific “aspect”, i.e., a specific topic of interest (with its respective subtopics). A system classifier using the discussed method for document classification and information retrieval is discussed
  • Keywords
    information retrieval; visual databases; document classification; document comparison; hierarchical dictionary; information retrieval; system classifier; weighted topic hierarchy; Cities and towns; Dictionaries; Europe; Histograms; Information retrieval; Laboratories; Natural languages; Read only memory; Statistical analysis; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Database and Expert Systems Applications, 1999. Proceedings. Tenth International Workshop on
  • Conference_Location
    Florence
  • Print_ISBN
    0-7695-0281-4
  • Type

    conf

  • DOI
    10.1109/DEXA.1999.795247
  • Filename
    795247