DocumentCode
3190122
Title
Document comparison with a weighted topic hierarchy
Author
Gelbukh, A. ; Sidorov, G. ; Guzmán-Arenas, A.
Author_Institution
Nat. Language Lab., Nat. Polytech. Inst., Mexico City, Mexico
fYear
1999
fDate
1999
Firstpage
566
Lastpage
570
Abstract
A method of document comparison based on a hierarchical dictionary of topics (concepts) is described. The hierarchical links in the dictionary are supplied with the weights that are used for detecting the main topics of a document and for determining the similarity between two documents. The method allows for the comparison of documents that do not share any words literally but do share concepts, including comparison of documents in different languages. Also, the method allows for comparison with respect to a specific “aspect”, i.e., a specific topic of interest (with its respective subtopics). A system classifier using the discussed method for document classification and information retrieval is discussed
Keywords
information retrieval; visual databases; document classification; document comparison; hierarchical dictionary; information retrieval; system classifier; weighted topic hierarchy; Cities and towns; Dictionaries; Europe; Histograms; Information retrieval; Laboratories; Natural languages; Read only memory; Statistical analysis; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Database and Expert Systems Applications, 1999. Proceedings. Tenth International Workshop on
Conference_Location
Florence
Print_ISBN
0-7695-0281-4
Type
conf
DOI
10.1109/DEXA.1999.795247
Filename
795247
Link To Document