DocumentCode :
1629869
Title :
A Sense Based Similarity Measure for Cross-Lingual Documents
Author :
Huang, Hsun-Hui ; Yang, Horng-Chang ; Kuo, Yau-Hwang
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan
Volume :
1
fYear :
2008
Firstpage :
9
Lastpage :
13
Abstract :
As cross-lingual information retrieval attracts increasing attention, tools that measure cross-lingual document similarity become desirable. Since the way that people convey thoughts at the abstract concept level makes little, if any, difference in the languages they use, it is possible to measure semantic similarity between different lingual documents based on the concepts conveyed by the documents. In this paper, we use senses for document representation to alleviate the barrier of different languages and adopt fuzzy set functions to cope with the inherent fuzziness among senses and propose two document similarity measures- one based on Tversky´s notion on similarity and the other on the much used information retrieval criterion. Their performances are compared experimentally. We only focus on documents in English and Chinese. But the proposed approach can be easily extended to process documents in other languages.
Keywords :
fuzzy set theory; information retrieval; natural languages; text analysis; abstract concept level; cross-lingual document similarity; cross-lingual information retrieval; document representation; fuzzy set function; semantic similarity; sense based similarity measure; Application software; Computer science; Design engineering; Fuzzy sets; Information retrieval; Intelligent systems; Internet; Natural language processing; Natural languages; Web pages; cross-lingual; semantic similarity; sense;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Systems Design and Applications, 2008. ISDA '08. Eighth International Conference on
Conference_Location :
Kaohsiung
Print_ISBN :
978-0-7695-3382-7
Type :
conf
DOI :
10.1109/ISDA.2008.284
Filename :
4696168
Link To Document :
بازگشت