Valuing Semantic Relatedness

Author

Boubacar, Abdoulahi

Author_Institution

Sch. of Comput. Sci., Beijing Inst. of Technol., Beijing, China

fYear

2014

Firstpage

1

Lastpage

5

Abstract

Semantic Relatedness is widely used in various domains such as DNA sequence analysis, knowledge representation, natural language processing, data mining, information retrieval, information flow etc... Computing semantic similarity between two entities is a non-trivial task. There are many ways to define semantic similarity. Some measures have been proposed combining both statistical information and lexical similarity. It is difficult for a measure that performs well in a given domain to be applied with accuracy in another domain. A similarity measure may perform better with one language than another. Word is supposed to be not only similar to itself but also to some of its synonyms in a given context, and some words with common roots. Our approach is designed to perform query matching and compute semantic relatedness using word occurrences. It performs better than classical measures like TF-IDF, Cosine etc... Although it is not a metric, the proposed similarity measure can be used for a wide range of content analysis tasks based on semantic distance and its efficacy has been demonstrated. The measure is not corpus dependent so it can establish directly the se-mantic relatedness of two entities.

Keywords

pattern matching; query processing; statistical analysis; text analysis; content analysis tasks; lexical similarity; query matching; semantic distance; semantic relatedness; semantic similarity; similarity measure; statistical information; synonyms; word occurrences; Atmospheric measurements; Biomedical measurement; Frequency measurement; Information retrieval; Particle measurements; Semantics; Vectors; Information Retrieval; Semantic Relatedness; Semantic Similarity;

fLanguage

English

Publisher

ieee

Conference_Titel

Information Technology and Artificial Intelligence Conference (ITAIC), 2014 IEEE 7th Joint International

Print_ISBN

978-1-4799-4420-0

Type

conf

DOI

10.1109/ITAIC.2014.7064994

Filename

7064994