DocumentCode
2745832
Title
Automatic cognate identification based on a fuzzy combination of string similarity measures
Author
Montalvo, Soto ; Pardo, Eduardo G. ; Martínez, Raquel ; Fresno, Víctor
Author_Institution
Univ. Rey Juan Carlos, Madrid, Spain
fYear
2012
fDate
10-15 June 2012
Firstpage
1
Lastpage
8
Abstract
Cognates are words in different languages that have similar spelling and meaning. The identification of cognates is very useful for many different Natural Language Processing tasks, and also in the process of learning a second language. This paper presents a new approach to classify pairs of words into cognates/false friends or not related classes. The proposed approach uses a fuzzy system to combine complementary string similarity measures in order to improve the cognate identification task. The underlying hypothesis is that the combination of different string measures by applying heuristic knowledge, can outperform those measures working separately. The results obtained by the proposed system confirm the previous hypothesis, and furthermore it also outperforms other systems that combine string measures by using a supervised approach. As an additional contribution, we have created a bilingual test data set which include pairs of cognates, false friends and unrelated words in Spanish and English, that is freely available for research purposes.
Keywords
fuzzy systems; natural language processing; pattern classification; string matching; word processing; English words; Spanish words; automatic cognate identification; bilingual test data set; false friends; fuzzy combination; fuzzy system; heuristic knowledge; natural language processing; second language learning; string similarity measures; supervised approach; Context; Fuzzy sets; Fuzzy systems; Length measurement; Pragmatics; Training;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems (FUZZ-IEEE), 2012 IEEE International Conference on
Conference_Location
Brisbane, QLD
ISSN
1098-7584
Print_ISBN
978-1-4673-1507-4
Electronic_ISBN
1098-7584
Type
conf
DOI
10.1109/FUZZ-IEEE.2012.6250802
Filename
6250802
Link To Document