DocumentCode :
3302919
Title :
Identification of Multiword Expressions in Technical Domains: Investigating Statistical and Alignment-Based Approaches
Author :
Villavicencio, Aline ; de Medeiros Caseli, Helena ; Machado, Andre
fYear :
2009
fDate :
8-11 Sept. 2009
Firstpage :
27
Lastpage :
35
Abstract :
Multiword Expressions (MWEs) are one of the stumbling blocks for more precise Natural Language Processing (NLP) systems. The lack of coverage of MWEs in resources can impact negatively on the performance of tasks and applications, and can lead to loss of information or communication errors; especially in technical domains where MWE are frequent. This paper investigates some approaches to the identification of MWEs in technical corpora based on: association measures, part-of-speech and lexical alignment information. We examine the influence of some factors on their performance such as sources of information for identification and evaluation. While the association measures emphasize recall, the alignment method focuses on precision.
Keywords :
Application software; Computer science; Global warming; Humans; Informatics; Information resources; Natural language processing; Natural languages; Performance loss; Vocabulary; Lexical Acquisition; Multiword Expressions; Natural Language Processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information and Human Language Technology (STIL), 2009 Seventh Brazilian Symposium in
Conference_Location :
Sao Carlos, TBD, Brazil
Print_ISBN :
978-1-4244-6008-3
Type :
conf
DOI :
10.1109/STIL.2009.33
Filename :
5532435
Link To Document :
بازگشت