مرکز منطقه ای اطلاع رساني علوم و فناوري - An automatic noun compound extraction from Arabic corpus

DocumentCode :

2922125

Title :

An automatic noun compound extraction from Arabic corpus

Author :

Saif, Abdulgabbar Mohammed ; Aziz, Mohd Juzaiddin Ab

Author_Institution :

Dept. of Comput. Sci., Nat. Univ. of Malaysia, Bangi, Malaysia

fYear :

2011

fDate :

28-29 June 2011

Firstpage :

224

Lastpage :

230

Abstract :

The identification of noun compound as multi-word lexical units is very important task in natural language processing applications that require some degree of semantic interpretation such as, machine translation, information retrieval and text summarization. In this paper, we used the hybrid method for extracting the noun compound from Arabic corpus that is based on linguistic knowledge and statistical measures. For the candidate identification, we have used some linguistic analysis tools such as lemmatization and POS in order to filter the candidates and determine the variations. The association measures have been computed for each candidate to rank the candidates. After that, we have evaluated the association measures by using the n-best evaluation method. We reported the precision values for each association measure in each n-best list. The experimental results showed that the log-likelihood ratio is the best association measure that achieved highest precision.

Keywords :

information filtering; linguistics; natural language processing; statistical analysis; word processing; Arabic corpus; automatic noun compound extraction; hybrid method; linguistic knowledge; log-likelihood ratio; multiword lexical units; n-best evaluation method; natural language processing; noun compound identification; semantic interpretation; statistical measures; Compounds; Magnetic heads; Mutual information; Pragmatics; Semantics; Syntactics; Tagging; Arabic noun compund; Association measures; hybrid method; lemmatization; morphological variations; n-best evaluation method;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Semantic Technology and Information Retrieval (STAIR), 2011 International Conference on

Conference_Location :

Putrajaya

Print_ISBN :

978-1-61284-354-4

Electronic_ISBN :

978-1-61284-353-7

Type :

conf

DOI :

10.1109/STAIR.2011.5995793

Filename :

5995793

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2922125