DocumentCode :
1076620
Title :
Dealing With Complex Linguistic Annotations Within a Language Processing Framework
Author :
Artola, Xabier ; De Ilarraza, Arantza Díaz ; Soroa, Aitor ; Sologaistoa, Aitor
Volume :
17
Issue :
5
fYear :
2009
fDate :
7/1/2009 12:00:00 AM
Firstpage :
904
Lastpage :
915
Abstract :
In this paper we present AWA, a general purpose Annotation Web Architecture for representing, storing, and accessing the information produced by different linguistic processors. The objective of AWA is to establish a coherent and flexible representation scheme that will be the basis for the exchange and use of linguistic information. In morphologically-rich languages as Basque it is necessary to represent and provide easy access to complex phenomena such as intraword structure, declension, derivation and composition features, constituent discontinousness (in multiword expressions) and so on. AWA provides a well-suited schema to deal with these phenomena. The annotation model relies on XML technologies for data representation, storage and retrieval. Typed feature structures are used as a representation schema for linguistic analyses. A consistent underlying data model, which captures the structure and relations contained in the information to be manipulated, has been identified and implemented. AWA is integrated into LPAF, a multilayered Language Processing and Annotation Framework, whose goal is the management and integration of diverse NLP components and resources. Moreover, we introduce EULIA, an annotation tool which exploits and manipulates the data created by the linguistic processors. Two real corpora have been processed and annotated within this framework.
Keywords :
Data models; Information retrieval; Libraries; Local government; Natural languages; Object oriented modeling; Proposals; Resource management; Service oriented architecture; XML; Language processing; language resources; linguistic annotation;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2009.2018565
Filename :
5075767
Link To Document :
بازگشت