Title :
Multilingual Extension of Temporal Expression Recognition Using Parallel Corpora
Author :
Puchol-Blasco, M. ; Saquete, E. ; Martínez-Barco, P.
Author_Institution :
Univ. de Alicante, Alicante
Abstract :
This paper presents the automatic extension of TERSEO to other languages, a knowledge-based system for the recognition and normalization of temporal expressions, originally developed for Spanish. TERSEO was extended to English and Italian through the automatic translation of the temporal expressions, and it was presented in previous works (see Saquete et al.), but a new methodology has been designed with the purpose of obtaining better results in this issue. This new methodology is based on the use of parallel corpora for extending the TERSEO temporal model to other languages. In this case, two different methods have been tested: (1) automatic translation of TERSEO patterns to other languages and (2) automatic corpora annotation in the target side of parallel corpora. The main idea is focused on annotating the Spanish side of a parallel corpora, projecting the analysis to the second language, and then obtaining new TERSEO patterns (1) and new annotated corpus (2). The set of new patterns will be used to improve the current TERSEO language independent modules. Whereas the new annotated corpus will be used to train a ML system. This system will annotate new temporal expressions in the new language.
Keywords :
knowledge based systems; natural language processing; TERSEO; automatic corpora annotation; automatic extension; automatic translation; knowledge-based system; multilingual extension; parallel corpora; temporal expression recognition; Automatic testing; Design methodology; Humans; Information retrieval; Knowledge based systems; Machine learning; Natural language processing; Natural languages; Pattern analysis; Proposals;
Conference_Titel :
Temporal Representation and Reasoning, 14th International Symposium on
Conference_Location :
Alicante
Print_ISBN :
978-0-7695-2836-6
DOI :
10.1109/TIME.2007.54