DocumentCode :
556733
Title :
Parallel treebank from word-aligned bilingual corpus. Language engineering for phrasal alignments
Author :
Colhon, Mihaela
Author_Institution :
Dept. of Comput. Sci., Univ. of Craiova, Craiova, Romania
fYear :
2011
fDate :
14-16 Oct. 2011
Firstpage :
1
Lastpage :
6
Abstract :
In this paper we describe a mechanism for parallel treebank generation between an intense studied language (i.e. English) and a less studied language, like Romanian. The Romanian constituents of the treebank are induced from the corresponding constituents of the English part taking into account the words alignments of the corpus. The proposed mechanism reuses and adjusts existing tools and algorithms for automatic Part-Of-Speech annotation and syntactic trees alignment.
Keywords :
natural language processing; Romanian; language engineering; parallel treebank; part-of-speech annotation; phrasal alignments; syntactic trees alignment; word-aligned bilingual corpus; Europe; Natural language processing; Pragmatics; Proposals; Syntactics; Tagging; Training;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
System Theory, Control, and Computing (ICSTCC), 2011 15th International Conference on
Conference_Location :
Sinaia
Print_ISBN :
978-1-4577-1173-2
Type :
conf
Filename :
6085680
Link To Document :
بازگشت