Title :
Chinese-English quasi-equivalent noun phrase definition and automatic identification
Author :
Ma, Yanjun ; Liu, Ying
Author_Institution :
Dept. of Chinese Language & Literature, Tsinghua Univ., Beijing, China
fDate :
Oct. 30 2005-Nov. 1 2005
Abstract :
After an examination of a Chinese-English bilingual corpus with 2239 sentence pairs, a new definition of Chinese noun phrase (NP), quasi-equivalent noun phrase (equNP), is proposed with a goal of translation from Chinese NPs to English NPs. Firstly, all the equNPs in the corpus are tagged manually according to the definition in this paper. A set of part of speech (POS) templates for equNP is automatically acquired. Secondly, all the possible equNPs in a sentence are identified using the templates. These equNPs are the candidates for equNP identification. Finally, a classification process and a chunking process are carried out. In classification process, the correct equNPs are chosen from the candidates set using a maximum entropy classifier which combined POS, syntactic and semantic information. In chunking process, the equNPs in the sentence are finally chosen. On open test set, the precision is 83.75% and recall is 86.50%.
Keywords :
grammars; learning (artificial intelligence); natural languages; Chinese noun phrase; Chinese-English quasi-equivalent noun phrase definition; POS; automatic identification; maximum entropy classifier; part of speech; semantic information; Computational linguistics; Data mining; Entropy; Information retrieval; Machine learning; Natural language processing; Natural languages; Speech; Statistical analysis; Testing;
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2005. IEEE NLP-KE '05. Proceedings of 2005 IEEE International Conference on
Conference_Location :
Wuhan, China
Print_ISBN :
0-7803-9361-9
DOI :
10.1109/NLPKE.2005.1598776