Title :
A practical framework for formalizing and extracting Chinese collocations
Author :
Qu, Weiguang ; Tang, Xuri ; Zhou, Junsheng ; Gu, Yanhui ; Li, Bin
Author_Institution :
Sch. of Comput. Sci., Nanjing Normal Univ., Nanjing, China
Abstract :
In this paper we argue for a word-sense based formalization for collocation, and proposes a seed-based approach for collocation extraction for specific purposes. The approach uses RFR_SUM model to iteratively classify polysemous word sense in the corpus. The collocation strength is also obtained by RFR. To capture the syntactic relation inside collocations, this paper presents a frame-based collocation extraction method, which uses word-related frames to obtain collocation with structural information automatically from a large-scale corpus with an average accuracy rate of 89.69%.
Keywords :
iterative methods; natural language processing; pattern classification; text analysis; Chinese collocation extraction; Chinese collocation formalization; RFR_SUM model; collocation strength; frame-based collocation extraction method; iterative polysemous word sense classification; seed-based approach; structural information; syntactic relation; word-related frame; word-sense based formalization; Hafnium; RFR_SUM model; collocation extraction; collocation formalization; frame-based collocation extraction;
Conference_Titel :
Natural Language Processing andKnowledge Engineering (NLP-KE), 2011 7th International Conference on
Conference_Location :
Tokushima
Print_ISBN :
978-1-61284-729-0
DOI :
10.1109/NLPKE.2011.6138230