Title :
Extracting coordinate word pairs for dependency parsing
Author :
Junjie Yu; Wenliang Chen
Author_Institution :
School of Computer Science and Technology, Soochow University, Collaborative Innovation Center of Novel Software Technology and Industrialization, Suzhou, China
Abstract :
The subtask of identifying coordinate structures in Chinese dependency analysis is a challenging problem. The accuracy of coordinate word recognition remains below the average. To address this problem, we propose an automatic identification method based on large-scale unlabeled corpus. We then integrate a set of new features corresponding to the collected word pairs into the dependency parser. Specifically, our proposed method is based on the presence of easy-to-identify coordinate fragments. Our method can be divided into two steps. In the first step, we leverage two hand-crafted rules to extract highly accurate coordinate word pairs as seed words. The second step is to utilize seed words to extract coordinate structures in the corpus for further use of coordinate word pair extraction. Experimental results show that the extracted coordinate word pairs can significantly improve the accuracy on coordinate structure dependency analysis.
Keywords :
"Feature extraction","Indexes"
Conference_Titel :
Asian Language Processing (IALP), 2015 International Conference on
Print_ISBN :
978-1-4673-9595-3
DOI :
10.1109/IALP.2015.7451536