Title :
CCRFs: Cascaded conditional random fields for Chinese POS tagging
Author :
Shen, Guoyang ; Qiu, Likun ; Hu, Changjian ; Zhao, Kai
Author_Institution :
NEC Labs., Beijing, China
Abstract :
We present CCRFs (cascaded conditional random fields): a cascaded approach to scale conditional random fields (CRFs) for Chinese POS tagging (labeling). General CRFs worked well on POS tagging, but met difficulty when dealing with a large training dataset and tag set because of high computation cost for training. CCRFs organize all tags in a hierarchy and run CRFs on each node of the hierarchy. In CCRFs framework, similar tags were treated as the same one in an upper layer, while they were distinguished in lower layer with more information from upper layer. The analysis of computation complexity shows that CCRFs can highly reduce both space complexity and time complexity for training. The experiments show that CCRFs can deal with large data sets from SIGHAN 2008 with common PCs, which cannot be handled by general CRFs under the same computational condition. Furthermore, we try to keep the good performance of CCRFs for POS tagging by selecting proper features for hierarchical POS tags clustering and introducing more rich and accurate features in the training and tagging phases. The final results show that CCRFs outperform maximum entropy model on all the three test datasets, and outperform enhanced maximum entropy model on the two of the three datasets.
Keywords :
computational complexity; natural language processing; random processes; CCRF; Chinese POS tag clustering; NLP processing; cascaded conditional random field; computation complexity; maximum entropy model; natural language processing; part-of-speech tagging; space complexity; time complexity; Computational efficiency; Entropy; Hidden Markov models; Labeling; Laboratories; National electric code; Natural language processing; Natural languages; Tagging; Technological innovation; Cascaded Conditional Random Fields; Chinese POS tagging; Natural Language Processing;
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2009. NLP-KE 2009. International Conference on
Conference_Location :
Dalian
Print_ISBN :
978-1-4244-4538-7
Electronic_ISBN :
978-1-4244-4540-0
DOI :
10.1109/NLPKE.2009.5313733