Title : 
Integrate statistical model and lexical knowledge for Chinese multiword chunking
         
        
            Author : 
Zhou, Qiang ; Yu, Hang
         
        
            Author_Institution : 
Centre for Speech & Language Technol., Tsinghua Univ., Beijing
         
        
        
        
        
        
            Abstract : 
Multiword chunking is designed as a shallow parsing technique to recognize external constituent and internal relation tags of a chunk in sentence. In this paper, we propose a new solution to deal with this problem. We design a new relation tagging scheme to represent different intra-chunk relations and make several experiments of feature engineering to select a best baseline statistical model. We also apply outside knowledge from a large-scale lexical relationship knowledge base to improve parsing performance. By integrating all above techniques, we develop a new Chinese MWC parser. Experimental results show its parsing performance can greatly exceed the rule-based parser trained and tested in the same data set.
         
        
            Keywords : 
knowledge based systems; natural language processing; Chinese multiword chunking; intrachunk relations; large-scale lexical relationship knowledge; relation tagging scheme; rule-based parser; shallow parsing technique; Design engineering; Information science; Labeling; Laboratories; Large-scale systems; Natural languages; Speech; Tagging; Technological innovation; Testing; Multiword chunking; Outside lexical knowledge base; Partial parsing; Relation tagging scheme;
         
        
        
        
            Conference_Titel : 
Natural Language Processing and Knowledge Engineering, 2008. NLP-KE '08. International Conference on
         
        
            Conference_Location : 
Beijing
         
        
            Print_ISBN : 
978-1-4244-4515-8
         
        
            Electronic_ISBN : 
978-1-4244-2780-2
         
        
        
            DOI : 
10.1109/NLPKE.2008.4906765