Title :
Chinese Text Chunking Using Divide-Conquer Model
Author :
Liang, Yinghong ; Cao, Jun ; Zhang, Chunxiang
Author_Institution :
Sch. of Comput. Eng., Vocational Univ. of Suzhon, Suzhon
Abstract :
Traditional Chinese text chunking approach is to identify phrases using only one model and same features. It has been shown that the limitations of using only one model are that: the use of the same types of features is not suitable for all phrases, and data sparseness may also result. In this paper, the divide-conquer model is proposed and applied in the identification of Chinese phrases. This model divides the task of chunking into several sub-tasks according to sensitive features of each phrase and identifies different phrases in parallel. Then, a two-stage decreasing conflict strategy is used to synthesize each sub-task´s answer. Through testing on Chinese Penn Treebank, F score of Chinese chunking using multi-agent strategy achieves to 95.82%, which is higher than the best result that has been reported.
Keywords :
divide and conquer methods; multi-agent systems; text analysis; Chinese Penn Treebank; Chinese text chunking; data sparseness; divide-conquer model; multiagent strategy; two-stage decreasing conflict strategy; Cities and towns; Data mining; Electronic mail; Forestry; Fuzzy systems; Information analysis; Knowledge engineering; Natural language processing; Testing; Text processing; Chinese chunking; Multi-agent; sensitive features;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2008. FSKD '08. Fifth International Conference on
Conference_Location :
Jinan Shandong
Print_ISBN :
978-0-7695-3305-6
DOI :
10.1109/FSKD.2008.300