DocumentCode :
3424707
Title :
Probabilistic unsupervised Chinese sentence compression
Author :
Chen, Jinguang ; He, Tingting ; Gui, Zhuoming ; Li, Fang
Author_Institution :
Eng. & Res. Center for Inf. Technol. on Educ., Huazhong Normal Univ., Wuhan, China
fYear :
2009
fDate :
17-19 Aug. 2009
Firstpage :
61
Lastpage :
66
Abstract :
Research on sentence compression has been undergoing for many years in other languages, especially in English, but research on Chinese sentence compression is rarely found. In this paper, we describe an efficient probabilistic and syntactic approach to Chinese sentence compression. We introduce the classical noisy-channel approach into Chinese sentence compression and improve it in many ways. Since there is no parallel training corpus in Chinese, we use the unsupervised learning method. This paper also presents a novel bottom-up optimizing algorithm which considers both bigram and syntactic probabilities for generating candidate compressed sentences. We evaluate results against manual compressions and a simple baseline. The experiments show the effectiveness of the proposed approach.
Keywords :
data compression; natural language processing; statistical distributions; unsupervised learning; Chinese sentence compression; classical noisy-channel approach; probabilistic approach; syntactic approach; unsupervised learning method; Computer science; Computer science education; Context modeling; Data mining; Educational institutions; Educational technology; Information technology; Natural languages; Optimization methods; Unsupervised learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Granular Computing, 2009, GRC '09. IEEE International Conference on
Conference_Location :
Nanchang
Print_ISBN :
978-1-4244-4830-2
Type :
conf
DOI :
10.1109/GRC.2009.5255158
Filename :
5255158
Link To Document :
بازگشت