DocumentCode :
2260590
Title :
A study on cross-language text summarization using supervised methods
Author :
Yu, Lei ; Ren, Fuji
Author_Institution :
Grad. Sch. of Adv. Sci. Technol. Educ., Univ. of Tokushima, Tokushima, Japan
fYear :
2009
fDate :
24-27 Sept. 2009
Firstpage :
1
Lastpage :
7
Abstract :
In this work, we use Hidden Markov Models (HMM), Conditional Random Field (CRF), Gaussian Mixture Models (GMM) and Mathematical Methods of Statistics (MMS) for Chinese and Japanese text summarization. The purpose of this work is to study the applicability of mentioned three trainable models for cross-language text summarization. For model training, we use several training features such as sentence position, sentence centrality, number of Name Entity and so on. For model testing, Chinese on-line news and Japanese news are used as test data which are extracted from web pages. We evaluate each model by measuring the precision at the compression rate 10%, 20% and 30%. MMS is a baseline method. The results show that HMM, CRF and GMM have remarkable increases than MMS on both Chinese and Japanese text summarization by using the same training features. Especially, GMM model make a best performance in all tests.
Keywords :
Gaussian processes; hidden Markov models; text analysis; Chinese text summarization; Gaussian mixture models; Japanese text summarization; conditional random field; cross-language text summarization; hidden Markov models; mathematical methods of statistics; name entity; sentence centrality; sentence position; supervised methods; training features; Artificial intelligence; Data mining; Educational technology; Hidden Markov models; Machine learning; Mathematical model; Natural languages; Statistics; Testing; Web pages; Machine Learning; NLP; Text Summarization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2009. NLP-KE 2009. International Conference on
Conference_Location :
Dalian
Print_ISBN :
978-1-4244-4538-7
Electronic_ISBN :
978-1-4244-4540-0
Type :
conf
DOI :
10.1109/NLPKE.2009.5313809
Filename :
5313809
Link To Document :
بازگشت