A New Method of the Automatically Marked Chinese Part of Speech Based on Gaussian Prior Smoothing Maximum Entropy Model

Author

Zhao, Wei ; Zhao, Faxing ; Li, Wenhui

Author_Institution

Jilin Univ., Changchun

Volume

3

fYear

2007

fDate

24-27 Aug. 2007

Firstpage

447

Lastpage

453

Abstract

With its many virtues, maximum entropy (ME) model has been favored in natural language processing. Because of the limitation of the training data, the parameters sparse phenomenon is serious in Chinese part of speech. The model is prone to over fit training data, therefore some smoothing method should be applied on maximum entropy model. While several smoothing methods for maximum entropy models have been proposed to address this problem, Gaussian prior smoothing method has an outstanding performance. Based on this smoothing maximum entropy model and characteristics of Chinese, a new Chinese part-of-speech system is presented. Result of experiment shows that it works well.

Keywords

Gaussian processes; maximum entropy methods; natural language processing; smoothing methods; speech processing; Gaussian prior smoothing maximum entropy model; Gaussian prior smoothing method; natural language processing; over fit training data; parameters sparse phenomenon; Computer science; Educational institutions; Entropy; Hidden Markov models; Natural language processing; Natural languages; Probability distribution; Smoothing methods; Speech processing; Training data;

fLanguage

English

Publisher

ieee

Conference_Titel

Fuzzy Systems and Knowledge Discovery, 2007. FSKD 2007. Fourth International Conference on

Conference_Location

Haikou

Print_ISBN

978-0-7695-2874-8

Type

conf

DOI

10.1109/FSKD.2007.86

Filename

4406278