Efficient sampling and feature selection in whole sentence maximum entropy language models

Author

Chen, Stanley F. ; Rosenfeld, Ronald

Author_Institution

Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA

Volume

1

fYear

1999

fDate

15-19 Mar 1999

Firstpage

549

Abstract

Conditional maximum entropy models have been successfully applied to estimating language model probabilities of the form P(w|h), but are often to demanding computationally. Furthermore, the conditional framework does not lend itself to expressing global sentential phenomena. We have previously introduced a non-conditional maximum entropy language model which directly models the probability of an entire sentence or utterance. The model treats each utterance as a “bag of features”, where features are arbitrary computable properties of the sentence. Using the model is computationally straightforward since it does not require normalization. Training the model requires efficient sampling of sentences from an exponential distribution. In this paper, we further develop the model and demonstrate its feasibility and power. We compare the efficiency of several sampling techniques. implement smoothing to accommodate rare features, and suggest an efficient algorithm for improving the convergence rate. We then present a novel procedure for feature selection, which exploits discrepancies between the existing model and the training corpus. We demonstrate our ideas by constructing and analyzing competitive modes in the Switchboard domain

Keywords

convergence of numerical methods; exponential distribution; feature extraction; maximum entropy methods; natural languages; signal sampling; speech processing; Switchboard domain; conditional maximum entropy models; convergence rate; efficient algorithm; efficient sampling; exponential distribution; feature selection; language model probabilities; model training; nonconditional maximum entropy language model; speech processing; training corpus; utterance; whole sentence maximum entropy language models; Computational modeling; Computer science; Entropy; Exponential distribution; History; Monte Carlo methods; Probability; Sampling methods; Smoothing methods;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on

Conference_Location

Phoenix, AZ

ISSN

1520-6149

Print_ISBN

0-7803-5041-3

Type

conf

DOI

10.1109/ICASSP.1999.758184

Filename

758184