DocumentCode
49744
Title
Hierarchical Pitman–Yor–Dirichlet Language Model
Author
Jen-Tzung Chien
Author_Institution
Dept. of Electr. & Comput. Eng., Nat. Chiao Tung Univ., Hsinchu, Taiwan
Volume
23
Issue
8
fYear
2015
fDate
Aug. 2015
Firstpage
1259
Lastpage
1272
Abstract
Probabilistic models are often viewed as insufficiently expressive because of strong limitation and assumption on the probabilistic distribution and the fixed model complexity. Bayesian nonparametric learning pursues an expressive probabilistic representation based on the nonparametric prior and posterior distributions with less assumption-laden approach to inference. This paper presents a hierarchical Pitman-Yor-Dirichlet (HPYD) process as the nonparametric priors to infer the predictive probabilities of the smoothed n-grams with the integrated topic information. A metaphor of hierarchical Chinese restaurant process is proposed to infer the HPYD language model (HPYD-LM) via Gibbs sampling. This process is equivalent to implement the hierarchical Dirichlet process-latent Dirichlet allocation (HDP-LDA) with the twisted hierarchical Pitman-Yor LM (HPY-LM) as base measures. Accordingly, we produce the power-law distributions and extract the semantic topics to reflect the properties of natural language in the estimated HPYD-LM. The superiority of HPYD-LM to HPY-LM and other language models is demonstrated by the experiments on model perplexity and speech recognition.
Keywords
Bayes methods; learning (artificial intelligence); nonparametric statistics; speech recognition; statistical distributions; Bayesian nonparametric learning; Gibbs sampling; HDP-LDA; HPYD language model; HPYD-LM; base measures; expressive probabilistic representation; fixed model complexity; hierarchical Chinese restaurant process; hierarchical Dirichlet process-latent Dirichlet allocation; hierarchical Pitman-Yor-Dirichlet language model; integrated topic information; model perplexity; natural language; nonparametric prior distribution; posterior distributions; power-law distributions; predictive probabilities; probabilistic distribution; probabilistic models; semantic topics; smoothed n-grams; speech recognition; twisted hierarchical Pitman-Yor LM; Adaptation models; Bayes methods; Context; Data models; Semantics; Speech; Speech processing; Bayesian nonparametrics; language model; speech recognition; topic model; unsupervised learning;
fLanguage
English
Journal_Title
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
Publisher
ieee
ISSN
2329-9290
Type
jour
DOI
10.1109/TASLP.2015.2428632
Filename
7098357
Link To Document