Title :
Tri-factorization learning of sub-word units with application to vocabulary acquisition
Author :
Sun, Meng ; Van hamme, Hugo
Author_Institution :
Dept. of Electr. Eng.-ESAT, Katholieke Univ. Leuven, Leuven, Belgium
Abstract :
In prior work, we proposed a method for vocabulary acquisition based on a co-occurrence model and non-negative matrix factorization. The vocabulary is described in terms of co-occurrence statistics of frame-level acoustic descriptions and suffers from poor scalability to larger vocabularies. Much like whole-word HMM models, there is no reuse of a sub-word units such as phone models. In this paper, we apply the co-occurrence framework to learn a set of sub-word units unsupervisedly using a matrix tri-factorization and propose a method for computing their posteriorgram and finally show vocabulary acquisition from the posteriorgram. The method outperforms our prior work in that it can learn from a smaller set of labeled data and shows a better recognition accuracy.
Keywords :
hidden Markov models; learning (artificial intelligence); matrix decomposition; speech recognition; statistical analysis; cooccurrence statistic model; hidden Markov models; nonnegative matrix factorization; posteriorgram; recognition accuracy; subword units; trifactorization learning; vocabulary acquisition; whole-word HMM models; Acoustics; Hidden Markov models; Probabilistic logic; Training; Training data; Vectors; Vocabulary; pattern discovery; semi-supervised learning; spectral embedding; vocabulary acquisition;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2012.6289086