Title :
EM-based phoneme confusion matrix generation for low-resource spoken term detection
Author :
Di Xu ; Yun Wang ; Metze, Florian
Author_Institution :
Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA
Abstract :
The idea of using a data-driven phoneme confusion matrix (PCM) to enhance speech recognition and retrieval performance is not new to the speech community. Although empirical results show various degrees of improvements brought by introducing a PCM, the underlying data-driven processes introduced in most papers are rather ad-hoc and lack rigorous statistical justifications. In this paper we will focus on the statistical aspects of PCM generation, propose and justify a novel expectation-maximization based algorithm for data-driven PCM generation. We will evaluate the performance of the generated PCMs under the context of low-resource spoken term detection, with primary focus on out-of-vocabulary keywords.
Keywords :
expectation-maximisation algorithm; information retrieval; matrix algebra; speech recognition; statistical analysis; EM-based phoneme confusion matrix generation; data-driven PCM generation; data-driven phoneme confusion matrix; data-driven processes; expectation-maximization based algorithm; low-resource spoken term detection; out-of-vocabulary keywords; speech community; speech recognition; speech retrieval performance; statistical aspects; Estimation; Optimization; Phase change materials; Probabilistic logic; Speech; Speech recognition; Viterbi algorithm; Expectation-maximization algorithm; information retrieval; machine learning; out-of-vocabulary words; spoken term detection;
Conference_Titel :
Spoken Language Technology Workshop (SLT), 2014 IEEE
DOI :
10.1109/SLT.2014.7078612