Discriminative articulatory models for spoken term detection in low-resource conversational settings

Author

Prabhavalkar, Rohit ; Livescu, Karen ; Fosler-Lussier, Eric ; Keshet, Joseph

Author_Institution

Ohio State Univ., Columbus, OH, USA

fYear

2013

Firstpage

8287

Lastpage

8291

Abstract

We study spoken term detection (STD) - the task of determining whether and where a given word or phrase appears in a given segment of speech - using articulatory feature-based pronunciation models. The models are motivated by the requirements of STD in low-resource settings, in which it may not be feasible to train a large-vocabulary continuous speech recognition system, as well as by the need to address pronunciation variation in conversational speech. Our STD system is trained to maximize the expected area under the receiver operating characteristic curve, often used to evaluate STD performance. In experimental evaluations on the Switchboard corpus, we find that our approach outperforms a baseline HMM-based system across a number of training set sizes, as well as a discriminative phone-based model in some settings.

Keywords

learning (artificial intelligence); speech recognition; HMM-based system; STD system; Switchboard corpus; articulatory feature-based pronunciation models; discriminative articulatory models; discriminative phone-based model; large-vocabulary continuous speech recognition system; low-resource conversational settings; spoken term detection; Acoustics; Context modeling; Hidden Markov models; Speech; Speech recognition; Switches; Training; AUC; articulatory features; discriminative training; spoken term detection; structural SVM;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on

Conference_Location

Vancouver, BC

ISSN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2013.6639281

Filename

6639281