مرکز منطقه ای اطلاع رساني علوم و فناوري - Template-Based Continuous Speech Recognition

DocumentCode :

779889

Title :

Template-Based Continuous Speech Recognition

Author :

Wachter, Mathias De ; Matton, Mike ; Demuynck, Kris ; Wambacq, Patrick ; Cools, Ronald ; Compernolle, Dirk Van

Author_Institution :

Katholieke Univ., Leuven

Volume :

Issue :

fYear :

2007

fDate :

5/1/2007 12:00:00 AM

Firstpage :

1377

Lastpage :

1390

Abstract :

Despite their known weaknesses, hidden Markov models (HMMs) have been the dominant technique for acoustic modeling in speech recognition for over two decades. Still, the advances in the HMM framework have not solved its key problems: it discards information about time dependencies and is prone to overgeneralization. In this paper, we attempt to overcome these problems by relying on straightforward template matching. The basis for the recognizer is the well-known DTW algorithm. However, classical DTW continuous speech recognition results in an explosion of the search space. The traditional top-down search is therefore complemented with a data-driven selection of candidates for DTW alignment. We also extend the DTW framework with a flexible subword unit mechanism and a class sensitive distance measure-two components suggested by state-of-the-art HMM systems. The added flexibility of the unit selection in the template-based framework leads to new approaches to speaker and environment adaptation. The template matching system reaches a performance somewhat worse than the best published HMM results for the Resource Management benchmark, but thanks to complementarity of errors between the HMM and DTW systems, the combination of both leads to a decrease in word error rate with 17% compared to the HMM results

Keywords :

hidden Markov models; speech recognition; acoustic modeling; hidden Markov models; resource management benchmark; subword unit mechanism; template matching; template-based continuous speech recognition; Context modeling; Error analysis; Explosions; Hidden Markov models; Power system modeling; Resource management; Speech processing; Speech recognition; Speech synthesis; Switches; Dynamic time warping (DTW); episodic modeling; example-based recognition;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2007.894524

Filename :

4156191

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=779889