Title :
Transition features for CRF-based speech recognition and boundary detection
Author :
Dimopoulos, Spiros ; Fosler-Lussier, Eric ; Lee, Chin-Hui ; Potamianos, Alexandros
Author_Institution :
Dept. of Electron. & Comput. Eng., Tech. Univ. of Crete, Chania, Greece
fDate :
Nov. 13 2009-Dec. 17 2009
Abstract :
In this paper, we investigate a variety of spectral and time domain features for explicitly modeling phonetic transitions in speech recognition. Specifically, spectral and energy distance metrics, as well as, time derivatives of phonological descriptors and MFCCs are employed. The features are integrated in an extended Conditional Random Fields statistical modeling framework that supports general-purpose transition models. For evaluation purposes, we measure both phonetic recognition task accuracy and precision/recall of boundary detection. Results show that when transition features are used in a CRF-based recognition framework, recognition performance improves significantly due to the reduction of phone deletions. The boundary detection performance also improves mainly for transitions among silence, stop, and fricative phonetic classes.
Keywords :
hidden Markov models; random processes; speech recognition; CRF-based speech recognition; boundary detection; energy distance metrics; extended conditional random fields statistical modeling framework; hidden Markov model; phone deletions reduction; phonetic recognition task accuracy; phonetic transitions; spectral distance metrics; Computer science; Data mining; Drives; Energy measurement; Frequency selective surfaces; Hidden Markov models; Power engineering and energy; Probability; Speech recognition; Speech synthesis;
Conference_Titel :
Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on
Conference_Location :
Merano
Print_ISBN :
978-1-4244-5478-5
Electronic_ISBN :
978-1-4244-5479-2
DOI :
10.1109/ASRU.2009.5373287