• DocumentCode
    2971448
  • Title

    Transition features for CRF-based speech recognition and boundary detection

  • Author

    Dimopoulos, Spiros ; Fosler-Lussier, Eric ; Lee, Chin-Hui ; Potamianos, Alexandros

  • Author_Institution
    Dept. of Electron. & Comput. Eng., Tech. Univ. of Crete, Chania, Greece
  • fYear
    2009
  • fDate
    Nov. 13 2009-Dec. 17 2009
  • Firstpage
    99
  • Lastpage
    102
  • Abstract
    In this paper, we investigate a variety of spectral and time domain features for explicitly modeling phonetic transitions in speech recognition. Specifically, spectral and energy distance metrics, as well as, time derivatives of phonological descriptors and MFCCs are employed. The features are integrated in an extended Conditional Random Fields statistical modeling framework that supports general-purpose transition models. For evaluation purposes, we measure both phonetic recognition task accuracy and precision/recall of boundary detection. Results show that when transition features are used in a CRF-based recognition framework, recognition performance improves significantly due to the reduction of phone deletions. The boundary detection performance also improves mainly for transitions among silence, stop, and fricative phonetic classes.
  • Keywords
    hidden Markov models; random processes; speech recognition; CRF-based speech recognition; boundary detection; energy distance metrics; extended conditional random fields statistical modeling framework; hidden Markov model; phone deletions reduction; phonetic recognition task accuracy; phonetic transitions; spectral distance metrics; Computer science; Data mining; Drives; Energy measurement; Frequency selective surfaces; Hidden Markov models; Power engineering and energy; Probability; Speech recognition; Speech synthesis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on
  • Conference_Location
    Merano
  • Print_ISBN
    978-1-4244-5478-5
  • Electronic_ISBN
    978-1-4244-5479-2
  • Type

    conf

  • DOI
    10.1109/ASRU.2009.5373287
  • Filename
    5373287