• DocumentCode
    2768728
  • Title

    Robust speech recognition with on-line unsupervised acoustic feature compensation

  • Author

    Buera, Luis ; Miguel, Antonio ; Lleida, Eduardo ; Saz, Óscar ; Ortega, Alfonso

  • Author_Institution
    Zaragoza Univ., Zaragoza
  • fYear
    2007
  • fDate
    9-13 Dec. 2007
  • Firstpage
    105
  • Lastpage
    110
  • Abstract
    An on-line unsupervised hybrid compensation technique is proposed to reduce the mismatch between training and testing conditions. It combines multi-environment model based linear normalization with cross-probability model based on GMMs (MEMLIN CPM) with a novel acoustic model adaptation method based on rotation transformations. Hence, a set of rotation transformations is estimated with clean and MEMLIN CPM-normalized training data by linear regression in an unsupervised process. Thus, in testing, each MEMLIN CPM normalized frame is decoded using a modified Viterbi algorithm and expanded acoustic models, which are obtained from the reference ones and the set of rotation transformations. To test the proposed solution, some experiments with Spanish SpeechDat Car database were carried out. MEMLIN CPM over standard ETSI front-end parameters reaches 83.89% of average improvement in WER, while the introduced hybrid solution goes up to 92.07%. Also, the proposed hybrid technique was tested with Aurora 2 database, obtaining an average improvement of 68.88% with clean training.
  • Keywords
    Gaussian processes; audio acoustics; compensation; decoding; estimation theory; feature extraction; matrix algebra; probability; regression analysis; speech coding; speech recognition; unsupervised learning; vectors; GMM; MEMLIN CPM-normalized training data; Viterbi algorithm; acoustic model adaptation method; cross-probability model; feature vector normalization; linear regression; multienvironment model linear normalization; normalized frame decoding; online unsupervised acoustic feature compensation; online unsupervised hybrid compensation technique; rotation matrix estimation process; rotation transformations; speech recognition; testing conditions; training conditions; Acoustic testing; Adaptation model; Databases; Decoding; Linear regression; Robustness; Speech recognition; Telecommunication standards; Training data; Viterbi algorithm; acoustic model adaptation; feature vector normalization; robust speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on
  • Conference_Location
    Kyoto
  • Print_ISBN
    978-1-4244-1746-9
  • Electronic_ISBN
    978-1-4244-1746-9
  • Type

    conf

  • DOI
    10.1109/ASRU.2007.4430092
  • Filename
    4430092