• DocumentCode
    312148
  • Title

    A unified spectral transformation adaptation approach for robust speech recognition

  • Author

    Yao, Lei ; Yu, Dong ; Huang, Taiyi

  • Author_Institution
    Nat. Lab. of Pattern Recognition, Acad. Sinica, Beijing, China
  • Volume
    2
  • fYear
    1996
  • fDate
    3-6 Oct 1996
  • Firstpage
    981
  • Abstract
    Canonical correlation-based compensation (CCBC) is proposed as a unified approach to cope with the mismatch between training and test set. The mismatch between training and test conditions can be simply clustered into three classes: differences between speakers, changes in the recording channel and the effects of a noisy environment. In previous work (1995), we successfully used the CCBC approach, with some modifications, to make our speech recognizer robust to a noisy environment. Recently, the same approach has been extended for speaker and channel adaptation. The results of our experiments show that CCBC approach well compensates for all three kinds of distortion source between training and test conditions. In order to compare the performance of CCBC with that of some conventional adaptation approaches, the capacities of the techniques of cepstal mean normalization, RASTA and Lin-Log RASTA are tested. We find that CCBC has a better performance than all of them. As a very important problem in the CCBC approach, the selection of appropriate reference speech data is also discussed in this paper
  • Keywords
    audio recording; cepstral analysis; compensation; correlation methods; software performance evaluation; speech recognition; Lin-Log RASTA; canonical correlation-based compensation; cepstal mean normalization; channel adaptation; clustering; distortion source; noisy environment; performance; recording channel changes; reference speech data selection; robust speech recognition; speaker adaptation; speaker differences; training set/test set mismatch; unified spectral transformation adaptation approach; Acoustic distortion; Acoustic noise; Cepstral analysis; Loudspeakers; Robustness; Speech enhancement; Speech recognition; Testing; Vectors; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    0-7803-3555-4
  • Type

    conf

  • DOI
    10.1109/ICSLP.1996.607767
  • Filename
    607767