• DocumentCode
    3165052
  • Title

    ASR-driven top-down binary mask estimation using spectral priors

  • Author

    Hartmann, William ; Fosler-Lussier, Eric

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
  • fYear
    2012
  • fDate
    25-30 March 2012
  • Firstpage
    4685
  • Lastpage
    4688
  • Abstract
    Typical mask estimation algorithms use low-level features to estimate the interfering noise or instantaneous SNR. We propose a simple top-down approach to mask estimation. The estimated mask is based on a specific hypothesis of the underlying speech without using information about the interference or the instantaneous SNR. In this pilot study, we observe a 9% reduction in word error over a baseline recognition system on the Aurora4 corpus, though much greater gains could theoretically be achieved through improvements to the model selection process. We also present SNR improvement results showing our method performs as well as a standard MMSE-based method, demonstrating that speech recognition can aid speech enhancement. Thus, the relationship between recognition and enhancement need not be one way: linguistic information can play a significant role in speech enhancement.
  • Keywords
    least mean squares methods; speech enhancement; speech recognition; ASR-driven top-down binary mask estimation; Aurora4 corpus; MMSE-based method; SNR; automatic speech recognition; baseline recognition system; interfering noise estimation; linguistic information; low-level features; spectral priors; speech enhancement; Estimation; Hidden Markov models; Signal to noise ratio; Speech; Speech enhancement; Speech recognition; ideal binary mask; mask estimation; robust automatic speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
  • Conference_Location
    Kyoto
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4673-0045-2
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2012.6288964
  • Filename
    6288964