• DocumentCode
    32785
  • Title

    The Potential for Speech Intelligibility Improvement Using the Ideal Binary Mask and the Ideal Wiener Filter in Single Channel Noise Reduction Systems: Application to Auditory Prostheses

  • Author

    Madhu, Nilesh ; Spriet, Ann ; Jansen, Sofie ; Koning, Raphael ; Wouters, Jan

  • Author_Institution
    Dept. of Neurosciences, Katholieke Univ. Leuven, Leuven, Belgium
  • Volume
    21
  • Issue
    1
  • fYear
    2013
  • fDate
    Jan. 2013
  • Firstpage
    63
  • Lastpage
    72
  • Abstract
    Whereas state-of-the-art single-channel noise reduction algorithms for auditory prostheses demonstrate an appreciable suppression of the noise and improved speech quality, they are unable, thus far, to improve the intelligibility of noise-degraded speech signals. Alternative approaches to speech enhancement using a binary time-frequency mask have demonstrated substantial intelligibility improvements in low signal-to-noise-ratio (SNR) conditions under ideal settings, making this a promising research direction for auditory prostheses. These approaches exploit the sparsity and disjoint-ness of speech spectra in their short-time-frequency representation to preserve only the target-dominant time-frequency regions in the processed output. State-of-the-art noise reduction algorithms in contrast are soft-decision approaches which weight each time-frequency region in proportion to the prevailing SNR. However, the potential for intelligibility improvement using these approaches has not been examined systematically vis-à-vis the binary mask alternative. This contribution compares the performance of an ideal soft-decision system, exemplified by the ideal Wiener filter (IWF), and the ideal binary mask (IBM) for single-channel speech enhancement for auditory prostheses. To obtain results relevant to this application area, a (relatively) low spectral resolution, modelled using the Bark-spectrum scale, is used for both the IWF and the IBM. This spectral resolution is comparable to that being used in commercial hearing instruments. The comparison is in terms of potential for intelligibility improvement and resulting signal quality. Intelligibility tests carried out under various noise conditions and SNRs show that the IWF leads to higher intelligibility scores than the IBM in low SNR conditions. Under non-ideal parameter estimates, it is demonstrated that the IWF approach is also much less sensitive to estimation errors. Quality-wise, a preference for the IWF exists. This - as evaluated using a two-stage, pair-wise preference-rating test.
  • Keywords
    Wiener filters; speech enhancement; speech intelligibility; Bark-spectrum scale; auditory prostheses; auditory prosthesis; binary time frequency mask; ideal Wiener filter; ideal binary mask; intelligibility score; intelligibility test; noise degraded speech signals; noise reduction algorithm; signal to noise ratio; single channel noise reduction system; single channel speech enhancement; soft decision system; speech intelligibility improvement; speech quality; speech spectra; time frequency region; Estimation; Materials; Signal to noise ratio; Speech; Speech enhancement; Time frequency analysis; Binary masks; hearing-aids; soft-decision; speech enhancement; speech intelligibility in noise;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2012.2213248
  • Filename
    6269058