• DocumentCode
    19437
  • Title

    Sparse Hidden Markov Models for Speech Enhancement in Non-Stationary Noise Environments

  • Author

    Feng Deng ; Changchun Bao ; Kleijn, W. Bastiaan

  • Author_Institution
    Sch. of Electron. Inf. & Control Eng., Beijing Univ. of Technol., Beijing, China
  • Volume
    23
  • Issue
    11
  • fYear
    2015
  • fDate
    Nov. 2015
  • Firstpage
    1973
  • Lastpage
    1987
  • Abstract
    We propose a sparse hidden Markov model (HMM)-based single-channel speech enhancement method that models the speech and noise gains accurately in non-stationary noise environments. Autoregressive models are employed to describe the speech and noise in a unified framework and the speech and noise gains are modeled as random processes with memory. The likelihood criterion for finding the model parameters is augmented with an lp regularization term resulting in a sparse autoregressive HMM (SARHMM) system that encourages sparsity in the speech- and noise- modeling. In the SARHMM only a small number of HMM states contribute significantly to the model of each particular observed speech segment. As it eliminates ambiguity between noise and speech spectra, the sparsity of speech and noise modeling helps to improve the tracking of the changes of both spectral shapes and power levels of non-stationary noise. Using the modeled speech and noise SARHMMs, we first construct a noise estimator to estimate the noise power spectrum. Then, a Bayesian speech estimator is derived to obtain the enhanced speech signal. The subjective and objective test results indicate that the proposed speech enhancement scheme can achieve a larger segmental SNR improvement, a lower log-spectral distortion and a better speech quality in stationary noise conditions than state-of-the-art reference methods. The advantage of the new method is largest for non-stationary noise conditions .
  • Keywords
    autoregressive processes; belief networks; hidden Markov models; speech enhancement; Bayesian speech estimator; HMM-based single-channel speech enhancement method; SARHMM system; autoregressive models; lower log-spectral distortion; noise estimator; nonstationary noise environments; random processes; sparse hidden Markov models; speech quality; Hidden Markov models; Mathematical model; Noise; Noise measurement; Speech; Speech enhancement; Gain modeling; non-stationary noise; sparse autoregressive hidden Markov model (ARHMM); speech enhancement;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    2329-9290
  • Type

    jour

  • DOI
    10.1109/TASLP.2015.2458585
  • Filename
    7163326