• DocumentCode
    2801145
  • Title

    Noisy speech enhancement based on prior knowledge about spectral envelope and harmonic structure

  • Author

    Yoshioka, Takuya ; Nakatani, Tomohiro ; Okuno, Hiroshi G.

  • Author_Institution
    NTT Commun. Sci. Labs., NTT Corp., Kyoto, Japan
  • fYear
    2010
  • fDate
    14-19 March 2010
  • Firstpage
    4270
  • Lastpage
    4273
  • Abstract
    This paper considers the enhancement of noisy speech. Earlier studies have revealed that an approach that enhances spectral envelopes by using prior knowledge about the all-pole (AP) model parameters of clean speech learnt from speech corpora is advantageous in terms of the amount of musical noise and speech distortion. This paper proposes a new speech enhancement method, in which harmonic structure enhancement is incorporated in learning-based spectral envelope enhancement to further improve performance. The harmonic structure is represented by using a harmonic Gaussian mixture model (GMM), which is parameterized by a voicing indicator and a fundamental frequency. The parameters of the AP model and the harmonic GMM are jointly estimated by maximum a posteriori estimation, thus enabling the enhancement of spectral envelopes and harmonic structures in a unified framework. The proposed method outperforms the spectral envelope enhancement approach by 0.85 dB in cepstral distance.
  • Keywords
    Gaussian processes; learning (artificial intelligence); maximum likelihood estimation; speech enhancement; all-pole model parameter; harmonic Gaussian mixture model; harmonic structure enhancement; learning-based spectral envelope enhancement; maximum a posteriori estimation; musical noise; noisy speech enhancement; spectral envelope enhancement approach; speech distortion; voicing indicator; Acoustic noise; Dictionaries; Frequency; Laboratories; Linear predictive coding; Parameter estimation; Power harmonic filters; Power system harmonics; Speech enhancement; Wiener filter; Speech enhancement; harmonic structure; learning; spectral envelope;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
  • Conference_Location
    Dallas, TX
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-4295-9
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2010.5495681
  • Filename
    5495681