• DocumentCode
    1076048
  • Title

    Combining Auditory Preprocessing and Bayesian Estimation for Robust Formant Tracking

  • Author

    Gläser, Claudius ; Heckmann, Martin ; Joublin, Frank ; Goerick, Christian

  • Author_Institution
    Honda Res. Inst. Eur., Offenbach, Germany
  • Volume
    18
  • Issue
    2
  • fYear
    2010
  • Firstpage
    224
  • Lastpage
    236
  • Abstract
    We present a framework for estimating formant trajectories. Its focus is to achieve high robustness in noisy environments. Our approach combines a preprocessing based on functional principles of the human auditory system and a probabilistic tracking scheme. For enhancing the formant structure in spectrograms we use a Gammatone filterbank, a spectral preemphasis, as well as a spectral filtering using difference-of-Gaussians (DoG) operators. Finally, a contrast enhancement mimicking a competition between filter responses is applied. The probabilistic tracking scheme adopts the mixture modeling technique for estimating the joint distribution of formants. In conjunction with an algorithm for adaptive frequency range segmentation as well as Bayesian smoothing an efficient framework for estimating formant trajectories is derived. Comprehensive evaluations of our method on the VTR-formant database emphasize its high precision and robustness. We obtained superior performance compared to existing approaches for clean as well as echoic noisy speech. Finally, an implementation of the framework within the scope of an online system using instantaneous feature-based resynthesis demonstrates its applicability to real-world scenarios.
  • Keywords
    Bayes methods; Gaussian processes; adaptive estimation; channel bank filters; smoothing methods; spectral analysis; speech processing; speech synthesis; statistical distributions; tracking filters; Bayesian estimation; Bayesian smoothing; Gammatone filterbank; VTR-formant database; adaptive frequency range segmentation; auditory preprocessing; contrast enhancement; difference-of-Gaussian operator; echoic noisy speech; formant trajectory estimation; human auditory system; instantaneous feature-based resynthesis; joint formant distribution; mixture modeling technique; noisy environment; online system; probabilistic tracking scheme; robust formant tracking; spectral filtering; spectral preemphasis; spectrogram; Adaptive estimation; Bayes procedures; dynamic programming; speech analysis; speech synthesis; tracking;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2009.2025536
  • Filename
    5075655