• DocumentCode
    394287
  • Title

    An expectation maximization approach for formant tracking using a parameter-free non-linear predictor

  • Author

    Bazzi, Issam ; Acero, Alex ; Deng, Li

  • Author_Institution
    Microsoft Res., Redmond, WA, USA
  • Volume
    1
  • fYear
    2003
  • fDate
    6-10 April 2003
  • Abstract
    This paper presents a new approach for formant tracking using a parameter-free non-linear predictor that maps formant frequencies and bandwidths into the acoustic feature space. The approach relies on decomposing the speech signal into two components: the first component captures the mapping between formants and acoustic observations, while the second component is intended to capture the residual in the signal. We build the mapping by quantizing the formant space and creating a predictor codebook. Formant tracking is achieved by: (1) EM training of the parameters of the residual component, and (2) searching the predictor codebook for the best formant values. We explore both MAP and MMSE methods for performing formant tracking with the proposed approach. Furthermore, we impose first order continuity constraints on formant trajectories, and use Viterbi search to perform formant tracking. We present formant tracking results on data from the Switchboard corpus.
  • Keywords
    least mean squares methods; maximum likelihood estimation; nonlinear estimation; optimisation; search problems; speech processing; speech recognition; table lookup; EM training; MAP method; MMSE; Switchboard corpus; Viterbi search; acoustic feature space; acoustic observations; expectation maximization approach; first order continuity constraints; formant frequencies; formant tracking; formant trajectories; mapping; parameter-free nonlinear predictor; predictor codebook; quantization; residual capture; speech signal decomposition; Bandwidth; Linear predictive coding; Mel frequency cepstral coefficient; Nonlinear acoustics; Predictive models; Signal mapping; Speech analysis; Speech recognition; Trajectory; Video recording;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7663-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.2003.1198818
  • Filename
    1198818