• DocumentCode
    2174944
  • Title

    Phoneme selective speech enhancement using the generalized parametric spectral subtraction estimator

  • Author

    Das, Amit ; Hansen, John H L

  • Author_Institution
    Center for Robust Speech Syst. (CRSS), Univ. of Texas at Dallas, Richardson, TX, USA
  • fYear
    2011
  • fDate
    22-27 May 2011
  • Firstpage
    4648
  • Lastpage
    4651
  • Abstract
    In this study, the generalized parametric spectral subtraction estimator is employed in the context of a ROVER speech enhancement framework to develop a robust phoneme class selective enhancement algorithm. The parametric estimator is derived by a) optimizing the weighted Euclidean distortion cost function and b) by modeling clean speech spectral magnitudes as Rayleigh distributed priors. A set of enhanced utterances are generated from a single noisy utterance by tuning the parameters of the parametric estimator for different phoneme classes. The speech and non-speech segments are segregated using a voice activity detector. Thereafter, the mixture maximum model is used to make soft decisions on these segments to determine their phoneme class weights. The segments from the enhanced utterances are weighted by these decisions and combined to form the final composite utterance. Using segmental SNR and Itakura-Saito metrics over two noise types and four SNR levels, it was demonstrated that the composite utterance exhibited better phoneme class improvement than the individual utterances enhanced from the parametric estimator.
  • Keywords
    speech enhancement; Euclidean distortion cost function; Itakura-Saito metrics; SNR levels; generalized parametric spectral subtraction estimator; phoneme selective ROVER speech enhancement; voice activity detector; Hidden Markov models; Mel frequency cepstral coefficient; Noise measurement; Signal to noise ratio; Speech; Speech enhancement; generalized spectral subtraction; phoneme selective speech enhancement;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
  • Conference_Location
    Prague
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4577-0538-0
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2011.5947391
  • Filename
    5947391