• DocumentCode
    148812
  • Title

    Effect of MPEG audio compression on vocoders used in statistical parametric speech synthesis

  • Author

    Bollepalli, Bajibabu ; Raito, Tuomo

  • Author_Institution
    Dept. of Speech, Music & Hearing, KTH, Stockholm, Sweden
  • fYear
    2014
  • fDate
    1-5 Sept. 2014
  • Firstpage
    1237
  • Lastpage
    1241
  • Abstract
    This paper investigates the effect of MPEG audio compression on HMM-based speech synthesis using two state-of-the-art vocoders. Speech signals are first encoded with various compression rates and analyzed using the GlottHMM and STRAIGHT vocoders. Objective evaluation results show that the parameters of both vocoders gradually degrade with increasing compression rates, but with a clear increase in degradation with bit-rates of 32 kbit/s or less. Experiments with HMM-based synthesis with the two vocoders show that the degradation in quality is already perceptible with bit-rates of 32 kbit/s and both vocoders show similar trend in degradation with respect to compression ratio. The most perceptible artefacts induced by the compression are spectral distortion and reduced bandwidth, while prosody is better preserved.
  • Keywords
    audio coding; data compression; hidden Markov models; multimedia communication; speech coding; speech synthesis; vocoders; GlottHMM vocoders; HMM-based speech synthesis; MPEG audio compression effect; STRAIGHT vocoders; bandwidth reduction; compression rates; compression ratio; hidden Markov model; spectral distortion; statistical parametric speech synthesis; Abstracts; Phase change materials; Speech; Transform coding; Vocoders; GlottHMM; HMM; MP3; MPEG; STRAIGHT; Statistical parametric speech synthesis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing Conference (EUSIPCO), 2014 Proceedings of the 22nd European
  • Conference_Location
    Lisbon
  • Type

    conf

  • Filename
    6952427