DocumentCode
148812
Title
Effect of MPEG audio compression on vocoders used in statistical parametric speech synthesis
Author
Bollepalli, Bajibabu ; Raito, Tuomo
Author_Institution
Dept. of Speech, Music & Hearing, KTH, Stockholm, Sweden
fYear
2014
fDate
1-5 Sept. 2014
Firstpage
1237
Lastpage
1241
Abstract
This paper investigates the effect of MPEG audio compression on HMM-based speech synthesis using two state-of-the-art vocoders. Speech signals are first encoded with various compression rates and analyzed using the GlottHMM and STRAIGHT vocoders. Objective evaluation results show that the parameters of both vocoders gradually degrade with increasing compression rates, but with a clear increase in degradation with bit-rates of 32 kbit/s or less. Experiments with HMM-based synthesis with the two vocoders show that the degradation in quality is already perceptible with bit-rates of 32 kbit/s and both vocoders show similar trend in degradation with respect to compression ratio. The most perceptible artefacts induced by the compression are spectral distortion and reduced bandwidth, while prosody is better preserved.
Keywords
audio coding; data compression; hidden Markov models; multimedia communication; speech coding; speech synthesis; vocoders; GlottHMM vocoders; HMM-based speech synthesis; MPEG audio compression effect; STRAIGHT vocoders; bandwidth reduction; compression rates; compression ratio; hidden Markov model; spectral distortion; statistical parametric speech synthesis; Abstracts; Phase change materials; Speech; Transform coding; Vocoders; GlottHMM; HMM; MP3; MPEG; STRAIGHT; Statistical parametric speech synthesis;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing Conference (EUSIPCO), 2014 Proceedings of the 22nd European
Conference_Location
Lisbon
Type
conf
Filename
6952427
Link To Document