Title :
Parametric emotional singing voice synthesis
Author :
Park, Younsung ; Yun, Sungrack ; Yoo, Chang D.
Author_Institution :
Dept. of Electr. Eng., Korea Adv. Inst. of Sci. & Technol., Daejeon, South Korea
Abstract :
This paper describes an algorithm to control the expressed emotion of a synthesized song. Based on the database of various melodies sung neutrally with restricted set of words, hidden semi-Markov models (HSMMs) of notes ranging from E3 to G5 are constructed for synthesizing singing voice. Three steps are taken in the synthesis: (1) Pitch and duration are determined according to the notes indicated by the musical score; (2) Features are sampled from appropriate HSMMs with the duration set to the maximum probability; (3) Singing voice is synthesized by the mel-log spectrum approximation (MLSA) filter using the sampled features as parameters of the filter. Emotion of a synthesized song is controlled by varying the duration and the vibrato parameters according to the Thayer´s mood model. Perception test is performed to evaluate the synthesized song. The results show that the algorithm can control the expressed emotion of a singing voice given a neutral singing voice database.
Keywords :
emotion recognition; feature extraction; hearing; hidden Markov models; music; probability; speech synthesis; Thayer mood model; emotional singing voice synthesis; hidden semiMarkov model; mel-log spectrum approximation; neutral singing voice database; perception test; synthesized song; vibrato parameter; Computer interfaces; Databases; Filters; Intelligent robots; Man machine systems; Mood; Performance evaluation; Speech synthesis; Testing; Training data; Emotion expression; Statistical singing voice synthesis; Vibrato model;
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2010.5495137