Title :
Pitch adaptive training for hmm-based singing voice synthesis
Author :
Oura, Keiichiro ; Mase, Ayami ; Nankaku, Yoshihiko ; Tokuda, Keiichi
Author_Institution :
Dept. of Comput. Sci., Nagoya Inst. of Technol., Nagoya, Japan
Abstract :
A statistical parametric approach to singing voice synthesis based on hidden Markov Models (HMMs) has been growing in popularity over the last few years. The spectrum, excitation, vibrato, and duration of singing voices in this approach are simultaneously modeled with context-dependent HMMs and waveforms are generated from the HMMs themselves. HMM-based singing voice synthesis systems are heavily based on the training data in performance because these systems are “corpus-based.” Therefore, HMMs corresponding to contextual factors that hardly ever appear in the training data cannot be well-trained. Pitch should especially be correctly covered since generated F0 trajectories have a great impact on the subjective quality of synthesized singing voices. We applied the method of “speaker adaptive training” (SAT) to “pitch adaptive training,” which is discussed in this paper. This technique made it possible to normalize pitch based on musical notes in the training process. The experimental results demonstrated that the proposed technique could alleviate the data sparseness problem.
Keywords :
hidden Markov models; learning (artificial intelligence); speech synthesis; HMM-based singing voice synthesis; SAT; context-dependent HMM; corpus-based; hidden Markov Models; pitch adaptive training; speaker adaptive training; statistical parametric approach; training data; Computational modeling; Databases; Hidden Markov models; Speech; Speech synthesis; Training; Training data; hidden Markov model; pitch adaptive training; singing voice synthesis;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2012.6289136