مرکز منطقه ای اطلاع رساني علوم و فناوري - Pitch adaptive training for hmm-based singing voice synthesis

DocumentCode :

3168649

Title :

Pitch adaptive training for hmm-based singing voice synthesis

Author :

Oura, Keiichiro ; Mase, Ayami ; Nankaku, Yoshihiko ; Tokuda, Keiichi

Author_Institution :

Dept. of Comput. Sci., Nagoya Inst. of Technol., Nagoya, Japan

fYear :

2012

fDate :

25-30 March 2012

Firstpage :

5377

Lastpage :

5380

Abstract :

A statistical parametric approach to singing voice synthesis based on hidden Markov Models (HMMs) has been growing in popularity over the last few years. The spectrum, excitation, vibrato, and duration of singing voices in this approach are simultaneously modeled with context-dependent HMMs and waveforms are generated from the HMMs themselves. HMM-based singing voice synthesis systems are heavily based on the training data in performance because these systems are “corpus-based.” Therefore, HMMs corresponding to contextual factors that hardly ever appear in the training data cannot be well-trained. Pitch should especially be correctly covered since generated F₀ trajectories have a great impact on the subjective quality of synthesized singing voices. We applied the method of “speaker adaptive training” (SAT) to “pitch adaptive training,” which is discussed in this paper. This technique made it possible to normalize pitch based on musical notes in the training process. The experimental results demonstrated that the proposed technique could alleviate the data sparseness problem.

Keywords :

hidden Markov models; learning (artificial intelligence); speech synthesis; HMM-based singing voice synthesis; SAT; context-dependent HMM; corpus-based; hidden Markov Models; pitch adaptive training; speaker adaptive training; statistical parametric approach; training data; Computational modeling; Databases; Hidden Markov models; Speech; Speech synthesis; Training; Training data; hidden Markov model; pitch adaptive training; singing voice synthesis;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on

Conference_Location :

Kyoto

ISSN :

1520-6149

Print_ISBN :

978-1-4673-0045-2

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2012.6289136

Filename :

6289136

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3168649