Title :
Flexible voice morphing based on linear combination of multi-speakers´ vocal tract area functions
Author :
Nambu, Yoshiki ; Mikawa, Masahiko ; Tanaka, Kazuyo
Author_Institution :
Grad. Sch. of Libr., Inf. & Media Studies, Univ. of Tsukuba, Tsukuba, Japan
Abstract :
This paper presents a flexible voice morphing method based on conversion using a linear combination of multi-speakers´ vocal tract area functions, in which phonological identity is maintained in terms of the overall interpolated area. In this system, the characteristic of vocal tract resonances is separated from that of glottal source waves using AR-HMM analysis of speech. The vocal tract resonances and glottal source wave characteristics are independently morphed. For the morphing of vocal tract resonances, log area vocal tract functions, which are derived from AR coefficients, are normalized and then processed by statistical mapping technique. For glottal source waves, statistical mapping is conducted in the cepstrum domain. Morphed speech is re-synthesized by an AR filter of converted glottal source waves which is re-synthesized using a cepstrum domain conversion. With the proposed morphing system, the continuity of formants and perceptual differences between a conventional method and the proposed method are confirmed.
Keywords :
cepstral analysis; hidden Markov models; speaker recognition; speech processing; AR filter; AR-HMM speech analysis; cepstrum domain conversion; flexible voice morphing; glottal source wave characteristics; glottal source waves; linear combination; multispeaker vocal tract area functions; phonological identity; statistical mapping; vocal tract resonances; Analytical models; Cepstrum; Estimation; Hidden Markov models; Interpolation; Speech; Training;
Conference_Titel :
Signal Processing Conference, 2010 18th European
Conference_Location :
Aalborg