DocumentCode :
2212228
Title :
Speech style conversion based on the statistics of vowel spectrograms and nonlinear frequency mapping
Author :
Takahashi, Toru ; Banno, Hideki ; Irino, Toshio ; Kawahara, Hideki
Author_Institution :
Fac. of Syst. Eng., Wakayama Univ., Wakayama, Japan
fYear :
2006
fDate :
4-8 Sept. 2006
Firstpage :
1
Lastpage :
5
Abstract :
A simple, efficient, and high-quality speech style conversion algorithm is proposed based on STRAIGHT. A very high-quality VOCODER STRAIGHT consists of instantaneous-frequency based F0 and source information extraction part and F0-adaptive time-frequency smoothing part to eliminate preriodicity interferences. The proposed method uses only vowel information to design the desired conversion functions and parameters. So, it is possible to reduce the amount of training data required for conversion. The processing of the proposed method is: 1) to produce abstract spectra that is represented on the perceptual frequency axis and is derived as average spectrum for each vowel and each style; 2) to decompose the original spectrum into the abstract spectrum and the residual, fine structure; 3) to replace the abstract spectrum from the original to the target style; 4) to map the fine structure with nonlinear frequency warping for adapting the target style fine structure; 5) then to add them together to produce target speech. An efficient algorithm for this conversion was developed using an orthogonal transformation referred to as warped-DCT. An informal listening test indicated that the proposed method yields more natural and high-quality speech style conversion than the previous methods.
Keywords :
discrete cosine transforms; speech coding; vocoders; VOCODER STRAIGHT; frequency based FO; high-quality speech style conversion; informal listening test; nonlinear frequency mapping; orthogonal transformation; perceptual frequency axis; preriodicity interferences; source information extraction; speech style conversion; speech style conversion algorithm; vowel spectrograms statistics; warped-DCT; Abstracts; Databases; Single photon emission computed tomography; Speech; Switches;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing Conference, 2006 14th European
Conference_Location :
Florence
ISSN :
2219-5491
Type :
conf
Filename :
7071076
Link To Document :
بازگشت