Title :
A polynomial segment model based statistical parametric speech synthesis sytem
Author :
Sun, Jingwei ; Ding, Feng ; Wu, Yahui
Author_Institution :
Nokia Res., Beijing
Abstract :
In this paper, we present a statistical parametric speech synthesis system based on the polynomial segment model (PSM). As one of the segmental models for speech signals, PSM explicitly describes the trajectory of the features in a speech segment, and keeps the internal dynamics of the segment. In this work, spectral and excitation parameters are modeled by PSMs simultaneously, while the duration for each segment is modeled by a single Gaussian distribution. A top-down K-means clustering technique is applied for model tying. Mean trajectories acquired from PSMs are used directly to generate speech parameters according to the estimated segment duration. An English speech synthesizer back-end is implemented on CMU Arctic corpus and the performance of the new approach is compared with the classical HMM-based one. Experimental results show that PSM modeling can achieve similar naturalness and intelligence of the synthetic speech as HMM modeling. The system is in the early stage of its development.
Keywords :
Gaussian distribution; polynomials; spectral analysis; speech synthesis; statistical analysis; Gaussian distribution; excitation parameter; polynomial segment model; spectral parameter; statistical parametric speech synthesis system; top-down K-means clustering technique; Acoustical engineering; Acoustics; Gaussian distribution; Hidden Markov models; High temperature superconductors; Laboratories; Polynomials; Speech recognition; Speech synthesis; Synthesizers; Hidden Markov Model; Polynomial Segment Model; mean trajectory; statistical parametric speech synthesis;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location :
Taipei
Print_ISBN :
978-1-4244-2353-8
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2009.4960510