DocumentCode :
177492
Title :
Effectiveness of PLP-based phonetic segmentation for speech synthesis
Author :
Shah, N.J. ; Vachhani, Bhavik B. ; Sailor, Hardik B. ; Patil, Hemant A.
Author_Institution :
Dhirubhai Ambani Inst. of Inf. & Commun. Technol. (DA-IICT), Gandhinagar, India
fYear :
2014
fDate :
4-9 May 2014
Firstpage :
270
Lastpage :
274
Abstract :
In this paper, use of Viterbi-based algorithm and spectral transition measure (STM)-based algorithm for the task of speech data labeling is being attempted. In the STM framework, we propose use of several spectral features such as recently proposed cochlear filter cepstral coefficients (CFCC), perceptual linear prediction cepstral coefficients (PLPCC) and RelAtive SpecTrAl (RASTA)-based PLPCC in addition to Mel frequency cepstral coefficients (MFCC) for phonetic segmentation task. To evaluate effectiveness of these segmentation algorithms, we require manual accurate phoneme-level labeled data which is not available for low resourced languages such as Gujarati (one of the official languages of India). In order to measure effectiveness of various segmentation algorithms, HMM-based speech synthesis system (HTS) for Gujarati has been built. From the subjective and objective evaluations, it is observed that Viterbi-based and STM with PLPCC-based segmentation algorithms work better than other algorithms.
Keywords :
cepstral analysis; hidden Markov models; maximum likelihood estimation; speech synthesis; CFCC; Gujarati; HMM-based speech synthesis system; HTS; MFCC; RASTA-based PLPCC; STM-based algorithm; Viterbi-based algorithm; cochlear filter cepstral coefficients; mel frequency cepstral coefficients; perceptual linear prediction cepstral coefficients; phonetic segmentation task; relative spectral-based PLPCC; spectral transition measure-based algorithm; speech data labeling; Hidden Markov models; High-temperature superconductors; Mel frequency cepstral coefficient; Signal processing algorithms; Speech; Speech synthesis; Hidden Markov Model (HMM); PLPCC; Spectral Transition Measure (STM);
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
Type :
conf
DOI :
10.1109/ICASSP.2014.6853600
Filename :
6853600
Link To Document :
بازگشت