DocumentCode :
1908921
Title :
Use of PLP Cepstral Features for Phonetic Segmentation
Author :
Vachhani, Bhavik B. ; Patil, Hemant A.
Author_Institution :
Dhirubhai Ambani Inst. of Inf. & Commun. Technol. (DA-IICT), Gandhinagar, India
fYear :
2013
fDate :
17-19 Aug. 2013
Firstpage :
143
Lastpage :
146
Abstract :
Phonetic segmentation can find its potential application for Text-to-Speech (TTS) synthesis and Automatic Speech Recognition (ASR) systems. In this paper, we propose use of Perceptual Linear Prediction Cepstral Coefficients (PLPCC) feature for phonetic segmentation task. To detect phonetic boundaries, we used spectral transition measure (STM). Using proposed approach, we achieve 85 % (i.e., 3 % better than state-of-the art Mel-frequency Cepstral Coefficients (MFCC) for 20 ms agreement duration) accuracy and 15 % over-segmentation rate (i.e., 8 % less than MFCC) for automatic boundary detection of 2, 34, 925 phone boundaries corresponding 630 speakers of entire TIMIT database.
Keywords :
natural language processing; speech synthesis; ASR; MFCC; PLP cepstral features; PLPCC; STM; TIMIT database; TTS; automatic boundary detection; automatic speech recognition systems; mel-frequency cepstral coefficients; perceptual linear prediction cepstral coefficients feature; phone boundaries; phonetic boundaries; phonetic segmentation task; spectral transition measure; text-to-speech synthesis; Accuracy; Databases; Feature extraction; Mel frequency cepstral coefficient; Speech; Training; Phonetic segmentation; mel cepstrum; perceptual linear prediction cepstrum; spectral transition measure; unsupervised approach;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Asian Language Processing (IALP), 2013 International Conference on
Conference_Location :
Urumqi
Type :
conf
DOI :
10.1109/IALP.2013.47
Filename :
6646023
Link To Document :
بازگشت