DocumentCode
2998753
Title
Automatic labeling system using speaker-dependent phonetic unit references
Author
Makin, Shozo ; Wakita, Hisashi
Author_Institution
Speech Technology Laboratory, Santa Barbara, CA, USA
Volume
11
fYear
1986
fDate
31503
Firstpage
2783
Lastpage
2786
Abstract
This paper describes a new automatic labeling system using speaker-dependent reference patterns for 73 phonetic units in American English. The system segments arbitrary utterances into phonetic units and automatically adapts to a new speaker using a small set of training words. The labeling of the training words begins with the words which can be easily segmented into necessary phonetic units and then reference patterns for each unit are computed by use of vector quantization clustering. Using the training reference patterns together with vocalic-consonant information, the speech input is aligned with the transcription using dynamic programming with duration constraints for each phonetic unit. More accurate phonetic boundaries are obtained using new reference patterns derived from the input speech. The system was evaluated on 15 repetitions of 104 words uttered by two males and one female. Standard deviation of differences between manually labeled and automatically obtained boundaries ranged from 21 ms to 27 ms. Most of the discrepancies occurred at the boundaries between vowels, nasals and liquids.
Keywords
Cepstral analysis; Data mining; Dynamic programming; Labeling; Laboratories; Liquids; Reproducibility of results; Speech recognition; Speech synthesis; Vector quantization;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '86.
Type
conf
DOI
10.1109/ICASSP.1986.1168617
Filename
1168617
Link To Document