• DocumentCode
    2998753
  • Title

    Automatic labeling system using speaker-dependent phonetic unit references

  • Author

    Makin, Shozo ; Wakita, Hisashi

  • Author_Institution
    Speech Technology Laboratory, Santa Barbara, CA, USA
  • Volume
    11
  • fYear
    1986
  • fDate
    31503
  • Firstpage
    2783
  • Lastpage
    2786
  • Abstract
    This paper describes a new automatic labeling system using speaker-dependent reference patterns for 73 phonetic units in American English. The system segments arbitrary utterances into phonetic units and automatically adapts to a new speaker using a small set of training words. The labeling of the training words begins with the words which can be easily segmented into necessary phonetic units and then reference patterns for each unit are computed by use of vector quantization clustering. Using the training reference patterns together with vocalic-consonant information, the speech input is aligned with the transcription using dynamic programming with duration constraints for each phonetic unit. More accurate phonetic boundaries are obtained using new reference patterns derived from the input speech. The system was evaluated on 15 repetitions of 104 words uttered by two males and one female. Standard deviation of differences between manually labeled and automatically obtained boundaries ranged from 21 ms to 27 ms. Most of the discrepancies occurred at the boundaries between vowels, nasals and liquids.
  • Keywords
    Cepstral analysis; Data mining; Dynamic programming; Labeling; Laboratories; Liquids; Reproducibility of results; Speech recognition; Speech synthesis; Vector quantization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '86.
  • Type

    conf

  • DOI
    10.1109/ICASSP.1986.1168617
  • Filename
    1168617