• DocumentCode
    178406
  • Title

    Automatic phonetic segmentation in Mandarin Chinese: Boundary models, glottal features and tone

  • Author

    Jiahong Yuan ; Ryant, Neville ; Liberman, Mark

  • Author_Institution
    Linguistic Data Consortium, Univ. of Pennsylvania, Philadelphia, PA, USA
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    2539
  • Lastpage
    2543
  • Abstract
    We conducted experiments on forced alignment in Mandarin Chinese. A corpus of 7,849 utterances was created for the purpose of the study. Systems differing in their use of explicit phone boundary models, glottal features, and tone information were trained and evaluated on the corpus. Results showed that employing special one-state phone boundary HMM models significantly improved forced alignment accuracy, even when no manual phonetic segmentation was available for training. Spectral features extracted from glottal waveforms (by performing glottal inverse filtering from the speech waveforms) also improved forced alignment accuracy. Tone dependent models only slightly outperformed tone independent models. The best system achieved 93.1% agreement (of phone boundaries) within 20 ms compared to manual segmentation without boundary correction.
  • Keywords
    feature extraction; hidden Markov models; speech recognition; Mandarin Chinese; automatic phonetic segmentation; boundary model; forced alignment accuracy; glottal feature; glottal inverse filtering; glottal waveform; one-state phone boundary HMM model; spectral features extraction; speech waveform; tone information; Accuracy; Acoustics; Feature extraction; Hidden Markov models; Manuals; Speech; Training; Forced alignment; Mandarin Chinese; boundary model; glottal features; tone;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6854058
  • Filename
    6854058