• DocumentCode
    1092489
  • Title

    An approach to segmenting speech into vowel-and nonvowel-like intervals

  • Author

    Kasuya, Hideki ; Wakita, Hisashi

  • Author_Institution
    Utsunomiya University, Utsunomiya, Japan
  • Volume
    27
  • Issue
    4
  • fYear
    1979
  • fDate
    8/1/1979 12:00:00 AM
  • Firstpage
    319
  • Lastpage
    327
  • Abstract
    A speaker-independent algorithm is given for segmenting continuous speech in English into vowel-like (V) and nonvowel-like (NV) intervals. The algorithm has three stages: measurements (parameter extraction), phonetic feature detection, and V/NV decision. In measurements, the broad-band rms energy, the back-to-total cavity volume ratio (BTR), the signed front-to-back maximum area ratio (SFBR), and the normalized high-to low frequency energy ratio (HLR) are computed. The BTR and SFBR are new parameters derived from linear prediction area functions and are interpreted in terms of the speech spectrum. The BTR is useful for distinguishing nasal segments from V segments, while the SFBR is effective for detecting the bursts of voiced plosives. In phonetic feature detection, three independent types of intervals are detected on the basis of the parameters: silence, preliminary V/NV, and turbulence noise. The V/NV decision stage accomplishes the final V/NV interval decision. Interspeaker differences are handled by normalizing the frequency scale on the basis of an estimated average vocal-tract length. Ten sentences spoken by each of two males and two females resulted in 93.3 percent correct V/NV segment-detection decisions (92.9 percent for design speakers, and 93.7 percent for test speakers).
  • Keywords
    Area measurement; Computer vision; Energy measurement; Feature extraction; Frequency estimation; Frequency measurement; Parameter extraction; Speech; Testing; Volume measurement;
  • fLanguage
    English
  • Journal_Title
    Acoustics, Speech and Signal Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0096-3518
  • Type

    jour

  • DOI
    10.1109/TASSP.1979.1163251
  • Filename
    1163251