DocumentCode :
1092489
Title :
An approach to segmenting speech into vowel-and nonvowel-like intervals
Author :
Kasuya, Hideki ; Wakita, Hisashi
Author_Institution :
Utsunomiya University, Utsunomiya, Japan
Volume :
27
Issue :
4
fYear :
1979
fDate :
8/1/1979 12:00:00 AM
Firstpage :
319
Lastpage :
327
Abstract :
A speaker-independent algorithm is given for segmenting continuous speech in English into vowel-like (V) and nonvowel-like (NV) intervals. The algorithm has three stages: measurements (parameter extraction), phonetic feature detection, and V/NV decision. In measurements, the broad-band rms energy, the back-to-total cavity volume ratio (BTR), the signed front-to-back maximum area ratio (SFBR), and the normalized high-to low frequency energy ratio (HLR) are computed. The BTR and SFBR are new parameters derived from linear prediction area functions and are interpreted in terms of the speech spectrum. The BTR is useful for distinguishing nasal segments from V segments, while the SFBR is effective for detecting the bursts of voiced plosives. In phonetic feature detection, three independent types of intervals are detected on the basis of the parameters: silence, preliminary V/NV, and turbulence noise. The V/NV decision stage accomplishes the final V/NV interval decision. Interspeaker differences are handled by normalizing the frequency scale on the basis of an estimated average vocal-tract length. Ten sentences spoken by each of two males and two females resulted in 93.3 percent correct V/NV segment-detection decisions (92.9 percent for design speakers, and 93.7 percent for test speakers).
Keywords :
Area measurement; Computer vision; Energy measurement; Feature extraction; Frequency estimation; Frequency measurement; Parameter extraction; Speech; Testing; Volume measurement;
fLanguage :
English
Journal_Title :
Acoustics, Speech and Signal Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
0096-3518
Type :
jour
DOI :
10.1109/TASSP.1979.1163251
Filename :
1163251
Link To Document :
بازگشت