A statistical approach to the segmentation and broad classification of continuous speech into phrase-sized information units

Author

Huber, Daniel

Author_Institution

Dept. of Inf. Theory, Chalmers Univ. of Technol., Gothenburg

fYear

1989

fDate

23-26 May 1989

Firstpage

600

Abstract

An algorithm is presented which uses the F₀ tracings of a connected-speech utterance as input and performs speaker-independent segmentation into prosodically defined information units. Two global declination lines are computed by the linear regression method, which approximate the trends in time of the peaks (topline) and valleys (baseline) of F₀ across the utterance. Computation is reiterated every time the Pearson product moment correlation coefficient for these declination lines drops below the present level of acceptability. Segmentation is thus performed without prior knowledge of higher level linguistic information, with the termination of one unit being determined by the general resetting of the intonation contour wherever in the utterance it may occur. The structure of the algorithm is described and its performance evaluated on three medium-sized Swedish texts read by four native speakers of standard Swedish

Keywords

speech recognition; F₀ tracings; Swedish; broad classification; continuous speech; correlation coefficient; global declination lines; linear regression method; phrase-sized information units; segmentation; statistical approach; Amorphous materials; Automatic speech recognition; Humans; Modems; Natural languages; Performance evaluation; Signal processing; Speech processing; Speech recognition; Uninterruptible power systems;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1989. ICASSP-89., 1989 International Conference on

Conference_Location

Glasgow

ISSN

1520-6149

Type

conf

DOI

10.1109/ICASSP.1989.266498

Filename

266498