Continuous speech recognition via diphone spotting a preliminary implementation

Author

Scagliola, Carlo ; Marmi, Luciano

Author_Institution

ELSAG S.p.A., Genova, Italy

Volume

7

fYear

1982

fDate

30072

Firstpage

2008

Lastpage

2011

Abstract

The paper describes a preliminary implementation of a continuous speech recognition system based on the concept of diphone spotting. This consists of continuously measuring the similarity of the current portion of signal pattern with a complete set of selected phonetic events, called diphones because the most significant of them are transitions between pairs of phonemes. The entire set of measures feeds a linguistic decoder that operates on a state space representation of the language, whose states are the diphones that compose the words of the lexicon. Durational constraints and optional phonological rules are included in the language representation. The linguistic decoder recognizes the sentence as that path through the network which attains the highest cumulative similarity score. Additionally, the decoder has the task of detecting the end of the sentence. A preliminary test on 50 sequences of 3 to 7 connected digits gave recognition rates as high as 99.6% on digits, or 98% on sequences, thus confirming the validity of this approach.

Keywords

Current measurement; Decoding; Error correction; Feeds; Natural languages; Speech recognition; State-space methods; Testing; Time measurement; Tin;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '82.

Type

conf

DOI

10.1109/ICASSP.1982.1171844

Filename

1171844