مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

1467745

Title :

High-performance alphabet recognition

Author :

Loizou, Philipos C. ; Spanias, Andreas S.

Author_Institution :

Dept. of Appl. Sci., Arkansas Univ., Little Rock, AR, USA

Volume :

Issue :

fYear :

1996

fDate :

11/1/1996 12:00:00 AM

Firstpage :

430

Lastpage :

445

Abstract :

Alphabet recognition is needed in many applications for retrieving information associated with the spelling of a name, such as telephone numbers, addresses, etc. This is a difficult recognition task due to the acoustic similarities existing between letters in the alphabet (e.g., the E-set letters). This paper presents the development of a high-performance alphabet recognizer that has been evaluated on studio quality as well as on telephone-bandwidth speech. Unlike previously proposed systems, the alphabet recognizer presented is based on context-dependent phoneme hidden Markov models (HMMs), which have been found to outperform whole-word models by as much as 8%. The proposed recognizer incorporates a series of new approaches to tackle the problems associated with the confusions occurring between the stop consonants in the E-set and the confusions between the nasals (i.e., letters M and N). First, a new feature representation is proposed for improved stop consonant discrimination, and second, two subspace approaches are proposed for improved nasal discrimination. The subspace approach was found to yield a 45% error-rate reduction in nasal discrimination. Various other techniques are also proposed, yielding a 97.3% speaker-independent performance on alphabet recognition and 95% speaker-independent performance on E-set recognition, A telephone alphabet recognizer was also developed using context-dependent HMMs. When tested on the recognition of 300 last names (which are contained in a list of 50,000 common last names) spelled by 300 speakers, the recognizer achieved 91.7% correct letter recognition with 1.1% letter insertions

Keywords :

acoustic signal processing; hidden Markov models; speech processing; speech recognition; telephony; E-set letters; HMM; acoustic similarities; addresses; alphabet recognition; context-dependent phoneme; correct letter recognition; error-rate reduction; hidden Markov models; information retrieval; letter insertions; nasals; speaker-independent performance; spelling; stop consonant discrimination; stop consonants; studio quality speech; subspace approaches; telephone alphabet recognizer; telephone bandwidth speech; telephone numbers; whole-word models; Context modeling; Costs; Hidden Markov models; Humans; Information retrieval; Speech analysis; Speech recognition; Telephony; Testing; Vocabulary;

fLanguage :

English

Journal_Title :

Speech and Audio Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1063-6676

Type :

jour

DOI :

10.1109/89.544528

Filename :

544528

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1467745