Title :
New challenges in automatic speech recognition and speech understanding
Author_Institution :
South Australia Univ., The Levels, SA, Australia
Abstract :
Summary form only given. Outstanding work in speech science, signal processing, statistical pattern recognition and computing has produced commercial speech recognition systems for voice-driven computing and word-processing and systems offering spoken access to information via the telephone. Good though these systems are, they achieve their performance by severely restricting the task in one or more of the task dimensions, including user characteristics, vocabulary size and application domain, and the operating environment. Even with these restrictions, real system performance is far from perfect; the "general" speech recognition problem remains a distant goal. This paper reviews the current dominant hidden Markov modelling (HMM) technique used for speech recognition and discusses its main deficiencies, particularly with respect to feature extraction, frame-synchronous stochastic modelling, and the limited language modelling most often employed. The author takes the view that improvements in performance will continue to be based on both technological and linguistic knowledge. To illustrate this, the paper reviews ongoing work aimed at overcoming the deficiencies of the current paradigm and elaborates research on speaker adaptation and accented speech recognition. This work offers considerable potential to improve future systems\´ performance.
Keywords :
feature extraction; hidden Markov models; linguistics; speech recognition; stochastic processes; HMM; accented speech recognition; automatic speech recognition; feature extraction; frame-synchronous stochastic modelling; hidden Markov modelling; language modelling; linguistic knowledge; research; signal processing; speaker adaptation; speech recognition systems; speech science; speech understanding; spoken access systems; statistical pattern recognition; system performance; telephone; user characteristics; vocabulary size; voice-driven computing; word-processing; Automatic speech recognition; Feature extraction; Hidden Markov models; Pattern recognition; Signal processing; Speech processing; Speech recognition; System performance; Telephony; Vocabulary;
Conference_Titel :
TENCON '97. IEEE Region 10 Annual Conference. Speech and Image Technologies for Computing and Telecommunications., Proceedings of IEEE
Conference_Location :
Brisbane, Qld., Australia
Print_ISBN :
0-7803-4365-4
DOI :
10.1109/TENCON.1997.647313