DocumentCode :
1401354
Title :
Automatic recognition and understanding of spoken language - a first step toward natural human-machine communication
Author :
Juang, Biing-hwang ; Furui, Sadaoki
Author_Institution :
AT&T Bell Labs., Murray Hill, NJ, USA
Volume :
88
Issue :
8
fYear :
2000
Firstpage :
1142
Lastpage :
1165
Abstract :
The promise of a powerful computing device to help people in productivity as well as in recreation can only be realized with proper human-machine communication. Automatic recognition and understanding of spoken language is the first step toward natural human-machine interaction. Research in this field has produced remarkable results, leading to many exciting expectations and new challenges. We summarize the development of the spoken language technology from both a vertical (chronology) and a horizontal (spectrum of technical approaches) perspective. We highlight the introduction of statistical methods in dealing with language-related problems, as this represents a paradigm shift in the research field of spoken language processing. Statistical methods are designed to allow the machine to learn structural regularities in the speech signal, directly from data, for the purpose of automatic speech recognition and understanding. Research results in spoken language processing have led to a number of successful applications, ranging from dictation software for personal computers and telephone-call processing systems for automatic call routing, to automatic sub-captioning for television broadcasts. We analyze the technical successes that support these applications. Along with an assessment of the state of the art in this broad technical field, we also discuss the limitations of the current technology, and point out the challenges that are ahead. This paper presents an accurate overview of spoken language technology as a basis to inspire future advances.
Keywords :
natural language interfaces; natural languages; reviews; speech recognition; speech-based user interfaces; statistics; Bayes risk; acoustic modelling; acoustic phonetics; articulation; automatic call routing; automatic speech recognition; automatic spoken language recognition; automatic spoken language understanding; automatic sub-captioning; cepstral distance; chronology; continuous speech recognition; detection-based approach; dialogue systems; dictation software; discriminative training; dynamic programming; finite state machine; forward-backward algorithm; generalized phone models; grammar; hidden Markov models; human-machine communication; human-machine interaction; isolated word recognition; language modelling; language structure; linear prediction; maximum a posteriori method; maximum likelihood estimation; natural human-machine communication; noise; overview; paradigm shift; perplexity; personal computers; probability distribution; productivity; pronunciation modelling; recreation; robustness; search algorithms; short-time spectral analysis; signal analysis; speech dictation; speech distortion; speech representation; speech signal structural regularity learning; spoken language processing technology; statistical language processing; statistical methods; statistical pattern recognition; technical approaches; technology limitations; telephone-call processing systems; television broadcasts; Application software; Automatic speech recognition; Design methodology; Man machine systems; Microcomputers; Natural languages; Productivity; Routing; Signal design; Statistical analysis;
fLanguage :
English
Journal_Title :
Proceedings of the IEEE
Publisher :
ieee
ISSN :
0018-9219
Type :
jour
DOI :
10.1109/5.880077
Filename :
880077
Link To Document :
بازگشت