مرکز منطقه ای اطلاع رساني علوم و فناوري - Audio-visual large vocabulary continuous speech recognition in the broadcast domain

DocumentCode :

3169382

Title :

Audio-visual large vocabulary continuous speech recognition in the broadcast domain

Author :

Basu, S. ; Neti, C. ; Rajput, N. ; Senior, A. ; Subramaniam, L. ; Verma, A.

Author_Institution :

IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA

fYear :

1999

fDate :

1999

Firstpage :

475

Lastpage :

481

Abstract :

Considers the problem of combining visual cues with audio signals for the purpose of improved automatic machine recognition of speech. Although significant progress has been made in the machine transcription of large-vocabulary continuous speech (LVCSR) over the last few years, the technology to date is most effective only under controlled conditions, such as low noise, speaker-dependent recognition, read speech (as opposed to conversational speech), etc. On the other hand, while augmenting the recognition of speech utterances with visual cues has attracted the attention of researchers over the last couple of years, most efforts in this domain can be considered to be only preliminary in the sense that, unlike LVCSR efforts, tasks have been limited to small vocabularies (e.g. commands, digits) and often to speaker-dependent training or isolated word speech, where word boundaries are artificially well-defined

Keywords :

audio-visual systems; broadcasting; speech recognition; vocabulary; audio signals; audio-visual large-vocabulary continuous speech recognition; broadcast domain; controlled conditions; machine transcription; noise; read speech; speaker-dependent recognition; visual cues; word boundaries; Automatic speech recognition; Broadcasting; Face detection; Facial features; Feature extraction; Mouth; Psychology; Speech enhancement; Speech recognition; Vocabulary;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Multimedia Signal Processing, 1999 IEEE 3rd Workshop on

Conference_Location :

Copenhagen

Print_ISBN :

0-7803-5610-1

Type :

conf

DOI :

10.1109/MMSP.1999.793893

Filename :

793893

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3169382