DocumentCode :
2970927
Title :
From speech to letters - using a novel neural network architecture for grapheme based ASR
Author :
Eyben, Florian ; Wöllmer, Martin ; Schuller, Björn ; Graves, Alex
Author_Institution :
Inst. for Human-Machine Commun., Tech. Univ. Munchen, Munich, Germany
fYear :
2009
fDate :
Nov. 13 2009-Dec. 17 2009
Firstpage :
376
Lastpage :
380
Abstract :
Main-stream automatic speech recognition systems are based on modelling acoustic sub-word units such as phonemes. Phonemisation dictionaries and language model based decoding techniques are applied to transform the phoneme hypothesis into orthographic transcriptions. Direct modelling of graphemes as sub-word units using HMM has not been successful. We investigate a novel ASR approach using Bidirectional Long Short-Term Memory Recurrent Neural Networks and Connectionist Temporal Classification, which is capable of transcribing graphemes directly and yields results highly competitive with phoneme transcription. In design of such a grapheme based speech recognition system phonemisation dictionaries are no longer required. All that is needed is text transcribed on the sentence level, which greatly simplifies the training procedure. The novel approach is evaluated extensively on the Wall Street Journal 1 corpus.
Keywords :
recurrent neural nets; speech coding; speech recognition; bidirectional long short-term memory recurrent neural networks; connectionist temporal classification; decoding techniques; grapheme based speech recognition system phonemisation dictionaries; language model; main-stream automatic speech recognition systems; orthographic transcriptions; Automatic speech recognition; Computer science; Context modeling; Dictionaries; Hidden Markov models; Man machine systems; Natural languages; Neural networks; Recurrent neural networks; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on
Conference_Location :
Merano
Print_ISBN :
978-1-4244-5478-5
Electronic_ISBN :
978-1-4244-5479-2
Type :
conf
DOI :
10.1109/ASRU.2009.5373257
Filename :
5373257
Link To Document :
بازگشت