DocumentCode :
3634541
Title :
Integrating morphology into automatic speech recognition
Author :
Haşim Sak;Murat Saraşlar;Tunga Güngör
Author_Institution :
Department of Computer Engineering, Bogazici University, TR-34342, Bebek, Istanbul, Turkey
fYear :
2009
Firstpage :
354
Lastpage :
358
Abstract :
This paper proposes a novel approach to integrate the morphology as a model into an automatic speech recognition (ASR) system for morphologically rich languages. The high out-of-vocabulary (OOV) word rates have been a major challenge for ASR in morphologically productive languages. The standard approach to this problem has been to shift from words to sub-word units in language modeling, and the only change to the system is in the language model estimated over these units. In contrast, we propose to integrate the morphology as other any knowledge source - such as the lexicon, and the language model- directly into the search network. The morphological parser for a language, implemented as a finite-state lexical transducer, can be considered as a computational lexicon. The computational lexicon represents a dynamic vocabulary in contrast to a static vocabulary generally used for ASR. We compose the transducer for this computational lexicon with a statistical language model over lexical morphemes to obtain a morphology-integrated search network. The resulting search network generates only grammatical word forms and improves the recognition accuracy due to reduced OOV rate. We give experimental results for Turkish broadcast news transcription, and show that it outperforms the 50 K and 100 K vocabulary word models while the 200 K vocabulary word model is slightly better.
Keywords :
"Morphology","Automatic speech recognition","Vocabulary","Transducers","Hidden Markov models","Natural languages","Power system modeling","Computer networks","Broadcasting","Error analysis"
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on
Print_ISBN :
978-1-4244-5478-5
Type :
conf
DOI :
10.1109/ASRU.2009.5373386
Filename :
5373386
Link To Document :
بازگشت