مرکز منطقه ای اطلاع رساني علوم و فناوري - Morpheme-Based Language Modeling for Arabic Lvcsr

DocumentCode :

454713

Title :

Morpheme-Based Language Modeling for Arabic Lvcsr

Author :

Choueiter, Ghinwa ; Povey, Daniel ; Chen, Stanley F. ; Zweig, Geoffrey

Author_Institution :

MIT CS, Cambridge, MA

Volume :

fYear :

2006

fDate :

14-19 May 2006

Abstract :

In this paper, we concentrate on Arabic speech recognition. Taking advantage of the rich morphological structure of the language, we use morpheme-based language modeling to improve the word error rate. We propose a simple constraining method to rid the decoding output of illegal morpheme sequences. We report the results obtained for word and morpheme language models using medium (64 kw) and large (~800 kw) vocabularies, the morpheme LM obtaining an absolute improvement of 2.4% for the former and only 0.2% for the latter. The 2.4% gain surpasses previous gains for morpheme-based LMs for Arabic, and the large vocabulary runs represent the first comparative results for vocabularies of this size for any language. Finally, we analyze the performance of the morpheme LM on word OOV´s

Keywords :

decoding; natural languages; speech coding; speech recognition; Arabic LVCSR; Arabic speech recognition; constraining method; decoding output; morpheme sequences; morpheme-based language modeling; word error rate; Artificial intelligence; Decoding; Error analysis; Laboratories; Natural languages; Paints; Performance analysis; Speech recognition; Testing; Vocabulary;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on

Conference_Location :

Toulouse

ISSN :

1520-6149

Print_ISBN :

1-4244-0469-X

Type :

conf

DOI :

10.1109/ICASSP.2006.1660205

Filename :

1660205

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=454713