DocumentCode
2717880
Title
Generation of arabic phonetic dictionaries for speech recognition
Author
Ali, Mohamed ; Elshafei, Moustafa ; Al-Ghamdi, Mansour ; Al-Muhtaseb, Husni ; Al-Najjar, Atef
Author_Institution
King Fahd Univ. of Pet. & Miner., Dhahran
fYear
2008
fDate
16-18 Dec. 2008
Firstpage
59
Lastpage
63
Abstract
Phonetic dictionaries are essential components of large-vocabulary natural language speaker-independent speech recognition systems. This paper presents a rule-based technique to generate Arabic phonetic dictionaries for a large vocabulary speech recognition system. The system used classic Arabic pronunciation rules, common pronunciation rules of Modern Standard Arabic, as well as morphologically driven rules. The paper gives in detail an explanation of these rules as well as their formal mathematical presentation. The rules were used to generate a dictionary for a 5.4 hours corpus of broadcast news. The phonetic dictionary contains 23,841 definitions corresponding to about 14232 words. The generated dictionary was evaluated on an actual Arabic speech recognition system. The pronunciation rules and the phone set were validated by test cases. The Arabic speech recognition system achieves word error rate of %11.71 for fully diacritized transcription of about 1.1 hours of Arabic broadcast news.
Keywords
speech processing; speech recognition; Arabic phonetic dictionaries; rule-based technique; speech recognition; vocabulary speech recognition system; Application software; Automatic speech recognition; Broadcasting; Dictionaries; Hidden Markov models; Natural languages; Speech recognition; Telephony; Testing; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Innovations in Information Technology, 2008. IIT 2008. International Conference on
Conference_Location
Al Ain
Print_ISBN
978-1-4244-3396-4
Electronic_ISBN
978-1-4244-3397-1
Type
conf
DOI
10.1109/INNOVATIONS.2008.4781716
Filename
4781716
Link To Document