• DocumentCode
    2717880
  • Title

    Generation of arabic phonetic dictionaries for speech recognition

  • Author

    Ali, Mohamed ; Elshafei, Moustafa ; Al-Ghamdi, Mansour ; Al-Muhtaseb, Husni ; Al-Najjar, Atef

  • Author_Institution
    King Fahd Univ. of Pet. & Miner., Dhahran
  • fYear
    2008
  • fDate
    16-18 Dec. 2008
  • Firstpage
    59
  • Lastpage
    63
  • Abstract
    Phonetic dictionaries are essential components of large-vocabulary natural language speaker-independent speech recognition systems. This paper presents a rule-based technique to generate Arabic phonetic dictionaries for a large vocabulary speech recognition system. The system used classic Arabic pronunciation rules, common pronunciation rules of Modern Standard Arabic, as well as morphologically driven rules. The paper gives in detail an explanation of these rules as well as their formal mathematical presentation. The rules were used to generate a dictionary for a 5.4 hours corpus of broadcast news. The phonetic dictionary contains 23,841 definitions corresponding to about 14232 words. The generated dictionary was evaluated on an actual Arabic speech recognition system. The pronunciation rules and the phone set were validated by test cases. The Arabic speech recognition system achieves word error rate of %11.71 for fully diacritized transcription of about 1.1 hours of Arabic broadcast news.
  • Keywords
    speech processing; speech recognition; Arabic phonetic dictionaries; rule-based technique; speech recognition; vocabulary speech recognition system; Application software; Automatic speech recognition; Broadcasting; Dictionaries; Hidden Markov models; Natural languages; Speech recognition; Telephony; Testing; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Innovations in Information Technology, 2008. IIT 2008. International Conference on
  • Conference_Location
    Al Ain
  • Print_ISBN
    978-1-4244-3396-4
  • Electronic_ISBN
    978-1-4244-3397-1
  • Type

    conf

  • DOI
    10.1109/INNOVATIONS.2008.4781716
  • Filename
    4781716