• DocumentCode
    2976657
  • Title

    The development of isolated words corpus of Pashto for the automatic speech recognition research

  • Author

    Ahmed, Ishtiaq ; Ahmad, Nafees ; Ali, Hamza ; Ahmad, G.

  • Author_Institution
    Dept. of Electr. Eng., Univ. of Eng. & Technol., Peshawar, Pakistan
  • fYear
    2012
  • fDate
    22-23 Oct. 2012
  • Firstpage
    139
  • Lastpage
    143
  • Abstract
    The availability of standard speech database is of paramount importance in the automatic speech recognition (ASR) research in the context of providing a baseline for comparing the performance of automatic speech recognition approaches. This paper presents the development of a Medium-Vocabulary Speech Corpus for Pashto language. The vocabulary encompasses 161 isolated words of Pashto language, consisting of most frequently used words of Pashto language, names of the days of the week and digits from 0 to 25. The words were uttered by 30 speakers of different ages and genders, including both native and non-native speakers of Pashto language. Recording of the corpus was performed in a noise free office environment. The Corpus developed is then used for the development of an automatic speech recognition system for Pashto language.
  • Keywords
    natural language processing; speaker recognition; vocabulary; ASR; Pashto language; automatic speech recognition; isolated word corpus development; speaker recognition; standard speech database; vocabulary encompass; vocabulary speech corpus; Automatic speech recognition; Databases; Educational institutions; Feature extraction; MONOS devices; Speech; Automatic Speech Recognition; Human Computer Interaction; Pashto Speech Corpus;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Robotics and Artificial Intelligence (ICRAI), 2012 International Conference on
  • Conference_Location
    Rawalpindi
  • Print_ISBN
    978-1-4673-4884-3
  • Type

    conf

  • DOI
    10.1109/ICRAI.2012.6413380
  • Filename
    6413380