Title :
The development of isolated words corpus of Pashto for the automatic speech recognition research
Author :
Ahmed, Ishtiaq ; Ahmad, Nafees ; Ali, Hamza ; Ahmad, G.
Author_Institution :
Dept. of Electr. Eng., Univ. of Eng. & Technol., Peshawar, Pakistan
Abstract :
The availability of standard speech database is of paramount importance in the automatic speech recognition (ASR) research in the context of providing a baseline for comparing the performance of automatic speech recognition approaches. This paper presents the development of a Medium-Vocabulary Speech Corpus for Pashto language. The vocabulary encompasses 161 isolated words of Pashto language, consisting of most frequently used words of Pashto language, names of the days of the week and digits from 0 to 25. The words were uttered by 30 speakers of different ages and genders, including both native and non-native speakers of Pashto language. Recording of the corpus was performed in a noise free office environment. The Corpus developed is then used for the development of an automatic speech recognition system for Pashto language.
Keywords :
natural language processing; speaker recognition; vocabulary; ASR; Pashto language; automatic speech recognition; isolated word corpus development; speaker recognition; standard speech database; vocabulary encompass; vocabulary speech corpus; Automatic speech recognition; Databases; Educational institutions; Feature extraction; MONOS devices; Speech; Automatic Speech Recognition; Human Computer Interaction; Pashto Speech Corpus;
Conference_Titel :
Robotics and Artificial Intelligence (ICRAI), 2012 International Conference on
Conference_Location :
Rawalpindi
Print_ISBN :
978-1-4673-4884-3
DOI :
10.1109/ICRAI.2012.6413380