Title :
Bangla Speech-to-Text conversion using SAPI
Author :
Sultana, Shaheena ; Akhand, M.A.H. ; Das, Prodip Kumer ; Rahman, M. M Hafizur
Author_Institution :
Dept. of Comput. Sci. & Eng., Khulna Univ. of Eng. & Technol., Khulna, Bangladesh
Abstract :
Speech is the most natural form of communication and interaction between humans; whereas, text and symbols are the most common form of transaction in computer systems. Therefore, interest regarding conversion between speech and text is increasing day by day for speech oriented human-computer interaction. Microsoft Corporation developed Speech Application Program Interface (SAPI) for speech related works in its Windows operating systems that includes features for only eight languages including English. So, the aim of this study is to investigate Speech-to-Text (STT) conversion using SAPI for Bangla language. Bangla is an important language with a rich heritage; 21st February is declared as the International Mother Language day by UNESCO to respect the language martyrs for the language in Bangladesh at the year of 1952. We managed SAPI to match pronunciation from continuous Bangla speech in precompiled grammar file of SAPI and SAPI returned Bangla words in English character if matches occur. The words are then used to fetch Bangla words from database and return words in true Bangla characters and to complete the sentences. Several English words for particular Bangla word in the grammar file of SAPI is found to overcome tone variation of persons as well as pronunciation variation in language communities and shown to improve overall performance of the system. Experimental study is carried out for the technique on an article from a news paper and the recognition rate was approximately 78% on an average. Although achieved performance is promising for STT related studies, we identified several elements to improve the performance and might give better accuracy. The theme of this study will also be helpful for other languages for Speech-to-Text conversion and similar tasks.
Keywords :
application program interfaces; grammars; human computer interaction; operating systems (computers); speech recognition; speech synthesis; speech-based user interfaces; Bangla language; English character; International Mother Language day; Microsoft Corporation; SAPI; STT conversion; UNESCO; Windows operating system; grammar file precompilation; speech application program interface; speech oriented human-computer interaction; speech-to-text conversion; Engines; Grammar; Operating systems; Speech; Speech recognition; XML; Human-Computer Interaction; Speech; Text;
Conference_Titel :
Computer and Communication Engineering (ICCCE), 2012 International Conference on
Conference_Location :
Kuala Lumpur
Print_ISBN :
978-1-4673-0478-8
DOI :
10.1109/ICCCE.2012.6271216