Turkish Speech Recognition Software with Adaptable Language Model

Author

Osman Buyuk;Ali Haznedaroglu;Levent M. Arslan

Author_Institution

Elektrik ve Elektronik M?hendisligi B?l?m?, Bogazi?i ?niversitesi, 34342, Bebek, ?stanbul. osman.buyuk@sestek.com.tr

fYear

2007

fDate

6/1/2007 12:00:00 AM

Firstpage

1

Lastpage

4

Abstract

Turkish speech recognition studies have been accelerated recently. With these efforts, not only available speech and text corpus which can be used in recognition experiments but also proposed new methods to improve accuracy has increased. Agglutinative nature of Turkish causes out of vocabulary (OOV) problem in Large Vocabulary Continuous Speech Recognition (LVCSR) tasks. In order to overcome OOV problem, usage of sub-word units has been proposed. In addition to LVCSR experiments, there have been some efforts to implement a speech recognizer in limited domains such as radiology. In this paper, we will present Turkish speech recognition software, which has been developed by utilizing recent studies. Both interface of software and recognition accuracies in two different test sets will be summarized. The performance of software has been evaluated using radiology and large vocabulary test sets. In order to solve OOV problem practically, we propose to adapt language models using frequent words or sentences. In recognition experiments, 90% and 44% word accuracies have been achieved in radiology and large vocabulary test sets respectively.

Keywords

"Speech recognition","Natural languages","Vocabulary","Radiology","Performance evaluation","Software testing","Acceleration","Text recognition","Software performance"

Publisher

ieee

Conference_Titel

Signal Processing and Communications Applications, 2007. SIU 2007. IEEE 15th

ISSN

2165-0608

Print_ISBN

1-4244-0719-2

Type

conf

DOI

10.1109/SIU.2007.4298561

Filename

4298561