Abstract :
More than 6000 living languages are spoken in the world today, and the majority of them are concentrating in Asia. Every language has its own specific acoustic as well as linguistic characteristics that require special modeling techniques. This talk presents our recent experiences in regard to building automatic speech recognition (ASR) systems for the Indonesian, Thai and Chinese languages. For Indonesian, we are building a spoken-query information retrieval (IR) system. In order to solve the problem of a large variation of proper noun and English word pronunciation, we have applied proper noun-specific adaptation in acoustic modeling and rule-based English- to-Indonesian phoneme mapping. For Thai, since there is no word boundary in the written form, we have proposed a new method for automatically creating word-like units from a text corpus, and to recognize spoken style utterances we have applied topic and speaking style adaptation to the language model. In spoken Chinese, long organization names are often abbreviated, and abbreviated utterances cannot be recognized if the abbreviations are not included in the dictionary. We have proposed a new method for automatically generating Chinese abbreviations, and by expanding the vocabulary using the generated abbreviations, we have significantly improved the performance of voice search. This talk includes several recent research activities for the Japanese language.