Title :
Japanese dictation system using character source modeling
Author :
Yamada, Tomokazu ; Matsunaga, Shouichi ; Shikano, Kiyohiro
Author_Institution :
NTT Human Interface Labs., Tokyo, Japan
Abstract :
The authors describe a Japanese dictation system that uses a stochastic language model based on sequences of Japanese characters. The trigram probabilities, which are obtained from a text database consisting of Kanji and Kana are used to construct a source model. A Japanese dictation system generally requires Kana-to-Kanji conversion if the system uses a phoneme based unit for the acoustic processing. However, a system that uses a Kanji-and-Kana character source model can generate an output Kanji-and-Kana sequence directly from input speech without using Kana-to-Kanji conversion. The system is tested using 274 phrases uttered by one male speaker, and achieves 58.4% phrase transcription rate. When the system uses a pronunciation dictionary and eliminates the candidates whose Kanji readings are contextually inappropriate, the phrase transcription rate increases to 63.9%. It is confirmed that a Japanese character source model is efficient for a Japanese dictation system
Keywords :
dictation; speech recognition; Japanese dictation system; Kana; Kanji; character source modeling; phrase transcription rate; pronunciation dictionary; speech recognition; stochastic language model; test-set perplexity; text database; trigram probabilities; Acoustic testing; Context modeling; Databases; Hidden Markov models; Humans; Laboratories; Natural languages; Speech recognition; Stochastic systems; System testing;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1992. ICASSP-92., 1992 IEEE International Conference on
Conference_Location :
San Francisco, CA
Print_ISBN :
0-7803-0532-9
DOI :
10.1109/ICASSP.1992.225978