Title :
Automatic online text selection for constructing text corpus with custom phonetic distribution
Author :
Vorapatratorn, Surapol ; Suchato, Atiwong ; Punyabukkana, Proadpran
Author_Institution :
Dept. of Comput. Eng., Chulalongkorn Univ., Bangkok, Thailand
fDate :
May 30 2012-June 1 2012
Abstract :
Performance of Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) systems depends on an appropriate text corpus. This article explains about the automated text corpus generation method using custom phonetic distribution. This distribution is defined by phoneme types, corpus size, the minimum criterion number of phonemes, and target phonetic distribution. Generally, the system selects text data from the Internet by continuously downloading them using a web crawler. The greedy algorithm is applied to extract the proper sentences, in order to fit with the target phonetic distribution until the appropriate text corpus is established. The experiment is done by using the text from the Large Vocabulary Continuous Speech Recognition (LVCSR) corpus for Thai language [1] to generate the target phonetic distribution. The result shows that the increased number of data drawn from the Internet is able to accomplish the target phonetic distribution and generates diphone coverage for 99.13%. This text corpus, then, can be used to generate the speech corpus efficiently.
Keywords :
Internet; greedy algorithms; information retrieval; natural languages; speech recognition; text analysis; ASR; Internet; LVCSR corpus; TTS; Thai language; Web crawler; automated text corpus generation method; automatic online text selection; automatic speech recognition; corpus size; custom phonetic distribution; data downloading; diphone coverage generation; greedy algorithm; large-vocabulary continuous speech recognition corpus; phoneme minimum criterion number; phoneme types; proper sentence extraction; speech corpus generation; target phonetic distribution; text-to-speech systems; Databases; Equations; Greedy algorithms; Internet; Mathematical model; Speech; Vocabulary; greedy algorithm; online corpus; phonetic; phonetically balanced; sentence segmentation; text selection;
Conference_Titel :
Computer Science and Software Engineering (JCSSE), 2012 International Joint Conference on
Conference_Location :
Bangkok
Print_ISBN :
978-1-4673-1920-1
DOI :
10.1109/JCSSE.2012.6261916