Title :
Hybrid baseform builder for phonetic languages
Author :
Kumar, Mohit ; Rajput, Nitendra ; Verma, Ashish
Author_Institution :
IBM India Res. Lab., New Delhi, India
Abstract :
We present a novel technique of automatically building baseforms from the spelling for languages that are phonetic. For such languages, although rule-based techniques give fairly accurate baseforms, they have some ambiguities depending upon the language. To handle these, we apply a statistical method to improve the correctness of phonetic spelling builders. The rule-based baseforms are used as a training corpus for improving the system. We also present an alternative method of building decision trees over the phone context to modify the rule-based baseforms. The novel framework of generating the baseforms using both, spelling-to-sound rules and statistics, one after the other, requires very small amount of training data. Correction results and recognition results are presented by using the Hindi language baseform builder and by using the baseforms generated in a Hindi speech recognition task.
Keywords :
decision trees; natural languages; speech processing; speech recognition; statistical analysis; Hindi language baseform builder; Hindi speech recognition task; decision trees; hybrid baseform builder; phonetic languages; phonetic spelling builders; rule-based baseforms; spelling-to-sound rules; statistical method; training corpus; Acoustics; Decision trees; Loudspeakers; Natural languages; Speech recognition; Speech synthesis; Statistical analysis; Tiles; Tires; Vocabulary;
Conference_Titel :
Intelligent Sensing and Information Processing, 2005. Proceedings of 2005 International Conference on
Print_ISBN :
0-7803-8840-2
DOI :
10.1109/ICISIP.2005.1529481