Title :
Improved language modeling for conversational applications using sentence quality
Author :
Epstein, Mark ; Ramabhadran, Bhuvana ; Balchandran, Rajesh
Author_Institution :
Human Languages Technol. Dept., IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
Abstract :
In this paper, we propose a new approach to build language models for conversationals system using a a corpus of text as a opposed to a live or a Wizard-of-Oz collection. Each sentence in the corpus is assigned a “quality” that reflects the developer´s intuition for how likely that sentence is to be spoken by a real user to the live system. Language Models (LM) are built for each sentence quality and these are subsequently interpolated to produce the final model. We also have built a classifier that assigns sentence qualities to the data, and whose subsequent language models achive similar improvements in word and turn error rate.
Keywords :
natural language interfaces; pattern classification; Wizard-of-Oz collection; classifier; conversational applications; language modeling; sentence quality; text corpus; Assembly; Automatic speech recognition; Error analysis; Filtering; Filters; Humans; Natural languages; Prototypes; Speech recognition; Statistics; natural language interfaces; speech recognition;
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2010.5494938