DocumentCode :
2329747
Title :
Query language modeling for voice search
Author :
Chelba, C. ; Schalkwyk, J. ; Brants, T. ; Ha, V. ; Harb, B. ; Neveitt, W. ; Parada, C. ; Xu, P.
Author_Institution :
Google, Inc., Mountain View, CA, USA
fYear :
2010
fDate :
12-15 Dec. 2010
Firstpage :
127
Lastpage :
132
Abstract :
The paper presents an empirical exploration of google.com query stream language modeling. We describe the normalization of the typed query stream resulting in out-of-vocabulary (OoV) rates below 1% for a one million word vocabulary. We present a comprehensive set of experiments that guided the design decisions for a voice search service. In the process we re-discovered a less known interaction between Kneser-Ney smoothing and entropy pruning, and found empirical evidence that hints at non-stationarity of the query stream, as well as strong dependence on various English locales-USA, Britain and Australia.
Keywords :
natural language processing; query languages; query processing; search engines; speech recognition; vocabulary; Australia; Britain; English locales; Kneser-Ney smoothing; USA; design decisions; entropy pruning; google.com; out-of-vocabulary rates; query stream language modeling; voice search; language modeling; query stream; voice search;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Spoken Language Technology Workshop (SLT), 2010 IEEE
Conference_Location :
Berkeley, CA
Print_ISBN :
978-1-4244-7904-7
Electronic_ISBN :
978-1-4244-7902-3
Type :
conf
DOI :
10.1109/SLT.2010.5700834
Filename :
5700834
Link To Document :
بازگشت