DocumentCode :
2180680
Title :
Subsequence similarity language models
Author :
Huerta, Juan M.
Author_Institution :
IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA
fYear :
2011
fDate :
22-27 May 2011
Firstpage :
5580
Lastpage :
5583
Abstract :
In this work we present the Subsequence Similarity Language Model (S2-LM) which is a new approach to language modeling based on string similarity. As a language model, S2-LM generates scores based on the closest matching string given a very large corpus. In this paper we describe the properties and advantages of our approach and describe efficient methods to carry out its computation. We describe an n-best rescoring experiment intended to show that S2-LM can be adjusted to behave as an n-gram SLM model.
Keywords :
formal languages; string matching; S2-LM; n-best rescoring experiment; n-gram SLM model; string matching; string similarity; subsequence similarity language models; language models; longest common subsequence;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
ISSN :
1520-6149
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2011.5947624
Filename :
5947624
Link To Document :
بازگشت