مرکز منطقه ای اطلاع رساني علوم و فناوري - Effectiveness of the backoff hierarchical class n-gram language models to model unseen events in speech recognition

DocumentCode :

3245952

Title :

Effectiveness of the backoff hierarchical class n-gram language models to model unseen events in speech recognition

Author :

Zitouni, Imed ; Kuo, Hong-Kwang Jeff

Author_Institution :

Lucent Technol. Bell Labs., Murray Hill, NJ, USA

fYear :

2003

fDate :

30 Nov.-3 Dec. 2003

Firstpage :

560

Lastpage :

565

Abstract :

Backoff hierarchical class n-gram language models use a class hierarchy to define an appropriate context. Each node in the hierarchy is a class containing all the words of the descendant nodes (classes). The closer a node is to the root, the more general the corresponding class, and consequently the context, is. We demonstrate experimentally the effectiveness of the backoff hierarchical class n-gram language modeling approach to model unseen events in speech recognition: improvement is achieved over regular backoff n-gram models. We also study the performance of this approach on vocabularies of different sizes and we investigate the impact of the hierarchy depth on the performance of the model. Performance is presented on several databases such as switchboard, call-home and Wall Street Journal (WSJ). Experiments on switchboard and call-home databases, which contain a few unseen events in the test set, show up to 6% improvement on unseen events perplexity with a vocabulary of 16,800 words. With a relatively large number of unseen events on the WSJ test corpus and using two vocabulary sets of 5,000 and 20,000 words, we obtain up to 26% improvement on unseen events perplexity and up to 12% improvement in WER when a backoff hierarchical class trigram language model is used on an ASR test set. Results confirm that improvement is achieved when the number of unseen events increases.

Keywords :

error statistics; hierarchical systems; natural languages; pattern classification; speech recognition; ASR; WER; Wall Street Journal database; backoff hierarchical class n-gram language models; call-home database; hierarchy depth; speech recognition; switchboard database; trigram language model; unseen events perplexity; word error rate; Appropriate technology; Automatic speech recognition; Context modeling; Databases; Engines; Frequency estimation; Natural languages; Speech recognition; Testing; Vocabulary;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on

Print_ISBN :

0-7803-7980-2

Type :

conf

DOI :

10.1109/ASRU.2003.1318501

Filename :

1318501

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3245952