Title : 
Vocabulary and language model adaptation using just one speech file
         
        
            Author : 
Meng, S. ; Thambiratnam, K. ; Lin, Y. ; Wang, L. ; Li, G. ; Seide, F.
         
        
            Author_Institution : 
5F Beijing Sigma Center, Microsoft Res. Asia, Beijing, China
         
        
        
        
        
        
            Abstract : 
This paper investigates unsupervised vocabulary and language model self-adaptation (VLA) from just one speech file using the web as a knowledge source and without prior knowledge of topic or domain beyond optional file metadata. Single-file self adaptation is regularly used for acoustic adaptation, but to date, is rarely used for VLA. The method investigated here uses a first-pass transcript or file metadata to generate web search queries for retrieving texts for adaptation. Various strategies for building queries, retrieving web texts and maximizing out-of-vocabulary (OOV) recovery while constraining vocabulary growth are examined. Significant improvements are demonstrated for transcribing and searching recorded lectures and telephone calls. The proposed method is orthogonal with acoustic adaptation and system combination and integrates well in multi-pass recognition architectures.
         
        
            Keywords : 
speech recognition; unsupervised learning; vocabulary; acoustic adaptation; knowledge source; multi pass recognition architecture; optional file metadata; out of vocabulary recovery; searching recorded lecture; single file self adaptation; speech file; telephone call; transcribing recorded lecture; vocabulary and language model self adaptation; web search query; Acoustical engineering; Adaptation model; Asia; Humans; Knowledge engineering; Natural languages; Search engines; Speech recognition; Vocabulary; Web search; Language Model Adaptation; Out-Of-Vocabulary (OOV); Spoken Document Retrieval; Unsupervised Adaptation; Vocabulary Adaptation;
         
        
        
        
            Conference_Titel : 
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
         
        
            Conference_Location : 
Dallas, TX
         
        
        
            Print_ISBN : 
978-1-4244-4295-9
         
        
            Electronic_ISBN : 
1520-6149
         
        
        
            DOI : 
10.1109/ICASSP.2010.5494929