مرکز منطقه ای اطلاع رساني علوم و فناوري - Training data selection for improving discriminative training of acoustic models

DocumentCode :

2769317

Title :

Training data selection for improving discriminative training of acoustic models

Author :

Liu, Shih-Hung ; Chu, Fang-Hui ; Lin, Shih-Hsiang ; Lee, Hung-Shin ; Chen, Berlin

Author_Institution :

Nat. Taiwan Normal Univ., Taipei

fYear :

2007

fDate :

9-13 Dec. 2007

Firstpage :

284

Lastpage :

289

Abstract :

This paper considers training data selection for discriminative training of acoustic models for broadcast news speech recognition. Three novel data selection approaches were proposed. First, the average phone accuracy over all hypothesized word sequences in the word lattice of a training utterance was utilized for utterance-level data selection. Second, phone-level data selection based on the difference between the expected accuracy of a phone arc and the average phone accuracy of the word lattice was investigated. Finally, frame-level data selection based on the normalized frame-level entropy of Gaussian posterior probabilities obtained from the word lattice was explored. The underlying characteristics of the presented approaches were extensively investigated and their performance was verified by comparison with the standard discriminative training approaches. Experiments conducted on the Mandarin broadcast news collected in Taiwan shown that both phone-and frame-level data selection could achieve slight but consistent improvements over the baseline systems at lower training iterations.

Keywords :

Gaussian processes; entropy; probability; speech recognition; Gaussian posterior probability; acoustic model; broadcast news speech recognition; discriminative training; frame-level data selection; hypothesized word sequence; normalized frame-level entropy; phone-level data selection; utterance-level data selection; word lattice; Acoustical engineering; Broadcasting; Computer science; Entropy; Hidden Markov models; Lattices; Speech recognition; Support vector machine classification; Support vector machines; Training data; acoustic models; data selection; discriminative training; entropy; speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on

Conference_Location :

Kyoto

Print_ISBN :

978-1-4244-1746-9

Electronic_ISBN :

978-1-4244-1746-9

Type :

conf

DOI :

10.1109/ASRU.2007.4430125

Filename :

4430125

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2769317