مرکز منطقه ای اطلاع رساني علوم و فناوري - Discriminative Product-of-Expert acoustic mapping for cross-lingual phone recognition

DocumentCode :

2964361

Title :

Discriminative Product-of-Expert acoustic mapping for cross-lingual phone recognition

Author :

Sim, Khe Chai

Author_Institution :

Nat. Univ. of Singapore, Singapore, Singapore

fYear :

2009

fDate :

Nov. 13 2009-Dec. 17 2009

Firstpage :

546

Lastpage :

551

Abstract :

This paper presents a product-of-expert framework to perform probabilistic acoustic mapping for cross-lingual phone recognition. Under this framework, the posterior probabilities of the target HMM states are modelled as the weighted product of experts, where the experts or their weights are modelled as functions of the posterior probabilities of the source HMM states generated by a foreign phone recogniser. Careful choice of these functions leads to the product-of-posterior and posterior weighted product-of-expert models, which can be conveniently represented as 2-layer and 3-layer feed-forward neural networks respectively. Therefore, the commonly used error back-propagation method can be used to discriminatively train the model parameters. Experimental results are presented on the NTIMIT database using the Czech, Hungarian and Russian hybrid NN/HMM recognisers as the foreign phone recognisers to recognise English phones. With only about 15.6 minutes of training data, the best acoustic mapping model achieved 46.00% phone error rate, which is not far behind the 43.55% performance of the NN/HMM system trained directly on the full 3.31 hours of data.

Keywords :

feedforward neural nets; hidden Markov models; linguistics; natural language processing; probability; speech recognition; HMM states; NTIMIT database; cross-lingual phone recognition; discriminative product-of-expert acoustic mapping; error backpropagation; feed-forward neural network; hidden Markov model; posterior probability; posterior weighted product-of-expert model; probabilistic acoustic mapping; product-of-posterior; Backpropagation; Drives; Feedforward neural networks; Feedforward systems; Hidden Markov models; Natural languages; Neural networks; Speech recognition; Target recognition; Training data;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on

Conference_Location :

Merano

Print_ISBN :

978-1-4244-5478-5

Electronic_ISBN :

978-1-4244-5479-2

Type :

conf

DOI :

10.1109/ASRU.2009.5372910

Filename :

5372910

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2964361