شماره ركورد كنفرانس :
1730
عنوان مقاله :
Feature Bandwidth Extension for Persian Conversational Telephone Speech Recognition
عنوان به زبان ديگر :
Feature Bandwidth Extension for Persian Conversational Telephone Speech Recognition
پديدآورندگان :
Goodarzi Mohammad Mohsen نويسنده , Almasganj Farshad نويسنده , Kabudian Jahanshah نويسنده , Shekofteh Yasser نويسنده , Sarraf Rezaei Iman نويسنده
تعداد صفحه :
4
كليدواژه :
feature bandwidth extension , Estimation theory , Neural network , conversational telephony speech recognition , Gaussian processes , Speaker Recognition , Gaussian Mixture Model , Neural nets
سال انتشار :
2012
عنوان كنفرانس :
بيستمين كنفرانس مهندسي برق ايران
زبان مدرك :
فارسی
چكيده لاتين :
Configuring a whole setup with application of continuous conversational telephony speech recognition in Persian is the goal of this paper. For this propose, two commonmethods, Gaussian Mixture Model (GMM) and Neural Network (NN) and a proposed hybrid GMM-NN method have been considered to estimate full-bandwidth features from band-limitedfeatures. Performances of these methods have been evaluated with two different spectral and cepstral based features, LFBEand MFCC. Also, the effect of speaker gender in estimation process has been investigated. Our results showed that bestphoneme recognition accuracy is obtained when MFCC features are reconstructed using two gender dependent neural networks.In this configuration, phoneme accuracy was about 1.6 % more than baseline. The tests were applied on TFarsDat corpus
شماره مدرك كنفرانس :
4460809
سال انتشار :
2012
از صفحه :
1
تا صفحه :
4
سال انتشار :
2012
لينک به اين مدرک :
بازگشت