• DocumentCode
    134331
  • Title

    A new Neural Network based logistic regression classifier for improving mispronunciation detection of L2 language learners

  • Author

    Wenping Hu ; Yao Qian ; Soong, Frank K.

  • Author_Institution
    Univ. of Sci. & Technol. of China, Hefei, China
  • fYear
    2014
  • fDate
    12-14 Sept. 2014
  • Firstpage
    245
  • Lastpage
    249
  • Abstract
    In this paper, we propose a Neural Network (NN) based, Logistic Regression (LR) classifier for improving phone mispronunciation detection rate in a Computer-Aided Language Learning (CALL) system. A general neural network with multiple hidden layers for extracting useful speech features is first trained with pooled, training data, and then phone-dependent, 2-class logistic regression classifiers are trained as individual, phoneme specific nodes at the output layer. This new NN-based classifier with shared hidden layers streamlines the time-consuming work needed in training multiple individual classifiers separately, i.e., one for a specific phoneme, and learns common feature representation via the shared hidden layers. Its improved performance, when compared with independently trained, phoneme specific classifiers, is verified on a testing database of isolated English words recorded by non-native English learners. Compared with the conventional Goodness of Pronunciation (GOP)-based approach, the NN-based LR classifier improves the precision and recall by 37.1% and 11.7% (absolute), respectively. On the same test data, it also outperforms a Support Vector Machine (SVM)-based classifier, which is widely used for mispronunciation detection, and at a slightly better precision rate, the recall is improved by 10.6% (absolute) and the relative improvement is 21.6%.
  • Keywords
    feature extraction; neural nets; regression analysis; signal classification; speech recognition; 2-class logistic regression classifiers; CALL system; English words; GOP-based approach; L2 language learners; NN-based LR classifier; SVM-based classifier; computer-aided language learning; feature representation; goodness-of-pronunciation; mispronunciation detection; neural network based logistic regression classifier; speech feature extraction; support vector machine; Acoustics; Artificial neural networks; Hidden Markov models; Logistics; Support vector machines; Training; CALL; Deep Neural Network; Logistic Regression; Mispronunciation Detection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
  • Conference_Location
    Singapore
  • Type

    conf

  • DOI
    10.1109/ISCSLP.2014.6936712
  • Filename
    6936712