• DocumentCode
    3397486
  • Title

    Modified estimation of between-class covariance matrix in linear discriminant analysis of speech

  • Author

    Viszlay, Peter ; Juhar, Jozef ; Pleva, Matus

  • Author_Institution
    Dept. of Electron. & Multimedia Commun., Tech. Univ. of Kosice, Kosice, Slovakia
  • fYear
    2013
  • fDate
    7-9 July 2013
  • Firstpage
    167
  • Lastpage
    170
  • Abstract
    Linear discriminant analysis (LDA) is a popular supervised feature transformation applied in current automatic speech recognition (ASR). Generally, the parameters of LDA are computed from the training data partitioned into classes. If the number of classes is smaller than the dimension of the supervectors (typically in phoneme-based LDA) then the between-class covariance matrix can become singular or close to singular (singularity problem in classical LDA). In this paper, we present a modification of the standard between-class covariance matrix estimation, which represents one of the possible approaches to solving the singularity problem. Our method works directly with the supervectors instead of the class mean vectors. The number of estimation cycles is much larger because more data are used during the computation. Thus, the matrix structure can be significantly refined. This implies that larger lengths of context can be used while the singularity problem is efficiently eliminated. The effectiveness of the proposed estimation is evaluated in Slovak phoneme-based and triphone-based large vocabulary continuous speech recognition (LVCSR) task. The method is compared to the state-of-the-art MFCCs and to LDA trained in the standard way. The experimental results confirm that the modified LDA considerably outperforms the MFCCs and consistently leads to improvements of the conventional LDA.
  • Keywords
    covariance matrices; speech recognition; ASR; LDA; LVCSR; MFCC; Slovak phoneme-based large vocabulary continuous speech recognition task; automatic speech recognition; class mean vectors; linear discriminant speech analysis; modified between-class covariance matrix estimation; singularity problem; supervectors; supervised feature transformation; triphone-based large vocabulary continuous speech recognition task; Bismuth; Context; Covariance matrices; Estimation; Hidden Markov models; Training; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Signals and Image Processing (IWSSIP), 2013 20th International Conference on
  • Conference_Location
    Bucharest
  • ISSN
    2157-8672
  • Print_ISBN
    978-1-4799-0941-4
  • Type

    conf

  • DOI
    10.1109/IWSSIP.2013.6623480
  • Filename
    6623480