• DocumentCode
    64483
  • Title

    A Family of Discriminative Manifold Learning Algorithms and Their Application to Speech Recognition

  • Author

    Tomar, Vikrant Singh ; Rose, Richard C.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., McGill Univ., Montreal, QC, Canada
  • Volume
    22
  • Issue
    1
  • fYear
    2014
  • fDate
    Jan. 2014
  • Firstpage
    161
  • Lastpage
    171
  • Abstract
    This paper presents a family of discriminative manifold learning approaches to feature space dimensionality reduction in noise robust automatic speech recognition (ASR). The specific goal of these techniques is to preserve local manifold structure in feature space while at the same time maximizing the separability between classes of feature vectors. In the manifold space, the relationships among the feature vectors are defined using nonlinear kernels. Two separate distance measures are used to characterize the kernels, namely the conventional Euclidean distance and a cosine-correlation based distance. The performance of the proposed techniques is evaluated on two task domains involving noise corrupted utterances of connected digits and read newspaper text. Performance is compared to existing approaches used for feature space transformations, including linear discriminant analysis (LDA) and locality preserving linear projections (LPP). The proposed approaches are found to provide a significant reduction in word error rate (WER) with respect to the more well-known techniques for a variety of noise conditions. Another contribution of the paper is to quantify the interaction between acoustic noise conditions and the shape and size of local neighborhoods which are used in manifold learning to define local relationships among feature vectors. Based on this analysis, a procedure for reducing the impact of varying acoustic conditions on manifold learning is proposed .
  • Keywords
    computational geometry; feature extraction; graph theory; learning (artificial intelligence); speech recognition; vectors; Euclidean distance; WER; acoustic noise conditions; connected digits; cosine-correlation based distance; discriminative manifold learning algorithms; distance measures; feature space dimensionality reduction; feature vectors; graph embedding; local manifold structure preservation; manifold space; noise corrupted utterances; noise robust automatic speech recognition; nonlinear kernels; read newspaper text; separability maximization; task domains; word error rate; Algorithm design and analysis; Covariance matrices; Kernel; Manifolds; Noise; Speech; Vectors; Cosine distances; dimensionality reduction; discriminative manifold learning; feature extraction; graph embedding; speech recognition;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    2329-9290
  • Type

    jour

  • DOI
    10.1109/TASLP.2013.2286906
  • Filename
    6645416