• DocumentCode
    9267
  • Title

    On the Projection of PLLRs for Unbounded Feature Distributions in Spoken Language Recognition

  • Author

    Diez, Mireia ; Varona, Amparo ; Penagarikano, Mike ; Rodriguez-Fuentes, Luis Javier ; Bordel, German

  • Author_Institution
    Dept. of Electr. & Electron., Univ. of the Basque Country, Leioa, Spain
  • Volume
    21
  • Issue
    9
  • fYear
    2014
  • fDate
    Sept. 2014
  • Firstpage
    1073
  • Lastpage
    1077
  • Abstract
    The so called Phone Log-Likelihood Ratio (PLLR) features have been recently introduced as a novel and effective way of retrieving acoustic-phonetic information in spoken language and speaker recognition systems. In this letter, an in-depth insight into the PLLR feature space is provided and the multidimensional distribution of these features is analyzed in a language recognition system. The study reveals that PLLR features are confined into a subspace that strongly bounds PLLR distributions. To enhance the information retrieved by the system, PLLR features are projected into a hyper-plane that provides a more suitable representation of the subspace where the features lie. After applying the projection method, PCA is used to decorrelate the features. Gains attained on each step of the proposed approach are outlined and compared to simple PCA projection. Experiments carried out on NIST 2007, 2009 and 2011 LRE datasets demonstrate the effectiveness of the proposed method, which yields up to a 27% relative improvement with regard to the system based on the original features.
  • Keywords
    feature extraction; maximum likelihood estimation; multidimensional systems; speaker recognition; NIST LRE datasets; PLLR projection; acoustic-phonetic information; language recognition system; multidimensional distribution; phone log-likelihood ratio; speaker recognition; spoken language recognition; unbounded feature distributions; Decoding; Mel frequency cepstral coefficient; NIST; Principal component analysis; Vectors; Feature projection; i-vectors; phone log-likelihood ratios; spoken language recognition;
  • fLanguage
    English
  • Journal_Title
    Signal Processing Letters, IEEE
  • Publisher
    ieee
  • ISSN
    1070-9908
  • Type

    jour

  • DOI
    10.1109/LSP.2014.2324819
  • Filename
    6817523