• DocumentCode
    118235
  • Title

    Intrinsic variation robust speaker verification based on sparse representation

  • Author

    Yi Nie ; Mingxing Xu ; Haishu Xianyu

  • Author_Institution
    Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
  • fYear
    2014
  • fDate
    9-12 Dec. 2014
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    Intrinsic variation is one of the major factors that aggravate performance of speaker verification system dramatically. In this paper, we focus on alleviating influence caused by intrinsic variation using sparse representation. Because the over-complete dictionary increases the flexibility and the ability to adapt to variable data in signal representation, we expect redundancy of the dictionary could benefit addressing the implicit properties of intrinsic variation within each speaker. Both exemplar dictionary and learned dictionary are evaluated on an intrinsic variation corpus and compared with GMM-UBM, Joint Factor Analysis (JFA) and i-vector systems. In our system, we choose the K-SVD algorithm, generalization of K-means algorithm to learn dictionary with Singular Value Decomposition (SVD). The experiment results show that the two sparse representation systems achieve higher accuracy than GMM-UBM, JFA and i-vector systems consistently, especially outperform GMM-UBM respectively by 37.17% and 41.55%. We also find that the K-SVD based sparse representation system has almost the best performance, which achieve an average Error Equal Rate (EER) of 14.23%.
  • Keywords
    Gaussian processes; mixture models; redundancy; signal representation; singular value decomposition; speaker recognition; EER; GMM-UBM; JFA; K-SVD based sparse representation system; K-means algorithm generalization; average error equal rate; dictionary redundancy; i-vector systems; intrinsic variation corpus; intrinsic variation robust speaker verification system; joint factor analysis; signal representation; singular value decomposition; Databases; Dictionaries; Mel frequency cepstral coefficient; Robustness; Speech; Training; Vectors; K-SVD; intrinsic variation; sparse representation; speaker verification; speaking style;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Asia-Pacific Signal and Information Processing Association, 2014 Annual Summit and Conference (APSIPA)
  • Conference_Location
    Siem Reap
  • Type

    conf

  • DOI
    10.1109/APSIPA.2014.7041692
  • Filename
    7041692