• DocumentCode
    2072422
  • Title

    Combining MAP and MLLR Approaches for SVM Based Speaker Recognition with a Multi-class MLLR Technique

  • Author

    Wang, Haipeng ; Zhang, Xiang ; Xiao, Xiang ; Zhang, Jianping ; Yan, Yonghong

  • Author_Institution
    Inst. of Acoust., Chinese Acad. of Sci., Beijing, China
  • fYear
    2009
  • fDate
    26-28 Dec. 2009
  • Firstpage
    447
  • Lastpage
    450
  • Abstract
    Gaussian mixture models with an universal background model (UBM) have been the standard method for speaker recognition. Typically, maximum a posteriori (MAP) or maximum likelihood linear regression (MLLR) is used to adapt the means of the UBM. Together with the SVM modeling technique, these approaches can achieve excellent performance. MLLR is quite efficient when the amount of adaptation data is limited, but has poor asymptotic properties as the amount of data increases. MAP estimation has nice asymptotic properties, but provides only a moderate improvement when the amount of adaptation data is small. In this paper, in order to take advantage of both approaches to improve the recognition performance, a new approach for speaker adaptation consisting of MAP adaptation followed by MLLR adaptation is presented. This work is enriched by a multi-class MLLR technique, which clusters the Gaussian components into regression classes and applies a different transform to each class. Experiments on the NIST 2006 SRE corpus show that the proposed approach improves on both MLLR and MAP adaptation systems.
  • Keywords
    Gaussian processes; maximum likelihood estimation; speaker recognition; support vector machines; Gaussian mixture models; MLLR; MLLR approaches; SVM based speaker recognition; UBM; combining MAP; maximum a posteriori; maximum likelihood linear regression; multiclass MLLR technique; universal background model; Acoustical engineering; Acoustics; Information science; Kernel; Maximum likelihood linear regression; NIST; Speaker recognition; Speech; Support vector machine classification; Support vector machines; Speaker recognition; maximum a posteriori; maximum likelihood linear regression; support vector machine;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Science and Engineering (ISISE), 2009 Second International Symposium on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-4244-6325-1
  • Electronic_ISBN
    978-1-4244-6326-8
  • Type

    conf

  • DOI
    10.1109/ISISE.2009.103
  • Filename
    5447271