• DocumentCode
    2702205
  • Title

    Physiological Feature Extraction for Text Independent Speaker Identification using Non-Uniform Subband Processing

  • Author

    Xugang Lu ; Jianwu Dang

  • Author_Institution
    Sch. of Inf. Sci., Japan Sci. & Technol. Adv. Inst., Ishikawa, Japan
  • Volume
    4
  • fYear
    2007
  • fDate
    15-20 April 2007
  • Abstract
    The features used for speech recognition should emphasize linguistic information while suppressing speaker differences. For speaker recognition, features should have more speaker individual information while attenuating the linguistic information. In most studies, however, the identical acoustic features are used for the different missions of speaker and speech recognitions. In this paper, we propose a new physiological feature extraction method which emphasizes individual information for speaker identification. For the purpose, physiological features of speakers were analyzed from the point of view of speech production. It is found that the speaker individual information is encoded in different frequency regions of speech sound. The speaker discriminative information was quantified using Fisher´s F-ratio in each frequency region. Based on the F-ratio, we proposed a non-uniform sub-band processing strategy to extract new feature which can emphasize or refine the physiological aspects involved in speech production. We combined the new feature with GMM for speaker identification task and applied on NTT-VR speaker recognition database. Compared with MFCC feature, by using the proposed feature, the identification error rate was reduced 20.1%.
  • Keywords
    Gaussian processes; feature extraction; speaker recognition; Fisher F-ratio; GMM; NTT-VR speaker recognition database; identification error rate; nonuniform subband processing; physiological feature extraction; speech recognition; text independent speaker identification; Data mining; Feature extraction; Frequency; Loudspeakers; Refining; Spatial databases; Speaker recognition; Speech analysis; Speech processing; Speech recognition; Speaker identification; non-uniform subband; physiological feature;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
  • Conference_Location
    Honolulu, HI
  • ISSN
    1520-6149
  • Print_ISBN
    1-4244-0727-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.2007.366949
  • Filename
    4218137