Title :
Detecting child speaker based on auditory feature vectors for VTL estimation
Author :
Nisimura, Ryuichi ; Miyamori, S. ; Okamoto, Eiji ; Kawahara, H. ; Irino, Toshio
Author_Institution :
Wakayama Univ., Wakayama, Japan
Abstract :
We introduce novel auditory features in the hidden Markov model (HMM) system for detecting child speakers. The features derived by the gammachirp auditory filterbank (GCFB) have been demonstrated to be suitable for vocal tract length (VTL) estimation, both theoretically and experimentally. We performed numerical experiments to distinguish between child and adult speakers using HMMs trained on 2,360 speech samples collected through a web-based query interface, and we compared the performance of the common mel-frequency cepstral coefficients (MFCC) and the GCFB-based feature vectors. We also introduced the modulation features as the substitution of delta parameters. It has been clearly demonstrated that the error rate distinguishing a child from an adult is reduced by GCFB. To enhance our method for use as a web application, we applied our original voice-enabled web framework to the front-end interface of the proposed system.
Keywords :
Internet; audio user interfaces; cepstral analysis; channel bank filters; error statistics; feature extraction; hidden Markov models; query processing; speaker recognition; GCFB-based feature vector; HMM system; MFCC; VTL estimation; Web application; Web-based query interface; adult speaker; auditory feature vector; child speaker detection; delta parameter; error rate; front-end interface; gammachirp auditory filterbank; hidden Markov model; mel-frequency cepstral coefficient; modulation feature; vocal tract length estimation; voice-enabled Web framework; Estimation; Hidden Markov models; Mel frequency cepstral coefficient; Modulation; Speech;
Conference_Titel :
Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific
Conference_Location :
Hollywood, CA
Print_ISBN :
978-1-4673-4863-8