Title :
Speaker classification using composite hypothesis testing and list decoding
Author :
Roberts, William J.J. ; Ephraim, Yariv ; Sabrin, Howard W.
Author_Institution :
Atlantic Coast Technol. Inc., Silver Spring, MD, USA
fDate :
3/1/2005 12:00:00 AM
Abstract :
Speaker classification is seen as a hypothesis testing problem of J simple hypotheses and a composite hypothesis. The simple hypotheses represent target speakers while the composite hypothesis represents nontarget speakers. The simple hypotheses have well-defined distributions that are estimated from training signals. The distribution of the signal under the composite hypothesis is assumed to belong to a given family. The parameter of that distribution is assumed random with a prior distribution that is estimated from a large set of speakers. This formulation converts the problem to that of testing J+1 simple hypotheses. Signals corresponding to target and nontarget speakers are assumed Gaussian mixtures processes. Once the system has been trained, list decoding is applied in which a test signal is associated with a list of possible speakers. The probability that the correct speaker is on the list is maximized for a given average number of incorrect speakers on the list. Results from speaker identification and speaker verification experiments are reported. In speaker identification using a National Institute of Standards and Technology (NIST) database with 174 target speakers, over 77% correct identification was achieved for an average of less than two erroneous speakers on the list. Speaker verification experiments on a similar database yielded results, expressed in terms of the equal-error-rate, of 6.7% and 10.1% using two decision rules.
Keywords :
Bayes methods; Gaussian processes; database management systems; decoding; error statistics; maximum likelihood estimation; speaker recognition; Gaussian mixtures process; J simple hypotheses; a prior distribution; composite hypothesis testing hypothesis testing; equal-error-rate; list decoding; nontarget speaker; speaker classification; speaker verification; Databases; Decoding; Hidden Markov models; Maximum likelihood estimation; NIST; Parameter estimation; Random variables; Signal processing; Speaker recognition; System testing; Composite hypothesis; list decoding; speaker recognition;
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
DOI :
10.1109/TSA.2004.838536