مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

2666493

Title :

Recognizing a voice from its model

Author :

Damiano, Brian ; Kercel, Stephen W. ; Tucker, Raymond W., Jr. ; Brown-VanHoozer, S. Alenka

Author_Institution :

Oak Ridge Nat. Lab., TN, USA

Volume :

fYear :

2000

fDate :

2000

Firstpage :

2216

Abstract :

Investigates a potential solution to the “large-population” speaker identification problem by characterizing a voice by the entailments in two different kinds of models. These entailments are found in the representational models of neuro-linguistic programming (NLP) and in the model of the mechanics of the voice as revealed by the continuous wavelet transform (CWT). Results to date have been obtained from examining samples in the TIMIT database and human subjects. Local features correlated with individual speakers for selected vowel sounds have been found in the CWT space. Features of NLP representation systems have also been found and are compared with voice features for speakers whose NLP representation systems are known a priori. Gaussian mixture models are used to calculate probability density functions from the local feature distributions. This speaker identification strategy combines three elements of novelty. First, it exploits the fact that the 2D CWT of a 1D signal can be interpreted as an image, and can thus use feature extraction techniques first developed for image processing. Second, voice waveforms are systematically studied to identify features that are attributed to the speaker´s mental representation. Third, the reliability of the identification is strengthened by combining entailments from these two completely different aspects of the speaker´s identity: the mechanical aspects of the speaker´s vocal tract and the pattern of representation

Keywords :

feature extraction; linguistics; probability; psychology; speaker recognition; speech; wavelet transforms; 1D signal; 2D continuous wavelet transform; Gaussian mixture models; TIMIT database; expectation maximization algorithm; feature extraction techniques; image interpretation; image processing; large-population speaker identification; local feature correlation; local feature distributions; model entailments; neurolinguistic programming; probability density functions; reliability; representational models; speaker mental representation; vocal mechanics; vocal tract; voice features; voice recognition; voice waveforms; vowel sounds; Continuous wavelet transforms; Feature extraction; Humans; Image processing; Loudspeakers; Probability density function; Signal processing; Spatial databases; Speech recognition; Wavelet transforms;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Systems, Man, and Cybernetics, 2000 IEEE International Conference on

Conference_Location :

Nashville, TN

ISSN :

1062-922X

Print_ISBN :

0-7803-6583-6

Type :

conf

DOI :

10.1109/ICSMC.2000.886445

Filename :

886445

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2666493