DocumentCode :
323498
Title :
Unsupervised speaker normalization using canonical correlation analysis
Author :
Ariki, Yasuo ; Sakuragi, Miharu
Author_Institution :
Dept. of Electron. & Inf., Ryukoku Univ., Ohtsu, Japan
Volume :
1
fYear :
1998
fDate :
12-15 May 1998
Firstpage :
93
Abstract :
Conventional speaker-independent HMMs ignore the speaker differences and collect speech data in an observation space. This causes a problem that the output probability distribution of the HMMs becomes vague so that it deteriorates the recognition accuracy. To solve this problem, we construct the speaker subspace for an individual speaker and correlate them by O-space canonical correlation analysis between the standard speaker and input speaker. In order to remove the constraint that input speakers have to speak the same sentences as the standard speaker in the supervised normalization, we propose an unsupervised speaker normalization method which automatically segments the speech data into phoneme data by the Viterbi decoding algorithm and then associates the mean feature vectors of the phoneme data by O-space canonical correlation analysis. We show the phoneme recognition rate by this unsupervised method is equivalent with that of the supervised normalization method we have already proposed
Keywords :
Viterbi decoding; correlation methods; feature extraction; hidden Markov models; probability; speech processing; speech recognition; unsupervised learning; O-space canonical correlation analysis; Viterbi decoding algorithm; automatic speech data segmentation; input speaker; large vocabulary continuous speech recognition; mean feature vectors; observation space; output probability distribution; phoneme data; phoneme recognition rate; recognition accuracy; sentences; speaker differences; speaker subspace; speaker-independent HMM; standard speaker; supervised normalization method; unsupervised speaker normalization; Algorithm design and analysis; Decoding; Informatics; Information analysis; Probability distribution; Speech analysis; Speech recognition; Viterbi algorithm;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location :
Seattle, WA
ISSN :
1520-6149
Print_ISBN :
0-7803-4428-6
Type :
conf
DOI :
10.1109/ICASSP.1998.674375
Filename :
674375
Link To Document :
بازگشت