Unsupervised speaker normalization using canonical correlation analysis

Author

Ariki, Yasuo ; Sakuragi, Miharu

Author_Institution

Dept. of Electron. & Inf., Ryukoku Univ., Ohtsu, Japan

Volume

1

fYear

1998

fDate

12-15 May 1998

Firstpage

93

Abstract

Conventional speaker-independent HMMs ignore the speaker differences and collect speech data in an observation space. This causes a problem that the output probability distribution of the HMMs becomes vague so that it deteriorates the recognition accuracy. To solve this problem, we construct the speaker subspace for an individual speaker and correlate them by O-space canonical correlation analysis between the standard speaker and input speaker. In order to remove the constraint that input speakers have to speak the same sentences as the standard speaker in the supervised normalization, we propose an unsupervised speaker normalization method which automatically segments the speech data into phoneme data by the Viterbi decoding algorithm and then associates the mean feature vectors of the phoneme data by O-space canonical correlation analysis. We show the phoneme recognition rate by this unsupervised method is equivalent with that of the supervised normalization method we have already proposed

Keywords

Viterbi decoding; correlation methods; feature extraction; hidden Markov models; probability; speech processing; speech recognition; unsupervised learning; O-space canonical correlation analysis; Viterbi decoding algorithm; automatic speech data segmentation; input speaker; large vocabulary continuous speech recognition; mean feature vectors; observation space; output probability distribution; phoneme data; phoneme recognition rate; recognition accuracy; sentences; speaker differences; speaker subspace; speaker-independent HMM; standard speaker; supervised normalization method; unsupervised speaker normalization; Algorithm design and analysis; Decoding; Informatics; Information analysis; Probability distribution; Speech analysis; Speech recognition; Viterbi algorithm;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on

Conference_Location

Seattle, WA

ISSN

1520-6149

Print_ISBN

0-7803-4428-6

Type

conf

DOI

10.1109/ICASSP.1998.674375

Filename

674375