Title : 
Unsupervised speaker normalization using canonical correlation analysis
         
        
            Author : 
Ariki, Yasuo ; Sakuragi, Miharu
         
        
            Author_Institution : 
Dept. of Electron. & Inf., Ryukoku Univ., Ohtsu, Japan
         
        
        
        
        
        
            Abstract : 
Conventional speaker-independent HMMs ignore the speaker differences and collect speech data in an observation space. This causes a problem that the output probability distribution of the HMMs becomes vague so that it deteriorates the recognition accuracy. To solve this problem, we construct the speaker subspace for an individual speaker and correlate them by O-space canonical correlation analysis between the standard speaker and input speaker. In order to remove the constraint that input speakers have to speak the same sentences as the standard speaker in the supervised normalization, we propose an unsupervised speaker normalization method which automatically segments the speech data into phoneme data by the Viterbi decoding algorithm and then associates the mean feature vectors of the phoneme data by O-space canonical correlation analysis. We show the phoneme recognition rate by this unsupervised method is equivalent with that of the supervised normalization method we have already proposed
         
        
            Keywords : 
Viterbi decoding; correlation methods; feature extraction; hidden Markov models; probability; speech processing; speech recognition; unsupervised learning; O-space canonical correlation analysis; Viterbi decoding algorithm; automatic speech data segmentation; input speaker; large vocabulary continuous speech recognition; mean feature vectors; observation space; output probability distribution; phoneme data; phoneme recognition rate; recognition accuracy; sentences; speaker differences; speaker subspace; speaker-independent HMM; standard speaker; supervised normalization method; unsupervised speaker normalization; Algorithm design and analysis; Decoding; Informatics; Information analysis; Probability distribution; Speech analysis; Speech recognition; Viterbi algorithm;
         
        
        
        
            Conference_Titel : 
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
         
        
            Conference_Location : 
Seattle, WA
         
        
        
            Print_ISBN : 
0-7803-4428-6
         
        
        
            DOI : 
10.1109/ICASSP.1998.674375