Artificial stereo data generation for speech feature mapping

Author

Han, Chang Woo ; Kang, Tae Gyoon ; Kang, Shin Jae ; Sung, June Sig ; Kim, Nam Soo

Author_Institution

Sch. of Electr. Eng., Seoul Nat. Univ., Seoul, South Korea

fYear

2012

fDate

25-30 March 2012

Firstpage

4897

Lastpage

4900

Abstract

Feature mapping technique is widely used to eliminate the mismatch between the training and test conditions of speech recognition. In the feature mapping, a target (mismatched) feature vector sequence is mapped closer to the corresponding reference (matched) feature vector stream. The training of the mapping system is usually carried out based on a set of stereo data which consists of simultaneous recordings obtained in both the reference and target conditions. In this paper, we propose a novel approach to blind parameter estimation which does not require the reference feature vectors. The proposed approach is motivated by the hidden Markov model (HMM)-based speech synthesis algorithm.

Keywords

hidden Markov models; speech synthesis; HMM-based speech synthesis algorithm; artificial stereo data generation; feature mapping technique; hidden Markov model; reference feature vector stream; speech feature mapping; stereo data; target conditions; Estimation; Hidden Markov models; Speech; Speech processing; Speech recognition; Superluminescent diodes; Vectors; Robust speech recognition; blind estimation; feature mapping;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on

Conference_Location

Kyoto

ISSN

1520-6149

Print_ISBN

978-1-4673-0045-2

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2012.6289017

Filename

6289017