مرکز منطقه ای اطلاع رساني علوم و فناوري - Mapping frames with DNN-HMM recognizer for non-parallel voice conversion

DocumentCode :

3752072

Title :

Mapping frames with DNN-HMM recognizer for non-parallel voice conversion

Author :

Minghui Dong;Chenyu Yang;Yanfeng Lu;Jochen Walter Ehnes;Dongyan Huang;Huaiping Ming;Rong Tong;Siu Wa Lee;Haizhou Li

Author_Institution :

Human Language Technology Department, Institute for Infocomm Research, A-Star, Singapore

fYear :

2015

Firstpage :

488

Lastpage :

494

Abstract :

To convert one speaker´s voice to another´s, the mapping of the corresponding speech segments from source speaker to target speaker must be obtained first. In parallel voice conversion, normally dynamic time warping (DTW) method is used to align signals of source and target voices. However, for conversion between non-parallel speech data, the DTW based mapping method does not work. In this paper, we propose to use a DNN-HMM recognizer to recognize each frame for both source and target speech signals. The vector of pseudo likelihood is then used to represent the frame. Similarity between two frames is measured with the distance between the vectors. A clustering method is used to group both source and target frames. Frame mapping from source to target is then established based on the clustering result. The experiments show that the proposed method can generate similar conversion results compared to parallel voice conversion.

Keywords :

Decision support systems

Publisher :

ieee

Conference_Titel :

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2015 Asia-Pacific

Type :

conf

DOI :

10.1109/APSIPA.2015.7415320

Filename :

7415320

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3752072