Alaryngeal Speech Enhancement Based on One-to-Many Eigenvoice Conversion

Author

Doi, Hidenobu ; Toda, Takechi ; Nakamura, Kentaro ; Saruwatari, Hiroshi ; Shikano, Kiyohiro

Author_Institution

Grad. Sch. of Inf. Sci., Nara Inst. of Sci. & Technol., Ikoma, Japan

Volume

22

Issue

1

fYear

2014

fDate

Jan. 2014

Firstpage

172

Lastpage

183

Abstract

In this paper, we present novel speaking-aid systems based on one-to-many eigenvoice conversion (EVC) to enhance three types of alaryngeal speech: esophageal speech, electrolaryngeal speech, and body-conducted silent electrolaryngeal speech. Although alaryngeal speech allows laryngectomees to utter speech sounds, it suffers from the lack of speech quality and speaker individuality. To improve the speech quality of alaryngeal speech, alaryngeal-speech-to-speech (AL-to-Speech) methods based on statistical voice conversion have been proposed. In this paper, one-to-many EVC capable of flexibly controlling the converted voice quality by adapting the conversion model to given target natural voices is further implemented for the AL-to-Speech methods to effectively recover speaker individuality of each type of alaryngeal speech. These proposed systems are compared with each other from various perspectives. The experimental results demonstrate that our proposed systems are capable of effectively addressing the issues of alaryngeal speech, e.g., yielding significant improvements in speech quality of each type of alaryngeal speech.

Keywords

eigenvalues and eigenfunctions; speech enhancement; AL-to-speech methods; EVC; alaryngeal speech enhancement; alaryngeal-speech-to-speech methods; body-conducted silent electrolaryngeal speech; electrolaryngeal speech; esophageal speech; laryngectomees; one-to-many eigenvoice conversion; speaker individuality; speaking-aid systems; speech quality; speech sounds; statistical voice conversion; voice quality; Acoustics; Larynx; Speech; Speech enhancement; Training; Vectors; Alaryngeal speech; eigenvoice conversion; laryngectomees; speech enhancement; voice conversion;

fLanguage

English

Journal_Title

Audio, Speech, and Language Processing, IEEE/ACM Transactions on

Publisher

ieee

ISSN

2329-9290

Type

jour

DOI

10.1109/TASLP.2013.2286917

Filename

6645394