مرکز منطقه ای اطلاع رساني علوم و فناوري - Audio super-resolution using concatenative resynthesis

DocumentCode :

3697408

Title :

Audio super-resolution using concatenative resynthesis

Author :

Michael I Mandel;Young Suk Cho

Author_Institution :

Brooklyn College, CUNY, Computer &

fYear :

2015

Firstpage :

Lastpage :

Abstract :

This paper utilizes a recently introduced non-linear dictionary-based denoising system in another voice mapping task, that of transforming low-bandwidth, low-bitrate speech into high-bandwidth, high-quality speech. The system uses a deep neural network as a learned nonlinear comparison function to drive unit selection in a concatenative synthesizer based on clean recordings. This neural network is trained to predict whether a given clean audio segment from the dictionary could be transformed into a given segment of the degraded observation. Speaker-dependent experiments on the small-vocabulary CHiME2-GRID corpus show that this model is able to resynthesize high quality clean speech from degraded observations. Preliminary listening tests show that the system is able to improve subjective speech quality evaluations by up to 50 percentage points, while a similar system based on non-negative matrix factorization and trained on the same data produces no significant improvement.

Keywords :

"Speech","Dictionaries","Neural networks","Speech processing","Bandwidth","Packet loss"

Publisher :

ieee

Conference_Titel :

Applications of Signal Processing to Audio and Acoustics (WASPAA), 2015 IEEE Workshop on

Type :

conf

DOI :

10.1109/WASPAA.2015.7336890

Filename :

7336890

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3697408