Title :
Noise robust estimation of the voice source using a deep neural network
Author :
Airaksinen, Manu ; Raitio, Tuomo ; Alku, Paavo
Author_Institution :
Dept. of Signal Process. & Acoust., Aalto Univ., Espoo, Finland
Abstract :
In the analysis of speech production, information about the voice source can be obtained non-invasively with glottal inverse filtering (GIF) methods. Current state-of-the-art GIF methods are capable of producing high-quality estimates in suitable conditions (e.g. low noise and reverberation), but their performance deteriorates in nonideal conditions because they require noise-sensitive parameter estimation. This study proposes a method for noise robust estimation of the voice source by creating a mapping using a deep neural network (DNN) between robust low-level speech features and the desired reference, a time-domain glottal flow computed by a GIF method. The method was evaluated with two GIF methods, of which one (quasi closed phase analysis, QCP) requires additional parameter estimation and the other (iterative adaptive inverse filtering, IAIF) does not. The results show that the proposed method outperforms the QCP method with SNRs less than 50-20 dB, but the simple IAIF method only with very low SNRs.
Keywords :
acoustic noise; neural nets; parameter estimation; speech processing; GIF methods; deep neural network; glottal inverse filtering; low level speech features; noise robust estimation; noise sensitive parameter estimation; speech production; voice source; Databases; Estimation; Neural networks; Shape; Signal to noise ratio; Speech; Training; Voice source estimation; deep neural network; glottal inverse filtering; noise robustness;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
DOI :
10.1109/ICASSP.2015.7178950