Title :
An iterative bilinear frequency warping approach to robust speaker-independent time synchronization
Author :
Soens, Pieter ; Verhelst, Werner
Author_Institution :
Dept. ETRO-DSSP, Vrije Univ. Brussel, Brussels, Belgium
Abstract :
Vocal Tract Length Normalization is a widely deployed speaker normalization technique, which compensates for vocal tract length differences among speakers by appropriately warping the frequency axis of the speech signal. In this work, we study the use of this technique on the time synchronization paradigm. An efficient bilinear frequency warping procedure is proposed, in which the amount of warping is iteratively optimized in accordance with a criterion that is directly related to the output of the standard Dynamic Time Warping algorithm. Subjective listening tests performed on mixed-gender time-aligned results obtained with a subset of data from the English EUROM1 Many Talker Set have shown that the proposed procedure significantly improves the overall speech quality and the time synchronization accuracy with 85% and 91%, respectively.
Keywords :
iterative methods; optimisation; speaker recognition; synchronisation; English EUROM1 Many Talker Set; bilinear frequency warping procedure; dynamic time warping algorithm; iterative bilinear frequency warping approach; iterative optimization; robust speaker-independent time synchronization; speaker normalization technique; speech quality; speech signal frequency axis warping; subjective listening tests; time synchronization accuracy; time synchronization paradigm; vocal tract length normalization; Accuracy; Robustness; Speech; Speech processing; Synchronization; Vectors; Dynamic Time Warping; Time Synchronization; Vocal Tract Length Normalization;
Conference_Titel :
Signal Processing Conference (EUSIPCO), 2012 Proceedings of the 20th European
Conference_Location :
Bucharest
Print_ISBN :
978-1-4673-1068-0