Adaptation of a speech recognizer for singing voice

Author

Mesaros, Annamaria ; Virtanen, Tuomas

Author_Institution

Dept. of Signal Process., Tampere Univ. of Technol., Tampere, Finland

fYear

2009

fDate

24-28 Aug. 2009

Firstpage

1779

Lastpage

1783

Abstract

This paper studies the speaker adaptation techniques that can be applied for adapting a speech recognizer to singing voice. Maximum likelihood linear regression (MLLR) techniques are studied, with specific details in choosing the number and types of transforms. The recognition performance of the different methods is measured in terms of phoneme recognition rate and singing-to-lyrics alignment errors of the adapted recognizers. Different methods improve the correct recognition rate with up to 10 percentage units, compared to the non-adapted system. In singing-to-lyrics alignment we obtain a best of 0.94 seconds mean absolute alignment error, compared to 1.26 seconds for the non-adapted system. Global adaptation was found to provide the most improvement in the performance, but small further improvement was obtained with regression tree adaptation.

Keywords

maximum likelihood estimation; regression analysis; speech recognition; trees (mathematics); maximum likelihood linear regression; phoneme recognition rate; regression tree; singing voice; singing-to-lyrics alignment error; speaker adaptation technique; speech recognizer; Abstracts; Adaptation models; Arctic; Hidden Markov models; Speech; Speech synthesis;

fLanguage

English

Publisher

ieee

Conference_Titel

Signal Processing Conference, 2009 17th European

Conference_Location

Glasgow

Print_ISBN

978-161-7388-76-7

Type

conf

Filename

7077626