DocumentCode :
653717
Title :
Singing voice identification and lyrics transcription for music information retrieval invited paper
Author :
Mesaros, Annamaria
Author_Institution :
Dept. of Signal Process. & Acoust., Aalto Univ., Espoo, Finland
fYear :
2013
fDate :
16-19 Oct. 2013
Firstpage :
1
Lastpage :
10
Abstract :
This paper presents an overview of methods and applications dealing with analysis of singing voice audio signals, related to singer identity and lyrics content of the singing. Singer identification in polyphonic music is based on general audio classification methods. The presence of instruments is detrimental to voice identification performance, and eliminating the effect of instrumental accompaniment is an important aspect of the prob-lem. The results show that classification of singing voices can be done robustly in polyphonic music when using source separation. Lyrics transcription is approached as a speech recognition prob-lem, with specific elements for dealing with singing voice. The variability of phonation in singing poses a significant challenge to the speech recognition approach. The word recognition accuracy of the lyrics transcription from singing is quite low, but it is shown to be useful in a query-by-singing application, for performing a textual search based on the words recognized from the query. A system for automatic alignment of lyrics and audio is also presented, with sufficient performance for facilitating applications such as automatic karaoke annotation or song browsing.
Keywords :
audio signal processing; music; query processing; signal classification; source separation; speech recognition; automatic karaoke annotation; general audio classification methods; instrumental accompaniment; lyrics automatic alignment; lyrics transcription; music information retrieval; phonation variability; polyphonic music; query-by-singing application; singer identity; singing voice audio signal analysis; singing voice classification; singing voice identification; song browsing; source separation; speech recognition problem; textual search; word recognition accuracy; Databases; Hidden Markov models; Instruments; Multiple signal classification; Music; Speech; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Speech Technology and Human - Computer Dialogue (SpeD), 2013 7th Conference on
Conference_Location :
Cluj-Napoca
Type :
conf
DOI :
10.1109/SpeD.2013.6682644
Filename :
6682644
Link To Document :
بازگشت