Title :
Learning and clean-up in a large scale music database
Author :
Hansen, L.K. ; Lehn-Schioler, T. ; Petersen, K.B. ; Arenas-Garcia, J. ; Larsen, J. ; Jensen, S.H.
Author_Institution :
Tech. Univ. of Denmark, Lyngby, Denmark
Abstract :
We have collected a database of musical features from radio broadcasts and CD collections (N > 105). The database poses a number of hard modelling challenges including: Segmentation problems and missing and wrong meta-data. We describe our efforts towards cleaning the data using probability density estimation. We train conditional densities for checking the relation between meta-data and music features, and un-conditional densities for spotting unlikely music features. We show that the rejected samples indeed represent various types of problems in the music data. The models may in some cases assist reconstruction of meta-data.
Keywords :
data handling; database management systems; estimation theory; meta data; music; probability; radio broadcasting; CD collections; data cleaning; large scale music database; metadata reconstruction; missing meta-data; musical feature database; probability density estimation; radio broadcasts; segmentation problem; wrong meta-data; Cleaning; Data models; Databases; Europe; Mel frequency cepstral coefficient; Rocks; Signal processing;
Conference_Titel :
Signal Processing Conference, 2007 15th European
Conference_Location :
Poznan
Print_ISBN :
978-839-2134-04-6