DocumentCode :
427058
Title :
Multimodal music retrieval for large databases
Author :
Schuller, Björn ; Rigoll, Gerhard ; Lang, Manfred
Author_Institution :
Inst. for Human-Machine Commun., Technische Univ. Munchen, Germany
Volume :
2
fYear :
2004
fDate :
27-30 June 2004
Firstpage :
755
Abstract :
We present a novel multi-modal access to large MP3 music databases. Retrieval can be fulfilled either in a content-based manner or by keywords. As input modalities, speech by natural language utterances or singing, and manual interaction by handwriting, typing or hardkeys are used. In order to achieve especially robust retrieval results and automatically suggest music to the user, contextual knowledge of the time, date, season, user emotion, and listening habits is integrated in the retrieval process. The system communicates with the user by speech or visual reactions. The concepts shown are especially designed for home and mobile access on tablet-PCs, PDAs, and similar PC solutions, The paper discusses the concept and a working prototype called Shangrila. An evaluation by a user study leads to an impression of the capabilities of the suggested approach to multimodal music retrieval.
Keywords :
audio databases; content-based retrieval; graphical user interfaces; music; natural language interfaces; GUI; MP3 music databases; PDA; content-based retrieval; contextual knowledge; graphical user interface; handwriting; hardkeys; keywords; large music databases; listening habits; multimodal music retrieval; natural language utterances; singing; tablet-PC; typing; user emotion; Audio databases; Content based retrieval; Context; Digital audio players; Music information retrieval; Natural languages; Personal digital assistants; Prototypes; Robustness; Speech;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia and Expo, 2004. ICME '04. 2004 IEEE International Conference on
Print_ISBN :
0-7803-8603-5
Type :
conf
DOI :
10.1109/ICME.2004.1394310
Filename :
1394310
Link To Document :
بازگشت