Title :
Multimodal Music and Lyrics Fusion Classifier for Artist Identification
Author :
Aryafar, Kamelia ; Shokoufandeh, Ali
Author_Institution :
Comput. Sci. Dept., Drexel Univ., Philadelphia, PA, USA
Abstract :
Humans interact with each other using different communication modalities including speech, gestures and written documents. In the absence of one modality or presence of a noisy modality, other modalities can benefit precision of systems. HCI systems can also benefit from these multimodal communication models for different machine learning tasks. The provision of multiple modalities is motivated by usability, presence of noise in one modality and non-universality of a single modality. Combining multimodal information introduces new challenges to machine learning such as designing fusion classifiers. In this paper we explore the multimodal fusion of audio and lyrics for music artist identification. We compare our results with a single modality artist classifier and introduce new directions for designing a fusion classifier.
Keywords :
audio signal processing; human computer interaction; learning (artificial intelligence); music; pattern classification; HCI systems; artist identification; communication modalities; human-computer interaction; machine learning tasks; multimodal communication models; multimodal music lyrics fusion classifier; noisy modality; single modality artist classifier; Accuracy; Kernel; Mel frequency cepstral coefficient; Music; Music information retrieval; Semantics; Sparse matrices; audio; classification; multimodal; sparse methods;
Conference_Titel :
Machine Learning and Applications (ICMLA), 2014 13th International Conference on
Conference_Location :
Detroit, MI
DOI :
10.1109/ICMLA.2014.88