DocumentCode :
245094
Title :
Mutual Information Based Output Dimensionality Reduction
Author :
Pandey, Shishir ; Vaze, Rahul
Author_Institution :
Sch. of Technol. & Comput. Sci., Tata Inst. of Fundamental Res., Mumbai, India
fYear :
2014
fDate :
14-17 Dec. 2014
Firstpage :
935
Lastpage :
940
Abstract :
Given a large dimensional input and output space, even simple regression is prohibitively costly. Dimensionality reduction in the output space is important for efficient learning and prediction as modern paradigms, e.g. Topic modelling, image classification, etc., have extremely large output spaces. In contrast to input dimensionality reduction, dimension reduction in output side is complicated. We propose, mutual information based output dimensionality reduction, that takes into account the relationship between the input and the output which is essential for regression and classification problems. Our method selects those labels to form the compressed label space that typically have the maximum mutual information with the input. Selecting the best subset is computationally hard, but we provide a polynomial time algorithm with provable approximation guarantee. We conduct experiments on seven multi-label classification datasets. Results show our method performs better than existing methods on some datasets.
Keywords :
approximation theory; computational complexity; learning (artificial intelligence); pattern classification; regression analysis; approximation guarantee; classification problems; compressed label space; dimensional input space; dimensional output space; learning; multilabel classification datasets; mutual information based output dimensionality reduction; polynomial time algorithm; prediction; regression problems; subset selection; Approximation methods; Compressed sensing; Decoding; Educational institutions; Greedy algorithms; Mutual information; Vectors; dimension reduction; multi-label; mutual information; output dimension reduction; submodular function;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining (ICDM), 2014 IEEE International Conference on
Conference_Location :
Shenzhen
ISSN :
1550-4786
Print_ISBN :
978-1-4799-4303-6
Type :
conf
DOI :
10.1109/ICDM.2014.110
Filename :
7023426
Link To Document :
بازگشت