Title :
Deep Learning with MCA-based Instance Selection and Bootstrapping for Imbalanced Data Classification
Author :
Sheng Guan;Min Chen;Hsin-Yu Ha;Shu-Ching Chen;Mei-Ling Shyu;Chengde Zhang
Author_Institution :
Sch. of Comput. &
Abstract :
In this paper, we propose an extended deep learning approach that incorporates instance selection and bootstrapping techniques for imbalanced data classification. In supervised learning, classification performance often deteriorates when the training set is imbalanced where at least one of the classes has a substantially fewer number of instances than the others. We propose to use adaptive synthetic sampling approach (ADASYN) to generate synthetic instances for the minority class. A data pruning process based on multiple correspondence analysis (MCA) is then performed to identify a sub-set of synthetic instances that are most suitable to supplement the existing minority instances. This results in a relatively more balanced training dataset which is then bootstrapped and fed into the convolutional neural networks (CNNs) for classification. Furthermore, we propose to use low-level features pre-processed by principal component analysis (PCA), instead of the commonly used raw signal data, as the input to CNNs to reduce the computational time. The experimental results show the effectiveness of our framework in classifying 54 TRECVID concepts with different imbalanced levels by comparing with other state-of-the-art methods.
Keywords :
"Training","Feature extraction","Machine learning","Neurons","Multimedia communication","Biological neural networks","Principal component analysis"
Conference_Titel :
Collaboration and Internet Computing (CIC), 2015 IEEE Conference on
DOI :
10.1109/CIC.2015.40