Title :
Imbalanced educational data classification: An effective approach with resampling and random forest
Author :
Vo Thi Ngoc Chau ; Nguyen Hua Phung
Author_Institution :
Fac. of Comput. Sci. & Eng., Ho Chi Minh City Univ. of Technol., Ho Chi Minh City, Vietnam
Abstract :
Educational data mining is emerging in the data mining research arena. Despite an applied field of data mining techniques and methods, educational data mining is full of challenges that have not been completely resolved. Especially data classification in an academic credit system is a very tough task which must deal with imbalanced issues and missing data on the technical side and tackle the flexibility of the education system leading to the heterogeneity of data on the practical side. In this paper, we present our approach with a hybrid resampling scheme and random forest for the imbalanced educational data classification task with multiple classes based on student´s performance. The proposed approach has not yet been available in educational data mining. Besides, it has been extensively proved in our empirical study to be effective for student´s final study status prediction and usable in a knowledge-driven educational decision support system.
Keywords :
data mining; decision support systems; educational administrative data processing; academic credit system; educational data mining; hybrid resampling scheme; imbalanced educational data classification task; knowledge-driven educational decision support system; random forest; Accuracy; Cities and towns; Classification algorithms; Data mining; Educational institutions; Support vector machines; academic credit system; educational data mining; imbalanced data classification; random forest; resampling;
Conference_Titel :
Computing and Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), 2013 IEEE RIVF International Conference on
Conference_Location :
Hanoi
Print_ISBN :
978-1-4799-1349-7
DOI :
10.1109/RIVF.2013.6719882