Title :
Customer churn prediction for telecommunication: Employing various various features selection techniques and tree based ensemble classifiers
Author :
Idris, Anuar ; Khan, Ajmal
Author_Institution :
Dept. of Comput. & Inf. Sci., Pakistan Inst. of Eng. & Appl. Sci., Islamabad, Pakistan
Abstract :
Ensemble classifiers have received increasing attention for attaining the higher classification performance in recent times. In this paper, we present comparative performances of various tree based ensemble classifiers in collaboration with maximum relevancy and minimum redundancy (mRMR), Fisher´s ratio and F-score based features selection schemes for a challenging problem of churn prediction in telecommunication. The large sized telecommunication dataset has been the main hurdle in achieving the desired classification performance in the contemporary proposed churn prediction models. Though, tree based ensemble classifiers are considered suitable for larger datasets, but we have found rotation forest and rotboost as effective techniques compared to random forest, which employ boosting through features selection and increased diversity by incorporating linear feature extraction method such as Principal Component Analysis. In addition to the features selection performed by used ensembles, we have also incorporated mRMR, Fisher´s ratio and F-score techniques for features selection. mRMR returns a coherent and well discriminants feature set, compared to Fisher´s ratio and F-score, which significantly reduces the computations and helps classifier in attaining improved performance. The performance evaluation is conducted using area under curve, sensitivity and specificity where Rotboost, an ensemble of rotation forest and Adaboost in collaboration with mRMR has shown competitive results for churn prediction in telecommunication as compared to other ensemble methods.
Keywords :
customer relationship management; learning (artificial intelligence); pattern classification; principal component analysis; telecommunication industry; trees (mathematics); Adaboost; F-score based features selection schemes; Fisher ratio; classification performance; customer churn prediction; linear feature extraction method; mRMR; maximum relevancy and minimum redundancy; principal component analysis; rotation forest; rotboost; telecommunication dataset; tree based ensemble classifiers; RotBoost; Rotation Forest; churn prediction; mRMR; teleommunication;
Conference_Titel :
Multitopic Conference (INMIC), 2012 15th International
Conference_Location :
Islamabad
Print_ISBN :
978-1-4673-2249-2
DOI :
10.1109/INMIC.2012.6511498