DocumentCode
2108780
Title
Customer churn prediction for telecommunication: Employing various various features selection techniques and tree based ensemble classifiers
Author
Idris, Anuar ; Khan, Ajmal
Author_Institution
Dept. of Comput. & Inf. Sci., Pakistan Inst. of Eng. & Appl. Sci., Islamabad, Pakistan
fYear
2012
fDate
13-15 Dec. 2012
Firstpage
23
Lastpage
27
Abstract
Ensemble classifiers have received increasing attention for attaining the higher classification performance in recent times. In this paper, we present comparative performances of various tree based ensemble classifiers in collaboration with maximum relevancy and minimum redundancy (mRMR), Fisher´s ratio and F-score based features selection schemes for a challenging problem of churn prediction in telecommunication. The large sized telecommunication dataset has been the main hurdle in achieving the desired classification performance in the contemporary proposed churn prediction models. Though, tree based ensemble classifiers are considered suitable for larger datasets, but we have found rotation forest and rotboost as effective techniques compared to random forest, which employ boosting through features selection and increased diversity by incorporating linear feature extraction method such as Principal Component Analysis. In addition to the features selection performed by used ensembles, we have also incorporated mRMR, Fisher´s ratio and F-score techniques for features selection. mRMR returns a coherent and well discriminants feature set, compared to Fisher´s ratio and F-score, which significantly reduces the computations and helps classifier in attaining improved performance. The performance evaluation is conducted using area under curve, sensitivity and specificity where Rotboost, an ensemble of rotation forest and Adaboost in collaboration with mRMR has shown competitive results for churn prediction in telecommunication as compared to other ensemble methods.
Keywords
customer relationship management; learning (artificial intelligence); pattern classification; principal component analysis; telecommunication industry; trees (mathematics); Adaboost; F-score based features selection schemes; Fisher ratio; classification performance; customer churn prediction; linear feature extraction method; mRMR; maximum relevancy and minimum redundancy; principal component analysis; rotation forest; rotboost; telecommunication dataset; tree based ensemble classifiers; RotBoost; Rotation Forest; churn prediction; mRMR; teleommunication;
fLanguage
English
Publisher
ieee
Conference_Titel
Multitopic Conference (INMIC), 2012 15th International
Conference_Location
Islamabad
Print_ISBN
978-1-4673-2249-2
Type
conf
DOI
10.1109/INMIC.2012.6511498
Filename
6511498
Link To Document