• DocumentCode
    2108780
  • Title

    Customer churn prediction for telecommunication: Employing various various features selection techniques and tree based ensemble classifiers

  • Author

    Idris, Anuar ; Khan, Ajmal

  • Author_Institution
    Dept. of Comput. & Inf. Sci., Pakistan Inst. of Eng. & Appl. Sci., Islamabad, Pakistan
  • fYear
    2012
  • fDate
    13-15 Dec. 2012
  • Firstpage
    23
  • Lastpage
    27
  • Abstract
    Ensemble classifiers have received increasing attention for attaining the higher classification performance in recent times. In this paper, we present comparative performances of various tree based ensemble classifiers in collaboration with maximum relevancy and minimum redundancy (mRMR), Fisher´s ratio and F-score based features selection schemes for a challenging problem of churn prediction in telecommunication. The large sized telecommunication dataset has been the main hurdle in achieving the desired classification performance in the contemporary proposed churn prediction models. Though, tree based ensemble classifiers are considered suitable for larger datasets, but we have found rotation forest and rotboost as effective techniques compared to random forest, which employ boosting through features selection and increased diversity by incorporating linear feature extraction method such as Principal Component Analysis. In addition to the features selection performed by used ensembles, we have also incorporated mRMR, Fisher´s ratio and F-score techniques for features selection. mRMR returns a coherent and well discriminants feature set, compared to Fisher´s ratio and F-score, which significantly reduces the computations and helps classifier in attaining improved performance. The performance evaluation is conducted using area under curve, sensitivity and specificity where Rotboost, an ensemble of rotation forest and Adaboost in collaboration with mRMR has shown competitive results for churn prediction in telecommunication as compared to other ensemble methods.
  • Keywords
    customer relationship management; learning (artificial intelligence); pattern classification; principal component analysis; telecommunication industry; trees (mathematics); Adaboost; F-score based features selection schemes; Fisher ratio; classification performance; customer churn prediction; linear feature extraction method; mRMR; maximum relevancy and minimum redundancy; principal component analysis; rotation forest; rotboost; telecommunication dataset; tree based ensemble classifiers; RotBoost; Rotation Forest; churn prediction; mRMR; teleommunication;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multitopic Conference (INMIC), 2012 15th International
  • Conference_Location
    Islamabad
  • Print_ISBN
    978-1-4673-2249-2
  • Type

    conf

  • DOI
    10.1109/INMIC.2012.6511498
  • Filename
    6511498