Title : 
Using Coding-Based Ensemble Learning to Improve Software Defect Prediction
         
        
            Author : 
Sun, Zhongbin ; Song, Qinbao ; Zhu, Xiaoyan
         
        
            Author_Institution : 
Dept. of Comput. Sci. & Technol., Xi´´an Jiaotong Univ., Xi´´an, China
         
        
        
        
        
        
        
            Abstract : 
Using classification methods to predict software defect proneness with static code attributes has attracted a great deal of attention. The class-imbalance characteristic of software defect data makes the prediction much difficult; thus, a number of methods have been employed to address this problem. However, these conventional methods, such as sampling, cost-sensitive learning, Bagging, and Boosting, could suffer from the loss of important information, unexpected mistakes, and overfitting because they alter the original data distribution. This paper presents a novel method that first converts the imbalanced binary-class data into balanced multiclass data and then builds a defect predictor on the multiclass data with a specific coding scheme. A thorough experiment with four different types of classification algorithms, three data coding schemes, and six conventional imbalance data-handling methods was conducted over the 14 NASA datasets. The experimental results show that the proposed method with a one-against-one coding scheme is averagely superior to the conventional methods.
         
        
            Keywords : 
learning (artificial intelligence); program compilers; program debugging; classification methods; coding based ensemble learning; data distribution; software defect prediction; software defect proneness; specific coding scheme; Boosting; Classification; Encoding; Prediction algorithms; Predictive models; Software algorithms; Software defects; Class-imbalance data; meta learning; multiclassifier; software defect prediction;
         
        
        
            Journal_Title : 
Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on
         
        
        
        
        
            DOI : 
10.1109/TSMCC.2012.2226152