Title :
Software defect prediction using Two level data pre-processing
Author :
Verma, Rajesh ; Gupta, Arpan
Author_Institution :
Comput. Sci. Eng., PDPM IIITDM, Jabalpur, India
Abstract :
Defect prediction can be useful to streamline testing efforts and reduce the development cost of software. Predicting defects is usually done by using certain data mining and machine learning techniques. A prediction model is said to be effective if it is able to classify defective and non defective modules accurately. In this paper we investigate the result of data pre-processing on the performance of four different K-NN classifiers and compare the results with random forest classifier. The method used for pre-processing includes attribute selection and instance filtering. We observed that Two-level data pre-processing enhances defect prediction results. We also report how these two filters influence the performance independently. The observed performance improvement can be attributed to the removal of irrelevant attributes by dimension (attribute) reduction and of class imbalance problem by Resampling, together leading to the improved performance capabilities of the classifiers.
Keywords :
data mining; learning (artificial intelligence); pattern classification; software reliability; K-NN classifiers; attribute selection; class imbalance problem; data mining; dimension reduction; instance filtering; machine learning techniques; nondefective module classification; random forest classifier; software defect prediction model; software development cost reduction; streamline testing efforts; two level data preprocessing; Accuracy; Filtering; Handheld computers; Predictive models; Radio frequency; Software; Training;
Conference_Titel :
Recent Advances in Computing and Software Systems (RACSS), 2012 International Conference on
Conference_Location :
Chennai
Print_ISBN :
978-1-4673-0252-4
DOI :
10.1109/RACSS.2012.6212686