Title :
Software quality classification modeling using the SPRINT decision tree algorithm
Author :
Khoshgoftaar, Taghi M. ; Seliya, Naeem
Author_Institution :
Dept. of Comput. Sci. & Eng., Florida Atlantic Univ., Boca Raton, FL, USA
Abstract :
Predicting the quality of system modules prior to software testing and operations can benefit the software development team. Such a timely reliability estimation can be used to direct cost-effective quality improvement efforts to the high-risk modules. Tree-based software quality classification models based on software metrics are used to predict whether a software module is fault-prone or not fault-prone. They are white box quality estimation models with good accuracy, and are simple and easy to interpret. This paper presents an in-depth study of calibrating classification trees for software quality estimation using the SPRINT decision tree algorithm. Many classification algorithms have memory limitations including the requirement that data sets be memory resident. SPRINT removes all of these limitations and provides a fast and scalable analysis. It is an extension of a commonly used decision tree algorithm, CART, and provides a unique tree-pruning technique based on the minimum description length (MDL) principle. Combining the MDL pruning technique and the modified classification algorithm, SPRINT yields classification trees with useful prediction accuracy. The case study used comprises of software metrics and fault data collected over four releases from a very large telecommunications system. It is observed that classification trees built by SPRINT are more balanced and demonstrate better stability in comparison to those built by CART.
Keywords :
decision trees; pattern classification; software metrics; software quality; software reliability; CART; SPRINT decision tree algorithm; calibration; cost-effective quality improvement efforts; fault-prone. software; high-risk modules; memory limitations; minimum description length principle; prediction accuracy; reliability estimation; software development; software metrics; software quality classification modeling; software testing; stability; system modules; tree pruning technique; very large telecommunications system; white box quality estimation models; Accuracy; Classification algorithms; Classification tree analysis; Decision trees; Predictive models; Programming; Software algorithms; Software metrics; Software quality; Software testing;
Conference_Titel :
Tools with Artificial Intelligence, 2002. (ICTAI 2002). Proceedings. 14th IEEE International Conference on
Print_ISBN :
0-7695-1849-4
DOI :
10.1109/TAI.2002.1180826