Title :
Building software quality classification trees: approach, experimentation, evaluation
Author :
Takahashi, Ryo ; Muraoka, Yoichi ; Nakamura, Yukihiro
Author_Institution :
NTT Inf. & Commun. Syst. Labs., Kanagawa, Japan
Abstract :
A methodology for generating an optimum software quality classification tree using software complexity metrics to discriminate between high-quality modules and low-quality modules is proposed. The process of tree generation is an application of the AIC (Akaike Information Criterion) procedures to the binomial distribution. AIC procedures are based on maximum likelihood estimation and the least number of complexity metrics. It is an improvement of the software quality classification tree generation method proposed by Porter and Selby (1990) from the viewpoint that the complexity metrics are minimized. The problems of their method are that the software quality prediction model is unstable because it reflects observational errors in real data too much and there is no objective criterion for determining whether the discrimination is appropriate or not at a deep nesting level of the classification tree when the number of sample modules gets smaller. To solve these problems a new metric is introduced and its validity is theoretically and experimentally verified. In our examples, complexity metrics written in C language, such as lines of source code, Halstead´s (1977) software science, McCabe´s (976) cyclomatic number, Henry and Kafura´s (1981) fan-in/out and Howatt and Baker´s (1989) scope number, are investigated. Our experiments with a medium-sized piece of software (85 thousand lines of source code; 562 samples) show that the software quality classification tree generated by our new metric identifies the target class of the observed modules more efficiently using the minimum number of complexity metrics without any significant decrease of the correct classification ratio (76%->72%) than the conventional classification tree
Keywords :
C language; binomial distribution; information theory; maximum likelihood estimation; software metrics; software quality; software reliability; trees (mathematics); AIC; Akaike Information Criterion; C language; binomial distribution; cyclomatic number; errors; experiments; fan-in fan-out; maximum likelihood estimation; methodology; scope number; software complexity metrics; software quality classification trees; software quality prediction model; software science; source code; tree generation; Application software; Classification tree analysis; Communication system software; Communication systems; Entropy; Information analysis; Information theory; Laboratories; Mathematical model; Software quality;
Conference_Titel :
Software Reliability Engineering, 1997. Proceedings., The Eighth International Symposium on
Conference_Location :
Albuquerque, NM
Print_ISBN :
0-8186-8120-9
DOI :
10.1109/ISSRE.1997.630869