Building software quality classification trees: approach, experimentation, evaluation

Author

Takahashi, Ryo ; Muraoka, Yoichi ; Nakamura, Yukihiro

Author_Institution

NTT Inf. & Commun. Syst. Labs., Kanagawa, Japan

fYear

35735

fDate

2-5 Nov1997

Firstpage

222

Lastpage

233

Abstract

A methodology for generating an optimum software quality classification tree using software complexity metrics to discriminate between high-quality modules and low-quality modules is proposed. The process of tree generation is an application of the AIC (Akaike Information Criterion) procedures to the binomial distribution. AIC procedures are based on maximum likelihood estimation and the least number of complexity metrics. It is an improvement of the software quality classification tree generation method proposed by Porter and Selby (1990) from the viewpoint that the complexity metrics are minimized. The problems of their method are that the software quality prediction model is unstable because it reflects observational errors in real data too much and there is no objective criterion for determining whether the discrimination is appropriate or not at a deep nesting level of the classification tree when the number of sample modules gets smaller. To solve these problems a new metric is introduced and its validity is theoretically and experimentally verified. In our examples, complexity metrics written in C language, such as lines of source code, Halstead´s (1977) software science, McCabe´s (976) cyclomatic number, Henry and Kafura´s (1981) fan-in/out and Howatt and Baker´s (1989) scope number, are investigated. Our experiments with a medium-sized piece of software (85 thousand lines of source code; 562 samples) show that the software quality classification tree generated by our new metric identifies the target class of the observed modules more efficiently using the minimum number of complexity metrics without any significant decrease of the correct classification ratio (76%->72%) than the conventional classification tree

Keywords

C language; binomial distribution; information theory; maximum likelihood estimation; software metrics; software quality; software reliability; trees (mathematics); AIC; Akaike Information Criterion; C language; binomial distribution; cyclomatic number; errors; experiments; fan-in fan-out; maximum likelihood estimation; methodology; scope number; software complexity metrics; software quality classification trees; software quality prediction model; software science; source code; tree generation; Application software; Classification tree analysis; Communication system software; Communication systems; Entropy; Information analysis; Information theory; Laboratories; Mathematical model; Software quality;

fLanguage

English

Publisher

ieee

Conference_Titel

Software Reliability Engineering, 1997. Proceedings., The Eighth International Symposium on

Conference_Location

Albuquerque, NM

Print_ISBN

0-8186-8120-9

Type

conf

DOI

10.1109/ISSRE.1997.630869

Filename

630869