Replacing code metrics in software fault prediction with early life cycle metrics

Author

Jiang, Yue ; Lin, Jie ; Cukic, Bojan ; Lin, Shuye ; Hu, Zhijian

Author_Institution

Faculty of Software, the Fujian Normal University, Fuzhou, China

fYear

2013

fDate

23-25 March 2013

Firstpage

516

Lastpage

523

Abstract

Fault prediction models are typically built using software metrics collected throughout the software lifecycle process. Given without a previous release version of the software product, the earlier software metrics collected, the earlier the prediction models can be built to guide software verification and validation activities. In this experiment, we investigate the problem in software fault prediction modeling: would it be possible to replace later code metrics by earlier design metrics? We find that 11 code metrics can be replaced by 6 design metrics using Canonical Correlation Analysis (CCA), a multivariate statistical analysis method. After removing these 11 replaceable code metrics from building fault prediction models, the built models typically have the same performance statistically as using all code metrics. This study shows that earlier available design metrics can be used to replace late lifecycle code metrics. This would make it possible to identify faults earlier before code implementation in software lifecycle. Furthermore, due to the expensiveness of metric collection, using less metrics to maintain the same predictive power models has potential high cost-savings in IV & V activities.

Keywords

Bagging; Boosting; Correlation; Logistics; Measurement; Predictive models; Software;

fLanguage

English

Publisher

ieee

Conference_Titel

Information Science and Technology (ICIST), 2013 International Conference on

Conference_Location

Yangzhou

Print_ISBN

978-1-4673-5137-9

Type

conf

DOI

10.1109/ICIST.2013.6747602

Filename

6747602