DocumentCode :
2220242
Title :
Semi-supervised learning for software quality estimation
Author :
Seliya, Naeem ; Khoshgoftaar, Taghi M. ; Zhong, Shi
Author_Institution :
Dept. of Comput. Sci. & Eng., Florida Atlantic Univ., Boca Raton, FL, USA
fYear :
2004
fDate :
15-17 Nov. 2004
Firstpage :
183
Lastpage :
190
Abstract :
A software quality estimation model is often built using known software metrics and fault data obtained from program modules of previously developed releases or similar projects. Such a supervised learning approach to software quality estimation assumes that fault data is available for all the previously developed modules. Considering the various practical issues in software project development, fault data may not be available for all the software modules in the training data. More specifically, the available labeled training data is such that a supervised learning approach may not yield good software quality prediction. In contrast, a supervised classification scheme aided by unlabeled data, i.e., semisupervised learning, may yield better results. This work investigates semisupervised learning with the expectation maximization (EM) algorithm for the software quality classification problem. Case studies of software measurement data obtained from two NASA software projects, JM1 and KC2, are used in our empirical investigation. A small portion of the JM1 dataset is randomly extracted and used as the labeled data, while the remaining JM1 instances are used as unlabeled data. The performance of the semisupervised classification models built using the EM algorithm is evaluated by using the KC2 project as a test dataset. It is shown that the EM-based semisupervised learning scheme improves the predictive accuracy of the software quality classification models.
Keywords :
learning (artificial intelligence); software fault tolerance; software management; software metrics; software quality; NASA software projects; expectation maximization algorithm; labeled training data; semisupervised learning; software metrics; software project development; software quality estimation; supervised learning approach; Data mining; NASA; Semisupervised learning; Software algorithms; Software measurement; Software metrics; Software quality; Supervised learning; Testing; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence, 2004. ICTAI 2004. 16th IEEE International Conference on
ISSN :
1082-3409
Print_ISBN :
0-7695-2236-X
Type :
conf
DOI :
10.1109/ICTAI.2004.108
Filename :
1374185
Link To Document :
بازگشت