• DocumentCode
    3674832
  • Title

    Improving Cross-Project Defect Prediction Methods with Data Simplification

  • Author

    Sousuke Amasaki;Kazuya Kawata;Tomoyuki Yokogawa

  • Author_Institution
    Dept. of Syst. Eng., Okayama Prefectural Univ., Soja, Japan
  • fYear
    2015
  • Firstpage
    96
  • Lastpage
    103
  • Abstract
    Context: Cross-project defect prediction (CPDP) research has been popular and many CPDP methods were proposed. While these methods used cross-project data as is for their inputs, useless or noisy information in the cross-project data can cause the degradation of predictive and computation performance. Removing such information makes the cross-project data simple and it will affect the performance of CPDP methods. Objective: To identify and quantify the effects of the data simplification for CPDP methods. Method: We conducted experiments that compared the predictive performance between CPDP with and without the data simplification. We adopted a data simplification method based on an active learning method proposed for software effort estimation. The experiments adopted 44 versions of OSS projects, four prediction models, and two CPDP methods, namely, Burak-filter and cross-project selection. Results: The data simplification achieved significant improvement in predictive performance for the cross-project selection. It did not improve Burak-filter. Conclusion: The data simplification can be helpful for the cross-project selection in terms of predictive performance and size reduction of cross-project data.
  • Keywords
    "Predictive models","Measurement","Logistics","Learning systems","Support vector machines","Data models","Software"
  • Publisher
    ieee
  • Conference_Titel
    Software Engineering and Advanced Applications (SEAA), 2015 41st Euromicro Conference on
  • ISSN
    1089-6503
  • Electronic_ISBN
    2376-9505
  • Type

    conf

  • DOI
    10.1109/SEAA.2015.25
  • Filename
    7302438