Advanced SearchSearch Tips
Software Fault Prediction at Design Phase
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
Software Fault Prediction at Design Phase
Singh, Pradeep; Verma, Shrish; Vyas, O.P.;
  PDF(new window)
Prediction of fault-prone modules continues to attract researcher's interest due to its significant impact on software development cost. The most important goal of such techniques is to correctly identify the modules where faults are most likely to present in early phases of software development lifecycle. Various software metrics related to modules level fault data have been successfully used for prediction of fault-prone modules. Goal of this research is to predict the faulty modules at design phase using design metrics of modules and faults related to modules. We have analyzed the effect of pre-processing and different machine learning schemes on eleven projects from NASA Metrics Data Program which offers design metrics and its related faults. Using seven machine learning and four preprocessing techniques we confirmed that models built from design metrics are surprisingly good at fault proneness prediction. The result shows that we should choose Naïve Bayes or Voting feature intervals with discretization for different data sets as they outperformed out of 28 schemes. Naive Bayes and Voting feature intervals has performed AUC > 0.7 on average of eleven projects. Our proposed framework is effective and can predict an acceptable level of fault at design phases.
Software metrics;Machine learning;Design metric;Fault prediction;
 Cited by
Fault Detection and Classification with Optimization Techniques for a Three-Phase Single-Inverter Circuit, Journal of Power Electronics, 2016, 16, 3, 1097  crossref(new windwow)
1. Report.pdf

Barry Boehm, Software Engineering Economics, ${\copyright}$ 1981, p. 40. of Prentice Hall, Inc., Englewood Cliffs, NJ


T. Menzies, J. Greenwald, and A. Frank, "Data Mining Static Code Attributes to Learn Defect Predictors", IEEE Trans. Software Eng., vol. 33, no. 1, pp. 2-13, Jan. 2007 crossref(new window)

S. Lessmann, B. Baesens, C. Mues, and S. Pietsch, "Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings," IEEE Trans. Software Eng., vol. 34, no. 4, pp. 485-496, July/Aug. 2008 crossref(new window)

Shull F, Basili V, Boehm B, Brown A, Costa P, Lindvall M, et al. "What we have learned about fighting defect". In: Proceedings of 8th international software metrics symposium, Ottawa, Canada; 2002. p. 249-58.

Menzies T, Raffo D, on Setamanit S, Hu Y, Tootoonian S. "Model-based tests of truisms". In: Proceedings of IEEE ASE 2002.

Do-178b and mccabe iq. available in


Fayyad, U.M., and Irani, K.B. (1993), "Multi Interval discretization of continuous-valued attributes for classification learning", in Proceeding of the 13th International Joint Conference on Artificial Intelli- gence, 1022-1027, Morgan Kauffmann

Quinlan, R.J., "C4.5: Programs for Machine Learning", Morgan Kaufman, 1993

Lee, S., "Noisy Replication in Skewed Binary Classification, Computational Statistics and Data Analysis," 34, 2000.

Kolcz, A. Chowdhury, and J. Alspector, Data duplication: "An imbalance problem"In Workshop on Learning from Imbalanced Data Sets" (ICML), 2003.

Niels Landwehr, Mark Hall, and Eibe Frank. "Logistic model trees". Machine Learning, 59(1-2):161-205, 2005. crossref(new window)

Shatovskaya, T., Repka, V., & Good, A. (2006). "Application of the Bayesian Networks in the informational modeling". International conference: Modern problems of radio engineering, telecommunications, and computer science, international conference (p. 108). Lviv-Slavsko, Ukraine.

Singh, P.; Verma, S., "Empirical investigation of fault prediction capability of object oriented metrics of open source software," Computer Science and Software Engineering (JCSSE), 2012 International Joint Conference on, vol., no., pp. 323, 327, May 30 2012-June 1 2012

P. Singh and S. Verma, "An Investigation of the Effect of Discretization on Defect Prediction Using Static Measures", IEEE International Conference on Advances in Computing, Control, and Telecommunication Technologies (2009), pp. 837-839

P. Singh and S. Verma, "Effectiveness analysis of consistency based feature selection in Software fault Prediction", International Journal of Advancements in Computer Science & Information Technology, vol.02, no.1, pp. 01-09, 2012

Koru, A. G., & Liu, H. (2007). "Identifying and characterizing change-prone classes in two largescale open-source products". Journal of Systems and Software, 80(1), 63-73. crossref(new window)

Hall, Mark A Holmes, Geoffrey "Benchmarking Attribute Selection Techniques for Discrete Class Data Mining" IEEE Transactions on Software Engineering, 2003

Boetticher, G., Menzies, T., & Ostrand, T. J. (2007). "The PROMISE repository of empirical software engineering data "West Virginia University, Lane Department of Computer Science and Electrical Engineering.