Effectiveness of the Random Forest Algorithm for Software Quality Classification

Effectiveness of the Random Forest Algorithm for Software Quality Classification
Author	Kehan Gao
Co-Author(s)	Taghi M. Khoshgoftaar; Amri Napolitano
Abstract	Software defect prediction is a process of utilizing software metrics and fault data, along with classification algorithms to classify program modules into quality-based classes (i.e., fault-prone or not-fault-prone). Such types of estimations can assist practitioners in effectively allocating limited project resources, or focusing on program modules that are of poor quality or likely to have a high number of faults. Many factors may affect the prediction results, such as the dimensionality of training datasets, the type of learner used, etc. In this paper, we focus on investigating a classifier ensemble, Random Forest, and evaluating its performance in the context of software defect prediction. The experiments were carried out on three groups of software datasets. The results demonstrate that the Random Forest algorithm exhibited consistently better performance than some other frequently used classifiers.
Keywords	software defect prediction, quality classification, learners, Random Forest

		Article #: 22181

Proceedings of the 22nd ISSAT International Conference on Reliability and Quality in Design
August 4-6, 2016 - Los Angeles, California, U.S.A.

	International Society of Science and Applied Technologies