Effectiveness of the Random Forest Algorithm for Software Quality Classification  
Author Kehan Gao


Co-Author(s) Taghi M. Khoshgoftaar; Amri Napolitano


Abstract Software defect prediction is a process of utilizing software metrics and fault data, along with classification algorithms to classify program modules into quality-based classes (i.e., fault-prone or not-fault-prone). Such types of estimations can assist practitioners in effectively allocating limited project resources, or focusing on program modules that are of poor quality or likely to have a high number of faults. Many factors may affect the prediction results, such as the dimensionality of training datasets, the type of learner used, etc. In this paper, we focus on investigating a classifier ensemble, Random Forest, and evaluating its performance in the context of software defect prediction. The experiments were carried out on three groups of software datasets. The results demonstrate that the Random Forest algorithm exhibited consistently better performance than some other frequently used classifiers.


Keywords software defect prediction, quality classification, learners, Random Forest
    Article #:  22181
Proceedings of the 22nd ISSAT International Conference on Reliability and Quality in Design
August 4-6, 2016 - Los Angeles, California, U.S.A.