Resumen
Power law describes a common behavior in which a few factors play decisive roles in one thing. Most software defects occur in very few instances. In this study, we proposed a novel approach that adopts power law function characteristics for software defect prediction. The first step in this approach is to establish the power law function of the majority of metrics in a software system. Following this, the power law function?s maximal curvature value is applied as the threshold value for determining higher metric values. Furthermore, the total number of higher metric values is counted in each instance. Finally, the statistical data are clustered into different categories as defect-free and defect-prone instances. Case studies and a comparison were conducted based on twelve public datasets of Promise, SoftLab, and ReLink by using five different algorithms. The results indicate that the precision, recall, and F-measure values obtained by the proposed approach are the most optimal among the tested five algorithms, the average values of recall and F-measure were improved by 14.3% and 6.0%, respectively. Furthermore, the complexity of the proposed approach based on the power law function is O(2n)" role="presentation">??(2??)O(2n)
O
(
2
n
)
, which is the lowest among the tested five algorithms. The proposed approach is thus demonstrated to be feasible and highly efficient at software defect prediction with unlabeled datasets.