Resumen
Risk identification and management are the two most important parts of construction project management. Better risk management can help in determining the future consequences, but identifying possible risk factors has a direct and indirect impact on the risk management process. In this paper, a risk prediction system based on a cross analytical-machine learning model was developed for construction megaprojects. A total of 63 risk factors pertaining to the cost, time, quality, and scope of the megaproject and primary data were collected from industry experts on a five-point Likert scale. The obtained sample was further processed statistically to generate a significantly large set of features to perform K-means clustering based on high-risk factor and allied sub-risk component identification. Descriptive analysis, followed by the synthetic minority over-sampling technique (SMOTE) and the Wilcoxon rank-sum test was performed to retain the most significant features pertaining to cost, time, quality, and scope. Eventually, unlike classical K-means clustering, a genetic-algorithm-based K-means clustering algorithm (GA?K-means) was applied with dual-objective functions to segment high-risk factors and allied sub-risk components. The proposed model identified different high-risk factors and sub-risk factors, which cumulatively can impact overall performance. Thus, identifying these high-risk factors and corresponding sub-risk components can help stakeholders in achieving project success.