Resumen
Modern machine learning methods, such as tree ensembles, have recently become extremely popular due to their versatility and scalability in handling heterogeneous data and have been successfully applied across a wide range of domains. In this study, two widely applied tree ensemble methods, i.e., random forest (parallel ensemble) and gradient boosting (sequential ensemble), were investigated to predict resilient modulus, using routinely collected soil properties. Laboratory test data on sandy soils from nine borrow pits in Georgia were used for model training and testing. For comparison purposes, the two tree ensemble methods were evaluated against a regression tree model and a multiple linear regression model, demonstrating their superior performance. The results revealed that a single tree model generally suffers from high variance, while providing a similar performance to the traditional multiple linear regression model. By leveraging a collection of trees, both tree ensemble methods, Random Forest and eXtreme Gradient Boosting, significantly reduced variance and improved prediction accuracy, with the eXtreme Gradient Boosting being the best model, with an R2 of 0.95 on the test dataset.