Resumen
The accurate prediction of soil contamination in abandoned mining areas is necessary to address their environmental risks. This study employed a combined model of machine learning and geostatistics to predict the spatial distribution of soil contamination using heavy metal data collected in an abandoned metal mine. An exploratory data analysis was used to identify patterns in the collected data, the root mean squared error (RMSE) and coefficient of determination (R2) were used to verify the predicted values, and the model was validated using K-fold cross-validation. The prediction results were produced as a map by applying hyperparameter tuning to Random Forest (RF) and Ordinary Kriging (OK) through GridSearchCV using optimal parameter selections. Furthermore, the prediction residuals of the RF model were calculated, and the RF prediction map and OK interpolation results of the RF prediction residuals were summarized to construct an RF?OK prediction map. The RMSE and R2 values for the RF, OK, and RF?OK interpolation models were 66.214, 65.101, and 52.884 mg/kg and 0.867, 0.871, and 0.915, respectively. In addition, the optimization results with the minimum RMSE and maximum R2 were obtained through hyperparameter tuning. The proposed RF?OK hybrid model demonstrated superior prediction performance compared to the individual models.