ARTÍCULO
TITULO

Comparing Machine Learning Models and Hybrid Geostatistical Methods Using Environmental and Soil Covariates for Soil pH Prediction

Panagiotis Tziachris    
Vassilis Aschonitis    
Theocharis Chatzistathis    
Maria Papadopoulou and Ioannis (John) D. Doukas    

Resumen

In the current paper we assess different machine learning (ML) models and hybrid geostatistical methods in the prediction of soil pH using digital elevation model derivates (environmental covariates) and co-located soil parameters (soil covariates). The study was located in the area of Grevena, Greece, where 266 disturbed soil samples were collected from randomly selected locations and analyzed in the laboratory of the Soil and Water Resources Institute. The different models that were assessed were random forests (RF), random forests kriging (RFK), gradient boosting (GB), gradient boosting kriging (GBK), neural networks (NN), and neural networks kriging (NNK) and finally, multiple linear regression (MLR), ordinary kriging (OK), and regression kriging (RK) that although they are not ML models, they were used for comparison reasons. Both the GB and RF models presented the best results in the study, with NN a close second. The introduction of OK to the ML models? residuals did not have a major impact. Classical geostatistical or hybrid geostatistical methods without ML (OK, MLR, and RK) exhibited worse prediction accuracy compared to the models that included ML. Furthermore, different implementations (methods and packages) of the same ML models were also assessed. Regarding RF and GB, the different implementations that were applied (ranger-ranger, randomForest-rf, xgboost-xgbTree, xgboost-xgbDART) led to similar results, whereas in NN, the differences between the implementations used (nnet-nnet and nnet-avNNet) were more distinct. Finally, ML models tuned through a random search optimization method were compared with the same ML models with their default values. The results showed that the predictions were improved by the optimization process only where the ML algorithms demanded a large number of hyperparameters that needed tuning and there was a significant difference between the default values and the optimized ones, like in the case of GB and NN, but not in RF. In general, the current study concluded that although RF and GB presented approximately the same prediction accuracy, RF had more consistent results, regardless of different packages, different hyperparameter selection methods, or even the inclusion of OK in the ML models? residuals.

 Artículos similares

       
 
Morteza Esmaeili, Jafar Hosseini Manoujan, Jafar Chalabii, Farshad Astaraki and Majid Movahedi Rad    
Tunnel face extrusion rigidity is an important factor for solving stress?strain problems in loose ground conditions. In previous studies, the effect of horizontal and vertical soil layering on tunnel excavation face stability in the presence of longitudi... ver más
Revista: Infrastructures

 
Jingjing Fang, Yining Wang, Peng Jiang, Qin Ju, Chao Zhou, Yiran Lu, Pei Gao and Bo Sun    
Various methods have been developed to estimate daily crop coefficients, but their performance varies. In this paper, a comprehensive evaluation was conducted to estimate the crop coefficient of winter wheat in four growth stages based on the observed da... ver más
Revista: Water

 
Xinyi Huang, Shouming Feng, Shuaishuai Zhao, Jinlong Fan, Zhihao Qin and Shuhe Zhao    
Agricultural drought assessment is based on soil moisture deficit during the plant-growing season. The available long-term in situ soil moisture data can be used to evaluate the drought indices? performance. Drought indices have different sensitivities t... ver más
Revista: Water

 
Maria Kofidou and Alexandra Gemitzi    
The present work aims to highlight the possibility of improving model performance by assimilating soil moisture information in the calibration and validation process. The Soil and Water Assessment Tool (SWAT) within QGIS, i.e., QSWAT, was used to simulat... ver más
Revista: Hydrology

 
Jonatan Pendiuk, María Florencia Degano, Luis Guarracino and Raúl Eduardo Rivas    
The practical utility of remote sensing techniques depends on their validation with ground-truth data. Validation requires similar spatial-temporal scales for ground measurements and remote sensing resolution. Evapotranspiration (ET) estimates are common... ver más
Revista: Hydrology