Resumen
As a critical physical parameter in the sea?air interface, sea surface temperature (SST) plays a crucial role in the sea?air interaction process. The SST diurnal cycle is one of the most critical changes that occur in the various time scales of SST. Currently, accurate simulation and prediction of SST diurnal cycle amplitude remain challenging. The application of machine learning in marine environment research, simulation, and prediction has received increasing attention. In this study, a regression prediction model for SST diurnal cycle amplitude was constructed based on TOGA/COARE buoy-observed data and an extreme gradient boosting algorithm (XGBoost). The XGBoost algorithm was also optimized using label distribution smoothing (LDS) to respond to the problem of uneven cycle amplitude size distribution. The results showed that the LDS-XGB model outperformed various empirical models and other machine learning models in terms of prediction error and prediction accuracy while effectively improving the data imbalance problem without losing model accuracy and achieving accurate and efficient predictions of the SST diurnal cycle amplitude. This work is a good demonstration of the integration of marine science and machine learning, which indicates that machine learning plays an important role in the model parametrizations and understanding the mechanisms.