Resumen
Groundwater chemistry data are normally scarce in remote inland areas. Effective statistical approaches are highly desired to extract important information about hydrochemical processes from the limited data. This study applied a clustering approach based on the Gaussian Mixture Model (GMM) to a hydrochemical dataset of groundwater collected in the middle Heihe River Basin (HRB) of northwestern China. Independent hydrological data were introduced to examine whether the clustering results led to an appropriate interpretation on the hydrochemical processes. The main findings include the following. First, in the middle HRB, although groundwater chemistry reflects primarily a natural salinization process, there are evidence for significant anthropogenic influence such as irrigation and fertilization. Second, the regional hydrological cycle, particularly surface water-groundwater interaction, has a profound and spatially variable impact on groundwater chemistry. Third, the interaction between the regional agricultural development and the groundwater quality is complicated. Overall, this study demonstrates that the GMM clustering can effectively analyze hydrochemical datasets and that these clustering results can provide insights into hydrochemical processes, even with a limited number of observations. The clustering approach introduced in this study represents a cost-effective way to investigate groundwater chemistry in remote inland areas where groundwater monitoring is difficult and costly.