Resumen
Depression is becoming one of the most prevalent mental disorders. This study looked at five different classification techniques to predict the risk of students? depression based on their socio-demographics, internet addiction, alcohol use disorder, and stress levels to see if they were at risk for depression. We propose a combined sampling technique to improve the performance of the imbalanced classification of university student depression data. In addition, three different feature selection methods, Correlation, Gain ratio, and Relief feature selection algorithms, were used for extracting the most relevant features from the dataset. In our experimental results, we discovered that combining the bootstrapping technique with the Relief selection technique under sampling methods enabled the generation of a relatively well-balanced dataset on depression without significant loss of information. The results show that the overall accuracy in the risk of depression prediction data was 93.16%, outperforming the individual sampling technique. In addition, other evaluation metrics, including precision, recall, and area under the curve (AUC), were calculated for various models to determine the most effective model for predicting risk of depression.