Resumen
For a lot of beginners, learning to program is challenging; similarly, for teachers, it is difficult to draw on students? prior knowledge to help the process because it is not quite obvious which abilities are significant for developing programming skills. This paper seeks to shed some light on the subject by identifying which previously recorded variables have the strongest correlation with passing an introductory programming course. To do this, a data set was collected including data from four cohorts of students who attended an introductory programming course, common to all Engineering programmes at a Chilean university. With this data set, several classifiers were built, using different Machine Learning methods, to determine whether students pass or fail the course. In addition, models were trained on subsets of students by programme duration and engineering specialisation. An accuracy of 68% was achieved, but the analysis by specialisation shows that both accuracy and the significant variables vary depending on the programme. The fact that classification methods select different predictors depending on the specialisation suggests that there is a variety of factors that affect a student?s ability to succeed in a programming course, such as overall academic performance, language proficiency, and mathematical and scientific skills.