Skip to main content
OpenConf small logo

Providing all your submission and review needs
Abstract and paper submission, peer-review, discussion, shepherding, program, proceedings, and much more

Worldwide & Multilingual
OpenConf has powered thousands of events and journals in over 100 countries and more than a dozen languages.

A Data Science Approach To Predictive Analytic Research On Preventing Student Dropout In Higher Education

Student dropout in higher education is a critical challenge with significant so-cial and economic implications. This study, conducted under the Design Sci-ence Research (DSR) paradigm, develops and validates a predictive artefact for the early identification of students at risk of dropping out. Using a public dataset with 37 socioeconomic and academic variables from 4,424 students, we implemented a Machine Learning (ML) pipeline that includes feature en-gineering and class imbalance correction using SMOTE-ENN, as well as a Stacking Ensemble model combining Random Forest, XGBoost, and SVM. The final artefact, demonstrated and evaluated, achieved an accuracy of 96.4% and an F1-Score of 0.964. Model interpretability was ensured through SHAP (SHapley Additive exPlanations), which enabled a transparent under-standing of the prediction results by assigning a relevance value to each fea-ture. This work proposes an effective and interpretable system to be used by higher education institutions to design targeted and personalised interventions for preventing student dropout.

Jorge Duque
ISLA - Instituto Politécnico de Gestão e Tecnologia
Portugal

José Vasconcelos
Universidade Lusófona
Portugal

Vítor Filipe
Universidade de Trás-os-Montes e Alto Douro (UTAD)
Portugal