Predicting students’ performance in mathematics subjects at Kolej MARA Banting using machine learning methods

Ahmad Akif, Ibrahim and Nor Azuana, Ramli and Sahimel Azwal, Sulaiman (2025) Predicting students’ performance in mathematics subjects at Kolej MARA Banting using machine learning methods. Jurnal Pendidikan Sains Dan Matematik Malaysia, 15 (1). pp. 19-31. ISSN 2600-9307. (Published)

Abstract

Predicting students’ performance is crucial for personalised and educational success for individuals. However, no standard procedure or method considers external factors to predict students’ performance in mathematics at Kolej MARA Banting (KMB). This research aims to address this problem by exploring the potential of machine learning methods for predicting students’ performance in mathematics at KMB. The study follows a machine learning process: data collection, attribute selection, pre-processing, model training, and evaluation. A sample of 703 data points on students’ demographics, academic records, and mathematics performance were collected and pre-processed. Machine learning models such as support vector machine, decision tree, k-nearest neighbours, Naïve Bayes, Random Forest, AdaBoost, and stacking model were applied in this study. The accuracy and performance of these models were assessed to determine which model outperformed the others and its effectiveness in predicting students’ mathematics performance. The study findings demonstrate that the stacking model exhibited superior performance in accuracy (71.43%), precision (68.73%), recall (71.43%), and F1-score (69.80%) compared to the other models. Nevertheless, it is essential to note that the stacking model achieved moderate accuracy. This could be attributed to the inherent difficulties in constructing a precise predictive model for student performance, such as the models failing to sufficiently reflect the complexities within the dataset, resulting in underfitting. Additionally, the target attribute, International Baccalaureate (IB) grade, is imbalanced, with more high performers than low performers, causing the models to be biased towards the majority class and impacting overall accuracy. The performance of the models in this study could be improved by adding more features related to students’ performance, such as anxiety, depression, well-being, and others, to capture enough complexity in the data. It is also suggested that samples from other colleges with a balanced grade distribution be obtained compared to students at KMB.

Item Type: Article
Uncontrolled Keywords: Machine Learning; Students’ Performance; Mathematics Subjects; International Baccalaureate; Predictive Modelling
Subjects: L Education > L Education (General)
L Education > LB Theory and practice of education
Q Science > QA Mathematics
Faculty/Division: Center for Mathematical Science
Depositing User: Dr. Nor Azuana Ramli
Date Deposited: 28 Aug 2025 07:12
Last Modified: 28 Aug 2025 07:12
URI: https://umpir.ump.edu.my/id/eprint/45505
Statistic Details: View Download Statistic

Actions (login required)

View Item
View Item