Nurul Habibah, Abdul Rahman (2024) Predictive modelling of student academic performance using machine learning approaches : a case study in universiti islam pahang sultan ahmad shah. Masters thesis, Universti Malaysia Pahang Al-Sultan Abdullah (Contributors, Thesis advisor: Sahimel Azwal, Sulaiman).
|
Pdf
Predictive modelling of student academic performance using machine learning approaches a case study in universiti islam pahang sultan ahmad shah.pdf - Accepted Version Download (2MB) | Preview |
Abstract
Recently, predictive analytics research has grown in popularity in higher education because it provides helpful information to educators and potentially assists them in enhancing student achievement. Based on the literature review, studies on machine learning and predictive analytics to improve student performance are still scarce in Malaysian higher education. Besides that, the increment of dropout rates among students is crucial issue in Higher Education Institutions. With a huge number of students drop out, the higher education institution’s reputation might be dropped. Furthermore, it may cause a significant loss of human capital for the country. The main goal of the study was to develop the most accurate predictive model for predicting students’ performance levels using machine learning techniques such as multinomial logistic regression, decision trees, Random Forest, k-nearest neighbor, Naïve Bayes, and support vector machine. This study used Cramer’s V correlation and Spearman’s Rank Correlation Coefficient to determine the most correlated factor towards students’ performance level. Evaluation metrics encompass precision, recall, accuracy, F1-score, and area under the receiver operating characteristics curve. Drawing from a dataset spanning students enrolled in the Business Statistics course at Universiti Islam Pahang Sultan Ahmad Shah from 2013 to 2022, this study identifies students’ carry marks as the most correlated factor in determining performance levels. Particularly, the decision tree is identified as the most accurate predictive model, having a 0.60 accuracy value. The model also has the highest value for recall and F1-score compared to other models. Finally, four models, namely multinomial logistic regression, decision tree, Random Forest, and Naïve Bayes, have perfect scores, 1.00 of area under the receiver operating characteristics curve to distinguish fail grade students. At the end of this study, it is recommended that future research might reassess the model by considering additional variables or techniques that may help improve the predictive accuracy. The predictive algorithm can also be added to the Learning Management System along with a dashboard so that it is easier to do analyses in the future.
Item Type: | Thesis (Masters) |
---|---|
Additional Information: | Thesis (Master of Science (Mathematics) -- Universiti Malaysia Pahang – 2024, SV: Sahimel Azwal bin Sulaiman, No CD: 13661 |
Uncontrolled Keywords: | multinomial logistic regression |
Subjects: | Q Science > Q Science (General) Q Science > QA Mathematics |
Faculty/Division: | Institute of Postgraduate Studies Center for Mathematical Science |
Depositing User: | Mr. Mohd Fakhrurrazi Adnan |
Date Deposited: | 07 May 2025 07:09 |
Last Modified: | 07 May 2025 07:09 |
URI: | http://umpir.ump.edu.my/id/eprint/44026 |
Download Statistic: | View Download Statistics |
Actions (login required)
![]() |
View Item |