UMP Institutional Repository

Evaluating the Effect of Dataset Size on Predictive Model Using Supervised Learning Technique

Raheem, Ajiboye Adeleke and Ruzaini, Abdullah Arshah and Hongwu, Qin and Kebbe, H. Isah (2015) Evaluating the Effect of Dataset Size on Predictive Model Using Supervised Learning Technique. International Journal of Software Engineering & Computer Sciences (IJSECS), 1. pp. 74-84. ISSN 2289-8522

[img] PDF
EVALUATING THE EFFECT OF DATASET SIZE ON PREDICTIVE MODEL.pdf - Published Version
Restricted to Repository staff only

Download (314kB) | Request a copy

Abstract

Learning models used for prediction purposes are mostly developed without paying much cognizance to the size of datasetsthat can produce models of high accuracy and better generalization. Although, the general believe is that, large dataset is needed to construct a predictive learning model. To describe adata setas large in size, perhaps, iscircumstance dependent, thus, what constitutesa dataset to be considered as being big or small is vague.In this paper, the ability of predictive model to generalize with respect to a particular size of data when simulated with new untrained input is examined. The study experiments on three different sizes of data using Matlab programto create predictive models with a view to establishing if the sizeof data has any effect on the accuracy of a model.The simulated output of each model is measured using theMean Absolute Error (MAE) and comparisons are made. Findings from this study reveals that, the quantity of data partitioned for the purpose of training must be of good representation of the entire sets and sufficient enough to span through the input space. The results of simulating the three network models also shows that, the learning model with the largest size of training setsappearsto be the most accurate and consistently delivers a much better and stable results.

Item Type: Article
Uncontrolled Keywords: Prediction, Neural Network, Supervised Learning, Data mining, Data size.
Subjects: Q Science > QA Mathematics > QA76 Computer software
Faculty/Division: Faculty of Computer System And Software Engineering
Depositing User: Mrs. Neng Sury Sulaiman
Date Deposited: 28 Apr 2015 07:18
Last Modified: 18 May 2018 02:49
URI: http://umpir.ump.edu.my/id/eprint/6085
Download Statistic: View Download Statistics

Actions (login required)

View Item View Item