Systematic review of using machine learning in imputing missing values

Alabadla, Mustafa and Fatimah, Sidi and Iskandar, Ishak and Hamidah D., Ibrahim and Lilly Suriani, Affendey and Zafienas, Che Ani and A. Jabar, Marzanah Ab and Bukar, Umar Ali and Devaraj, Navin Kumar and Ahmad Sobri, Muda and Tharek, Anas and Noritah, Omar and Mohd Izham, Mohd Jaya (2022) Systematic review of using machine learning in imputing missing values. IEEE Access, 10. pp. 44483-44502. ISSN 2169-3536. (Published)

[img] Pdf
Systematic Review of Using Machine Learning in Imputing Missing Values.pdf
Restricted to Repository staff only

Download (2MB) | Request a copy
[img]
Preview
Pdf
Systematic review of using machine learning in imputing missing values_ABS.pdf

Download (273kB) | Preview

Abstract

Missing data are a universal data quality problem in many domains, leading to misleading analysis and inaccurate decisions. Much research has been done to investigate the different mechanisms of missing data and the proper techniques in handling various data types. In the last decade, machine learning has been utilized to replace conventional methods to address the problem of missing values more efficiently. By studying and analyzing recently proposed methods using machine learning approaches, vital adoptions in accuracy, performance, and time consumed can be highlighted. This study aimed to help data analysts and researchers address the limitations of machine learning imputation methods by conducting a systematic literature review to provide a comprehensive overview of using such methods to impute missing values. Novel proposed machine learning approaches used for data imputation are analyzed and summarized to assist researchers in selecting a proper machine learning method based on several factors and settings. The review was performed on research studies published between 2016 and 2021 on adopting machine learning to impute missing values, focusing on their strengths and limitations. A total of 684 research articles from various scientific databases were analyzed using search engines, and 94 of them were selected as primary studies. Finally, several recommendations were given to guide future researchers in applying machine learning to impute missing values.

Item Type: Article
Additional Information: Indexed by Scopus
Uncontrolled Keywords: Data imputation; Data mining; Data preprocessing; Data quality; Missingness; Systematic literature review
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
T Technology > T Technology (General)
T Technology > TA Engineering (General). Civil engineering (General)
Faculty/Division: College of Engineering
Faculty of Computing
Depositing User: Mr Muhamad Firdaus Janih@Jaini
Date Deposited: 08 Nov 2023 02:50
Last Modified: 08 Nov 2023 02:50
URI: http://umpir.ump.edu.my/id/eprint/38852
Download Statistic: View Download Statistics

Actions (login required)

View Item View Item