Feature selection using law of total variance with fast correlation-based filter

Nur Atiqah, Mustapa and Azlyna, Senawi and Liang, Chuanzun (2023) Feature selection using law of total variance with fast correlation-based filter. In: 8th International Conference on Software Engineering and Computer Systems, ICSECS 2023 , 25-27 August 2023 , Penang. pp. 35-40. (192961). ISBN 979-835031093-1

[img] Pdf
Feature selection using law of total variance.pdf
Restricted to Repository staff only

Download (889kB) | Request a copy
[img]
Preview
Pdf
Feature selection using law of total variance with fast correlation-based filter_ABS.pdf

Download (262kB) | Preview

Abstract

The increased dimensionality of data poses a formidable obstacle to completing data mining tasks. Due to the extraneous features associated with high-dimensional data, processing and analysis took longer and were less precise. As a pre-processing phase in the analysis of data mining tasks, feature selection is effective at reducing dimensionality, removing irrelevant characteristics, increasing accuracy, and enhancing the readability of the results. This research proposes the law of total variance with fast correlation-based filter (LTVFCBF) as a new feature selection method. LTVFCBF chose the significant features by identifying relevant features and remove redundant features among the relevant ones. The analysis was conducted with ten datasets of varied dimensionality to evaluate the performance of the proposed LTVFCBF and validated using four classifiers: K-nearest neighbours, Naïve Bayes, support vector machine, and bagging. The LTVFCBF and LTV methods have been compared in terms of the number of selected features, classification accuracy, and execution time. In overall, the suggested LTVFCBF has the potential to minimize the dimensionality of data by selecting a lower number of significant features with better accuracy. However, it requires a slightly higher execution time compared to LTV. Aside from that, LTVFCBF can achieve comparable accuracy with faster execution time when less than half of the original features are maintained. The proposed method can produce a promising outcome and may be regarded as an effective filter approach for feature selection.

Item Type: Conference or Workshop Item (Lecture)
Additional Information: Indexed by Scopus
Uncontrolled Keywords: Dimensionality reduction; Feature selection; Filter feature selection; Law of total variance; Pearson correlation coefficient
Subjects: Q Science > Q Science (General)
Q Science > QA Mathematics
Faculty/Division: Institute of Postgraduate Studies
Center for Mathematical Science
Depositing User: Mr Muhamad Firdaus Janih@Jaini
Date Deposited: 16 Apr 2024 04:18
Last Modified: 16 Apr 2024 04:18
URI: http://umpir.ump.edu.my/id/eprint/40376
Download Statistic: View Download Statistics

Actions (login required)

View Item View Item