A review of feature selection on text classification

Nur Syafiqah, Mohd Nafis and Suryanti, Awang (2018) A review of feature selection on text classification. In: Proceedings Book: National Conference for Postgraduate Research (NCON-PGR 2018), 28-29 August 2018 , Universiti Malaysia Pahang, Gambang, Pahang. pp. 8-14.. ISBN 978-967-22260-5-5

A Review of Feature Selection on Text2.pdf

Download (708kB) | Preview


Textual data is a high-dimensional data. In high-dimensional data, the number of features xceeds the number of samples. Hence, it equally increased the amount of noise, and irrelevant features. At this point, dimensionality reduction is necessary. Feature selection is an example of dimensionality reduction techniques. Moreover, it had been an indispensable component in classification. Thus, in this paper, we presented three feature selection approaches; filter, wrapper and embedded. Their aims, advantages and disadvantages are also briefly explained. Besides, this study reviews several significant studies for each feature selection approach for text classification. Based on the studies, we found that wrapper approach is less used by researchers since it is prone to over-fit and exposed local-optima for text classification while filter and embedded achieved an amount of research. However, in filter approach, the classification accuracies cannot be guaranteed because it does not incorporate with any learning algorithm. Therefore, it concludes that embedded feature selection can offer a promising classification performance regarding classification accuracy and computational time.

Item Type: Conference or Workshop Item (Lecture)
Uncontrolled Keywords: feature selection; text classification; high-dimensional
Subjects: Q Science > QA Mathematics > QA76 Computer software
Faculty/Division: Faculty of Computer System And Software Engineering
Depositing User: Pn. Hazlinda Abd Rahman
Date Deposited: 10 Dec 2018 02:36
Last Modified: 24 Jul 2019 01:17
URI: http://umpir.ump.edu.my/id/eprint/23030
Download Statistic: View Download Statistics

Actions (login required)

View Item View Item