A Convolutional Neural Network (CNN) Classification Model for Web Page: A Tool for Improving Web Page Category Detection Accuracy

Siti Hawa, Apandi and Jamaludin, Sallim and Rozlina, Mohamed (2023) A Convolutional Neural Network (CNN) Classification Model for Web Page: A Tool for Improving Web Page Category Detection Accuracy. Jurnal Ilmiah Teknologi Sistem Informasi, 4 (3). pp. 110-121. ISSN Print ISSN : 2722-4619 ; Online ISSN : 2722-4600. (Published)

[img]
Preview
Pdf
A Convolutional Neural Network (CNN) Classification Model.pdf
Available under License Creative Commons Attribution Share Alike.

Download (1MB) | Preview

Abstract

Game and Online Video Streaming are the most viewed web pages. Users who spend too much time on these types of web pages may suffer from internet addiction. Access to Game and Online Video Streaming web pages should be restricted to combat internet addiction. A tool is required to recognise the category of web pages based on the text content of the web pages. Due to the unavailability of a matrix representation that can handle long web page text content, this study employs a document representation known as word cloud image to visualise the words extracted from the text content web page after data pre-processing. The most popular words are shown in large size and appear in the centre of the word cloud image. The most common words are the words that appear frequently in the text content web page and are related to describing what the web page content is about. The Convolutional Neural Network (CNN) recognises the pattern of words presented in the core portions of the word cloud image to categorise the category to which the web page belongs. The proposed model for web page classification has been compared with the other web page classification models. It shows the good result that achieved an accuracy of 85.6%. It can be used as a tool that helps to make identifying the category of web pages more accurate

Item Type: Article
Uncontrolled Keywords: Web page classification; document representation; word cloud image; deep learning; Convolutional Neural Network
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Depositing User: Miss Amelia Binti Hasan
Date Deposited: 16 Jan 2024 06:50
Last Modified: 07 Feb 2024 07:30
URI: http://umpir.ump.edu.my/id/eprint/40032
Download Statistic: View Download Statistics

Actions (login required)

View Item View Item