KDA: An unsupervised approach for analyzing keyphrases distance from news articles as a feature of keyphrase extraction

Alam Miah, Mohammad Badrul and Suryanti, Awang (2022) KDA: An unsupervised approach for analyzing keyphrases distance from news articles as a feature of keyphrase extraction. In: The 6th National Conference for Postgraduate Research (NCON-PGR 2022) , 15 November 2022 , Virtual Conference, Universiti Malaysia Pahang, Malaysia. p. 83..

[img]
Preview
Pdf
KDA _ An unsupervised approach for analyzing keyphrases distance from news articles as a feature of keyphrase extraction.pdf

Download (335kB) | Preview

Abstract

Automatic keyphrase extraction remains a significant and difficult issue in the current research domain because of the exponential explosion of information and internet sources. Various activities involving natural language processing and information retrieval systems greatly benefit from the use of keyphrases. To extract the best keyphrases and summarize the documents to the highest standard, feature extractions for those keyphrases are crucial. This paper proposes an unsupervised region-based KDA technique for analyzing the distance of keyphrases from news articles as feature of keyphrase extraction. The proposed technique is divided into eight phases: data collection, data pre-processing, data processing, keyphrase searching, distance calculating, distance averaging, curve-plotting, and curve-fitting. At first, the proposed technique collects two different datasets that contain the news articles; it is then applied to the data pre-processing step that uses a few preprocessing algorithms. Then this pre-processing data is used in the data processing stage, where it is sent to the keyphrase searching step, the distance calculation process, and then the distance averaging steps. Curve plotting analysis is then applied, and finally the curve fitting technique is used. Afterwards, the performance of the proposed technique is put to test and evaluated using two of the most accessible benchmark datasets. The proposed method is then compared to other available methods in order to demonstrate its efficiency, advantages, and importance. Lastly, the results of the experiment demonstrated that the proposed approach efficiently analyzed the keyphrase distance from news articles, produced an F1-score of 96.91%, and presented keyphrases of 94.55%, as well as greatly improved the effectiveness of the current keyphrase extraction methods.

Item Type: Conference or Workshop Item (Lecture)
Uncontrolled Keywords: Curve fitting technique; Data pre-processing; Data processing; Feature extraction; KDA technique; Keyphrase extraction.
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
T Technology > T Technology (General)
T Technology > TA Engineering (General). Civil engineering (General)
Faculty/Division: Institute of Postgraduate Studies
Faculty of Computing
Depositing User: Mr Muhamad Firdaus Janih@Jaini
Date Deposited: 20 Feb 2023 07:13
Last Modified: 04 Jan 2024 01:25
URI: http://umpir.ump.edu.my/id/eprint/36844
Download Statistic: View Download Statistics

Actions (login required)

View Item View Item