Region-Based Distance Analysis of Keyphrases: A New Unsupervised Method for Extracting Keyphrases Feature from Articles

Miah, Mohammad Badrul Alam and Suryanti, Awang and Md.Saiful, Azad (2021) Region-Based Distance Analysis of Keyphrases: A New Unsupervised Method for Extracting Keyphrases Feature from Articles. In: International Conference on Software Engineering & Computer Systems and 4th International Conference on Computational Science and Information Management (ICSECS-ICOCSIM 2021) , 24-26 August 2021 , Pekan, Pahang, Malaysia. pp. 124-129.. ISBN 978-1-6654-1407-4

[img]
Preview
Pdf
Region-Based Distance Analysis of Keyphrases1.pdf

Download (77kB) | Preview

Abstract

Due to the exponential growth of information’s and web sources, Automatic keyphrase extraction is still a challenging issue in the current research area. Keyphrases are very helpful for several tasks in natural language processing (NLP) and information retrieval (IR) systems. Feature extractions for those keyphrases execute a vital role in extracting the top-quality keyphrases and summarising the documents at a superior level. This paper proposes a new region-based distance analysis of keyphrases (RDAK) unsupervised technique for feature extraction of keyphrases from articles. The proposed method comprises six phases: data acquisition and preprocessing, data processing, distance calculation, average distance, curve plotting, and curve fitting. At first, the system inputs the collected different datasets to the preprocessing step by employing some text preprocessing techniques. Afterwards, the preprocessed data is applied to the data processing phase, and then after distance calculation, it is passed to the region-based average calculation process, then curve plotting analysis, and afterwards, the curve fitting technique is utilized. Finally, the proposed system has tested and evaluated the performance through implementing them on benchmark datasets. The proposed system will significantly improve the performance of existing keyphrase extraction techniques.

Item Type: Conference or Workshop Item (Paper)
Uncontrolled Keywords: Distance analysis; Region-based distance analysis; Data processing; Feature extraction; Keyphrase extraction technique; Goldkey
Subjects: Q Science > QA Mathematics > QA76 Computer software
Faculty/Division: Faculty of Computing
Depositing User: Noorul Farina Arifin
Date Deposited: 10 Jan 2022 04:15
Last Modified: 05 Jan 2024 07:38
URI: http://umpir.ump.edu.my/id/eprint/33128
Download Statistic: View Download Statistics

Actions (login required)

View Item View Item