Keyphrases Frequency Analysis from Research Articles: A Region-Based Unsupervised Novel Approach

Miah, Mohammad Badrul Alam and Suryanti, Awang and Rahman, Md Mustafizur and A. S. M., Sanwar Hosen and Ra, In-Ho (2022) Keyphrases Frequency Analysis from Research Articles: A Region-Based Unsupervised Novel Approach. IEEE Access. pp. 1-12. ISSN 2169-3536. (Published)

[img]
Preview
Pdf
Keyphrases Frequency Analysis.pdf - Accepted Version
Available under License Creative Commons Attribution.

Download (2MB) | Preview

Abstract

Due to the advancement of technology and the exponential proliferation of digital sources and textual data, the extraction of high-quality keyphrases and the summarizing of content at a high standard has become increasingly difficult in current research. Extracting high-quality keyphrases and summing texts at a high level demands the use of keyphrase frequency as a feature for keyword extraction, which is becoming more popular. This article proposed a novel unsupervised keyphrase frequency analysis (KFA) technique for feature extraction of keyphrases that is corpus-independent, domain-independent, language-agnostic, and length-free documents, and can be used by supervised and unsupervised algorithms. This proposed technique has five essential phases: data acquisition; data pre-processing; statistical methodologies; curve plotting analysis; and curve fitting technique. First, the technique begins by collecting five different datasets from various sources and then feeding those datasets into the data pre-processing phase using text pre-processing techniques. The preprocessed data is then transmitted to the region-based statistical process, followed by the curve plotting phase, and finally, the curve fitting approach. Afterward, the proposed technique is tested and assessed using five (5) standard datasets. Then, the proposed technique is compared with our recommended systems to prove its efficacy, benefits, and significance. Finally, the experimental findings indicate that the proposed technique effectively analyses the keyphrase frequency from articles and delivers the keyphrase frequency of 70.63% in 1st region and 10.74% in 2nd region of the total present keyphrase frequency.

Item Type: Article
Uncontrolled Keywords: Curve fitting technique, Data pre-processing, Feature extraction, Keyphrase extraction, Keyphrase frequency analysis, KFA technique
Subjects: Q Science > QA Mathematics > QA76 Computer software
T Technology > TA Engineering (General). Civil engineering (General)
Faculty/Division: Faculty of Engineering Technology
Institute of Postgraduate Studies
Faculty of Computing
Depositing User: Noorul Farina Arifin
Date Deposited: 06 Sep 2022 08:27
Last Modified: 06 Sep 2022 08:27
URI: http://umpir.ump.edu.my/id/eprint/35106
Download Statistic: View Download Statistics

Actions (login required)

View Item View Item