Miah, M. Saef Ullah and Junaida, Sulaiman and Azad, Saiful and Kamal Z., Zamli and Rajan, Jose (2021) Comparison of document similarity algorithms in extracting document keywords from an academic paper. In: IEEE 2021 International Conference on Software Engineering & Computer Systems and 4th International Conference on Computational Science and Information Management (ICSECS-ICOCSIM) , 24-26 August 2021 , Pekan, Pahang, Malaysia. pp. 631-636..
Pdf
Comparison of document similarity.pdf Restricted to Repository staff only Download (2MB) | Request a copy |
Abstract
The idea of this study is to validate a list of keywords derived from a scientific article by a domain expert from years of knowledge with prominent document similarity algorithms. For this study, a list of handcrafted keywords generated by Electric Double Layer Capacitor (EDLC) experts are chosen, and relevant documents to EDLC are considered for the comparison. Then, different similarity calculation algorithms were employed in different settings on the documents such as using the whole texts of the documents, selecting the positive sentences of the documents, and generating similarity score with automatically extracted keywords from the documents. The experiment’s outcome provides us with findings that the machine-generated keywords are mostly similar to the curated list by the domain experts. This study also suggests the preferable algorithms for similarity calculation and automated key-phrase extraction for the EDLC domain.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Uncontrolled Keywords: | Document similarity calculation, Relevant Document Selection, keyword extraction comparison, keyword validation, Electric Double Layer Capacitor, EDLC, Keyword Based Recommendation System |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Faculty/Division: | Faculty of Computer System And Software Engineering Institute of Postgraduate Studies |
Depositing User: | Dr. Junaida Sulaiman |
Date Deposited: | 29 Jul 2022 03:56 |
Last Modified: | 29 Jul 2022 03:58 |
URI: | http://umpir.ump.edu.my/id/eprint/34309 |
Download Statistic: | View Download Statistics |
Actions (login required)
View Item |