Online media as a price monitor: Text analysis using text extraction technique and jaro-winkler similarity algorithm

Nurcahyawati, Vivine and Mustaffa, Zuriani (2020) Online media as a price monitor: Text analysis using text extraction technique and jaro-winkler similarity algorithm. In: 2020 International Conference on Emerging Technology in Computing, Communication and Electronics, ETCCE 2020 , 21 - 22 December 2020 , Virtual, Dhaka. pp. 1-6. (167272). ISBN 978-166541962-8

[img]
Preview
Pdf
Online media as a price monitor - Text analysis .pdf

Download (171kB) | Preview
[img] Pdf
Online Media as a Price Monitor - Text Analysis_FULL.pdf
Restricted to Repository staff only

Download (354kB) | Request a copy

Abstract

Online media has become an essential part of everyday life in modern society. Everyone or organization is free to share their opinions and feelings about any topic on it, including information or news about commodity price fluctuations. Commodity price data from the National Strategic Price Information Center (NSPIC) website is not real-time, so it is not sufficient as a basis for monitoring commodity price fluctuations. Meanwhile, the government needs to collect data and information quickly about these price fluctuations, hence immediately strategic decisions and policies can be made to stabilize the prices. This study explores the potential function of online media by extracting the text in it and analyzing text so that it can display the commodity price data sought. The commodities used as search keywords were commodities that had the highest consumption level in 2016 in Indonesia. The texts analyzed were taken from three online media, namely Twitter, Liputan6.com, and Detik.com. It was analyzed using text extraction techniques and the application of the Jaro-Winkler algorithm to find commodity prices in the text collection. Then compare the results of text analysis with commodity prices from the NSPIC website. The experimental data were 99,007 with a data collection time of three months. From only 122 data that match the keywords, it consists of 100 training data and 22 testing data. The results of the text analysis show that the text from the Detik.com website shows the commodity prices closest to the price data from the NSPIC, while Twitter shows the farthest results. The accuracy test with the confusion matrix is 75%. Based on this research, online media texts are a viable source for monitoring commodity price fluctuations.

Item Type: Conference or Workshop Item (Lecture)
Additional Information: Indexed by Scopus
Uncontrolled Keywords: Classification; Online Media; Similarity; Text Extraction; Text Mining; Text-Preprocessing; Visualization
Subjects: Q Science > QA Mathematics > QA76 Computer software
T Technology > T Technology (General)
Faculty/Division: Institute of Postgraduate Studies
Faculty of Computing
Depositing User: Mrs Norsaini Abdul Samat
Date Deposited: 25 Jan 2023 02:15
Last Modified: 25 Jan 2023 02:15
URI: http://umpir.ump.edu.my/id/eprint/36781
Download Statistic: View Download Statistics

Actions (login required)

View Item View Item