Sentiment Analysis of COVID-19 Vaccines in Indonesia on Twitter Using Pre-Trained and Self-Training Word Embeddings

Kartikasari Kusuma Agustiningsih; Ema Utami; Muhammad Altoumi Alsyaibani

doi:10.21609/jiki.v15i1.1044

Sentiment Analysis of COVID-19 Vaccines in Indonesia on Twitter Using Pre-Trained and Self-Training Word Embeddings

Authors

Kartikasari Kusuma Agustiningsih Universitas Amikom Yogyakarta
Ema Utami Universitas Amikom Yogyakarta
Muhammad Altoumi Alsyaibani Universitas Amikom Yogyakarta

DOI:

https://doi.org/10.21609/jiki.v15i1.1044

Keywords:

Sentiment Analysis, Twitter, Bidirectional LSTM, Word Embedding, fastText, GloVe

Abstract

Sentiment analysis regarding the COVID-19 vaccine can be obtained from social media because users usually express their opinions through social media. One of the social media that is most often used by Indonesian people to express their opinion is Twitter. The method used in this research is Bidirectional LSTM which will be combined with word embedding. In this study, fastText and GloVe were tested as word embedding. We created 8 test scenarios to inspect performance of the word embeddings, using both pre-trained and self-trained word embedding vectors. Dataset gathered from Twitter was prepared as stemmed dataset and unstemmed dataset. The highest accuracy from GloVe scenario group was generated by model which used self-trained GloVe and trained on unstemmed dataset. The accuracy reached 92.5%. On the other hand, the highest accuracy from fastText scenario group generated by model which used self-trained fastText and trained on stemmed dataset. The accuracy reached 92.3%. In other scenarios that used pre-trained embedding vector, the accuracy was quite lower than scenarios that used self-trained embedding vector, because the pre-trained embedding data was trained using the Wikipedia corpus which contains standard and well-structured language while the dataset used in this study came from Twitter which contains non-standard sentences. Even though the dataset was processed using stemming and slang words dictionary, the pre-trained embedding still can not recognize several words from our dataset.

Downloads

Published

2022-02-27

How to Cite

Agustiningsih, K. K., Utami, E., & Alsyaibani, M. A. (2022). Sentiment Analysis of COVID-19 Vaccines in Indonesia on Twitter Using Pre-Trained and Self-Training Word Embeddings. Jurnal Ilmu Komputer Dan Informasi, 15(1), 39–46. https://doi.org/10.21609/jiki.v15i1.1044

Download Citation

Issue

Vol. 15 No. 1 (2022): Jurnal Ilmu Komputer dan Informasi (Journal of Computer Science and Information)

Section

Articles

License

Authors who publish with this journal agree to the following terms:

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).

Sentiment Analysis of COVID-19 Vaccines in Indonesia on Twitter Using Pre-Trained and Self-Training Word Embeddings

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Journal Information

Call for Papers

SINTA accreditation

Indexed in

Our journal is implementing Double Blind Review for each submitted article.

Visitors