Classification of Economic Activities in Indonesia Using IndoBERT Language Model

Authors

DOI:

https://doi.org/10.21609/jiki.v18i2.1446

Abstract

Classification of economic activities plays a vital role in understanding, analyzing, and managing complex economic processes in a society or country. It facilitates economic analysis, data collection, policy formulation, and informed decision-making. In Indonesia, economic activities are classified according to the Indonesian Standard Industrial Classification (KBLI). This classification process requires in-depth knowledge about KBLI, and this process is still performed manually, which is therefore time-consuming. To address this challenge, this paper proposes to use a transformer-based language model that was pretrained using a large Indonesian corpus, i.e., IndoBERT, to better understand the contextual meanings of text in order to improve the accuracy of automatic economic activity classification. Our results show that the finetuned IndoBERTLARGE model achieves superior results, with an F1 score of 96.82% and a balanced accuracy of 96.10%, outperforming other recent methods used for similar task, i.e., CatBoost and DistilBERT models.

Author Biography

Evi Yulianti, Universitas Indonesia

Evi Yulianti is a lecturer and researcher at Faculty of Computer Science, Universitas Indonesia. She received the B.Comp.Sc. degree from the Universitas Indonesia in 2010, the dual M.Comp.Sc. degree from Universitas Indonesia and Royal Melbourne Institute of Technology University in 2013, and the Ph.D. degree from Royal Melbourne Institute of Technology University in 2018. Her research interests include information retrieval and natural language processing. She can be contacted at email: evi.y@cs.ui.ac.id.

Downloads

Published

2025-06-26

How to Cite

Syazali, M. R., & Yulianti, E. (2025). Classification of Economic Activities in Indonesia Using IndoBERT Language Model. Jurnal Ilmu Komputer Dan Informasi, 18(2), 155–165. https://doi.org/10.21609/jiki.v18i2.1446