A Hybrid Vision Transformer Model for Efficient Waste Classification

Amir Mahmud Husein; Baren Baruna Harahap; Tio Fulalo Simatupang; Karunia Syukur Baeha; Bintang Keitaro Sinambela

doi:10.21609/jiki.v18i2.1545

A Hybrid Vision Transformer Model for Efficient Waste Classification

Authors

Amir Mahmud Husein Informatics Engineering, Faculty of Science and Technology, Universitas Prima Indonesia, Indonesia https://orcid.org/0000-0002-5735-8594
Baren Baruna Harahap Informatics Engineering, Faculty of Science and Technology, Universitas Prima Indonesia, Indonesia https://orcid.org/0009-0007-1120-530X
Tio Fulalo Simatupang Informatics Engineering, Faculty of Science and Technology, Universitas Prima Indonesia, Indonesia https://orcid.org/0009-0004-1089-4180
Karunia Syukur Baeha Informatics Engineering, Faculty of Science and Technology, Universitas Prima Indonesia, Indonesia https://orcid.org/0009-0006-3908-7272
Bintang Keitaro Sinambela Informatics Engineering, Faculty of Science and Technology, Universitas Prima Indonesia, Indonesia https://orcid.org/0009-0002-1285-0101

DOI:

https://doi.org/10.21609/jiki.v18i2.1545

Abstract

The rapid and accurate sorting of municipal waste is essential for efficient recycling and sustainable resource recovery. Most existing AI solutions focus only on four common materials (plastic, paper, metal, and glass), overlooking many other routinely encountered waste types and losing accuracy when applied to the mixed waste compositions seen in operational environments. We introduce HR-ViT, a hybrid network that combines ResNet50 residual blocks, which capture fine-grained local cues, with Vision Transformer global self-attention. Trained on a balanced six-class benchmark of about 775 images per class (plastic, paper, organic, metal, glass, batteries), HR-ViT attains 98.27 % accuracy and a macro-averaged F1-score of 0.98, outperforming a pure ViT, VT-MLH-CNN, and Garbage FusionNet by up to five percentage points in both metrics. Gains arise from selective fine-tuning of the last ten ResNet layers, lightweight ViT hyper-parameter optimisation, and targeted data augmentation that mitigates cluttered backgrounds, uneven lighting, and object deformation. These results show that hybrid attention-residual architectures provide reliable predictions under complex imaging conditions. Future work will extend the method to multi-object scenes and domain-adaptive deployment in smart-city recycling systems.

Downloads

Published

2025-06-26

How to Cite

Amir Mahmud Husein, Baren Baruna Harahap, Tio Fulalo Simatupang, Karunia Syukur Baeha, & Bintang Keitaro Sinambela. (2025). A Hybrid Vision Transformer Model for Efficient Waste Classification. Jurnal Ilmu Komputer Dan Informasi, 18(2), 261–275. https://doi.org/10.21609/jiki.v18i2.1545

Download Citation

Issue

Vol. 18 No. 2 (2025): Jurnal Ilmu Komputer dan Informasi (Journal of Computer Science and Information)

Section

Articles

License

Authors who publish with this journal agree to the following terms:

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).

A Hybrid Vision Transformer Model for Efficient Waste Classification

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Journal Information

Call for Papers

SINTA accreditation

Indexed in

Our journal is implementing Double Blind Review for each submitted article.

Visitors