Improving Classification Performance on Imbalanced Medical Data using Generative Adversarial Network

  • Siska Rahmadani Universitas Nusa Mandiri
  • Agus Subekti Nusa Mandiri University
  • Muhammad Haris

Abstract

In many real-world applications, the problem of data imbalance is a common challenge that significantly affects the performance of machine learning algorithms. Data imbalance means each target of classes is not balanced. This problem often appears in medical data, where the positive cases of a disease or condition are much fewer than the negative cases. In this paper, we propose to explore the oversampling-based Generative Adversarial Networks (GAN) method to improve the performance of the classification algorithm over imbalanced medical datasets. We expect that GAN will be able to learn the actual data distribution and generate synthetic samples that are similar to the original ones. We evaluate our proposed methods on several metrics: Recall, Precision, F1 score, AUC score, and FP rate. These metrics measure the ability of the classifier to correctly identify the minority class and reduce the false positives and false negatives. Our experimental results show that the application of GAN performs better than other methods in several metrics across datasets and can be used as an alternative method to improve the performance of the classification model on imbalanced medical data.

Published
2024-02-25
How to Cite
Siska Rahmadani, Agus Subekti, & Haris, M. (2024). Improving Classification Performance on Imbalanced Medical Data using Generative Adversarial Network. Jurnal Ilmu Komputer Dan Informasi, 17(1), 9-17. https://doi.org/10.21609/jiki.v17i1.1177