Abstract
This study aims to analyze the risk factors for diabetes mellitus and build a predictive model using the C4.5 algorithm implemented through the Orange application. The dataset used is the Pima Indians Diabetes Dataset from the UCI Machine Learning Repository containing 768 samples with eight predictor variables and one target variable. The analysis process includes data preprocessing, splitting training and testing data (80:20), building a decision tree model, and evaluating performance using accuracy, precision, recall, F1-score, and AUC metrics. The results show that the C4.5 model achieved an accuracy value of 71.9% and an AUC of 0.674, indicating a fairly good classification ability. The most influential factors on diabetes risk are blood glucose levels, body mass index (BMI), and age. This study contributes to the development of a data mining-based disease prediction model that is easy to implement and can support the decision-making process in the early detection of diabetes mellitus.
References
Aditya, M. F., Pramuntadi, A., Wijaya, D. P., & Wicaksono, Y. (2024). Implementasi metode decision tree pada prediksi penyakit diabetes melitus tipe 2. MALCOM: Indonesian Journal of Machine Learning and Computer Science, 4(3), 1104-1110. https://doi.org/10.57152/malcom.v4i3.1284
American Diabetes Association. (2019). Diabetes advocacy: Standards of medical care in diabetes. Diabetes care, 43(Supplement_1), S203-S204. https://doi.org/10.2337/dc20-S016
Aris, F., Benyamin (2019). Penerapan data mining untuk identifikasi penyakit diabetes melitus dengan menggunakan metode klasifikasi. Router Research, 1(1), 1-6. https://doi.org/10.29239/j.router.2019.313
Demšar, J., Curk, T., Erjavec, A., Gorup, ?., Ho?evar, T., Milutinovi?, M., Možina, M., Polajnar, M., Toplak, M., Stari?, A., Stajdohar, M., Umek, L., Žagar, L., Žbontar, J., & Zupan, B. (2013). Orange: Data mining toolbox in Python. Journal of Machine Learning Research, 14(1), 2349–2353.
Hana, F. M. (2020). Klasifikasi penderita penyakit diabetes menggunakan algoritma decision tree C4. 5. Jurnal SISKOM-KB (Sistem Komputer dan Kecerdasan Buatan), 4(1), 32-39. https://doi.org/10.47970/siskom-kb.v4i1.173
International Diabetes Federation. (2021). IDF diabetes atlas (10th ed.). Author.
Kementerian Kesehatan Republik Indonesia. (2018). Hasil utama Riskesdas 2018. Kementerian Kesehatan Republik Indonesia.
Noviandi, N. (2018). Implementasi algoritma decision tree c4.5 untuk prediksi penyakit diabetes. Indonesian of Health Information Management Journal (INOHIM), 6(1), 1-5. https://doi.org/10.47007/inohim.v6i1.142
Sari, Z. D. R., & Jasmir, J. (2024). Penerapan Data Mining untuk prediksi penyakit diabetes menggunakan algoritma C4.5. Jurnal Informatika Dan Rekayasa Komputer (JAKAKOM), 4(1), 827-834. https://doi.org/10.33998/jakakom.2024.4.1.1624
Siallagan, R. A. (2021). Prediksi penyakit diabetes mellitus menggunakan algoritma c4.5. Jurnal Responsif: Riset Sains dan Informatika, 3(1), 44-52. https://doi.org/10.51977/jti.v3i1.407
Velu, S. R., Ravi, V., & Tabianan, K. (2023). Machine learning implementation to predict type-2 diabetes mellitus based on lifestyle behaviour pattern using HBA1C status. Health and Technology, 13(3), 437-447. https://doi.org/10.1007/s12553-023-00751-5

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Copyright (c) 2024 Mutiara Akbar Nasution, Zaid Ahlun Ulumuddin, Anisa Fitri