Data-Driven Portfolio Optimization using K-Means and Markowitz Model: Evidence from LQ45 Stocks
Main Article Content
Abstract
This study aims to optimize stock portfolio allocation through a data-driven approach by integrating the K-Means Clustering algorithm and the Markowitz Model. The dataset includes technical and fundamental indicators of LQ45 index stocks from 2019 to 2024. The process begins with data normalization and feature extraction, followed by stock clustering using the K-Means algorithm. From the four resulting clusters, the top-performing stock with the highest average return is selected from each. Portfolio weights are then optimized using the Markowitz Model under a mean-variance framework without short selling. The optimization results allocate the largest weights to ARTO, BRPT, and ISAT. Performance evaluation through a backtest simulation in 2024 shows that the portfolio experienced only an 8.02% decline, outperforming the LQ45 index which dropped by 15.60%. These findings underscore the potential of integrating data mining and quantitative optimization methods to improve diversification efficiency and strengthen portfolio resilience during market downturns.
Article Details

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
References
Abrar, A. S., Kunaifi, A., & SE, M. (2021). Diversifikasi portofolio saham syariah Indonesia menggunakan algoritma k-means clustering: Studi kasus pandemi COVID-19 (Skripsi Sarjana, Institut Teknologi Sepuluh Nopember).
Ady, S. U., Susilowati, S., & Farida, I. (2022). Penyuluhan pengenalan analisa fundamental pada keputusan investasi saham. Transformasi: Jurnal Pengabdian Pada Masyarakat, 2(1), 18–31. https://doi.org/10.31764/transformasi.v2i1.8099
Ariyatma, R. D., & Fahmi, S. (2023). Data mining menggunakan multiple regression untuk prediksi harga saham Netflix. Jurnal Saintekom: Sains, Teknologi, Komputer dan Manajemen, 13(2), 184–192. https://doi.org/10.33020/saintekom.v13i2.419
Azizah, K. N., Saepudin, D., & Gunawan, P. H. (2021). Optimasi portofolio saham LQ45 dengan mempertimbangkan prediksi return menggunakan metode Holt-Winter. E-Proceeding of Engineering, 8(5), 10776–10784. https://openlibrarypublications.telkomuniversity.ac.id/index.php/engineering/article/view/15653
Gubu, L., Rosadi, D., & Abdurakhman, A. (2021). Pembentukan portofolio saham menggunakan klastering time series k-medoid dengan ukuran jarak dynamic time warping. Jurnal Aplikasi Statistika & Komputasi Statistik, 13(2), 35–46. https://doi.org/10.34123/jurnalasks.v13i2.295
Harris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., ... & Oliphant, T. E. (2020). Array programming with NumPy. Nature, 585, 357–362. https://doi.org/10.1038/s41586-020-2649-2
Inaku, R. F., & Chandra, J. C. (2023). Implementasi data mining dalam prediksi harga saham menggunakan metode long short-term memory (LSTM). Jurnal Ticom: Technology of Information and Communication, 12(1), 1–7. https://doi.org/10.70309/ticom.v12i1.99
Jum'an. (2024). Indikator fundamental dan teknikal sebagai dasar pengambilan keputusan dalam berinvestasi saham. Jurnal Ekonomi STIEP, 9(2), 47–56. https://doi.org/10.54526/jes.v9i2.348
Markowitz, H. (1952). Portfolio selection. The Journal of Finance, 7(1), 77–91.
McKinney, W. (2017). Python for data analysis: Data wrangling with pandas, NumPy, and IPython (2nd ed.). O'Reilly Media.
Nuha, S. U., Cahyadi, N., & Purnomo, S. W. (2023). Pengenalan langkah awal berinvestasi dalam pasar modal di era milenial. Jurnal Pengabdian Masyarakat dan Lingkungan, 2(1), 21–24. http://dx.doi.org/10.30587/jpml.v2i1.6109
Nuraini, N. P., Simatupang, W., & Dasman, S. (2024). Analisis risiko investasi saham melalui diversifikasi portofolio secara domestik dan internasional. Margin: Jurnal Lentera Managemen Keuangan, 2(01), 37–44. https://doi.org/10.59422/margin.v2i01.259
Pratama, Y., Sulistianingsih, E., Debataraja, N. N., & Imro’ah, N. (2024). K-means clustering dan mean variance efficient portfolio dalam portofolio saham. Jambura Journal of Probability and Statistics, 5(1), 24–30. https://doi.org/10.37905/jjps.v5i1.20298
Riyandi, A., Aripin, A., Ardiansyah, I. N., Dany, R., & Yusrizal, Y. (2023). Analisis data mining untuk prediksi harga saham: Perbandingan metode regresi linier dan pola historis. Jurnal Teknologi Sistem Informasi, 4(2), 278–288. https://doi.org/10.35957/jtsi.v4i2.5158
Samsudin, A., Aritonang, C., Munthe, G. R., Monalisa, W., & Hutasoit, Y. G. (2023). Analisis risiko investasi saham melalui diversifikasi portofolio secara domestik dan internasional. El-Mal: Jurnal Kajian Ekonomi & Bisnis Islam, 4(5), 1330–1351. https://doi.org/10.47467/elmal.v4i5.2895
Sari, I. P., Al-Khowarizmi, A., & Sulaiman, O. K. (2023). Implementation of data classification using k-means algorithm in clustering stunting cases. JCoSITTE, 4(2), 402–412. https://doi.org/10.30596/jcositte.v4i2.15765
Sari, I. P., Batubara, I. H., & Hanif, I. (2021). Cluster analysis using k-means algorithm and fuzzy c-means clustering for grouping students' abilities in online learning process. JCoSITTE, 2(1), 139–144. https://doi.org/10.30596/jcositte.v2i1.6504
Silitonga, A. I., Nasution, M. A., & Rizwinie, K. S. (2025). Klasterisasi penyebaran base transceiver station menggunakan k-means clustering. SEMNAS RISTEK, 9(1). https://doi.org/10.30998/semnasristek.v9i1.7947
Sukamto, A. S., Setiawan, W., & Pratama, E. E. (2023). Data mining untuk pengelompokan saham pada sektor energi dengan metode k-means. JEPIN (Jurnal Edukasi dan Penelitian Informatika), 9(1), 76–81. https://doi.org/10.26418/jp.v9i1.62509
Sunariyah. (2022). Pengaruh model mean-variance dalam optimasi portofolio saham. Jurnal Ekonomi & Keuangan, 14(2), 145–157.
Tohendry, D., & Jollyta, D. (2023). Penerapan algoritma k-means clustering untuk pengelompokkan saham berdasarkan price earning ratio dan price to book value. Jurnal Mahasiswa Aplikasi Teknologi Komputer dan Informasi (JMApTeKsi), 5(1), 1–7.
Usman, D. R., Ramadhan, M., Hutasuhut, M., Jaya, H., Gunawan, R., & Kusnasari, S. (2024). Implementasi data mining untuk memprediksi pergerakan harga saham BRI dengan menggunakan metode regresi linier berganda. Jurnal Teknologi Sistem Informasi dan Sistem Komputer TGD, 7(1), 151–159. https://doi.org/10.53513/jsk.v7i1.9605
VanderPlas, J. (2016). Python data science handbook: Essential tools for working with data. O'Reilly Media.
Wardhani, R. S., Vebtasvili, S. E., Aprilian, R. I., Yanto, S. E., Suhdi, S. S. T., Anggraeni Yunita, S. E., & Duwi Agustina, S. E. (2022). Mengenal saham. Penerbit K-Media.