A Systematic Review of Machine Learning Models for Predicting Type 2 Diabetes Mellitus Using Electronic Health Records

Authors

  • Oduware C. Odigie Department of Computer Science, Babcock University, Nigeria
  • Folasade Y. Ayankoya Department of Computer Science, Babcock University, Nigeria
  • Shade O. Kuyoro Department of Computer Science, Babcock University, Nigeria
  • Ayodeji G. Abiodun Department of Computer Science, Babcock University, Nigeria

DOI:

https://doi.org/10.70112/ajes-2025.14.2.4292

Keywords:

Machine Learning, Type 2 Diabetes Mellitus, Electronic Health Records, Predictive Modeling, Ensemble Models, Deep Learning

Abstract

This systematic review evaluates the use of machine learning models for predicting type 2 diabetes mellitus using electronic health record data. The global increase in the prevalence of type 2 diabetes underscores the need for reliable early prediction methods that can identify individuals at risk before disease onset. Machine learning provides an opportunity to improve predictive performance by uncovering complex relationships in clinical data that traditional statistical approaches may not capture. To assess progress in this area, a comprehensive search of the Scopus and PubMed databases was conducted to identify relevant studies published between January 2020 and October 2025. A total of 329 records were retrieved, and 13 studies met the inclusion criteria following a structured screening and quality assessment process. Data were extracted on model type, dataset characteristics, and reported outcomes. The reviewed studies showed that ensemble models and deep learning architectures generally achieved stronger predictive performance than single classifiers. Common predictors identified across studies included fasting plasma glucose, HbA1c, triglycerides, body mass index, age, and lipid measures. Although most models demonstrated high discrimination, key methodological limitations persisted, including insufficient external validation, inconsistent performance reporting, and limited transparency in data processing. The findings suggest that machine learning applied to electronic health record data offers significant potential for the early detection of type 2 diabetes; however, clinical adoption will require standardized evaluation frameworks, robust validation across diverse populations, and improved model interpretability to ensure trustworthy and equitable implementation in healthcare settings.

References

[1] U. Galicia-Garcia et al., “Pathophysiology of type 2 diabetes mellitus,” Int. J. Mol. Sci., vol. 21, no. 17, pp. 1–34, 2020, doi: 10.3390/ijms21176275.

[2] C. Gavina et al., “Premature mortality in type 2 diabetes mellitus associated with heart failure and chronic kidney disease: 20 years of real-world data,” J. Clin. Med., vol. 11, no. 8, p. 2131, Apr. 2022, doi: 10.3390/jcm11082131.

[3] H. Sun et al., “IDF diabetes atlas: Global, regional and country-level diabetes prevalence estimates for 2021 and projections for 2045,” Diabetes Res. Clin. Pract., vol. 183, p. 109119, Jan. 2022, doi: 10.1016/j.diabres.2021.109119.

[4] C. H. Karugu et al., “The economic burden of type 2 diabetes on the public healthcare system in Kenya: A cost of illness study,” BMC Health Serv. Res., vol. 24, no. 1, pp. 1–11, Dec. 2024, doi: 10.1186/s12913-024-11700-x.

[5] E. D. Parker et al., “Economic costs of diabetes in the U.S. in 2022,” Diabetes Care, vol. 47, no. 1, pp. 26–43, 2024, doi: 10.2337/dci23-0085.

[6] J. Zhang, Z. Zhang, K. Zhang, X. Ge, R. Sun, and X. Zhai, “Early detection of type 2 diabetes risk: Limitations of current diagnostic criteria,” Front. Endocrinol., vol. 14, pp. 1–7, 2023, doi: 10.3389/fendo.2023.1260623.

[7] P. W. Franks and J. L. Sargent, “Diabetes and obesity: Leveraging heterogeneity for precision medicine,” Eur. Heart J., vol. 45, no. 48, pp. 5146–5155, 2024, doi: 10.1093/eurheartj/ehae746.

[8] A. Cahn et al., “Prediction of progression from pre-diabetes to diabetes: Development and validation of a machine learning model,” Diabetes Metab. Res. Rev., vol. 36, no. 2, p. e3252, Feb. 2020, doi: 10.1002/dmrr.3252.

[9] Y. Edlitz and E. Segal, “Prediction of type 2 diabetes mellitus onset using logistic regression-based scorecards,” eLife, vol. 11, pp. 1–24, 2022, doi: 10.7554/eLife.71862.

[10] S. Chen, J. Yu, S. Chamouni, Y. Wang, and Y. Li, “Integrating machine learning and artificial intelligence in life-course epidemiology: Pathways to innovative public health solutions,” BMC Med., vol. 22, no. 1, 2024, doi: 10.1186/s12916-024-03566-x.

[11] H. J. A. van Os et al., “Developing clinical prediction models using primary care electronic health record data: The impact of data preparation choices on model performance,” Front. Epidemiol., vol. 2, pp. 1–8, 2022, doi: 10.3389/fepid.2022.871630.

[12] L. P. Nguyen et al., “The utilization of machine learning algorithms for assisting physicians in the diagnosis of diabetes,” Diagnostics, vol. 13, no. 12, 2023, doi: 10.3390/diagnostics13122087.

[13] H. A. Aliyu et al., “Optimizing machine learning algorithms for diabetes data: A metaheuristic approach to balancing and tuning classifier parameters,” Franklin Open, vol. 8, p. 100153, 2024, doi: 10.1016/j.fraope.2024.100153.

[14] W. Gong et al., “Deep learning for enhanced prediction of diabetic retinopathy: A comparative study on the diabetes complications data set,” Front. Med., vol. 12, 2025, doi: 10.3389/fmed.2025.1591832.

[15] Y. Ye et al., “Comparison of machine learning methods and conventional logistic regressions for predicting gestational diabetes using routine clinical data: A retrospective cohort study,” J. Diabetes Res., vol. 2020, p. 4168340, 2020, doi: 10.1155/2020/4168340.

[16] T. Feng et al., “Machine learning-based clinical decision support for infection risk prediction,” Front. Med., vol. 10, pp. 1–12, 2023, doi: 10.3389/fmed.2023.1213411.

[17] N. Huguet et al., “Using electronic health records in longitudinal studies: Estimating patient attrition,” Med. Care, vol. 58, suppl. 6, pp. S46–S54, Jun. 2020, doi: 10.1097/MLR.0000000000001298.

[18] R. Grout et al., “Predicting disease onset from electronic health records for population health management: A scalable and explainable deep learning approach,” Front. Artif. Intell., vol. 6, p. 1287541, 2023, doi: 10.3389/frai.2023.1287541.

[19] S. G. Choi et al., “Comparisons of prediction models for undiagnosed diabetes using machine learning versus traditional statistical methods,” Sci. Rep., vol. 13, no. 1, pp. 1–11, 2023, doi: 10.1038/s41598-023-40170-0.

[20] L. Wang, X. Wang, A. Chen, X. Jin, and H. Che, “Prediction of type 2 diabetes risk and its effect evaluation based on the XGBoost model,” Healthcare, vol. 8, no. 3, pp. 1–11, 2020, doi: 10.3390/healthcare8030247.

[21] H. Javidi et al., “Identification of robust deep neural network models of longitudinal clinical measurements,” npj Digit. Med., vol. 5, no. 1, p. 106, 2022, doi: 10.1038/s41746-022-00651-4.

[22] H. Lee et al., “Prediction model for type 2 diabetes mellitus and its association with mortality using machine learning in three independent cohorts,” eClinicalMedicine, vol. 80, p. 103069, 2025, doi: 10.1016/j.eclinm.2025.103069.

[23] P. T. Phuc et al., “Early detection of dementia in populations with type 2 diabetes: Predictive analytics using a machine learning approach,” J. Med. Internet Res., vol. 26, p. e52107, 2024, doi: 10.2196/52107.

[24] J.-E. Ding et al., “Large language multimodal models for new-onset type 2 diabetes prediction using five-year cohort electronic health records,” Sci. Rep., vol. 14, no. 1, p. 20774, 2024, doi: 10.1038/s41598-024-71020-2.

[25] M. Shrestha et al., “A novel deep learning solution enhancing support vector machines for predicting the onset of type 2 diabetes,” Multimed. Tools Appl., vol. 82, no. 4, pp. 6221–6241, 2023, doi: 10.1007/s11042-022-13582-9.

[26] M. Bernstorff et al., “Development and validation of a machine learning model for prediction of type 2 diabetes in patients with mental illness,” Acta Psychiatr. Scand., vol. 151, no. 3, pp. 245–258, 2025, doi: 10.1111/acps.13687.

[27] G. Alix et al., “An online risk tool for predicting type 2 diabetes mellitus,” Diabetology, vol. 2, no. 3, pp. 123–129, 2021, doi: 10.3390/diabetology2030011.

[28] Y.-Q. Liu et al., “Use of machine learning to predict the incidence of type 2 diabetes among relatively healthy adults: A 10-year longitudinal study in Taiwan,” Diagnostics, vol. 15, no. 1, 2025, doi: 10.3390/diagnostics15010072.

[29] M. Bernardini et al., “Discovering type 2 diabetes in electronic health records using the sparse balanced support vector machine,” IEEE J. Biomed. Health Inform., vol. 24, no. 1, pp. 235–246, 2020, doi: 10.1109/JBHI.2019.2899218.

[30] A. García-Dominguez et al., “Optimizing clinical diabetes diagnosis through generative adversarial networks: Evaluation and validation,” Diseases, vol. 11, no. 4, 2023, doi: 10.3390/diseases11040134.

[31] X. Kang et al., “Construction and validation of a prediction model for developing type 2 diabetes mellitus in patients with chronic obstructive pulmonary disease,” Front. Endocrinol., vol. 16, 2025, doi: 10.3389/fendo.2025.1560631.

[32] S. Perveen et al., “A hybrid approach for modeling type 2 diabetes mellitus progression,” Front. Genet., vol. 10, 2020, doi: 10.3389/fgene.2019.01076.

[33] H.-H. Ha et al., “Diabetes early prediction using machine learning and ensemble methods,” Int. J. Adv. Sci. Eng. Inf. Technol., vol. 15, no. 2, pp. 363–375, 2025, doi: 10.18517/ijaseit.15.2.20947.

[34] H. M. Deberneh and I. Kim, “Prediction of type 2 diabetes based on machine learning algorithm,” Int. J. Environ. Res. Public Health, vol. 18, no. 6, 2021, doi: 10.3390/ijerph18063317.

[35] F. Al-Hussein et al., “Predicting type 2 diabetes onset age using machine learning: A case study in Saudi Arabia,” PLoS One, vol. 20, no. 2, p. e0318484, 2025, doi: 10.1371/journal.pone.0318484.

[36] S. Suriya and J. J. Muthu, “Type 2 diabetes prediction using the k-nearest neighbor algorithm,” J. Trends Comput. Sci. Smart Technol., vol. 5, no. 2, pp. 190–205, 2023, doi: 10.36548/jtcsst.2023.2.007.

Downloads

Published

20-10-2025

How to Cite

Odigie, O. C., Ayankoya, F. Y., Kuyoro, S. O., & Abiodun, A. G. (2025). A Systematic Review of Machine Learning Models for Predicting Type 2 Diabetes Mellitus Using Electronic Health Records. Asian Journal of Electrical Sciences, 14(2), 28–34. https://doi.org/10.70112/ajes-2025.14.2.4292

Similar Articles

<< < 3 4 5 6 7 8 9 > >> 

You may also start an advanced similarity search for this article.