Hybrid Feature Optimization and Attention-Enhanced Siamese Deep Learning for Large-Scale Diabetes Prediction
DOI:
https://doi.org/10.31838/NJAP/08.02.21Keywords:
Diabetes prediction; Feature selection; Siamese network; 1D-CNN; Deep learning; Electronic health records; Medical decision support.Abstract
Diabetes mellitus is an alarming issue that is spreading across the world and requires proper and early prediction of the risk to aid in timely clinical intervention. This paper will suggest a small and streamlined diabetes prediction model that incorporates a hybrid IGF–DMO–RFO feature-selection approach with a Siamese one-dimensional convolutional neural network optimized by Global Spatial-Channel Attention (GSCA). The model is tested on a large scale public diabetes data of 100,000 patients records with demographic, lifestyle, comorbidity and metabolic features. The hybrid feature-selection pipeline identifies a concise and highly informative subset of predictors, while the Siamese architecture employs contrastive learning to generate discriminative embeddings from structured clinical data. Experimental results demonstrate that the proposed framework achieves an overall classification accuracy of 97%, with a ROC-AUC of 0.9726 and reliable probability calibration, outperforming conventional machine-learning and single-branch deep-learning baselines. The lightweight architecture enables fast inference and robustness to class imbalance, making it suitable for large-scale diabetes screening and clinical decision-support applications.
References
1. Caballero-María, P., Caballero-Villarraso, J., Arenas-Montes, J., Díaz-Cáceres, A., Castañeda-Nieto, S., Alcalá-Díaz, J.F., Delgado-Lista, J., Rodríguez-Cantalejo, F., Pérez-Martínez, P., López-Miranda, J. and Camargo, A., 2025. Deep Learning Model Approach to Predict Diabetes Type 2 Based on Clinical, Biochemical, and Gut Microbiota Profiles. Applied Sciences, 15(4), p.2228.
2. Abousaber, I., Abdallah, H.F. and El-Ghaish, H., 2025. Robust predictive framework for diabetes classification using optimized machine learning on imbalanced datasets. Frontiers in Artificial Intelligence, 7, p.1499530.
3. Lee, H., Hwang, S.H., Park, S., Choi, Y., Lee, S., Park, J., Son, Y., Kim, H.J., Kim, S., Oh, J. and Smith, L., 2025. Prediction model for type 2 diabetes mellitus and its association with mortality using machine learning in three independent cohorts from South Korea, Japan, and the UK: a model development and validation study. EClinicalMedicine, 80.
4. Lee, H., Park, M.B. and Won, Y.J., 2025. AI Machine Learning–Based Diabetes Prediction in Older Adults in South Korea: Cross-Sectional Analysis. JMIR Formative Research, 9(1), p.e57874.
5. Fan, Y., 2025. Diabetes diagnosis using a hybrid CNN LSTM MLP ensemble. Scientific Reports, 15(1), p.26765.
6. Tanabe, H., Sato, M., Miyake, A., Shimajiri, Y., Ojima, T., Narita, A., Saito, H., Tanaka, K., Masuzaki, H., Kazama, J.J. and Katagiri, H., 2024. Machine learning-based reproducible prediction of type 2 diabetes subtypes. Diabetologia, 67(11), pp.2446-2458.
7. Wee, B.F., Sivakumar, S., Lim, K.H., Wong, W.K. and Juwono, F.H., 2024. Diabetes detection based on machine learning and deep learning approaches. Multimedia Tools and Applications, 83(8), pp.24153-24185.
8. Kiran, M., Xie, Y., Anjum, N., Ball, G., Pierscionek, B. and Russell, D., 2025. Machine learning and artificial intelligence in type 2 diabetes prediction: a comprehensive 33-year bibliometric and literature analysis. Frontiers in Digital Health, 7, p.1557467.
9. Kaliappan, J., Saravana Kumar, I.J., Sundaravelan, S., Anesh, T., Rithik, R.R., Singh, Y., Vera-Garcia, D.V., Himeur, Y., Mansoor, W., Atalla, S. and Srinivasan, K., 2024. Analyzing classification and feature selection strategies for diabetes prediction across diverse diabetes datasets. Frontiers in Artificial Intelligence, 7, p.1421751.
10. El-Bashbishy, A.E.S. and El-Bakry, H.M., 2024. Pediatric diabetes prediction using deep learning. Scientific Reports, 14(1), p.4206.
11. Liu, H., Dong, S., Yang, H., Wang, L., Liu, J., Du, Y., Liu, J., Lyu, Z., Wang, Y., Jiang, L. and Yu, S., 2024. Comparing the accuracy of four machine learning models in predicting type 2 diabetes onset within the Chinese population: a retrospective study. Journal of International Medical Research, 52(6), p.03000605241253786.
12. Phongying, M. and Hiriote, S., 2023. Diabetes classification using machine learning techniques. Computation, 11(5), p.96.
13. Bhangale, K.B., Bhosale, S., Temkar, R., Adagale-Vairagar, S., Adagale, S.S., Mapari, R. and Tiwari, H., 2025. Diabetes Prediction from Clinical Data Using Deep Convolution Neural Network. Mathematical Modelling of Engineering Problems, 12(8).
14. Hageh, C.A., Henschel, A., Zhou, H., Zubelli, J., Nader, M., Chacar, S., Iakovidou, N., Hatzikirou, H., Abchee, A., O’Sullivan, S. and Zalloua, P.A., 2025. Improving T2D machine learning-based prediction accuracy with SNPs and younger age. Computational and Structural Biotechnology Journal.
15. Afolabi, S., Ajadi, N., Jimoh, A. and Adenekan, I., 2025. Predicting diabetes using supervised machine learning algorithms on E-health records. Informatics and Health, 2(1), pp.9-16.
16. Ghazizadeh, Y., Salehi, S. and Mirsaeid Ghazi, M., 2025. Machine learning-based diabetes prediction: A comprehensive study on predictive modeling and risk assessment. J Clin Images Med Case Rep, 6(5), p.3578.
17. Khan, S. and Shah, Z., 2025. Artificial intelligence–based diabetes risk prediction from longitudinal DXA bone measurements. Scientific Reports, 15(1), p.25706.
18. Lee, T.F., Chang, C.H., Chi, C.H., Liu, Y.H., Shao, J.C., Hsieh, Y.W., Yang, P.Y., Tseng, C.D., Chiu, C.L., Hu, Y.C. and Lin, Y.W., 2024. Utilizing radiomics and dosiomics with AI for precision prediction of radiation dermatitis in breast cancer patients. BMC cancer, 24(1), p.965.
19. Lugner, M., Rawshani, A., Helleryd, E. and Eliasson, B., 2024. Identifying top ten predictors of type 2 diabetes through machine learning analysis of UK Biobank data. Scientific reports, 14(1), p.2102.
20. Jiang, L., Xia, Z., Zhu, R., Gong, H., Wang, J., Li, J. and Wang, L., 2023. Diabetes risk prediction model based on community follow-up data using machine learning. Preventive Medicine Reports, 35, p.102358.
21. Feng, X., Cai, Y. and Xin, R., 2023. Optimizing diabetes classification with a machine learning-based framework. BMC bioinformatics, 24(1), p.428.
22. Aslan, M.F. and Sabanci, K., 2023. A novel proposal for deep learning-based diabetes prediction: converting clinical data to image data. Diagnostics (Basel) 13 (4): 796 [online]
23. Alanis, A.Y., Sanchez, O.D., Vaca-González, A. and Rangel-Heras, E., 2023. Intelligent classification and diagnosis of diabetes and impaired glucose tolerance using deep neural networks. Mathematics, 11(19), p.4065.
24. Alghamdi, T., 2023. Prediction of diabetes complications using computational intelligence techniques. Applied Sciences, 13(5), p.3030.
25. Machado-Fragua, M.D., Landré, B., Chen, M., Fayosse, A., Dugravot, A., Kivimaki, M., Sabia, S. and Singh-Manoux, A., 2022. Circulating serum metabolites as predictors of dementia: a machine learning approach in a 21-year follow-up of the Whitehall II cohort study. BMC medicine, 20(1), p.334.




