Skip Navigation
Skip to contents

Diabetes Metab J : Diabetes & Metabolism Journal

Search
OPEN ACCESS

Articles

Page Path
HOME > Diabetes Metab J > Volume 48(5); 2024 > Article
Response
Construction of Risk Prediction Model of Type 2 Diabetic Kidney Disease Based on Deep Learning (Diabetes Metab J 2024;48:771-9)
Chuan Yun1*, Fangli Tang2*, Qingqing Lou3orcidcorresp_icon
Diabetes & Metabolism Journal 2024;48(5):1008-1011.
DOI: https://doi.org/10.4093/dmj.2024.0490
Published online: September 12, 2024
  • 295 Views
  • 19 Download

1Department of Endocrinology, The First Affiliated Hospital of Hainan Medical University, Haikou, China

2International School of Nursing, Hainan Medical University, Haikou, China

3The First Affiliated Hospital of Hainan Medical University, Hainan Clinical Research Center for Metabolic Disease, Haikou, China

corresp_icon Corresponding author: Qingqing Lou orcid The First Affiliated Hospital of Hainan Medical University, Hainan Clinical Research Center for Metabolic Disease, No. 31, Longhua Road, Haikou 570102, China E-mail: 2444890144@qq.com
*Chuan Yun and Fangli Tang contributed equally to this study as first authors.

Copyright © 2024 Korean Diabetes Association

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

prev next
See the letter "Construction of Risk Prediction Model of Type 2 Diabetic Kidney Disease Based on Deep Learning (Diabetes Metab J 2024;48:771-9)" on page 1003.
We sincerely appreciate the insightful suggestions and comments provided by Dr. Bo Mi Seo and Dr. Jong Wook Choi on our manuscript [1]. We have reviewed Dr. Choi’s publications on diabetic kidney disease (DKD), and are impressed by his extensive expertise in this field. We are grateful to the Diabetes & Metabolism Journal for facilitating this valuable discussion on our paper published in Diabetes & Metabolism Journal. It is an excellent opportunity for learning and communication.
Deep learning enables machines to autonomously identify and rank features based on their significance. Although factors may associated DKD, their informational value might be subsumed by others, more salient features. During model training, factors with insufficient predictive power, due to competition with more critical factors, may not be selected as significant by the model. We have selected only the top 20 features with the highest weights for DKD prediction. This approach is grounded in the principle that predictive tools benefit from data simplicity while still ensuring optimal predictive performance.
To ensure that our data collection included as many predictors of DKD as possible, we conducted a systematic review of DKD risk/predictive factors. Nonetheless, it is inherent in such reviews that some factors may be omitted. Despite this, our deep learning model was trained with a diverse set of data, encompassing not only the predictors derived from the systematic review, but also additional variables such as patient demographics (e.g., gender, educational attainment, and income), and clinical measurements (e.g., white blood cell count, platelet count, hemoglobin levels, comprehensive liver function tests, and lipid profiles). The model assessed the importance of various predictors based on their computed weights, with gender not emerging as a significant feature.
The model was specifically developed for application to Chinese or Asian diabetes patients. Given the predominance of East Asian individuals in China and the challenges associated with obtaining extensive data from other ethnic groups, such as Black or White populations, these groups were not included in the initial model development. However, we appreciate the feedback from Dr. Choi and Dr. Seo. We plan to conduct external validation and application of the model in regions with diverse ethnic minorities within China.
Large-scale cross-sectional studies have identified associations between aspartate transaminase (AST)/alanine transaminase (ALT) ratios [2] and platelet counts [3] with DKD. Although our model included AST, ALT, and platelet data, these factors were not selected by the machine learning algorithm as significant predictors. Similarly, the usage of antidiabetic, antihypertensive, and lipid-lowering medications was incorporated into the training dataset, but none of these medications were identified as important features by the model. While individual medications may indeed influence the development of DKD, their weights in the model are lighter compared to more critical factors such as serum creatinine (SCr), high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, disease duration, and age. This does not imply that these medications lack relevance to DKD; rather, it reflects their comparatively lower predictive significance within the context of our model.
In our study, we evaluated only the long short term memory (LSTM) model and the support vector machine (SVM) due to the superior predictive capabilities of deep learning models in disease diagnosis [4,5] and prognosis [6,7] compared to traditional machine learning approaches (such as random forests, decision trees, SVM, and Bayesian methods). The consensus within the field is that deep learning models generally outperform conventional machine learning models. As pointed by Seo et al., studies reported that CatBoost Classifier and Gradient Boosted Tree (GBDT) demonstrate strong accuracy and area under the curve (AUC) performance [8,9]. In collaboration with an information technology (IT) firm, we developed models using Light Gradient Boosting Machine (LightGBM) and CatBoost Classifier—both enhancements of GBDT in 2023. We also excluded that with less than 7 years of follow-up for model training and testing. Finally, the LightGBM model exhibited the best predictive performance. Consequently, we have developed the LightGBM model into a Mini Program, a clinical application tool, as illustrated in Fig. 1.
Due to the structure of the healthcare system in mainland China, many diabetes patients do not do regular follow-up or monitor their glycosylated hemoglobin (HbA1c), estimated glomerular filtration rate (eGFR) or urine protein-creatinine ratio (UACR). Patients often seek medical attention only when experiencing significant symptoms or when their medication is depleted, and they may not consistently visit the same healthcare facility due to the flexibility in choosing among various hospitals. This situation complicates the collection of continuous data. A study from Chongqing revealed that annual proportions of HbA1c testing, blood lipid testing, and screenings for nephropathy and eye conditions were 8%, 54%, 45%, and 44%, respectively [10]. Another national cross-sectional survey conducted in China indicated that the rates of UACR, eGFR, and HbA1c testing among Chinese patients with type 2 diabetes mellitus (T2DM) were notably low, with only 21.12% of patients having HbA1c measured semi-annually and 13.11% and 9.34% undergoing UACR and eGFR tests annually, respectively [11]. Consequently, collecting continuous, 7-year datasets encompassing a range of DKD risk factors proves challenging. To ensure the robustness of our model, we utilized data from the Li’s United Clinics in Taiwan for model training. This choice is a limitation of our study; however, it can be mitigated through independent external validation. Independent external validation involves assessing the model using data from different sources or researchers not involved in the original model development [12], which is crucial for evaluating the model’s reliability. Such validation is necessary to confirm the model’s stability and generalizability, thereby enhancing its credibility for clinical use [12,13].
To assess the stability and generalizability of the model, we initially intended to conduct independent external validation using data from 800 cases across five hospitals in mainland China. Unfortunately, three of these hospitals had updated their electronic madical system within the past 7 years, preventing us from acquiring continuous longitudinal data. As a result, we obtained data from the remaining two hospitals, located in Zhejiang Province and Hainan Province, and collected a dataset comprising 488 cases with continuous records spanning 7 years. We used the initial 2 years of this data to predict the incidence of DKD over the following 5 years. The LightGBM model was then validated using this dataset, achieving an accuracy of 0.94, precision of 0.88, recall of 0.87, F1 score of 0.91, and an AUC of 0.85. These results indicate strong performance of the model among patients with T2DM in mainland China. Furthermore, the top 20 predictive features identified by the LightGBM model differed significantly from those of the original LSTM model. Detailed information on the key features and their weights used in the LightGBM model for DKD prediction is provided in Table 1.
We extend our sincere gratitude to Dr. Choi and Dr. Seo for their valuable input regarding the variability of SCr, an aspect we had not previously considered. We have now computed the variability of SCr and integrated this data into the model and found that SCr variability is an important feature in predicting DKD.
To enhance the robustness of our model, we plan to increase the sample size by incorporating data from two to three additional hospitals in mainland China that offer continuous 7-year follow-up records for patients with T2DM. This will allow us to conduct further independent external validation and apply the model to diverse patient populations, including ethnic minorities.
The development and refinement of predictive models is a dynamic and iterative process. As deep learning technologies advance, new and superior neural networks will continuously emerge. It is crucial for clinicians to engage in ongoing collaboration with IT specialists to optimize model performance. Additionally, successful model development is only the initial step; to facilitate clinical application, further efforts are required, including the creation of user-friendly platforms such as mobile applications or software tools.

CONFLICTS OF INTEREST

No potential conflict of interest relevant to this article was reported.

Fig. 1.
Mini program of diabetic kidney disease (DKD) risk prediction model. BMI, body mass index; HDL-C, high-density lipoprotein cholesterol; LDL-C, low-density lipoprotein cholesterol.
dmj-2024-0490f1.jpg
Table 1.
The top 20 important features of the LightGBM model
No. Feature Importance
1 SCr 2nd yr 2,022.563598
2 Age 839.928943
3 UA 2nd yr 566.367859
4 Hemoglobin 1st yr 560.346788
5 Duration of diabetes 512.597209
6 SBP 2nd yr 504.068467
7 HDL 2nd yr 463.918871
8 SCr 1st yr 417.860179
9 Hemoglobin 2nd yr 389.865619
10 LDL 2nd yr 313.112968
11 BMI 1st yr 306.705871
12 UA 1st yr 294.715961
13 BMI 2nd yr 276.222002
14 HbA1c 2nd yr 259.921019
15 HbA1c 2nd yr 225.473221
16 TG 2nd yr 224.271811
17 SBP 1st yr 220.521612
18 HDL-C 1st yr 217.250630
19 DBP 2nd yr 203.563461
20 LDL-C 1st yr 193.283288

LightGBM, Light Gradient Boosting Machine; SCr, serum creatinine; UA, uric acid; SBP, systolic blood pressure; HDL, high-density lipoprotein; LDL, low-density lipoprotein; BMI, body mass index; HbA1c, glycosylated hemoglobin; TG, triglyceride; HDL-C, highdensity lipoprotein cholesterol; DBP, diastolic blood pressure; LDL-C, low-density lipoprotein cholesterol.

  • 1. Yun C, Tang F, Gao Z, Wang W, Bai F, Miller JD, et al. Construction of risk prediction model of type 2 diabetic kidney disease based on deep learning. Diabetes Metab J 2024;48:771-9.PubMedPMC
  • 2. Choi JW, Lee CH, Park JS. Comparison of laboratory indices of non-alcoholic fatty liver disease for the detection of incipient kidney dysfunction. PeerJ 2019;7:e6524.ArticlePubMedPMCPDF
  • 3. Choi JW, Kim TH, Park JS, Lee CH. Association between relative thrombocytosis and microalbuminuria in adults with mild fasting hyperglycemia. J Pers Med 2024;14:89.ArticlePubMedPMC
  • 4. Albrecht T, Rossberg A, Albrecht JD, Nicolay JP, Straub BK, Gerber TS, et al. Deep learning-enabled diagnosis of liver adenocarcinoma. Gastroenterology 2023;165:1262-75.ArticlePubMed
  • 5. Levy J, Alvarez D, Del Campo F, Behar JA. Deep learning for obstructive sleep apnea diagnosis based on single channel oximetry. Nat Commun 2023;14:4881.ArticlePubMedPMCPDF
  • 6. Placido D, Yuan B, Hjaltelin JX, Zheng C, Haue AD, Chmura PJ, et al. A deep learning algorithm to predict risk of pancreatic cancer from disease trajectories. Nat Med 2023;29:1113-22.ArticlePubMedPMCPDF
  • 7. Tian F, Liu D, Wei N, Fu Q, Sun L, Liu W, et al. Prediction of tumor origin in cancers of unknown primary origin with cytology-based deep learning. Nat Med 2024;30:1309-19.ArticlePubMedPMCPDF
  • 8. Liu XZ, Duan M, Huang HD, Zhang Y, Xiang TY, Niu WC, et al. Predicting diabetic kidney disease for type 2 diabetes mellitus by machine learning in the real world: a multicenter retrospective study. Front Endocrinol (Lausanne) 2023;14:1184190.ArticlePubMedPMC
  • 9. Allen A, Iqbal Z, Green-Saxena A, Hurtado M, Hoffman J, Mao Q, et al. Prediction of diabetic kidney disease with machine learning algorithms, upon the initial diagnosis of type 2 diabetes mellitus. BMJ Open Diabetes Res Care 2022;10:e002560.ArticlePubMedPMC
  • 10. He M, Gao J, Liu W, Tang X, Tang S, Long Q. Case management of patients with type 2 diabetes mellitus: a cross-sectional survey in Chongqing, China. BMC Health Serv Res 2017;17:129.ArticlePubMedPMCPDF
  • 11. Xia Z, Luo X, Wang Y, Xu T, Dong J, Jiang W, et al. Diabetic kidney disease screening status and related factors: a cross-sectional study of patients with type 2 diabetes in six provinces in China. BMC Health Serv Res 2024;24:489.ArticlePubMedPMCPDF
  • 12. McLernon DJ, Giardiello D, Van Calster B, Wynants L, van Geloven N, van Smeden M, et al. Assessing performance and clinical usefulness in prediction models with survival outcomes: practical guidance for Cox proportional hazards models. Ann Intern Med 2023;176:105-14.ArticlePubMed
  • 13. Slieker RC, van der Heijden AAWA, Siddiqui MK, LangendoenGort M, Nijpels G, Herings R, et al. Performance of prediction models for nephropathy in people with type 2 diabetes: systematic review and external validation study. BMJ 2021;374:n2134.ArticlePubMedPMC

Figure & Data

References

    Citations

    Citations to this article as recorded by  

      • PubReader PubReader
      • ePub LinkePub Link
      • Cite this Article
        Cite this Article
        export Copy Download
        Close
        Download Citation
        Download a citation file in RIS format that can be imported by all major citation management software, including EndNote, ProCite, RefWorks, and Reference Manager.

        Format:
        • RIS — For EndNote, ProCite, RefWorks, and most other reference management software
        • BibTeX — For JabRef, BibDesk, and other BibTeX-specific software
        Include:
        • Citation for the content below
        Construction of Risk Prediction Model of Type 2 Diabetic Kidney Disease Based on Deep Learning (Diabetes Metab J 2024;48:771-9)
        Diabetes Metab J. 2024;48(5):1008-1011.   Published online September 12, 2024
        Close
      • XML DownloadXML Download
      Figure
      • 0
      Related articles
      Construction of Risk Prediction Model of Type 2 Diabetic Kidney Disease Based on Deep Learning (Diabetes Metab J 2024;48:771-9)
      Image
      Fig. 1. Mini program of diabetic kidney disease (DKD) risk prediction model. BMI, body mass index; HDL-C, high-density lipoprotein cholesterol; LDL-C, low-density lipoprotein cholesterol.
      Construction of Risk Prediction Model of Type 2 Diabetic Kidney Disease Based on Deep Learning (Diabetes Metab J 2024;48:771-9)
      No. Feature Importance
      1 SCr 2nd yr 2,022.563598
      2 Age 839.928943
      3 UA 2nd yr 566.367859
      4 Hemoglobin 1st yr 560.346788
      5 Duration of diabetes 512.597209
      6 SBP 2nd yr 504.068467
      7 HDL 2nd yr 463.918871
      8 SCr 1st yr 417.860179
      9 Hemoglobin 2nd yr 389.865619
      10 LDL 2nd yr 313.112968
      11 BMI 1st yr 306.705871
      12 UA 1st yr 294.715961
      13 BMI 2nd yr 276.222002
      14 HbA1c 2nd yr 259.921019
      15 HbA1c 2nd yr 225.473221
      16 TG 2nd yr 224.271811
      17 SBP 1st yr 220.521612
      18 HDL-C 1st yr 217.250630
      19 DBP 2nd yr 203.563461
      20 LDL-C 1st yr 193.283288
      Table 1. The top 20 important features of the LightGBM model

      LightGBM, Light Gradient Boosting Machine; SCr, serum creatinine; UA, uric acid; SBP, systolic blood pressure; HDL, high-density lipoprotein; LDL, low-density lipoprotein; BMI, body mass index; HbA1c, glycosylated hemoglobin; TG, triglyceride; HDL-C, highdensity lipoprotein cholesterol; DBP, diastolic blood pressure; LDL-C, low-density lipoprotein cholesterol.

      Yun C, Tang F, Lou Q. Construction of Risk Prediction Model of Type 2 Diabetic Kidney Disease Based on Deep Learning (Diabetes Metab J 2024;48:771-9). Diabetes Metab J. 2024;48(5):1008-1011.
      DOI: https://doi.org/10.4093/dmj.2024.0490.

      Diabetes Metab J : Diabetes & Metabolism Journal
      Close layer
      TOP