Comparison of Operational Definition of Type 2 Diabetes Mellitus Based on Data from Korean National Health Insurance Service and Korea National Health and Nutrition Examination Survey
Article information
Abstract
Background
We evaluated the validity and reliability of the operational definition of type 2 diabetes mellitus (T2DM) based on the Korean National Health Insurance Service (NHIS) database.
Methods
Adult subjects (≥40 years old) included in the Korea National Health and Nutrition Examination Survey (KNHANES) from 2008 to 2017 were merged with those from the NHIS health check-up database, producing a cross-sectional dataset. We evaluated the sensitivity, specificity, accuracy, and agreement of the NHIS criteria for defining T2DM by comparing them with the KNHANES criteria as a standard reference.
Results
In the study population (n=13,006), two algorithms were devised to determine from the NHIS dataset whether the diagnostic claim codes for T2DM were accompanied by prescription codes for anti-diabetic drugs (algorithm 1) or not (algorithm 2). Using these algorithms, the prevalence of T2DM was 14.9% (n=1,942; algorithm 1) and 20.8% (n=2,707; algorithm 2). Good reliability in defining T2DM was observed for both algorithms (Kappa index, 0.73 [algorithm 1], 0.63 [algorithm 2]). However, the accuracy (0.93 vs. 0.89) and specificity (0.96 vs. 0.90) tended to be higher for algorithm 1 than for algorithm 2. The validity (accuracy, ranging from 0.91 to 0.95) and reliability (Kappa index, ranging from 0.68 to 0.78) of defining T2DM by NHIS criteria were independent of age, sex, socioeconomic status, and accompanied hypertension or dyslipidemia.
Conclusion
The operational definition of T2DM based on population-based NHIS claims data, including diagnostic codes and prescription codes, could be a valid tool to identify individuals with T2DM in the Korean population.
INTRODUCTION
The prevalence of type 2 diabetes mellitus (T2DM) has increased worldwide, and diabetes itself is closely related to an increased risk of atherosclerotic cardiovascular diseases such as myocardial infarction and ischemic stroke, as well as mortality. As a result, population-based data have been widely used in epidemiologic studies [1-3] to identify individuals with diabetes and evaluate diabetes-related comorbidities and risk factors. The population-level classification of T2DM can also provide informative data to guide and prioritize populations at the greatest risk and those most likely to benefit from interventions and treatment. However, there is a limitation in the population-based claim database (DB) because accurate diagnoses cannot be made due to limited clinical and laboratory information, despite the advantage of the vast amount of data.
In Korea, two representative population-based DBs have been used, the Korea National Health and Nutrition Examination Survey (KNHANES) DB, with a cross-sectional design, and the National Health Insurance Service (NHIS) DB, with a national claims DB cohort design [4]. The Korean NHIS, a single-payer system for all residents, covers 97.1% of Koreans (approximately 50 million individuals), and this DB could be an efficient resource for diabetes research based on the entire population [5]. These big DBs have different advantages and disadvantages, depending on their characteristics.
Clinical measures, including glycosylated hemoglobin (HbA1c) and the oral glucose tolerance test (OGTT), are the gold standards for diagnosing diabetes [6]. However, it is difficult to routinely conduct an HbA1c test or OGTT in a study involving an entire population, especially for subjects with mild hyperglycemia. Instead, an operational definition was adopted to define diabetes using claims-based data and national health examination data in the NHIS DB. Generally, T2DM can be defined as the assignment of an International Classification of Disease, 10th Revision (ICD-10) code corresponding to T2DM (E11-14), with or without accompanying prescription codes for anti-diabetic drugs, or a high fasting glucose level (≥126 mg/dL) in the health check-up DB [7]. However, different operational definition criteria for diabetes were adopted for previous studies, depending on whether the diagnosis was based only on the corresponding ICD-10 codes [8,9], the use of concomitant drugs prescription were included [10-15], or fasting glucose results were included [16,17].
Whether the accuracy of defining diabetes based on claims data using diagnostic codes (ICD-10) with or without prescription codes (anti-diabetic drug use) is consistent with actual diabetes in the real-world is unknown. The quality of data must first be evaluated for fitness for use. Previous validation studies were performed based on comparisons with self-reports, telephone-based surveys, or medical chart reviews [18]. These methods may include biases, such as recall bias and selection bias, that affect accuracy and concordance. Our study aimed to evaluate the validity and reliability of the NHIS data-based definition of T2DM by comparing it with other population-based KNHANES data as a standard reference. The overall sensitivity, specificity, positive and negative predictive value, accuracy, and agreement were analyzed. We also compared the prevalence and concordance of T2DM when the two algorithms were applied, depending on whether the prescription codes and diagnostic codes were included in the criteria. To the best of our knowledge, this was the first study to validate the operational definition of T2DM using two big, linked Korean national DBs.
METHODS
The Institutional Review Board of The Catholic University of Korea (IRB No.: VC18FESI0240) approved this study. The study was conducted in compliance with the Declaration of Helsinki. Written informed consent by the subjects was waived due to a retrospective nature of our study and anonymous and de-identified information was used for analysis.
Data sources
The Korean NHIS program is a computerized DB containing all claims data, including patient demographics, drug prescriptions, diagnostic codes for the disease coding system (ICD), insurers’ payment coverage, patients’ deductions, and claimed treatment details [7]. Among the total datasets in the NHIS DB, qualifications, claims, health check-up DB, and death information were used. We investigated whether there were fasting glucose levels in the health check-up DB and whether there were ICD-10 codes corresponding to T2DM and claimed prescription data for anti-diabetic drugs in the Korean Health Insurance Review and Assessment. All Korean citizens are encouraged to receive regular biannual or pre-employment health evaluations provided by NHIS. This regular health examination included assessments of anthropometric measures, blood pressure, social history, physical activity levels, and laboratory tests after overnight fasting, including serum glucose, total cholesterol, creatinine, liver function, and urinalysis.
KNHANES is a population-based cross-sectional survey designed to assess Koreans’ health-related behavior, health conditions, and nutritional status [19]. A retrospective sample of non-institutionalized civilians was obtained from all geographic regions in the country. In the KNHANES data, we analyzed the laboratory test results (fasting glucose and HbA1c levels) and collected responses to a questionnaire on whether the people included took anti-diabetic drugs or were diagnosed with T2DM. Among the eight phases of the KNHANES, data from the IV to VII phases (2008 to 2017) were analyzed, and adults over 40 years old were included in the study. The subjects surveyed by the KNHANES each year were matched to the first claims data in the NHIS health check-up DB.
We identified a cohort of 39,701 subjects in the KNHANES from 2008 to 2017. Subjects who had no data on glucose levels in the medical check-up DB or did not undergo blood tests in a fasting state (for more than 8 hours) were excluded (n=1,598). Among them, 14,294 subjects in the NHIS health check-up DB matched those in the KNHANES. Finally, 13,006 subjects were included in the study, excluding those missing values for age, sex, body mass index, household income, alcohol or smoking status, regular exercise, or the presence of dyslipidemia, hypertension, or chronic kidney disease (CKD) in the KNHANES data (Fig. 1).
Definition of T2DM
According to the KNHANES, the presence of T2DM was defined if any of the following were present: (1) fasting glucose level of ≥126 mg/dL; (2) current use of any anti-diabetic medications; (3) a previous T2DM diagnosis; or (4) an HbA1c level of ≥6.5%. The use of medications and information on medical conditions were collected through the health interview questionnaire, using the face-to-face interview method [19]. According to the NHIS, T2DM was identified by the presence of at least one of these criteria: (1) fasting glucose level of ≥126 mg/dL in the health check-up DB or (2) the presence of ICD-10 codes corresponding to T2DM (E11-14) with or without accompanying prescription codes for any anti-diabetic drugs in the claims data. Concerning defining T2DM by the NHIS dataset, two algorithms based on claims data were applied, an algorithm for diagnosing T2DM when prescription codes were accompanied by diagnostic codes (algorithm 1) and an algorithm that only required diagnostic codes (algorithm 2).
Definition of hypertension, dyslipidemia, and socioeconomic variables
Variables were defined based on the KNHANES data. Hypertension was defined as a systolic blood pressure of ≥140 mm Hg or diastolic blood pressure of ≥90 mm Hg or taking anti-hypertensive drugs [20]. Dyslipidemia was defined as a total cholesterol level of ≥240 mg/dL or taking lipid-lowering drugs [21]. CKD was defined when the estimated glomerular filtration rate was <60 mL/min/1.73 m2 [22]. Information on household income was obtained through a questionnaire and dichotomized at the higher 25th percentile or divided into quartiles. Household income was calculated as an equivalent income by dividing monthly income into the square root of the family size. Alcohol intake was classified into three categories: never drinker, mild drinker (0 to 30 g/day), and heavy drinker (>30 g/day) [23]. The final education level was classified as elementary school graduation (education duration ≤6 years), middle school graduation (≤9 years), high school graduation (≤12 years), and university or higher (>12 years). When the education level was classified into two groups, they were classified as those who graduated from middle school or lower (education duration ≤9 years) and those who graduated from high school or higher (>9 years). Regular walking was defined as walking for at least 30 minutes per day at least five times a week [24].
Statistical methods
T2DM was classified based on whether it satisfied the diagnostic criteria of the NHIS and KHNANES, respectively. Accordingly, the subjects were divided into four subgroups (NHIS-/KNHANES-, NHIS+/KNHANES-, NHIS-/KNHANES+, and NHIS+/KNHANES+, where positivity indicated a case corresponding to T2DM according to the criteria used). We summarized the characteristics of the participants by the presence or absence of T2DM according to four groups. An independent t-test was conducted on the continuous variables, and a chi-squared test was conducted on the categorical variables. The validity of the NHIS definition of T2DM was measured by estimating the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy using the KNAHNES criteria as the standard. Accuracy was expressed as a proportion of correctly classified subjects (true positive and true negative) among all subjects [25]. The Kappa coefficient with corresponding 95% confidence intervals (CI) was also calculated to assess the reliability of the two diagnostic criteria for T2DM. In general, when the Kappa coefficient was larger than 0.8, there was excellent consistency, and when the Kappa value was between 0.6 and 0.8, there was good consistency [26]. Additionally, we evaluated whether there were differences in the agreement between the two T2DM criteria according to age, sex, household income, educational level, and the presence of hypertension or dyslipidemia. Data analysis was performed using SAS version 9.4 (SAS Institute, Cary, NC, USA).
RESULTS
The prevalence of T2DM according to operational definitions by the NHIS and KNHANES
The overall prevalence of T2DM satisfying KNHANES criteria was 14.2% (n=1,843). The prevalence of T2DM in the NHIS using algorithm 1 was 14.9% (n=1,942), and using algorithm 2, it was 20.8% (n=2,707) (Table 1). When classifying T2DM using the diagnostic criteria of the NHIS (algorithm 1) or KNHANES data, the prevalence of subjects who did not meet both the NHIS and KHNANES diagnostic criteria (true negative) was 82.1% (n=10,683); 381 subjects (2.9%) only met the KNHANES diagnostic criteria (false negative), 480 subjects met (3.7%) only the NHIS criteria (false positive), and 1,462 (11.2%) met both (true positive) (Table 2). When the condition of using an anti-diabetic drug was excluded from the NHIS criteria (algorithm 2), 10,025 (77.1%) subjects did not meet either set of criteria, 274 subjects (2.1%) met only the KHNANES diagnostic criteria, 1,138 (8.7%) met only the NHIS, and 1,569 (12.1%) met both criteria (Supplementary Table 1). According to algorithm 1, the subgroup that satisfied both criteria (NHIS+/KNHANES+) was older; had a higher proportion of male gender, hypertension, and CKD; higher HbA1c levels, and lower income and education levels than the subgroup that satisfied only one set of criteria (NHIS+/KNHANES-, NHIS-/KNHANES+) or were in the non-diabetic group (NHIS-/KNHANES-) (Table 2).
Concordance measures
The overall sensitivity, specificity, PPV, NPV, accuracy, and Kappa coefficient of the NHIS diagnostic criteria (algorithm 1) compared to the KNHANES criteria was 79% (95% CI, 77 to 81), 96% (95% CI, 95 to 96), 75% (95% CI, 73 to 77), 97% (95% CI, 96 to 97), 93% (95% CI, 93 to 94), and 0.73 (95% CI, 0.72 to 0.75), respectively. When algorithm 2 was adopted in the NHIS criteria, sensitivity, specificity, PPV, NPV, accuracy, and the Kappa coefficient were 85% (95% CI, 84 to 87), 90% (95% CI, 89 to 90), 58% (95% CI, 56 to 60), 97% (95% CI, 97 to 98), 89% (95% CI, 89 to 90), and 0.63 (95% CI, 0.61 to 0.64) (Fig. 2). The mean sensitivity (ranging from 73% to 83%), specificity (ranging from 93% to 97%), PPV (ranging from 67% to 82%), NPV (ranging from 94% to 98%), accuracy (ranging from 91% to 95%), and agreement (Kappa index, ranging from 0.68 to 0.78) of the NHIS definition criteria (algorithm 1) were not different by age, sex, income level, education status, and accompanied hypertension or dyslipidemia (Table 3).
DISCUSSION
Overall good validity and consistency of the diagnostic criteria using NHIS data were observed, which did not differ by age, sex, socioeconomic factors, or accompanied hypertension or dyslipidemia. When two diagnostic algorithms were applied to NHIS data according to whether the diagnostic codes were accompanied by prescription codes (algorithm 1) or not (algorithm 2), the prevalence of T2DM by algorithm 1 was lower than by algorithm 2, which was similar to the prevalence using the KNHANES data. In addition, although good reliability was observed for both algorithms, specificity and accuracy tended to increase in the algorithm that included both diagnostic and prescription codes (algorithm 1).
The prevalence of T2DM in the NHIS data using algorithm 1 (adopting both diagnostic and prescription codes) was lower, around 5.9% lower than when algorithm 2 (adopting only diagnostic codes) was applied. False positives (cases identified in NHIS claims data as having T2DM that were not diagnosed with T2DM by KNHANES criteria) increased when T2DM was defined only by diagnostic codes (8.7% in algorithm 2, 3.4% in algorithm 1). The overall prevalence of T2DM identified using algorithm 1 in this study was similar to the overall prevalence published in the 2021 Korea Diabetes Fact Sheet using KNHANES data (16.7%, approximately 6.05 million people) [15]. The mean HbA1c level in the NHIS+/KNHANES-group (false positives) was 5.9% using algorithm 1 and 5.7% using algorithm 2 in the study. There may be cases in which claims were issued for a T2DM diagnosis in subjects with prediabetes or early T2DM who did not require medications. Also, even though both algorithms (whether or not prescription claims data were included) provided good agreement based on the Kappa index, higher specificity, and accuracy for defining T2DM based on the NHIS were observed when claims for diagnostic codes were present along with prescription codes. When both diagnostic codes and prescription codes were included in the criteria for defining T2DM in the NHIS dataset, it helped to distinguish between patients who were in a prediabetic or early diabetic state and those who were in overt diabetes requiring treatment.
Concordance and the consistency of the diagnostic value based on NHIS criteria (algorithm 1) were not different according to age, sex, socioeconomic factors, and accompanied hypertension or dyslipidemia. The accuracy and specificity were over 90%, and the mean Kappa index showed good reliability (ranging from 0.68 to 0.78). These trends were consistent when algorithm 2 was applied (data not shown). A previous validation study compared accuracy and consistency using self-reports or telephone surveys as a reference standard [18]. Self-reports and telephone surveys are prone to recall bias, social desirability bias, poor understanding of the survey questions, incomplete knowledge, or their accurate diagnosis information. The literature review demonstrated that participants’ sociodemographic characteristics, such as age, gender, race, setting, and socioeconomic and health status, were associated with incomplete data linkage and the potential for systematic bias in reported outcomes [27]. Otherwise, our study used KNHANES data, a population-based surveillance system, as a reference standard. The KNHANES data has the advantage of minimizing selection bias compared to a diagnosis based on an electric medical chart review or interviews because the target population of the KNHANES comprises nationally representative non-institutionalized civilians in Korea. In addition, including clinical measures (HbA1c) as one of the diagnostic criteria in the reference standard for assessing validation can help overcome the potential limitation with systemic bias. Also, data linkage between the KNHANES and NHIS compensated for the shortcoming in the claims data, which was a lack of clinical information such as disease duration or glycemic control status, by adding information about self-reported surveys and urine or blood sample measurements in the KNHANES.
Validity of national claims administrative data was also evaluated in other countries such as Japan [28], Canada [29], and the USA [18]. Based on the Japanese national claim DB, the algorithm that contains both diagnosis-related codes for diabetes and medication codes had higher specificity (mean, 99.4% vs. 91.6%) and agreement (mean Kappa index, 0.80 vs. 0.49) than the algorithm that contains only diagnosis-related codes [28]. According to healthcare administrative data from Canada, compared with electronic medical records, the algorithm with the best specificity and PPV while maintaining sensitivity above 80% was either one hospitalization or physician claim and either one prescription for drug or diabetes-specific fee code at any time [29]. Validity of physician claims data-based on ICD-9 codes in the USA demonstrated that the sensitivity ranged from 26.9% to 97.0%, specificity ranged from 94.5% to 99.4%, and the Kappa index ranged from 0.8 to 0.9 [18]. Comparing the sensitivity, specificity, and Kappa agreement to other countries, the algorithm based on Korean NHIS data also demonstrated good validity and reliability.
Several limitations to this study should be considered. First, selection bias may have occurred because two-thirds of the subjects were excluded due to missing data on fasting glucose levels in the medical check-up DB or covariates in the KNHANES data, as well as cases where the person refused to provide personal information. In addition, only subjects aged 40 years or older were included in this study because national health check-up was conducted for 40 years or older. Second, among the diagnostic criteria in the KNHANES data used as a standard reference, questionnaires were also used to classify patients with T2DM through a self-reported survey. Other laboratory tests and data, such as the OGTT or hyperglycemia-accompanied symptoms, were not present in the data used to diagnose T2DM. As a result, the KNHANES data also did not fully reflect all patients with T2DM in real-world settings. Third, defining T2DM according to claims-based data can overlook patients with untreated diabetes or those who did not require treatment. Clinical factors such as disease duration, diabetes management status, or accompanied hypertension or dyslipidemia, were not assessed through the NHIS data. Despite these limitations, validating the operational diagnosis of T2DM by linking these two big national DBs, including clinical measures (HbA1c), represents a very important and timely investigation approach for future diabetes research in Korea.
In conclusion, population-based NHIS claims data can be useful in identifying subjects with T2DM by using diagnostic and prescription codes as diagnostic criteria in epidemiologic studies. The validity and accuracy of the population-based claims data for identifying T2DM were well documented and independent of sociodemographic and metabolic risk factors.
SUPPLEMENTARY MATERIALS
Supplementary materials related to this article can be found online at https://doi.org/10.4093/dmj.2022.0375.
Notes
CONFLICTS OF INTEREST
Seung-Hyun Ko has been executive editor of the Diabetes & Metabolism Journal since 2022. Yong-Moon Park has been statistical advisor of the Diabetes & Metabolism Journal since 2021. They were not involved in the review process of this article. Otherwise, there was no conflict of interest.
AUTHOR CONTRIBUTIONS
Conception or design: J.H.B., K.D.H., S.H.K.
Acquisition, analysis, or interpretation of data: Y.M.P., K.D.H
Drafting or revising the work: J.H.B., M.K.M., J.H.C.
Final approval of the manuscript: K.D.H., S.H.K.
FUNDING
This research was supported by a grant from the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health and Welfare, Republic of Korea (grant number: HI18-C0275).
Acknowledgements
None