TY - JOUR
T1 - Clustering-based risk stratification of prediabetes populations
T2 - Insights from the Taiwan and UK Biobanks
AU - Onthoni, Djeane Debora
AU - Chen, Ying Erh
AU - Lai, Yi Hsuan
AU - Li, Guo Hung
AU - Zhuang, Yong Sheng
AU - Lin, Hong Ming
AU - Hsiao, Yu Ping
AU - Onthoni, Ade Indra
AU - Chiou, Hung Yi
AU - Chung, Ren Hua
N1 - Publisher Copyright:
© 2024 The Author(s). Journal of Diabetes Investigation published by Asian Association for the Study of Diabetes (AASD) and John Wiley & Sons Australia, Ltd.
PY - 2024
Y1 - 2024
N2 - Aims/Introduction: This study aimed to identify low- and high-risk diabetes groups within prediabetes populations using data from the Taiwan Biobank (TWB) and UK Biobank (UKB) through a clustering-based Unsupervised Learning (UL) approach, to inform targeted type 2 diabetes (T2D) interventions. Materials and Methods: Data from TWB and UKB, comprising clinical and genetic information, were analyzed. Prediabetes was defined by glucose thresholds, and incident T2D was identified through follow-up data. K-means clustering was performed on prediabetes participants using significant features determined through logistic regression and LASSO. Cluster stability was assessed using mean Jaccard similarity, silhouette score, and the elbow method. Results: We identified two stable clusters representing high- and low-risk diabetes groups in both biobanks. The high-risk clusters showed higher diabetes incidence, with 15.7% in TWB and 13.0% in UKB, compared to 7.3% and 9.1% in the low-risk clusters, respectively. Notably, males were predominant in the high-risk groups, constituting 76.6% in TWB and 52.7% in UKB. In TWB, the high-risk group also exhibited significantly higher BMI, fasting glucose, and triglycerides, while UKB showed marginal significance in BMI and other metabolic indicators. Current smoking was significantly associated with increased diabetes risk in the TWB high-risk group (P < 0.001). Kaplan–Meier curves indicated significant differences in diabetes complication incidences between clusters. Conclusions: UL effectively identified risk-specific groups within prediabetes populations, with high-risk groups strongly associated male gender, higher BMI, smoking, and metabolic markers. Tailored preventive strategies, particularly for young males in Taiwan, are crucial to reducing T2D risk.
AB - Aims/Introduction: This study aimed to identify low- and high-risk diabetes groups within prediabetes populations using data from the Taiwan Biobank (TWB) and UK Biobank (UKB) through a clustering-based Unsupervised Learning (UL) approach, to inform targeted type 2 diabetes (T2D) interventions. Materials and Methods: Data from TWB and UKB, comprising clinical and genetic information, were analyzed. Prediabetes was defined by glucose thresholds, and incident T2D was identified through follow-up data. K-means clustering was performed on prediabetes participants using significant features determined through logistic regression and LASSO. Cluster stability was assessed using mean Jaccard similarity, silhouette score, and the elbow method. Results: We identified two stable clusters representing high- and low-risk diabetes groups in both biobanks. The high-risk clusters showed higher diabetes incidence, with 15.7% in TWB and 13.0% in UKB, compared to 7.3% and 9.1% in the low-risk clusters, respectively. Notably, males were predominant in the high-risk groups, constituting 76.6% in TWB and 52.7% in UKB. In TWB, the high-risk group also exhibited significantly higher BMI, fasting glucose, and triglycerides, while UKB showed marginal significance in BMI and other metabolic indicators. Current smoking was significantly associated with increased diabetes risk in the TWB high-risk group (P < 0.001). Kaplan–Meier curves indicated significant differences in diabetes complication incidences between clusters. Conclusions: UL effectively identified risk-specific groups within prediabetes populations, with high-risk groups strongly associated male gender, higher BMI, smoking, and metabolic markers. Tailored preventive strategies, particularly for young males in Taiwan, are crucial to reducing T2D risk.
KW - Machine learning
KW - Prediabetes
KW - Risk stratification
UR - http://www.scopus.com/inward/record.url?scp=85205878784&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85205878784&partnerID=8YFLogxK
U2 - 10.1111/jdi.14328
DO - 10.1111/jdi.14328
M3 - Article
AN - SCOPUS:85205878784
SN - 2040-1116
JO - Journal of Diabetes Investigation
JF - Journal of Diabetes Investigation
ER -