TY - GEN
T1 - Early Diabetes Prediction
T2 - A Comparative Study Using Machine Learning Techniques
AU - Poly, Tahmina Nasrin
AU - Islam, Md Mohaimenul
AU - Li, Yu Chuan Jack
N1 - Publisher Copyright:
© 2022 The authors and IOS Press.
PY - 2022
Y1 - 2022
N2 - Most screening tests for Diabetes Mellitus (DM) in use today were developed using electronically collected data from Electronic Health Record (EHR). However, developing and under-developing countries are still struggling to build EHR in their hospitals. Due to the lack of HER data, early screening tools are not available for those countries. This study develops a prediction model for early DM by direct questionnaires for a tertiary hospital in Bangladesh. Information gain technique was used to reduce irreverent features. Using selected variables, we developed logistic regression, support vector machine, K-nearest neighbor, Naïve Bayes, random forest (RF), and neural network models to predict diabetes at an early stage. RF outperformed other machine learning algorithms achieved 100% accuracy. These findings suggest that a combination of simple questionnaires and a machine learning algorithm can be a powerful tool to identify undiagnosed DM patients.
AB - Most screening tests for Diabetes Mellitus (DM) in use today were developed using electronically collected data from Electronic Health Record (EHR). However, developing and under-developing countries are still struggling to build EHR in their hospitals. Due to the lack of HER data, early screening tools are not available for those countries. This study develops a prediction model for early DM by direct questionnaires for a tertiary hospital in Bangladesh. Information gain technique was used to reduce irreverent features. Using selected variables, we developed logistic regression, support vector machine, K-nearest neighbor, Naïve Bayes, random forest (RF), and neural network models to predict diabetes at an early stage. RF outperformed other machine learning algorithms achieved 100% accuracy. These findings suggest that a combination of simple questionnaires and a machine learning algorithm can be a powerful tool to identify undiagnosed DM patients.
KW - Diabetes
KW - early-stage prediction
KW - machine learning
KW - random forest
UR - http://www.scopus.com/inward/record.url?scp=85133244028&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85133244028&partnerID=8YFLogxK
U2 - 10.3233/SHTI220752
DO - 10.3233/SHTI220752
M3 - Conference contribution
C2 - 35773898
AN - SCOPUS:85133244028
T3 - Studies in Health Technology and Informatics
SP - 409
EP - 413
BT - Advances in Informatics, Management and Technology in Healthcare
A2 - Mantas, John
A2 - Gallos, Parisis
A2 - Zoulias, Emmanouil
A2 - Hasman, Arie
A2 - Househ, Mowafa S.
A2 - Diomidous, Marianna
A2 - Liaskos, Joseph
A2 - Charalampidou, Martha
PB - IOS Press BV
ER -