Most screening tests for Diabetes Mellitus (DM) in use today were developed using electronically collected data from Electronic Health Record (EHR). However, developing and under-developing countries are still struggling to build EHR in their hospitals. Due to the lack of HER data, early screening tools are not available for those countries. This study develops a prediction model for early DM by direct questionnaires for a tertiary hospital in Bangladesh. Information gain technique was used to reduce irreverent features. Using selected variables, we developed logistic regression, support vector machine, K-nearest neighbor, Naïve Bayes, random forest (RF), and neural network models to predict diabetes at an early stage. RF outperformed other machine learning algorithms achieved 100% accuracy. These findings suggest that a combination of simple questionnaires and a machine learning algorithm can be a powerful tool to identify undiagnosed DM patients.

Original languageEnglish
Title of host publicationAdvances in Informatics, Management and Technology in Healthcare
EditorsJohn Mantas, Parisis Gallos, Emmanouil Zoulias, Arie Hasman, Mowafa S. Househ, Marianna Diomidous, Joseph Liaskos, Martha Charalampidou
PublisherIOS Press BV
Number of pages5
ISBN (Electronic)9781643682907
Publication statusPublished - 2022

Publication series

NameStudies in Health Technology and Informatics
ISSN (Print)0926-9630
ISSN (Electronic)1879-8365


  • Diabetes
  • early-stage prediction
  • machine learning
  • random forest

ASJC Scopus subject areas

  • Biomedical Engineering
  • Health Informatics
  • Health Information Management


Dive into the research topics of 'Early Diabetes Prediction: A Comparative Study Using Machine Learning Techniques'. Together they form a unique fingerprint.

Cite this