Deep Learning Algorithms for Detection of Diabetic Retinopathy in Retinal Fundus Photographs: A Systematic Review and Meta-Analysis

Md Mohaimenul Islam, Hsuan-Chia Yang, Tahmina Nasrin Poly, Wen-Shan Jian, Yu-Chuan (Jack) Li

Research output: Contribution to journalArticlepeer-review

75 Citations (Scopus)


Background : Diabetic retinopathy (DR) is one of the leading causes of blindness globally. Earlier detection and timely treatment of DR are desirable to reduce the incidence and progression of vision loss. Currently, deep learning (DL) approaches have offered better performance in detecting DR from retinal fundus images. We, therefore, performed a systematic review with a meta-analysis of relevant studies to quantify the performance of DL algorithms for detecting DR. Methods : A systematic literature search on EMBASE, PubMed, Google Scholar, Scopus was performed between January 1, 2000, and March 31, 2019. The search strategy was based on the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) reporting guidelines, and DL-based study design was mandatory for articles inclusion. Two independent authors screened abstracts and titles against inclusion and exclusion criteria. Data were extracted by two authors independently using a standard form and the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool was used for the risk of bias and applicability assessment. Results : Twenty-three studies were included in the systematic review; 20 studies met inclusion criteria for the meta-analysis. The pooled area under the receiving operating curve (AUROC) of DR was 0.97 (95%CI: 0.95-0.98), sensitivity was 0.83 (95%CI: 0.83-0.83), and specificity was 0.92 (95%CI: 0.92-0.92). The positive- and negative-likelihood ratio were 14.11 (95%CI: 9.91-20.07), and 0.10 (95%CI: 0.07-0.16), respectively. Moreover, the diagnostic odds ratio for DL models was 136.83 (95%CI: 79.03-236.93). All the studies provided a DR-grading scale, a human grader (e.g. trained caregivers, ophthalmologists) as a reference standard. Conclusion : The findings of our study showed that DL algorithms had high sensitivity and specificity for detecting referable DR from retinal fundus photographs. Applying a DL-based automated tool of assessing DR from color fundus images could provide an alternative solution to reduce misdiagnosis and improve workflow. A DL-based automated tool offers substantial benefits to reduce screening costs, accessibility to healthcare and ameliorate earlier treatments.
Original languageEnglish
Article number105320
Pages (from-to)105320
JournalComputer Methods and Programs in Biomedicine
Publication statusPublished - Jul 2020


  • Deep learning
  • Diabetic
  • Diabetic retinopathy
  • Fundus photograph
  • Retinopathy

ASJC Scopus subject areas

  • Software
  • Computer Science Applications
  • Health Informatics

Cite this