Orchestrating an optimized next-generation sequencing-based cloud workflow for robust viral identification during pandemics

Hendrick Gao Min Lim, Shih Hsin Hsiao, Yuan Chii Gladys Lee

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)


Coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has recently become a novel pandemic event following the swine flu that occurred in 2009, which was caused by the influenza A virus (H1N1 subtype). The accurate identification of the huge number of samples during a pandemic still remains a challenge. In this study, we integrate two technologies, next-generation sequencing and cloud computing, into an optimized workflow version that uses a specific identification algorithm on the designated cloud platform. We use 182 samples (92 for COVID-19 and 90 for swine flu) with short-read sequencing data from two open-access datasets to represent each pandemic and evaluate our workflow performance based on an index specifically created for SARS-CoV-2 or H1N1. Results show that our workflow could differentiate cases between the two pandemics with a higher accuracy depending on the index used, especially when the index that exclusively represented each dataset was used. Our workflow substantially outperforms the original complete identification workflow available on the same platform in terms of time and cost by preserving essential tools internally. Our workflow can serve as a powerful tool for the robust identification of cases and, thus, aid in controlling the current and future pandemics.

Original languageEnglish
Article number1023
Issue number10
Publication statusPublished - Oct 2021


  • Cloud computing
  • Cloud workflow
  • COVID-19
  • H1N1
  • Next-generation sequencing
  • Pandemics
  • SARS-CoV-2
  • Swine flu

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Immunology and Microbiology(all)
  • Agricultural and Biological Sciences(all)


Dive into the research topics of 'Orchestrating an optimized next-generation sequencing-based cloud workflow for robust viral identification during pandemics'. Together they form a unique fingerprint.

Cite this