Abstract
Search data were found to be useful variables for COVID-19 trend prediction. In this study, we aimed to investigate the performance of online search models in state space models (SSMs), linear regression (LR) models, and generalized linear models (GLMs) for South Korean data from January 20, 2020, to July 31, 2021. Principal component analysis (PCA) was run to construct the composite features which were later used in model development. Values of root mean squared error (RMSE), peak day error (PDE), and peak magnitude error (PME) were defined as loss functions. Results showed that integrating search data in the models for short- and long-term prediction resulted in a low level of RMSE values, particularly for SSMs. Findings indicated that type of model used highly impacts the performance of prediction and interpretability of the model. Furthermore, PDE and PME could be beneficial to be included in the evaluation of peaks.
Original language | English |
---|---|
Pages (from-to) | 855-859 |
Number of pages | 5 |
Journal | Studies in Health Technology and Informatics |
Volume | 310 |
DOIs | |
Publication status | Published - Jan 25 2024 |
Keywords
- COVID-19
- digital epidemiology
- internet search
- Prediction
- time series
ASJC Scopus subject areas
- Biomedical Engineering
- Health Informatics
- Health Information Management