Comparing social media and Google to detect and predict severe epidemics
Identifiers
Permanent link (URI): http://hdl.handle.net/10017/60424DOI: 10.1038/s41598-020-61686-9
ISSN: 2045-2322
Publisher
Springer Nature
Date
2020-03-16Bibliographic citation
Samaras, L., García Barriocanal, M.E. & Sicilia Urbán, M.A. 2020, "Comparing social media and Google to detect and predict severe epidemics", Scientific Reports, vol. 10, art. no. 4747, pp. 1-11.
Document type
info:eu-repo/semantics/article
Version
info:eu-repo/semantics/publishedVersion
Publisher's version
https://doi.org/10.1038/s41598-020-61686-9Rights
Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
© 2020 The authors
Access rights
info:eu-repo/semantics/openAccess
Abstract
Internet technologies have demonstrated their value for the early detection and prediction of
epidemics. In diverse cases, electronic surveillance systems can be created by obtaining and analyzing
on-line data, complementing other existing monitoring resources. This paper reports the feasibility
of building such a system with search engine and social network data. Concretely, this study aims at
gathering evidence on which kind of data source leads to better results. Data have been acquired from
the Internet by means of a system which gathered real-time data for 23 weeks. Data on infuenza in
Greece have been collected from Google and Twitter and they have been compared to infuenza data
from the ofcial authority of Europe. The data were analyzed by using two models: the ARIMA model
computed estimations based on weekly sums and a customized approximate model which uses daily
sums. Results indicate that infuenza was successfully monitored during the test period. Google data
show a high Pearson correlation and a relatively low Mean Absolute Percentage Error (R=0.933,
MAPE=21.358). Twitter results are slightly better (R=0.943, MAPE=18.742). The alternative model is
slightly worse than the ARIMA(X) (R=0.863, MAPE=22.614), but with a higher mean deviation (abs.
mean dev: 5.99% vs 4.74%).
Files in this item
Files | Size | Format |
|
---|---|---|---|
Comparing_Samaras_Sci_Rep_2020.pdf | 1.094Mb |
|
Files | Size | Format |
|
---|---|---|---|
Comparing_Samaras_Sci_Rep_2020.pdf | 1.094Mb |
|
Collections
- CCOMPUT - Artículos [86]