Bar Chart (descending order) and Data Table
Case numbers are taken from (JHU 2020b). In order to compensate for the daily fluctuations, the mean number of cases over the past seven day (Mean_Daily_*) are added.
Bar Chart - Cumulative Cases per 100,000 Inhabitants
Population numbers are taken from (UNO 2020). In order to compensate for the daily fluctuations, the mean number of cases over the past seven days (Mean_Daily_*) are added.
Bar Chart - Mean daily cases over the past seven days per 100,000 Inhabitants
Population numbers are taken from (UNO 2020). In order to compensate for the daily fluctuations, the mean number of cases over the past seven days (Mean_Daily_*) are added.
Bar Chart - Case Fatality Rate - CFR of mean daily (over past 7 days) and CFR_total (cumulated) assuming a time lag of 12 days between Confirmed => Death
The number of confirmed cases is an early predictor of the number of deaths. The number of today’s deaths is already determined by the infections about by \(\sim19\) days ago or respectively by the confirmed cases about by \(\sim11\) days ago (see RWI 2020).
An average duration of Confirmed Infection to Death of \(12\) (lag-)days is assumed (country-independent for the sake of simplicity) for the calculations, since many tests are carried out in the meantime before symptoms appear.
However, this varies considerably depending on country-specific test rate and health system. In the worst health systems it may be only one day between infection confirmation and death, the “Confirmed” cases must be “lagged” by \(\sim1\) day. In the best case, the time from the end of incubation period (in average \(\sim5-6\) days) to death is an average \(\sim14\) days. In this case, the average Confirmed infection to Death period is \(\sim14\) days), the “Confirmed” cases must be correctly “lagged” by \(\sim14\) days. For the assumed time periods see (RKI 2020c), (RKI 2020b), for Case Fatality Rate and Incubation Period in general see (Wikipedia contributors 2020a), (Wikipedia contributors 2020c).
The simple calculation with unlagged cumulative confirmed cases divided by cumulative deaths results in a significant underestimation of the CFR in health systems with early disease detection. If the number of cumulative cases is already large compared to the number of active cases (~ cases from the past two weeks), the “lagged” rsp. “unlagged” values converge.
The Infection Fatality Rate (IFR) is the fatality rate of all infection, that means detected confirmed cases and undetected cases (asymptomatic and not tested group). This lethality is assumed to be country independent and only rough estimates exist (RKI: bottom of existing estimates \(\sim0.56\%\)).
Country | Date | CFR_mean_daily | CFR_total | CFR_unlagged |
---|---|---|---|---|
Austria | 2023-03-09 | 0.2 | 0.4 | 0.4 |
EU | 2023-03-09 | 0.6 | 0.7 | 0.7 |
France | 2023-03-09 | 0.5 | 0.4 | 0.4 |
Germany | 2023-03-09 | 0.6 | 0.4 | 0.4 |
India | 2023-03-09 | 0.6 | 1.2 | 1.2 |
Italy | 2023-03-09 | 0.8 | 0.7 | 0.7 |
Japan | 2023-03-09 | 0.4 | 0.2 | 0.2 |
Spain | 2023-03-09 | 1.3 | 0.9 | 0.9 |
United Kingdom | 2023-03-09 | 0.0 | 0.9 | 0.9 |
United States of America | 2023-03-09 | 0.9 | 1.1 | 1.1 |
Countries - Table overview
Population numbers are taken from (UNO 2020). In order to compensate for daily fluctuations, the mean number of cases for the past seven days (Mean_Daily_*) is used instead of the daily cases.
European Union - Table overview
Population numbers are taken from (UNO 2020). In order to compensate for daily fluctuations, the mean number of cases for the past seven days (Mean_Daily_*) is used instead of the daily cases.
Cumulative and daily Cases over Time
Selected Countries
Germany - Rolling Mean and Reproduction Number
The 7-days Rolling Mean/Moving Average of the Daily Confirmed and Death Cases smooths out the short-term weekly fluctuations (weekend).
The daily confirmed cases are related to the left y-axes, the daily death cases are related to the right y-axes. This clearly outlines the 12 days delay relation between daily confirmed and death cases and also the roughly the factor of ~1/25 (~4%).
Note: Age group dependent 7-Day Incidence plots for Germany** are largely provided weekly.
The age group specific incidence data RKI - COVID-19-Fälle nach Altersgruppe und Meldewoche - is provided weekly, every Tuesday evening.
The calculation of the reproduction number \(R(t)\) uses a R function provided by (Thomas Hotz 2020b) on GitHub.
The (effective) reproduction number \(R(t)\) at day \(t\), i.e. the average number of people someone infected at time \(t\) would infect if conditions remained the same.
For further German federal states figures (based on the data provided by Robert Koch Institut) see (Thomas Hotz 2020a) and for worldwide figures (based on the data provided by Johns Hopkins University) see (Thomas Hotz 2020c).
For the calculation the assumption of 7-days reporting delay (confirmed is reported 7-days after ‘real’ infection) is unchanged and the same modelled infectivity profile w is used. The lower and upper confidence interval lines provide the (approximate, pointwise) 95% confidence interval (only based on statistical numbers, possible changes in e.g. counting measures can not be considered).
This is the reason why (Thomas Hotz 2020b) “do not compute an average over a sliding window of seven days so the viewer immediately recognizes the size of such artefacts, warning her to be overly confident in the results. In fact, these artefacts are much larger than the statistical uncertainty due to the stochastic nature of the epidemic which is reflected in the confidence intervals.”
Nevertheless, here the calculation is based on the 7-days rolling mean and therefore the figure smooths over the the weekly rhythm.
China and South Korea slowed down exponential growth significantly at an early stage. Their lines on log10 scale have had no longer a significant slope.
In early phases countries have a more or less unchecked exponential growth. If countermeasures are effective, reduced exponential growth is reflected in a reduced slope of the cumulative cases again.
In early phases countries have a more or less unchecked exponential growth resulting in a significant curve slope for the daily cases.
Successful countermeasures are reflected in a reduced exponential growth, the slope of the curve decreases.
In the steady state, the slope disappears and if the daily cases decrease, the slope becomes negative.
The plot shows the daily cases forecast increase in case of unchecked exponential growth. The dark shaded regions show the 80% rsp. 95% prediction intervals. These prediction intervals are displaying the uncertainty in forecasts based on the linear regression of the logarithmic data over the past 14 days.
The charts compare the different forecasts for an exponential rsp. linear growth model. Due to the large fluctuations of the daily cases regression of three weeks is required. Otherwise the prediction levels are much too big.
The dark shaded regions are indicating the \(80\%\) rsp. \(95\%\) prediction intervals. These prediction intervals are displaying the “pure” statistical uncertainty in forecasts based on the regression models.
For doubling periods in the order of period of infectivity (RKI assumption: \(\sim9-10\) days, with great uncertainty, see (RKI 2020b), we no longer have exponential growth. The “old” infected cases are at the end of the doubling period no longer infectious (active). This results in a constant infection rate with basic reproduction number \(R_t \sim 1\) or even \(<1\).
Note: for case numbers of German federal states see (RKI 2020a).
Doubling Time and Forecast The forecasted cases for the next 14 days are calculated ‘only’ from the linear regression of the logarithmic data and are not considering any effects of measures in place. In addition data inaccuracies are not taken into account, especially relevant for the confirmed cases.
Therefore the 14 days forecast is only an indication for the direction of an unchecked exponentiell growth.
The Cumulative cases doubling rate is only a good indicator at the beginning of the pandemic. If the number of confirmed cases from the past two weeks is already small compared to the total number of confirmed cases, the number of infectious people is also small compared to the total cases.
Therefore the table below provides the doubling time for the daily rolling mean cases. The forecast is based on the linear regression of the logarithmic data of past 14 days.
Country | Case_Type | T_doubling | Reg_last_day | FC_7days | FC_14days |
---|---|---|---|---|---|
Austria | Confirmed | -212.7 | 5’391 | 5’270 | 5’151 |
EU | Confirmed | -33.7 | 31’408 | 27’192 | 23’542 |
France | Confirmed | 158.0 | 3’916 | 4’038 | 4’164 |
Germany | Confirmed | -10.3 | 8’415 | 5’255 | 3’282 |
India | Confirmed | 10.7 | 344 | 541 | 850 |
Italy | Confirmed | -64.9 | 3’733 | 3’465 | 3’215 |
Japan | Confirmed | -23.8 | 9’654 | 7’877 | 6’426 |
Spain | Confirmed | -162.3 | 1’005 | 976 | 947 |
United Kingdom | Confirmed | -125.4 | 3’792 | 3’648 | 3’510 |
United States of America | Confirmed | -151.3 | 33’959 | 32’887 | 31’849 |
World | Confirmed | -47.5 | 124’412 | 112’332 | 101’425 |
Austria | Deaths | 62.0 | 8 | 9 | 9 |
EU | Deaths | 96.4 | 244 | 256 | 269 |
France | Deaths | -33.3 | 19 | 17 | 14 |
Germany | Deaths | 58.3 | 87 | 95 | 103 |
India | Deaths | -10.8 | 1 | 0 | 0 |
Italy | Deaths | -94.4 | 32 | 31 | 29 |
Japan | Deaths | -20.5 | 60 | 48 | 38 |
Spain | Deaths | -9.5 | 12 | 7 | 4 |
United States of America | Deaths | 32.2 | 366 | 425 | 495 |
World | Deaths | -74.6 | 865 | 811 | 760 |
The forecast accuracy is checked by using the forecast method for the past three weeks before the past week (training data). Subsequent forecasting of the past week enables comparison with the real data of these days (test data).
The comparison is also an early indicator if the exponential growth is declining. However, possible changes in underreporting (in particular the proportion confirmed / actually infected) requires careful interpretation.
For doubling periods of the total cumulative cases in the order of
infectivity (RKI assumption: \(\sim9-10\) days, with great uncertainty, (see RKI 2020b), we have no
exponential growth for the total cumulative cases. Since the “old”
infected cases are no longer infectious after these periods and we then
have a constant infection rate with basic reproduction number \(R_t \sim 1\).
Instead, we have “only” linear growth of the cumulative Confirmed Cases and the Daily Confirmed Cases remain more or less constant if \(R_t \sim 1\).
However, the basic reproduction number \(R_0 (\approx 3.3 - 3.8)\) (RKI 2020d) is a product of the average number of contacts of an infectious person per day, the probability of transmission upon contacts and the average number of days infected people are infectious. With the current uncertainty of the average duration of the infectivity duration, \(R_0\) can therefore be estimated from the doubling time only to a very limited extent. See also (CMMID 2020).
The number of confirmed cases is an early predictor of the number of deaths. The number of today’s deaths is already determined by the infections about by \(\sim19\) days ago or respectively by the confirmed cases about by \(\sim12\) days ago see Bar Chart - CFR.
The country-specific case fatality rate (CFR, proportion of deaths from confirmed cases) and changes over time can be an indicator of different
Overall a rough conclusion on the country specific underreporting rate (lack of diagnostic confirmation; proportion of all infected to confirmed cases) is feasible if the infection fatality rate (IFR, confiremd cases plus all asymptomatic and undiagnosed infections) is assumed to be country independent and the IFR is known (bottom of existing estimates \(\sim0.56\%\), assumption by RKI see (RKI 2020b).
In this case an estimation of the CFR of \(0.06\) \((6\%)\) indicates an underreporting by a by a factor of \(\sim10\). A CFR of \(0.20\) \((20\%)\) indicates an underreporting by a by a factor of \(\sim30\). This corresponds to RKI assumption of a underreporting by a factor of \(11-20\) (RKI 2020c). Unfortunately, the IFR or lethality is still far too imprecise for concrete conlusions.
In the model paper RKI assumes for the
Depending on the country-specific test frequency (late or early tests), the
*lag_days - time from receipt of the confirmed test result to death, Confirmed to Death, is about \(11-13\) days.
Note: these methods are also used for example for advertising campaigns. The campaign impact on sales will be some time beyond the end of the campaign, and sales in one month will depend on the advertising expenditure in each of the past few months (see Hyndman and Athanasopoulos 2020).
Forecast residuals
indicate quality of
Arima model fit:
Data Source
Data files are provided by Johns Hopkins University
on GitHub
https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series
The data are visualized on their Dashboard
Johns Hopkins University Dashboard
https://coronavirus.jhu.edu/map.html
Code Source
GitHub repository link for
Corona_Virus_TS_Dashboard.Rmd
R Markdown file for
dashboard creationPage_world_map.Rmd
Page_bar_chart.Rmdd
Page_cumulative_and_daily_trend.Rmd
Page_exp_linear_growth.Rmd
Corona_raw_data.R
provides function to read and process
the time series raw data of the John Hopkins University Corona
Hopkinsworld_population_un.RDS
R object file providing UN
world population dataReferences_Corona.bib
Bibtex file providing the
references with the Bibtexkeyshttps://github.com/WoVollmer/R-TimesSeriesAnalysis/tree/master/Corona-Virus
GitHub link for repository of
pkgTS
providing functions for the
Time Series analysis.https://github.com/WoVollmer/pkgTS
R installation by calling
install.packages("devtools")
devtools::install_github("WoVollmer/pkgTS")
The required R files are
ggts_corona.R
- providing the functions to create the
plotsuts_corona.R
- providing utility functions