You are on page 1of 18

Analysis and prediction of COVID-19 on worldwide

Mohamed Douksieh

Zhejiang Gonshang University


Mathematics and statistics

Supervisor: Gu Wentao
2020/2021
ton

Mohamed Douksieh Thesis proposal 2020-2021 / 18


Content

I. Introduction
1. Research question
2. Objective

II. Review litterature


III. Methodologie
1. Data source
2. Description of data
3. Analysis method
a. Exploratory and comparative data analysis
b) SIR model
c) Modelization of Covid 19 with Prophet model
I. Conclusion ton

Reference
Mohamed Douksieh Thesis proposal 2020-2021 2 / 18
I. Introduction

Coronaviruses are a large family of viruses that can cause severe illness to the
human being. The first known severe epidemic is Severe Acute Respiratory
Syndrome (SARS) occurred in 2003, whereas the second outbreak of severe illness
began in 2012 in Saudi Arabia with the Middle East Respiratory Syndrome (MERS).

. On March 11, as the number of COVID-19 cases has increased thirteen times
apart from China with more than 118,000 cases in 114 countries and over 4,000
deaths, WHO declared this a pandemic.

As the outbreak of the COVID-19 has become a worldwide pandemic, the real-time
analyses of epidemiological data are needed to prepare the society with better
action plans against the disease.

ton

Arlette ANTONI
Mohamed (UBS Vannes)
Douksieh Thesis proposal 2020-20212019-2020
3 / 18 3 / 28
Introduction

As of April 1, 2020, based on the globally shared live data by the Johns Hopkins
dashboard, worldwide there are 932,605 confirmed cases, out of which 193,177
are recovered and 46,809 lost their lives.

In such grave circumstances, it is very important to predict Covid 19 cases, death


and recoveries through predictive modelling.

In the last few months studies have been carried out to understand the spread of
the disease. For example, the Susceptible-Exposed-Infectious-Removed (SEIR)
model is used to model the outbreak in the city of Wuhan China .

ton

Mohamed Douksieh Thesis proposal 2020-2021 4 / 18


1. Research question

Which predictives models can we use to predict the Covid 19 pandemic ?

ton

Mohamed Douksieh Thesis proposal 2020-2021 5 / 18


2. Objective

• The aim of this study is to work on the covid19 pandemic data that we
are going through for the purpose of modeling in order to do prediction. It
might be interesting to compare the evolution of covid19 in several
countries and predict the end of the pandemic. So there is two specific
objective

 Compare the evolution of Covid 19 in different


country to determine their tendency

 Our main purpose is to develop a predictive model

ton

Mohamed Douksieh 2020-2021 6 / 18


Thesis proposal
II. Review littérature

• Albertine Weber et al. presented the trend analysis of COVID-19 pandemic in


China using globally accepted SIR model in this study.
• Lucia Russo et al. presented a mechanism to find the first day of infections and
predictions of COVID-19 in Italy
• In a research work proposed by Benvenuto et al. , authors proposed an
autoregressive integrated moving average (ARIMA) model to predict the
spread of COVID-2019
• Deb et al. [6] proposed a time series method to analyze incidence pattern and
the estimated reproduction number of COVID-19 outbreak.

ton

Mohamed Douksieh Thesis proposal 2020-2021 7 / 18


III. Methodologie

1. Data source

We used the Covid 19 open data set provided by the University Johns Hopkins;
they made an exceptional dashboard using case data affected to date. Apart
from that, they also offer an opportunity to analysts and researchers by providing
the available data in sheet format Google.

Therefore, daily information about those infected can provide interesting


information when it is made available to the broader data science community.
This dataset contains daily information on the number of affected cases, deaths
and recovery on Covid 19 case.

As this is chronological data and the number of cases one given day is the
cumulative number. Data are available from January 22, 2020.

ton

Mohamed Douksieh Thesis proposal 2020-2021 8 / 18


2. Description of Data

Modeling pandemic data helps understand the magnitude, progression and


impacts of the virus, and contributes to decision-making.
The dataset most used by the authorities of health, researchers, data
scientists and journalists today is the Center for Systems Science and
Engineering (CSSE) at Johns Hopkins University. They has published at the
end of January 2020 an interactive dashboard based on several sources, in
especially the Chinese community platform DXY.

This table allows you to follow almost real-time cases of Covid-19 confirmed,
cured and deceased worldwide.

In addition, I also developed a table to provide the analyst with data


complete knowledge of each column of the data set used
ton

Mohamed Douksieh Thesis proposal 2020-2021 9 / 18


Description of data

ton

Mohamed Douksieh Thesis proposal 2020-2021 10 / 18


3. Analysis method

a. Exploratory and comparative data analysis

• For a better and clear understanding of the data, the first method will
be the comparative and exploratory analysis to determine the
tendency in different country.

• In this part we can see the evolution and the different number of
death, recovery and confirmed cases of Covid 19. We can also visualize
the recovery rate and mortality rate between countries.

ton

Mohamed Douksieh Thesis proposal 2020-2021 11 / 18


b. SIR MODEL
since our main purpose is to develop a predictive model,
SIR is a simple epidemiologic model that considers a population that
belongs to one of the following compartment :

 Susceptible (S): The individual hasn't contracted the disease, but she
can be infected due to transmisison from infected people

 Infected (I): This person has contracted the disease

 Removed (R) : Removed individuals have either recovered from the


infection and are immune to reinfection, or have died.

ton

Mohamed Douksieh Thesis proposal 2020-2021 12 / 18


SIR Model

The rate of transfer from the susceptible population to


the infected population is βSI, where β is the per-capita effective contact
rate (Ce/N). The effective contact rate (Ce) is the number of effective
contacts made by a given individual per unit time, where an effective
contact is defined as a contact sufficient to lead to infection if it were to
occur between a susceptible and an infectious individual.

The rate at which Infected individuals move into the Removed population
is I/r, where r is the recovery delay. The recovery delay represents the
length of time an individual remains infectious. The independent variable
of the model is the time t, and the rates of change of the compartments
are expressed as a set of differential equations:

ton

Mohamed Douksieh Thesis proposal 2020-2021 13 / 18


SIR model

The basic reproduction number (R0) is an indication of the transmissibility of


a virus within a particular population. It represents the average number of
new infections generated by an infected person in an entirely susceptible
population. In this scheme R0 is given by:

Where N is the total population:

This simple model predicts behavior similar to that observed in real-world


epidemics

ton

Mohamed Douksieh Thesis proposal 2020-2021 14 / 18


c. Modelization with prophet model

The prophet model is able to capture daily, weekly and yearly seasonality
along with holiday effects, by
Implementing additive regression model.

The mathematical equation behind the prophet model is defined as :

• With, g(t) representing the trend. Prophet uses a piecewise linear model
for trend forecasting.
• s(t) represents periodic changes (weekly, monthly, yearly).
• h(t) represents the effects of holidays (recall: Holidays impact
businesses).
• e(t) is the error term.
ton

Mohamed Douksieh Thesis proposal 2020-2021 15 / 18


IV. Conclusion

Due to pandemic of Coronavirus and COVID-19, all countries are looking


towards mitigation plan to control the spread with the help some modeling
techniques.

Predictive analysis can play an important role in epidemic analysis and


forecasting

Analysis of various predictive analytics methods available in the literature is


presented in this paper.

ton

Mohamed Douksieh Thesis proposal 2020-2021 16 / 18


Reference

[Q. Lina, S. Zhaob, D. Gaod, Y. Loue, S. Yangf, S. S. Musae, M. H. Wangb, Y.


Caig, W. Wangg, L. Yangh, D. Hee, A conceptual model for the coronavirus
disease 2019 (COVID-19) outbreak in Wuhan, China with individual reaction and
governmental action. International Journal of Infectious Diseases, vol. 93, pp. 211-
216, 2020

Weber, A., Ianelli, F., & Goncalves, S. (2020). Trend analysis of the COVID-19
pandemic in China and the rest of the world. arXiv preprint arXiv:2003.09032.

Russo, L., Anastassopoulou, C., Tsakris, A., Bifulco, G. N., Campana, E.


F., Toraldo, G., & Siettos, C. (2020). Tracing DAY-ZERO and Forecasting
the Fade out of the COVID-19 Outbreak in Lombardy, Italy: A
Compartmental Modelling and Numerical Optimization Approach. medRxiv.

[5]. Benvenuto, Domenico, Marta Giovanetti, Lazzaro Vassallo, Silvia


Angeletti, and Massimo Ciccozzi. ”Application of the ARIMA model on the ton
COVID-2019 epidemic dataset.” Data in brief (2020): 105340.

Mohamed Douksieh Thesis proposal 2020-2021 17 / 18


Thank you

ton

Thesis proposal 2020-2021 18 / 18

You might also like