You are on page 1of 15

Using Statistics and Stochastic Simulation to Control and Forecast Diseases

By
XXXX XXXX
xxxxxxxx

Major: Industrial and Systems Engineering

For
xxxxxxxxx
xxxxxxxxxxxxxxxxxxxx
xxxxxxxx

Abstract

This report discusses the most common challenges that face statistical and stochastic
modelling in forecasting disease spread and control and effective and accurate models that
can be used to improve forecasting accuracy.

6th July 2023

1
TABLE OF CONTENTS

INTRODUCTION.........................................................................................................................4

I. COMMON CHALLENGES IN DISEASE FORECASTING........................................5

A. LACK OF ACCURATE AND TIMELY DATA...................................................................................5


B. UNCERTAINTY AND VARIABILITY...............................................................................................5
C. MODEL ASSUMPTIONS AND LIMITATIONS................................................................................6

II. STATISTICAL AND STOCHASTIC APPROACHES NECESSARY FOR


ACCURATE FORECASTING OF DISEASE DATA................................................................7

A. PREDICTIVE MONITORING.......................................................................................................7
B. GOMPERTZ MODEL................................................................................................................7
C. EPIDEMIOLOGY MODELING....................................................................................................8
D. TIME SERIES MODELLING.......................................................................................................9
E. SEIR MODELING.....................................................................................................................9

III. METHODS TO IMPROVE STOCHASTIC MODELS................................................11

A. IMPROVED CROSS-MODEL CROSS-MODE (ICMCM)............................................................11


B. STOCHASTIC HYBRID PERTURBATION-GALERKIN METHOD.....................................................11
CONCLUSION............................................................................................................................11
RECOMMENDATIONS.............................................................................................................12
IV. REFERENCES......................................................................................................................13

LIST OF ILLUSTRATIONS

FIGURES

FIGURE 1........................................................................................................................................9
FIGURE 2......................................................................................................................................12

2
Using Statistics and Stochastic Simulation to Control and Forecast Diseases

INTRODUCTION

Over the past decades, several infectious diseases have been discovered, posing a

significant threat to human populations. Many infectious diseases, such as epidemics, are

known for their highly sensitive short-term and long-term changes to the environment. For

this reason, statistics and stochastic simulation have increasingly been needed in controlling

and forecasting diseases by allowing researchers and healthcare organizations to model the

spread of diseases and evaluate the effectiveness of various control measures. Recently in

2019, the COVID-19 pandemic was controlled by using simulation models that helped

predict the future trends of the pandemic as well as explore the effectiveness of measures

such as wearing masks, social distancing, and vaccinations. With simulations, researchers

have identified potential hotspots for disease transmissions and formulated targeted

interventions that help in preventing further spread. Moreover, Simulation has been used to

help public health officials in making informed decisions about the control and prevention of

diseases. Moreover, disease control and forecasting allow hospital preparedness during

epidemics by accurately predicting the hospitalizations and ICU beds to allow effective

allocation of resources and reduce the burden on the healthcare system as well as improve

patient outcomes.

This report will only focus on the challenges experienced in the use of Statistics and

Stochastic simulations in controlling and forecasting diseases and various simulation approaches

that can be employed to improve accuracy in forecast models. However, the report will not

provide step-by-step procedures of how these approaches are implemented but instead, provide a

general understanding of how they can be employed and affected.

3
I. COMMON CHALLENGES IN DISEASE FORECASTING

a. Lack of accurate and timely data

A major challenge when forecasting disease data is the Lack of timely and accurate data.

Forecasting of disease data relies on accurate data that ought to be acquired timely to make

accurate predictions (Yadav & Akhter, 2021). However, in many cases, there is usually

incomplete or delayed data on disease incidence, which makes it difficult to develop accurate

models. For instance, delays in reporting cases or deaths lead to inaccurate data as there may be a

lag between the time a case or death occurs and the time it was reported to public health

authorities. This delay adversely affects the accuracy of disease forecasting models as the models

depend on timely and accurate data to make accurate forecasts leading to detrimental health

outcomes. This is because a delay in real-time data may limit the utility of real-time decisions

and planning since incomplete information may lead to biased decisions considering that real-

time forecasting with time series models is essential in creating a statistically validated

conjecture in a health crisis (Bhattacharyya et al., 2022).

b. Uncertainty and variability

Typically, disease dynamics are sophisticated and can be affected by various factors, such

as social, environmental, and behavioral factors. As a result, there can be Uncertainty and

variability in the disease forecasts, making it challenging to predict the future course of an

outbreak. On many occasions, Uncertainty and variability stem from the unpredictability or

severity of disease control measures, such as severity and length of social distancing measures,

as they change the peak date by months and in other instances, they create several peaks

(Bertozzi et al., 2020). Human behaviors also have a significant impact on disease transmissions,

4
and the forecasting models ought to take into account the effects of human behavior on the

transmission of diseases in making accurate decisions. However, human behaviors are typically

uncertain and variable and can be challenging to predict, particularly how individuals behave in

response to particular disease epidemics or outbreaks.

c. Model Assumptions and Limitations

Each Disease forecasting model has its individual set of assumptions and limitations that

ought to be considered during the interpretation of the results of the models. Each statistical

modeling technique for forecasting infectious diseases, including time series modeling,

distribution fitting, and epidemiological modeling, has unique assumptions and limitations. For

instance, an assumption used in distribution fitting is that data follows a particular distribution,

yet this assumption is not always true, and, in some instances, it may be challenging to ascertain

which distribution best fits the data. As such, assumptions of a forecasting model are not only a

methodological and bookkeeping issue, but they create a serious barrier and bottleneck that

limits simulation approaches. Tolles and Luong (2020) give an example of how the SIR model

assumes that homogeneous mixing of populations, which means that all individuals in the

population are estimated to have an equal probability of coming into contact with the other, yet it

does not reflect human social structures where the majority of contact occurs within limited

networks.

5
II. STATISTICAL AND STOCHASTIC APPROACHES NECESSARY FOR

ACCURATE FORECASTING OF DISEASE DATA

a. Predictive monitoring

On many occasions, monitoring approaches can be applied to time series models to allow

the estimation of parameters more precisely. These approaches comprise residual or error

monitoring of the time series models and modifying the parameters accordingly. As such, several

predictive monitoring approaches can be used, such as Kalman filtering, Bayesian filtering, and

particle filtering, which can be employed to improve accuracy in models. Kalman filtering is

used in estimating the state of a system on the basis of noisy measurements. It is often employed

in control systems and signal processing. On the other hand, Bayesian filtering is an approach for

estimating the probability distribution of a system-based state on the basis of noisy

measurements. It is commonly employed in machine learning and artificial intelligence to

estimate the probability distribution of the number of infections on the basis of the available

epidemiological data.

b. Gompertz Model

Considering that adequate and timely data is necessary for forecasting disease spread, the

Gompertz model can eliminate some of the challenges associated with data acquisition. This is

because the Gompertz model is a patient arrival process modeling that is used as a hospital

resource planning tool, especially when there is an epidemic. The model is a sigmoidal growth

model that is employed in epidemiology or biology to model the growth of diseases and

populations. It is useful in capturing the exponential growth of diseases during the initial stages

of the epidemic, followed by the period of maximum growth and then the reducing phase as the

6
outbreak subsides (Vicuña et al., 2021). It is a better fit for pandemic-related data, such as

hospitalization numbers and positive cases, and has a superior prediction capacity compared to

other sigmoid models.

Figure 1

Representation of COVID-19 patient flows in the health system in Gompertz Model.

Note. From Vicuña et al. (2021).

c. Epidemiology Modeling

Epidemiology modeling is a statistical approach to studying the spread and control of

infectious diseases in a population. Epidemiology modeling involves the incorporation of

biological parameters such as mode of transmission, infectious period, latent period, infectious

agent, resistance, susceptibility, and socio-cultural or demographic and geographic factors into

the models to allow forecasting of the disease spread. The statistical approach forecasts the

spread of diseases and evaluates the effectiveness of control measures such as quarantine,

vaccination, and social distancing. These biological factors are integrated into the model because

infectious diseases are caused by biological agents that interact with the host and the

7
environment in various ways, and they allow accurate representation of the disease dynamics,

which can help in the prediction of the disease spread through identification of high-risk

populations and evaluation of the effectiveness of control mechanisms.

d. Time Series Modelling

When infectious diseases develop over time, and there is data on a single variable consisting

of the number of infections that occurred, the time series models are fitted, and predictions are

made based on the best-fitted model. According to Yadav and Akhter (2021), time series models

are primarily used in gathering past information and then analyzing and predicting the spread of

infectious diseases over time and can be used to estimate the parameters of the model, such as

seasonality, trend, or autocorrelation, which are essential in making better policies. Notably, there

are several types of time series models that can be used in forecasting disease spread. One of the

common time series models is the autoregressive model, which represents a variable that

regresses on its prior or lagged values. Another model is the Moving Average Model, which

provides a relationship between observations and residuals from the Moving Average model for

lag observations. Autoregressive Integrated Moving Average (ARIMA) model is also a time series

model that integrates the Autoregressive model and Moving Average model with differencing to

make the time series stationary. Lastly, the Seasonal Arima (SARIMA) model is used in

exhibiting seasonal pattern is used in time series data that portray seasonal patterns to forecast

disease spread through the Box-Jenkins approach, which involves identification, estimation, and

diagnostic checking.

e. SEIR modeling

8
The SEIR (Susceptible-Exposed-Infected-Recovered) model is a stochastic modeling

technique used in modeling the spread of disease. The SEIR model divides the population into

four categories, Susceptible, exposed, infected, and recovered. Through differential equations,

the model describes the flow of individuals between these categories over time. According to

Khan (2021), SEIR assumes that individuals in the susceptible category are those that catch the

infection, and when exposed, they may become exposed. Similarly, those in the exposed

category are the individuals that already have the infections but are asymptomatic. The infectious

category contains individuals that depict signs of infections and can transmit the infectious

disease or virus. Lastly, those in the recovered section are those that were previously infected

but cannot transmit the virus as they are already immune to it.

Figure 2

SEIR model

Note From Khan (2021).

9
III. METHODS TO IMPROVE STOCHASTIC MODELS

a. Improved Cross-Model Cross-Mode (ICMCM)

The Improved Cross-Model Cross-Mode technique is used in updating structural models

on the basis of limited measurement data and uncertain measurement errors. The approach takes

into consideration the uncertain measured modal data in creating a new stochastic model

updating equation with the use of an updated coefficient vector. In some instances, the hybrid

perturbation-Galerkin method can be used to improve the ICMCM to solve model problems and

improve upon it to achieve more accurate updating outcomes, particularly when taking into

consideration rank deficiency (Chen et al., 2021).

b. Stochastic hybrid perturbation-Galerkin Method

The stochastic hybrid perturbation -Galerkin (HPG) approach is a numerical technique

that involves improving the stochastic model by updating the equation. The approach involves

combining the perturbation approach and the Galerkin method to obtain a random updated

coefficient vector. Moreover, the approach is employed in solving the stochastic updating

equation through the consideration of uncertain measured modal data and limited measurement

data. Typically, this technique is computationally efficient and can be used in handling large

uncertainty in measuring data. In some instances, the method can be integrated with the ICMCM

method in updating structural models, but the primary use of the HPG method is to improve

stochastic models (Chen et al., 2021).

CONCLUSION

In conclusion, this report provided detailed insights into the significance of statistics and

stochastic Simulation in controlling and forecasting diseases. The approaches provided in the

10
report have proven invaluable in understanding the spread of diseases, predicting their future

trends, and evaluating the effectiveness of control mechanisms. Although challenges exist, such

as the need for accurate data and the complexity of simulation models, the implementation of

certain measures can improve their utility when employing simulation models.

RECOMMENDATIONS

To overcome some of the challenges experienced during the Simulation and modeling of

diseases, certain measures can be implemented to improve their utility. The following

recommendation can address these challenges:

 Establishing robust data collection systems, such as automated systems for collecting real-

time data on disease incidence, occurrence, transmission, etc., to improve data collection and

accuracy.

 Incorporating uncertainty analysis by incorporating probabilistic distributions to enhance

model parameter estimation.

 Validating and calibrating models with the use of real-world data can help evaluate the

accuracy of models and improve their performance.

 Integrating multiple models or forecasting approaches can also create a more comprehensive

and accurate forecast of disease spread or dynamics.

Word Count: 1994

11
IV. References

Abo-Elreesh, I. M. A. E. (2021). Analysis of chronic diseases progression using stochastic

modeling. arXiv preprint arXiv:2111.06892.

Bertozzi, A. L., Franco, E., Mohler, G., Short, M. B., & Sledge, D. (2020). The challenges of

modeling and forecasting the spread of COVID-19. Proceedings of the National

Academy of Sciences, 117(29), 16732-16738. https://doi.org/10.1073/pnas.2006520117

Bhattacharyya, A., Chakraborty, T., & Rai, S. N. (2022). Stochastic forecasting of COVID-19

daily new cases across countries with a novel hybrid time series model. Nonlinear

Dynamics, 1-16.

Chen, H., Huang, B., Tee, K. F., & Lu, B. (2021). A New Stochastic Model Updating Method

Based on Improved Cross-Model Cross-Mode Technique. Sensors, 21(9), 3290.

https://doi.org/10.3390/s21093290

Khan, S. (2021). Visual Data Analysis and Simulation Prediction for COVID-19 in Saudi Arabia

Using SEIR Prediction Model. International Journal of Online & Biomedical

Takele, R. (2020). Stochastic modelling for predicting COVID-19 prevalence in East Africa

Countries. Infectious Disease Modelling, 5, 598-607.

Takele, R. (2020). Stochastic modelling for predicting COVID-19 prevalence in East Africa

Countries. Infectious Disease Modelling, 5, 598-607.

Tolles, J., & Luong, T. (2020). Modeling epidemics with compartmental models. Jama, 323(24),

2515-2516. https:// doi:10.1001/jama.2020.8420

12
Vicuña, D. G., Esparza, L., & Mallor, F. (2021). Hospital preparedness during epidemics

using Simulation: the case of COVID-19. Central European Journal of Operations

Research, 30.

Yadav, S. K., & Akhter, Y. (2021). Statistical Modeling for the Prediction of Infectious Disease

Dissemination with Special Reference to COVID-19 Spread. Frontiers in Public Health,

9, 645405. https://doi.org/10.3389/fpubh.2021.645405

(Yadav & Akhter, 2021, p.5)

Report p.4 III a (paraphrased)

(Khan, 2021, p.158)

Report p.8 III e (paraphrased)

13
(Vicuña et al., 2021, p.).

Report p.6 III b (paraphrased)

(Bertozzi et al., 2020, p. 16737)

Report p.4 II b (paraphrased)

(Tolles & Luong, 2020, p.2515)

Report p.5 II c (paraphrased)

14
(Bhattacharyya et al., 2022 p.3026)

Report p.4 II a (paraphrased)

(Chen et al., 2021 p.16)

Report p.11 III a (paraphrased)

15

You might also like