You are on page 1of 22

Using EpiCurve

jp.decorps@epiconcept.fr
2019-09-09

Epiconcept is made up of a team of doctors, epidemiologists, data scientists and digital specialists. For more
than 20 years, Epiconcept has contributed to the improvement of public health by writing software, carrying
out epidemiological studies, research, evaluation and training to better detect, monitor and prevent disease
and to improve treatment.
Epiconcept provides software, services and studies in the following areas:
• Software for managing public health programs,
• Secure cloud solutions for health data collection, reporting and processing,
• Research projects on vaccine preventable diseases, including measuring the effectiveness and impact of
vaccines,
• Services in the field of epidemiology (protocols, analysis, training, etc.),
• Expertise in data analysis,
• Coaching and assistance to professionals in public health,
• Training (in software use and epidemiology: short and longer introductory modules, advanced courses,
training through long-term practice).
To achieve such goals Epiconcept :
• Recognized research organization,
• Certified datacenter for hosting personal health data,
• Training organisation.
Epiconcept relies on :
• Its expertise in epidemiology
• Its IT expertise,
• Ethical values rooted in practice (responsibility and quality of services, data security and confidentiality,
scientific independence, etc.),
• Capabilities to answer and anticipate tomorrow’s challenges (Research - evaluation, e-health, Big Data,
IoT, etc.),
• A desire to build long-term relationships with its clients and partners.
Its current customers and partners include some of the greatest names in the world such as: Santé Publique
France (and many public health organizations around the world), WHO, ECDC, AFD, MSF, World Bank,
etc.

1
Package EpiCurve

Description

EpiCurve allows the user to create epidemic curves from case-based and aggregated data.

Details

The EpiCurve function creates a graph of number of cases by time of illness (for example date of onset).
Each case is represented by a square. EpiCurve allows the time unit for the x-axis to have hourly, daily,
weekly or monthly intervals. The hourly interval can be split into 1, 2, 3, 4, 6, 8 or 12 hour time units.
EpiCurve works on both case-based (one case per line) or aggregated data (where there is a count of cases
for each date). With aggregated data, you need to specify the variable for the count of cases in the “freq”
parameter.
With case-based (non-aggregated data), the date format for EpiCurve can be:
• hourly: YYYY-MM-DD HH:MM or YYYY-mm-DD HH:MM:SS
• daily: YYYY-MM-DD
• monthly: YYYY-MM
If the date format is daily or hourly, you can change and force the period for aggregation on the graph with
the parameter “period” setted with “day”, “week” or “month”.
For aggregated data, the date formats can be as above, but they can also be weekly: YYYY-Wnn. Here, we
need to specify how the data are aggregated in the parameter “period”. If we want to further aggregate the
aggregated data for the epidemic curve (e.g. move from daily aggregated cases to weekly aggregated cases),
we can specify the parameter “to.period”.
When the date format is hourly, the dataset is considered case-based, whether the “freq” parameter of the
EpiCurve function is supplied or not.

The EpiCurve function

EpiCurve (
x,
date = NULL,
freq = NULL,
cutvar = NULL,
period = NULL,
to.period = NULL,
split = 1,
cutorder = NULL,
colors = NULL,
title = NULL,
xlabel = NULL,
ylabel=NULL,
note=NULL

2
Arguments

Parameter Description
x data.frame with at least one column with dates
date character, name of date column
freq character, name of a column with a value to display
cutvar character, name of a column with factors
period character, c(“hour”, “day”,“week”, “month”)
to.period character, Convert date period to another period only for aggregated data. If period is
“day”, to.period can be “week” or “month”. If period is “week”, to.period can be
“month”.
split integer, c(1,2,3,4,6,8,12) value for hourly split
cutorder character vector of factors
colors character, vector of colors
title character, title of the plot
xlabel character, label for x axis
ylabel character, label for y axis
note character, add a note under the graph

Depends

ggplot2, dplyr, ISOweek, scales, timeDate

3
Plot non-aggregated cases

Daily - non-aggregated cases

DF <- read.csv("daily_unaggregated_cases.csv", stringsAsFactors=FALSE)


kable(head(DF, 12))

UTS V1 V2
2016-10-26 7.20 188
2016-11-02 7.03 95
2016-11-03 5.14 160
2016-11-05 9.89 165
2016-11-05 9.69 109
2016-11-05 4.15 154
2016-11-05 4.97 144
2016-11-06 8.97 187
2016-11-06 4.45 120
2016-11-06 6.60 116
2016-11-07 7.68 141
2016-11-07 10.08 126

EpiCurve(DF,
date = "UTS",
period = "day",
color ="#9900ef",
xlabel=sprintf("From %s to %s", min(DF$UTS), max(DF$UTS)))

4
0
2
4
6
8
10
12
2016−10−26
2016−10−27
2016−10−28
2016−10−29
2016−10−30
2016−10−31
2016−11−01
2016−11−02
2016−11−03
2016−11−04
2016−11−05
2016−11−06
2016−11−07
2016−11−08
2016−11−09
2016−11−10
2016−11−11
2016−11−12
2016−11−13
2016−11−14
2016−11−15
2016−11−16
2016−11−17
2016−11−18
2016−11−19
2016−11−20
2016−11−21
2016−11−22
2016−11−23

5
2016−11−24
2016−11−25
2016−11−26
2016−11−27
2016−11−28
2016−11−29
2016−11−30
2016−12−01
2016−12−02
2016−12−03
From 2016−10−26 to 2016−12−21

2016−12−04
2016−12−05
2016−12−06
2016−12−07
2016−12−08
2016−12−09
2016−12−10
2016−12−11
2016−12−12
2016−12−13
2016−12−14
2016−12−15
2016−12−16
2016−12−17
2016−12−18
2016−12−19
2016−12−20
2016−12−21
Number of cases

0
2
4
6
8
10
12
14
16
18
12 00:00−00:59
12 01:00−01:59
12 02:00−02:59

EpiCurve(DF,
12 03:00−03:59
12 04:00−04:59
12 05:00−05:59
12 06:00−06:59
12 07:00−07:59

split = 1,
kable(head(DF, 12))

12 08:00−08:59
12 09:00−09:59

date = "UTS",
12 10:00−10:59
12 11:00−11:59

period = "hour",
12 12:00−12:59

color ="#339933",
12 13:00−13:59
12 14:00−14:59
12 15:00−15:59
12 16:00−16:59
12 17:00−17:59

ylabel="Number of cases",
Hourly - non-aggregated cases

12 18:00−18:59
12 19:00−19:59
UTS

12 20:00−20:59
12 21:00−21:59
12 22:00−22:59

2017-04-12
2017-04-12
2017-04-12
2017-04-12
2017-04-12
2017-04-12
2017-04-12
2017-04-12
2017-04-12
2017-04-12
2017-04-12
2017-04-12

12 23:00−23:59
13 00:00−00:59
13 01:00−01:59

6
13 02:00−02:59
19:41
19:37
19:24
19:11
19:05
18:43
18:38
18:36
18:06
17:38
17:35
16:31

13 03:00−03:59
13 04:00−04:59
13 05:00−05:59
13 06:00−06:59
6.18
7.80
6.36
4.61
6.39
8.03
7.02
10.92
4.95
6.81
8.69
5.17
X1

13 07:00−07:59
13 08:00−08:59
13 09:00−09:59
13 10:00−10:59
123
112
188
126
102
175
185
189
120
140
101
166
X2

13 11:00−11:59
13 12:00−12:59
13 13:00−13:59

From 2017−04−12 16:31 to 2017−04−13 03:23


13 14:00−14:59
13 15:00−15:59
13 16:00−16:59
13 17:00−17:59
xlabel=sprintf("From %s to %s", min(DF$UTS), max(DF$UTS)))

13 18:00−18:59
13 19:00−19:59
DF <- read.csv("hourly_unaggregated_cases.csv", stringsAsFactors=FALSE)

13 20:00−20:59
13 21:00−21:59
13 22:00−22:59
13 23:00−23:59
Hourly - non-aggregated cases with factors

DF <- read.csv("hourly_unaggregated_cases_factors.csv", stringsAsFactors=FALSE)


kable(head(DF, 12))

UTS X1 X2 Confirmed
2017-04-12 16:31 5.17 166 YES
2017-04-12 17:35 8.69 101 YES
2017-04-12 17:38 6.81 140 NO
2017-04-12 18:06 4.95 120 NO
2017-04-12 18:36 10.92 189 NO
2017-04-12 18:38 7.02 185 YES
2017-04-12 18:43 8.03 175 NO
2017-04-12 19:05 6.39 102 NO
2017-04-12 19:11 4.61 126 NO
2017-04-12 19:24 6.36 188 YES
2017-04-12 19:37 7.80 112 NO
2017-04-12 19:41 6.18 123 NO

EpiCurve(DF,
date = "UTS",
period = "hour",
split = 1,
cutvar = "Confirmed",
color = c("#339933","#eebb00"),
xlabel=sprintf("From %s to %s", min(DF$UTS), max(DF$UTS)))

18
16
14
12 Confirmed
10 YES
8
NO
6
4
2
0
12 00:00−00:59
12 01:00−01:59
12 02:00−02:59
12 03:00−03:59
12 04:00−04:59
12 05:00−05:59
12 06:00−06:59
12 07:00−07:59
12 08:00−08:59
12 09:00−09:59
12 10:00−10:59
12 11:00−11:59
12 12:00−12:59
12 13:00−13:59
12 14:00−14:59
12 15:00−15:59
12 16:00−16:59
12 17:00−17:59
12 18:00−18:59
12 19:00−19:59
12 20:00−20:59
12 21:00−21:59
12 22:00−22:59
12 23:00−23:59
13 00:00−00:59
13 01:00−01:59
13 02:00−02:59
13 03:00−03:59
13 04:00−04:59
13 05:00−05:59
13 06:00−06:59
13 07:00−07:59
13 08:00−08:59
13 09:00−09:59
13 10:00−10:59
13 11:00−11:59
13 12:00−12:59
13 13:00−13:59
13 14:00−14:59
13 15:00−15:59
13 16:00−16:59
13 17:00−17:59
13 18:00−18:59
13 19:00−19:59
13 20:00−20:59
13 21:00−21:59
13 22:00−22:59
13 23:00−23:59

From 2017−04−12 16:31 to 2017−04−13 03:23

7
Plot aggregated data

Daily

Without factors

date value
2016-03-01 5
2016-03-03 5
2016-03-05 5
2016-03-06 1
2016-03-07 3
2016-03-14 4
2016-03-15 1
2016-03-16 10
2016-03-27 3
2016-03-28 2
2016-03-30 4
2016-03-31 2
2016-04-01 3
2016-04-03 1
2016-04-04 6
2016-04-07 6
2016-04-08 9
2016-04-09 15
2016-04-13 2
2016-04-14 1
2016-04-15 3
2016-04-16 6
2016-04-17 4
2016-04-18 17
2016-04-27 13
2016-04-29 3
2016-04-30 6

8
Number of cases

0
2
4
6
8
10
12
14
16
2016−03−01
2016−03−02
2016−03−03
EpiCurve(DF,

2016−03−04
2016−03−05
2016−03−06
2016−03−07
2016−03−08
2016−03−09
2016−03−10
2016−03−11
date = "date",

2016−03−12

Epidemic Curve
period = "day",
freq = "value",

2016−03−13
2016−03−14
2016−03−15
2016−03−16
2016−03−17
2016−03−18
2016−03−19
title = "Epidemic Curve",
ylabel="Number of cases",

2016−03−20
2016−03−21
2016−03−22
2016−03−23
note = "Daily epidemic curve")

2016−03−24
2016−03−25
2016−03−26
2016−03−27
2016−03−28

9
2016−03−29
2016−03−30
2016−03−31
2016−04−01
2016−04−02
2016−04−03
2016−04−04
2016−04−05
2016−04−06

Daily epidemic curve


2016−04−07
2016−04−08
2016−04−09
2016−04−10

From 2016−03−01 to 2016−04−30


2016−04−11
2016−04−12
2016−04−13
2016−04−14
xlabel=sprintf("From %s to %s", min(DF$date), max(DF$date)),

2016−04−15
2016−04−16
2016−04−17
2016−04−18
2016−04−19
2016−04−20
2016−04−21
2016−04−22
2016−04−23
2016−04−24
2016−04−25
2016−04−26
2016−04−27
2016−04−28
2016−04−29
2016−04−30
With factors

date value factor


2016-03-01 3 Validated Case
2016-03-01 2 Unvalidated Case
2016-03-03 4 Validated Case
2016-03-03 1 Unvalidated Case
2016-03-05 5 Validated Case
2016-03-06 1 Unvalidated Case
2016-03-07 2 Validated Case
2016-03-07 1 Unvalidated Case
2016-03-14 4 Validated Case
2016-03-15 1 Validated Case
2016-03-16 7 Validated Case
2016-03-16 3 Unvalidated Case
2016-03-27 3 Unvalidated Case
2016-03-28 1 Validated Case
2016-03-28 1 Unvalidated Case
2016-03-30 4 Validated Case
2016-03-31 2 Validated Case
2016-04-01 3 Validated Case
2016-04-03 1 Validated Case
2016-04-04 2 Validated Case
2016-04-04 4 Unvalidated Case
2016-04-07 6 Unvalidated Case
2016-04-08 9 Validated Case
2016-04-09 7 Validated Case
2016-04-09 8 Unvalidated Case
2016-04-13 2 Validated Case
2016-04-14 1 Validated Case
2016-04-15 3 Validated Case
2016-04-16 6 Validated Case
2016-04-17 4 Validated Case
2016-04-18 8 Validated Case
2016-04-18 9 Unvalidated Case
2016-04-27 11 Validated Case
2016-04-27 2 Unvalidated Case
2016-04-29 3 Validated Case
2016-04-30 6 Validated Case

10
Number of cases

0
2
4
6
8
10
12
14
16
2016−03−01
2016−03−02
2016−03−03
2016−03−04
EpiCurve(DF,

2016−03−05
2016−03−06
2016−03−07
2016−03−08
2016−03−09
2016−03−10
2016−03−11
2016−03−12
2016−03−13
2016−03−14
2016−03−15
date = "date",

2016−03−16

Epidemic Curve
period = "day",
freq = "value",

2016−03−17
2016−03−18
2016−03−19
cutvar = "factor",

2016−03−20
2016−03−21
2016−03−22
2016−03−23
2016−03−24
2016−03−25
2016−03−26
title = "Epidemic Curve",
ylabel="Number of cases",

2016−03−27
2016−03−28
2016−03−29
2016−03−30
2016−03−31
2016−04−01
note = "Daily epidemic curve")

2016−04−02
2016−04−03
2016−04−04
2016−04−05
2016−04−06
2016−04−07

11
2016−04−08

Daily epidemic curve


2016−04−09
2016−04−10
2016−04−11
2016−04−12
2016−04−13
2016−04−14

From 2016−03−01 to 2016−04−30


2016−04−15
2016−04−16
2016−04−17
2016−04−18
2016−04−19
2016−04−20
2016−04−21
2016−04−22
2016−04−23
2016−04−24
2016−04−25
2016−04−26
2016−04−27
2016−04−28
2016−04−29
2016−04−30
xlabel=sprintf("From %s to %s", min(DF$date), max(DF$date)),

factor
Validated Case
Unvalidated Case
Weekly

Without factors

date value
2016-W10 3
2016-W11 4
2016-W12 1
2016-W13 5
2016-W14 2
2016-W15 7
2016-W16 1
2016-W17 4
2016-W18 3

EpiCurve(DF,
date = "date",
freq = "value",
period = "week",
color=c("#990000"),
ylabel="Number of cases",
xlabel=sprintf("Du %s au %s", min(DF$date), max(DF$date)),
title = "Epidemic Curve\n")

Epidemic Curve
7

5
Number of cases

0
2016−W10

2016−W11

2016−W12

2016−W13

2016−W14

2016−W15

2016−W16

2016−W17

2016−W18

Du 2016−W10 au 2016−W18

12
With factors

date value factor


2016-W10 3 Valid
2016-W10 2 Invalid
2016-W11 4 Valid
2016-W12 1 Invalid
2016-W13 5 Valid
2016-W14 2 Valid
2016-W14 1 Invalid
2016-W15 7 Valid
2016-W15 3 Invalid
2016-W17 1 Valid
2016-W17 1 Invalid
2016-W18 4 Valid
2016-W18 2 Invalid
2016-W20 3 Valid

EpiCurve(DF,
date = "date",
freq = "value",
period = "week",
cutvar = "factor",
color=c("Blue", "Red"),
ylabel="Cases",
xlabel=sprintf("From %s to %s", min(DF$date), max(DF$date)),
title = "Epidemic Curve\n")

Epidemic Curve
10

6 factor
Cases

Valid
4 Invalid

0
2016−W10
2016−W11
2016−W12
2016−W13
2016−W14
2016−W15
2016−W16
2016−W17
2016−W18
2016−W19
2016−W20

From 2016−W10 to 2016−W20

13
Monthly

Without factors

date value
2016-02 3
2016-03 4
2016-04 1
2016-05 5
2016-07 2
2016-08 7
2016-10 1
2016-11 4
2016-12 3

EpiCurve(DF,
date = "date",
freq = "value",
period = "month",
ylabel="Number of cases",
xlabel=sprintf("From %s to %s", min(DF$date), max(DF$date)),
title = "Epidemic Curve\n")

Epidemic Curve
7

5
Number of cases

0
2016−02

2016−03

2016−04

2016−05

2016−06

2016−07

2016−08

2016−09

2016−10

2016−11

2016−12

From 2016−02 to 2016−12

14
With factors

date value factor


2016-02 3 Valid
2016-02 2 Invalid
2016-03 4 Valid
2016-04 1 Invalid
2016-05 5 Valid
2016-06 2 Valid
2016-06 1 Invalid
2016-07 7 Valid
2016-07 3 Invalid
2016-09 1 Valid
2016-09 1 Invalid
2016-11 4 Valid
2016-11 2 Invalid
2016-12 3 Valid

EpiCurve(DF,
date = "date",
freq = "value",
cutvar = "factor",
period = "month",
ylabel="Number of cases",
xlabel=sprintf("From %s to %s", min(DF$date), max(DF$date)),
title = "Epidemic Curve\n")

Epidemic Curve
10

8
Number of cases

6 factor
Valid
4 Invalid

0
2016−02
2016−03
2016−04
2016−05
2016−06
2016−07
2016−08
2016−09
2016−10
2016−11
2016−12

From 2016−02 to 2016−12

15
Converted period (aggragated cases)

“day” to “week”

date value
2016-03-01 5
2016-03-03 5
2016-03-05 5
2016-03-06 1
2016-03-07 3
2016-03-14 4
2016-03-15 1
2016-03-16 10
2016-03-27 3
2016-03-28 2
2016-03-30 4
2016-03-31 2
2016-04-01 3
2016-04-03 1
2016-04-04 6
2016-04-07 6
2016-04-08 9
2016-04-09 15
2016-04-13 2
2016-04-14 1
2016-04-15 3
2016-04-16 6
2016-04-17 4
2016-04-18 17
2016-04-27 13
2016-04-29 3
2016-04-30 6

EpiCurve(DF,
date = "date",
freq = "value",
period = "day",
to.period = "week",
ylabel="Number of cases",
xlabel=sprintf("From %s to %s", min(DF$date), max(DF$date)),
title = "Epidemic Curve",
note = "Daily epidemic curve")

16
Number of cases

0
5
10
15
20
25
30
35

2016−W09
2016−W10
2016−W11

17
2016−W12
2016−W13
2016−W14
2016−W15
Epidemic Curve

2016−W16

Daily epidemic curve


2016−W17

From 2016−03−01 to 2016−04−30


“day” to “month”

EpiCurve(DF,
date = "date",
freq = "value",
period = "day",
to.period = "month",
ylabel="Number of cases",
xlabel=sprintf("From %s o %s", min(DF$date), max(DF$date)),
title = "Epidemic Curve",
note = "Daily epidemic curve")

18
Epidemic Curve
95

90

85

80

75

70

65

60

Number of cases
55

50

45

40

35

30

25

20

15

10

0
2016−03
2016−04

From 2016−03−01 o 2016−04−30

Daily epidemic curve

19
“week” to “month”

date value
2016-W10 3
2016-W11 4
2016-W12 1
2016-W13 5
2016-W14 2
2016-W15 7
2016-W16 1
2016-W17 4
2016-W18 3

EpiCurve(DF,
date = "date",
freq = "value",
period = "week",
to.period = "month",
color=c("#990000"),
ylabel="Number of cases",
xlabel=sprintf("Du %s au %s", min(DF$date), max(DF$date)),
title = "Epidemic Curve\n")

Epidemic Curve
14

12

10
Number of cases

0
2016−03
2016−04
2016−05

Du 2016−W10 au 2016−W18

20
“week” to “month” with factors

date value factor


2016-W10 3 Valid
2016-W10 2 Invalid
2016-W11 4 Valid
2016-W12 1 Invalid
2016-W13 5 Valid
2016-W14 2 Valid
2016-W14 1 Invalid
2016-W15 7 Valid
2016-W15 3 Invalid
2016-W17 1 Valid
2016-W17 1 Invalid
2016-W18 4 Valid
2016-W18 2 Invalid
2016-W20 3 Valid

EpiCurve(DF,
date = "date",
freq = "value",
period = "week",
to.period = "month",
cutvar = "factor",
color=c("Blue", "Red"),
ylabel="Cases",
xlabel=sprintf("From %s to %s", min(DF$date), max(DF$date)),
title = "Epidemic Curve\n")

21
Epidemic Curve

14

12

10
factor
Cases

8
Valid
6 Invalid

0
2016−03
2016−04
2016−05

From 2016−W10 to 2016−W20

22

You might also like