You are on page 1of 31

1

U.S. COVID-19 Infection and Mortality Dashboard

Caroline E. White

University of San Diego, Hahn School of Nursing and Health Science and

the Betty and Bob Beyster Institute for Nursing Research, Advanced Practice, and Simulation

Health Care Informatics Master’s Capstone

Barbara Berkovich, PhD, MA

May 20, 2022

Abstract Word Count: 250

Paper Word Count: 3812

Approved by:

_________________________

Barbara Berkovich, PhD, MA

Capstone Advisor

_________________________

Date
2

Abstract

Introduction. During the COVID-19 pandemic, numerous data sources were created to track

disparate aspects of the pandemic. New insights can be discovered by combining multiple data

sources into a longitudinal, multidimensional dashboard. Because of the immense impact of

COVID-19 on the U.S. population, it is crucial to study the factors that influence COVID-19

infections and mortality.

Methods. Three COVID-19 datasets were analyzed using Tableau, and a dashboard was created

to show the geographic impact of the pandemic from January 2020 to March 2022, along with

factors influencing COVID-19 infections and mortality.

Results. Six visualizations were created for the dashboard, including an animated heat map

showing COVID-19 infections over time, annotated line graphs on cases and deaths over time,

stacked side-by-side bar graphs on cases and deaths over time by age group and vaccination

status, and a stacked bar graph on comorbid COVID-19 deaths by health condition and age.

Discussion. The animated heat map shows the spread of COVID-19 over time, showing how

geographic location and time affected the number of U.S. infections. The annotated line graphs

and stacked side-by-side bar graphs (on cases and deaths by age group and vaccination status)

show that while COVID-19 vaccination is not a perfect public health tool, mass vaccination is

associated with a significant reduction in COVID-19 mortality. The dashboard also demonstrates

that while age is a risk factor for COVID-19 mortality, young unvaccinated people and

individuals with chronic diseases are also at risk.

Keywords: COVID-19, infections, mortality, vaccination, comorbidity, Tableau, data

visualization
3

U.S. COVID-19 Infection and Mortality Dashboard

Background

As of May 18, 2022, there have been over 82 million COVID-19 cases and over 997,000

COVID-19 deaths in the United States (Centers for Disease Control and Prevention [CDC],

2022b). One in 330 Americans has now died from COVID-19, and this number is likely an

undercount (CDC, 2022a; Smith-Schoenwalder, 2022). Despite the massive death toll, only

66.5% of Americans – and 70.7% of those five and older – have been fully vaccinated (CDC,

2022b).

Because of the colossal impact of COVID-19 on the U.S. population, it is essential to

study the factors influencing the number of COVID-19 cases and deaths. For example, factors

associated with an uptick in cases could be evaluated to determine how to avoid future

infections. In addition, factors influencing COVID-19 mortality must be identified to ascertain

who is at risk and what can be done to minimize the threat.

Statement of Problem

There is an abundance of dashboards that cover COVID-19 infections and mortality.

However, few existing dashboards utilize animation features to show how COVID-19 has spread

geographically over time dynamically. Additionally, existing graphics that depict COVID-19

cases or deaths over time are not usually overlaid with supplemental information on the factors

that influence the trajectory of the pandemic, such as vaccine rollout events, lockdown orders,

and the introduction of new variants to the U.S. The featured COVID-19 dashboard includes this

supporting information as context so the viewer can visualize the relationships between

infection, mortality, vaccination, geographic location, morbidity, and age.


4

Project Purpose

This project aims to deliver a quality visualization tool for analyzing the spread of SARS-

CoV-2 and COVID-19 mortality over time.

PICOT Question

Population. The population of the COVID-19 dashboard is United States residents.

Intervention. The intervention to be examined is COVID-19 immunization.

Comparison. The comparison group is non-vaccinated individuals.

Outcome. The outcomes of interest are the COVID-19 infection rate and mortality rate in

the United States.

Time. The data span from late January 2020 to early March 2022.

Literature Review

A literature search was conducted in PubMed using the following search terms: (COVID-

19) AND (vaccin*) AND (factor*) AND (influenc* OR impact* OR affect*) AND (case* OR

infection* OR death* OR mortality) AND (“United States” OR “U.S.” OR US). The initial

search results consisted of 46 articles. Articles were limited to free full-text reviews and multi-

site studies from 2022. Studies not involving COVID-19 in humans were excluded. Excluded

studies also consisted of those taking place outside of the U.S. and those that focused only on

specific populations such as prisoners, patients with specific health conditions, pregnant females,

and individuals in specific occupations. Some of the excluded article topics were COVID-19

microbiology, comparisons of vaccine manufacturers, acute COVID-19 complications, long-term

post-acute symptoms, allergic reactions to vaccines, and vaccine immune responses. Ten articles

were included in the literature review. Each of the selected articles discussed factors that

influence the COVID-19 infection rate or mortality, including vaccine efficacy, demographic
5

factors, politics and vaccine hesitancy, and public health factors such as lockdowns and social

distancing.

Two of the selected articles discuss vaccine efficacy, specifically breakthrough

infections. The first, Kugeler et al. (2022), sought to estimate the number of symptomatic

breakthrough infections in the U.S. from January 2021 to July 2021. Using a tool in Microsoft

Excel, the researchers found that approximately 199,000 symptomatic breakthrough infections

had occurred during that time (Kugeler et al., 2022). The second article by Lipsitch et al. (2022)

reviews the approaches being used to measure vaccine efficacy and breakthrough rates, and it

describes the causes of breakthrough infections. Lipsitch et al. (2022) explain, “Whether a

breakthrough infection occurs when a vaccinated host is exposed to an infectious person depends

on whether the immune response present in that person at the moment of exposure is sufficient to

abort or rapidly control the infection” (p. 59). Because immune responses peak and then wane

over time, the protection provided by vaccines also decreases over time, and breakthrough

infections become more likely to occur. In addition, according to Lipsitch et al. (2022), older

adults are at greater risk of breakthrough infections because their antibody response is lower.

Two articles from the search discussed demographic factors that influence COVID-19

infection and mortality. Scott et al. (2022) described the impact of COVID-19 on an

economically disadvantaged Latinx community in Southern California. The researchers found

that “Low-income Latinos in Southern California were generally hesitant to get a COVID-19

vaccine. Culturally sensitive vaccine promotion campaigns need to address the concerns of

minority populations who experience increased morbidity and mortality from COVID-19” (Scott

et al., 2022, p. 1). Like Scott et al. (2022), Upchurch et al. (2022) also discuss racial and ethnic

differences in COVID-19 infection rates. However, Upchurch et al. (2022) also examine the
6

intersection of gender with infection. Upchurch et al. (2022) found roughly equal numbers of

men and women tested positive for COVID-19 in their sample. They also found that Native

American, Black, and Hispanic women were more likely to become infected than non-Hispanic

White women (Upchurch et al., 2022). This trend was also found in men. Women over 40 and

men over 55 were less likely to become infected with COVID-19 (Upchurch et al., 2022). For

men only, being employed increased the likelihood of infection, and men with one or more

chronic illnesses were less likely to get infected (Upchurch et al., 2022).

Racial, ethnic, and economic demographic factors coincide with another factor

influencing COVID-19 infection and mortality: vaccine hesitancy. Rather than studying the

effect of race on infection, Lin et al. (2022) examined racial disparities in survey participants’

trust in vaccination and intent to get vaccinated. Lin et al. (2022) found:

White support for the vaccines began at 65% before the public realized the enormity of

the outbreak, peaked at 74% in April 2020 after the near-nationwide lockdown, dipped

into the low-50s in September and October, and then gradually reclimbed to where it

started. Conversely, minority responses were more erratic. Blacks started at 58%,

dropped more than 20% in two months with persistent fluctuations, skidded to its lowest

at 27% by late October, and rebounded to 47% the following February. Hispanics began

similarly to Whites at 67%, slipped to its lowest point in October (the same time as

Blacks) at 41%, and finished at 60% (p. 6).

As a possible explanation for the difference in hesitancy among races, Lin et al. (2022) cite the

Tuskegee Syphilis Study and Nazi concentration camp experiments as historical reasons for

minority vaccine hesitancy.


7

Four other studies from the literature review also examine vaccine hesitancy. Akpoji et

al. (2022) discuss that pharmacists, the most trusted type of healthcare professionals, may hold

power in convincing people to get vaccinated. Additionally, Albrecht (2022) reviews the impact

of political ideology on vaccination rates. Albrecht (2022) states, “In counties with a high

percentage of Republican voters, vaccination rates were significantly lower and COVID-19 cases

and deaths per 100,000 residents were much higher” (p. 1). He also says that to restore

confidence in science and medicine, people must overcome political divisiveness. Similar to the

article by Albrecht (2022), Roberts et al. (2022) examined demographic, political, and behavioral

factors and their association with vaccine hesitancy. Roberts et al. (2022) state:

We found that younger age, non-White race, lower income, less education, more

conservative and less liberal social attitudes, and less adherence to COVID-19 safety

behaviors and lower approval of government restrictions were common correlates of anti-

vax attitudes in general and COVID-19 vaccine hesitancy specifically (pp. 12-13).

Perhaps younger people are less likely to perceive COVID-19 as a threat, and less educated

individuals with certain political ideologies are more receptive to misinformation and conspiracy

theories. Relatedly, the review by Farhart et al. (2022) discusses how the spread of

misinformation leads to vaccine hesitancy in Americans. They found that “anti-intellectualism,

conspiratorial predispositions, and COVID-19 conspiracy theory belief are the strongest and

most consistent predictors of COVID-19 vaccine hesitancy” (Farhart et al., 2022, p. 136).

However, contrary to other studies, Farhart et al. (2022) also found that political ideology and

partisanship were not consistent predictors of hesitancy after accounting for conspiracy theory

beliefs and anti-intellectualism.


8

The final article included in the literature search concerned social distancing. Lau et al.

(2022) studied Georgia residents’ adherence to social distancing measures before and after its

statewide lockdown. Using a machine learning model, Lau et al. (2022) discovered:

… Overall population-level transmissibility was reduced to 41.2% … of the pre-

lockdown level in about a week of the announcement of the shelter-in-place order.

Although it subsequently increased after the lockdown was lifted, it only bounced back to

62% … of the pre-lockdown level after about a month (p. 1).

This shows that while statewide lockdowns are not flawless, they are very effective at slowing

infection rates.

In addition to the PubMed literature search, publicly available datasets on government

websites such as the CDC and the National Institutes of Health (NIH) were also reviewed. The

objective was to create a dashboard using national data, so state websites and the World Health

Organization (WHO) were excluded from the research. Unfortunately, the CDC and NIH did not

have a publicly available dataset with cases and deaths by county for every day of the pandemic,

so the search was expanded to academic and news organizations reporting on the pandemic. The

New York Times COVID-19 data repository on GitHub was identified as a valuable data source

for the dashboard. Two additional data sources were later selected from the CDC in line with the

dashboard’s theme of infections and mortality: one on comorbidities and one on cases and deaths

by vaccination status and age.

While researching existing COVID-19 visualizations, many engaging CDC dashboards

were discovered that were related to the capstone project’s infections and mortality theme. For

example, one relevant CDC dashboard is “Trends in COVID-19 Cases and Deaths in the United

States, by County-level Population Factors” (CDC, 2022c). This dashboard shows U.S. COVID-
9

19 cases or deaths per 100,000 people by metropolitan status (Metropolitan, Non-Metropolitan,

or All Counties). In both the cases and deaths graphs, cases and deaths start much higher for

metropolitan counties, then around August 2020, the two categories swap (CDC, 2022c).

Another pertinent CDC dashboard is “County Vaccination Coverage and Other Outcomes”

(CDC, 2022d). The primary graphic is a heat map with a two-color gradient grid as its legend.

The heat map shows, for each county, the percent of the population that is fully vaccinated and

the cases per 100,000 people (CDC, 2022d). Although slightly unintuitive, the heat map is

fascinating and highly informative.

Methods

This project utilizes three publicly available COVID-19 datasets. All three are static

extracts from continuously updating live data sources.

New York Times’ U.S. Counties Dataset

The first data source is an ongoing New York Times COVID-19 data repository on

GitHub (The New York Times, 2020). This repository’s U.S. Counties dataset contains the

cumulative total of COVID-19 cases and deaths by state and county for every day of the

pandemic. For this dataset, “cases” is the sum of confirmed and probable cases, and “deaths” is

the sum of confirmed and probable deaths. Confirmed cases and deaths are those verified with a
10

positive specimen test, and probable cases and deaths are those diagnosed by a healthcare

provider without a specimen test result. Retrieved March 8, 2022, the extracted dataset spans

from January 21, 2020, to March 7, 2022.

During the initial Tableau data analysis, a problem was encountered with the data. It

became apparent that the data reflected cumulative cases and deaths. To be helpful for the

dashboard, the data needed to be transformed into new daily cases and deaths. Unfortunately, a

subsequent literature search was unsuccessful in finding another publicly available dataset

containing the number of cases and deaths for each U.S. county during each day of the

pandemic. With over 2 million rows, the New York Times dataset was too large to cleanse in

Excel. Even if Excel could handle the file size, there would be no quick, simple way to transform

the cumulative totals into new cases and deaths using Excel. Because of this, Python was used to

transform the data. The Python code used for this project was based on the work of David West

(West, 2021a; West, 2021b).

Using Google Colaboratory, Google Drive’s integrated development environment (IDE),

the New York Times U.S. Counties .csv file was retrieved. Then, part of West’s code was used

to cleanse and transform the data (West, 2021b). All data from the U.S. territories were excluded,

and two new columns were added to the dataset, representing new cases and new deaths. The

number of new cases was found for each county by subtracting a given day’s total cases by the

previous day’s total cases, then filling null values with zeroes. Python enabled this to happen

relatively instantly for all counties for each day of the pandemic. Once this was done, the Pandas

library tail() function was used to examine the last five values of the dataset to verify the number

of rows and cross-check those last values for accuracy. This function revealed that the dataset
11

was 2.28 million rows long, which was accurate, and the last five rows of the dataset had the

correct number of new cases, which was promising.

The Pandas library to_csv() function was used to export the transformed dataset. Next,

the entire text file was opened in Visual Studio. The values of new cases for various U.S.

counties and dates were cross-checked at random. The dataset then accurately displayed new

cases and deaths and was ready for analysis.

CDC Dataset: Rates of COVID-19 Cases or Deaths by Age Group and Vaccination Status

The second dataset, “Rates of COVID-19 Cases or Deaths by Age Group and

Vaccination Status,” is from the CDC website (CDC, 2021a). This dataset contains weekly

COVID-19 infection and mortality data by age group and vaccination status. For this dataset,

“cases” and “deaths” refer to all COVID-19 cases and deaths confirmed with a positive specimen

test. The age groups are 5-11, 12-17, 18-29, 30-49, 50-64, 65-79, and 80 or older. The dataset

only contains data for the 5-11 age group beginning December 2021, about a month after the age

group became eligible for the vaccine on October 29, 2021 (U.S. Food and Drug Administration

[FDA], 2021b). For this dataset, “vaccinated” individuals refer to those who were fully

vaccinated during their positive COVID-19 test result. Partially vaccinated individuals are

excluded from the dataset. The extract was taken on March 20, 2022, and the data spans April 4,

2021, to February 19, 2022.

This dataset was cleansed using Excel. Data on cases and deaths by individual vaccine

manufacturers were excluded, as only information on vaccination status and age was needed for

the visualizations. Additionally, in the raw dataset, the first column is called “outcome,” and the

content is either “case” or “death.” For simplicity, the dataset was divided into two separate

tables – one for each outcome.


12

Before dividing the dataset into two tables, the dataset’s date format needed to be fixed.

The raw dataset has a column for the month (“Apr-21” for April 2021) but does not contain date

ranges for the weeks. Instead, it has a column entitled “MMWR week,” in which MMWR stands

for Morbidity and Mortality Weekly Report. MMWR week is the CDC’s official

epidemiological week ranging from 1 to 52. In the dataset, the MMWR week is formatted like

“202114,” representing the fourteenth week of 2021. Because of this, the corresponding date

ranges needed to be researched. Once these date ranges were identified, the first day of each

corresponding week was added to a new column for “week.”

CDC Dataset: Conditions Contributing to COVID-19 Deaths, by State and Age,

Provisional 2020-2022

The third dataset, “Conditions Contributing to COVID-19 Deaths, by State and Age,

Provisional 2020-2022,” is also from the CDC website (CDC, 2020). This dataset contains

comorbid COVID-19 death data by age group and by month, year, or total time. The age groups

included in the dataset are 0-24, 25-34, 35-44, 45-54, 55-64, 65-74, 75-84, and 85 or older. For

individuals to be included in this dataset, COVID-19 and the comorbid health condition(s) must

be listed as causes of death, not just mentioned on the death certificate. Additionally, some

individuals may have more than one comorbidity listed as a cause of death and thus may be

counted more than once. The extracted dataset was retrieved on April 1, 2022, and spans January

1, 2020, to March 26, 2022.

This dataset was also cleansed using Excel. The spreadsheet contained data by month,

year, and total. Only the information by total was of interest, so monthly and annual data were

excluded. One of the health conditions included in the dataset was COVID-19, but there was no
13

explanation about what that meant (i.e., whether it referred to individuals without any

comorbidities). Some of the conditions were renamed for conciseness and clarity.

Results

The dashboard for this capstone project contains six visualizations. The first visualization

is an animated heat map displaying the number of weekly COVID-19 cases by county over time

from January 21, 2020, to March 5, 2022 (see Figure 1). The heat map shows that the pandemic

started in the U.S. in three counties before moving into metropolitan areas and ultimately

spreading into rural America. The heat map also shows the peak of COVID-19 infections in the

middle of January 2022, followed by a reduction in cases.

The second and third visualizations are line graphs showing U.S. COVID-19 cases and

deaths, respectively, from January 21, 2020, to March 5, 2022 (see Figures 2-3). While these are

two separate visualizations, a Tableau parameter was used so that the dashboard displays only

one of the visualizations at one time. The user can select which measure they want to see by

clicking cases or deaths from a drop-down menu. This simplifies and modernizes the dashboard.

Also, the line graphs are annotated with labels indicating various pandemic milestone events that

have influenced the number of U.S. cases and deaths. These events are related to vaccine rollout,

statewide lockdowns, the emergence of new variants, and the superseding of new variants as the

primary COVID-19 strain circulating in the United States. Overall, these graphs illustrate that the
14

various pandemic events are associated with changes in the rate of COVID-19 cases and deaths.

For example, the arrival of new variants was associated with massive upticks in cases and deaths.

The fourth and fifth visualizations are stacked side-by-side bar graphs that show the

number of monthly COVID-19 cases and deaths, respectively, by age group and vaccination

status from April 2021 to January 2022 (see Figures 4-5). A stacked bar chart illustrates multiple

data points on top of each other to show how each component contributes to the total value. A

stacked side-by-side bar chart does this for two x-variable categories, and the stacked bar

segments for each x-variable are placed next to one another. The fourth visualization examines

monthly unvaccinated and vaccinated cases by age group from April 2021 to January 2022 (see

Figure 4). In this graph, the different age groups are stacked on top of each other, and

vaccination statuses are placed side-by-side. Likewise, the fifth visualization demonstrates

monthly deaths by vaccination status and age group (see Figure 5). Both visualizations can be

filtered by age group (by selecting from a multiple-choice menu) or month (by double-clicking

on the data). The dashboards are placed next to each other with an age filter in the center. The

age filter applies to both visualizations simultaneously, making the dashboard more streamlined

and sophisticated. These visualizations reveal more cases and deaths among unvaccinated people

than vaccinated people, and this difference is a lot more dramatic for mortality. The dashboard

shows, without implying causation, that mass vaccination is associated with a reduced COVID-

19 mortality rate.

The final visualization is a stacked bar chart demonstrating the number of comorbid

COVID-19 deaths from January 1, 2020, to March 26, 2022, by age group and comorbidity

category (see Figure 6). This visualization utilizes a different dataset from the previous two

graphs and the age group bins are different. Therefore, the age filter for this visualization could
15

not be synced with the one for the previous two graphs. As a result, a new color scheme is used

for the age groups on this graph. A drop-down filter is also included for jurisdiction –

specifically, the United States and each state. In the graph, the comorbidities are sorted from

highest to lowest death toll, and deaths are stacked by age group. It is important to note that not

all of the comorbidity categories represent chronic preexisting conditions. For example, the first-

and third-highest death tolls are “influenza and pneumonia” and “respiratory failure.” These

health conditions are likely caused by COVID-19 and are not preexisting conditions. Conversely,

preexisting conditions such as diabetes and hypertension are the most strongly associated with

mortality.

The completed dashboard was published to Tableau Public to be accessed and shared on

the internet. Unfortunately, the animated heat map plays much more slowly on the internet than

on the Tableau Desktop application, and the font sizes are a bit smaller and harder to read. It can

also be difficult to trigger the animation and may take a few tries before it starts working.

Otherwise, the web version looks and functions the same as the desktop version.

Discussion

This is an unprecedented time – accurate, transparent public health information is more

necessary than ever. It is also more available than ever. It is truly remarkable that all of this

information is so easily accessible online. Anyone with Tableau experience can analyze COVID-

19 data, create a dashboard, and share the finished product with others using a simple internet

URL.

In the future, a couple of improvements could be made to the dashboard. First, a live data

connection would benefit the dashboard by keeping it updated. The dashboard currently utilizes

three dataset extracts, which cannot be updated automatically. The dashboard can only be
16

updated manually, which is time-consuming and impractical. Additionally, if the dashboard were

to be published in a primarily web-based format, the speed of the heat map animation would

need to be quickened because Tableau Public’s default animation settings are too slow. A

straightforward solution could be to use a video format to avoid rendering issues.

There are a few key takeaways from this project. First, the overall dashboard shows that

while vaccination is not perfect, it is associated with significantly fewer deaths. Along with this,

the dashboard demonstrates that age is significantly associated with mortality. However, young

unvaccinated people, especially those with chronic health conditions like diabetes and

hypertension, are also at risk. More research is needed into the biological basis for these risk

factors. Future studies could also examine why some vaccinated people are still dying from

COVID-19 and if this is primarily due to immunodeficiency. Further investigation should also

explore the potential reasons for vaccine hesitancy and how to improve overall vaccination rates.
17

References

A Timeline of COVID-19 Vaccine Developments in 2021. (2021, June 3). American Journal of

Managed Care (AMJC). https://www.ajmc.com/view/a-timeline-of-covid-19-vaccine-

developments-in-2021

Akpoji, U., Amos, M. E., McMillan, K., Sims, S., & Rife, K. (2022). Exercising empathy:

Pharmacists possess skills to increase coronavirus vaccine confidence. Journal of the

American Pharmacists Association: JAPhA, 62(1), 296–301.

https://doi.org/10.1016/j.japh.2021.07.016

Albrecht D. (2022). Vaccination, politics and COVID-19 impacts. BMC public health, 22(1), 96.

https://doi.org/10.1186/s12889-021-12432-x

Browne, E. (2021, August 10). When Were the First U.S. COVID Delta Variant Cases, and How

Did It Mutate? Newsweek. https://www.newsweek.com/first-us-covid-delta-variant-

cases-how-did-it-mutate-1617871

Centers for Disease Control and Prevention (CDC). (2020, April 27). Conditions Contributing to

COVID-19 Deaths, by State and Age, Provisional 2020–2022 | Data | Centers for

Disease Control and Prevention [Dataset]. Centers for Disease Control and Prevention

(CDC). Retrieved April 1, 2022, from https://data.cdc.gov/NCHS/Conditions-

Contributing-to-COVID-19-Deaths-by-Stat/hk9y-quqm

Centers for Disease Control and Prevention (CDC). (2021a, October 19). Rates of COVID-19

Cases or Deaths by Age Group and Vaccination Status | Data | Centers for Disease

Control and Prevention [Dataset]. Centers for Disease Control and Prevention (CDC).

Retrieved March 9, 2022, from https://data.cdc.gov/Public-Health-Surveillance/Rates-of-

COVID-19-Cases-or-Deaths-by-Age-Group-and/3rge-nu2a
18

Centers for Disease Control and Prevention (CDC). (2021b, November 19). CDC Expands

Eligibility for COVID-19 Booster Shots to All Adults.

https://www.cdc.gov/media/releases/2021/s1119-booster-shots.html

Centers for Disease Control and Prevention (CDC). (2021c, December 1). First Confirmed Case

of Omicron Variant Detected in the United States.

https://www.cdc.gov/media/releases/2021/s1201-omicron-variant.html#:%7E:text=First

%20Confirmed%20Case%20of%20Omicron%20Variant%20Detected%20in%20the

%20United%20States,-Media%20Statement&text=The%20California%20and%20San

%20Francisco,529

Centers for Disease Control and Prevention (CDC). (2022a, May 4). Excess Deaths Associated

with COVID-19. https://www.cdc.gov/nchs/nvss/vsrr/covid19/excess_deaths.htm

Centers for Disease Control and Prevention (CDC). (2022b, May 18). COVID Data Tracker.

https://covid.cdc.gov/covid-data-tracker/#datatracker-home

Centers for Disease Control and Prevention (CDC). (2022c, May 19). COVID Data Tracker.

https://covid.cdc.gov/covid-data-tracker/#pop-factors_7daynewcases

Centers for Disease Control and Prevention (CDC). (2022d, May 20). COVID Data Tracker.

https://covid.cdc.gov/covid-data-tracker/#vaccination-case-rate

Farhart, C. E., Douglas-Durham, E., Lunz Trujillo, K., & Vitriol, J. A. (2022). Vax attacks: How

conspiracy theory belief undermines vaccine support. Progress in molecular biology and

translational science, 188(1), 135–169. https://doi.org/10.1016/bs.pmbts.2021.11.001

Giattino, C., Ritchie, H., Roser, M., Ortiz-Ospina, E., & Hasell, J. (2022, May 2). Excess

mortality during the Coronavirus pandemic (COVID-19). Our World in Data.

https://ourworldindata.org/excess-mortality-covid
19

Greenhalgh, J., & Stein, R. (2021, July 6). Delta Is Now The Dominant Coronavirus Variant In

The U.S. NPR. https://choice.npr.org/index.html?origin=https://www.npr.org/sections/

coronavirus-live-updates/2021/07/06/1013582342/delta-is-now-the-dominant-

coronavirus-variant-in-the-u-s

Hubbard, K. (2021, December 28). CDC: Omicron Overtook Delta as Dominant Variant. U.S.

News & World Report.

https://www.usnews.com/news/health-news/articles/2021-12-28/cdc-omicron-overtook-

delta-as-dominant-variant#:%7E:text=The%20delta%20variant%2C%20which

%20has,while%20omicron%20rose%20to%2058.6%25

Kugeler, K. J., Williamson, J., Curns, A. T., Healy, J. M., Nolen, L. D., Clark, T. A., Martin, S.

W., & Fischer, M. (2022). Estimating the number of symptomatic SARS-CoV-2

infections among vaccinated individuals in the United States-January-July, 2021. PloS

one, 17(3), e0264179. https://doi.org/10.1371/journal.pone.0264179

Lau, M., Liu, C., Siegler, A. J., Sullivan, P. S., Waller, L. A., Shioda, K., & Lopman, B. A.

(2022). Post-lockdown changes of age-specific susceptibility and its correlation with

adherence to social distancing measures. Scientific reports, 12(1), 4637.

https://doi.org/10.1038/s41598-022-08566-6

Lin, C., Tu, P., & Terry, T. C. (2022). Moving the needle on racial disparity: COVID-19 vaccine

trust and hesitancy. Vaccine, 40(1), 5–8. https://doi.org/10.1016/j.vaccine.2021.11.010

Lipsitch, M., Krammer, F., Regev-Yochay, G., Lustig, Y., & Balicer, R. D. (2022). SARS-CoV-

2 breakthrough infections in vaccinated individuals: measurement, causes and

impact. Nature reviews. Immunology, 22(1), 57–65. https://doi.org/10.1038/s41577-021-

00662-4
20

Roberts, H. A., Clark, D. A., Kalina, C., Sherman, C., Brislin, S., Heitzeg, M. M., & Hicks, B.

M. (2022). To vax or not to vax: Predictors of anti-vax attitudes and COVID-19 vaccine

hesitancy prior to widespread vaccine availability. PloS one, 17(2), e0264019.

https://doi.org/10.1371/journal.pone.0264019

Scott, V. P., Hiller-Venegas, S., Edra, K., Prickitt, J., Esquivel, Y., Melendrez, B., & Rhee, K. E.

(2022). Factors associated with COVID-19 vaccine intent among Latino SNAP

participants in Southern California. BMC public health, 22(1), 653.

https://doi.org/10.1186/s12889-022-13027-w

Smith-Schoenwalder, C. (2022, May 16). U.S. Coronavirus Death Toll Reaches 1 Million as

Country Grapples With How to Move Forward. U.S. News & World Report.

https://www.usnews.com/news/national-news/articles/2022-05-16/u-s-coronavirus-death-

toll-reaches-1-million-as-country-grapples-with-how-to-move-forward

States that issued lockdown and stay-at-home orders in response to the coronavirus (COVID-19)

pandemic, 2020. (2021, January 5). Ballotpedia.

https://ballotpedia.org/States_that_issued_lockdown_and_stay-at-

home_orders_in_response_to_the_coronavirus_(COVID-19)_pandemic,_2020

The New York Times. (2020a, June 27). GitHub - nytimes/covid-19-data/us-counties.csv

[Dataset]. GitHub. Retrieved March 8, 2022, from https://github.com/nytimes/covid-19-

data/blob/master/us-counties.csv

The New York Times. (2020b, June 27). GitHub - nytimes/covid-19-data: An ongoing repository

of data on coronavirus cases and deaths in the U.S. [Repository]. GitHub.

https://github.com/nytimes/covid-19-data
21

Upchurch, D. M., Wong, M. S., Yuan, A. H., Haderlein, T. P., McClendon, J., Christy, A., &

Washington, D. L. (2022). COVID-19 Infection in the Veterans Health Administration:

Gender-specific Racial and Ethnic Differences. Women’s health issues: official

publication of the Jacobs Institute of Women’s Health, 32(1), 41–50.

https://doi.org/10.1016/j.whi.2021.09.006

U.S. Food and Drug Administration (FDA). (2021a, May 10). Coronavirus (COVID-19) Update:

FDA Authorizes Pfizer-BioNTech COVID-19 Vaccine for Emergency Use in Adolescents

in Another Important Action in Fight Against Pandemic. https://www.fda.gov/news-

events/press-announcements/coronavirus-covid-19-update-fda-authorizes-pfizer-

biontech-covid-19-vaccine-emergency-use

U.S. Food and Drug Administration (FDA). (2021b, August 23). FDA Approves First COVID-

19 Vaccine. https://www.fda.gov/news-events/press-announcements/fda-approves-first-

covid-19-vaccine#:%7E:text=The%20first%20EUA%2C%20issued%20Dec,trial%20of

%20thousands%20of%20individuals.

U.S. Food and Drug Administration (FDA). (2021c, September 22). FDA Authorizes Booster

Dose of Pfizer-BioNTech COVID-19 Vaccine for Certain Populations.

https://www.fda.gov/news-events/press-announcements/fda-authorizes-booster-dose-

pfizer-biontech-covid-19-vaccine-certain-populations

U.S. Food and Drug Administration (FDA). (2021d, October 29). FDA Authorizes Pfizer-

BioNTech COVID-19 Vaccine for Emergency Use in Children 5 through 11 Years of Age.

https://www.fda.gov/news-events/press-announcements/fda-authorizes-pfizer-biontech-

covid-19-vaccine-emergency-use-children-5-through-11-years-age
22

West, D. A. [Fickle-Scene-4773] (2021a, October 8). [O.C.] The Pandemic in the U.S. in 60

Seconds [Online forum post]. Reddit.

https://www.reddit.com/r/dataisbeautiful/comments/q4dc8b/oc_the_pandemic_in_the_us

_in_60_seconds/

West, D. A. (2021b, October 15). Covid-Videos/CountyCaseVideo_For_Git.ipynb [Source

Code]. GitHub.

https://github.com/DavidAWest/Covid-Videos/blob/main/CountyCaseVideo_For_Git.ipy

nb
23

Appendix A

Figure 1

Weekly U.S. COVID-19 Cases by County Over Time

Note. This image is a single snapshot from an animation. The data from the full animation spans

January 21, 2020, to March 5, 2022. The above figure depicts COVID-19 cases during the week

of January 23-29, 2022, near the pandemic’s peak.


24

Figure 2

U.S. COVID-19 Cases Over Time

Note. The visualization depicts COVID-19 cases from January 21, 2020, to March 5, 2022.

Various pandemic milestone events are labeled on the graph to show how each event may have

influenced the number of cases.

Figure 3

U.S. COVID-19 Deaths Over Time

Note. The visualization depicts COVID-19 deaths from January 21, 2020, to March 5, 2022.

Various pandemic milestone events are labeled on the graph to show how each event may have

influenced the number of deaths.


25

Figure 4

U.S. COVID-19 Cases by Vaccination Status and Age

Note. This stacked side-by-side chart depicts U.S. COVID-19 cases by vaccination status and age

group (5-11, 12-17, 18-29, 30-49, 50-64, 65-79, 80+) from April 2021 to January 2022. For this

dataset, data collection for the 5-11 age group began in December 2021, about a month after this

age group became eligible for the vaccine on October 29, 2021.
26

Figure 5

U.S. COVID-19 Deaths by Vaccination Status and Age

Note. This stacked side-by-side chart depicts U.S. COVID-19 deaths by vaccination status and

age group (5-11, 12-17, 18-29, 30-49, 50-64, 65-79, 80+) from April 2021 to January 2022. For

this dataset, data collection for the 5-11 age group began in December 2021, about a month after

this age group became eligible for the vaccine on October 29, 2021.
27

Figure 6

Comorbid COVID-19 Deaths by Condition and Age

Note. This stacked graph shows the number of comorbid COVID-19 deaths by health condition

and age group. To be included in this visualization, COVID-19 and the comorbid condition

needed to be listed as causes of death, not just mentioned on the death certificate. Deaths may be

attributed to more than one comorbidity category, and therefore, deaths may be counted more

than once. The data spans from January 1, 2020, to March 26, 2022.
28

Appendix B

Tableau Data Analysis Methods

Visualization #1: Animated Heat Map

The animated heat map was created using the New York Times dataset (see Figure 1).

First, Longitude was dragged to Columns and Latitude was dragged to Rows. After this, bins

were designed for new weekly COVID-19 cases. A calculated field was created entitled “Sum of

Cases (Bins).” The code entered was as follows:

IF SUM([Cases Increase])<10 then '1-9'


ELSEIF SUM([Cases Increase])<50 Then '10-49'
ELSEIF SUM([Cases Increase])<100 Then '50-99'
ELSEIF SUM([Cases Increase])<250 Then '100-249'
ELSEIF SUM([Cases Increase])<500 Then '250-499'
ELSEIF SUM([Cases Increase])<1000 Then '500-999'
ELSEIF SUM([Cases Increase])<2500 Then '1,000-2,499'
ELSEIF SUM([Cases Increase])<5000 Then '2,500-4,999'
ELSEIF SUM([Cases Increase])<10000 Then '5,000-9,999'
ELSEIF SUM([Cases Increase])<20000 Then '10,000-19,999'
ELSEIF SUM([Cases Increase])<50000 Then '20,000-49,999'
ELSEIF SUM([Cases Increase])<100000 Then '50,000-99,999'
ELSEIF SUM([Cases Increase])>=100000 Then '100,000+'
END

Next, this calculated field was dragged to the Tableau Color icon as AGG(Sum of Cases

(Bins)), using the data aggregation function. Then, the State and County measures were dragged

to Marks to filter the data by U.S. county and state. Afterward, another calculated field was

established called “Week with Date Ranges” so that Tableau could be programmed to give the

date range for each week in the pandemic. The code for “Week with Date Ranges” is as follows:

STR(DATE(DATETRUNC('week', [Date]))) + " to " +


STR(DATE(DATETRUNC('week', [Date]) + 6))

To filter the heat map by date, the calculated field “Week with Date Ranges” was

dragged to Filters and then displayed the filter using the “single value (slider)” format. At this
29

point, the heat map was functional and filtered by week but not animated. To animate the heat

map, “Week with Date Ranges” was dragged to Pages. Finally, Tableau’s ToolTip window was

configured to display the county and state names, weekly cases, and weekly deaths.

Visualizations #2-3: Annotated Line Graphs

The New York Times dataset was used to create the subsequent two visualizations, which

would depict COVID-19 cases and deaths, respectively (see Figures 2-3). While these are two

separate visualizations, a parameter was used so that the dashboard displays only one of the

visualizations at one time. The user can select which measure they want to see by clicking cases

or deaths from a drop-down menu. Also, the line graphs are annotated with labels indicating

various milestone events that have influenced the number of cases and deaths nationwide.

Examples of pandemic milestone events are lockdown orders, vaccine rollout events, and the

emergence of new variants. Additionally, the ToolTip is configured to display the “week of” date

and number of weekly cases for each week when the user hovers over the data.

Visualizations #4-5: Stacked Side-by-Side Bar Graphs

The following two visualizations included in the dashboard are both stacked side-by-side

graphs. These graphs are stacked by age, and bar segments for each vaccination status are placed

side-by-side. Both visualizations can be filtered by age group (by selecting from a multiple-

choice menu) or month (by double-clicking on the data). The age filter applies to both

visualizations simultaneously, making the dashboard more streamlined and sophisticated. Lastly,

the ToolTip shows the month, age group, and number of vaccinated or unvaccinated cases or

deaths when the viewer hovers over a specific data element.


30

Visualization #6: Stacked Bar Graph

The dashboard’s final visualization shows comorbid COVID-19 deaths by health

condition from January 1, 2020, to March 26, 2020 (see Figure 6). The comorbidity categories

are sorted from the highest to the lowest number of deaths. Like the previous two graphs, deaths

are stacked by age group. However, this visualization utilizes a different dataset from the

previous two graphs and the age group bins are different. Therefore, the age filter for this

visualization could not be synced with the age filter for the previous two graphs. As a result, a

new color scheme is utilized for the age groups on this graph. A drop-down filter is also included

for jurisdiction – specifically, the United States and each state. Finally, the ToolTip displays the

age group, health condition, and number of associated deaths when the user hovers over the data.
31

Appendix C

Program Competencies

Health Science Knowledge and Skills Competency

The capstone project demonstrates this competency because it focuses on a relevant

healthcare topic, COVID-19. The project delves into population health and reveals how

public health tools like vaccination influence COVID-19 infections and mortality.

Leadership and Systems Management Outcome

This competency was demonstrated during the capstone poster presentation to fellow

Health Care Informatics students and faculty.

Systems Design and Management Outcome

The dashboard was designed and iteratively improved using human factors techniques,

making the dashboard informative, interactive, intuitive, and user-friendly.

Data and Knowledge Management Competency

This was the main competency highlighted in the capstone project. The datasets were

cleansed, transformed, and analyzed using Excel, Python, and Tableau. The dashboard

was also published to Tableau Public to be accessed on the web.

Quality and Regulatory Competency

This project relates to the Quality and Regulatory Competency because COVID-19

infections, deaths, and vaccinations are tracked by public health agencies so that data can

be analyzed and interpreted for quality improvement and regulatory purposes.

Social Justice and Community Activism

The COVID-19 dashboard helps the user visualize how age, geographic location, and

vaccination status may affect the number of cases and deaths.

You might also like