You are on page 1of 1

Public 33 Reporting

COVID-19 Case Surveillance Public Use Data Utility Summary


Users should consider the level of completeness, including suppression levels when planning their analyses and use of public datasets. Privacy protections will suppress
field values to reduce reidentification risks. Completeness varies by jurisdiction (i.e., state, local, and territorial) and time period. Variables are consistently coded to the
value “Unknown” when jurisdictions specify in the case data submitted to CDC that the value is unknown, the value “Missing” when jurisdictions do not provide a value,
and the value “NA” when the value is suppressed as part of privacy protections.
Dataset version: 10/5/2023

Quick Summary
summary all_fields_counts all_fields_pct quasi_fields_counts quasi_fields_pct
String Double Double Double Double
1 total_rows 101,224,479 NaN% 101,224,479 NaN%
2 total_columns 33 NaN% 7 NaN%
3 total_cells 3,340,407,807 100.0% 708,571,353 100.0%
4 suppressed_fields 2,907,787 0.1% 2,903,500 0.4%
5 missing_fields 1,704,416,015 51.0% 89,959,070 12.7%
6 unknown_fields 289,900,486 8.7% 52,782,088 7.4%
7 non_blank_fields 1,343,183,519 40.2% 562,926,695 79.4%

Field Level Utility Summary


variable suppressed suppressed_pct missing missing_pct unknown unknown_pct
String Long String Long String Long String
1 res_state 3,560 0.0% 0 0.0% 0 0.0%
2 res_county 3,560 0.0% 0 0.0% 0 0.0%
3 sex 134,778 0.1% 477,035 0.5% 992,363 1.0%
4 age_group 47,834 0.0% 1,015,894 1.0% 0 0.0%
5 ethnicity 346,897 0.3% 7,858,902 7.8% 22,955,834 22.7%
6 race 1,080,266 1.1% 8,370,620 8.3% 14,360,990 14.2%
7 hc_work_yn 1,286,605 1.3% 72,236,619 71.4% 14,472,901 14.3%
8 records_with_any_quasi_identifier 1,286,605 1.3% 74,512,725 73.6% 37,953,958 37.5%

You might also like