Professional Documents
Culture Documents
2008 3-Hillaker
2008 3-Hillaker
Harry Hillaker
Karen Andsager
2008
Journal of Service Climatology
Volume 2, Number 3, Pages 1-19
A Refereed Journal of the American Association of State Climatologists
Daily Climate Data Quality Control Procedures
Of the Iowa State Climatologist
Harry Hillaker
State Climatologist
Iowa Department of Agriculture and Land Stewardship
Karen Andsager
Midwestern Regional Climate Center
Illinois State Water Survey
Corresponding author: Harry Hillaker, State Climatologist, Iowa Dept. of Agriculture and Land
Stewardship, Wallace State Office Building, Des Moines, IA, 50319, USA. Tel. 1-515-281-8981,
Harry.Hillaker@IowaAgriculture.gov
Abstract
The State Climatologist Office of the Iowa Department of Agriculture and Land Stewardship has
been performing data entry and data quality control of National Oceanic and Atmospheric
Administration daily climate data in Iowa since July 1, 1987. This process uses comprehensive,
automated quality control tests based on standard instrumentation and observing practices and on
standard climatological consistency. Inconsistencies flagged by these tests are manually
resolved using a standard procedure based on information available from other sources and
surrounding stations. The process then uses a manual spatial test to flag additional suspect
values, which are also manually resolved using a standard procedure based on information
available from other sources and surrounding stations. For example, for suspect values spotted
in snowfall and snow depth spatial plots, visible satellite imagery may be consulted along with
snowfall at neighboring stations to produce reasonable snowfall and snow depth estimates. This
manually intensive process has produced a unique resource for comparison of manual quality
control with automated processes, as well as analysis of Iowa climate.
Figure 2. National Weather Service B-91 form, filled in by the observer for Cedar Rapids No. 1, Iowa, for January 2005.
Figure 3 – First page. First page of the National Weather Service F-6 form, with data from the observer for Des Moines, Iowa, for
January 2005. This is the presentation of the form available on the National Weather Service’s web site:
http://www.weather.gov/climate/index.php?wfo=dmx.
[REMARKS]
Figure 3 – Second page. Second page of the National Weather Service F-6 form, with data from the observer for Des Moines, Iowa,
for January 2005. This is the presentation of the form available on the National Weather Service’s web site:
http://www.weather.gov/climate/index.php?wfo=dmx.
Figure 4. Monthly spreadsheet for Cedar Rapids No. 1 for January 2005.
18 18 Cedar Rapids #1
YR MO DA MAX MIN ATOB PREC SF SOG F I G T H W
1319 6 2005 1 1 34 22 31 0.01 0.1 T 1
1319 6 2005 1 2 35 29 29 0.31
1319 6 2005 1 3 29 26 27 0.07 0.4 T 1
1319 6 2005 1 4 28 25 26 0.02 0.2 T 1
1319 6 2005 1 5 26 16 17 0.51 5.5 6
1319 6 2005 1 6 19 9 9 0.41 6.2 11
1319 6 2005 1 7 20 3 19 T T 11
1319 6 2005 1 8 23 19 21 0.02 0.2 11
1319 6 2005 1 9 32 21 32 10
1319 6 2005 1 10 33 14 24 10
1319 6 2005 1 11 30 24 30 0.03 0.1 7 1 1
1319 6 2005 1 12 35 30 32 0.34 6
1319 6 2005 1 13 32 6 6 T 6
1319 6 2005 1 14 6 -9 -3 5
1319 6 2005 1 15 6 -6 0 5
1319 6 2005 1 16 12 -3 6 5
1319 6 2005 1 17 9 -8 3 5
1319 6 2005 1 18 23 -1 23 5
1319 6 2005 1 19 38 22 22 0.02 0.2 5
1319 6 2005 1 20 27 21 25 T T 5
1319 6 2005 1 21 26 22 22 5
1319 6 2005 1 22 25 4 19 0.19 2.2 7
1319 6 2005 1 23 21 4 19 7
1319 6 2005 1 24 38 18 32 6
1319 6 2005 1 25 47 18 40 5
1319 6 2005 1 26 40 27 27 0.01 0.1 4 1
1319 6 2005 1 27 27 7 21 4
1319 6 2005 1 28 26 11 24 4
1319 6 2005 1 29 35 21 30 4
1319 6 2005 1 30 33 27 32 4
1319 6 2005 1 31 35 28 34 0.02 0.2 4 1
If match
If unresolved
If unresolved
3: Generate and
examine daily spatial
maps for each
element, identify
suspect values
Re-key
daily value
If mis-keyed
3.1: Compare to B-91
If observer
If unresolved inconsistency
among forms Select best
3.2: Compare to F-11, RR# value
If unresolved
If unresolved If evidence of
systematic problems
3.3: Compare to hourly and other with the instruments Report to the
network data, generate estimated appropriate NWS
value, with flag in spreadsheet Forecast Office
Table 1. Data set volume and estimated value rates for the period January 1, 1991, through December 31,
2006.
20000
TMAX
TMIN
16000
ATOB
Number of Daily Values
12000
8000
4000
0
-50 -25 0 25 50 75 100 125
Degrees F
Figure 7. Distribution of temperature data in the enhanced Iowa data set for the period January 1991 through December
2006.
12
TMAX
10 TMIN
ATOB
Percent Estimated
0
-50 -25 0 25 50 75 100 125
Degrees F
Figure 8. Distribution of the proportion of estimated temperature data within the enhanced Iowa data set for the period January 1991
through December 2006.
50
40
Percent Estimated
30
20
10
0
0 100 200 300 400
Precipitation in Hundredths of Inch
Figure 9. Distribution of the proportion of estimated values in precipitation data within the enhanced Iowa data set for January 1991
through December 2006 and 21-points running mean.
three temperature elements, the proportion of The snowfall data are serially complete,
estimated values becomes somewhat noisy at the including zero values for all summer months.
ends of the distribution, due to the small number Over 90% of the snowfall values are zero, and
of values in the tails. almost 4% are traces. The traces are treated as
There are over one million precipitation, 0.1 inch in this analysis. While snowfall is
snowfall, and snow depth values in the data set generally measured to the nearest tenth of an inch,
for the period January 1, 1991, through December there is a strong tendency for the observers to
31, 2006. Precipitation is measured to the nearest record snowfall values that end in either 0 or 5
hundredth of an inch, and is highly skewed tenths of an inch (Figure 10). There is also some
toward low values. Only about 2.2% are 1.00 tendency for the estimated values to end in either
inch or more, and only 0.03% are 4.00 inches or 0 or 5 tenths of an inch, particularly for values
more. About 65% are zero, and about 7% are less than 2.0 inches. This tendency affects the
traces. The traces are treated here as 0.01 inch. distribution of the proportion of snowfall data
All estimated values are less than 4.00 inches estimated, shown in Figure 11, with the smoothed
(Figure 9). The overall proportion of estimated values lower than most, held down by the larger
precipitation values is 2.6%. There are 94 numbers at the values ending in either 0 or 5
estimates at or above 2.00 inches, or 2.4%, tenths. Much more so than with precipitation, the
essentially the same as the overall proportion. great number of zero snowfall values skews the
The proportion of estimated values is relatively overall average proportion of estimated snowfall
large for small values, between 0.02 and about a values: 1.6% including zeros, 16.2% for all non-
third of an inch. This is probably due to the zero values, and 7.2% for values above 2 inches.
observers who include small storm amounts from As with precipitation, the proportion of estimated
one day with the total for the following day. This values is relatively high for small values, of about
is a common observer error which results in an inch or less. This is likewise probably due to
estimated values. the observers including small snowfall amounts in
10000
all SNOW
8000 est SNOW
6000
Count
4000
2000
0
0 20 40 60 80 100 120 140 160 180 200
Snow in Tenths of Inch
Figure 10. Distribution of snowfall data in the enhanced Iowa data set for the period January 1991 through December
2006.
100
90
80
70
Percent Estimated
60
50
40
30
20
10
0
0 20 40 60 80 100 120 140 160 180 200
Snow fall in Tenths of Inch
Figure 11. Distribution of the proportion of estimated snowfall data within the enhanced Iowa data set for the period
January 1991 through December 2006 and 21-points running mean.
70
60
50
Percent Estimated
40
30
20
10
0
0 5 10 15 20 25 30 35 40
Snow Depth in Inches
Figure 12. Distribution of the proportion of estimated snow depth data within the enhanced Iowa data set for the period January
1991 through December 2006 and seven points running mean.
the next day’s total, or perhaps not including a 5. Consistency and Accuracy in Subjective
snowfall measurement at all, for which an Quality Control
estimate is produced. In all quality control processing, one strives to
As with snowfall, the snow depth is serially limit both Type 1 and Type 2 errors. In the
complete, including zero values for all summer quality control of the Iowa Cooperative Observer
months. 80% of the snow depth values are zero, Network and other daily Iowa data described
and 4% are traces. The traces are treated as 1 here, at what rate are estimated values produced
inch in this analysis. Snow depth is measured to for observed values that were actually correct
the nearest inch. The greatest snow depth representations of the daily weather? At what rate
observed in Iowa in this data set is 36 inches. The are observed values in error not corrected? To
overall proportion of estimated snow depth is test these questions, the standard procedure would
5.7%; for snow depth of 2 inches or more, it is be to seed the data with known errors and report
24.1%. The relatively high proportion of on the percent caught by the quality control
estimated snow depth, compared to the other process, as well as the rate of non-seeded values
observed elements, may be due to the relative corrected. To seed this process from the very
ease of noting that the snow depth was not beginning would require introducing known
properly recorded as changing upward when snow errors on the original observers’ weather reports,
fell (available snowfall data utilized), or not which is impractical. Which portion of the
properly recorded as changing downward when process may generate the greatest potential for
the temperature was above freezing (available errors? The internal consistency checks are
temperature data utilized). Another difference objective, as they depend on climatological
with snow depth compared to precipitation and consistency and the accurate application of
snowfall, is that the proportion of estimated standard observational practices by the observers,
values increases with depth, rather than remaining features are not tested; this becomes a question of
generally constant. whether or not the set of internal consistency