You are on page 1of 22

Climate Change: Impacts and

Adaptation
Methods to Analyzing historical and future
climate change/variability

Week 2
Environment and Climate Change
1 Management Masters Program 11/10/23
Session outline
Introduction
Measures of central tendencies
Measures of variability
Probability distributions
Time Series Analysis
Drought Indices
Estimating Data
Introduction
Climate is a paradigm of a complex system.
It has many variables, which act nonlinearly on a wide
range of space-time scales.
Climatology is the study of climate, its variations and
extremes, and its influences on a variety of activities
including (but far from limited to) human health,
safety and welfare.
Mathematical models simulate the climate and its
impacts.
The statistical inference should not only report the best
guess (estimate) but also its uncertainty.
Climate can be described in terms of statistical
descriptions of the central tendencies and variability of
relevant elements such as temperature, precipitation,
atmospheric pressure, humidity, and winds, or through
combinations of elements, such as weather types and
phenomena, that are typical to a location, region or the
world for any time period.

This session is intended to describe basic concepts


rather than to provide detailed specifics of complex
subjects.
Measures of central tendencies
Measures of Central Tendency provide a summary
measure that attempts to describe a whole set of data
with a single value that represents the middle or centre
of its distribution.
There are three main measures of central tendency: the
mean, the median and the mode.
When data is normally distributed, the mean, median
and mode should be identical, and are all effective in
showing the most typical value of a data set.
It's important to look the dispersion of a data set when
interpreting the measures of central tendency.
Mean
The arithmetic mean or, as commonly termed,
average, is one of the most frequently used
statistics in climatology.
It is calculated by simply dividing the sum of the
values by the number of values.
Disadvantages to the mean as a measure of central
tendency are that it is highly susceptible to
outliers, and that it is not appropriate to use when
the data is skewed, rather than being of a normal
distribution.
Median
The median of a data set is the value that is at the
middle of a data set arranged from smallest to largest.
If there are an odd number of values, the median is the
middle value. For an even number of values the median
is located between the two middle values, generally as
the mean of the two.
The influence of extreme variations is less on the
median than the mean because the median is a measure
of position.
The median is appropriate to use with ordinal variables,
and with interval variables with a skewed distribution.
Mode
The mode is the most common observation of a
data set, or the value in the data set that occurs
most frequently.
Like the median, it is a positional measure.
It is affected neither by the value (as is the mean)
nor by the position of other observations (as is the
median).
The mode is an appropriate measure to use with
categorical data.
Measures of variability
The measurement of variation and its explanation is of
fundamental importance.
However, a record of only a few observations
generally gives a poor basis for judging the variability.
The deviation of each individual observation from the
central tendency can be reduced to a value that
represents and describes the entire data set.
There are a few ways to measure variability and they
include: the Range, the Standard Deviation and the
Variance
Range
The simplest measure of absolute variability is the
range of the observations.
The range is the difference between the highest and
lowest values.
Although easy to calculate, the range has many
limitations.
If the extreme values are very rare or they fall well
beyond the bulk of observations, then the range will be
misleading.
Standard Deviation
Standard Deviation: It is the typical (standard) difference
(deviation) of an observation from the mean.
The standard deviation is the square root of the mean of
the square of all the individual deviations from the mean.
Deviations are taken from the mean instead of the median
or mode because the sum of squares from the mean is a
minimum.
It indicates the average distance between an observation
value, and the mean of a data set.
Squaring deviations gives greater weight to extreme
variations.
Probability distributions
Probability distribution functions may be fit to empirical
distributions
While many climatological variables may be assumed to
have prescribed distributions – normal distribution for air
temperature, gamma distribution for precipitation, Weibull
distribution for wind speed – it often is worthwhile to
examine the characteristics and fit of a number of
distributions.
Extreme value distributions are particularly important in
determining “return periods” for extreme events, such as the
50- or 100-year precipitation amount.
 The interactions between changing parameters, such as
mean and variance, and their resulting influence on the
probability of extreme events also is of critical importance
in climatic change research
Time Series Analysis
“Climate change” refers to time, and the analysis of
modelled or observed time series, such as global surface-
air temperature over the past millennium, is an important
field for climate analysis.
The detection, estimation and prediction of trends and
associated statistical and physical significance are
important aspects of climate research.
Given a time series of (say) temperatures, the trend is the
rate at which temperature changes over a time period.
The trend may be linear or non-linear.
However, generally, it is synonymous with the linear
slope of the line fit to the time series.
 Regression: the estimation of bivariate and multivariate
relationships via regression is one of the most common
and powerful statistical tools in all of science.
Regression analysis develops an equation that relates a
dependent variable to a set of independent variables.
As a result, regression can be used: (a) to evaluate the
strength and sensitivity of statistical relationships
between variables and (b) to estimate “missing” values
of the dependent variable, which would include both
interpolation and extrapolation (including forecasting).
Diagnostic tools associated with regression, such as the
coefficient of determination (R2), are used to evaluate
the amount of variability that can be accounted for in
one variable by knowing the values of other variables.
Simple linear regression is most commonly used to
estimate the linear trend (slope) and statistical
significance (via a t-test).
Correlation: A fundamental concept for analysing
bivariate data sets (two variables) is correlation analysis,
that is, a quantitative measure of how strong both
variables co-vary.
The non-parametric (ie., distribution free) Mann-
Kendall (M-K) test can also used to assess monotonic
trend (linear or non-linear) significance.
Smoothing and filtering: Moving averages, also
known as running means, frequently used to reveal
variability in a time-series that is masked by either a
prominent periodicity (such as a daily cycle) or high
levels of variability (“noise”).
Drought Indices
Indices are typically computed numerical
representations of drought severity, assessed using
climatic or hydro-meteorological inputs including the
indicators.

Indicators are variables or parameters used to describe


drought conditions.

Examples include precipitation, temperature,


streamflow, groundwater and reservoir levels, soil
moisture and snowpack.
Meteorology Ease Input Additional information
of use parameters

Standardized Precipitation Green P Highlighted by the World


Index (SPI) Meteorological Organization as a
starting point for meteorological
drought monitoring

Standardized Precipitation Yellow P, T Serially complete data required;


Evapotranspiration Index output similar to SPI but with a
(SPEI) temperature component

Rainfall Anomaly Index Yellow P Serially complete data required


(RAI)
Standardized Anomaly Yellow P Point data used to describe
Index regional
(SAI) conditions
Hydrology Ease of Input Additional information
use parameters
Palmer Hydrological Yellow P, T, AWC Serially complete data required
Drought
Severity Index
(PHDI)
Standardized Yellow RD Similar calculations to SPI using
Reservoir Supply reservoir data
Index (SRSI)
Standardized Yellow SF Uses the SPI program along with
Streamflow Index stream flow data
(SSFI)
Standardized Water- Yellow GW Similar calculations to SPI, but
level Index (SWI) using groundwater or well-level
data instead of precipitation
Streamflow Drought Yellow SF Similar calculations to SPI, but
Index (SDI) using streamflow data instead of
precipitation
Remote sensing Ease of Input Additional information
use parameter
s
Enhanced Vegetation Green Sat Does not separate drought stress
Index (EVI) from other stress
Evaporative Stress Index Green Sat, PET Does not have a long history as an
(ESI) operational product
Normalized Difference Green Sat Calculated for most locations
Vegetation Index (NDVI)
Temperature Condition Green Sat Usually found along with NDVI
Index (TCI) calculations
Vegetation Condition Green Sat Usually found along with NDVI
Index (VCI) calculations
Vegetation Drought Green Sat, P, T, Takes into account many variables to
Response Index (VegDRI) AWC, LC, separate drought stress from other
ER vegetation stress
Vegetation Health Index Green One of the first attempts to monitor
(VHI) drought using remotely sensed data
Normalized Difference Green Sat Produced operationally using
Water Index (NDWI) and Moderate Resolution Imaging
Land Surface Water Index Spectroradiometer data
Estimating Data
Interpolation uses data which are available both
before and after a missing value (time interpolation),
or surrounding the missing value (space interpolation),
to estimate the missing value.

Extrapolation extends the range of available data


values. There are more possibilities for error of
extrapolated values because relations are used
outside the domain of the values from which the
relationships were derived.
Questions?

You might also like