TIME SERIES ANALYSIS KYU 2019-Chapter1

TIME SERIES ANALYSIS NOTES; JAN – MAY 2019
PAPER NAME : TIME SERIES ANALYSIS

CODE : SMT 1206
HOURS TAUGHT : 4 Hours per Week
PRE-REQUISITES : SMT 1101 CALCULUS
BSTAT DAY AND EVENING
BY MUKALAZI HERBERT
COURSE CONTENT
The Time Series and Index Numbers I course is designed to introduce students to the
major concepts and tools for analyzing and drawing conclusions from data. Data and
information are integral to the operation and planning, and as statisticians grow and
develop there is an increasing need for the use of formalized statistical methodology to
answer statistics related questions.
Teaching Objectives
On completion of this course, students should:

 Have achieved a sound understanding of the theoretical and practical
knowledge covered during the course
 Have developed a sound understanding of the value of using computer
technology for statistical purposes, and to have gained skills, experience and
confidence in using such tools
 Be able to apply independent learning skills to further their statistical
knowledge and skills throughout their future studies and careers.
 Have developed a sound vocabulary in the area of time series and index
numbers, so as to communicate statistical information to others and so as to
understand statistical reports
COURSE CONTENT
Session One: Decomposition of Economic Time Series

 Trend
 Seasonal variations
 Cyclical variations
 Irregular / Random variations
Session Two: Estimation of Trend

 By free-hand method
 Semi-average method
 Moving-average method
 Least-squares method
 Semi-logarithmic / Geometric straight line trend
 Non-linear trends
 Elimination of trend
 Third degree curve
 Asymptotic growth curves
MUKALAZI HERBERT, KYAMBOGO UNIVERSITY MATH DEPARTMENT Page 1

TIME SERIES ANALYSIS KYAMBOGO UNIVERSITY; Jan –May 2019
Session Three: Estimation of Seasonal variations

 Simple average method
 Simple average corrected to trend
 Link-relative method
 Moving average method
 Ratio to trend and ratio to moving average method
Session Four: Estimation of Cyclical fluctuations / variations

 Index numbers
 Price index numbers
 Quantity index numbers
 Value index numbers
 Deflation of index numbers
 Cost of living index/consumer price index
 Construction of index numbers
 Edge worth and Marshall method
 Fisher’s ideal index
 Laspeyres method
 Paasche’s method
READING MATERIALS/LIST
1. Anderson, The Statistical Analysis of Time Series
2. Neil R. Ullman, Elementary Statistics and applied approach
2
Chapter One
Introduction to Time Series
1.0 Goals to the Introduction to Time Series
1. Define and explain the meaning of each of the components of a time

series
2. Use of time series and their application to different situations and fields
of study
3. Introduced to the importance of forecasting the future with some degree
of accuracy
4. Introduced to how those forecasts can be used to make informed
decisions
5. Define and explain the Time Series Models – the Additive and the
Multiplicative models
6. Introduced to the use the time series models and their application in
different situations
1.1 Definition of Time Series
A Time Series is a collection of data for some variable or a set of variables

recorded over a period of time – usually hourly, daily, weekly, monthly,
quarterly, half-yearly or yearly. The data set is called a time series because
it contains observations for some variable over time. Examples of time series
are: (1) Sales by quarter at Uchumi Supermarket; (2) the annual production
of coffee in Uganda since independence (1962); (3) the weekly interest rates
in financial institutions; (4) the hourly wind speed recorded by the
Meteorology Department of the Ministry of Natural Resources; (5) the
annual birth and death rates from the Social Science Surveys.
1.2 Reasons for analyzing Time Series
There are a number of reasons for analyzing time series that include but not
limited to the following:
1. The main purpose of time-series analysis is to predict or forecast future

values of the variable from past observations.
Time series can be used by management to make current decisions and

for long – term forecasting and planning. Long – term forecasts usually
extend more than one year into the future. 5-, 10-, 15-, and 20-year
projections are common. Long-range predictions are considered essential
in order to allow sufficient time for the procurement, manufacturing,
3
sales, finance, and other departments of a company to develop plans for

possible new products, and new methods of assembling.
As a result, the ability to forecast and predict future events and trends
greatly enhances the likelihood of success. It is therefore no wonder that
businesses and governments spend a good deal of time and effort in the
pursuit of accurate forecasts of future trends and developments.
2. In time series analysis, we try to capture the underlying patterns of

variation of the variable of interest over time. This means identifying
what factors affect the patterns of variation by examining the time series
data. Such data come from repeated observations of the same variable of
interest over equal intervals of time.
1.3 Components of a Time Series
There are four components to a time series: the trend, the cyclical variation,
the seasonal variation, and the irregular variation.
1.3.1 Trend (Secular Trend)
Trend (Secular Trend) is the steady increase or decrease over a long period of
time, reflecting long-term growth or decline of the variable of interest.
1.3.1.1 Causes of Trend
Trend can be a consequence of long-term gradual changes in population, gross

national product, technological advances, or consumer preferences, among
other causes.
Disposable income, money supply, bank deposits have generally increased over
time, together with sales of durable goods such as cars, mobile phones, usually
accompanied by steadily rising prices. The per capita death rates data exhibit
long-term downward trends attributable to advances in medicine and the rising
standards of living.
1.3.2 Cyclical Variations
Cyclical variations are medium-term variations lasting a few years but

exhibiting no regular periodicity or pattern. These fluctuations are also referred
to as business cycles. They cover much longer time periods than do seasonal
variations, often encompassing three or more years in duration.
1.3.2.1 Phases of a cycle
A cycle contains four phases namely:

4
1. Upswing or expansion, during which the level of business activity is

accelerated, unemployment is low, and production is brisk.
2. Peak, at which the rate of economic activity has “topped out”.
3. Downturn, or contraction, when unemployment rises and activity wanes.
4. Trough, when activity is at its lowest point.
1.3.2.2 A typical business cycle
A typical business cycle consists of a period of prosperity followed by periods of

recession, depression, and recovery.
1.3.2.3 Reasons for identifying cycles in time series
There are two main reasons why we may wish to identify cycles in time series.
In the first place, we may want to know where we are in the cycle to anticipate
what may happen in the near future.
Second, as with trend, when a cycle is identified and isolated, the other factors
affecting the time series data are more easily seen and can be explained
accordingly.
1.3.3 Seasonal Variations
Seasonal variations are characterized by predictable swings occurring within

one calendar year or less with clockwork regularly. Such intra-year variations
often reflect natural phenomena, such as changing weather associated with the
four seasons as well as institutional manmade factors such as work holidays
and school calendars.
Consider the expanded consumption of electricity and gas in winter, the

increased sales of warm clothing in cold weather, and the growth in
summertime sales of ice cream, sunglasses, and air-conditioners. Climatic
conditions also affect production in such industries as agriculture and outdoor
construction. Finally social factors cause increases in the purchase of toys just
before Christmas and flowers around Valentine’s Day, recreational travel
during school vacation.
These are movements in the time series that reoccur each year about the same
time.
1.3.3.1 Periodic variations
Periodic variations are shorter versions of seasonal variations. They manifest

themselves within a month, a week, or even a day. Notice the increase in
5
banking activity at the end of the month, while resorts and amusement parks
are busiest on weekends.
The unit of time may be quarterly, monthly, weekly or even daily.
Practically, all business and economic series have recurring seasonal patterns.
For example, sales for clothes are high prior to Christmas. Prices of produce
are low at harvest time.
1.3.3.2 Reasons for analyzing seasonal variations
An analysis of seasonal variation is important in planning production

schedules and anticipating sales.
Furthermore, once we know the seasonal variations, we may want to iron out
the intra-year variations by promoting during the off season.
1.3.4 Irregular Variations
Time series contain irregular or random fluctuations caused by unusual

occurrences producing movements that have no discernible pattern. These
movements are unique and unlikely to reoccur in similar fashion. They can be
caused by events such as wars, natural disasters – floods, earthquakes,
political upheavals, economic embargoes, etc.
In time series analysis, irregular activity in the variable consists of whatever

variation is left over after we have account for the effects of trend, cycles, and
seasonal variation.
1.3.4.1 Types of irregular variations
Time series analysts prefer to subdivide the irregular variation into episodic
and residual variations.
Episodic variation
Episodic fluctuations are unpredictable, but they can be identified. For

example, the impact on the economy of a major strike or a war can be
identified, but a strike or a war cannot be predicted.
Residual variation
After the episodic fluctuations have been removed, the remaining variation is
the residual variation, often called chance fluctuations. These are
unpredictable and they cannot be identified.
6
Note: Neither the episodic nor the residual variations can be predicted into the
future. They are merely treated as the residual influence after the other three
components of the time series data have been taken into account.
1.4 Time Series Models
1.4.1 Definition
A time series model can be expressed as some combination of the four

components: trend, cyclical, seasonal and irregular.
The model is simply a mathematical statement of the relationship among the

four components. Two types of models are commonly associated with time
series, namely, the additive and the multiplicative models.
1.4.2 The Additive Model
The additive model refers to time series 𝑌𝑡 as an algebraic sum of the four
components, symbolically expressed as: 𝑌𝑡 = 𝑇𝑡 + 𝑆𝑡 + 𝐶𝑡 + 𝐼𝑡
Where 𝑌𝑡 is the value of the time series for the time period t, and the right-hand
side values are the trend, the seasonal variation, the cyclical variation, and the
random or irregular variation respectively, for the same time period.
All the values are expressed in their original units, and S, C and I are
deviations around T.
The time series 𝑌𝑡 does not depend on the four components. That is, 𝑌𝑡 is
independent of the four components.
1.4.2.1 Application of the Additive Model
In order to break down the time series data and measure the effect of the
individual components, we proceed in four steps as follows:
1. Isolate the seasonal variation, S, and then deseasonalize the data:

𝑌𝑡 - 𝑆𝑡 = 𝑇𝑡 + 𝐶𝑡 + 𝐼𝑡
2. Compute the trend, T, then remove its influence:

𝑌𝑡 - 𝑆𝑡 - 𝑇𝑡 = 𝐶𝑡 + 𝐼𝑡
3. Identify the cyclical fluctuation, C, then remove its influence:

𝑌𝑡 - 𝑆𝑡 - 𝑇𝑡 - 𝐶𝑡 = 𝐼𝑡
4. Recognize that the residual (what remains) is the effect of unpredictable

irregular events, I.
7
Example: If we were to develop a time-series model for sales for a local retail
store, we might find that T = $500, S = $100, C = -$25, and I = -$10. Sales
would be: Y = $500 + $100 -$25 -$10 = $565
Notice that the positive value for S indicates that existing seasonal influences
have a positive impact on sales. The negative cyclical value suggests that the
business cycle is currently in a downswing. There was apparently some
random event that had a negative impact on sales.
1.4.2.2 Defects of the Additive Model
The additive model suffers from the somewhat unrealistic assumption that the
components are independent of each other. This is seldom the case in the real
world. In most instances, movements in one component will have an impact on
the other components, thereby negating the assumption of independence. Or,
perhaps even more commonly, we often find that certain forces at work in the
economy simultaneously affect two or more components. Again, the
assumption of independence is violated.
1.4.3 The Multiplicative Model
The multiplicative model assumes that the four components interact with each
other and do not move independently.
It is expressed as follows: 𝑌𝑡 = 𝑇𝑡 x 𝑆𝑡 x 𝐶𝑡 x 𝐼𝑡
This model is often preferred for the reason that the components affect one
another.
1.4.3.1 Application of the Multiplicative Model
In order to break down the time series data and measure the effect of the
individual components, we proceed in four steps as follows:
1. Isolate the seasonal variation, S, and then deseasonalise the data:

𝑇𝑡 x 𝑆𝑡 x 𝐶𝑡 x 𝐼𝑡
= 𝑇𝑡 x 𝐶𝑡 x 𝐼𝑡
𝑆𝑡
2. Compute the trend, T, then remove its influence:

𝑇𝑡 x 𝐶𝑡 x 𝐼𝑡
= 𝐶𝑡 x 𝐼𝑡
𝑇𝑡
3. Identify the cyclical fluctuation, C, then remove its influence:

𝐶𝑡 x 𝐼𝑡
= 𝐼𝑡
𝐶𝑡
8
4. Recognize that the residual (what remains) is the effect of unpredictable

irregular events, I.
Example: Values for bad debts at a commercial bank might be recorded as T =

$10 million, S = 1.7, C = 0.91 and I = 0.87. Bad debts could then be computed
as:
Y = (10) (1.7) (0.91) (0.87) = $13.46 million
9
R TIME SERIES
10
11
12
13
14
1.5 CHAPTER EXERCISES
Exercise 1:
Plot a graph using the time series data in the table below:
Year Shares (%) Year Shares (%)

2004 31.3 2009 32.5
2005 31.1 2010 35.2
2006 29.3 2011 38.0
2007 28.9 2012 39.7
2008 29.5
What is the direction of the trend?
Exercise 2:
Plot the graph for using time series data in the table below:
Year Sales (Billion Shillings) Year Sales (Billion Shillings)

1997 3.6 2005 6.3
1998 4.2 2006 5.8
1999 5.5 2007 4.7
2000 6.2 2008 5.9
2001 5.6 2009 6.1
2002 4.3 2010 7.5
2003 6.0 2011 8.1
2004 7.2 2012 8.5
Draw the trend line.
Exercise 3:
The table below shows weekly fuel purchases for the Ministry of Finance
Headquarters:
Week Fuel (liters) Week Fuel (liters)

1 409 11 318
2 289 12 598
3 509 13 418
4 364 14 359
5 404 15 432
15
6 445 16 252
7 310 17 446
8 372 18 473
9 440 19 337
10 414 20 478
a. Plot the fuel purchases by the Ministry against time.

b. Do you expect a trend, a seasonal, and a cyclical variation or not?
SOLUTIONS TO EXERCISES
16
17
18
Chapter Two
Stationary (No-Trend) Time Series
2.0 Goals to the chapter
1. Understand the causes of stationarity in Time series

2. Understand and conduct various tests for stationarity in Time series
3. Conduct test of hypotheses for the various tests
4. Compute and interpret the test results
5. Derive conclusions from the tested hypotheses
2.1 Introduction
A Time series is said to be stationary if it appears about the same on average

no matter when it was observed.
That is, 𝑌𝑡 = 𝛽0 + 𝜀𝑡 t = 1,2,3,………………
Where 𝑌𝑡 is the actual value of the time series
𝛽0 is the average level of the time series
𝜀𝑡 is the random variable (irregular form)
Note: 𝜀𝑡 is assumed to be independent and its mean value is assumed to be

zero.
2.2 Causes of Stationarity
There are numerous activities that cause time series to be stationary. These
include:
1. Stable environment – the forces that generate the time series would have
stabilized and the environment in the time series is relatively unchanged.
For example, the mature stage of the life cycle of a good or a service.
2. Easily correctable trend – which stabilization may be obtained by making
simple corrections for factors such as population growth and inflation.
For example, the Gross Domestic Product (GDP) per Capita.
3. Short forecasting horizon – where the time series may have a trend but
the period over which the forecasts are needed is relatively short so that
the amount due to trend is negligible.
4. Transferrable series – whereby the series may be mathematically altered
into a stable one by taking logarithms, square roots or differentials.
5. Analysis of residual series – where the series comprises of residuals that

show horizontal pattern.
19
6. Preliminary stages of model development – when a simple model may be

required for easy explanation.
2.3 Tests for Stationarity in Time Series
2.3.1 Decision to test for stationarity in time series
The decision to use a no-trend model depends on cost, availability of data and
the desired level of accuracy.
The general practices in use are the following:
1. Familiarization of the series – that is becoming knowledgeable about the

series you are attempting to predict. This helps to avoid serious mistakes
in model selection
2. Organization of the data into an early examination form
3. Graphical representation of the data that easily show that the series is
stationary
2.4 Nonparametric Tests
Definition:
Nonparametric tests are statistical procedures that can be used to test
hypotheses when no assumptions regarding parameters or population
distribution are possible.
Coverage:
Seven distribution-free tests will be considered in this chapter: the runs
test, the turning point test, the signs test, Daniel’s test, Pearson’s test,
Wilcoxon Signed-Rank Test, Wilcoxon Rank-Sum Test.
2.4.1 The Runs Test

Procedure:
The following procedure is followed in using the runs test:
1. Compute the median of the series
2. Assign a plus sign (+) to observations above the median and a minus
sign (-) to observations below
3. List the pluses +s and minuses –s in chronological order and count
the number of runs or blocks of pluses and minuses (R)
4. If n (number of observations in the data) is odd, the median is itself
an observation which is ignored and we get (n-1)/2 pluses and (n-1)/2
minuses
20
5. Let m be the number of pluses, the statistic, which is equal to the

number of runs in a random sequence of m pluses and m minuses
with mean 𝜇𝑅 which is equal to the expected number of runs and is
equal to m+1. 𝑆𝑅 which is the standard deviation of the runs is:
𝑚(𝑚−1)
𝑆𝑅 = √ 2𝑚−1
𝑅− 𝑈𝑅
6. Calculate the value of 𝑍𝑐 as follows: 𝑍𝑐 = 𝑆𝑅
7. Test the hypothesis
𝐻0 : The series are stationary
𝐻𝐴 : The series are no stationary (have a trend)
Reject 𝐻0 if |𝑍𝑐 | is greater than 𝑍𝛼
2
8. Conclusion
If 𝐻0 is rejected, conclude with (1 – α) x 100% confidence that a trend
is present in the series.
If 𝑍𝑐 is positive, the trend is upward, while if 𝑍𝑐 is negative, the trend
is downward.
If 𝐻0 is not rejected, conclude that the series is stationary.
Example:
Using the Runs test, test for stationarity in the time series below:
t 𝒀𝒕 +𝑺 /−𝒔 Run
1 2.0 - 1
2 4.3 -
3 2.4 -
4 4.5 + 2
5 2.8 - 3
6 4.1 -
7 5.6 + 4
8 4.8 +
9 3.6 - 5
10 2.4 -
11 5.5 + 6
12 5.8 +
13 3.3 - 7
14 5.2 + 8
15 4.1 - 9
16 4.9 + 10
17 2.9 - 11
18 5.6 + 12
19 5.8 +
20 6.2 +
21
Step 1: Arranging 𝑌𝑡 values in ascending order and compute the median of the
series.
2. 2. 2. 2. 3. 3. 4. 4. 4. 4. 4. 4. 5. 5. 5. 5. 5. 5. 6.
2 4 4 8 9 3 6 1 1 3 5 8 9 2 5 6 6 8 8 2
From 2.0 to 4.1 there are 9 values and from 4.8 to 6.2 there are 9 values. Since
the number of values is even the median is the average of the two values in the
middle. That is, (4.3 + 4.5)/2 = 8.8/2 = 4.4
Step 2: Assign a plus sign (+) to observations above the median and a minus
sign (-) to observations below
Step 3: List the pluses +s and minuses –s in chronological order and count the
number of runs or blocks of pluses and minuses (R). R =12 the number of
Runs
Step 4: Since the number of observations is even this step is not applicable
Step 5: Let m be the number of pluses, the statistic, which is equal to the
number of runs in a random sequence of m pluses and m minuses with mean
𝜇𝑅 which is equal to the expected number of runs and is equal to m+1
m = n/2 = 20/2 =10
𝜇𝑅 = m + 1 = 10 + 1 = 11
𝑆𝑅 Which is the standard deviation of the runs is:
𝑚(𝑚−1) 10(10−1)
𝑆𝑅 = √ 2𝑚−1
=√ 20−1
= 2.176
Step 6: Calculate the value of 𝑍𝑐 as follows:

𝑅− 𝑈𝑅 12− 11
𝑍𝑐 = = = 0.4595
𝑆𝑅 2.176
Step 7: Test the hypothesis


2
At 95% confidence, α is equal to 0.05, 𝑍𝛼 = 1.96

2
22
If 𝑍𝑐 > 1.96 reject 𝐻0 , otherwise accept 𝐻0
Step 8: Conclusion

is present in the series
is downward
If 𝐻0 is not rejected, conclude that the series is stationary
Since 𝑍𝑐 = 0.4595 is less than 𝑍𝛼 = 1.96 we fail to reject 𝐻0 and conclude at

2
95% confidence that the series is stationary or has no trend
2.4.2 The Turning Points Test
Definition
A turning point in time series is a point where the series change direction, each
such point, represents either a local peak or local trough in the series. A
turning point is a time period whose sign is different from that of the next
period.
Steps for the turning point test
To determine a turning point the following steps are followed:
Step 1: Assign a plus (+) or a minus (-) to a period depending on whether its
first difference (𝑌𝑡 -𝑌𝑡−1 ) is positive or negative. A positive indicates that the
series went up in the period and a negative implies it went down
Step 2: Determine the number of turning points (U). This is the test statistic
that is equal to the turning points in a series of n observations
Step 3: Determine the values 𝜇𝑈 and 𝜎𝑈 calculated as follows
2(𝑛−2) 16𝑛−29
𝜇𝑈 = and 𝜎𝑈 = √
3 90
Step 4: Calculate the value of 𝑍𝑐 as follows

𝑈− 𝜇𝑈
𝑍𝑐 = 𝜎𝑈
23

2

2
Step 6: Conclusion

is downward
Example:
Using the turning point test, test for stationarity in the following time series:
t 𝒀𝒕 𝒀𝒕 - 𝒀𝒕−𝟏 Turning point

1 1.6
2 3.1 + 1
3 2.0 - 2
4 3.5 + 3
5 2.5 - 4
6 3.2 + 5
7 4.4 +
8 3.8 - 6
9 3.6 -
10 2.0 -
11 4.3 + 7
12 4.7 +
13 3.3 - 8
14 3.8 + 9
15 3.9 +
16 4.2 +
17 3.1 - 10
18 4.7 + 11
19 5.3 +
20 4.5 - 12
21 4.9 + 13
Step 1: Assign a plus (+) or a minus (-) to a period depending on whether its
first difference (𝑌𝑡 -𝑌𝑡−1 ) is positive or negative
Step 2: Determine the number of turning points (U)

24
U = 13
Step 3: Determine the values 𝜇𝑈 and 𝜎𝑈 calculated as follows

2(𝑛−2) 2(21−2) 2(19) (38)
𝜇𝑈 = = = = = 12.67
3 3 3 3
16𝑛−29 16𝑥 21−29 16𝑥 21−29 336−29 307

𝜎𝑈 = √ =√ =√ =√ = √ 90 = √3.411 = 1.847
90 90 90 90
Step 4: Calculate the value of 𝑍𝑐 as follows

𝑈− 𝜇𝑈 13− 12.67
𝑍𝑐 = = = 0.33/1.847 = 0.179
𝜎𝑈 1.847


2

2
Step 6: Conclusion

Since 𝑍𝑐 = 0.179 is less than 𝑍𝛼 = 1.96 we fail to reject 𝐻0 and conclude at 95%
2
confidence that the series is stationary or has no trend
2.4.3 The Sign Test
Definition
When the signs of the first difference have been determined for the turning
point test, a sign test may be used. The sign test is based on the sign of a
difference between two related observations. We designate a plus sign for a
positive difference and a minus sign for a negative difference
Steps for the sign test
Step 1: Determine the signs of the first differences for the turning point test
25
Step 2: Determine the test statistic, V, the number of positive first differences
in the series. 𝑛΄ , the number of non-zero first differences
Step 3: Determine the values 𝜇𝑉 and 𝜎𝑉 calculated as follows
𝑛΄ 𝑛 ΄
𝜇𝑉 = and 𝜎𝑉 = √ 2
2
𝑉− 𝜇𝑉
Step 4: Calculate the value of 𝑍𝑐 as follows: 𝑍𝑐 = 𝜎𝑉


2

2
Step 6: Conclusion

Example:
Test for stationarity in the time series below using the sigh test at 95%
confidence level:
t 𝒀𝒕 𝒀𝒕 - 𝒀𝒕−𝟏 V N
1 1.4
2 3.0 + 1 1
3 1.9 - 2
4 3.1 + 2 3
5 2.1 - 4
6 2.5 + 3 5
7 4.1 +
8 3.6 - 6
9 2.9 -
10 1.9 -
11 4.0 + 4 7
12 4.2 +
13 2.7 - 8
14 3.4 + 5 9
26
15 3.0 - 10
16 3.5 + 6 11
17 2.7 - 12
18 4.1 + 7 13
19 4.3 +
Step 1: Determine the signs of the first differences for the turning point test.
Step 2: Determine the test statistic, V, the number of positive first differences
in the series. V = 7
𝑛΄ , the number of non-zero first differences = 13
Step 3: Determine the values 𝜇𝑉 and 𝜎𝑉 calculated as follows:
𝑛΄ 𝑛 13 ΄
𝜇𝑉 = = 13/2 = 6.5 and 𝜎𝑉 = √ 2 = √ 2 = √6.5 = 2.55
2
Step 4: Calculate the value of 𝑍𝑐 as follows:

𝑉− 𝜇𝑉 7− 6.5
𝑍𝑐 = == = 0.5/2.55 = 0.196
𝜎𝑉 2.55


2

2
Step 6: Conclusion

Since 𝑍𝑐 = 0.196 is less than 𝑍𝛼 = 1.96 we fail to reject 𝐻0 and conclude at 95%
2
confidence that the series is stationary or has no trend.
27
2.4.4 Daniel’s Test
Definition
Daniel’s test is based on the Spearman’s rank correlation coefficient.
Steps for Daniel’s test
Step 1: State the Null and the Alternative Hypotheses

Step 2: Select a Level of Significance; Choose a level say 0.1, 0.05 etc.
Step 3: Decide on the test statistic
6 ∑ 𝑑𝑖 2
r=1- 𝑛(𝑛2 − 1)
Where 𝑑𝑖 the difference between the rankings for each of the observation, and n
is the number of observations.
1
𝜇𝑟 = 0 and 𝜎𝑟 =
√𝑛−1
𝑟−𝜇𝑟
𝑍𝑐 =
𝜎𝑟
Step 4: Formulate a decision Rule; Reject 𝐻0 if the |𝑍𝑐 | is greater than 𝑍𝛼

2
Step 5: Conclusion

28
Example
Using the data in the table below, test for stationarity using Daniel’s test:
t 𝒀𝒕 𝑹𝒕 𝑹𝒀 d 𝒅𝟐
1 19.1 1 1 0 0
2 40.5 2 3.5 -1.5 2.25
3 40.5 3 3.5 -0.5 0.25
4 62.3 4 5 -1.0 1
5 37.6 5 2 3 9
6 84.3 6 7 1 1
7 123.9 7 9 2 4
8 74.5 8 6 2 4
9 200.5 9 11 2 4
10 177.4 10 10 0 0
11 114.6 11 8 3 9
34.5

6 ∑ 𝑑𝑖 2 6𝑥34.5 207 207 207

r=1- =1- =1- =1- =1- =1 – 0.157 =
𝑛(𝑛2 − 1) 11(112 − 1) 11(112 − 1) (11𝑥120) (1320)
0.843
1 1 1 1 1
𝜇𝑟 = 0 and 𝜎𝑟 = = = = = = 0.316
√𝑛−1 √11−1 √10 √10 3.162
𝑟−𝜇𝑟 0.843−0
𝑍𝑐 = = = 2.668
𝜎𝑟 0.316
Step 4: Formulate a decision Rule; Reject 𝐻0 if the |𝑍𝑐 | is greater than 𝑍𝛼

2
Step 5: Conclusion

Since |𝑍𝑐 | = 2.668 is greater than 𝑍𝛼 =1.96 reject 𝐻0 and conclude that the time
2
series is not stationary, that is, has a trend
29
2.4.5 Pearson’s Test
Definition
Pearson’s test is based on the product of correlation coefficient
Steps for the Pearson’s test for stationary time series


𝑆𝑡𝑌
r=
√𝑆𝑡𝑡 𝑆𝑌𝑌
2
(∑ 𝑡)
̅ 2 = ∑ 𝑡2 –
Where 𝑆𝑡𝑡 = ∑(𝑡 − 𝑡) 𝑛
2
(∑ 𝑌)
𝑆𝑌𝑌 = ∑(𝑌 − ̅̅̅
𝑌)2 = ∑ 𝑌 2 – 𝑛
∑𝑡∑𝑌
𝑆𝑡𝑌 = ∑(𝑡 − 𝑡̅ )(𝑌 − 𝑌̅ )= ∑ 𝑡𝑌–
𝑛
√𝑛−2
𝑡𝑐 = r(√1−𝑟 2)
Step 4: Formulate a decision Rule
Reject 𝐻0 if the |𝑡𝑐 | is greater than 𝑡𝛼 with (n-2) degrees of freedom.

2
Step 5: Conclusion

30
2.4.6 Wilcoxon Signed-Rank Test
Definition
In 1945, Frank Wilcoxon developed a nonparametric test, based on the

differences in dependent samples, where the normality assumption is not
required
Example:
Thomas’s is a family restaurant in town, offering a full dinner menu, but their
specialty is chicken. Recently, Jones Thomas, the owner and founder,
developed a new spicy flavor for the batter in which the chicken is cooked.
Before replacing the current flavor, he wants to conduct some tests to be sure
that patrons will like the spicy flavor better
To begin with, Jones selects a random sample of 15 people. He is careful that

this sample is representative of his customers. Each member of the sample is
given a small sample of the current chicken and asked to rate its overall taste
on a scale of 1 to 20. A value near 20 indicates the participant liked the flavor,
whereas a score near 0 indicates they did not like the flavor. Next, the same 15
participants are given a sample of the new chicken with the spicier flavor and
again asked to rate its taste on a scale of 1 to 20. The results are reported
below. Is it reasonable to conclude that the spicy flavor is preferred? Use the
0.05 significance level
Participant Spicy Flavor Current Participant Spicy Flavor Current

Score Flavor Score Flavor
Score Score
Annette 14 12 John 19 10
Susan 8 16 Joseph 18 10
George 6 2 Peter 16 13
William 18 4 Paul 18 2
Jonah 20 12 Godwin 4 13
Jacob 16 16 James 7 14
Liz 14 5 Deon 16 4
Garrett 6 16
Solution:
The samples are dependent or related. That is, the participants are asked to
rate both flavors of chicken. Thus, if we compute the difference between the
rating for the spicy flavor and the current flavor, the resulting value shows the
amount the participants favor one flavor over the other. If we choose to
subtract the current flavor score from the spicy flavor score, a positive result is
the “amount” the participant favors the spicy flavor. Negative difference scores
indicate the participant favored the current flavor. Because of the somewhat
31
subjective nature of the scores, we are not sure the distribution of the
differences follows the normal distribution. We decide to use the nonparametric
Wilcoxon signed-rank test.
As usual, we will use the five-step hypothesis testing procedure. The null
hypothesis is that there is no difference in the rating of the chicken flavors by
the participants. That is, as many participants in the study rated the spicy
flavor higher as rated the regular flavor higher. The alternative hypothesis is
that the ratings are higher for the spicy flavor. That is,
𝐻𝑂 : There is no difference in the ratings of the two flavors
𝐻𝐴 : The spicy ratings are higher
This is a one-tailed test. Why? Because Jones, the owner of Thomas’s, will want
to change his chicken flavor only if the sample participants show that the
population of customers like the new flavor better.
The significance level is 0.05.
Steps to conduct the Wilcoxon signed-rank test are as follows:
Step 1: Compute the difference between the spicy flavor score and the current
flavor score for each participant. These differences are shown in column 4 of
the table below:
(1) (2) (3) (4) (5) (6) (7)

Participant Spicy Current Difference Absolute Rank Signed Rank
Flavor Flavor in Score difference
Score Score 𝑅+ 𝑅−
Annette 14 12 2 2 1 1
Susan 8 16 -8 8 6 6
George 6 2 4 4 3 3
William 18 4 14 14 13 13
Jonah 20 12 8 8 6 6
Jacob 16 16 * * * *
Liz 14 5 9 9 9 9
Garrett 6 16 -10 10 11 11
John 19 10 9 9 9 9
Joseph 18 10 8 8 6 6
Peter 16 13 3 3 2 2
Paul 18 2 16 16 14 14
Godwin 4 13 -9 9 9 9
James 7 14 -7 7 4 4
Deon 16 4 12 12 12 12
Total 75 30
32
Step 2: Only the positive and negative differences are considered further. That
is, if the difference in flavor scores is 0, that participant is dropped from the
analysis and the number in the sample reduced. Hence Jacob is dropped from
the study and the sample size reduced from 15 to 14.
Step 3: Determine the absolute differences for the values computed in column
4. Recall that in an absolute difference we ignore the sign of the difference.
Step 4: Rank the absolute differences from smallest to largest. There are three
participants who rated the difference in flavors as 8. To resolve the problem, we
average the rankings involved and report the average rank for each. This
situation involves the ranks 5, 6, and 7, so all three participants are assigned
the rank of 6. The same situation occurs for those participants with a
difference of 9. The ranks involved are 8, 9, and 10, so three participants are
assigned a rank of 9.
Step 5: Each assigned rank in column 6 is then given the same sign as the
original difference, and the results are reported in column 7.
Step 6: The 𝑅 + and 𝑅 − columns are totaled.
Step 7: Obtain the critical values for the Wilcoxon signed-rank test from table.
An extract of the table is below:
2α 0.15 0.10 0.05 0.04 0.03 0.02 0.01

N 𝜶 0.075 0.050 0.025 0.020 0.015 0.010 0.005
4 0
5 1 0
6 2 2 0 0
7 4 3 2 1 0 0
8 7 5 3 3 2 1 0
9 9 8 5 5 4 3 1
10 12 10 8 7 6 5 3
11 16 13 10 9 8 7 5
12 19 17 13 12 11 9 7
13 24 21 17 16 14 12 9
14 28 25 21 19 18 15 12
15 33 30 25 23 21 19 15
16 39 35 29 28 26 23 19
17 45 41 34 33 30 27 23
The 𝛼 row is used for one-tailed tests and the 2 𝛼 row for two-tailed tests. In
this case we want to show that customers like the spicy taste better, which is a
one-tailed test, so we select the 𝛼 row. We choose the 0.05 significant level, so
move to the right to the column headed 0.05. Go down that column to the row
where n is 14. The value at the intersection is 25, so the critical value is 25.
33
The decision rule is to reject the null hypothesis if the smaller of the rank sums
is 25 or less. This is the largest value in the rejection region.
Step 8: Test the values obtained in column 7 against the critical value and
make decision to reject 𝐻0 or not
In this case the smaller rank sum is 30, so the decision is not to reject the null
hypothesis
Step 9: Make conclusion
We cannot conclude there is a difference in the flavor ratings between the

current and the spicy
2.4.7 Wilcoxon Rank-Sum Test
One test specifically designed to determine whether two independent samples

came from equal populations is the Wilcoxon rank-Sum Test.
Definition
The Wilcoxon rank-sum test is based on the average of ranks. The data are
ranked as if the observations were part of a single sample. If the null
hypothesis is true, then the ranks will be about evenly distributed between the
two samples, and the average of the ranks for the two samples will be about
the same. That is, the low, medium, and high ranks should be about equally
divided between the two samples. If the alternative hypothesis is true, one of
the samples will have more of the lower ranks and, thus, a smaller rank total.
If each of the samples contains at least eight observations, the standard normal
distribution is used as the test statistic. The formula is:
𝑛1(𝑛
1+𝑛2 +1)
𝑊−
Z= 2
𝑛 𝑛 (𝑛 +𝑛 +1)
√ 1 2 1 2
12
Where; 𝑛1 is the number of observations from the first population
𝑛2 is the number of observations from the second population
W is the sum of ranks from the first population
34
Example:
Mr. Dan Thompson, the president of Ethiopian Airlines, recently noted an

increase in the number of no-shows for flights out of Addis Ababa. He is
particularly interested in determining whether there are more no-shows for
flights that originate from Nairobi compared with flights leaving Entebbe. A
sample of nine flights from Nairobi and eight from Entebbe are reported in the
table below:
Nairobi Entebbe
11 13
15 14
10 10
18 8
11 16
20 9
24 17
22 21
25
At the 0.05 significance level, can we conclude that there are no-shows for
flights originating from Nairobi?
Solution:
If the number of no-shows is the same for Nairobi and Entebbe, then we expect
the means of the two ranks to be about the same. If the number of no-shows is
not the same, we expect the two sums of ranks to be quite different.
Mr. Thompson believes there are more no-shows for Nairobi flights. Thus, a
one-tailed test is appropriate, with the rejection region located in the upper tail.
Steps for the Wilcoxon Rank-Sum Test
Step 1: State the null and alternate hypotheses
𝐻0 : The distribution of no-shows is the same for Nairobi and Entebbe

𝐻𝐴 : The distribution of no-shows is larger for Nairobi than for Entebbe
Step 2: Determine the critical region
The test statistic follows the standard normal distribution. At the 0.05
significance level, we find from the standard normal tables, the critical value of
Z is 1.65. The null hypothesis is rejected if the computed value of Z is greater
than 1.65
Step 3: Calculate the value of W as shown in the table below:
35
Nairobi Entebbe
No-Shows Rank No-Shows Rank
11 5.5 13 7
15 9 14 8
10 3.5 10 3.5
18 12 8 1
11 5.5 16 10
20 13 9 2
24 16 17 11
22 15 21 14
25 17
96.5 56.5
The value of W is calculated for the Nairobi group and is found to be 96.5
Step 4: Compute the Z value

𝑛1(𝑛
1+𝑛2 +1) 9(9+8+1)
𝑊− 96.5−
Z= 2
= 2
= 1.49
𝑛 𝑛 (𝑛 +𝑛 +1)
√ 1 2 1 2 √9(8)(9+8+1)
12 12
Step 5: Make decision on the Null hypothesis
Because the computed z value (1.49) is less than 1.65, the null hypothesis is
not rejected.
Step 6: Make a conclusion
The evidence does not show a difference in the typical number of no-shows.
That is, it appears that the number of no-shows is the same in Nairobi as in
Entebbe.
2.5 CHAPTER EXCERCISES
Exercise 1
Test for stationarity in the time series using a Runs test on the following data:
1.7 6.1 3.8 2.4 2.1 3.8 4.9 3.2 2.8

1.6 3.4 3.0 2.9 1.7 3.1 2.8 3.0 1.9
2.4 2.7 1.6 1.9 3.3 2.6 2.2 2.6
36
Exercise 2
Use turning point test to test for stationarity in the data in exercise 1.
Exercise 3
Use sign test to test for stationarity in the data in exercise 1.
Exercise 4
Given the following performances in two subjects STA2101 and STA2106

determine the Spearman’s rank correlation coefficient and use it to test for
stationarity in the time series.
STA2101 STA2106 STA2101 STA2106

40 39 28 96
43 45 80 47
47 60 44 81
42 54 59 94
56 77 64 96
70 80 68 81
85 81 74 44
52 49 55 50
Exercise 5
Use the Pearson’s test on the data below to test for stationarity in the time
series:
t 𝒀𝒕 𝒕𝟐 𝒀𝒕 𝟐 t𝒀𝒕
1 1.82 1 3.31 1.82
2 2.6 4 6.76 5.2
3 1.7 9 2.89 5.1
4 2.8 16 7.84 11.2
5 3.4 25 11.56 17.0
6 4.3 36 18.49 25.8
7 3.1 49 9.61 21.7
8 4.5 64 20.25 36.0
9 5.0 81 25.0 45.0
10 5.7 100 32.49 57.0
11 4.12 121 16.97 45.32
12 3.6 144 12.96 43.2
13 6.3 169 39.69 81.9
14 7.0 196 49.0 98.0
37
Exercise 6
A record of the production for each machine operator was kept over a period of
time. Certain changes in the production procedure were suggested, and 11
operators were picked as an experimental test group to determine whether the
new procedures were worthwhile. Their production rates before and after the
new procedures were established and recorded as follows:
Operator Production Before Production After

S.M. 17 18
D.J. 21 23
M.D. 25 22
B.B. 15 25
M.F. 10 28
A.A. 16 16
U.Z. 10 22
Y.U. 20 19
U.T. 17 20
Y.H. 24 30
Y.Y. 23 26
a. How many usable pairs are there? That is, what is n?

b. Using the Wilcoxon signed-rank test, determine the new procedures
actually increased production. Use the 0.05 level and a one-tailed test.
Exercise 7
One of the major car manufacturers is studying the effect of regular verses
super gasoline in its economy cars. Ten executives are selected and asked to
maintain records on the number of kilometers per liter of gas. The results are:
Kilometers per Liter

Executive Regular Super
B.W. 25 28
D.M. 33 31
G.S. 31 35
D.T. 45 44
K.L. 42 47
R.U. 38 40
G.O. 29 29
B.N. 42 37
S.W 41 44
L.W. 30 44
At the 0.05 significance level, is there a difference in the number of kilometers

per liter between regular and super gasoline?
38
Exercise 8
It has been suggested that daily production of an assembly line for batteries at
Uganda Batteries Limited would be increased if better portable lighting were
installed and background music and free coffee and doughnuts were provided
during the day. Management agreed to try the scheme for a limited time. The
numbers of batteries produced per week by a small test group of employees are
as follows:
Employee Past Current Employee Past Current

Production Production Production Production
J.D. 23 33 J.B. 30 29
S.B. 26 26 W.W 21 25
M.D. 24 30 O.P. 25 22
R.C. 17 25 C.D. 21 23
M.F. 20 19 P.A. 16 17
U.H. 24 22 R.T. 20 15
A.T. 17 9 O.O 23 30
Using the Wilcoxon signed-rank test, determine whether the suggested changes
are worthwhile.
a. State the null hypothesis
b. Decide on the alternative hypothesis
c. Decide on the level of significance
d. State the decision rule
e. Compute T and arrive at a decision
Exercise 9
The following observations were selected from populations that were not
necessarily normally distributed. Use the 0.05 significance level, a two-tailed
test, and the Wilcoxon rank-sum test to determine whether there is a
difference between the two populations:
Population A 38 45 56 57 61 69 70 79
Population B 26 31 35 42 51 52 57 62
Exercise 10
One group was taught an assembly procedure using a standard sequence of

steps and another group was taught a new experimental technique. The
39
time to compute the assembly in seconds, for a sample of workers is shown

below:
Current Method: 41 36 42 39 36 48 49 38
Experimental: 21 27 36 20 19 21 39 24 22
At the 0.05 significance level, can we conclude the experimental method is

faster? Assume that the distribution of assembly lines is not normal.
Exercise 11
Six truck models are rated on a scale of 1 to 10 by two companies that

purchase entire fleet of trucks for industrial use. Calculate the Spearman
rank coefficient to determine at the 1% level whether the rankings are
ind
Model Rating by First Company Rating by Second Company epe
1 8 9 nd
2 7 6 ent
3 5 8 .
4 7 5
5 3 7
6 2 8
Exercise 12
Four methods of treating steel rods are analyzed to determine whether there
is any difference in analyzed the pressure the rods can bear before breaking.
The results of the tests measuring the pressure in pounds before the rods
bent are shown. Conduct the test, complete with the hypotheses, decision
rule, and conclusion. Ser 𝛂 = 1 percent.
Method 1 Method 2 Method 3 Method 4

50 10 72 54
62 12 63 59
73 10 73 64
48 14 82 82
63 10 79 79
40
Exercise 13
The quality control manager for a large plant in Kampala Industrial Area gives
two operations manuals to two groups of employees. Each group is then tested
on operational procedures. The scores are shown in the table below. The
manager has always felt that manual 1 provides a better base of knowledge for
new employees. Compute the mean test scores of the employees and report
your conclusion. State the hypotheses. Set 𝛂 = 0.05.
Manual 1 87 97 82 97 92 90 81 89 90 88 87 89 93
Manual 2 92 79 80 73 84 93 86 88 91 82 81 84 72 74
Exercise 14
At the 10 percent level, is there a relationship between study time in hours and
grades on a test, according to the data in the table below?
Time 21 18 15 17 18 25 18 4 6 5
Grade 67 58 59 54 58 80 14 15 19 21
41

TIME SERIES ANALYSIS KYU 2019-Chapter1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

TIME SERIES ANALYSIS KYU 2019-Chapter1

Uploaded by

Copyright:

Available Formats

TIME SERIES ANALYSIS NOTES; JAN – MAY 2019

PAPER NAME : TIME SERIES ANALYSIS

On completion of this course, students should:

Session One: Decomposition of Economic Time Series

Session Two: Estimation of Trend

MUKALAZI HERBERT, KYAMBOGO UNIVERSITY MATH DEPARTMENT Page 1

Session Three: Estimation of Seasonal variations

Session Four: Estimation of Cyclical fluctuations / variations

Introduction to Time Series

1.0 Goals to the Introduction to Time Series

1. Define and explain the meaning of each of the components of a time

1.1 Definition of Time Series

A Time Series is a collection of data for some variable or a set of variables

1.2 Reasons for analyzing Time Series

1. The main purpose of time-series analysis is to predict or forecast future

Time series can be used by management to make current decisions and

sales, finance, and other departments of a company to develop plans for

2. In time series analysis, we try to capture the underlying patterns of

1.3 Components of a Time Series

1.3.1 Trend (Secular Trend)

1.3.1.1 Causes of Trend

Trend can be a consequence of long-term gradual changes in population, gross

1.3.2 Cyclical Variations

Cyclical variations are medium-term variations lasting a few years but

1.3.2.1 Phases of a cycle

A cycle contains four phases namely:

1. Upswing or expansion, during which the level of business activity is

1.3.2.2 A typical business cycle

A typical business cycle consists of a period of prosperity followed by periods of

1.3.2.3 Reasons for identifying cycles in time series

1.3.3 Seasonal Variations

Seasonal variations are characterized by predictable swings occurring within

Consider the expanded consumption of electricity and gas in winter, the

1.3.3.1 Periodic variations

Periodic variations are shorter versions of seasonal variations. They manifest

The unit of time may be quarterly, monthly, weekly or even daily.

1.3.3.2 Reasons for analyzing seasonal variations

An analysis of seasonal variation is important in planning production

1.3.4 Irregular Variations

Time series contain irregular or random fluctuations caused by unusual

In time series analysis, irregular activity in the variable consists of whatever

1.3.4.1 Types of irregular variations

Episodic fluctuations are unpredictable, but they can be identified. For

1.4 Time Series Models

A time series model can be expressed as some combination of the four

The model is simply a mathematical statement of the relationship among the

1.4.2 The Additive Model

1.4.2.1 Application of the Additive Model

1. Isolate the seasonal variation, S, and then deseasonalize the data:

2. Compute the trend, T, then remove its influence:

3. Identify the cyclical fluctuation, C, then remove its influence:

4. Recognize that the residual (what remains) is the effect of unpredictable

1.4.2.2 Defects of the Additive Model

1.4.3 The Multiplicative Model

1.4.3.1 Application of the Multiplicative Model

1. Isolate the seasonal variation, S, and then deseasonalise the data:

2. Compute the trend, T, then remove its influence:

3. Identify the cyclical fluctuation, C, then remove its influence:

4. Recognize that the residual (what remains) is the effect of unpredictable

Example: Values for bad debts at a commercial bank might be recorded as T =

Y = (10) (1.7) (0.91) (0.87) = $13.46 million

1.5 CHAPTER EXERCISES

Year Shares (%) Year Shares (%)