You are on page 1of 5

Statistics 5550: Homework #5

Due Monday, March 2, 2020

The data in Figure 1 are the monthly average retail price of electricity in Arizona from
January 2001 through April 2014 (the data are available on Carmen). We will consider the
additive model xt = mt + st + yt , where mt is a global trend, st is a seasonal component and
yt is a mean-zero stationary time series. In this question we will model the global trend as
a polynomial function of time:
q
X
mt = βj tj ,
j=0

for specific values of q, and we will consider two different approaches for modeling the seasonal
component. Throughout the problem, assume that time is labeled as:
1 2 3
t = 2001, 2001 + , 2001 + , . . . , 2014 + .
12 12 12

Arizona Average Retail Price of Electricity


12
11
cents per kWh

10
9
8
7

2002 2004 2006 2008 2010 2012 2014

Time

Figure 1: Arizona electricity price data.

1. For this part of the problem we will use a harmonic regression model to describe the
seasonal component:
st = α1 sin(2πt/d) + α2 cos(2πt/d),
where d is the period of the seasonal component, so that the overall model is:
q
X
xt = βj tj + α1 sin(2πt/d) + α2 cos(2πt/d) + yt
j=0
for a given value of q. Fit the model described above using the method of least squares
for q = 1 and q = 2. Report the parameter estimates for each case:

q β̂0 β̂1 β̂2 α̂1 α̂2


1 X
2

Make sure your parameter estimates correspond to time as defined above.

2. What is the estimated difference in average retail price of electricity between April and
May for any given year under the models in question 1?

3. After you have fit both models, use the R code below to make the plots shown in
Figure 2. You will need to fill in the ??? symbols with appropriate R code based on
how you fit the model. Do this for both models (q = 1, 2); the example in Figure 2 is
for q = 1. Make sure you check your axes to make sure they are scaled correctly!

az <- ts(read.csv("arizona_electricty.csv", header=F,


col.names="az"), start=c(2001,1), end=c(2014,4), freq=12)

# To save the plots to a pdf file, uncomment the next line


# (and uncomment dev.off() command at the end) and
# then execute all of the code
# pdf("fittedModel.pdf", height=8.25, width=6.375)

layout(matrix(c(1,1,2,3,4,4,5,6), ncol=2, nrow=4, byrow=T))


par(mar=c(5,4,2,1))

plot(az, ylab="cents per kWh", main="Arizona Average Retail Price of


Electricity", type="o")
lines(as.numeric(time(az)), ???, col="pink2", lwd=2)

plot(time(az), ???, type="l", col="pink2", lwd=2, xlab="Time",


ylab="", main="Estimated Trend")

plot(time(az), ???, type="l", col="pink2", lwd=2, ylim=c(-2,2),


xlab="Time", ylab="", main="Estimated Seasonal Component")
abline(h=0, lty=3)

# Make a time series object with the detrended, deseasonalized data:


detrended = ???

plot(detrended, ylab="", main="Detrended, Deseasonalized Data")


abline(h=0, lty=3)

monthplot(detrended, ylab="", main="Detrended, Deseasonalized Data")


abline(h=0, lty=3)

acf(detrended, lag.max=48, main="")


title("Detrended, Deseasonalized Data")

# dev.off()
# uncomment the line above (and the first) line and run all the
# code to save the plots

4. Based on your plots from the previous part (question 3), which of the two models
appears to fit the data better (the model with q = 1 or q = 2)? Do either fit the data
particularly well? What aspects of the data are not well fit by the models?

5. For this part of the problem we will use a seasonal means model to describe the seasonal
component, so that
12 q
X X
xt = αj Mtj + βj tj + yt ,
j=1 j=0 1

for a given value of q where yt is a zero mean stationary time series. Each Mtj = 1 if
time t corresponds to month j and is 0 otherwise, so that only one αj term appears
for any given time t. Fit the model described above using the method of least squares
for q = 1 and q = 2. Report the parameter estimates for each case:

q β̂0 β̂1 β̂2 α̂1 α̂2 α̂3 α̂4 α̂5 α̂6 α̂7 α̂8 α̂9 α̂10 α̂11 α̂12
1 X X
2 X

Make sure your parameter estimates correspond to time as defined above. Hint: You
can create the Mtj predictor variables indirectly in R by creating a factor variable:

M = factor(rep(month.abb, length.out=length(az)), levels=month.abb)

This factor variable can then be used as a term in the regression model, e.g.

lm( az ~ time(az) + M + 0 )

Adding + 0 at the end will exclude a common intercept term as discussed in class.

6. What is the estimated difference in average retail price of electricity between April and
May for any given year under the models in question 5?
7. Repeat question 3 using the models you fit in question 5.

8. Based on your plots from the previous part (question 7), which of the two models
appears to fit the data better (the model with q = 1 or q = 2)? Do either fit the data
particularly well? What aspects of the data are not well fit by the models?

9. Which of the four models seems to be the best representation of the data? Do any
aspects of this model not fit the data well? Does the resulting detrended, deseasonalized
series under this model appear stationary?
Arizona Average Retail Price of Electricity


●● ●
12 ●●● ●●●● ●●●●
● ●
● ●
●● ●
cents per kWh

●●●● ●
● ● ●●
●●● ● ● ● ●●
● ● ● ●
● ● ● ●
●● ● ● ●
● ● ● ● ●
9 10

● ● ●● ●
● ● ● ● ● ● ●●
●●●● ●● ● ● ●

● ● ●●● ●
● ● ●●●● ●
● ● ●
● ● ● ● ● ●
● ●●●●
●●●● ●● ●●●●

● ● ●●
● ● ● ●
● ●
● ● ● ●●
8

● ● ●
● ●● ●

● ●● ● ●●
● ●●
● ●● ● ● ●

7

● ●

2002 2004 2006 2008 2010 2012 2014

Time

Estimated Trend Estimated Seasonal Component

2
11

1
10

0
9

−1
8

−2
2002 2004 2006 2008 2010 2012 2014 2002 2004 2006 2008 2010 2012 2014

Time Time

Detrended, Deseasonalized Data


1.0
0.5
−0.5 0.0

2002 2004 2006 2008 2010 2012 2014

Time

Detrended, Deseasonalized Data Detrended, Deseasonalized Data


1.0

0.8
0.5

0.4
ACF
−0.5 0.0

0.0
−0.4

J F M A M J J A S O N D 0 1 2 3 4

Lag

Figure 2: Aspects of fitted model (q = 1).

You might also like