DS 432 Assignment I 2020

DS 432 – Assignment I
Program: B.Tech. Discipline: CSE/ECE

Semester: August–December 2020 Time: 3 Hours
Course Code: DS 432 Total Pages: 7
Course Name: Predictive Modeling for Data Science Max. Marks: 20
Instructions:
1. Read the instructions carefully.
2. Attempt all questions.
3. Use of calculator and/or R/Python is allowed. (Code output not required.)
4. Use of lecture notes, books, other internet resources are allowed.
5. Any discussion or otherwise inappropriate communication between examinees will be dealt with
severely.
6. Each question is worth 1 point. The exam has 20 questions.
7. Write your answers on plain A4 paper, scan, and upload the merged PDF in Moodle
https://exam.niituniversity.in. Answer five questions per page – total 4 pages.
Separate each answer with a line as per discussed format.
8. For each question, 0.5 marks will be awarded for correct option (A, B, C, D, or E). For the
explanation that follows your choice of option, either 0.5 or 0.25 or 0 marks will be awarded. 0.5
marks will be awarded for correct explanation, 0.25 marks for partially correct explanation, and
0 marks will be awarded for incorrect explanation. If your option (A, B, C, D, or E) is incorrect,
0 marks will be awarded for the question regardless of the explanation that follows.
9. Do not forget to write your name and enrollment number on each page of your submission.
Write the answers in sequence; also handwriting should be legible. No email submission please
and don’t wait until last minute for the submission.
10. By uploading the answer (in Moodle) you acknowledge that you did not discuss
any aspect of this exam with anyone other than the instructor, that you neither
gave nor received any unauthorized assistance on this exam, and that the work
submitted is entirely your own.
1
Answer questions 1 and 2 with the R output given below.
Call:
lm(formula = Y ~ X, data = Regr)
Residuals:
1 2 3 4 5 6 7
0.55769 -0.65385 -0.86538 0.34615 1.13462 -0.07692 -0.44231
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 9.4423 0.4757 19.851 5.99e-06 ***
X -1.7885 0.2887 -6.194 0.0016 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.7869 on 5 degrees of freedom

Multiple R-squared: 0.8847, Adjusted R-squared: 0.8617
F-statistic: 38.37 on 1 and 5 DF, p-value: 0.0016
1. The estimated regression equation for the full model is
(A) ŷ = 9.4423X − 1.7885 + 0.2887
(B) ŷ = 9.4423 − 0.8847X
(C) ŷ = −9.4423 − 1.7885X
(D) ŷ = 9.4423X + 1.7885
(E) None of the above.
2. The predicted value ŷ(1) is equal to
(A) -1.7885
(B) 11.2308
(C) 7.6538
(D) 9.4423
3. Suppose that {Zt } is a white noise process with a sample size of n = 100. If we performed a
simulation to study the sampling variation of r1 , the lag one sample autocorrelation, about
95% of our estimates r1 would fall between
(A) -0.025 and 0.025
(B) -0.05 and 0.05
(C) -0.1 and 0.1
(D) -0.2 and 0.2
2
4. What is the difference between strict stationarity and weak stationarity?
(A) Strict stationarity requires that the mean function and autocovariance function be free of
time t. Weak stationarity does not.
(B) Strict stationarity is required to guarantee that MMSE forecasts are unbiased (in ARIMA
models). These forecasts may not be unbiased under weak stationarity.
(C) Strict stationarity is a stronger form of stationarity that does not rely on large-sample
theory.
(D) Strict stationarity refers to characteristics involving joint probability distributions. Weak
stationarity refers only to conditions placed on the mean and autocovariance function.
5. In an analysis, we have determined that
• The Dickey-Fuller unit root test for the series {Xt } does not reject a unit root.
• The ACF for the series {Xt } has a very, very slow decay.
• The PACF for the differences {∇Xt } has significant spikes at lags 1 and 2 (and is
negligible at higher lags).
Which model is most consistent with these observations?
(A) IMA(1,1)
(B) ARI(2,1)
(C) ARIMA(2,2,2)
(D) IMA(2,2)
6. Which of the following processes is stationary?
(A) MA(1) process with θ = −1.4
(B) Xt = 12.3 + 1.1Xt−1 + Zt
(C) IMA(1,1)
(D) Xt = β0 + β1 t + Zt
7. Consider the time series model Xt = 0.8Xt−1 + 0.09Xt−2 + Zt − 0.01Zt−2 . Determine whether
the model is stationary and/or invertible.
(A) Stationary but not invertible.
(B) Not stationary but invertible.
(C) Stationary and invertible.
(D) Neither stationary nor invertible.
3
8. Here is the R output from fitting an ARIMA(1,0,0) model to a data set ts.sim11.
> arima(ts.sim11, order=c(1,0,0))
Call:
arima(x = ts.sim11, order = c(1, 0, 0))
Coefficients:
ar1 intercept
0.5782 0.0146
s.e. 0.0810 0.2167
sigma^2 estimated as 0.8582: log likelihood = -134.45, aic = 272.91

The last observed value of the data set is X100 = −1.6387. Using the fitted AR(1) model, the
(estimated) MMSE forecast for X101 is approximately equal to
(A) -0.941
(B) -0.841
(C) -0.741
(D) -0.641
9. For polynomial regression, which one of these structural assumptions is the one that most
affects the trade-off between underfitting and overfitting.
(A) The polynomial degree.
(B) Whether we learn the weights by matrix inversion or gradient descent.
(C) The assumed variance of the Gaussian noise.
(D) The use of a constant-term unit input.
10. The relationship between number of beers consumed (x) and blood alcohol content (y) was
studied in 16 male college students by using least squares regression. The following regression
equation was obtained from this study: ŷ = −0.0127 + 0.0180x. Another guy, his name Buddy,
has the regression equation written on a scrap of paper in his pocket. Buddy goes out drinking
and has 4 beers. He calculates that he is under the legal limit (say, 0.08) so he decides to drive
to another bar. Unfortunately Buddy gets pulled over and confidently submits to a road-side
blood alcohol test. He scores a blood alcohol of 0.085 and gets himself arrested. Obviously,
Buddy did not knew about residual variation. Buddy’s residual is:
(A) +0.0257
(B) -0.0257
(C) +0.005
4
(D) -0.005
(E) none of the above.
11. What did we discover about the method of moments procedure when estimating parameters
in ARIMA models?
(A) The procedure gives reliable results when the sample size n > 100.
(B) The procedure gives unbiased estimates.
(C) The procedure should not be used when models include AR components.
(D) The procedure should not be used when models include MA components.
12. Which statement about MMSE forecasts in stationary ARMA models is true?
(A) If X̂t (l) is the MMSE forecast of ln(Xt+l ), then eX̂t (l) is the MMSE forecast of Xt+l .
(B) As the lead time l increases, X̂t (l) will approach the process mean E(Xt ) = µ.
(C) As the lead time l increases, V(X̂t (l)) will approach the process variance V(Xt ) = γ0 .
(D) All of the above are true.
(E) None of the above are true.
13. An observed time series displays a clear upward linear trend. We fit a straight line regression
model to remove this trend, and we notice that the residuals from the straight line fit are
stationary in the mean level. What should we do next?
(A) Search for a stationary ARMA process to model the residuals.
(B) Perform a Shapiro-Wilk test.
(C) Calculate the first differences of the residuals and then consider fitting another regression
model to them.
(D) Perform a t-test for the straight line slope estimate.
14. Let V(X) = 1, V(Y ) = 2, and Cov(X, Y ) = 3, then the value of α that minimizes V(αX +
(1 − α)Y ) is
(A) 0.1
(B) 0.2
(C) 0.4
(D) 0.5
15. Consider the time series data given below.
Time 1 2 3 4 5 6
Xt 2 3 7 4 8 11
5
Now suppose you would like to fit a AR(2) model into the above data set. Then the MoM
estimate of φ̂1 and φ̂2 is given by
(A) φ̂1 = 0.2 and φ̂2 = 1.7
(B) φ̂1 = 1.7 and φ̂2 = 0.2
(C) φ̂1 = −0.2 and φ̂2 = 1.7
(D) φ̂1 = 0.2 and φ̂2 = −1.7
16. Traditionally, when we have a real-valued input attribute during decision-tree training we
consider a binary split according to whether the attribute is above or below some threshold.
Suppose someone suggested that instead we should have a multiway split with one branch
for each of the distinct values of the attribute. Which of the following is the single biggest
problem with that suggestion.
(A) It is computationally expensive.
(B) It would probably result in a decision tree that scores badly on the training set and a
test set.
(C) It would probably result in a decision tree that scores well on the training set but badly
on a test set.
(D) It would probably result in a decision tree that scores badly on the training set but well
on a test set.
17. A hypothesis test that can be used for model comparison in linear regression is
(A) F -test.
(B) t-test.
(C) χ2 -test.
(D) partial F -test.
18. A statistic that measures the change in the fitted regression coefficients when an observation
is dropped from the regression analysis is
(A) Cook’s distance.
(B) Hat matrix.
(C) Influential point.
(D) Leverage point.
19. In linear regression, the tendency in data sets when there are several unusual observations
clustered together such that attempts to identify one observation at a time fail is called
6
(A) Masking effect.
(B) Collinearity effect.
(C) Interaction effect.
(D) Linear restriction.
20. Averaging the output of multiple decision trees helps ______
(A) increase bias.
(B) increase variance.
(C) decrease bias.
(D) decrease variance.

DS 432 Assignment I 2020

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DS 432 Assignment I 2020

Uploaded by

Copyright:

Available Formats

DS 432 – Assignment I

Program: B.Tech. Discipline: CSE/ECE

Residual standard error: 0.7869 on 5 degrees of freedom

sigma^2 estimated as 0.8582: log likelihood = -134.45, aic = 272.91

You might also like