4018O
dy 5
Institute of Rural Management Anand
Course Name: Business Statistics and Analytics PRM40 (2019-21)
Term -I
End — Term Examination
Duration of Exam: 2 Hrs 30 mins. Total Marks: 50 Weightage: 25%
Instructions —
1. All questions are compulsory
2. Clarifications will not be addressed during the exam. Please make suitable assumptions if
necessary
3. Open Book, Open Notes, Open Laptop and Calculator
4, Enjoy solving
5. Open the excel file that was sent to you for endterm examination. (Also, refer to question 3
and question 4.) The excel file contains three worksheets ~ Sheet 1, Sheet 2 and CPI.
a.) From Sheet 1, delete the data in columns C and D of the worksheet
b.) From Sheet 2, delete the data in columns F, G, H and I, Label V1 stands for Marks, V2 stands
for Section, V3 stands for Hardwork, V4 stands for interaction .
c) Sheet named CPI is for Question-4. Because, we have given you the regression output, you
might not have to use this sheet to answer any Question
Question -1 [10 marks]
A Government scheme titled “Banish Poverty” promises to give a grant of Rs 200 crores to districts
which have more than thirty percent families categorized as Below Poverty Line (BPI). Kalshera, a
district somewhere in Western India, does not qualify for this grant. An NGO, “opportunities in poverty”
‘operating in Kalshera is extremely disappointed at losing an opportunity for tapping into government
funds, It claims that Kalshera has more than 30% BPL families (alternative hypothesis) and hence should
be considered for the grant. It does a quick dipstick survey wherein it draws a random sample of 500
rural households from Kalshera and finds that 165 households belong to the BPL category. It therefore
stakes a claim for the “Banish Poverty” funding.
Does its claim have a statistical justification? Reason out systematically at three different levels of
significance ~ 1%, 5% and 10%, (Seven marks)
ow, consider that 5000 houses were sampled and 1650 households belonged to the BPL category. Do
sou tink thatthe calm now has greater statistical justification? Justify appropriately. (Three Marks)
s
Sconned with CamSconnerQuestion -2 [9 marks]
Consider the following demand function fitted by regression analysis for time series data of Khadi
apparel in india.
Log D = 2.5 - 0.8*Log P: + 1.15*Log P2- 0.2*Log Income -1.2*time + €
Where ¢ is the error term and:
Log D = logarithm (with base e) of demand for khadi apparel (in million units)
Log P:= logarithm (with base e) of price of Khadi apparel (in Rs)
Log P2 = logarithm (with base e) of price of non-Khadi apparel (in Rs)
Log income = logarithm (with base e) of per capita income of India (in Rs)
time= time or trend variable having values as 1 for time period 1, 2 for time period 2 and so forth.
Interpret each of the slope coefficients. (Six marks)
Assuming that the model and all the coefficients passes the required F and t-tests (p-values are very
low), Is there evidence that preference for Khadi is diminishing over the years? On what basis can you
say s0? (three marks)
int 1; #0299) — a a(egy) _14y A
(Hint a: stean ee Hint; £29») = Fp” Hint3: Log x is an increasing function in x, i
» lx
increases, Log x increases)
Question -3 [21 marks]
Our old friends, Rancho Thukral(RT) and Abhishek Kejriwal(Ak), after doing PRM from IRMA, went
ahead and also did FPRM from IRMA. They have now Joined as faculty in IRMA and are sharing an
elective course called “Advanced BSA”(ABSA) which is being taught in 2 sections (A and B). They were
Surprised to see that section A (taught by RT) outperformed section B (taught by AK) in the midterm.
Hen«
ce, RT set out to prove to the world that he is a better teacher than AK. His claim would go in the
alternative hypothesis,
of the excel file that was sent to you.
1) Clearly formulate
2.) Do a hypothesis te
be accepted at 5
the null and alternative hypothesis, (2 marks)
+ st making suitable assumptions and give a.verdict on whether RT's claim can
% significance level. (5 marks)
Sconned with CamSconner‘AK was not in a good mood, when an enlightened student came to his office and said ~ “Sir, students in
section A are clearly more hard-working on an average than students of section B. If you don’t trust me,
you can ask the students themselves”,
Hence, students in both the sections were asked the number of hours they study per week. Poor
students had to fill the form, otherwise 0.1 GPA would be deducted. Thus, we were now able to obtain
the data about the hard work that students are putting in.
AK decided to build a regression model taking Marks as the dependent variable, and the following three
as the independent variables —
a.) Hardwork, measured in hours/week
b.) Section - This is a dummy variable, which takes a value 1 if Section ~“A” and takes the value 0 if
section =“B”
c) Interaction = Hardwork*Section
‘AK decided to include interaction term in the model because he also wanted to confirm, if at all,
because of the different instructors, itis the case that one section can get more marks on an average, by
putting in one extra hour of hardwork.
The multiple linear regression model is thus given as —
Marks = a + B:*Hardwork + B2*Section + B3*Interaction + €
_ Mats f 8
Where ¢ is the error term,
(For the raw data, please refer to sheet 2 of the excel fle that was sent to you)
eet ma
>) Find the values of a,.6s, Bs and Bs by using the regression feature in data analysis toolpack of
Microsoft Excel. (Four marks)
4) By looking at the value of the “significance F” in ANOVA table,
multiple regression model at 5% significance level? (One mark)
&& Out of the three independent variables used to build model, which of the variables seem to be
significant. What is the basis of your judgement? (Three marks)
4} Give the interpretation of B:. (Two marks)
F/ Using the output of the above regression model, two different multiple linear equations,
each section can be formed, which explains marks as a function of hardwork. Wi
equations for section A and section 8 separately. (242 = Four marks)
can we accept the proposed
one for
rite the
Question -4 [10 marks]
Some IRMA Alumni at Niti Aayog believe that per capita income
influences the corruption level in a
country. They would want to test their hypothesis using a regression model with the following
independent variables. (Refer to worksheet named CPI in the excel file that was sent to you)
Sconned with CamSconner} Numeric Variable ~ Per Capita Income (PCI), measured in dollars
a) ia
is a democracy, 0 if the
as the value 1 if the country is a democracy, 0 1
ble - Democracy, which takes the
b) Categorical Varial
country is not @ democracy
to know the absolute value of corruption, they looked at CPL(Corruption, Perception
Scere
ps leearered intries around the world,
Index) as the dependent. variable of various cou!
‘the corruption Perception Index scores countries on a scale from 0 (highly corrupt) to 100 (very clean).
ye Corruption Perception Index
‘The regression output obtained by excel is shown below —
SUMMARY OUTPUT
Regression Statistics
Multiple R (0,908784634
RSquare 0,825889511
Adjusted RSquare —_0.822635109
Standard Error 7.70811052
Observations 110
ANOVA
d SS MS F__gnificonce F
Regression 2 30124.91 15062.46 253.761 2.42E-41
Residual 107 6350.85 59,3532 ,
Total 109 _36475.72
Coefficients andard Err tStat P-value Lower 95%Upper 95%ower 95.0s\pper 95.09
Intercept 27.78685741 1.073749 25.87835 7.276-48 25,65827 29.91544 2565827 29.91544
Percapitaincome$ 0,000826162 6.14E-05 13.45504 9.47E-25 0,000704 0.000948. 0.000704 0.000948
Democracy Y/N. 14,47493796_2.475658_5.846905_5.5E-08 9.567235 19.38264 9.567235 19.38264
&) From the excel output, write the estimated multiple linear regression model, {Two marks}
2) By looking at the value of the “significance F” in ANOVA table, can we accept the proposed
multiple regression model at 5% significance level? (One mark)
3-7 Out of the two independent variables used to build model, which of the variables seem to be
significant. What is the basis of your judgement? (One mark)
2”) Give the interpretation of the coefficient of Per Capita Income$. (Two marks)
5A Give the interpretation of the coefficient of Democracy. (Two marks)
6.) According to the model, what is the expected value of the Corruption Perception Index of a
country which is not a democracy and has a per capita income of $ 5000? (Two marks)
Sconned with CamSconner