You are on page 1of 12

Internal Assignment_June 2022 Examination: Decision Science

Answer to Question 1:

INTRODUCTION:
In the present case study three airlines operating from the Srinagar airport. To calculate
the probability of the plane belonging to the airlines the classical method of assigning
probabilities shall be used.
“When probabilities are assigned based on laws and rules, the method is referred to as
the classical method of assigning probabilities.”
In classical method, the probabilities can be determined prior to the occurrence of event
considering the past record.
For example, if an airline is having record of being 80% of time, then for the next trip of
the airline can be assigned a probability of 80% for it being on time. The probability of two
mutually exclusive events occurring at same time is zero. The plane being on time and
late are two complementary events. The complementary events contain all the
probabilities of an experiment and hence the sum of probabilities for complementary
events is always one. This can be used to calculate the probability of plane not being on
time if we know the probability of plane being on time.
Probability of the Complement of X
P (X’) = 1- P (X)
A tree diagram displays all the possible outcomes of an event can be used to describe
independent probabilities and conditional probabilities. The nodes on the tree diagram
shows an event and probability of that event is marked above line connecting the nodes.
The tree diagram contains all possibilities, including both being on time and being late.
Part-1) Tree Diagram:

Punctuality Record
Airlines (Leaving on time or late)

0.8 Ontime

Amira
0.5
Late
0.2

0.65
Ontime
Srinagar Biyas
Airport 0.3
0.35 Late

0.4
Ontime
0.2
Chinar
Late
0.6

Figure-1: Probability Tree Diagram

Part-2) A Plane has left of time and the probability of being it as Amira Airline is
calculated as below:
In this situation, this is a case of joint probability as we must calculate the joint probability
of a plane left the airport on time and it should be of Amira Airline.
Therefore, to calculate the joint probability above tree diagram shall be used. The number
on the branches are the conditional probabilities for the next step. To obtain the probability
of an outcome, it needs to multiply along the path leading to the outcome.
Hence, probability P [Plane leaving on time and being of Amira Airline] is = 0.5*0.8 = 0.4
CONCLUSION:
As explained above, the probability tree diagram shall be used to calculate the conditional
probabilities for the next step. To obtain the probability of an outcome, we multiply along
the path leading to the outcome and in this case probability of Plane leaving on time and
being of Amira is = 0.5*0.8 = 0.4.
Answer to Question 2:

Part 1)
“Regression analysis is the process of constructing a mathematical model or function that
can be used to predict or determine one variable by another variable or other variables.”
In the simple regression or bivariate regression consist of two variables in which one
variable is predicted by another variable.
In simple regression, the variable to be predicted is called the dependent variable and is
designated as y. The predictor is called the independent variable, or explanatory variable,
and is designated as x.
In current question dependent and independent variables are given below:
Dependent Variable: Sale of Kashmiri Kahwa.
Independent Variables:
First Variable: Advertise Expenses,
Second Variable: Number of sales representatives and
Third Variable: Customer satisfaction rating.

Part 2)
Let us consider the below symbology:
ŷ = Sales of Kahwa (in INR)

x1 = Advertise Expenses (in INR)


x2 = number of sales representatives (person)
x3 = customer-satisfaction ratings (1=highly dissatisfied to 5 = highly satisfied)
The regression model equation can be written as:

ŷ = A0 + A1x1 + A2x2 + A3x3 + ε

Where in:
A0 = the regression constant
A1 = the partial regression coefficient for first independent variable i.e., on Advertise
Expenses.
A2 = the partial regression coefficient for second independent variable i.e., number of
sales representatives.
A3 = the partial regression coefficient for third independent variable i.e., customer
satisfaction rating
ε = the error of prediction
Part 3) SUMMARY
OUTPUT

Regression Statistics
Multiple R 0.981088188
R Square 0.962534032
Adjusted R Square 0.943801048
Standard Error 3721.75587
Observations 10

ANOVA
df SS MS F Significance F
Regression 3 2135139002 711713000.8 51.38177878 0.000113414
Residual 6 83108800.55 13851466.76
Total 9 2218247803

Standard
Coefficients Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 32355.26089 8538.400022 3.78938218 0.009079953 11462.54869 53247.97309 11462.54869 53247.97309
X First Variable 1.686835512 1.500232214 1.124382943 0.303811184 -1.984100472 5.357771497 -1.984100472 5.357771497
X Second Variable -155.5294991 1061.613404 -0.146502953 0.888322803 -2753.203919 2442.144921 -2753.203919 2442.144921
X Third Variable 11309.33441 2589.262959 4.36778133 0.004730396 4973.63619 17645.03263 4973.63619 17645.03263
Part 4)
The below are the interpretation of Regression statistics-table:
• As concluded from the above shown excel output table, Correlation coefficient
(Multiple R) is 0.981088188. This concludes the positive multiple correlation between
sales of Kahwa and advertisement expenses, number of sales representatives and
customer-satisfaction ratings.
• The R Square or Coefficient of determination is 0.962534032. This case study has
more than one variable, it shall use adjusted R Square for interpretation.
• The value of Adjusted R Square is 0.943801048 which means that 94.38 % of sales
growth can be explained by advertisement expenses, number of sales representatives
and customer-satisfaction ratings.
• Standard Error provides the standard deviation of the error (residuals) and for this
case the standard error is 3721.75587 which is large as compared to the standard
error of variables and hence it, may be statistically significant.
• The total sample size of the data set is 10.

The below are the interpretation of ANOVA table:


F column provides the overall F-test of the null hypothesis. Null hypothesis assumes that
all coefficients are equal to zero and the alternative hypothesis is that at least one of the
coefficients is not equal to zero. The p-value for the F-test is indicated in significance F
column. As the same is 0.00011 which is lower than the significance level of 0.05 (at
confidence level of 95%), the null hypothesis (that all coefficients are equal to zero) can
be rejected. This means that advertisement expenses, number of sales representatives
and customer-satisfaction ratings combined have a statistically significant association
with the sales of Kahwa.

Coefficients interpretation:
A coefficient for each independent variable represents the average expected change in
the dependent variable, assuming the other independent variable constant.
In this case, every additional INR spent on advertisement is expected to increase sales
by 1.68 INR assuming other two variables constant and each addition of sales
representative is expected to reduce sale by 155.5 INR, similarly each additional point on
customer rating is expected to increase sales by 11309.33 INR considering the remaining
variables remain constant.

t-stat / P-values interpretation:


The individual p-values explain whether each independent variable is statistically
significant or not. This is the test of a null hypothesis stating the coefficient has a slope of
zero. The p-values for each coefficient to be looked upon and to be compared with the
significance level of 0.05. If p-value is less than the significance level, this means the
independent variable is statistically significant for the model.
In table, the below can be concluded:

• x1 = Advertisement Expenses (in INR) is having p value as 0.303811184 which is


more than 0.05 and hence this variable is statistically insignificant.
• x2 = number of sales representatives (person) is having p value 0.888322803
which is more than 0.05 and hence this variable also is statistically insignificant.
• x3 = customer-satisfaction ratings (1=highly dissatisfied to 5 = highly satisfied) is
having p value as 0.004730396 which is less than 0.05 and hence customer
satisfaction rating is statistically significant for the model.

Since, null hypothesis (that the coefficients are equal to zero) cannot be rejected, the
statistically insignificant variables i.e. x1 i.e., Advertisement Expenses (in INR) and x2 i.e.,
number of sales representatives (person) can be eliminated from the model.
Answer to Question 3(a):
The following binomial formula can be used for solution of this problem: The binomial
formula summarizes the steps presented so far to solve binomial problems.

P(x) = nCx *px * qn-x


Where:

n= the number of trials (or the number being sampled)

x= the number of successes desired

p= the probability of getting success in one trial

q= 1-p = the probability of getting failure in one trial

in current case study

n=25

p=0.75

q=1-0.75=0.25

a) Probability that Exactly 15 of them would agree with the claim, i.e., x=15
P (15) = 25C15 *p15 * q25-15

= 25C15 *(0.75)15 * (0.25)10


= 3268760* 0.013363461*9.53674E-07
= 0.041658351

So, the probability of exactly 15 Instagram users out of 25 approached would agree
to the claim that they love Insta-REELS would be 4.16%.

b) Probability that Exactly 20 of them would agree with the claim, i.e., x=20
P (20) = 25C20 *p20 * q25-20
= 25C20 *(0.75)20 * (0.25)5
= 53130* 0.003171212* 0.000976563
= 0.164537588

Therefore, the probability of exactly 20 Instagram users out of 25 approached


would agree to the claim that they love Insta-REELS would be 16.45%.
Answer to Question 3(b):
In the present case study, the ‘Bhartdarshan’ an Internet-based travel agency wherein
customers can see videos of the cities to plan the visit. The number of hits daily is a
normally distributed random variable with a mean of 10,000 and a standard deviation of
2,400.

Probabilities for intervals of any particular values of a normal distribution can be


determined by using the mean, the standard deviation, the z formula, and the z distribution
table.

a. What is the probability of getting more than 12,000 hits?

The problem can be stated as below:

P(x>12000| µ = 10,000 and σ = 2,400) =?

Let’s find out z value:

z=x-µ

or z =12000-100002400

or z=20002400

or z=0.833

Therefore, x (12,000) is 0.833 above the mean (10,000). Looking this z-value up in
the z distribution yields a probability of 0.2967.

It is evident that the probability given in z-table is for a value between mean of the z-
distribution (z=0) and the given value of z. Thus, there is 0.2967 area (probability)
between the mean of z distribution (z=0) and z value of interest (0.83). However, in our
case we want to solve for the tail of the distribution or the area above z=0.83. The normal
curve has an area of 1 and is symmetrical. In this way, the area under the curve under
upper half is 0.5000. Subtracting 0.2967 from 0.5000 results in 0.2033 for the area of the
upper tail of the distribution. The probability of getting more than 12000 hits is 0.2033 or
20.33%.
b. What is the probability of getting less than 9,000 hits?

The problem can be stated as below:

P (x<9,000| µ = 10,000 and σ = 2,400) =?

For z value:

z=x-µ

or z =9000-100002400

or z=-10002400

or z=-0.42

Therefore, x (9,000) is 0.42 below the mean (10,000). Since the normal distribution
is symmetrical, the probability associated with z=-0.42 is the same as probability
associated with z=0.42. Looking this z-value up in the z distribution yields a
probability of 0.1628.

It is evident that the probability given in z-table is for a value between mean of the z-
distribution (z=0) and the given value of z. Thus, there is 0.1682 area (probability)
between the mean of z distribution (z=0) and z value of interest (-0.42). However, in our
case we want to solve for the tail of the distribution or the area above z=-0.42. The normal
curve has an area of 1 and is symmetrical. Thus, the area under the curve under lower
half is 0.5000. Subtracting 0.1682 from 0.5000 results in 0.3372 for the area of the lower
tail of the distribution. The probability of getting less than 9000 hits is 0.3372 or 33.72%.

You might also like