You are on page 1of 13

SUBMITTED BY

SOORAJ S
Registration number: 11809348

MASTER OF BUSINESS ADMINISTRATION (MBA)


Mittal School of Business
LOVELY PROFESSIONAL UNIVERSITY
Phagwara, Punjab June 2022
Q1. What do you understand by the term Normal
Distribution? Discuss the features of Normal Distribution?
Normal distribution, also known as the Gaussian distribution, is a probability distribution that is
symmetric about the mean, showing that data near the mean are more frequent in occurrence
than data far from the mean.
Features of Normal Distribution-

The Mean,
Skewness &
It is Symmetric Median, mode Empirical rule
kurtosis
are equal

It is symmetric- The shape of a normal distribution is totally symmetrical. In other words,


the distribution curve may be split in half to create two equal pieces. When 50% of the data fall
on either side of the curve, the form is symmetric.

The Mean , Median , Mode are equal- The highest frequency, or point where there are
the most observations of the variable, is found in the middle of a normal distribution. These
three metrics all fall at the same location, which is the middle. The measurements in a
completely (normal) distributed are typically equal.
Empirical rule - When data is normally distributed, the area under the curve between the mean and a
given number of standard deviations from the mean is constant. 68.25 percent of all instances, for
instance, are within +/- one standard deviation of the mean. Ninety-five percent of instances deviate by
no more than two standard deviations from the mean, and 99 percent of cases deviate by no more than
three standard deviations from the mean

Skewness and Kurtosis - The coefficients skewness and kurtosis indicate how divergent a distribution is
from a normal distribution. Kurtosis measures the thickness of the tail ends in relation to the tails of a
normal distribution, whereas skewness evaluates the symmetry of a normal distribution.
Q2. Discuss any two non-parametric test in detail .
 What is Non-parametric test?
Non-parametric tests, often referred to as distribution-free tests, are considered as being less
effective since they rely less on data for their calculations and make less assumptions about the data
set.

 When is Non-parametric test used?


•When the study is better represented by the median

•When the data has a normal distribution

•When there is ordinal data, ranked data, or outliers can’t be removed

•When the sample size is very small

•When the measurement scale is nominal or ordinal


 Types of Non-parametric test-

1.Kruskal-Wallis test
2. Spearman’s rank correlation
3. Wilcoxon signedrank test
4. Wilcoxon ranksum test

Spearman’s rank correlation- The degree and direction of the link between two ranked variables are
measured by Spearman's rank correlation. It simply provides a measure of how monotonically a relationship
between two variables can be expressed, or how effectively a monotonic function can capture that
relationship.

Formula for Spearman’s rank correlation-

𝝆 = Spearman’s rank correlation coefficient


di = Difference between the two ranks of each observation
n = Number of observations
The Spearman Rank Correlation can take a value from +1 to -1 where,

•A value of +1 means a perfect association of rank


•A value of 0 means that there is no association between ranks
•A value of -1 means a perfect negative association of rank

Example-:

Subject Maths English

A 35 24
B 20 35
C 49 39
D 44 48
E 30 45
Subject Maths Rank English Rank d d square
A 35 3 24 5 2 4
B 20 5 35 4 1 1
C 49 1 39 3 2 4
D 44 2 48 1 1 1
E 30 4 45 2 2 4

= 1 - (6 * 14) / 5(25 - 1)
= 0.3

The Spearman’s Rank Correlation for the given data is 0.3. The value is near 0,
which means that there is a weak correlation between the two ranks.
Kruskal-Wallis Test- The non-parametric alternative to the One Way ANOVA is the Kruskal Wallis test.
Non parametric refers to a test that makes no assumptions about the distribution of your data. In cases
where the ANOVA assumptions aren't fulfilled, the H test is applied (like the assumption of normality).
Since the rankings of the data values rather than the actual data points are utilised in the test, it is
frequently referred to as the one-way ANOVA on ranks.

The test examines if there is a difference between the medians of two or more groups. You compute a test
statistic and contrast it to a distribution cut-off point, as with most statistical tests. The H statistic is the test
statistic applied in this test. The test's hypotheses are as follows:

•H0: population medians are equal.

•H1: population medians are not equal.


Assumptions for Kruskal-Wallis Test-
• Two or more tiers and one independent variable (independent groups). When you have three
or more levels, the test is more frequently administered. Consider utilising the Mann Whitney
U Test for two levels instead.

• Ordinal scale, Ratio Scale or Interval scale dependent variables.

• Your observations have to be impartial. To put it another way, there shouldn't be any
connections between the individuals who make up each group or between groupings. Refer to
Assumption of Independence for further details on this issue.

• The distributions of shapes for all groups ought to be uniform. The majority of testing tools,
including SPSS and Minitab, will check for this condition.
Q3. Explain Multiple Regression Analysis? Also, discuss the assumptions.

What is Multiple Regression analysis ?

Multiple regression is a statistical technique that can be used to analyze the relationship
between a single dependent variable and several independent variables.
The objective of multiple regression analysis is to use the independent variables whose values
are known to predict the value of the single dependent value.

Example-

A researcher decides to study students’ performance from a school over a period of time. He observed that
as the lectures proceed to operate online, the performance of students started to decline as well. The
parameters for the dependent variable “decrease in performance” are various independent variables like
“lack of attention, more internet addiction, neglecting studies” and much more.
Advantages of Multiple regression

Only independent variables with non zero regression coefficients are included in the regression equation.

The changes in the multiple standard errors of estimate and the coefficient of determination are shown.

The stepwise multiple regression is efficient in finding the regression equation with only significant
regression coefficients.

The steps involved in developing the regression equation are clear

Assumption Multiple regression analysis –


•The variables considered for the model should be relevant and the model should be reliable.

•The model should be linear and not non-linear.

•Variables must have normal distribution

•The variance should be constant for all levels of the predicted variable.
• Between the independent variables, there is a low degree of correlation.
• The variance of the independent variable is constant at all levels.
• In multiple regression, the assumption of normality is required. It means that variables in multiple regression
must have a normal distribution.
• In multiple regression, the model should be specified in a methodical manner. It suggests that the model
should contain just important variables and be accurate.

--------------------------------------------**-----------------------------------------------

You might also like