You are on page 1of 60

Question 1

Differentiate between the two general classes of significance tests which statistical
technique will be appropriate when the testing involves two samples, the samples are
independent and the data are interval? Why?

prepared by drashti jasani 1


Question 2
List out various Univariate and Multivariate data analysis method

prepared by drashti jasani 2


prepared by drashti jasani 3
Question 3
What are the four levels of measurements? Explain each one with suitable example

prepared by drashti jasani 4


prepared by drashti jasani 5
nominal level of measurement
• The first level of measurement is nominal level of
measurement.  In this level of measurement, the
numbers in the variable are used only to classify the data.
 In this level of measurement, words, letters, and alpha-
numeric symbols can be used.  Suppose there are data
about people belonging to three different gender
categories. In this case, the person belonging to the
female gender could be classified as F, the person
belonging to the male gender could be classified as M,
and transgendered classified as T.  This type of
assigning classification is nominal level of measurement.

prepared by drashti jasani 6


ordinal level of measurement

• The second level of measurement is the ordinal level of


measurement.  This level of measurement depicts some
ordered relationship among the variable’s observations.
 Suppose a student scores the highest grade of 100 in the
class.  In this case, he would be assigned the first rank.
 Then, another classmate scores the second highest
grade of an 92; she would be assigned the second rank.
 A third student scores a 81 and he would be assigned
the third rank, and so on.   The ordinal level of
measurement indicates an ordering of the
measurements.
prepared by drashti jasani 7
 interval level of measurement
• The third level of measurement is the interval level of
measurement.  The interval level of measurement not only
classifies and orders the measurements, but it also specifies
that the distances between each interval on the scale are
equivalent along the scale from low interval to high interval.
 For example, an interval level of measurement could be the
measurement of anxiety in a student between the score of 10
and 11, this interval is the same as that of a student who
scores between 40 and 41.   A popular example of this level of
measurement is temperature in centigrade, where, for
example, the distance between 940C and 960C is the same
as the distance between 1000C and 1020C.

prepared by drashti jasani 8


ratio level of measurement
• The fourth level of measurement is the ratio level of
measurement.  In this level of measurement, the
observations, in addition to having equal intervals,
can have a value of zero as well.  The zero in the
scale makes this type of measurement unlike the
other types of measurement, although the
properties are similar to that of the interval level of
measurement.  In the ratio level of measurement,
the divisions between the points on the scale have
an equivalent distance between them.
prepared by drashti jasani 9
prepared by drashti jasani 10
Question 4
Depict your understanding for Univariate and bivariate data analysis methods

Data Analysis is the methodical approach of applying the statistical measures to describe, analyze,
and evaluate data. The researchers analyze patterns and relationships among variables.
Univariate, Bivariate, and Multivariate are the major statistical techniques of data analysis.

prepared by drashti jasani 11


prepared by drashti jasani 12
• Univariate Analysis

• Univariate analysis is the easiest methods of quantitative data analysis. As


the name suggests, “Uni,” meaning “one,” in univariate analysis, there is only
one dependable variable. It is used to test the hypothesis and draw
inferences. The objective is to derive data, describe and summarize it, and
analyze the pattern in it.

• In a set of data, the univariate analysis explores each variable separately. It


analyzes the range and central tendency of the values, describes the pattern
of responses towards the variable.

• A variable is a condition or a category that the data falls under. For instance,
the analysis may be looking into the variable of “age” or “weight” of
demography. It takes one variable into concern at a time, i.e., either “age” or
“weight.”

• The univariate method is commonly used in analyzing data for cases where
there is a single variable for each element in a data sample or when there
prepared by drashti jasani 13
are multiple variables on each data set.
• The patterns that are identified from the univariate analysis can be described in the
following ways:
• Central tendency – (mean, mode and median)
•  Dispersion – (range, variance)
•  Procuring an adequate budget
•  Quartiles (interquartile range)
•  Standard deviation

• Univariate data can be described through graphs:


• Bar Charts
•  Pie Charts
•  Histograms
•  Frequency Distribution Tables
•  Frequency Polygons

prepared by drashti jasani 14


• Bivariate Analysis
• In Bivariate Analysis, there are two variables wherein the
analysis is related to cause and the relationship between the
two variables. For example, points scored by the winning
team in the Super Bowl from 1960 to 2010.
• Types of Bivariate Analysis
• Scatter Plots It shows the measure of the influence of one
variable on the other.
•  Regression Analysis It is used to analyze how the data is
related to each other.
•  Correlation Coefficients It analyzes if the variables are
related. “0” suggests that the variables are not related to
each other, and “1” reveals a positive or a negative
correlation.
prepared by drashti jasani 15
Question 5
Write the differences between Parametric and non-parametric tests?

• Parametric Test Definition


• In Statistics, a parametric test is a kind of the hypothesis test
which gives generalizations for creating records about the
mean of the original population. A t-test is carried out based
on the t-statistic of students, which is often used in this value.
The t-statistic test holds on the underlying hypothesis that
there is the normal distribution of a variable. Here, the mean
is known, or it is taken to be known. For finding the sample
from the population, population variance is determined. It is
hypothesized that the variables of concern in the population
are estimated on an interval scale.

prepared by drashti jasani 16


Advantages of Parametric Tests
• Advantage 1: Parametric tests can provide
trustworthy results with distributions that are
skewed and nonnormal
• Advantage 2: Parametric tests can provide
trustworthy results when the groups have
different amounts of variability
• Advantage 3: Parametric tests have greater
statistical power

prepared by drashti jasani 17


• Non-Parametric Test Definition
• The non-parametric test does not require any
distribution of the population, which are meant by
distinct parameters. It is also a kind of hypothesis
test, that is not based on the underlying hypothesis.
In the non-parametric test, the test is based on the
differences in the median. So, this method of test is
also known as a distribution-free test. The test
variables are determined on the ordinal or nominal
level. If the independent variables are non-metric,
the non-parametric test is usually performed

prepared by drashti jasani 18


Advantages of Nonparametric Tests
• Advantage 1: Nonparametric tests assess the
median which can be better for some study
areas
• Advantage 2: Nonparametric tests are valid
when our sample size is small and your data
are potentially nonnormal
• Advantage 3: Nonparametric tests can analyze
ordinal data, ranked data, and outliers

prepared by drashti jasani 19


Key Difference Between Parametric And Non-
parametric
Properties Parametric Non-parametric
Meaning A statistical test, in which A statistical test used in the
specific assumptions are made case of non-metric
about the population independent variables, is
parameter is known as called non-parametric test.
parametric test.
Assumptions Yes No
Value for central tendency Mean value Median value
Probabilistic distribution Normal Arbitrary
Population knowledge Requires Does not require
Applicability Variables Attributes & Variables
Measurement level Interval or ratio Nominal or ordinal
Information about population Completely known Unavailable
Correlation test Pearson Spearman
Examples t-test, z-test, etc. Kruskal-Wallis, Mann-Whitney

prepared by drashti jasani 20


Question 6
Explain: Type 1 and Type 2 errors

prepared by drashti jasani 21


prepared by drashti jasani 22
prepared by drashti jasani 23
Question 7
Explain various steps of hypothesis testing

• Inferences on population characteristics (or parameters) are often made on the basis of
sample observations, especially when the population is large and it may not be possible to
enumerate all the sampling units belonging to the population.
• In doing so, one has to take the help of certain assumptions (or hypothetical values) about
the characteristics of the population if some such information is available. Such hypothesis
about the population is termed as statistical hypothesis and the hypothesis is tested on the
basis of sample values.
• The procedure enables one to decide on a certain hypothesis and test its significance.
• “A claim or hypothesis about the population parameters is known as Null Hypothesis and is
written as, H0.”
• This hypothesis is then tested with available evidence and a decision is made whether to
accept this hypothesis or reject it. If this hypothesis is rejected, then we accept the
alternate hypothesis.
• This hypothesis is written as H1. For testing hypothesis or test of significance we use both
parametric tests and nonparametric or distribution free tests.
• Parametric tests assume within properties of the population, from which we draw samples.
• Such assumptions may be about population parameters, sample size, etc. In case of non-
parametric tests, we do not make such assumptions.
• Here we assume only nominal or ordinal data.
prepared by drashti jasani 24
• Important parametric tests used for testing of hypothesis
are:
• (i) z-test
• (ii) t-test
• (iii) χ2 test; and
• (iv) f-test 6
• When χ2 test is used as a test of goodness of fit and also
as a test of independence, we use non-parametric tests.
• As has been stated earlier all parametric tests used for
testing of hypothesis are based on the assumption of
normally, i.e., population is considered to be normally
distributed.

prepared by drashti jasani 25


prepared by drashti jasani 26
prepared by drashti jasani 27
prepared by drashti jasani 28
prepared by drashti jasani 29
prepared by drashti jasani 30
prepared by drashti jasani 31
Question 8
Explain the applications of various univariate data tests with an example

prepared by drashti jasani 32


prepared by drashti jasani 33
prepared by drashti jasani 34
prepared by drashti jasani 35
prepared by drashti jasani 36
prepared by drashti jasani 37
prepared by drashti jasani 38
prepared by drashti jasani 39
prepared by drashti jasani 40
prepared by drashti jasani 41
prepared by drashti jasani 42
prepared by drashti jasani 43
prepared by drashti jasani 44
prepared by drashti jasani 45
prepared by drashti jasani 46
prepared by drashti jasani 47
prepared by drashti jasani 48
prepared by drashti jasani 49
prepared by drashti jasani 50
prepared by drashti jasani 51
prepared by drashti jasani 52
prepared by drashti jasani 53
Question 9
Explain the difference between ANOVA and Chi-square

• The chi square and Analysis of Variance (ANOVA) are both inferential statistical
tests. Inferential statistics are used to determine if observed data we obtain
from a sample (i.e., data we collect) are different from what one would expect
by chance alone. A more simple answer is that we want to determine if the
relationships among variables or differences between groups that we see in
our sample data are occurring in the entire population.

• That said, chi square is used when we have two categorical variables (e.g.,
gender and alive/dead) and want to determine if one variable is related to
another. In ANOVA, we have two or more group means (averages) that we
want to compare. In an ANOVA, one variable must be categorical and the other
must be continuous. For example, we may want to examine if marijuana use (0
to 25 times) differs by grade level (9th grade, 10th grade, 11th grade).

prepared by drashti jasani 54


prepared by drashti jasani 55
prepared by drashti jasani 56
prepared by drashti jasani 57
ANOVA (Analysis of Variance)

• ANOVA is a statistical technique that assesses potential differences in a


scale-level dependent variable by a nominal-level variable having 2 or
more categories.  For example, an ANOVA can examine potential
differences in IQ scores by Country (US vs. Canada vs. Italy vs. Spain). 
Developed by Ronald Fisher in 1918, this test extends the t and the z test
which have the problem of only allowing the nominal level variable to
have two categories.  This test is also called the Fisher analysis of variance.
• The use of ANOVA depends on the research design. Commonly, ANOVAs
are used in three ways: one-way ANOVA, two-way ANOVA, and N-way
ANOVA.

prepared by drashti jasani 58


• One-Way ANOVA
• A one-way ANOVA has just one independent variable. For example, difference in
IQ can be assessed by Country, and County can have 2, 20, or more different
categories to compare.

• Two-Way ANOVA
• A two-way ANOVA (are also called factorial ANOVA) refers to an ANOVA using
two independent variables. Expanding the example above, a 2-way ANOVA can
examine differences in IQ scores (the dependent variable) by Country
(independent variable 1) and Gender (independent variable 2). Two-way ANOVA
can be used to examine the interaction between the two independent variables.
Interactions indicate that differences are not uniform across all categories of the
independent variables. For example, females may have higher IQ scores overall
compared to males, but this difference could be greater (or less) in European
countries compared to North American countries.

• N-Way ANOVA
• A researcher can also use more than two independent variables, and this is an n-
way ANOVA (with n being the number of independent variables you have). For
example, potential differences in IQ scores can be examined by Country, Gender,
prepared by drashti jasani 59
Age group, Ethnicity, etc, simultaneously.
• General Purpose and Procedure
• Omnibus ANOVA test:
• The null hypothesis for an ANOVA is that there is no significant difference
among the groups. The alternative hypothesis assumes that there is at
least one significant difference among the groups.  After cleaning the
data, the researcher must test the assumptions of ANOVA. 
• They must then calculate the F-ratio and the associated probability value
(p-value). In general, if the p-value associated with the F is smaller than .
05, then the null hypothesis is rejected and the alternative hypothesis is
supported. 
• If the null hypothesis is rejected, one concludes that the means of all the
groups are not equal. Post-hoc tests tell the researcher which groups are
different from each other

prepared by drashti jasani 60

You might also like