Professional Documents
Culture Documents
By:
Ramandeep Kaur
Assistant Professor
Economics
Introduction
• Minitab was developed at the Pennsylvania State University by some
researchers in 1972.
• Minitab is a data analysis software package that is used for data
analysis. It is widely used in a variety of industries, including
healthcare, manufacturing, and education. Minitab provides users with
tools to perform statistical analysis, including hypothesis testing,
regression analysis, and ANOVA.
• Minitab Statistical Software Version 21.1.0 is available
How to Download it?
• Go to minitab.com
• Email in 15 minutes
with a link to download
the software
Types of Data
1) Descriptive:
• Descriptive statistics are a part of statistics that can be used to
describe data. It is used to summarize the attributes of a sample in
such a way that a pattern can be drawn from the group.
• Descriptive statistics uses two tools to organize and describe data.
These are given as follows:
Measures of Central Tendency - mean, median, and mode.
Measures of Dispersion - Range, standard deviation, variance, quartiles,
and absolute deviation
Types of Data
2) Inferential
• Inferential statistics is a branch of statistics that is used to make
inferences about the population by analyzing a sample. When the
population data is very large it becomes difficult to use it. In such
cases, certain samples are taken that are representative of the entire
population. Inferential statistics draws conclusions regarding the
population using these samples.
• Some methodologies used in inferential statistics are as follows:
➢Hypothesis Testing - z test, f test, t test, etc.
➢Regression Analysis
Overview of Minitab
Session Window
Data Input methods
• Write directly in Minitab worksheet
• Copy Paste from Excel : Ctrl+A, Ctrl+C from excel file and Ctrl+V in Minitab worksheet
1.Enter the data into a blank Minitab worksheet with one column containing
the Campus names and a second column containing the Count for each campus
(open file campus count pie chart)
2.From the tool bar, select Graph > Pie Chart...
3.Select Summarized Data in a Table
4.Click OK
5.Double click Campus in the box on the left to insert it into the Categorical
variable box on the right
6.Double click Count in the box on the left to insert it into the Summary
variables box on the right
7.Click OK
Bar Graph
(used for categorical data)
To create a bar graph of the primary campus variable in Minitab:
1.Open the data file in Minitab
2.From the tool bar, select Graph > Bar Chart > Counts of Unique
Values...
3.Select Simple
4.Click OK
5.Double click the variable Primary Campus in the box on the left to
insert it into the Categorical variable box on the right
6.Click OK
• To create a bar chart using summarized data:
1.Enter the data into a blank Minitab worksheet with one column containing
the Campus names and a second column containing the Count for each
campus
2.From the tool bar, select Graph > Bar Chart > Values from a Table...
3.Under One Column of Values, select Simple
4.Click OK
5.Double click Count in the box on the left to insert it into the Graph-
variable box on the right
6.Double click Campus in the box on the left to insert it into the Categorical
variable box on the right
7.Click OK
• To create a clustered bar chart of the Work Status and Primary
Campus variables in Minitab:
1.Open the data file in Minitab
2.From the tool bar, select Graph > Bar Chart > Counts of Unique
Values
3.Select Cluster (Select Stack for stacking)
4.Click OK
5.Double click the variables Work Status and Primary Campus to insert
them both into the Categorical variables box on the right
6.Click OK
Histograms
(used for quantitative data)
• To create a histogram of the number of online courses completed in
Minitab:
1.Open the data set in Minitab
2.From the tool bar, select Graph > Histogram...
3.Under One Y Variable, select Simple
4.Click OK
5.Double click the variable Online Courses Completed in the box on the
left to insert it into the Y-variable box on the right
6.Click OK
Symmetry/Skewness
• Quantitative variables are often
discussed in terms of their
shape. Both dotplots and
histograms can be used to
interpret a distribution's shape.
• Symmetrical Distribution
A distribution that is similar on
both sides of the center.
Normal Distribution
One specific type of symmetrical distribution. This is also known as
a bell-shaped distribution.
Skewed
A distribution in which values are
more spread out on one side of
the center than on the other.
Right Skewed
A distribution in which the higher
values (towards the right on a
number line) are more spread
out than the lower values. This is
also known as positively skewed.
Left Skewed
A distribution in which the lower values (towards the left on a number line) are more spread
out than the higher values. This is also known as negatively skewed.
❑Measures of Central Tendency
• Mean, Median, Mode
❑Measures of Spread
• The standard deviation is the most commonly used measure of
variability
• this is denoted as Ꝺ (sigma)
• When computing the standard deviation by hand, it is necessary to
first compute the variance. The variance is equal to the standard
deviation squared.
To obtain measures of central tendency and variability in Minitab:
• Explanatory variable
Variable that is used to explain variability in the response variable,
also known as an independent variable or predictor variable.
• Response variable
The outcome variable, also known as a dependent variable.
• Relationship: There is a positive
linear relationship between
height and shoe size in this
sample.
• There is a negative linear
relationship between the
maximum daily temperature and
coffee sales.
The file below contains data concerning students' quiz averages and final
exam scores. Let's construct a scatterplot with the quiz averages on the x-
axis and final exam scores on the y-axis.
• Grades.csv
1.Open the data file in Minitab
2.From the tool bar, select Graphs > Scatterplot > Simple
3.Double click the variable Final on the left to move it to the Y variable box
on the right
4.Double click the variable Quiz_Average on the left to move it to the X
variable box on the right
5.Click OK
Correlation
A measure of the direction and strength of the relationship between
two variables.
• Properties of Pearson’s r:
Correlation: Relationships
Variables p-value:
• Both are significant (less than 0.05).
• One-Sample t-test
In a one-sample t-test, we compare the average (or mean
parameter) of one group against the set average (or mean). This
set average can be any theoretical value (or it can be the
population mean).
• Independent Two-Sample t-test
The two-sample t-test is used to compare the means of two
different samples.
• Paired Sample t-test
Here, we measure one group at two different times.
T-Test
• When Should We Perform a T-test?
• Example: Consider a telecom company that has two service centers in the city. The
company wants to find out whether the average time required to service a customer is the
same in both stores. The company measures the average time taken by 50 random
customers in each store. Store A takes 22 minutes, while Store B averages 25 minutes.
Can we say that Store A is more efficient than Store B in terms of customer service?
• It does seem that way, doesn’t it? However, we have only looked at 50 random customers
out of the many people who visit the stores. Simply looking at the average sample time
might not be representative of all the customers who visit both stores.
• This is where the t-test comes into play. It helps us understand if the
difference between two sample means is actually real or simply due to chance.
Assumptions for Performing a T-test
•From the given data, it may be concluded that, statistically there is no significance change in driving distance due
to new coating on golf balls.
•However, recommendation is that the test be carried out with a larger sample size covering number of golf courses
(at least a five different) to improve the accuracy of the test results and negating any effect of one type of ground.
Also, the results need to interpreted and future actions be planned with the understanding of other characteristics
like size, shape, weight etc.
Paired t test
Z-Test
• Z-statistic – Z Test
• Z-statistic is used when the sample follows a normal distribution.
It is calculated based on the population parameters like mean
and standard deviation.
One sample Z test is used when we want to compare a sample
mean with a population mean