You are on page 1of 30

Raja Daniyal

0000242740

8614

Educational Statistics

Spring-2023

1st
Question#1: How descriptive and inferential statistics help a
teacher? Explain.

Answer: Difference between Descriptive and Inferential Statistics


Descriptive and inferential statistics are two broad categories in the field of statistics. In this blog
post, I show you how both types of statistics are important for different purposes. Interestingly,
some of the statistical measures are similar, but the goals and methodologies are very different.

Descriptive Statistics

Both descriptive and inferential statistics help make sense out of row after row of data!

Use descriptive statistics to summarize and graph the data for a group that you choose. This process
allows you to understand that specific set of observations.

Descriptive statistics describe a sample. That’s pretty straightforward. You simply take a group
that you’re interested in, record data about the group members, and then use summary statistics
and graphs to present the group properties. With descriptive statistics, there is no uncertainty
because you are describing only the people or items that you actually measure. You’re not trying
to infer properties about a larger population.

The process involves taking a potentially large number of data points in the sample and reducing
them down to a few meaningful summary values and graphs. This procedure allows us to gain
more insights and visualize the data than simply pouring through row upon row of raw numbers!

Common tools of descriptive statistics

Descriptive statistics frequently use the following statistical measures to describe groups:

Central tendency: Use the mean or the median to locate the center of the dataset. This measure
tells you where most values fall.

Dispersion: How far out from the center do the data extend? You can use the range or standard
deviation to measure the dispersion. A low dispersion indicates that the values cluster more tightly
around the center. Higher dispersion signifies that data points fall further away from the center.
We can also graph the frequency distribution.

Example of descriptive statistics

Suppose we want to describe the test scores in a specific class of 30 students. We record all of
the test scores and calculate the summary statistics and produce graphs. Here is the CSV data
file: Descriptive statistics.
Statistic Class value

Mean 79.18

Range 66.21 – 96.53

Proportion >= 70 86.7%

These results indicate that the mean score of this class is 79.18. The scores range from 66.21 to
96.53 and the distribution is symmetrically centered on the mean. A score of at least 70 on the
test is acceptable. The data show that 86.7% of the students have acceptable scores.

Collectively, this information gives us a pretty good picture of this specific class. There is no
uncertainty surrounding these statistics because we gathered the scores for everyone in the class.
However, we can’t take these results and extrapolate to a larger population of students.
We’ll do that later.

A good exploratory tool for descriptive statistics is the five-number summary, which presents a set
of distributional properties for your sample.

Inferential Statistics

Inferential statistics takes data from a sample and makes inferences about the larger population
from which the sample was drawn. Because the goal of inferential statistics is to draw conclusions
from a sample and generalize them to a population, we need to have confidence that our sample
accurately reflects the population. This requirement affects our process. At a broad level, we must
do the following:

1. Define the population we are studying.


2. Draw a representative sample from that population.
3. Use analyses that incorporate the sampling error.

We don’t get to pick a convenient group. Instead, random sampling allows us to have confidence
that the sample represents the population. This process is a primary method for obtainingsamples
that mirrors the population on average. Random sampling produces statistics, such asthe mean,
that do not tend to be too high or too low. Using a random sample, we can generalize from the
sample to the broader population. Unfortunately, gathering a truly random sample can be a
complicated process. Learn more about Making Statistical Inferences.
You can use the following methods to collect a representative sample:

 Simple random sampling


 Stratified sampling
 Cluster sampling
 Systematic sampling
In contrast, convenience sampling doesn’t tend to obtain representative samples. These samples
are easier to collect but the results are minimally useful.

Pros and cons of working with samples


You gain tremendous benefits by working with a random sample drawn from a population. In most
cases, it is simply impossible to measure the entire population to understand its properties. The
alternative is to gather a random sample and then use the methodologies of inferential statistics to
analyze the sample data.

While samples are much more practical and less expensive to work with, there are tradeoffs.
Typically, we learn about the population by drawing a relatively small sample from it. We are a
very long way off from measuring all people or objects in that population. Consequently, when
you estimate the properties of a population from a sample, the sample statistics are unlikely to
equal the actual population value exactly.

For instance, your sample mean is unlikely to equal the population mean exactly. The difference
between the sample statistic and the population value is the sampling error. Inferential statistics
incorporate estimates of this error into the statistical results.

In contrast, summary values in descriptive statistics are straightforward. The average score in a
specific class is a known value because we measured all individuals in that class. There is no
uncertainty.

Standard analysis tools of inferential statistics

The most common methodologies in inferential statistics are hypothesis tests, confidence intervals,
and regression analysis. Interestingly, these inferential methods can produce similar summary
values as descriptive statistics, such as the mean and standard deviation. However, as I’ll show
you, we use them very differently when making inferences.

Hypothesis tests

 Hypothesis tests use sample data answer questions like the following:
 Is the population mean greater than or less than a particular value?
 Are the means of two or more populations different from each other?
For example, if we study the effectiveness of a new medication by comparing the outcomes in a
treatment and control group, hypothesis tests can tell us whether the drug’s effect that we
observe in the sample is likely to exist in the population. After all, we don’t want to use the
medication if it is effective only in our specific sample. Instead, we need evidence that it’ll be
useful in the entire population of patients. Hypothesis tests allow us to draw these types of
conclusions about entire populations.

Confidence intervals (CIs)

In inferential statistics, a primary goal is to estimate population parameters. These parameters are
the unknown values for the entire population, such as the population mean and standard deviation.
These parameter values are not only unknown but almost always unknowable. Typically, it’s
impossible to measure an entire population. The sampling error I mentioned earlier produces
uncertainty, or a margin of error, around our estimates.

Suppose we define our population as all high school basketball players. Then, we draw a random
sample from this population and calculate the mean height of 181 cm. This sample estimate of 181
cm is the best estimate of the mean height of the population. However, it’s virtuallyguaranteed
that our estimate of the population parameter is not exactly correct.

Confidence intervals incorporate the uncertainty and sample error to create a range of values the
actual population value is like to fall within. For example, a confidence interval of [176 186]
indicates that we can be confident that the real population mean falls within this range.

Regression analysis

Regression analysis describes the relationship between a set of independent variables and a
dependent variable. This analysis incorporates hypothesis tests that help determine whether the
relationships observed in the sample data actually exist in the population.
For example, the fitted line plot below displays the relationship in the regression model between
height and weight in adolescent girls. Because the relationship is statistically significant, we have
sufficient evidence to conclude that this relationship exists in the population rather than just our
sample.

Example of inferential statistics


For this example, suppose we conducted our study on test scores for a specific class as I detailed
in the descriptive statistics section. Now we want to perform an inferential statistics study for that
same test. Let’s assume it is a standardized statewide test. By using the same test, but now with
the goal of drawing inferences about a population, I can show you how that changes the way we
conduct the study and the results that we present.
In descriptive statistics, we picked the specific class that we wanted to describe and recorded all
of the test scores for that class. Nice and simple. For inferential statistics, we need to define the
population and then draw a random sample from that population.

Let’s define our population as 8th-grade students in public schools in the State of Pennsylvania in
the United States. We need to devise a random sampling plan to help ensure a representative
sample. This process can actually be arduous. For the sake of this example, assume that we are
provided a list of names for the entire population and draw a random sample of 100 students from
it and obtain their test scores. Note that these students will not be in one class, but from many
different classes in different schools across the state.

Inferential statistics results

For inferential statistics, we can calculate the point estimate for the mean, standard deviation, and
proportion for our random sample. However, it is staggeringly improbable that any of these point
estimates are exactly correct, and there is no way to know for sure anyway. Because we can’t
measure all subjects in this population, there is a margin of error around these statistics.
Consequently, I’ll report the confidence intervals for the mean, standard deviation, and the
proportion of satisfactory scores (>=70). Here is the CSV data file: Inferential statistics.

Statistic Population Parameter Estimate (CIs)

Mean 77.4 – 80.9

Standard deviation 7.7 – 10.1

Proportion scores >= 70 77% – 92%

Given the uncertainty associated with these estimates, we can be 95% confident that the population
mean is between 77.4 and 80.9. The population standard deviation (a measure of dispersion) is
likely to fall between 7.7 and 10.1. And, the population proportion of satisfactory scores is
expected to be between 77% and 92%.
Differences between Descriptive and Inferential Statistics

As you can see, the difference between descriptive and inferential statistics lies in the process as
much as it does the statistics that you report.

For descriptive statistics, we choose a group that we want to describe and then measure all subjects
in that group. The statistical summary describes this group with complete certainty (outside of
measurement error).

For inferential statistics, we need to define the population and then devise a sampling plan that
produces a representative sample. The statistical results incorporate the uncertainty that is inherent
in using a sample to understand an entire population. The sample size becomes a vital
characteristic. The law of large numbers states that as the sample size grows, the sample statistics
(i.e., sample mean) will converge on the population value.

A study using descriptive statistics is simpler to perform. However, if you need evidence that an
effect or relationship between variables exists in an entire population rather than only your sample,
you need to use inferential statistics.

If you’re learning about statistics and like the approach I use in my blog, check out my
Introduction to Statistics book! It’s available at Amazon and other retailers.
Question#2: Explain non-probability sampling techniques used in
educational research.

Answer: Non-Probability Sampling

When we are going to do an investigation, and we need to collect data, we have to know the type
of techniques we are going to use to be prepared. For this reason, there are two typesof
sampling: the random or probabilistic sample and the non-probabilistic one. In this case, we will
talk in-depth about non-probability sampling.

Definition

Non-probability sampling is defined as a sampling technique in which the researcher selects


samples based on the subjective judgment of the researcher rather than random selection. It is a
less stringent method. This sampling method depends heavily on the expertise of the researchers.
It is carried out by observation, and researchers use it widely for qualitative research.

Non-probability sampling is a method in which not all population members have an equal chance
of participating in the study, unlike probability sampling. Each member of the population has a
known chance of being selected. Non-probability sampling is most useful for exploratory studies
like a pilot survey (deploying a survey to a smaller sample compared to pre-determined sample
size). Researchers use this method in studies where it is impossible to draw random probability
sampling due to time or cost considerations.

Types of non-probability sampling

Here are the types of non-probability sampling methods:


1. Convenience sampling

Convenience sampling is a non-probability sampling technique where samples areselected


from the population only because they are conveniently available to the researcher.
Researchers choose these samples just because they are easy to recruit, andthe researcher
did not consider selecting a sample that represents the entire population. Ideally, in research,
it is good to test a sample that represents the population. But, in some research, the
population is too large to examine and consider the entire population. It is one of the
reasons why researchers rely on convenience sampling, which is the most common non-
probability sampling method, because of its speed, cost-effectiveness, and ease of availability of
the sample.

2. Consecutive sampling

This non-probability sampling method is very similar to convenience sampling, with a slight
variation. Here, the researcher picks a single person or a group of a sample, conducts
research over a period, analyzes the results, and then moves on to another subject or group
if needed. Consecutive sampling technique gives the researcher a chance to work with
many topics and fine-tune his/her research by collecting results that have vital insights.
3. Quota sampling

Hypothetically consider, a researcher wants to study the career goals of male and female
employees in an organization. There are 500 employees in the organization, also known as the
population. To understand better about a population, the researcher will need only a sample, not
the entire population. Further, the researcher is interested in particular strata within the population.
Here is where quota sampling helps in dividing the population into strata or groups.

4. Judgmental or Purposive sampling

In the judgmental sampling method, researchers select the samples based purely on the
researcher’s knowledge and credibility. In other words, researchers choose only those people who
they deem fit to participate in the research study. Judgmental or purposive sampling is not a
scientific method of sampling, and the downside to this sampling technique is that the
preconceived notions of a researcher can influence the results. Thus, this research technique
involves a high amount of ambiguity.

5. Snowball sampling

Snowball sampling helps researchers find a sample when they are difficult to locate. Researchers
use this technique when the sample size is small and not easily available. This sampling system
works like the referral program. Once the researchers find suitable subjects, he asks them for
assistance to seek similar subjects to form a considerably good size sample.

Non-probability sampling examples

Here are three simple examples of non-probability sampling to understand the subject better.

1. An example of convenience sampling would be using student volunteers known to the


researcher. Researchers can send the survey to students belonging to a particular school,
college, or university, and act as a sample.
2. In an organization, for studying the career goals of 500 employees, technically, the
sample selected should have proportionate numbers of males and females. Which
means there should be 250 males and 250 females? Since this is unlikely, the researcher
selects the groups or strata using quota sampling.
3. Researchers also use this type of sampling to conduct research involving a particular
illness in patients or a rare disease. Researchers can seek help from subjects to refer to
other subjects suffering from the same ailment to form a subjective sample to carry out
the study.

Use of non-probability Sampling

 Use this type of sampling to indicate if a particular trait or characteristic exists in a


population.
 Researchers widely use the non-probability sampling method when they aim at
conducting qualitative research, pilot studies, or exploratory research.
 Researchers use it when they have limited time to conduct research or have budget
constraints.
 When the researcher needs to observe whether a particular issue needs in-depth
analysis, he applies this method.
 Use it when you do not intend to generate results that will generalize the entire
population.

Advantages of Non-probability Sampling

Here are the advantages of using the non-probability technique:

 Non-probability sampling techniques are a more conducive and practical method for
researchers deploying surveys in the real world. Although statisticians prefer probability
sampling because it yields data in the form of numbers, however, if done correctly,
it can produce similar if not the same quality of results andavoid sampling errors.
 Getting responses using non-probability sampling is faster and more cost- effective
than probability sampling because the sample is known to theresearcher. The
respondents respond quickly as compared to people randomly selected as they have a
high motivation level to participate.
Question#3: Give examples to describe variables commonly used in
educational research.

Answer: Variables in Research

The definition of a variable in the context of a research study is some feature with the potential to
change, typically one that may influence or reflect a relationship or outcome. For example,
potential variables might be time it takes for something to occur, whether or not an object is used
within a study, or the presence of a feature among members of the sample.

Within research, independent and dependent variables are key, forming the basis on which a
study is performed. However, other types of variables may come into play within a study, such as
confounding variables, controlled variables, extraneous, and moderator variables.

Dependent Variables in Research

A dependent variable is one being measured in an experiment, reflecting an outcome.


Researchers do not directly control this variable. Instead, they hope to learn something about the
relationship between different variables by observing how the dependent variable reacts under
different circumstances.

Although "dependent variable" is the most commonly used term, they may also be referred to
as response variables, outcome variable, or left-hand-side variable. These alternate names help to
further illustrate their purpose: a dependent variable shows a response to changes in other
variables, displaying the outcome.

The meaning of "left-hand-side" is less immediately transparent, but becomes more obvious when
considering the format of a basic algebraic equation. Typically, the dependent variable in these is
referred to as "Y" and placed on the left-hand-side of the equation. Because of this standard,
dependent variables may also be called the Y variable as well, and the dependent variable is usually
seen on the y-axis in graphs.
One example of a dependent variable would be a student's test scores. Several factors would
influence these scores, such as the amount of time spent studying, amount of sleep, or the stress
levels of the student. Ultimately, the dependent variable is not static or controlled directly, but is
subject to change depending on the independent variables involved.

Independent Variables in Research

An independent variable is one that the researcher controls or otherwise manipulates within a
study. In order to determine the relationship between dependent and independent variables, a
researcher will purposefully change an independent variable, watching to see if and how the
dependent variable changes in response.

The independent variable can alternately be called the explanatory, predicator, right-hand-side, or
X variable. Similarly to dependent variables, these reflect the uses of independent variables, as
they are intended to explain or predict changes in the dependent variables. Likewise, independent
variables are often referred to as "X" in basic algebraic equations and plotted using the x-axis. In
research, the experimenters will generally control independent variables as much as possible, so
that they can understand their true relationship with the dependent variables.

For example, a research study might use age as an independent variable, since it influences some
potential dependent variables. Obviously, a researcher cannot randomly assign ages to
participants, but they could only allow participants of certain ages into a study or sort a sample
into desired age groups.

Comparing Dependent and Independent Variables


Independent
Research Topic Dependent Variable
Variable
Manipulated by the Measured by the
All Research Topics
researcher. researcher.
What is being What is changing in
All Research Topics
changed? response?
Plants grow faster in warmer Temperature Plant Growth
temperatures.
To what extent does traffic affect
Traffic Mood
a person's mood?
People walk slower after drinking
Drinking Coffee Walking Speed
coffee.

Examples of Independent and Dependent Variables in Research Studies

Many research studies have independent and dependent variables, since understanding cause- and-
effect between them is a key end goal. Some examples of research questions involving these
variables include:

 How does sleep the night before an exam affect scores in students? The independent
variable is the amount of time slept (in hours), and the dependent variable is the test score.
 How does caffeine affect hunger? The amount of caffeine consumed would be the
independent variable, and hunger would be the dependent variable.
 Is quality of sleep affected by phone use before bedtime? The length of time spent on the
phone prior to sleeping would be the independent variable and the quality of sleep would
be the dependent variable.
 Does listening to classical music help young children develop their reading abilities? The
frequency and level of classical music exposure would be the independent variables, and
reading scores would be the dependent variable. 
Coffee may affect hunger levels. To study this, coffee would be the independent variable and
hunger would be the dependent variable.

Other Types of Variables in Research

While the independent and dependent variables are the most commonly discussed variables in
research, other variables can influence outcomes. These include confounding, extraneous, control,
and moderator variables.

Confounding Variables

A confounding variable, also known as a "third variable," changes the dependent variable despite
not being the independent variable being studied. This can cause issues within a study. After all,
since variation in a confounding variable causes a response in a dependent variable that response
may be misattributed the independent variable. In order to ensure that the observed outcome is
only due to changes in independent variables, it is crucial to determine what confounding variables
might sway experimental results.
Question#4: Describe histogram as data interpretation technique.

Answer: A frequency distribution shows how often each different value in a set of data occurs. A
histogram is the most commonly used graph to show frequency distributions. It looks very much like
a bar chart, but there are important differences between them. This helpful data collection andanalysis
tool is considered one of the seven basic quality tools.

USE A HISTOGRAM

Use a histogram when:

 The data are numerical

 You want to see the shape of the data’s distribution, especially when determining whether the

output of a process is distributed approximately normally


 Analyzing whether a process can meet the customer’s requirements

 Analyzing what the output from a supplier’s process looks like

 Seeing whether a process change has occurred from one time period to another

 Determining whether the outputs of two or more processes are different

 You wish to communicate the distribution of data quickly and easily to others
HOW TO CREATE A HISTOGRAM

1. Collect at least 50 consecutive data points from a process.


2. Use a histogram worksheet to set up the histogram. It will help you determine the number of
bars, the range of numbers that go into each bar, and the labels for the bar edges. After
calculating W in Step 2 of the worksheet, use your judgment to adjust it to a convenient
number. For example, you might decide to round 0.9 to an even 1.0. The value for W must
not have more decimal places than the numbers you will be graphing.
3. Draw x- and y-axes on graph paper. Mark and label the y-axis for counting data values. Mark
and label the x-axis with the L values from the worksheet. The spaces between these numbers
will be the bars of the histogram. Do not allow for spaces between bars.
4. For each data point, mark off one count above the appropriate bar with an X or by shading
that portion of the bar.

HISTOGRAM ANALYSIS

 Before drawing any conclusions from your histogram, be sure that the process was operating

normally during the time period being studied. If any unusual events affected the process during
the time period of the histogram, your analysis of the histogram shape likely cannot be
generalized to all time periods.
 Analyze the meaning of your histogram's shape. Typical histogram shapes and what they mean

are covered below.

HISTOGRAM TOOLS & TEMPLATES

Histogram template (Excel) Analyze the frequency distribution of up to 200 data points using this
simple, but powerful, histogram generating tool.

Check sheet template (Excel) Analyze the number of defects for each day of the week. Start by
tracking the defects on the check sheet. The tool will create a histogram using the data you enter.
HISTOGRAM WORKSHEET EXAMPLE

TYPICAL HISTOGRAM SHAPES AND WHAT THEY MEAN

Normal Distribution

A common pattern is the bell-shaped curve known as the "normal distribution." In a


normal or "typical" distribution, points are as likely to occur on one side of the average as on the other.
Note that other distributions look similar to the normal distribution. Statistical calculations must be
used to prove a normal distribution.
It's important to note that "normal" refers to the typical distribution for a particular process. For
example, many processes have a natural limit on one side and will produce skewed distributions. This
is normal—meaning typical—for those processes, even if the distribution isn’t considered "normal."

Skewed Distribution

The skewed distribution is asymmetrical because a natural limit prevents outcomes on one side. The
distribution’s peak is off center toward the limit and a tail stretches away from it. For example, a
distribution of analyses of a very pure product would be skewed, because the product cannot be more
than 100 percent pure. Other examples of natural limits are holes that cannot be smaller than the
diameter of the drill bit or call-handling times that cannot be less than zero. These distributions are
called right- or left-skewed according to the direction of the tail.
Double-Peaked or Bimodal

The bimodal distribution looks like the back of a two-humped camel. The outcomes of two processes
with different distributions are combined in one set of data. For example, a distribution of production
data from a two-shift operation might be bimodal, if each shift produces a different distribution of
results. Stratification often reveals this problem.

Plateau or Multimodal Distribution

The plateau might be called a ―multimodal distribution.‖ Several processes with normal distributions
are combined. Because there are many peaks close together, the top of the distribution resembles a
plateau.
Edge Peak Distribution

The edge peak distribution looks like the normal distribution except that it has a large peak at one tail.
Usually this is caused by faulty construction of the histogram, with data lumped together into a group
labeled ―greater than.‖

Comb Distribution

In a comb distribution, the bars are alternately tall and short. This distribution often results from
rounded-off data and/or an incorrectly constructed histogram. For example, temperature data rounded
off to the nearest 0.2 degree would show a comb shape if the bar width for the histogram were 0.1
degree.
Truncated or Heart-Cut Distribution

The truncated distribution looks like a normal distribution with the tails cut off. The supplier might be
producing a normal distribution of material and then relying on inspection to separate what is within
specification limits from what is out of spec. The resulting shipments to the customer from inside the
specifications are the heart cut.

Dog Food Distribution

The dog food distribution is missing something—results near the average. If a customer receives this
kind of distribution, someone else is receiving a heart cut and the customer is left with the ―dog
food,‖ the odds and ends left over after the master’s meal. Even though what the customer receives is
within specifications, the product falls into two clusters: one near the upper specification limit and one
near the lower specification limit. This variation often causes problems in the customer’s process.
Question#5: Explain different measures of dispersion used in
educational research.
Answer: Measure of dispersion explains the extent of variability. Dispersion helps to
understand the disparity or distribution in a dataset. It gives us an idea about the variation and
central value of a unit. Range, interquartile range, standard deviation and mean deviation are the
commonly used measures of dispersion. Dispersion can be calculated and measured using these
methods.

Dispersion is the state of getting dispersed or spread. Statistical dispersion means the extent to
which numerical data is likely to vary about an average value. In other words, dispersion helps to
understand the distribution of the data.

Measures of Dispersion

In statistics, the measures of dispersion help to interpret the variability of data i.e. to know how
much homogenous or heterogeneous the data is. In simple terms, it shows how squeezed or
scattered the variable is.

Types of Measures of Dispersion

There are two main types of dispersion methods in statistics which are:

 Absolute Measure of Dispersion

 Relative Measure of Dispersion


Absolute Measure of Dispersion

An absolute measure of dispersion contains the same unit as the original data set. The absolute
dispersion method expresses the variations in terms of the average of deviations of observations
like standard or means deviations. It includes range, standard deviation, quartile deviation, etc.

The types of absolute measures of dispersion are:

1. Range: It is simply the difference between the maximum value and the minimum value
given in a data set. Example: 1, 3,5, 6, 7 => Range = 7 -1= 6

2. Variance: Deduct the mean from each data in the set, square each of them and add each
square and finally divide them by the total no of values in the data set to get the variance.
Variance (σ2) = ∑(X−μ)2/N

3. Standard Deviation: The square root of the variance is known as the standard deviation
i.e. S.D. = √σ.

4. Quartiles and Quartile Deviation: The quartiles are values that divide a list of numbers
into quarters. The quartile deviation is half of the distance between the third and the first
quartile.

5. Mean and Mean Deviation: The average of numbers is known as the mean and the
arithmetic mean of the absolute deviations of the observations from a measure of central
tendency is known as the mean deviation (also called mean absolute deviation.

Relative Measure of Dispersion

The relative measures of dispersion are used to compare the distribution of two or more data sets.
This measure compares values without units. Common relative dispersion methods include:

1. Co-efficient of Range

2. Co-efficient of Variation

3. Co-efficient of Standard Deviation

4. Co-efficient of Quartile Deviation

5. Co-efficient of Mean Deviation


Co-efficient of Dispersion

The coefficients of dispersion are calculated (along with the measure of dispersion) when two series
are compared, that differ widely in their averages. The dispersion coefficient is also used when two
series with different measurement units are compared. It is denoted as C.D.

The common coefficients of dispersion are:

C.D. in terms of Coefficient of dispersion

Range C.D. = (Xmax – Xmin) ⁄ (Xmax + Xmin)

Quartile Deviation C.D. = (Q3 – Q1) ⁄ (Q3 + Q1)

Standard Deviation (S.D.) C.D. = S.D. ⁄ Mean

Mean Deviation C.D. = Mean deviation/Average

Measures of Dispersion Formulas

The most important formulas for the different dispersion methods are:

Arithmetic Mean Formula Quartile Formula

Standard Deviation Formula Variance Formula

Interquartile Range Formula All Statistics Formulas

Solved Examples

Example 1: Find the Variance and Standard Deviation of the Following


Numbers: 1, 3, 5, 5, 6, 7, 9, 10.
Solution:

The mean = (1+ 3+ 5+ 5+ 6+ 7+ 9+ 10)/8 = 46/ 8 = 5.75

Step 1: Subtract the mean value from individual value

(1 – 5.75), (3 – 5.75), (5 – 5.75), (5 – 5.75), (6 – 5.75), (7 – 5.75), (9 – 5.75), (10 – 5.75)

= -4.75, -2.75, -0.75, -0.75, 0.25, 1.25, 3.25, 4.25

Step 2: Squaring the above values we get, 22.563, 7.563, 0.563, 0.563, 0.063, 1.563, 10.563,
18.063

Step 3: 22.563 + 7.563 + 0.563 + 0.563 + 0.063 + 1.563 + 10.563 + 18.063


= 61.504

Step 4: n = 8, therefore variance (σ2) = 61.504/ 8 = 7.69

Now, Standard deviation (σ) = 2.77

Example 2: Calculate the range and coefficient of range for the following data
values.

45, 55, 63, 76, 67, 84, 75, 48, 62, 65

Solution:

Let Xi values be: 45, 55, 63, 76, 67, 84, 75, 48, 62, 65

Here,

Maximum value (Xmax) = 84

Minimum or Least value (Xmin) = 45

Range = Maximum value = Minimum value

= 84 – 45
= 39

Coefficient of range = (Xmax – Xmin)/ (Xmax + Xmin)

= (84 – 45)/ (84 + 45)

= 39/129

= 0.302 (approx.)

You might also like