Professional Documents
Culture Documents
Class-01:
Date: 28/08/2023
1.What is population?
For example, a sample of 100 students from the school population could
be denoted by s = {student1, student2, ..., student100}.
2. What is sample?
For example, a sample of 100 students from the school population could
be denoted by s = {student1, student2, ..., student100}.
The choice of sampling method depends on the specific situation and the
goals of the study.
Parameters are typically denoted by Greek letters, such as μ (mu) for the mean and
σ2 (sigma squared) for the variance.
It is important to note that parameters are typically unknown, and they must be
estimated from samples. The sample mean, sample median, sample mode, sample
variance, and sample standard deviation are all estimates of the corresponding
population parameters.
The accuracy of these estimates depends on the sample size. In general, larger
sample sizes produce more accurate estimates.
Data can be classified in many different ways, depending on the specific needs of the study.
Some common ways to classify data include:
Primary data: This is data that is collected directly from the source. For example, data
collected from a survey or experiment is primary data.
Secondary data: This is data that has already been collected by someone else. For
example, data collected from a government census or a company's financial records is
secondary data.
Discrete data: This is data that can only take on a finite number of values. For example,
the number of siblings a person has is discrete data.
Continuous data: This is data that can take on any value within a given range. For
example, a person's height is continuous data.
Qualitative data: This is data that describes a characteristic or attribute of a person or
thing. For example, a person's gender is qualitative data.
Quantitative data: This is data that can be measured or counted. For example, a person's
height is quantitative data.
The type of data that is collected will depend on the specific research question that is being
asked. For example, if a researcher is interested in the average height of adults in a particular
country, they would collect quantitative data. If a researcher is interested in the different types of
cars that people drive, they would collect qualitative data.
Data can be analyzed using a variety of statistical methods. These methods can be used to
describe the data, to test hypotheses, and to make predictions. The choice of statistical method
will depend on the specific type of data that is being analyzed and the research question that is
being asked.
Data is an essential part of statistics. It is the foundation on which all statistical analyses are
based. By understanding the different types of data and how to collect and analyze it, statisticians
can make valuable contributions to a wide range of fields, such as business, economics,
healthcare, and education.
5. What is Variable?
Quantitative variables: These variables are numbers that can be measured or counted.
Examples of quantitative variables include height, weight, age, and income.
Qualitative variables: These variables are categories or groups. Examples of qualitative
variables include gender, marital status, and eye color.
Discrete variables: These variables can only take on a finite number of values. For
example, the number of siblings a person has is a discrete variable.
Continuous variables: These variables can take on any value within a given range. For
example, a person's height is a continuous variable.
The type of variable that is used will depend on the specific research question that is being asked.
For example, if a researcher is interested in the average height of adults in a particular country,
they would use a quantitative variable. If a researcher is interested in the different types of cars
that people drive, they would use a qualitative variable.
Variables are an essential part of statistics. They are used to collect data, to analyze the data, and
to make inferences about the population. By understanding the different types of variables and
how to use them, statisticians can make valuable contributions to a wide range of fields, such as
business, economics, healthcare, and education.
6.Scale of measurement:
In statistics, a scale of measurement is a classification system for variables. It describes the type
of information that is being measured and the mathematical operations that can be performed on
the data.
Nominal scale: This is the weakest scale of measurement. It only classifies data into
categories that have no intrinsic order. For example, gender, eye color, and blood type are
nominal variables.
Ordinal scale: This scale has order, but the intervals between the categories are not
equal. For example, academic levels (freshman, sophomore, junior, senior) and Likert
scale (strongly agree, agree, neutral, disagree, strongly disagree) are ordinal variables.
Interval scale: This scale has order and equal intervals between the categories. However,
it does not have a true zero point. For example, temperature in degrees Celsius or
Fahrenheit is an interval variable.
Ratio scale: This scale has order, equal intervals, and a true zero point. This means that
the zero point on the scale represents the absence of the quantity being measured. For
example, height, weight, and time are ratio variables.
The scale of measurement of a variable determines the types of statistical analysis that can be
performed on the data. For example, nominal variables can only be used for descriptive statistics,
such as frequency counts and cross-tabulations. Ordinal variables can be used for descriptive
statistics and some inferential statistics, such as t-tests and analysis of variance (ANOVA).
Interval variables can be used for all types of statistical analysis, including t-tests, ANOVA, and
regression analysis. Ratio variables can also be used for all types of statistical analysis, but they
are especially useful for regression analysis.
It is important to choose the correct scale of measurement for a variable. This will ensure that the
data is analyzed correctly and that the results are meaningful.
Scale of
measurement Characteristics Mathematical operations
Nominal scale Data is classified into categories Grouping, counting
Ordinal scale Data has order, but the intervals are not equal Ranking, comparing
Interval scale Data has order and equal intervals Adding, subtracting
Data has order, equal intervals, and a true Adding, subtracting, multiplying,
Ratio scale zero point dividing
Ans:In statistics, data variables can be classified into two main categories:
Quantitative variables: These variables are numbers that can be measured or counted.
Examples of quantitative variables include height, weight, age, and income.
Qualitative variables: These variables are categories or groups. Examples of qualitative
variables include gender, marital status, and eye color.
Discrete variables: These variables can only take on a finite number of values. For
example, the number of siblings a person has is a discrete variable.
Continuous variables: These variables can take on any value within a given range. For
example, a person's height is a continuous variable.
Nominal variables: These variables are categories that have no intrinsic order. For
example, gender, eye color, and blood type are nominal variables.
Ordinal variables: These variables have order, but the intervals between the categories
are not equal. For example, academic levels (freshman, sophomore, junior, senior) and
Likert scale (strongly agree, agree, neutral, disagree, strongly disagree) are ordinal
variables.
The scale of measurement of a variable determines the types of statistical analysis that can be
performed on the data. For example, nominal variables can only be used for descriptive statistics,
such as frequency counts and cross-tabulations. Ordinal variables can be used for descriptive
statistics and some inferential statistics, such as t-tests and analysis of variance (ANOVA).
Interval variables can be used for all types of statistical analysis, including t-tests, ANOVA, and
regression analysis. Ratio variables can also be used for all types of statistical analysis, but they
are especially useful for regression analysis.
Here is a table summarizing the different types of data variables and their characteristics:
Qualitive Quantitive
Class-02:
Date: 30/08/2023
Data collection:
Data collection in statistics is the process of gathering information from different sources to
answer research questions or test hypotheses. It is an essential step in the statistical analysis
process, and the quality and accuracy of the data collected directly impact the validity and
reliability of the findings.
There are two main types of data collection methods in statistics: primary data collection and
secondary data collection.
Primary data collection involves collecting new data specifically for the research study.
This can be done through surveys, interviews, experiments, or observations.
Secondary data collection involves using data that has already been collected by
someone else. This data can be found in government databases, academic journals, or
commercial data sources.
The choice of data collection method will depend on the research questions being asked, the
resources available, and the time constraints.
Surveys: Surveys are a popular method of collecting data from a large number of people.
They can be used to collect information about people's opinions, behaviors, or
demographics.
Interviews: Interviews are a more in-depth way to collect data from people. They can be
used to get people's perspectives on a particular topic or to gather detailed information
about their experiences.
Experiments: Experiments are used to test cause-and-effect relationships. They involve
manipulating one variable (the independent variable) and observing how it affects
another variable (the dependent variable).
Observations: Observations are used to collect data about people's behaviors or the
environment. They can be conducted in a natural setting or in a laboratory setting.
Reviews of existing records: This method involves collecting data from existing records,
such as government databases, academic journals, or commercial data sources.
The data collection process should be carefully planned and executed to ensure that the data is
accurate and reliable. The following are some important considerations when planning a data
collection study:
Define the research questions or hypotheses. What do you want to learn from the data?
Identify the target population. Who are you going to collect data from?
Choose the appropriate data collection method. What method will best answer your
research questions?
Develop a data collection plan. This should include the sampling strategy, the data
collection instruments, and the data collection procedures.
Pilot test the data collection plan. This will help you identify any potential problems
and make necessary adjustments.
Collect the data. This should be done carefully and according to the data collection plan.
Clean and prepare the data. This involves checking for errors and inconsistencies, and
formatting the data for analysis.
Data collection is an important part of the statistical analysis process. By carefully planning and
executing the data collection process, you can ensure that you collect the data you need to
answer your research questions or test your hypotheses.
Data presentation:
Data presentation in statistics is the process of organizing and displaying data in a way that is
clear, concise, and easy to understand. It is an important step in the statistical analysis process, as
it allows the researcher to communicate the findings of the study to others.
Textual presentation: This is the simplest form of data presentation, and involves
describing the data in words. It is often used to present small amounts of data or to
provide a brief overview of the data.
Tabular presentation: This involves organizing the data in a table, which makes it
easier to see patterns and relationships. Tables are often used to present large amounts of
data or to compare different groups of data.
Graphical presentation: This involves using graphs or charts to represent the data
visually. Graphs and charts can be used to communicate complex data in a way that is
easy to understand.
The choice of data presentation method will depend on the type of data, the purpose of the
presentation, and the audience. For example, if the data is complex or if the audience is not
familiar with statistics, then a graphical presentation may be the best option.
Keep it simple. The presentation should be easy to understand and should not overwhelm
the audience with too much information.
Use clear and concise language. Avoid jargon and technical terms that the audience may
not understand.
Use appropriate visuals. The visuals should be clear and easy to read, and they should be
used to enhance the presentation, not to distract from it.
Label all axes and categories. This will help the audience understand the data and make
comparisons.
Use consistent formatting. This will make the presentation look neat and professional.
Proofread carefully. This will help to avoid errors and ensure that the presentation is
accurate.
Data presentation is an important part of the statistical analysis process. By carefully presenting
the data, the researcher can ensure that the findings of the study are communicated effectively to
others.
Inclusive method is a method of grouping data in which the upper limit of a class
interval is included in the class itself. For example, if the class interval is 10-19, then the
upper limit of 19 is included in the class. This means that a value of 19 is included in the
class 10-19.
Exclusive method is a method of grouping data in which the upper limit of a class
interval is excluded from the class. For example, if the class interval is 10-19, then the
upper limit of 19 is excluded from the class. This means that a value of 19 is not included
in the class 10-19.
Inclusive method
The inclusive method is the simplest way to group data into classes or intervals. It is easy to
understand and use, and it is often the default method for grouping discrete data.
To use the inclusive method, simply define the lower and upper limits of each class interval. The
upper limit of each class interval is included in the class itself. For example, if you are grouping
data on the number of children in a family, you might use the following class intervals:
0-1
2-3
4-5
6-7
8+
In this example, the upper limit of each class interval is included in the class itself. This means
that a family with 3 children would be included in the class interval 2-3.
Exclusive method
The exclusive method is a bit more complex than the inclusive method, but it can be useful for
grouping continuous data.
To use the exclusive method, simply define the lower and upper limits of each class interval. The
upper limit of each class interval is excluded from the class. For example, if you are grouping
data on the height of people, you might use the following class intervals:
50-54
55-59
60-64
65-69
70+
In this example, the upper limit of each class interval is excluded from the class. This means that
a person who is 54 inches tall would be included in the class interval 50-54, but a person who is
55 inches tall would be included in the class interval 55-59.
The choice of whether to use the inclusive or exclusive method depends on the specific data set
and the purpose of the analysis.
The inclusive method is often used when the data is discrete, meaning that it can only take on a
certain number of values. For example, the number of children in a family is discrete data. The
inclusive method is also often used when the data is qualitative, meaning that it is not a
numerical value. For example, the color of a person's hair is qualitative data.
The exclusive method is often used when the data is continuous, meaning that it can take on any
value within a range. For example, the height of people is continuous data. The exclusive method
is also often used when the data is used to calculate measures of central tendency, such as the
mean and median.
Here are some general guidelines for choosing between the inclusive and exclusive methods:
Example:
1.Prepare a frequency distribution by
inclusive method taking class interval of 7
from the following data.
28,17,15,22,29,21,23,27,18,12,7,2,9,4,6,1,8,3,10
,5,20,16,12,8,4,33,27,21,15,9,3,36,27,18,9,2,4,6,
32,31,29,18,14,13,15,11,9,7,1,5,37,32,28,26,24,
20,19,25,19,20
Solution
Frequency Distribution
(inclusive Method)
2. To create a frequency table with class intervals and tally marks from
a set of 50 data points using the inclusive method, you'll need to follow
these steps:
Step 2: Determine the range (the difference between the maximum and
minimum values) and the number of classes (intervals) you want. For
this example, let's choose 5 classes.
Step 3: Calculate the class width by dividing the range by the number of
classes and round up to the nearest whole number.
Step 4: Determine the starting point of the first class by subtracting the
class width from the minimum value and round down to the nearest
whole number.
Step 6: Count the frequency of data points in each class interval using
tally marks.
Data: 23, 29, 34, 38, 40, 42, 45, 46, 48, 51, 53, 56, 57, 59, 60, 61, 62, 63,
65, 67, 68, 69, 70, 73, 74, 75, 77, 79, 80, 81, 82, 83, 84, 85, 87, 88, 89,
91, 93, 95, 96, 97, 99, 100, 102, 105, 108, 110, 115
Step 6: Count the frequency of data points in each class interval using
tally marks. I'll provide a frequency table with tally marks:
This frequency table represents the data using the inclusive method
with 5 class intervals and tally marks for each interval.
Class:03
Date:04/09/2023
—–THE TITLE—-
—-Prefatory Notes—-
—-Box Head—-
—-Row Captions—- ——Column Captions—–
Foot Notes…
Source Notes…
The title is the main heading written in capitals shown at the top of the table. It must explain the
contents of the table and throw light on the table, as whole different parts of the heading can be
separated by commas. There are no full stops in the little.
The vertical heading and subheading of the column are called columns captions. The spaces
where these column headings are written is called the box head. Only the first letter of the box
head is in capital letters and the remaining words must be written in lowercase.
The horizontal headings and sub heading of the row are called row captions and the space where
these rows headings are written is called the stub.
This is the main part of the table which contains the numerical information classified with
respect to row and column captions.
These appear immediately below the body of the table providing additional explanation.
The source notes are given at the end of the table indicating the source the information has been
taken from. It includes the information about compiling agency, publication, etc.
A table should be simple and attractive. There should be no need of further explanation
(details).
Proper and clear headings for columns and rows are necessary.
Suitable approximation may be adopted and figures may be rounded off.
The unit of measurement should be well defined.
If the observations are large in numbers they can be broken into two or three tables.
Thick lines should be used to separate the data under big classes and thin lines to separate
the sub classes of data.
To represent data meaningfully within a short form.
To represent complicated data in a simple and meaning form.
To detect errors and omission on the data.
To facilated staticale analysis.
To help reference.
A graphical presentation of data is a visual representation of data using graphs, charts, and
diagrams. It is a more effective way of understanding and comparing data than seen in a tabular
form. Graphical presentation helps to qualify, sort, and present data in a method that is simple to
understand for a larger audience. Graphs enable in studying the cause and effect relationship
between two variables through both time series and frequency distribution.
There are many different types of graphical presentations of data, each with its own strengths and
weaknesses. Some of the most common types of graphical presentations of data in statistics
include:
Bar graph: A bar graph is a chart that uses bars to represent the frequencies of
different data values. It is a good way to compare the frequencies of different categories
of data.
Bar graphs are the pictorial representation of data (generally grouped), in the form of vertical or
horizontal rectangular bars, where the length of bars are proportional to the measure of data.
They are also known as bar charts. Bar graphs are one of the means of data handling in statistics.
The collection, presentation, analysis, organization, and interpretation of observations of data are
known as statistics. The statistical data can be represented by various methods such as tables, bar
graphs, pie charts, histograms, frequency polygons, etc. In this article, let us discuss what is a bar
chart, different types of bar graphs, uses, and solved examples.
Table of Contents:
Definition
Types of Bar Graph
o Vertical Bar Graph
o Horizontal Bar Graph
o Grouped Bar Graph
o Stacked Bar Graph
Properties
Uses
Advantages and Disadvantages
Difference Between Bar Graph and Histogram
Difference Between Bar Graph and Pie Chart
Difference Between Bar Graph and Line Graph
Steps to Draw Bar Graph
The bars drawn are of uniform width, and the variable quantity is represented on one of the axes.
Also, the measure of the variable is depicted on the other axes. The heights or the lengths of the
bars denote the value of the variable, and these graphs are also used to compare certain
quantities. The frequency distribution tables can be easily represented using bar charts which
simplify the calculations and understanding of data.
The bar graph helps to compare the different sets of data among different groups easily.
It shows the relationship using two axes, in which the categories are on one axis and the discrete
values are on the other axis.
The graph shows the major changes in data over time.
Vertical axis
Horizontal axis
The bar graph’s title informs the reader of its purpose.
The title of the horizontal axis indicates the information that is shown there.
The title of the vertical axis indicates the data it is used to display.
The categories on the particular axis indicate what each bar represents.
The bar graph’s scale demonstrates how numbers are used in the data. It is a system of markings
spaced at specific intervals that aid in object measurement. For instance, the scale of a graph
may be stated as 1 unit = 10 fruits
Even though the graph can be plotted using horizontally or vertically, the most usual type of bar
graph used is the vertical bar graph. The orientation of the x-axis and y-axis are changed
depending on the type of vertical and horizontal bar chart. Apart from the vertical and horizontal
bar graph, the two different types of bar charts are:
When the grouped data are represented vertically in a graph or chart with the help of bars, where
the bars denote the measure of data, such graphs are called vertical bar graphs. The data is
represented along the y-axis of the graph, and the height of the bars shows the values.
When the grouped data are represented horizontally in a chart with the help of bars, then such
graphs are called horizontal bar graphs, where the bars show the measure of data. The data is
depicted here along the x-axis of the graph, and the length of the bars denote the values.
The grouped bar graph is also called the clustered bar graph, which is used to represent the
discrete value for more than one object that shares the same category. In this type of bar chart,
the total number of instances are combined into a single bar. In other words, a grouped bar graph
is a type of bar graph in which different sets of data items are compared. Here, a single colour is
used to represent the specific series across the set. The grouped bar graph can be represented
using both vertical and horizontal bar charts.
Bar charts possess a discrete domain of divisions and are normally scaled so that all the data can
fit on the graph. When there is no regular order of the divisions being matched, bars on the chart
may be organized in any order. Bar charts organized from the highest to the lowest number are
called Pareto charts.
Bar graphs are a visual representation of data. They are used to show the relationship between
two or more sets of data. They are mostly used in business and finance, but they can also be
found in other contexts. Bar graphs are used in many real-life situations. For example, a bar
graph can be used to show the distribution of different types of food in a restaurant. The height
of each rectangle would represent how many orders were placed for that type of food.
Bar graphs are also often used to represent the data grouped into categories, such as how many
people have voted for each candidate in an election or how much money was spent by each
department. The bars on this type of graph represent the number or percentage of people or
money spent and are usually stacked on top of one another so that they can be easily compared to
one another.
Disadvantages:
Sometimes, the bar graph fails to reveal the patterns, cause, effects, etc.
It can be easily manipulated to yield fake information.
The bar graph represents the data using the rectangular bars and the height of the bar
represents the value shown in the data. Whereas a line graph helps to show the information
when the series of data are connected using a line.
Understanding the line graph is a little bit confusing as the line graph plots too many lines over
the graph. Whereas bar graph helps to show the relationship between the data quickly.
Important Notes:
Some of the important notes related to the bar graph are as follows:
In the bar graph, there should be an equal spacing between the bars.
It is advisable to use the bar graph if the frequency of the data is very large.
Understand the data that should be presented on the x-axis and y-axis and the relation between
the two.
How to Draw a Bar Graph?
Let us consider an example, we have four different types of pets, such as cat, dog, rabbit, and
hamster and the corresponding numbers are 22, 39, 5 and 9 respectively.
In order to visually represent the data using the bar graph, we need to follow the steps given
below.
Example 1:
In a firm of 400 employees, the percentage of monthly salary saved by each employee is given in
the following table. Represent it through a bar graph.
20 105
30 199
40 29
50 73
Total 400
Solution:
A cosmetic company manufactures 4 different shades of lipstick. The sale for 6 months is shown
in the table. Represent it using bar charts.
Solution:
The variation of temperature in a region during a year is given as follows. Depict it through the
graph (bar).
Month Temperature
January -6°C
February -3.5°C
March -2.7°C
April 4°C
May 6°C
June 12°C
July 15°C
August 8°C
September 7.9°C
October 6.4°C
November 3.1°C
December -2.5°C<
Solution:
As the temperature in the given table has negative values, it is more convenient to represent such
data through a horizontal bar graph.
Frequently Asked Questions on Bar Graph
Q1
Bar graph (bar chart) is a graph that represents the categorical data using rectangular bars. The
bar graph shows the comparison between discrete categories.
Q2
Q3
When is a bar graph used?
The bar graph is used to compare the items between different groups over time. Bar graphs are
used to measure the changes over a period of time. When the changes are larger, a bar graph is
the best option to represent the data.
Q4
The horizontal bar graph is the best choice while graphing the nominal variables.
Q5
The vertical bar graph is the most commonly used bar chart, and it is best to use it while
graphing the ordinal variables.
Pie Chart
A pie chart is a type of graph that represents the data in the circular graph. The slices of pie
show the relative size of the data, and it is a type of pictorial representation of data. A pie chart
requires a list of categorical variables and numerical variables. Here, the term “pie” represents
the whole, and the “slices” represent the parts of the whole.
Table of Contents:
Definition
Formula
How to Create Pie Chart
Pie Chart Maker
How to Solve Pie Chart
Examples
Uses
Advantages
Disadvantages
Practice Problem
FAQs
Formula
The pie chart is an important type of data representation. It contains different segments and
sectors in which each segment and sector of a pie chart forms a specific portion of the
total(percentage). The sum of all the data is equal to 360°.
To work out with the percentage for a pie chart, follow the steps given below:
Note: It is not mandatory to convert the given data into percentages until it is specified. We can
directly calculate the degrees for given data values and draw the pie chart accordingly.
How to Create a Pie Chart?
Imagine a teacher surveys her class on the basis of favourite Sports of students:
10 5 5 10 10
The data above can be represented by a pie chart as following and by using the circle graph
formula, i.e. the pie chart formula given below. It makes the size of the portion easy to
understand.
10 5 5 10 10
Step 2: Add all the values in the table to get the total.
Step 3: Next, divide each value by the total and multiply by 100 to get a per cent:
(10/40) × 100 (5/ 40) × 100 (5/40) ×100 (10/ 40) ×100 (10/40)× 100
Step 4: Next to know how many degrees for each “pie sector” we need, we will take a full circle
of 360° and follow the calculations below:
The central angle of each component = (Value of each component/sum of values of all the
components)✕360°
(10/ 40)× 360° (5 / 40) × 360° (5/40) × 360° (10/ 40)× 360° (10/ 40) × 360°
Let us take an example for a pie chart with an explanation here to understand the concept in a
better way.
Question: The percentages of various cops cultivated in a village of particular distinct are
given in the following table.
Solution:
Steps to construct:
Step 3: Choose the largest central angle. Construct a sector of a central angle, whose one radius
coincides with the radius drawn in step 2, and the other radius is in the clockwise direction to the
vertical radius.
Step 4: Construct other sectors representing other values in the clockwise direction in
descending order of magnitudes of their central angles.
Step 5: Shade the sectors obtained by different colours and label them as shown in the figure
below.
Question:
The pie-chart shows the marks obtained by a student in an examination. If the student
secures 440 marks in all, calculate the marks in each of the given subjects.
Solution:
The given pie chart shows the marks obtained in the form of degrees.
Marks secured in science = (central angle of science / 360°) × Total score secured
Marks secured in English = (central angle of English/ 360°) × Total score secured
Marks secured in Hindi = (central angle of Hindi / 360°) × Total score secured
Marks secured in social science = (central angle of social science / 360°) × Total score secured
Examples
A pie chart can be used to represent the relative size of a variety of data such as:
Advantages
The picture is simple and easy-to-understand
Data can be represented visually as a fractional part of a whole
It helps in providing an effective communication tool for the even uninformed audience
Provides a data comparison for the audience at a glance to give an immediate analysis or to
quickly understand information
No need for readers to examine or measure underlying numbers themselves, which can be
removed by using this chart
To emphasize a few points you want to make, you can manipulate pieces of data in the pie chart
Disadvantages
It becomes less effective if there are too many pieces of data to use
If there are too many pieces of data. Even if you add data labels and numbers may not help here,
they themselves may become crowded and hard to read
As this chart only represents one data set, you need a series to compare multiple sets
This may make it more difficult for readers when it comes to analyze and assimilate information
quickly
You can practice another pie chart question for Class 8, given below:
A pie chart is a pictorial representation of data. The slices of pie here shows the relative sizes of data.
The same data is represented in different sizes with the help of pie charts.
Q2
Q3
Measure the angle of each slice of the pie chart and divide by 360 degrees. Now multiply the value by
100. The percentage of particular data will be calculated.
Q4
How to find the total number of pieces of data in a slice of a pie chart?
To find the total number of pieces of data in a slice of a pie chart, multiply the slice percentage with the
total number of data set and then divide by 100.
For example, a slice of the pie chart is equal to 60% and the pie chart contains a total data set of 150.
Then, the value of 60% of pie slice is: (60×150)/100 = 90.
Q5
Assignment:01
Date:04/09/2023
1.Constraction a Bar Diagram.
2.Constraction a Pie Chart.
Histogram: A histogram is a graph that uses bars to represent the frequencies of different
data values in a continuous range. It is a good way to show the distribution of data.
Line graph: A line graph is a chart that uses lines to connect points that represent the
values of data over time. It is a good way to show trends in data over time.
Scatter plot: A scatter plot is a graph that uses points to represent the values of two
variables. It is a good way to show the relationship between two variables.
Scatter plot of height vs. weight of students
The best type of graphical presentation of data for a particular set of data will depend on the
purpose of the presentation and the type of data being presented.
Here are some tips for creating effective graphical presentations of data in statistics:
By following these tips, you can create graphical presentations of data that are clear, informative,
and easy to understand.
Class:04
Date:11/09/2023
1.Historigram
2.Frequency curve
3.Frequency polygon
4.Ogive curve
1.Historigram
Histogram
Table of Contents:
Definition
How to Make Histogram
When to Use Histogram?
Difference between Histogram and Bar Graph
Types of Histogram
o Uniform Histogram
o Bimodal Histogram
o Symmetric Histogram
o Probability Histogram
Applications
Example
What is Histogram?
A histogram is a graphical representation of a grouped frequency distribution with continuous
classes. It is an area diagram and can be defined as a set of rectangles with bases along with the
intervals between class boundaries and with areas proportional to frequencies in the
corresponding classes. In such representations, all the rectangles are adjacent since the base
covers the intervals between class boundaries. The heights of rectangles are proportional to
corresponding frequencies of similar classes and for different classes, the heights will be
proportional to corresponding frequency densities.
In other words, a histogram is a diagram involving rectangles whose area is proportional to the
frequency of a variable and width is equal to the class interval.
How to Plot Histogram?
You need to follow the below steps to construct a histogram.
1. Begin by marking the class intervals on the X-axis and frequencies on the Y-axis.
2. The scales for both the axes have to be the same.
3. Class intervals need to be exclusive.
4. Draw rectangles with bases as class intervals and corresponding frequencies as heights.
5. A rectangle is built on each class interval since the class limits are marked on the horizontal axis,
and the frequencies are indicated on the vertical axis.
6. The height of each rectangle is proportional to the corresponding class frequency if the intervals
are equal.
7. The area of every individual rectangle is proportional to the corresponding class frequency if the
intervals are unequal.
Although histograms seem similar to graphs, there is a slight difference between them. The
histogram does not involve any gaps between the two successive bars.
The frequency is shown by the area of The height shows the frequency and the width has no
each rectangle significance.
Types of Histogram
The histogram can be classified into different types based on the frequency distribution of the
data. There are different types of distributions, such as normal distribution, skewed distribution,
bimodal distribution, multimodal distribution, comb distribution, edge peak distribution, dog
food distribution, heart cut distribution, and so on. The histogram can be used to represent these
different types of distributions. The different types of a histogram are:
Uniform histogram
Symmetric histogram
Bimodal histogram
Probability histogram
Uniform Histogram
A uniform distribution reveals that the number of classes is too small, and each class has the
same number of elements. It may involve distribution that has several peaks.
Bimodal Histogram
If a histogram has two peaks, it is said to be bimodal. Bimodality occurs when the data set has
observations on two different kinds of individuals or combined groups if the centers of the two
separate histograms are far enough to the variability in both the data sets.
Symmetric Histogram
A symmetric histogram is also called a bell-shaped histogram. When you draw the vertical line
down the center of the histogram, and the two sides are identical in size and shape, the histogram
is said to be symmetric. The diagram is perfectly symmetric if the right half portion of the image
is similar to the left half. The histograms that are not symmetric are known as skewed.
Probability Histogram
Applications of Histogram
The applications of histograms can be seen when we learn about different distributions.
Normal Distribution
The usual pattern that is in the shape of a bell curve is termed normal distribution. In a normal
distribution, the data points are most likely to appear on a side of the average as on the other. It is
to be noted that other distributions appear the same as the normal distribution. The calculations
in statistics are utilised to prove a distribution that is normal. It is required to make a note that the
term “normal” explains the specific distribution for a process. For instance, in various processes,
they possess a limit that is natural on a side and will create distributions that are skewed. This is
normal which means for the processes, in the case where the distribution isn’t considered normal.
Skewed Distribution
The distribution that is skewed is asymmetrical as a limit which is natural resists end results on
one side. The peak of the distribution is the off-center in the direction of the limit and a tail that
extends far from it. For instance, a distribution consisting of analyses of a product that is
unadulterated would be skewed as the product cannot cross more than 100 per cent purity. Other
instances of natural limits are holes that cannot be lesser than the diameter of the drill or the call-
receiving times that cannot be lesser than zero. The above distributions are termed right-skewed
or left-skewed based on the direction of the tail.
Multimodal Distribution
The alternate name for the multimodal distribution is the plateau distribution. Various processes
with normal distribution are put together. Since there are many peaks adjacent together, the tip of
the distribution is in the shape of a plateau.
This distribution resembles the normal distribution except that it possesses a bigger peak at one
tail. Generally, it is due to the wrong construction of the histogram, with data combined together
into a collection named “greater than”.
Comb Distribution
In this distribution, there exist bars that are tall and short alternatively. It mostly results from the
data that is rounded off and/or an incorrectly drawn histogram. For instance, the temperature that
is rounded off to the nearest 0.2o would display a shape that is in the form of a comb provided
the width of the bar for the histogram were 0.1o.
The above distribution resembles a normal distribution with the tails being cut off. The producer
might be manufacturing a normal distribution of product and then depending on the inspection to
segregate what lies within the limits of specification and what is out. The resulting parcel to the
end-user from within the specifications is heart cut.
This distribution is missing something. It results close by the average. If an end-user gets this
distribution, someone else is receiving a heart cut distribution and the end-user who is left gets
dog food, the odds and ends which are left behind after the meal of the master. Even if the end-
user receives within the limits of specifications, the item is categorised into 2 clusters namely –
one close to the upper specification and another close to the lesser specification limit. This
difference causes problems in the end-users process.
Question: The following table gives the lifetime of 400 neon lamps. Draw the histogram for the
below data.
300 – 400 14
400 – 500 56
500 – 600 60
600 – 700 86
700 – 800 74
800 – 900 62
900 – 1000 48
Solution:
The histogram for the given data is:
No, histograms and bar charts are different. In the bar chart, each column represents the group
which is defined by a categorical variable, whereas in the histogram each column is defined by
the continuous and quantitative variable.
Q2
The uniform shaped histogram shows consistent data. In the uniform histogram, the frequency of
each class is similar to one other. In most cases, the data values in the uniform shaped histogram
may be multimodal.
Q3
Yes, the histogram can be drawn for the normal distribution of the data. A normal distribution
should be perfectly symmetrical around its center. It means that the right should be the mirror
image of the left side about its center and vice versa.
Q4
A histogram is skewed to the right, if most of the data values are on the left side of the histogram
and a histogram tail is skewed to right. When the data are skewed to the right, the mean value is
larger than the median of the data set.
Q5
A histogram is skewed to the left, if most of the data values fall on the right side of the histogram
and a histogram tail is skewed to left. In this case, the mean value is smaller than the median of
the data set.
To know more about histograms, graphs and other statistical concepts, visit BYJU’S -The
Learning App today!
The mode of a dataset represents the value that occurs most often.
2. Draw a line from the left corner of the tallest bar to the left corner of the bar immediately after
it.
3. Draw a line from the right corner of the tallest bar to the right corner of the bar immediately
before it.
4. Identify the point where the two lines intersect. Then draw a line straight down to the x-axis.
The point where the line hits the x-axis is our best estimate for the mode.
The following step-by-step example shows how to find the mode of the following histogram:
In this example, our best estimate for the mode is roughly 17.
Note: Since the data in a histogram is grouped into bins, it’s not possible to know the exact value
of the mode but the method that we used here allows us to make our best estimate.
The major difference between Bar Chart and Histogram is the bars of the bar chart are not
just next to each other. In the histogram, the bars are adjacent to each other. In statistics, bar
charts and histograms are important for expressing a huge or big number of data. The similarity
between bar chart and histogram is both are a pictorial representation of grouped data. Here, we
will learn histogram vs bar graph with examples.
What are the Differences Between Bar Chart and
Histogram?
A histogram is also a pictorial representation of data using rectangular bars, that are adjacent to
each other. It is used to represent grouped frequency distribution with continuous classes.
Frequency polygon
A frequency polygon is almost identical to a histogram, which is used to compare sets of data or
to display a cumulative frequency distribution. It uses a line graph to represent quantitative data.
Statistics deals with the collection of data and information for a particular purpose. The
tabulation of each run for each ball in cricket gives the statistics of the game. Tables, graphs, pie-
charts, bar graphs, histograms, polygons etc. are used to represent statistical data pictorially.
Frequency polygons are a visually substantial method of representing quantitative data and its
frequencies. Let us discuss how to represent a frequency polygon.
Step 1- Choose the class interval and mark the values on the horizontal axes
Step 2- Mark the mid value of each interval on the horizontal axes.
Step 3- Mark the frequency of the class on the vertical axes.
Step 4- Corresponding to the frequency of each class interval, mark a point at the height
in the middle of the class interval
Step 5- Connect these points using the line segment.
Step 6- The obtained representation is a frequency polygon.
Example
Example 1: In a batch of 400 students, the height of students is given in the following table.
Represent it through a frequency polygon.
Solution: Following steps are to be followed to construct a histogram from the given data:
The heights are represented on the horizontal axes on a suitable scale as shown.
The number of students is represented on the vertical axes on a suitable scale as shown.
Now rectangular bars of widths equal to the class- size and the length of the bars
corresponding to a frequency of the class interval is drawn.
ABCDEF represents the given data graphically in form of frequency polygon as:
Frequency polygons can also be drawn independently without drawing histograms. For this, the
midpoints of the class intervals known as class marks are used to plot the points.
Question 1: Construct a frequency polygon using the data given below:
49.5-59.5 5
59.5-69.5 10
69.5-79.5 30
79.5-89.5 40
89.5-99.5 15
Answer: We first need to calculate the cumulate frequency from the frequency given.
49.5-59.5 5 5
59.5-69.5 10 15
69.5-79.5 30 45
79.5-89.5 40 85
89.5-99.5 15 100
We now start by plotting the class marks such as 54.5, 64.5, 74.5 and so on till 94.5. Note that
we will also plot the previous and next class marks to start and end the polygon, i.e. we plot 44.5
and 104.5 as well.
Then, the frequencies corresponding to the class marks are plotted against each class mark. Like
you can see below, this makes sense as the frequency for class marks 44.5 and 104.5 are zero and
touching the x-axis. These plot points are used only to give a closed shape to the polygon. The
polygon looks like this:
Study Materials
BYJU'S Answer
Scholarship
BTC
Buy a Course
Success Stories
Login
1. Maths
2. Math Article
3. Ogive
Ogive
The word Ogive is a term used in architecture to describe curves or curved shapes. Ogives are
graphs that are used to estimate how many numbers lie below or above a particular variable or
value in data. To construct an Ogive, firstly, the cumulative frequency of the variables is
calculated using a frequency table. It is done by adding the frequencies of all the previous
variables in the given data set. The result or the last number in the cumulative frequency table is
always equal to the total frequencies of the variables. The most commonly used graphs of the
frequency distribution are histogram, frequency polygon, frequency curve, and Ogives
(cumulative frequency curves). Let us discuss one of the graphs called “Ogive” in detail. Here,
we are going to have a look at what is an Ogive, graph, chart and an example in detail.
Table of Contents:
Definition
Graph
o Less Than Ogive
o More Than Ogive
Chart
Uses
Examples
Practice Questions
FAQs
Ogive Definition
The Ogive is defined as the frequency distribution graph of a series. The Ogive is a graph of a
cumulative distribution, which explains data values on the horizontal plane axis and either the
cumulative relative frequencies, the cumulative frequencies or cumulative per cent frequencies
on the vertical axis.
Cumulative frequency is defined as the sum of all the previous frequencies up to the current
point. To find the popularity of the given data or the likelihood of the data that fall within the
certain frequency range, Ogive curve helps in finding those details accurately.
Create the Ogive by plotting the point corresponding to the cumulative frequency of each class
interval. Most of the Statisticians use Ogive curve, to illustrate the data in the pictorial
representation. It helps in estimating the number of observations which are less than or equal to
the particular value.
Ogive Graph
The graphs of the frequency distribution are frequency graphs that are used to exhibit the
characteristics of discrete and continuous data. Such figures are more appealing to the eye than
the tabulated data. It helps us to facilitate the comparative study of two or more frequency
distributions. We can relate the shape and pattern of the two frequency distributions.
The graph given above represents less than and the greater than Ogive curve. The rising curve
(Rose Curve) represents the less than Ogive, and the falling curve (Purple Curve) represents the
greater than Ogive.
The frequencies of all preceding classes are added to the frequency of a class. This series is
called the less than cumulative series. It is constructed by adding the first-class frequency to the
second-class frequency and then to the third class frequency and so on. The downward
cumulation results in the less than cumulative series.
Greater than or More than Ogive
The frequencies of the succeeding classes are added to the frequency of a class. This series is
called the more than or greater than cumulative series. It is constructed by subtracting the first
class, second class frequency from the total, third class frequency from that and so on. The
upward cumulation result is greater than or more than the cumulative series.
Ogive Chart
An Ogive Chart is a curve of the cumulative frequency distribution or cumulative relative
frequency distribution. For drawing such a curve, the frequencies must be expressed as a
percentage of the total frequency. Then, such percentages are cumulated and plotted, as in the
case of an Ogive. Below are the steps to construct the less than and greater than Ogive.
Ogive Example
Question 1:
Construct the more than cumulative frequency table and draw the Ogive for the below-given
data.
Frequency 3 8 12 14 10 6 5 2
Solution:
More than 1 3 60
More than 11 8 57
More than 21 12 49
More than 31 14 37
More than 41 10 23
More than 51 6 13
More than 61 5 7
More than 71 2 2
Plotting an Ogive:
Plot the points with coordinates such as (70.5, 2), (60.5, 7), (50.5, 13), (40.5, 23), (30.5, 37),
(20.5, 49), (10.5, 57), (0.5, 60).
An Ogive is connected to a point on the x-axis, that represents the actual upper limit of the last
class, i.e.,( 80.5, 0)
Y-axis = 1 cm – 10 c.f
.
Ogive (Cumulative frequency curve)
In a continuous series, if the upper limit (or the lower limit) of each class
interval is taken as x-coordinate and its corresponding cumulative frequency
(c.f.) as y-coordinate and the points are plotted in the graph, we obtain a
curve by joining the points with freehand. Such a curve is known as
the ogive or cumulative frequency curve. It is also called a free hand
curve.
Since less than c.f. and more than c.f. are the two types of cumulative
frequencies, there are two types of ogives as well. They are:
When the upper limit of each class interval is taken as x-coordinate and its
corresponding frequency as y-coordinate, the ogive so obtained is known as
less than ogive (or less than cumulative frequency curve).
Obviously, less than ogive is an increasing curve, sloping upwards from left
to right and has the shape of an elongated S.
More than ogive is a decreasing curve sloping downward from left to right
and has the shape of an elongated S, upside down.
Examples:
Example 1: The table given below shows the marks obtained by 80 students in
science. Construct (i) less than ogive (ii) more than ogive.
Solution: Here,
Here, we have the coordinates to draw less than ogive: (10, 3), (20, 11), (30, 28),
(40, 57), (50, 72), (60, 78) and (70, 80).
Plotting these points on a graph, we have the following less than ogive.
More than cumulative frequency distribution table:
Here, we have the coordinates to draw more than ogive: (0, 80), (10, 77), (20,
69), (30, 52), (40, 23), (50, 8) and (60, 2).
Plotting these points on a graph, we have the following more than ogive.
Less than and more than (Combined) Ogive
If we draw both less than and more than ogive of a distribution on the same graph
paper, it is called combined ogive. The point of intersection of two ogives in a
combined ogive is called the vital-point.
The foot of the perpendicular drawn from vital-point to x-axis gives the value of the
median of the distribution. And, the corresponding class is the proper class for the
median.
Uses of ogive
Following are the uses of cumulative frequency curve (ogive):
Examples:
Example 2: Construct the combined ogive from the data given in the table of
Example 1, and find the median.
Solution: Here,
Here, we have the coordinates to draw more than ogive: (0, 80), (10, 77), (20,
69), (30, 52), (40, 23), (50, 8) and (60, 2).
Plotting these points on a graph, we have the following less than and more than
combined ogive.
A perpendicular is drawn from the point of intersection of two ogives which meets
the x-axis at 34.14 (approx.) units from the origin. So, the required median of the
given distribution is 34.14.
Example 3: The table given below shows the marks obtained by 60 students in
mathematics. Construct a less than ogive and compute median, first quartile (Q 1)
and third quartile (Q3).
Solution: Here,
Here, we have the coordinates to draw less than ogive: (10, 4), (20, 14), (30, 34),
(40, 49), (50, 55) and (60, 60).
Plotting these points on a graph, we have the following less than ogive.
Median lies in 50% of the total data.
Therefore,
Median = 28 (approx.)
Q1 = 20.5 (approx.)
Q3 = 37.33 (approx.)
Example 4: Draw the less than ogive from the following data and the questions.
Solution: Here,
Plotting these points on a graph, we have the following less than ogive.
Example 5: From the given cumulative frequency curve (ogive), determine the
lower quartile (Q1) class, median (Q2) class and upper quartile (Q3) class.
∴ Q1 class = 10-20
∴ Q2 class = 30-40
∴ Q3 class = 40-50
What is an Ogive?
An ogive is a freehand graph drawn curve to show the cumulative frequency distribution. It is
also known as a cumulative frequency polygon.
Q2
The two types of ogives are less than ogive and greater than or more than ogive. In a less than
ogive, the frequencies of all preceding classes are added to the frequency of a class. In a more
than ogive, the frequencies of the succeeding classes are added to the frequency of a class.
Q3
Q4
Quartiles are the values that divide a list of numerical data into three quarters. The middle part of
the three quarters measures the central point of distribution and shows the data which are near to
the central point. The lower part of the quarters indicates just half information set which comes
under the median and the upper part shows the remaining half, which falls over the median. In
all, the quartiles depict the distribution or dispersion of the data set.
Quartiles Definition
Quartiles divide the entire set into four equal parts. So, there are three quartiles, first, second and
third represented by Q1, Q2 and Q3, respectively. Q2 is nothing but the median, since it indicates
the position of the item in the list and thus, is a positional average. To find quartiles of a group of
data, we have to arrange the data in ascending order.
In the median, we can measure the distribution with the help of lesser and higher quartile. Apart
from mean and median, there are other measures in statistics, which can divide the data into
specific equal parts. A median divides a series into two equal parts. We can partition values of a
data set mainly into three different ways:
1. Quartiles
2. Deciles
3. Percentiles
Median
Mean Deviation
Statistics For Class 10
Statistics For Class 11
Quartiles Formula
Suppose, Q3 is the upper quartile is the median of the upper half of the data set. Whereas, Q 1 is
the lower quartile and median of the lower half of the data set. Q 2 is the median. Consider, we
have n number of items in a data set. Then the quartiles are given by;
Q1 = [(n+1)/4]th item
Q2 = [(n+1)/2]th item
Q3 = [3(n+1)/4]th item
f is the frequency
Quartiles in Statistics
Similar to the median which divides the data into half so that 50% of the estimation lies below
the median and 50% lies above it, the quartile splits the data into quarters so that 25% of the
estimation are less than the lower quartile, 50% of estimation are less than the mean, and 75% of
estimation are less than the upper quartile. Usually, the data is ordered from smallest to largest:
Quartile Deviation
You have learned about standard deviation in statistics. Quartile deviation is defined as half of
the distance between the third and the first quartile. It is also called Semi Interquartile range. If
Q1 is the first quartile and Q3 is the third quartile, then the formula for deviation is given by;
Interquartile Range
The interquartile range (IQR) is the difference between the upper and lower quartile of a given
data set and is also called a midspread. It is a measure of statistical distribution, which is equal
to the difference between the upper and lower quartiles. Also, it is a calculation of variation
while dividing a data set into quartiles. If Q1 is the first quartile and Q3 is the third quartile, then
the IQR formula is given by;
IQR = Q3 – Q1
Quartiles Examples
Question 1: Find the quartiles of the following data: 4, 6, 7, 8, 10, 23, 34.
Solution: Here the numbers are arranged in the ascending order and number of items, n = 7
Solution:
Number of items, n = 8
Similarly,
Q2= 26+0.5(26-26) = 26
And,
Q3 = 35+0.75(35-35) = 35
Example 1: Calculate the median, lower quartile, upper quartile, and interquartile range of the
following data set of values: 20, 19, 21, 22, 23, 24, 25, 27, 26
Solution:
Arranging the values in ascending order: 19, 20, 21, 22, 23, 24, 25, 26, 27
Lower Quartile (Q1) = Mean of 2nd and 3rd term = (20 + 21)/2 = 20.5
Upper Quartile(Q3) = Mean of 7th and 8th term = (25 + 26)/2 = 25.5
IQR = 5
Answer: IQR = 5
Example 2: What will be the upper quartile for the following set of numbers?
26, 19, 5, 7, 6, 9, 16, 12, 18, 2, 1.
Solution:
The formula for the upper quartile formula is Q3 = ¾(n + 1)th Term.
The formula instead of giving the value for the upper quartile gives us the place. For example, 8th
place, 10th place, etc.
So firstly we put your numbers in ascending order: 1, 2, 5, 6, 7, 9, 12, 16, 18, 19, 26. There are a
total of 11 numbers, so:
Solution: The upper quartile (18) is the 9th term or on the 9th place from the left.
Example 3: Find the 3rd quartile in the following data set: 4, 5, 8, 7, 11, 9, 9
Solution:
In order to find the 3rd quartile, we have to deal with the data points that are greater than the
median that is 9, 9, 10.
In order to find the 3rd quartile, we have to find the median of the data points that are greater
than the median that is 9, 9, 10.
You can use Cuemath's online quartile calculator to verify your answer.
Decile
Decile is a method that is used to divide a distribution into ten equal parts. When data is divided
into deciles a decile rank is assigned to each data point in order to sort the data into ascending or
descending order. A decile has 10 categorical buckets while a quartile has 4 and a percentile has
100.
The concept of a decile is used widely in the field of finance and economics to perform the
analysis of data. It can be used to check the performance of a portfolio in the field of finance. In
this article, we learn more about a decile, its definition, rank, and see associated examples on
calculating the decile value.
What is Decile?
Decile, percentile, quartile, and quintile are different types of quantiles in statistics. A quantile
refers to a value that divides the observations in a sample into equal subsections. There will
always be 1 lesser quantile than the number of subsections created.
Decile Definition
Decile is a type of quantile that divides the dataset into 10 equal subsections with the help of 9
data points. Each section of the sorted data represents 1/10 of the original sample or population.
Decile helps to order large amounts of data in the increasing or decreasing order. This ordering is
done by using a scale from 1 to 10 where each successive value represents an increase by 10
percentage points.
Decile Formula
The decile formulas can be used to calculate the deciles for grouped and ungrouped data. When
data is in its raw form it is known as ungrouped data. When this data is sorted and organized then
it forms grouped data. These are given as follows:
Decile Formula for ungrouped data: D(x) = Value of the x(n+1)10
x is the value of the decile that needs to be calculated and ranges from 1 to 9. n is the total
number of observations in that data set.
l is the lower boundary of the class containing the decile given by (x × cf) / 10, cf is the
cumulative frequency of the entire data set, w is the size of the class, N is the total frequency, C
is the cumulative frequency of the preceding class.
The next section will cover the steps for calculating a particular decile.
Decile Example
Suppose a data set consists of the following numbers: 24, 32, 27, 32, 23, 62, 45, 80, 59, 63, 36,
54, 57, 36, 72, 55, 51, 32, 56, 33, 42, 55, 30. The value of the first two deciles has to be
calculated. The steps required are as follows:
Step 1: Arrange the data in increasing order. This gives 23, 24, 27, 30, 32, 32, 32, 33, 36, 36, 42,
45, 51, 54, 55, 55, 56, 57, 59, 62, 63, 72, 80.
Step 2: Identify the total number of points. Here, n = 23
Step 3: Apply the decile formula to calculate the position of the required data point. D(1) =
(n+1)10
= 2.4. This implies the value of the 2.4th data point has to be determined. This will lie between the
scores in the 2nd and 3rd positions. In other words, the 2.4th data is 0.4 of the way between the scores 24
and 27
Step 4: The value of the decile can be determined as [lower score + (distance)(higher score - lower
score)]. This is given as 24 + 0.4 * (27 – 24) = 25.2
Step 5: Apply steps 3 and 4 to determine the rest of the deciles. D(2) = 2(n+1)10
= 4.8th data between digit number 4 and 5. Thus, 30 + 0.8 * (32 – 30) = 31.6
A decile is a quantile that is used to divide a data set into 10 equal subsections.
The 5th decile will be the median for the dataset.
The decile formula for ungrouped data is given as x(n+1)10
Decile Worksheet
Examples on Decile
1. Example 1: Find the 6th and the 9th decile for the data in the above-mentioned example.
Solution: The arranged data is 23, 24, 27, 30, 32, 32, 32, 33, 36, 36, 42, 45, 51, 54, 55,
55, 56, 57, 59, 62, 63, 72, 80
n = 23
D(6) = 6(n+1)10
= 14.4th data. This lies between 54 and 55.
D(6) = 54 + 0.4 * (55 – 54) = 54.4
D(9) = 9(n+1)10
Example 2: Find the median of the following data set using the concept of deciles.
55, 58, 61, 67, 68, 70, 74, 81, 82, 93, 20, 28, 29, 30, 36, 37, 39, 42, 53, 54
Solution: Arranging the data in increasing order 20, 28, 29, 30, 36, 37, 39, 42, 53, 54, 55, 58, 61, 67, 68,
70, 74, 81, 82, 93
The fifth decile is the median of the data set, thus,
n = 20
D(5) = 5(n+1)10
Example 3: Find the 7th decile for the following frequency distribution table.
Class Frequency
10 - 20 15
20 - 30 10
30 - 40 12
40 - 50 8
50 - 60 7
60 - 70 18
70 - 80 5
80 - 90 25
Solution:
From the given frequency distribution table, we can have,
10 - 20 15 15
Class Frequency Cumulative Frequency (cf)
20 - 30 10 25
30 - 40 12 37
40 - 50 8 45
50 - 60 7 52
60 - 70 18 70
70 - 80 5 75
80 - 90 25 100
D(7) = 7×10010
D(7) = l+wf(Nx10−C)
= 60+1018(7×10010−52)
3. = 70
A decile in statistics is a method to divide the distribution into 10 equal parts by using 9 data
points and assigning decile ranks to each point.
Once the data set is sorted into deciles then a decile class rank is assigned to each point so as to
arrange these deciles into increasing order.
The decile formula for ungrouped data is determined by the value of the x(n+1)10
term. The formula for grouped data is l+wf(Nx10−C)
How to Find the Value of the Median Using the Decile Formula?
The value of the 5th decile represents the median. For ungrouped data the median will be given
by D(5) = 5(n+1)10
th
term.
The first decile is a point such that 90% of the data lies above it and 10% of the data lies below
it. Similarly, the 2nd decile is a point with 20% of data lying below it and 80% lying above it.
The steps to calculate the decile for ungrouped data are as follows: