You are on page 1of 61

CLASS 5.

2 B
BUSINESS STATISTICS
FREQUENCY
DISTRIBUTION
RESEARCH SCHOLAR
PRIYA CHUGH
Frequency Distribution
A frequency distribution shows the frequency of repeated items in a graphical form or tabular form. It gives a visual display of
the frequency of items or shows the number of times they occurred.

What is Frequency Distribution?


Frequency distribution is used to organize the collected data in table form. The data could be marks scored by students,
temperatures of different towns, points scored in a volleyball match, etc. After data collection, we have to show data in a
meaningful manner for better understanding. Organize the data in such a way that all its features are summarized in a table.
Let's consider an example to understand this better. The following are the scores of 10 students in the G.K. quiz released by Mr.
Chris 15, 17, 20, 15, 20, 17, 17, 14, 14, 20. Let's represent this data in a tabular form and find out the frequency of students who
got the same marks.
Quiz Marks No. of Students

15 2

17 3

20 3

14 2
Let us consider the following distribution of marks of 200 students in an examination, arranged serially in order of their roll
numbers. TABLE 3·5.

MARKS OF 200 STUDENTS


70 45 33 64 50 25 65 75 30 20 41 53 48 21 28 55 60 65 58 52 36 45 42 35 40 30 33 37 35 29 51 47 39 61 53 59 49 41 15 53 43 32
24 38 38 42 63 78 65 45 63 54 52 48 46 46 50 26 15 23 57 53 55 42 45 39 64 35 26 18 41 38 40 37 40 49 42 36 41 29 46 40 32 34
44 54 35 39 31 48 37 38 40 32 49 48 50 43 55 43 39 41 48 53 34 22 41 50 17 46 32 31 42 34 34 32 33 24 43 39 42 25 52 38 46 40
50 27 47 34 44 34 33 47 42 48 45 30 28 31 17 42 57 35 38 17 33 46 36 23 42 21 51 37 42 37 38 42 49 52 38 53 57 47 59 61 33 17
71 39 44 42 39 16 17 27 19 54 51 39 43 42 16 37 67 62 39 51 53 41 53 59 37 27 29 33 34 42 22 31

• The data in the above form is called the raw or disorganised data.
• In the raw form the data are so unwieldy .
• The above presentation of the data in its raw form does not give us any useful information and is rather confusing to the mind.
• Our objective will be to express the huge mass of data in a suitable condensed form which will highlight the significant facts
and comparisons and furnish more useful information without sacrificing any information of interest about the important
characteristics of the distribution.
Construction of Frequency Distribution
The following steps are involved in the construction of a frequency distribution.

Arrange data into an array.


The first step in organizing the data is to arrange them in an array so that we can observe the data in a more
meaningful and systematic manner. Notice that data can be arranged from lowest to highest values (ascending
order) or from highest to lowest values (descending order).

Find the range of the data: The range is the difference between the largest and the smallest values.

Decide the approximate number of classes  in which the data are to be grouped. There are no hard and first rules for
number of classes.
H.A. Sturges provides a formula for determining the approximation number of classes.
K=1+3.322log⁡N
where K= Number of classes
N= total number of observations
and log⁡N = Logarithm of the total number of observations.
Example: If the total number of observations is 50, the number of classes would be
K=1+3.322log⁡N
K=1+3.322log⁡50
K=1+3.322(1.69897)
K=1+5.644
K=6.644
7 classes, approximately.
(3) Determine the approximate class interval size: The size of class interval is obtained by dividing the range of data by
the number of classes and is denoted by h
 class interval size
(h)=Range/Number of Classes
In the case of fractional results, the next higher whole number is taken as the size of the class interval.

(4) Decide the starting point: The lower class limit or class boundary should cover the smallest value in the raw data. It is a
multiple of class intervals.
Example: 0,5,10,15,20, etc. are commonly used.

(5) Determine the remaining class limits (boundary): When the lowest class boundary has been decided, by adding the
class interval size to the lower class boundary
you can compute the upper class boundary. The remaining lower and upper class limits may be determined by adding the
class interval size repeatedly till the largest value
of the data is observed in the class.

(6) Distribute the data into respective classes: All the observations are divided into respective classes by using the 
tally bar (tally mark) method, which is suitable for tabulating the observations into respective classes.
The number of tally bars is counted to get the frequency against each class.
The frequency of all the classes is noted to get the grouped data or frequency distribution of the data.
The total of the frequency columns must be equal to the number of observations.
Example Construction of Frequency Distribution
Construct a frequency distribution with the suitable class interval size of marks obtained by 50 students of a
class, which are given below:
23,50,38,42,63,75,12,33,26,39,35,47,43,52,56,59,64,77,15,21,51,54,72,68,36,65,52,60,27,34,47,48,
55,58,59,62,51,48,50,41,57,65,54,43,56,44,30,46,67,53

Solution:
Arrange the marks in ascending order as

12, 15, 21, 23, 26, 27, 30, 33, 34, 35, 36, 38, 39, 41, 42, 43, 43, 44, 46, 47, 47, 48, 48, 50, 50, 51, 51, 52, 52, 53, 54,
54, 55, 56, 56, 57, 58, 59, 59, 60, 62, 63, 64, 65, 65, 67, 68, 72, 75, 77
Minimum Value = 12   Maximum = 77
Range = Maximum Value – Minimum Value = 77–12–12 = 65
Number of Classes = 1+3.322logN
Number of Classes =  1+3.322log50
Number of Classes = 1+3.322(1.69897)
Number of Classes = 1+5.64 = 6.64 or 7, approximately.
Class Interval Size (h) = Range/No. of Classes 
= 65/7 = 9.3 or 10
Note: To find the class boundaries, we take half of the difference between the lower class limit of the 2nd class and
the upper class limit of the 1st class20–192=12=0.520–192=12=0.5. This value is subtracted from the lower class limi
and is added to the upper class limit to get the required class boundaries.

Marks Number of Class Class


Class Limits Students Boundary Marks
C.L f C.B x
10–19 2 9.5–19.5 10+19/2=14.5

20–29 4 19.5–29.5 20+29/2=24.5

30–39 7 29.5–39.5 30+39/2=34.5

40–49 10 39.5–49.5 40+49/2=44.5

50–59 16 49.5–59.5 50+59/2=54.5

60–69 8 59.5–69.5 60+69/2=64.5

70–79 3 69.5–79.5 70+79/2=74.5


50
Types of frequency distribution
There are different types of frequency distributions.

1.Ungrouped or Discrete frequency distribution


2.Grouped or Continuous frequency distribution
3.Cumulative frequency distribution
4.Relative frequency distribution
5.Relative cumulative frequency distribution
6.Bivariate frequency distribution
1.  Discrete or Ungrouped Frequency Distribution There will be no class boundaries because discrete data are not in fractions.
For example; following figures represents number of HOUSEHELP to 50 women in a certain locality up to the age of 40 years.

The following Table 5 shows the frequency distribution table for


discrete data, taking the class interval size of 1.
Number of HOUSEHELP Tally Marks Number of Women

0 // 2

1 // 2

2 //// 4

3 //// 5

4 //// /// 8

5 //// //// 10

6 //// / 6

7 //// 4

8 /// 3
9 //// 5
10 / 1
Total 50
2. Grouped or Continuous Frequency Distribution Table
To arrange a large number of observations or data we use grouped frequency distribution table. In this, we form class
intervals to tally the frequency for the data that belongs to that particular class interval having upper and lower class limits.
For example, Marks obtained by 20 students in the test are as follows. 5, 10, 20, 15, 5, 20, 20, 15, 15, 15, 10, 10, 10, 20, 15,
5, 18, 18, 18, 18. To arrange the data in grouped table we have to make class intervals. Thus, we will make class intervals of
marks like 0 – 5, 6 – 10, and so on.
Given below table shows two columns one is of class intervals (marks obtained in test) and the second is of frequency (no. of
students). In this, we have not used tally marks and we counted the marks directly.

Marks obtained in
No. of Students
Test (class
(Frequency)
intervals)
0–5 3
6 – 10 4
11 – 15 5
16 – 20 8
Total 20
• In case of exclusive series, the upper limit of one class interval is the lower limit of the next class interval. In
exclusive series, value of upper limit of a class is not included in that class.
• In inclusive series value of the upper limit of a class is included in that very class interval. In inclusive series, value
of upper limit is included in that class. Counting In inclusive series, counting is not possible Counting can be done in
all without converting it into exclusive series.
• Generally, the inclusive method is used for discrete variables (number of workers, obtained marks, etc.).
• But the exclusive method is used for continuous variables (income, age, weight, etc.)
• In such a situation, in terms of easy accessibility, we should change the inclusive series to an exclusive category.
• For this, the difference between the upper limit of one class and the lower limit of the next class is halved and the result is
subtracted from the lower limits (l1) of the class and added to the upper limits (l2).
3. Cumulative Frequency Distribution
Cumulative frequency distribution represents the sum of all succeeding or
previous frequencies up to certain class.
The table showing the cumulative frequency is called cumulative frequency
distribution or cumulative frequency distribution table or simply cumulative
frequency.
For example, referring Table 1, the cumulative frequency for class 120-129 is 1 +
4 = 5.
Similarly, the cumulative frequency of the class 130-139 is 1+ 4 + 17 = 22.
It will be interpreted as there are 22 children who have weights less than 139.5
pounds.
The cumulative frequency is shown in the following Table 6.
4. Relative Frequency Distribution
The frequency of a class divided by the total frequency is called the
relative frequency of that particular class.
The frequency distribution table showing the relative frequencies is
called relative frequency distribution or relative
frequency or percentage table.
Relative frequencies are generally expressed as a percentage. The sum
of the relative frequencies of all the classes is 1 or 100%. For example;
referring Table 1, the relative frequency of the class 160-169 is 18/120 x
100 = 15%.
The following t Table 7 gives the relative frequency distribution for the
weight distribution of Table 1.
Weight (lb) Relative Frequency
110-119 1/120 = 0.0083 or 0.83%
120-129 4/120 = 0.0333 or 3.33%
130-139 17/120 = 0.1417 or 14.17%
140-149 28/120 = 0.2333 or 23.33%
150-159 25/120 = 0.2084 or 20.84%
160-169 18/120 = 0.15 or 15%
170-179 13/120 = 0.1083 or 10.83%
180-189 6/120 = 0,05 or 5%
190-199 5/120 = 0.0417 or 4.17%
200-209 2/120 = 0.0167 or 1.67%
210-219 1/120 = 0.0083 or 0.83%
5. Relative Cumulative Frequency Distribution
The cumulative frequency of a class divided by the total frequency is called relative cumulative frequency. It is also
called percentage cumulative frequency since it is expressed in percentage. The table showing relative cumulative
frequencies is called the relative cumulative frequency distribution or percentage cumulative frequency
distribution. For example, referring Table 6, the relative cumulative frequency of weight less than 159.5 is 75/120 x
100 = 62.5%. it means that 62.5% of the students have weight less than 159.5 pounds. The following Table 8 gives the
relative cumulative frequency distribution for Table 6.
Weight (lb) Relative Cumulative Frequency
Less than 109.5 0%
Less than 119.5 1/120 = 0.0083 or 0.83%
Less than 129.5 5/120 = 0.0417 or 4.17%
Less than 139.5 22/120 = 0.1833 or 18.33%
Less than 149.5 50/120 = 0.4167 or 41.67%
Less than 159.5 75/120 = 0.6250 or 62.5%
Less than 169.5 93/120 = 0.7750 or 77.5%
Less than 179.5 106/120 = 0.8833 or 88.33%
Less than 189.5 112/ 120 = 0.9333 or 93.33%
Less than 199.5 117/120 = 0.9750 or 97.5%
Less than 209.5 119/120 = 0.9917 or 99.17%
Less than 219.5 120/120 = 1 or 100%
6. Bi-variate Frequency Distribution
• So far we have considered frequency distributions which involved
only one variable.
• Such frequency distributions are called uni-variate frequency
distribution because they involve only one variable.
• We can also construct a distribution taking two variables at a time.
• The frequency distribution involving two variables is called bivariate
frequency distribution or bivariate frequency table or
simply bivariate distribution or bivariate table.
• For example; in the data provided below we have the heights in inches
and weighs in pounds of 50 students at a certain college.
Height 60 62 61 70 64 60 65 65 73 71
(inches)
Weight (lb) 100 105 104 115 110 102 110 108 119 118
Height 61 60 63 64 67 68 69 64 66 62
(inches)
Weight (lb) 109 108 107 112 115 117 117 111 113 104
Height 63 67 71 70 68 68 71 64 63 68
(inches)
Weight (lb) 108 108 116 110 114 116 119 107 108 105
Height 73 69 64 67 67 64 62 67 62 64
(inches)
Weight (lb) 119 107 115 111 114 108 105 117 105 107
Height 65 66 67 68 61 64 65 67 66 69
(inches)
Weight (lb) 108 116 118 115 104 108 109 113 113 115
From the above data we will prepare frequency distribution, taking the class interval of size 3 for heights and a class interval of
size 5 pounds for weights.
We will arrange the class limits for heights in columns and those of weights in rows as provided in Table 9 below.
The classification of data will be done by taking pair of values of two variables and a tally mark will be marked in a cell lying at
the intersection of appropriate class of the two variables.
For example; the tally mark for the height 60 inches and weight of 100 pounds will be marked at the intersection of the classes
60-62 for heights and 100-104 for weights.
The following Table 9 shows bivariate frequency distribution by tally marks and can also show bivariate frequency distribution
by listing of actual values.
THANK YOU
CLASS 5.2 B
BUSINESS STATISTICS
TYPES AND CONSTRUCTION
OF DIAGRAMS AND GRAPHS
RESEARCH SCHOLAR
PRIYA CHUGH
Difference Between Diagrams And Graphs
There is no clear-cut line of demarcation between a diagram and a graph yet:

1. A graph needs a graph paper but a diagram can be drawn on a plain paper. In the technical way we can say
that a graph is a mathematical relation between two variables. This however is not the case of a diagram.
2. As diagrams are attractive to look at, they are used for publicity and propaganda. Graphs on the other hand
are more useful to statisticians and research workers for the purpose of further analysis.
3. For representing frequency distribution, diagrams are rarely used when compared with graphs. For example,
for the time series graphs are more appropriate than diagrams.

Uses of Diagrams and Graphs:


Diagrams and graphs are extremely useful due to the following reasons:
1. Information presented though diagrams and graphs can be understood easily just in a bird’s eye view.
2. These are appealing and fascinating to the eyes; Scholars take greater interest in presenting data through
these devices.
3. Diagrams and graphs produce a greater lasting impression on the mind of the readers than the figures
presented in a table.
4. They facilitate ready comparison of data over time and space. Graphs study economic relationship between
two variables.
However, graphic and diagrammatic presentation have some limitations. For example, unlike a table a diagram
or a graph does not show the exact value of a variable. Further, a limited set of facts can be presented through
such devices like diagram and graph.
                                       
General Rules for Drawing Graphs and Diagrams
Following points must be kept in mind while constructing a diagram or graph. Every diagram or graph must have
a serial number. It is necessary to distinguish one from the other.

1. Serial number: Every diagram or graph must have a serial number. It is necessary to distinguish one from the
other.
2. Title: Title must be given to every diagram or graph. From the title one can know the idea contained in it. The
title should be brief and self-explanatory. It is usually placed at the top.
3. Proper size and scale: A diagram or graph should be of normal size and drawn with proper scale. The scale in a
graphs specifies the size of the unit.
4. Cleanliness: Diagrams must be as simple as possible. Further they must be quite neat and clean. They should
also be descent to look at.
5. Index: Every diagram or graph must be accompanied by an index. This illustrates different types of lines,
shades or colors used in the diagram.
6. Footnote: Foot notes may be given at the bottom of a diagram if necessary. It clarifies certain points in the
diagram.
Types of Graphs
Graphical representation can be advantageous to bring out the statistical nature of the frequency distribution of quantitative variable, which may be discrete or
continuous.
The most commonly used graphs are
 
1.        Histogram
2.        Frequency Polygon
3.        Frequency Curve
4.        Cumulative Frequency Curves (Ogives)
 
1. Histogram

A histogram is an attached bar chart or graph displaying the distribution of a frequency distribution in visual form.
Take classes along the X-axis and the frequencies along the Y-axis.
Corresponding to each class interval, a vertical bar is drawn whose height is proportional to the class frequency. 
Limitations: 
We cannot construct a histogram for distribution with open-ended classes. The histogram is also quite misleading, if the distribution has
unequal intervals.

Example 4.9
Draw the histogram for the 50 students in a class whose heights (in cms) are given below.

Find the range, whose height of students are maximum.


Solution: 
Since we are displaying the distribution of Height and Number of students in visual form, the histogram is drawn. 
Step 1 : Heights are marked along the X-axis and labeled as “Height(in cms)”. 
Step 2 : Number of students are marked along the Y-axis and labeled as “No. of students”. 
Step 3 : Corresponding to each Heights, a vertical attached bar is drawn whose height is proportional to the number of students. 
The Histogram is presented in Fig 4.7. 
For drawing a histogram, the frequency distribution should be continuous. If it is not continuous, then make it continuous as follows.
The tallest bar shows that maximum number of students height are in the range 130.5 to 140.5 cm
 
2. Frequency Polygon
 
Frequency polygon is drawn after drawing histogram for a given frequency distribution.
The area covered under the polygon is equal to the area of the histogram. Vertices of the polygon represent the class frequencies.
Frequency polygon helps to determine the classes with higher frequencies. It displays the tendency of the data.
The following procedure can be followed to draw frequency polygon: 
i. Mark the midpoints at the top of each vertical bar in the histogram representing the classes. 
ii. Connect the midpoints by line segments.

Example 4.12
A firm reported that its Net Worth in the years 2011-2016 are as follows:

Draw the frequency polygon for the above data 


Solution:
Since we are displaying the distribution of Net worth in the
years 2011-2016, the Frequency polygon is drawn to determine
the classes with higher frequencies. It displays the tendency of
the data. 
The following procedure can be followed to draw frequency
polygon: 
Step 1 : Year are marked along the X-axis and labeled as
‘Year’. 
Step 2 : Net worth are marked along the Y-axis and labeled as
‘Net Worth (in lakhs of `)’. 
Step 3 : Mark the midpoints at the top of each vertical bar in
the histogram representing the year. 
Step 4 : Connect the midpoints by line segments. 
The Frequency polygon is presented in Fig 4.10.
3. Frequency Curve
 
Frequency curve is a smooth and free-hand curve drawn to represent a frequency distribution.
Frequency curve is drawn by smoothing the vertices of the frequency polygon.
Frequency curve provides better understanding about the properties of the data than frequency polygon and
histogram.

Example 4.13
The ages of group of pensioners are given in the table below. Draw the Frequency curve to the following data.
Solution:
Since we are displaying the distribution of Age and Number of
Pensioners, the Frequency curve is drawn, to provide better
understanding about the age and number of pensioners than
frequency polygon. 
The following procedure can be followed to draw frequency
curve: 
Step 1 : Age are marked along the X-axis and labeled as ‘Age’. 
Step 2 : Number of pensioners are marked along the Y-axis and
labeled as ‘No. of Pensioners’. 
Step3 : Mark the midpoints at the top of each vertical bar in the
histogram representing the age. 
Step 4 : Connect the midpoints by line segments by smoothing
the vertices of the frequency polygon 
The Frequency curve is presented in Fig 4.11.
4. Cumulative frequency curve ( Ogive )

Cumulative frequency curve (Ogive) is drawn to represent the cumulative frequency distribution.
There are two types of Ogives such as ‘less thanOgive curve’ and ‘more thanOgive curve’.
To draw these curves, we have to calculate the ‘less than’ cumulative frequencies and ‘more than’ cumulative frequencies.
The following procedure can be followed to draw the ogive curves: 
Less than Ogive: Less than cumulative frequency of each class is marked against the corresponding upper limit of the respective class.
All the points are joined by a free-hand curve to draw the less than ogive curve. 
More than Ogive: More than cumulative frequency of each class is marked against the corresponding lower limit of the respective class.
All the points are joined by a free-hand curve to draw the more than ogive curve. 
Both the curves can be drawn separately or in the same graph.
If both the curves are drawn in the same graph, then the value of abscissa (x-coordinate) in the point of intersection is the median.
If the curves are drawn separately, median can be calculated as follows:
Draw a line perpendicular to Y-axis at y=N/2. Let it meet the Ogive at C.
Then, draw a perpendicular line to X-axis from the point C. Let it meet the X-axis at M. The abscissa of M is the median of the data.
 
Example 4.14
Draw the less than Ogive curve for the following data:

Also, find 
i. The Median
ii. The number of workers whose daily wages are less
than ` 125. 
Solution: 
Since we are displaying the distribution of Daily Wages and No. of workers, the Ogive curve is drawn, to provide better understanding about the
wages and No. of workers.
The following procedure can be followed to draw Less than Ogive curve: 
Step 1 : Daily wages are marked along the X-axis and labeled as “Wages(in `)”. 
Step 2 : No. of Workers are marked along the Y-axis and labeled as “No. of workers”. 
Step 3 : Find the less than cumulative frequency, by taking the upper class-limit of daily wages. The cumulative frequency corresponding to any
upper class-limit of daily wages is the sum of all the frequencies less than the limit of daily wages. 
Step 4 : The less than cumulative frequency of Number of workers are plotted as points against the daily wages (upper-limit). These points are
joined to form less than ogive curve. 
The Less than Ogive curve is presented in Fig 4.12
i. Median = ` 120
ii. 183 workers get daily wages less than ` 125
Example 4.16
The yield of mangoes were recorded (in kg)are given below:
Graphically, 
i. find the number of trees which yield mangoes of less than 55 kg.
ii. find the number of trees from which mangoes of more than 75 kg.
iii. find the median. 
Draw the Less than and More than Ogive curves. Also, find the median using the Ogive curves
Solution: 
Since we are displaying the distribution of Yield and No. of trees, the Ogive curve is drawn, to provide better understanding about
the Yield and No. of trees 
The following procedure can be followed to draw Ogive curve: 
Step 1 : Yield of mangoes are marked along the X-axis and labeled as ‘Yield (in Kg.)’. 
Step 2 : No. of trees are marked along the Y-axis and labeled as ‘No. of trees’. 
Step 3 : Find the less than cumulative frequency, by taking the upper class-limit of Yield of mangoes. The cumulative frequency
corresponding to any upper class-limit of Mangoes is the sum of all the frequencies less than the limit of mangoes. 
Step 4 : Find the more than cumulative frequency, by taking the lower class-limit of Yield of mangoes. The cumulative frequency
corresponding to any lower class-limit of Mangoes is the sum of all the frequencies above the limit of mangoes. 
Step 5 : The less than cumulative frequency of Number of trees are plotted as points against the yield of mangoes (upper-limit).
These points are joined to form less than ogive curve. 
Step 6 : The more than cumulative frequency of Number of trees are plotted as points against the yield of mangoes (lower-limit).
These points are joined to form more than O give curve.
i. 16 trees yield less than 55 kg
ii. 20 trees yield more than 75 kg
iii. Median =66 kg
Types of Charts:
1. Simple Bar Chart
2.  Pie Chart
3. Line Chart
4. Area Chart
5. Scatterplot
6. Pictogram
line charts
A line chart graphically displays data that changes continuously over time. Each line graph
consists of points that connect data to show a trend (continuous change). line charts have an x-
axis and a y-axis. In the most cases, time is distributed on the horizontal axis.

Uses of line charts:


•When you want to show trends. For example, how house prices have increased over time.
•When you want to make predictions based on a data history over time.
•When comparing two or more different variables, situations, and information over a given
period of time.

Example: 
The following line graph shows annual sales of a particular business company for the period of
six consecutive years:
Bar Charts
Bar charts represent categorical data with rectangular bars (to understand what is categorical data see categorical data examples).
Bar graphs are among the most popular types of graphs and charts in economics, statistics, marketing, and visualization in 
digital customer experience. They are commonly used to compare several categories of data.
Each rectangular bar has length and height proportional to the values that they represent.
One axis of the bar chart presents the categories being compared. The other axis shows a measured value.

Bar Charts Uses:


•When you want to display data that are grouped into nominal or ordinal categories (see nominal vs ordinal data).
•To compare data among different categories.
•Bar charts can also show large data changes over time.
•Bar charts are ideal for visualizing the distribution of data when we have more than three categories.

Example:
The bar chart below represents the total sum of sales for Product A and Product B over three years.
The bars are 2 types: vertical or horizontal. It doesn’t matter which kind
you will use. The above one is a vertical type.
Pie Charts
When it comes to statistical types of graphs and charts, the pie chart (or the circle chart) has a
crucial place and meaning. It displays data and statistics in an easy-to-understand ‘pie-slice’
format and illustrates numerical proportion.
Each pie slice is relative to the size of a particular category in a given group as a whole. To say
it in another way, the pie chart brakes down a group into smaller pieces. It shows part-whole
relationships.
To make a pie chart, you need a list of categorical variables and numerical variables.
Pie Chart Uses:
•When you want to create and represent the composition of something.
•It is very useful for displaying nominal or ordinal categories of data.
•To show percentage or proportional data.
•When comparing areas of growth within a business such as profit.
•Pie charts work best for displaying data for 3 to 7 categories.
Example:
The pie chart below represents the proportion of types of transportation used by 1000 students to
go to their school.
Pie charts are widely used by data-driven marketers for displaying marketing
data.
 Scatter plot
The scatter plot is an X-Y diagram that shows a relationship between two variables. It is used to
plot data points on a vertical and a horizontal axis. The purpose is to show how much one
variable affects another.
Usually, when there is a relationship between 2 variables, the first one is called independent. The
second variable is called dependent because its values depend on the first variable.
Scatter plots also help you predict the behavior of one variable (dependent) based on the
measure of the other variable (independent).
Scatter plot uses:
•When trying to find out whether there is a relationship between 2 variables.
•To predict the behavior of dependent variable based on the measure of the independent
variable.
•When having paired numerical data.
•When working with root cause analysis tools to identify the potential for problems.
•When you just want to visualize the correlation between 2 large datasets without regard to
time.
Example:
The below Scatter plot presents data for 7 online stores, their monthly e-commerce sales, and online advertising costs for the last
year.
The orange line you see in the plot is called “line of best fit” or a “trend line”. This line is used to help us make predictions that
are based on past data.
The Scatter plots are used widely in data science and statistics. They are a great tool for visualizing linear regression models.
Pictographs
The pictograph or a pictogram is one of the more visually appealing types of graphs and charts
that display numerical information with the use of icons or picture symbols to represent data sets.
They are very easy to read statistical way of data visualization. A pictogram shows the frequency
of data as images or symbols. Each image/symbol may represent one or more units of a given
dataset.
Pictograph Uses:
•When your audience prefers and understands better displays that include icons and illustrations.
Fun can promote learning.
•It’s habitual for infographics to use of a pictogram.
•When you want to compare two points in an emotionally powerful way.
Example: 
The following pictographic represents the number of computers sold by a business company for
the period from January to March.
The pictographic example above shows that in January are sold
20 computers (4×5 = 20), in February are sold 30
computers (6×5 = 30) and in March are sold 15 computers.
Area Charts 
Area charts show the change in one or several quantities over time. They are very similar to the line chart.
However, the area between axis and line are usually filled with colors.

Despite line and area charts support the same type of analysis, they cannot be always used interchangeably. Line
charts are often used to represent multiple data sets. Area charts cannot show multiple data sets clearly because
area charts show a filled area below the line.

Area Chart Uses:


•When you want to show trends, rather than express specific values.
•To show a simple comparison of the trend of data sets over the period of time.
•To display the magnitude of a change.
•To compare a small number of categories.
The area chart has 2 variants: a variant with data plots overlapping each other and a variant with data plots
stacked on top of each other (known as stacked area chart – as the shown in the following example).

Example:
The area chart below shows quarterly sales for product categories A and B for the last year.
This area chart shows you a quick comparison of the trend in
the quarterly sales of Product A and Product B over the period
of the last year.
THANK YOU

You might also like