Professional Documents
Culture Documents
Pristine www.edupristine.com
Pristine
2.c. Case: Summarizing Data
Romanov, an Analytics consultant works with Credit One bank. His manager gave him some
data around credit cards relating to number of credit cards issued to a set of customers and
the credit limit of the cards. Further he has been tasked to summarize the data in a
presentable form and prepare the report. Romanov, who has just started his professional
career, has never played around with such kind of data, so he is clueless about the different
summarizing techniques.
Now, suppose he approached you and asked your help in preparing the report. Help Romanov
in summarizing the data and preparing the report.
Pristine 1
2.c. Comments: Summarizing Data
Pristine 2
2.c. Summarizing Data - Frequency distribution
A technique to summarize discrete data
A simple process which involves counting of distinct discrete values
The representation can be either tabular or graphical
Example: Number of credit cards owned in a sample of 3000 individuals
7 240 0
1 2 3 4 5 6 7 8 9 10
8 150
# Cards
9 120
10 90
Pristine 3
2.c. Summarizing Data - Frequency distribution (Using MS Excel)
1 2 3 Number of 4
Credit Cards
3
2
4
5
1
7
9
10
6
8
4. Press ctrl+alt+enter
# Customers 7 6 5
700
600
500
400
300 # Customers
200
100
0
1 2 3 4 5 6 7 8 9 10
Pristine 4
2.c. Summarizing Data - Grouped Frequency distribution
A technique to summarize continuous data or discrete data having large number of observations
and an extended range
A simple process which involves counting of values falling under the different intervals (grouped)
Example and illustration 2.2: Number of customers falling under different Salary groups
100
80
#Customers
60
40
20
Salary Band
Pristine 5
2.c. Summarizing Data Grouped Frequency distribution (Using MS Excel)
1 2
1. Press ctrl+alt+enter
4
5.Observe the difference
between horizontal axes of
two charts
3
5
# Customers
4.From Edit select the
120
100
salary bands as horizontal
80 axis
60
40
20
0
0-75000
200001-225000
100001-125000
150001-175000
250001-275000
300001-325000
350001-375000
400001-425000
450001-475000
500001-525000
550001-575000
600001-625000
650001-675000
700001-725000
750001-775000
800001-825000
850001-875000
900001-925000
950001-975000
Pristine 6
2.c. Summarizing Data - Cumulative Frequency distribution
Cumulative frequencies are obtained by accumulating the frequencies to give the total number of
observations up to and including the value or group in question.
Example and illustration 2.3: Cumulative number of cards in the sample of 3000 individuals
Cumulative # Customers
2500
2 450
3 900 2000
4 1560 1500
5 2100
1000
6 2400
7 2640 500
8 2790 0
0 1 2 3 4 5 6 7 8 9 10
9 2910
# Cards
10 3000
Pristine 7
2.c. Summarizing Data - Cumulative Frequency distribution (Using MS Excel)
1 2
5 4 3
Cumulative # Customers
3500
3000
2500
2000
1500
1000
500
0
0 2 4 6 8 10 12
3. Observe the last entry. It is equal to
Pristine the total numbers of observations 8
2.c. Summarizing Data Stem-leaf diagram
Stem-leaf diagram
Not suitable for large data. Hence, not extensively used in industry.
Illustration: Given age of 20 individuals in years. Represent them using stem-leaf diagram
Sl # Age Age (Sorted)
1 23 21
2 33 23 Stem Leaf
3 23 24
4 33 27 20 1 3 4 7
5 34 30
6 21 31
7 54 33
8 52 34 30 1 3 4 5 6 9
9 34 35
10 36 36
11 52 39
12 51 40 40 0 3 8 9
13 48 43
14 35 48
15 40 49
50 1 2 3 4 7
16 43 51
17 49 52
18 54 53
19 27 54
Pristine 20 39 57 9
2.c. Summarizing Data Line Plots
Line plot diagram
Not suitable for large data. Hence, not extensively used in industry.
Illustration: Given test scores of 20 students. Represent them using line plot diagram
Sl # Score Score (Sorted)
1 50 20
2 20 20
3 50 20
4 50 30
5 50 30
6 30 30
7 30 30
8 40 30
9 30 40
10 40 40
11 30 40
12 20 40
13 50 40
14 40 50
15 20 50
16 30 50
17 40 50
18 40 50
19 50 50
20 50 50
Pristine 10
Thank you!
Pristine
702, Raaj Chambers, Old Nagardas Road, Andheri (E), Mumbai-400
069. INDIA
www.edupristine.com
Ph. +91 22 3215 6191
Pristine www.edupristine.com
Pristine 11