You are on page 1of 12

Business Analytics

Introduction to Analytics and Data

Pristine www.edupristine.com
Pristine
2.c. Case: Summarizing Data

Romanov, an Analytics consultant works with Credit One bank. His manager gave him some
data around credit cards relating to number of credit cards issued to a set of customers and
the credit limit of the cards. Further he has been tasked to summarize the data in a
presentable form and prepare the report. Romanov, who has just started his professional
career, has never played around with such kind of data, so he is clueless about the different
summarizing techniques.

Now, suppose he approached you and asked your help in preparing the report. Help Romanov
in summarizing the data and preparing the report.

Pristine 1
2.c. Comments: Summarizing Data

There are various ways to summarize data. Some of them are


1. Frequency distribution
2. Grouped frequency distribution
3. Cumulative frequency distribution
4. Stem leaf diagram
5. Line plots

Pristine 2
2.c. Summarizing Data - Frequency distribution
A technique to summarize discrete data
A simple process which involves counting of distinct discrete values
The representation can be either tabular or graphical
Example: Number of credit cards owned in a sample of 3000 individuals

Tabular representation Graphical representation - Bar Chart


Number of Credit Freq Distribution- #Cards vs. # Customers
# Customers
Cards
700
1 150
600
2 300 # Customers
500
3 450
# Customers
400
4 660
300
5 540 200
6 300 100

7 240 0
1 2 3 4 5 6 7 8 9 10
8 150
# Cards
9 120

10 90
Pristine 3
2.c. Summarizing Data - Frequency distribution (Using MS Excel)
1 2 3 Number of 4
Credit Cards

3
2
4
5
1
7
9
10
6
8
4. Press ctrl+alt+enter

# Customers 7 6 5
700

600

500

400

300 # Customers

200

100

0
1 2 3 4 5 6 7 8 9 10

Pristine 4
2.c. Summarizing Data - Grouped Frequency distribution
A technique to summarize continuous data or discrete data having large number of observations
and an extended range
A simple process which involves counting of values falling under the different intervals (grouped)
Example and illustration 2.2: Number of customers falling under different Salary groups

Graphical representation - Bar Chart

Freq Distribution- Salary Band vs. # Customers


120

100

80
#Customers

60

40

20

Salary Band

Pristine 5
2.c. Summarizing Data Grouped Frequency distribution (Using MS Excel)
1 2

1. Press ctrl+alt+enter

4
5.Observe the difference
between horizontal axes of
two charts
3
5

# Customers
4.From Edit select the
120
100
salary bands as horizontal
80 axis
60
40
20
0
0-75000

200001-225000
100001-125000
150001-175000

250001-275000
300001-325000
350001-375000
400001-425000
450001-475000
500001-525000
550001-575000
600001-625000
650001-675000
700001-725000
750001-775000
800001-825000
850001-875000
900001-925000
950001-975000

Pristine 6
2.c. Summarizing Data - Cumulative Frequency distribution
Cumulative frequencies are obtained by accumulating the frequencies to give the total number of
observations up to and including the value or group in question.
Example and illustration 2.3: Cumulative number of cards in the sample of 3000 individuals

Tabular representation Graphical representation

Number of Credit Cumulative Cumulative # Customers


Cards Up to # Customers 3000
1 150

Cumulative # Customers
2500
2 450
3 900 2000

4 1560 1500
5 2100
1000
6 2400
7 2640 500

8 2790 0
0 1 2 3 4 5 6 7 8 9 10
9 2910
# Cards
10 3000

Pristine 7
2.c. Summarizing Data - Cumulative Frequency distribution (Using MS Excel)
1 2

5 4 3
Cumulative # Customers
3500
3000
2500
2000
1500
1000
500
0
0 2 4 6 8 10 12
3. Observe the last entry. It is equal to
Pristine the total numbers of observations 8
2.c. Summarizing Data Stem-leaf diagram
Stem-leaf diagram
Not suitable for large data. Hence, not extensively used in industry.
Illustration: Given age of 20 individuals in years. Represent them using stem-leaf diagram
Sl # Age Age (Sorted)
1 23 21
2 33 23 Stem Leaf
3 23 24
4 33 27 20 1 3 4 7
5 34 30
6 21 31
7 54 33
8 52 34 30 1 3 4 5 6 9
9 34 35
10 36 36
11 52 39
12 51 40 40 0 3 8 9
13 48 43
14 35 48
15 40 49
50 1 2 3 4 7
16 43 51
17 49 52
18 54 53
19 27 54
Pristine 20 39 57 9
2.c. Summarizing Data Line Plots
Line plot diagram
Not suitable for large data. Hence, not extensively used in industry.
Illustration: Given test scores of 20 students. Represent them using line plot diagram
Sl # Score Score (Sorted)
1 50 20
2 20 20
3 50 20
4 50 30
5 50 30
6 30 30
7 30 30
8 40 30
9 30 40
10 40 40
11 30 40
12 20 40
13 50 40
14 40 50
15 20 50
16 30 50
17 40 50
18 40 50
19 50 50
20 50 50
Pristine 10
Thank you!

Pristine
702, Raaj Chambers, Old Nagardas Road, Andheri (E), Mumbai-400
069. INDIA
www.edupristine.com
Ph. +91 22 3215 6191
Pristine www.edupristine.com
Pristine 11

You might also like