Professional Documents
Culture Documents
Data
Categorical Continuous
Editing / Cleaning
Table / Crosstab
Coding
Graph / Figure
Statistical Methods
Classification Frequency, %age, Ratio,
Tabulation Mean, Median, Standard
Graphical Deviation (Variance)
Representation Advance Statistical
Methods / Analysis
Comparison (t/z-test)
Association (chi square)
Correlation (r)
Regression (y = ax+b)
DATA ANALYSIS
1. DATA PREPARATION / INITIAL OPERATIONS
Editing / Data Cleaning
examining the collected raw data to detect any
errors and omit/correct it if possible
Coding
assigningnumerals to answers so that responses can
be put into a limited number of categories
Classification
Grouping of data on some basis (large volume of raw
data is reduced into homogenous groups
I. Attribute - on the basis of demographic bases
eg. gender, rural/urban, day scholar/hosteller
II. Class Interval – on the basis on some numeric
range eg. 0-10, 10-20 etc.
DATA ANALYSIS
2. DATA SUMMARIZATION / DATA ANALYSIS OPERATIONS
I. Tabulation
isthe process of displaying raw data in tabular
form and summarising it for further analysis
orderly arranging data in columns and rows
Musli
Akbar M 65 8004896712 HS 16 14 Mod-2 20
m
7 1 1 60 9450366367 -1 0 16 0 -4
2 1 2 65 8004896712 1 16 14 2 20
5 2 1 35 9934876545 2 19 0 0 15
4 2 1 90 2542543598 1 8 16 0 0
3 2 3 38 9458098734 3 21 13 3 0
6 1 1 48 9412890112 4 23 20 2 -1
1 1 1 45 8796654398 0 12 10 2 30
Nominal & Ordinal called qualitative . Interval and Ratio called quantitative
DATA ANALYSIS
2. DATA SUMMARIZATION / DATA ANALYSIS OPERATIONS
Roll. Age
Single / Multi Variable Table - one or No (yr)
more variable (no interaction) 1 22
2 24
Single Variable Freq. Table
3 23
Age Group (years) Freq. 4 26
Below 20 2 5 19
20-22 28 6 22
22-24 16 . .
24-26 10 . .
Above 26 4 . .
60 . .
. .
**Multiple Variable Table – as presented in above slide
60 22
DATA ANALYSIS
2. DATA SUMMARIZATION / DATA ANALYSIS OPERATIONS
Crosstabs – interaction of two or more
variables
Two Variable Interaction – Crosstab
Gender
Pie Charts
Used to represent %ages, distribution of 1
variable at various levels
Bar Chart
To represent 1 variable at various levels
Levels can be year/ groups etc.
Sales
4
2 4.3 4.5
3.5
2.5
1
0
2018 2019 2020 2021
DATA ANALYSIS
2. DATA SUMMARIZATION / DATA ANALYSIS OPERATIONS
Bar Chart
Clustered Bar
5
4
1st
3 2nd
3rd
2 4.3 4.4 4
3.5 4th
3 3
1 2.4 2 2.5 2.5
2 1.8
0
2018 2019 2020
DATA ANALYSIS
2. DATA SUMMARIZATION / DATA ANALYSIS OPERATIONS
Histogram
To show the distribution of a quantitative
variable
12
10
8
Frequency
6
10
4 8
6
2 4
2
0 0
10 20 30 40 50
Line Diagram
To show change in variable from in a particular
time period / reference range or points
₹ 7.40
₹ 7.20
Stock Price
₹ 7.00
₹ 6.80
₹ 6.60
₹ 6.40
₹ 6.20
₹ 6.00
₹ 5.80
₹ 5.60
1 2 3 4 5 6 7 8 9 10
Last 10 Days
DATA ANALYSIS
2. DATA SUMMARIZATION / DATA ANALYSIS OPERATIONS
Line Diagram
May also be used to compare 2 or more variables
along the range
14
12
10
8 Adani
6 Tata
Reliance
4
2
0
1 2 3 4 5 6 7 8
DATA ANALYSIS
2. DATA SUMMARIZATION / DATA ANALYSIS OPERATIONS
Scatter Plot
to express relationships between two variables
6
5
4
Sales in 3
Crore
2 Y-Values
1
0
0.5 1 1.5 2 2.5 3 3.5 4
Adv Budget in 10’Lacs
DATA ANALYSIS
2. DATA SUMMARIZATION / DATA ANALYSIS OPERATIONS
Scatter Plot
to express relationships between two variables
DATA ANALYSIS
2. DATA SUMMARIZATION / DATA ANALYSIS OPERATIONS
Scatter Plot
Trend Lines - Correlation
FREQUENCY DISTRIBUTION
No. of
Income 80
families
70
0-500 20
60
500-1000 30 50
No.of families
1000-1500 50 40
1500-2000 70 30
2000-2500 30 20
2500-3000 10 10
0
. . 0 500 1000 1500 2000 2500 3000 3500 4000
Income
. .