Professional Documents
Culture Documents
Learning Objectives
• Identify and handling missing values
• Data Formatting
• Data Normalization(centering/scaling)
• Data Binning
• Turning Categorical values into numeric
variables
2
Data Preprocessing
• Also known as
– Data Cleaning or Data Wrangling
3
12/7/2019
Data Mining: Concepts and Techniques
How to Handle Missing Data?
• What is Missing Value?
• Missing values occurred when no data is stored for a variable
(feature) for an observation.
• Could be represented as “?”, “N/A”, 0 or just a blank cell
4
How to Handle Missing Data?
5
Data Formatting
6
Data Formatting
7
Data Formatting
8
Data Formatting
9
Data Formatting
10
Data Normalization
11
Data Normalization
12
Data Normalization
13
Data Normalization
14
Data Normalization
15
Min-Max
16
Z-Score
17
Binning
18
Binning
19
Binning
20
21
Turning categorical variables into quantitative variables in Python
22
Turning categorical variables into quantitative variables in Python
23
Turning categorical variables into quantitative variables in Python
24
Turning categorical variables into quantitative variables in Python
25
Turning categorical variables into quantitative variables in Python
26
Turning categorical variables into quantitative variables in Python
27