You are on page 1of 28

Data Analysis with Python

Learning Objectives
• Identify and handling missing values
• Data Formatting
• Data Normalization(centering/scaling)
• Data Binning
• Turning Categorical values into numeric
variables

2
Data Preprocessing

The process of mapping or converting from the


initial “raw” form into another format, in
order to prepare the data for further analysis.

• Also known as
– Data Cleaning or Data Wrangling

3
12/7/2019
Data Mining: Concepts and Techniques
How to Handle Missing Data?
• What is Missing Value?
• Missing values occurred when no data is stored for a variable
(feature) for an observation.
• Could be represented as “?”, “N/A”, 0 or just a blank cell

4
How to Handle Missing Data?

5
Data Formatting

6
Data Formatting

7
Data Formatting

8
Data Formatting

9
Data Formatting

10
Data Normalization

11
Data Normalization

12
Data Normalization

13
Data Normalization

14
Data Normalization

15
Min-Max

16
Z-Score

17
Binning

18
Binning

19
Binning

20
21
Turning categorical variables into quantitative variables in Python

22
Turning categorical variables into quantitative variables in Python

23
Turning categorical variables into quantitative variables in Python

24
Turning categorical variables into quantitative variables in Python

25
Turning categorical variables into quantitative variables in Python

26
Turning categorical variables into quantitative variables in Python

27

You might also like