Welcome to Scribd!

Data Mining

Uploaded by

0% found this document useful (0 votes)

2 views11 pages

Data transformation techniques like normalization are used to prepare raw data for processing. Normalization maps data values to a smaller range through techniques like min-max normalization, z-score normalization, and decimal scaling. Min-max normalization scales data linearly between a specified minimum and maximum. Z-score normalization standardizes values based on an attribute's mean and standard deviation. Decimal scaling moves the decimal point of values based on the maximum absolute value. Normalization pre-processes data to minimize duplicates and aid further data mining tasks.

Original Description:

Original Title

Data Mining Ppt

Copyright

Available Formats

PPTX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

2 views11 pages

Data Mining

Uploaded by

Muneera Rafique

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 11

Search inside document

Presentation topic:

Data
Transformation by
Normalization
Data Transformation:

 A function that maps the entire set of values • Normalization: Scaled to fall within
of a given attribute to a new set of i.e. a smaller, specified range.
replacement values each old value can be
identified with one of the new values min-max normalization
 Methods z-score normalization
normalization by decimal scaling
• Smoothing: Remove noise from data
• Attribute/feature construction • Discretization: Concept hierarchy
climbing
• New attributes constructed from the
given ones
• Aggregation: Summarization, data cube
construction
“
Data normalization
The data normalization (also
referred to as data pre-
processing) is a basic element of
data mining. It means
transforming the data, namely
converting the source data in to
another format that allows
processing data effectively. The
main purpose of data
normalization is to minimize or
even exclude duplicated data.
Methods for data normalization:

Min-max normalization
Min-max normalization performs a linear transformation on the original data.
Suppose that minA and maxA are the minimum and maximum values of an
attribute, A. Min-max normalization maps a value, vi of A to vi’ in the range
[new_minA, new_maxA] by computing

v  minA
v'  (new _ maxA  new _ minA)  new _ minA
maxA  minA
Min-max normalization preserves the relationships among the original data values. It
will encounter an “out-of-bounds” error if a future input case for normalization falls
outside of the original data range for A.
Methods for data normalization:
Example Min-max normalization
Suppose that the minimum and maximum values for the
attribute income are $12,000 and $98,000, respectively. We
would like to map income to the range [0.0,1.0]. By min-max
normalization, a value of $73,600 for income is transformed
to:

73,600  12,000
(1.0  0)  0  0.716
98,000  12,000
Methods for data normalization:

z-score normalization (or zero-mean normalization)

In z-score normalization (or zero-mean normalization), the values for an attribute, A, are
normalized based on the mean (i.e., average) and standard deviation of A. A value, vi, of A is
normalized to vi’ by computing

v  A
v' 
 A
Methods for data normalization:

Example Z-score normalization

z-score normalization. Suppose that the mean and standard
deviation of the values for the attribute income are $54,000
and $16,000, respectively. With z-score normalization, a value
of $73,600 for income is transformed to
Ex. Let μ = 54,000, σ = 16,000. Then

73,600  54,000
 1.225
16,000
Methods for data normalization:
Normalization by decimal scaling
Normalization by decimal scaling normalizes by moving the decimal
point of values of attribute A. The number of decimal points moved
depends on the maximum absolute value of A. A value, vi, of A is
normalized to vi’ by computing

v
v' 
10 j
Methods for data normalization:

Example of normalization by decimal scaling

Suppose that the recorded values of A range from −986 to 917.
The maximum absolute value of A is 986. To normalize by
decimal scaling, we therefore divide each value by 1000 (i.e., j
= 3) so that −986 normalizes to −0.986 and 917 normalizes to
0.917. Note that normalization can change the original data
quite a bit, especially when using z-score normalization or
decimal scaling. It is also necessary to save the normalization
parameters (e.g., the mean and standard deviation if using z-
score normalization) so that future data can be normalized in a
uniform manner.
Thank
you

The Ultimate Guide To Iphone Resolutions
Document3 pages
The Ultimate Guide To Iphone Resolutions
khaled alahmad
No ratings yet
Laser Cutter Business Guide Ebook PDF
Document25 pages
Laser Cutter Business Guide Ebook PDF
dieter
No ratings yet
Bio Medical Signal Processing Tompkins
Document304 pages
Bio Medical Signal Processing Tompkins
sachin58
83% (6)
PEGASUS Spyware: Detecting and Protecting Your Smartphone From
Document8 pages
PEGASUS Spyware: Detecting and Protecting Your Smartphone From
If I
No ratings yet
Sample One Pager For Documentary Production Proposal
Document1 page
Sample One Pager For Documentary Production Proposal
Deidre Carney
100% (11)
Supervised Learning 1 PDF
Document162 pages
Supervised Learning 1 PDF
Alexander
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Number Series Arranged by Leonalyn - All With Tutorial Videos
Document12 pages
Number Series Arranged by Leonalyn - All With Tutorial Videos
Niela Alcelyn Dee
No ratings yet
Data Transformation
Document16 pages
Data Transformation
Avudaiappan S
No ratings yet
Data Normalization
Document7 pages
Data Normalization
Ruchira Saha
No ratings yet
DM 02 04 Data Transformation
Document52 pages
DM 02 04 Data Transformation
maneesh s
No ratings yet
3point5point2 Normalization
Document3 pages
3point5point2 Normalization
Sally Sameh
No ratings yet
Data Cleaning: Missing Values: - For Example in Attribute Income If
Document30 pages
Data Cleaning: Missing Values: - For Example in Attribute Income If
Ashyou Youash
No ratings yet
Normalization 05032024 010758pm
Document17 pages
Normalization 05032024 010758pm
Muneeba Hussain
No ratings yet
Data Preprocessing Part 2
Document14 pages
Data Preprocessing Part 2
new acc jeet
No ratings yet
Integration and Normalization
Document19 pages
Integration and Normalization
Mahmoud Elnahas
No ratings yet
Chapter 3: Data Preprocessing
Document15 pages
Chapter 3: Data Preprocessing
Asma Batool Naqvi
No ratings yet
Data Mining CSE-443: Ayesha Aziz Prova Lecturer, Dept. of CSE CWU
Document21 pages
Data Mining CSE-443: Ayesha Aziz Prova Lecturer, Dept. of CSE CWU
Dipty Sarker
No ratings yet
DMDW 5
Document25 pages
DMDW 5
Anu agarwal
No ratings yet
Lec 4 PDF
Document66 pages
Lec 4 PDF
pranab sarker
No ratings yet
ML Practical File
Document43 pages
ML Practical File
Pankaj Singh
100% (1)
Data Mining - Classification & Prediction
Document5 pages
Data Mining - Classification & Prediction
Tdx mentor
No ratings yet
Data Analysis
Document8 pages
Data Analysis
Tayar Elie
No ratings yet
Ch2 Data Preprocessing Part3: Amit KR Upadhyay Sharda University
Document24 pages
Ch2 Data Preprocessing Part3: Amit KR Upadhyay Sharda University
yachana_talk
No ratings yet
Exp2 - Data Visualization and Cleaning and Feature Selection
Document13 pages
Exp2 - Data Visualization and Cleaning and Feature Selection
mnbatrawi
No ratings yet
Preprocessing - M2
Document53 pages
Preprocessing - M2
RIZKA FIDYA PERMATASARI 06211940005004
No ratings yet
Normalization A Preprocessing Stage
Document5 pages
Normalization A Preprocessing Stage
Tanzeel Hassan
No ratings yet
Unit 4
Document33 pages
Unit 4
PaiEducation
No ratings yet
Lecture 09 DM
Document14 pages
Lecture 09 DM
Shakeel Ahmed Kashmiri
No ratings yet
Data Mining Unit-1 Lect-4
Document49 pages
Data Mining Unit-1 Lect-4
Pooja Reddy
No ratings yet
DataFrame Statistics
Document41 pages
DataFrame Statistics
rohan jha
No ratings yet
5 Data Pre Processing II
Document26 pages
5 Data Pre Processing II
Kanika Chanana
No ratings yet
Major Issues in Data Mining
Document9 pages
Major Issues in Data Mining
Gaurav Jaiswal
No ratings yet
Data Integration & Transformation
Document14 pages
Data Integration & Transformation
Rupesh V
No ratings yet
Unit 4 4407 Data Mining Discussion
Document2 pages
Unit 4 4407 Data Mining Discussion
Zodwa Mngometulu
No ratings yet
Utf 8''week4
Document15 pages
Utf 8''week4
devendra416
No ratings yet
FDP Day1
Document35 pages
FDP Day1
yadavsticky5108
No ratings yet
Dimensional Reduction in R
Document24 pages
Dimensional Reduction in R
Shil Shambharkar
No ratings yet
ICS 2408 - Lecture 2 - Data Preprocessing
Document29 pages
ICS 2408 - Lecture 2 - Data Preprocessing
petergitagia9781
No ratings yet
JAVA Advanced 3
Document19 pages
JAVA Advanced 3
Lucky Mahanto
No ratings yet
Week 4 - 5 - Data Preprocessing
Document67 pages
Week 4 - 5 - Data Preprocessing
Hussain ASL
No ratings yet
Linear Regression
Document4 pages
Linear Regression
Muhammad Ismail
No ratings yet
Feature Scaling in Machine Learning
Document4 pages
Feature Scaling in Machine Learning
Varun Bhayana
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
Document47 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
tanvi wadhwa
No ratings yet
Data Warehousing and Mining: Ii Unit: Data Preprocessing, Language Architecture Concept Description
Document7 pages
Data Warehousing and Mining: Ii Unit: Data Preprocessing, Language Architecture Concept Description
ravi3754
No ratings yet
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Machine Learning
Document35 pages
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Machine Learning
Mostafa Heidary
No ratings yet
Bayesian Regression and Bitcoin
Document17 pages
Bayesian Regression and Bitcoin
vy
No ratings yet
CS231n Convolutional Neural Networks For Visual Recognition 6
Document17 pages
CS231n Convolutional Neural Networks For Visual Recognition 6
Ali Rahimi
No ratings yet
4 - Finding and Fixing Data Quality Issues
Document48 pages
4 - Finding and Fixing Data Quality Issues
mkz01041
No ratings yet
Data Preprocessing
Document28 pages
Data Preprocessing
Rahul Sharma
No ratings yet
REPORT - Gradient Based Optimization
Document2 pages
REPORT - Gradient Based Optimization
abhay4meggi
No ratings yet
Untitled Document
Document1 page
Untitled Document
nafev89296
No ratings yet
K Nearest Neighbours (KNN) : Short Intro To KNN
Document13 pages
K Nearest Neighbours (KNN) : Short Intro To KNN
Luka Filipovic
No ratings yet
A COMPLETE GUIDE TO PRINCIPAL COMPONENT ANALYSIS in ML 1598272724
Document16 pages
A COMPLETE GUIDE TO PRINCIPAL COMPONENT ANALYSIS in ML 1598272724
「瞳」你分享
No ratings yet
Scikit Learn
Document17 pages
Scikit Learn
RR
No ratings yet
Data Cleaning
Document26 pages
Data Cleaning
Prashant deshmukh
No ratings yet
2.4 Dimensionality Reduction
Document6 pages
2.4 Dimensionality Reduction
Raja
No ratings yet
Normalization
Document35 pages
Normalization
Hrithik Reigns
No ratings yet
Nonlinear Model
Document3 pages
Nonlinear Model
Divya B
No ratings yet
Pca&kmean
Document6 pages
Pca&kmean
goelkartik23
No ratings yet
Week10 KNN Practical
Document4 pages
Week10 KNN Practical
seerungen jordi
No ratings yet
Linear Regression
Document36 pages
Linear Regression
Siddharth Doshi
No ratings yet
PCA by Vikram Kumar
Document19 pages
PCA by Vikram Kumar
Vikram Kumar
No ratings yet
Pattern Recognition - Unit 2
Document31 pages
Pattern Recognition - Unit 2
Priyansh Kumar
No ratings yet
How To Transform Features Into Normal Gaussian Distribution
Document9 pages
How To Transform Features Into Normal Gaussian Distribution
Ramesh Mudhiraj
No ratings yet
Data Wrangling and Preprocessing
Document41 pages
Data Wrangling and Preprocessing
Archana Balikram
No ratings yet
Pert Math Study Guide 1
Document18 pages
Pert Math Study Guide 1
vagablond
100% (1)
Hardware Interrupts - Arduino
Document7 pages
Hardware Interrupts - Arduino
17maq307
No ratings yet
Obelisk: MIDI Harmoniser Plugin
Document3 pages
Obelisk: MIDI Harmoniser Plugin
Daniel Barboza
No ratings yet
11 Modbus Protocol
Document22 pages
11 Modbus Protocol
Yohan Krisnandi Putra
No ratings yet
Structure and Functions of CPU
Document18 pages
Structure and Functions of CPU
Eswin Angel
No ratings yet
Oracle Live Experience Cloud Solutions Engineer Consultant Assessment
Document15 pages
Oracle Live Experience Cloud Solutions Engineer Consultant Assessment
The_vet
No ratings yet
Cisco Ucertify 300-710 Actual Test 2023-Jun-04 by Mick 101q Vce
Document5 pages
Cisco Ucertify 300-710 Actual Test 2023-Jun-04 by Mick 101q Vce
khalid anjum
No ratings yet
Artikel 2 (Pendukung) Preparing-For-The-Next-Normal-Via-Digital-Manufacturings-Scaling-Potential
Document10 pages
Artikel 2 (Pendukung) Preparing-For-The-Next-Normal-Via-Digital-Manufacturings-Scaling-Potential
Erik Gunawan
No ratings yet
Dell Inspiron 531 Specs - CNET
Document7 pages
Dell Inspiron 531 Specs - CNET
Arturo Nieto
No ratings yet
4.2 1 Maintain Computer Systems 2
Document15 pages
4.2 1 Maintain Computer Systems 2
Alexis Rhianne
No ratings yet
Icdl Workforce Word Processing. Syllabus 6.0. Learning Material (Ms Word 2013) Provided by - E-Tech Complete Solutions Limited
Document178 pages
Icdl Workforce Word Processing. Syllabus 6.0. Learning Material (Ms Word 2013) Provided by - E-Tech Complete Solutions Limited
janithmi perera
No ratings yet
Microphone Power Decoupling: Released Under The Creative Commons Attribution Share-Alike 4.0 License
Document1 page
Microphone Power Decoupling: Released Under The Creative Commons Attribution Share-Alike 4.0 License
isubha
No ratings yet
Shree Ramchandra College of Engineering: Lab Manual
Document64 pages
Shree Ramchandra College of Engineering: Lab Manual
Trang Nguyễn
No ratings yet
MS PPT Short Keys
Document7 pages
MS PPT Short Keys
Hari Krishnan
No ratings yet
785 Books Doubtnut Question Bank
Document49 pages
785 Books Doubtnut Question Bank
Hitesh Kumar
No ratings yet
Varieties of Threaded Code For Language Implementation
Document13 pages
Varieties of Threaded Code For Language Implementation
Jens Sierens
No ratings yet
Script Analyzer:: A Tool For Quantitative Paleography
Document25 pages
Script Analyzer:: A Tool For Quantitative Paleography
vinodh.vinodh
No ratings yet
LectureZero CAP202
Document34 pages
LectureZero CAP202
tash meet
No ratings yet
IOT New
Document25 pages
IOT New
Arjun
No ratings yet
Naji Mussa
Document5 pages
Naji Mussa
Phani Prakash
No ratings yet
Moot Problem ANMCC2020 PDF
Document16 pages
Moot Problem ANMCC2020 PDF
adventure of Jay Khatri
No ratings yet
ZOLLER Brochure Overview
Document11 pages
ZOLLER Brochure Overview
junjie 422
No ratings yet
E - Manual 2024
Document2 pages
E - Manual 2024
for life
No ratings yet
bq34z100-G1 Wide Range Fuel Gauge With Impedance Track™ Technology
Document65 pages
bq34z100-G1 Wide Range Fuel Gauge With Impedance Track™ Technology
Pcross
No ratings yet