Welcome to Scribd!

7 Data Transformation - Jupyter Notebook

Uploaded by

0% found this document useful (0 votes)

6 views3 pages

This document demonstrates different techniques for standardizing and normalizing data in Python using Pandas and Scikit-learn. It first loads and preprocesses an MBA dataset. It then shows how to standardize the data using Scikit-learn's StandardScaler, which scales features to have mean 0 and variance 1. It also demonstrates normalizing the data to a range between 0 and 1 using MinMaxScaler. Alternative approaches for standardization and normalization are also presented.

Original Description:

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

6 views3 pages

7 Data Transformation - Jupyter Notebook

Uploaded by

venkatesh m

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 3

Search inside document

In

[1]: import pandas as pd

In [2]: import numpy as np

In [7]: mba = pd.read_csv("D:\\Course\\Python\\Datasets\\mba.csv")

mba

...

In [8]: del mba['Datasrno']

In [9]: mba
...

In [10]: names = mba.columns

names

Out[10]: Index(['workex', 'gmat'], dtype='object')

Standardization
In [11]: from sklearn import preprocessing

# scaler is function using to create the standard function

scaler = preprocessing.StandardScaler()

In [12]:
mba1 = scaler.fit(mba) # applied on the data set

mba1 = scaler.transform(mba) # it wil conver the values into scaled data

In [13]: mba1

Out[13]: array([[-1.3336917 , 0.30134669],

[ 1.80853813, -2.42709832],

[-0.01833968, 0.98345794],

...,

[-1.07792881, -3.4502652 ],

[-1.73560482, -3.4502652 ],

[-0.20102746, -3.10920957]])
In [14]: mba1 = pd.DataFrame(mba1, columns=names)

In [15]: mba1
...

Method 2 for Standardization

we can use scale function to standardize the data

Scale function is available under sklearn Module

In [16]: from sklearn.preprocessing import scale

mba2 = scale(mba)

In [17]: mba2
...

In [21]: mba2 = pd.DataFrame(mba2, columns=names)

mba2

...

Normalization - ReScale the Data

In [22]: from sklearn.preprocessing import MinMaxScaler

# scaler is function using to create the standard function

#scaler = preprocessing.StandardScaler()

In [23]: scale = MinMaxScaler()

In [24]: mba3 = scale.fit(mba)

mba3 = scale.transform(mba)

In [26]: mba3

Out[26]: array([[0.04444444, 0.66666667],

[0.36296296, 0.22222222],

[0.17777778, 0.77777778],

...,

[0.07037037, 0.05555556],

[0.0037037 , 0.05555556],

[0.15925926, 0.11111111]])
In [27]: mba3 = pd.DataFrame(mba3, columns=names)

In [28]: mba3

...

Method 2 for Normaliztion

In [29]: mba4 = (mba - mba.min())/ (mba.max() - mba.min())

In [30]: mba4

...

In [ ]:

Tutorial Minex
Document77 pages
Tutorial Minex
Slamet Karta Negara
100% (3)
EXPENSE TRACKER REPORT (GRP 32)
Document46 pages
EXPENSE TRACKER REPORT (GRP 32)
Aryan Pinto
100% (1)
SF Apex Code Cheatsheet Web
Document8 pages
SF Apex Code Cheatsheet Web
swetha
No ratings yet
6.outlier Code - Jupyter Notebook
Document5 pages
6.outlier Code - Jupyter Notebook
venkatesh m
No ratings yet
Answer PDF Lab
Document34 pages
Answer PDF Lab
Al Kafi
No ratings yet
Functions and Packages
Document7 pages
Functions and Packages
Nur Syazliana
No ratings yet
Question 8
Document1 page
Question 8
luong.hng43
No ratings yet
Labpractice 2
Document29 pages
Labpractice 2
Rajashree Das
100% (2)
Py 13 30
Document7 pages
Py 13 30
Manuel Sigüeñas
No ratings yet
Brain Tumor Classification
Document12 pages
Brain Tumor Classification
Ultra Bloch
100% (1)
Human Activity Recognition Using Smartphone Data
Document18 pages
Human Activity Recognition Using Smartphone Data
officialimraneben
No ratings yet
Linear Regression Mca Lab - Jupyter Notebook
Document2 pages
Linear Regression Mca Lab - Jupyter Notebook
Smriti Sharma
No ratings yet
Data Preparation
Document11 pages
Data Preparation
Vivek Munjayasra
No ratings yet
Assignment 4
Document5 pages
Assignment 4
shellierathee
No ratings yet
Ilovepdf Merged
Document10 pages
Ilovepdf Merged
Pattranit Teerakoson
No ratings yet
Kelompok 3 - Latihan 1 Setup Python Dan Aljabar Linier
Document12 pages
Kelompok 3 - Latihan 1 Setup Python Dan Aljabar Linier
Satrya Budi Pratama
No ratings yet
Neural Networks: From Import
Document3 pages
Neural Networks: From Import
Anonymous VNu3ODGav
No ratings yet
DWDM Lab Report
Document26 pages
DWDM Lab Report
Simran Shrestha
No ratings yet
Practical 2
Document8 pages
Practical 2
HARRY
No ratings yet
SGD For Linear Regression
Document4 pages
SGD For Linear Regression
Rahul Yadav
No ratings yet
B Ridge - and - Lasso - Regression
Document5 pages
B Ridge - and - Lasso - Regression
Gabriel Gheorghe
No ratings yet
Audio Classification
Document1 page
Audio Classification
1MV19ET014 Divya TA
No ratings yet
Unnamed: 0 Sample Rock - Type Sio2 Tio2 Al2O3 Fe2O3 Mno Mgo Cao Na2O K2O P2O5 0 0 1 1 2 2 3 3 4 4
Document1 page
Unnamed: 0 Sample Rock - Type Sio2 Tio2 Al2O3 Fe2O3 Mno Mgo Cao Na2O K2O P2O5 0 0 1 1 2 2 3 3 4 4
Nishant Tripathi
No ratings yet
3 Confussion Matrix Hasil Modelling OK
Document8 pages
3 Confussion Matrix Hasil Modelling OK
Arman Maulana Muhtar
100% (1)
Untitled66 - Jupyter Notebook
Document2 pages
Untitled66 - Jupyter Notebook
Gopala krishna Seelamneni
No ratings yet
Tsne On Credit Card
Document9 pages
Tsne On Credit Card
gopisai
No ratings yet
Exp2 - Data Visualization and Cleaning and Feature Selection
Document13 pages
Exp2 - Data Visualization and Cleaning and Feature Selection
mnbatrawi
No ratings yet
Linear and Logistic Regression
Document6 pages
Linear and Logistic Regression
Mahevish Fatima
No ratings yet
Advanced Python
Document48 pages
Advanced Python
mohamedzaali
No ratings yet
Assignments Walkthroughs and R Demo: W4290 Statistical Methods in Finance - Spring 2010 - Columbia University
Document38 pages
Assignments Walkthroughs and R Demo: W4290 Statistical Methods in Finance - Spring 2010 - Columbia University
tsit
No ratings yet
"Feature Names:" "Target Names:" "/nfirst 10 Rows of X:/N": From Import
Document6 pages
"Feature Names:" "Target Names:" "/nfirst 10 Rows of X:/N": From Import
19C089 SHAAMBHAVI S
No ratings yet
Fall 2005 Statistics 579 R Tutorial: Vectors, Matrices, and Arrays
Document8 pages
Fall 2005 Statistics 579 R Tutorial: Vectors, Matrices, and Arrays
uzeyirff
No ratings yet
Digit Detection Using CNN 1683226190
Document36 pages
Digit Detection Using CNN 1683226190
Doruk Şerbetçioğlu
No ratings yet
A2 Rahil
Document5 pages
A2 Rahil
harsh upadhyay
No ratings yet
Data Pre Processing
Document2 pages
Data Pre Processing
rk73462002
No ratings yet
2 Operations On Numpy Arrays
Document6 pages
2 Operations On Numpy Arrays
Biku
No ratings yet
Tutorial 2 - Clustering
Document6 pages
Tutorial 2 - Clustering
Gupta Akshay
100% (2)
22MCA1008 - Varun ML LAB ASSIGNMENTS
Document41 pages
22MCA1008 - Varun ML LAB ASSIGNMENTS
S Varun (RA1931241020133)
100% (1)
Practical 2
Document3 pages
Practical 2
soham pawar
No ratings yet
Using A Three Layer Deep Neural Network To Solve An Unsupervised Learning Problem
Document13 pages
Using A Three Layer Deep Neural Network To Solve An Unsupervised Learning Problem
Shajon Pagla
No ratings yet
Content: From Import Import As Import Import Import As
Document8 pages
Content: From Import Import As Import Import Import As
بشار الحسين
No ratings yet
統計學習CH2 Lab - Jupyter Notebook (直向)
Document41 pages
統計學習CH2 Lab - Jupyter Notebook (直向)
張F
No ratings yet
AIML
Document5 pages
AIML
aditya.2133e
No ratings yet
Machine Learning Lab
Document23 pages
Machine Learning Lab
ops sks
No ratings yet
SVM K NN MLP With Sklearn Jupyter NoteBo
Document22 pages
SVM K NN MLP With Sklearn Jupyter NoteBo
Ahm Tharwat
No ratings yet
2 Naive Bayee Algorithm - Jupyter Notebook
Document2 pages
2 Naive Bayee Algorithm - Jupyter Notebook
venkatesh m
No ratings yet
ML Lab
Document21 pages
ML Lab
nuzzurockzz301
No ratings yet
Data Manipulation With Numpy
Document13 pages
Data Manipulation With Numpy
babul
No ratings yet
Numpy NP: Import As
Document6 pages
Numpy NP: Import As
wilfredo
No ratings yet
.. ML Lab 07
Document25 pages
.. ML Lab 07
Shezi Fezi
No ratings yet
Assignment 12
Document4 pages
Assignment 12
dash
No ratings yet
Final Code
Document16 pages
Final Code
Naimul Hasan Tahsin
No ratings yet
27 Jupyter Notebook
Document42 pages
27 Jupyter Notebook
Jonathan Villanueva
No ratings yet
Assignment 6.1
Document4 pages
Assignment 6.1
dash
No ratings yet
20bce2251 VL2021220503859 Ast02
Document10 pages
20bce2251 VL2021220503859 Ast02
TANMAY MEHROTRA
No ratings yet
ML2 Practical List
Document80 pages
ML2 Practical List
Yash Amin
No ratings yet
Aries Project
Document6 pages
Aries Project
Divyansh Bhadauria
No ratings yet
Backward && Forward Feature Selection PART-2
Document6 pages
Backward && Forward Feature Selection PART-2
Gabriel Gheorghe
No ratings yet
Data Manipulation With Numpy: Tips and Tricks, Part 1
Document2 pages
Data Manipulation With Numpy: Tips and Tricks, Part 1
vaskore
No ratings yet
Day42 SVM Regression
Document3 pages
Day42 SVM Regression
Igor Fernandes
No ratings yet
22 Dim Reduction Part-1
Document9 pages
22 Dim Reduction Part-1
Gabriel Gheorghe
No ratings yet
How To Train A Model With MNIST Dataset
Document7 pages
How To Train A Model With MNIST Dataset
Magdalena Falkowska
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
3 SVM - Jupyter Notebook
Document4 pages
3 SVM - Jupyter Notebook
venkatesh m
No ratings yet
Tuple
Document4 pages
Tuple
venkatesh m
No ratings yet
6 XG Boost - Jupyter Notebook
Document3 pages
6 XG Boost - Jupyter Notebook
venkatesh m
100% (1)
Label Encoders - Jupyter Notebook
Document3 pages
Label Encoders - Jupyter Notebook
venkatesh m
No ratings yet
1 Basics of Python
Document6 pages
1 Basics of Python
venkatesh m
No ratings yet
2 MLR New - Jupyter Notebook
Document3 pages
2 MLR New - Jupyter Notebook
venkatesh m
No ratings yet
5 Random Forest - Jupyter Notebook
Document2 pages
5 Random Forest - Jupyter Notebook
venkatesh m
No ratings yet
1 Simple Linear Regression
Document9 pages
1 Simple Linear Regression
venkatesh m
No ratings yet
1 KNN - Jupyter Notebook
Document3 pages
1 KNN - Jupyter Notebook
venkatesh m
No ratings yet
Python Cheat Sheet
Document3 pages
Python Cheat Sheet
Sunil Choudhary
No ratings yet
LAB211 Assignment: Title Background Program Specifications
Document3 pages
LAB211 Assignment: Title Background Program Specifications
Phạm V. Nghĩa
No ratings yet
Animation Os Lab
Document52 pages
Animation Os Lab
Anuradha Patnala
No ratings yet
Unified Comfort Panels - Improvements in Updates
Document32 pages
Unified Comfort Panels - Improvements in Updates
YASH WANKHEDE
No ratings yet
Introduction To Turbo Prolog - Townsend, Carl, 1938 - 1987 - Berkeley - Sybex - 9780895883599 - Anna's Archive
Document340 pages
Introduction To Turbo Prolog - Townsend, Carl, 1938 - 1987 - Berkeley - Sybex - 9780895883599 - Anna's Archive
Razin
No ratings yet
What Are DB2 Storage Areas
Document9 pages
What Are DB2 Storage Areas
sujijaya
No ratings yet
Petrol Bunk Automation
Document44 pages
Petrol Bunk Automation
Ranjith Kumar
No ratings yet
Top 100 Software Companies in Surat
Document27 pages
Top 100 Software Companies in Surat
Rishi Shah
No ratings yet
The 10 Best Cloud Data Warehouse Solutions To Consider in 2022
Document12 pages
The 10 Best Cloud Data Warehouse Solutions To Consider in 2022
BW Basha
No ratings yet
Douglas C. Schmidt: Case Studies Using Patterns
Document37 pages
Douglas C. Schmidt: Case Studies Using Patterns
Rom
No ratings yet
Basic R
Document3 pages
Basic R
tejkiran
No ratings yet
Btec FPT International College: Unit: Software Development Life Cycle
Document51 pages
Btec FPT International College: Unit: Software Development Life Cycle
Lê Thị Lệ
No ratings yet
1 3) External Style Sheets Can Contain HTML Tags False
Document5 pages
1 3) External Style Sheets Can Contain HTML Tags False
AniKet B
No ratings yet
O S I S P e C T R A: Advanced Distribution Management System
Document4 pages
O S I S P e C T R A: Advanced Distribution Management System
Rahul
No ratings yet
Web.roblox.com
Document2,431 pages
Web.roblox.com
Vitor Gobi
No ratings yet
DZone TR Kubernetes Enterprise 2022 Spotlight Kasten
Document52 pages
DZone TR Kubernetes Enterprise 2022 Spotlight Kasten
Mario A. Hurtado Briones
No ratings yet
SAP IDOC Tutorial: Definition, Structure, Types, Format & Tables
Document9 pages
SAP IDOC Tutorial: Definition, Structure, Types, Format & Tables
gayatriscribd
No ratings yet
Sr. No List of Practicals 1. 2. 3. 4. 5. 6
Document4 pages
Sr. No List of Practicals 1. 2. 3. 4. 5. 6
Rudrik Bhatt
No ratings yet
CH 18
Document31 pages
CH 18
Dhayananthan Raja
No ratings yet
Groovy 9 - Capturing Rawrequest & Response: or or
Document4 pages
Groovy 9 - Capturing Rawrequest & Response: or or
Sirisha Chigurupati
No ratings yet
Techorbit - Vinod - Tableau Admin - 7 Years - Aswini.M - Mandem Aswini
Document2 pages
Techorbit - Vinod - Tableau Admin - 7 Years - Aswini.M - Mandem Aswini
Hemanth Kumar
No ratings yet
MongoDB - Data Modelling
Document3 pages
MongoDB - Data Modelling
Chandu Chandrakanth
No ratings yet
SAP PS Interview Questions and Answers Guide.: Global Guideline
Document6 pages
SAP PS Interview Questions and Answers Guide.: Global Guideline
rakeshkumar8010690
No ratings yet
Student Registration Form Using Table in HTML and CSS
Document7 pages
Student Registration Form Using Table in HTML and CSS
Harikrishna Netha
No ratings yet
Nitin Sharma - SAP ABAP Consultant 6+ Exp
Document3 pages
Nitin Sharma - SAP ABAP Consultant 6+ Exp
Nitin Sharma
No ratings yet
Quiz App Angularjs - Odt
Document5 pages
Quiz App Angularjs - Odt
Shahim Akarm
No ratings yet
Prvi Kolokvij OOP 20.4.2015.
Document5 pages
Prvi Kolokvij OOP 20.4.2015.
Mila
No ratings yet