Welcome to Scribd!

4.4. Data Standardization - Ipynb - Colaboratory

Uploaded by

0% found this document useful (0 votes)

4 views1 page

The document discusses data standardization using the StandardScaler. It loads breast cancer data, splits it into training and test sets, then standardizes the training data using the StandardScaler. The StandardScaler transforms the training data to have mean 0 and standard deviation 1 based on the training data statistics. It then transforms the test data using the same parameters to put it on the same scale as the training data.

Original Description:

Original Title

4.4. Data Standardization.ipynb - Colaboratory

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

4 views1 page

4.4. Data Standardization - Ipynb - Colaboratory

Uploaded by

lokesh k

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 1

Search inside document

StandardScaler(copy=True, with_mean=True, with_std=True)

Data Standardization:

The process of standardizing the data to a common format and common range X_train_standardized = scaler.transform(X_train)

import numpy as np print(X_train_standardized)

import pandas as pd
import sklearn.datasets [[ 1.40381088 1.79283426 1.37960065 ... 1.044121 0.52295995
from sklearn.preprocessing import StandardScaler 0.64990763]
from sklearn.model_selection import train_test_split [ 1.16565505 -0.14461158 1.07121375 ... 0.5940779 0.44153782
-0.85281516]
[-0.0307278 -0.77271123 -0.09822185 ... -0.64047556 -0.31161687
# loading the dataset -0.69292805]
dataset = sklearn.datasets.load_breast_cancer() ...
[ 1.06478904 0.20084323 0.89267396 ... 0.01694621 3.06583565
-1.29952679]
# loading the data to a pandas dataframe [ 1.51308238 2.3170559 1.67987211 ... 1.14728703 -0.16599653
df = pd.DataFrame(dataset.data, columns=dataset.feature_names) 0.82816016]
[-0.73678981 -1.02636686 -0.74380549 ... -0.31826862 -0.40713129
-0.38233653]]
df.head()

X_test_standardized = scaler.transform(X_test)
mean mean
mean mean mean mean mean mean mean mean radius texture perimeter
concave fractal
radius texture perimeter area smoothness compactness concavity symmetry error error error e
points dimension print(X_train_standardized.std())

0 17.99 10.38 122.80 1001.0 0.11840 0.27760 0.3001 0.14710 0.2419 0.07871 1.0950 0.9053 8.589 1 1.0
1 20.57 17.77 132.90 1326.0 0.08474 0.07864 0.0869 0.07017 0.1812 0.05667 0.5435 0.7339 3.398

2 19.69 21.25 130.00 1203.0 0.10960 0.15990 0.1974 0.12790 0.2069 0.05999 0.7456 0.7869 4.585 print(X_test_standardized.std())

3 11.42 20.38 77.58 386.1 0.14250 0.28390 0.2414 0.10520 0.2597 0.09744 0.4956 1.1560 3.445 0.8654541077212674

4 20.29 14.34 135.10 1297.0 0.10030 0.13280 0.1980 0.10430 0.1809 0.05883 0.7572 0.7813 5.438

df.shape

(569, 30)

X = df
Y = dataset.target

print(X)

mean radius mean texture ... worst symmetry worst fractal dimension
0 17.99 10.38 ... 0.4601 0.11890
1 20.57 17.77 ... 0.2750 0.08902
2 19.69 21.25 ... 0.3613 0.08758
3 11.42 20.38 ... 0.6638 0.17300
4 20.29 14.34 ... 0.2364 0.07678
.. ... ... ... ... ...
564 21.56 22.39 ... 0.2060 0.07115
565 20.13 28.25 ... 0.2572 0.06637
566 16.60 28.08 ... 0.2218 0.07820
567 20.60 29.33 ... 0.4087 0.12400
568 7.76 24.54 ... 0.2871 0.07039

[569 rows x 30 columns]

Splitting the data into training data and test data

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=3)

print(X.shape, X_train.shape, X_test.shape)

(569, 30) (455, 30) (114, 30)

Standardize the data

print(dataset.data.std())

account_circle 228.29740508276657
Code Text
scaler = StandardScaler()

scaler.fit(X_train)

K Nearest Neighbours
Document4 pages
K Nearest Neighbours
bunsglazing135
No ratings yet
Practical - 5 - 52
Document4 pages
Practical - 5 - 52
Royal Empire
No ratings yet
3
Document5 pages
3
Zaid Ahamed
No ratings yet
Merging - Scaled - 1D - & - Trying - Different - CLassification - ML - Models - .Ipynb - Colaboratory
Document16 pages
Merging - Scaled - 1D - & - Trying - Different - CLassification - ML - Models - .Ipynb - Colaboratory
girishcherry12
100% (1)
Jay Patel - DSML - Practical-6.ipynb - Colaboratory
Document8 pages
Jay Patel - DSML - Practical-6.ipynb - Colaboratory
sharad
No ratings yet
Logistic Regression
Document10 pages
Logistic Regression
C T
No ratings yet
Tensorflow Logistic Regression
Document10 pages
Tensorflow Logistic Regression
C T
No ratings yet
Naive Bayes Model With Python 1684166563
Document9 pages
Naive Bayes Model With Python 1684166563
mohit c-35
No ratings yet
Project 8 Predictive Analytics - Ipynb - Colaboratory
Document8 pages
Project 8 Predictive Analytics - Ipynb - Colaboratory
aadityadeolalikar
No ratings yet
Labpractice 2
Document29 pages
Labpractice 2
Rajashree Das
100% (2)
SVM (Support Vector Machine) For Classification - by Aditya Kumar - Towards Data Science
Document28 pages
SVM (Support Vector Machine) For Classification - by Aditya Kumar - Towards Data Science
Zuzana W
100% (1)
Generative AI Binary Classification
Document7 pages
Generative AI Binary Classification
Cyborg Ultra
No ratings yet
Ridge - Lasso - Regression (1) .Ipynb - Colaboratory
Document4 pages
Ridge - Lasso - Regression (1) .Ipynb - Colaboratory
SHEKHAR SWAMI
No ratings yet
Source Code in PDF Format
Document15 pages
Source Code in PDF Format
Shivam Prajapati
No ratings yet
Covid-19 Prediction - Jupyter Notebook
Document6 pages
Covid-19 Prediction - Jupyter Notebook
TEEGALAMEHERRITHWIK 122010320035
No ratings yet
Pandas
Document21 pages
Pandas
Shubham dattatray kote
No ratings yet
Generate Unifrom Distribution
Document11 pages
Generate Unifrom Distribution
ajaykum14
No ratings yet
Neural Networks: From Import
Document3 pages
Neural Networks: From Import
Anonymous VNu3ODGav
No ratings yet
Tsne On Credit Card
Document9 pages
Tsne On Credit Card
gopisai
No ratings yet
Lab10 Regression Evaluation Methods
Document5 pages
Lab10 Regression Evaluation Methods
iffi khan
No ratings yet
2 and 3
Document6 pages
2 and 3
Radhika Khandelwal
No ratings yet
Import As Import As From Import: "Mean Squared Errors: "
Document1 page
Import As Import As From Import: "Mean Squared Errors: "
ul
No ratings yet
PembelajaranMesin - Ipynb - Colaboratory
Document6 pages
PembelajaranMesin - Ipynb - Colaboratory
Khaerul Rijal
No ratings yet
ML Lab6.Ipynb - Colaboratory
Document5 pages
ML Lab6.Ipynb - Colaboratory
Avi Srivastava
100% (1)
Lab 5
Document4 pages
Lab 5
Muhammad Salman
No ratings yet
Decision-Tree-Lab 3
Document4 pages
Decision-Tree-Lab 3
api-559045701
No ratings yet
Chapter4 PDF
Document34 pages
Chapter4 PDF
Hana Banana
No ratings yet
ML0101EN Clas Logistic Reg Churn Py v1
Document13 pages
ML0101EN Clas Logistic Reg Churn Py v1
banicx
100% (1)
Breast Cancer Classification Using DTC
Document1 page
Breast Cancer Classification Using DTC
tarunbinjadagi18
No ratings yet
Data Rata-Rata Curah Hujan Bulanan Kota Makassar Tahun 2009-2014
Document6 pages
Data Rata-Rata Curah Hujan Bulanan Kota Makassar Tahun 2009-2014
nicazio de oliveira
No ratings yet
Supervised Learning With Scikit-Learn: Preprocessing Data
Document32 pages
Supervised Learning With Scikit-Learn: Preprocessing Data
NourheneMbarek
No ratings yet
3 Confussion Matrix Hasil Modelling OK
Document8 pages
3 Confussion Matrix Hasil Modelling OK
Arman Maulana Muhtar
100% (1)
Problem Set 6
Document4 pages
Problem Set 6
Peace
No ratings yet
Nasanearestobjects: 1 Nasa - Nearest Earth Objects
Document9 pages
Nasanearestobjects: 1 Nasa - Nearest Earth Objects
Deepak Mishra
No ratings yet
03 Multiple Linear Regression
Document7 pages
03 Multiple Linear Regression
Gabriel Gheorghe
No ratings yet
Logistic Regression
Document8 pages
Logistic Regression
Nipuni
No ratings yet
Importing Packages: Id Label Tweet 0 1 2 3 4
Document8 pages
Importing Packages: Id Label Tweet 0 1 2 3 4
rajat raina
No ratings yet
Using A Three Layer Deep Neural Network To Solve An Unsupervised Learning Problem
Document13 pages
Using A Three Layer Deep Neural Network To Solve An Unsupervised Learning Problem
Shajon Pagla
No ratings yet
Day42 SVM Regression
Document3 pages
Day42 SVM Regression
Igor Fernandes
No ratings yet
Diabetes Prediction - Logistic Regression - Jupyter Notebook
Document4 pages
Diabetes Prediction - Logistic Regression - Jupyter Notebook
saravanakumar
No ratings yet
Loading The Dataset: 'Diabetes - CSV'
Document4 pages
Loading The Dataset: 'Diabetes - CSV'
Divyani Chavan
No ratings yet
Clustering Documentation R Code
Document9 pages
Clustering Documentation R Code
nehal gundrapally
100% (1)
20BCP021 Assignment 3
Document7 pages
20BCP021 Assignment 3
chatgptplus4us
No ratings yet
Credit Default Practice PDF
Document27 pages
Credit Default Practice PDF
Sravan
100% (1)
Desarrollo Modelo Random Forest Preparamos El Entorno de Spark
Document3 pages
Desarrollo Modelo Random Forest Preparamos El Entorno de Spark
Facundo Varas
No ratings yet
Loading The Dataset: 'Churn - Modelling - CSV'
Document6 pages
Loading The Dataset: 'Churn - Modelling - CSV'
Divyani Chavan
No ratings yet
K-Means Clustering Python Codes
Document1 page
K-Means Clustering Python Codes
Satish Kumar
No ratings yet
E21CSEU0770 Lab4
Document4 pages
E21CSEU0770 Lab4
kumar.nayan26
No ratings yet
Diabetis Project
Document7 pages
Diabetis Project
Keerthi Sankara
No ratings yet
20bce2251 VL2021220503859 Ast02
Document10 pages
20bce2251 VL2021220503859 Ast02
TANMAY MEHROTRA
No ratings yet
Ann2018 L7
Document17 pages
Ann2018 L7
Amartya Keshri
No ratings yet
Logistic Pima Indians - Ipynb - Colaboratory
Document4 pages
Logistic Pima Indians - Ipynb - Colaboratory
SHEKHAR SWAMI
No ratings yet
Data Analysis: This Dataset As 9 Features and 1239 Entries. The Meanings of The Features (Ratios) Are Given Below
Document1 page
Data Analysis: This Dataset As 9 Features and 1239 Entries. The Meanings of The Features (Ratios) Are Given Below
akshay.c c
100% (1)
Linear - Regression - Ipynb - Colaboratory
Document4 pages
Linear - Regression - Ipynb - Colaboratory
avnimote121
No ratings yet
Lecture7appendixa 12thsep2009
Document31 pages
Lecture7appendixa 12thsep2009
Richa Shekhar
No ratings yet
2 Linear Regression
Document5 pages
2 Linear Regression
Rushabh Vashikar
No ratings yet
Week 4 - Cont. Week 3 Slides - Updated
Document43 pages
Week 4 - Cont. Week 3 Slides - Updated
Sin Tung
No ratings yet
ML0101EN Clas K Nearest Neighbors CustCat Py v1
Document9 pages
ML0101EN Clas K Nearest Neighbors CustCat Py v1
Rajat Solanki
No ratings yet
Logistic Regression
Document3 pages
Logistic Regression
Shwetha Reddy
No ratings yet
Math Reproducibles - Grade 6
From Everand
Math Reproducibles - Grade 6
Vicky Shiotsu
Rating: 5 out of 5 stars
5/5 (4)
8th Computer 5 - Excite
Document24 pages
8th Computer 5 - Excite
Sanchi
100% (1)
Database Planning Design and Administration Chapter9
Document8 pages
Database Planning Design and Administration Chapter9
Anoj Suresh
No ratings yet
Mainframe Utilities
Document5 pages
Mainframe Utilities
sxdasgu
100% (2)
wps_scan
Document9 pages
wps_scan
technof96
No ratings yet
Hammad Ahmad Hammad Ahmad Class Bs (IT) Roll No 342 Topic Cover Letter + Resume College GPCSF
Document3 pages
Hammad Ahmad Hammad Ahmad Class Bs (IT) Roll No 342 Topic Cover Letter + Resume College GPCSF
Hammad Ahmad
No ratings yet
Google Cloud Computing Foundations Data ML and AI Google Cloud GoogleX
Document3 pages
Google Cloud Computing Foundations Data ML and AI Google Cloud GoogleX
Anigrah Raj
No ratings yet
Huawei C8650, C8650+ Root, Firmware
Document9 pages
Huawei C8650, C8650+ Root, Firmware
Htet Aung Moe
No ratings yet
HUAWEI S7700 Switch Datasheet
Document16 pages
HUAWEI S7700 Switch Datasheet
arranguezjr5991
No ratings yet
Computer Mcqs
Document30 pages
Computer Mcqs
Rabail Karamat
100% (8)
Spider Charts:: A Training Course
Document21 pages
Spider Charts:: A Training Course
rashid001
No ratings yet
(IJCST-V11I3P23) :nishigandha Patil, Manaswi Patil, Shraddha Kadam, R. Srivaramangai
Document8 pages
(IJCST-V11I3P23) :nishigandha Patil, Manaswi Patil, Shraddha Kadam, R. Srivaramangai
EighthSenseGroup
No ratings yet
Real Numbers Problems
Document9 pages
Real Numbers Problems
Edgardo Leysa
No ratings yet
M VB VH-Menu Eng - Desbloqueado
Document392 pages
M VB VH-Menu Eng - Desbloqueado
JosueGonzalezDePalacios
No ratings yet
Module 02 - Preliminary Test
Document43 pages
Module 02 - Preliminary Test
Tuan Anh
No ratings yet
6AV66440AA012AX0 Datasheet en
Document6 pages
6AV66440AA012AX0 Datasheet en
Joseph Magondu
No ratings yet
Mil Periodical Test
Document5 pages
Mil Periodical Test
Reychelle Ann Quinto Rodriguez
No ratings yet
Audacity Manual PDF
Document3 pages
Audacity Manual PDF
Fábio Mariath
No ratings yet
Colour Managed Workflow
Document5 pages
Colour Managed Workflow
Adrian
No ratings yet
Log
Document235 pages
Log
Azreen Azreen
No ratings yet
Perception of Cloud Computing in Developing Countries: A Case Study of Indian Academic Libraries
Document19 pages
Perception of Cloud Computing in Developing Countries: A Case Study of Indian Academic Libraries
Kajananthan Rajendran
No ratings yet
Activity Template Project Plan
Document11 pages
Activity Template Project Plan
Steven Chee Man Shing
No ratings yet
DM Manual-Min
Document100 pages
DM Manual-Min
OmaR AL-SaffaR
No ratings yet
ABB PaymentTerminal PDF
Document2 pages
ABB PaymentTerminal PDF
Al RA
No ratings yet
AM1H-M Am1B-Mdh AM1B-M: User Manual
Document68 pages
AM1H-M Am1B-Mdh AM1B-M: User Manual
Milagros Carolina Garcia
No ratings yet
USB To 4 Ports RS485/422 Converter User Manual
Document9 pages
USB To 4 Ports RS485/422 Converter User Manual
Damijan Srdoč
No ratings yet
TETRA Applications Paul Ward
Document38 pages
TETRA Applications Paul Ward
howardpreciado
No ratings yet
Water Quality Monitoring System Using IOT: Dr. Nageswara Rao Moparthi
Document5 pages
Water Quality Monitoring System Using IOT: Dr. Nageswara Rao Moparthi
Kranthi Kiran
No ratings yet
Hawassa University Institute of Technologyfaculty of Informaticsdepartment of Computer Scienceintroduction To Emerging Technology-Assignment I
Document7 pages
Hawassa University Institute of Technologyfaculty of Informaticsdepartment of Computer Scienceintroduction To Emerging Technology-Assignment I
Duresa Gemechu
No ratings yet
App Builder Deploying Applications
Document150 pages
App Builder Deploying Applications
Sadot Enrique Castillo Galan
No ratings yet
TEAM Helpdesk Procedures
Document5 pages
TEAM Helpdesk Procedures
NYL92
No ratings yet