Experiment 01: AIM: To Perform Data Preparation Using Numpy and Panda. Theory

Uploaded by

Neha Chogale

0% found this document useful (0 votes)

7 views5 pages

Original Title

exp1

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

7 views5 pages

Experiment 01: AIM: To Perform Data Preparation Using Numpy and Panda. Theory

Uploaded by

Neha Chogale

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 5

Search inside document

EXPERIMENT 01

AIM: To perform data preparation using numpy and panda.

THEORY: Data preparation is the process of cleaning and transforming raw data prior to
processing and analysis. It is an important step prior to processing and often involves
reformatting data, making corrections to data and the combining of data sets to enrich
data.

Data preparation is often a lengthy undertaking for data professionals or business users,
but it is essential as a prerequisite to put data in context in order to turn it into insights
and eliminate bias resulting from poor data quality.

For example, the data preparation process usually includes standardizing data formats,
enriching source data, and/or removing outliers.

Data Preparation Steps

1. Gather data

The data preparation process begins with finding the right data. This can come from an
existing data catalog or can be added ad-hoc.

2. Discover and assess data

After collecting the data, it is important to discover each dataset. This step is about
getting to know the data and understanding what has to be done before the data becomes
useful in a particular context.

Discovery is a big task, but Talend’s data preparation platform offers visualization tools
which help users profile and browse their data.

3. Cleanse and validate data

Cleaning up the data is traditionally the most time consuming part of the data preparation
process, but it’s crucial for removing faulty data and filling in gaps. Important tasks here
include:

● Removing extraneous data and outliers.

● Filling in missing values.
● Conforming data to a standardized pattern.
● Masking private or sensitive data entries.
Once data has been cleansed, it must be validated by testing for errors in the data
preparation process up to this point. Often times, an error in the system will become
apparent during this step and will need to be resolved before moving forward.

4. Transform and enrich data

Transforming data is the process of updating the format or value entries in order to reach
a well-defined outcome, or to make the data more easily understood by a wider audience.
Enriching data refers to adding and connecting data with other related information to
provide deeper insights.

5. Store data

Once prepared, the data can be stored or channeled into a third party application—such as
a business intelligence tool—clearing the way for processing and analysis to take place.
(a) Derive an index field and add it to the data set.
(b) Find out the missing values.

i. Standardize the variable

Colab Link :
https://colab.research.google.com/drive/1Tjb5LroRPzTK54SCl2hectjM_jXL1qcz?
usp=sharing

Conclusion : Hence, we observe that NumPy and Pandas make matrix manipulation easy. This
flexibility makes them very useful in Machine Learning model development.

Data Analyst Interview Questions
Document28 pages
Data Analyst Interview Questions
17MDS039
No ratings yet
Data Cleaning
Document6 pages
Data Cleaning
andrew stankovik
No ratings yet
Conflict Waiver
Document2 pages
Conflict Waiver
jluros
No ratings yet
ELC Work Description
Document36 pages
ELC Work Description
Hari
100% (1)
Sony SL HF-900 Owners Manual
Document53 pages
Sony SL HF-900 Owners Manual
Thomas Christoph
100% (1)
Business Analytics
Document46 pages
Business Analytics
Muhammed Althaf VK
100% (3)
Test 8
Document6 pages
Test 8
Robert Kegara
No ratings yet
Lab Assignment 1 Title: Data Wrangling I: Problem Statement
Document12 pages
Lab Assignment 1 Title: Data Wrangling I: Problem Statement
Mr. Legendperson
No ratings yet
Flow Chart For SiFUS Strata Title Application
Document5 pages
Flow Chart For SiFUS Strata Title Application
Phang Han Xiang
No ratings yet
Data Cleaning: Definition
Document2 pages
Data Cleaning: Definition
Mansoor Alam
No ratings yet
Week 7 Apple Case Study Final
Document18 pages
Week 7 Apple Case Study Final
gopika surendranath
No ratings yet
Data Mining
Document18 pages
Data Mining
admin ker
100% (1)
IOSA Information Brochure
Document14 pages
IOSA Information Brochure
Havva Sahın
No ratings yet
Heat Treatments
Document14 pages
Heat Treatments
ravishankar
100% (1)
Prachi Shukla MBA (HR) 193718
Document9 pages
Prachi Shukla MBA (HR) 193718
Sanjay shukla
No ratings yet
College of Engineering: Department of Information Technology SUBJECT: DS Using Python Lab
Document6 pages
College of Engineering: Department of Information Technology SUBJECT: DS Using Python Lab
Neha Chogale
No ratings yet
Knowledge Discovery in Databases
Document17 pages
Knowledge Discovery in Databases
Sarvesh Dharme
No ratings yet
Math211101020
Document12 pages
Math211101020
Saba Shaheen
No ratings yet
Q 7
Document2 pages
Q 7
Harshit Singh
No ratings yet
Da - 100 - 2MT01 - Act#1 - Bernas, Marc Carlson F.
Document3 pages
Da - 100 - 2MT01 - Act#1 - Bernas, Marc Carlson F.
MARC CARLSON BERNAS
No ratings yet
College of Engineering: Department of Information Technology SUBJECT: DS Using Python Lab
Document10 pages
College of Engineering: Department of Information Technology SUBJECT: DS Using Python Lab
Aditya Kale
No ratings yet
Data Preparation
Document17 pages
Data Preparation
Joyce Choy
No ratings yet
Data Preparation Activity
Document7 pages
Data Preparation Activity
gpay94348
No ratings yet
Research Methodology (Data Analysis)
Document7 pages
Research Methodology (Data Analysis)
Masood Shaikh
No ratings yet
Data and DW Lab Manual Updated
Document44 pages
Data and DW Lab Manual Updated
Vineet Aggarwal
No ratings yet
Week 04
Document48 pages
Week 04
Batool Al-Sowaiq
No ratings yet
UCS551 Chapter 3 - Data Management and Data Quality
Document27 pages
UCS551 Chapter 3 - Data Management and Data Quality
Fatin Hulwani
No ratings yet
Data Preparation
Document3 pages
Data Preparation
Roshan
No ratings yet
QB 10 Marker
Document19 pages
QB 10 Marker
yashpatelykp
No ratings yet
Datamining 2
Document5 pages
Datamining 2
Manoj Manu
No ratings yet
DWM Notes
Document27 pages
DWM Notes
pankajji.090969
No ratings yet
Chapter 4
Document20 pages
Chapter 4
You
No ratings yet
DATA WRANGLING New
Document13 pages
DATA WRANGLING New
SNEHA SNEHA .M
No ratings yet
DWDM
Document40 pages
DWDM
Akash Ravichandran
No ratings yet
Data Processing, Data Transformation and Data Analysis
Document31 pages
Data Processing, Data Transformation and Data Analysis
Babar Khan
No ratings yet
What Is Data Processing ?
Document15 pages
What Is Data Processing ?
aman
No ratings yet
Unit 1 - DSA
Document12 pages
Unit 1 - DSA
Roshanaa R
No ratings yet
Data Preprocessing AND Data Cleansing: By-Ahtesham Ullah Khan 1604610013 CS-3 Yr
Document12 pages
Data Preprocessing AND Data Cleansing: By-Ahtesham Ullah Khan 1604610013 CS-3 Yr
Ali Khan
No ratings yet
6 Stages of Data Processing
Document9 pages
6 Stages of Data Processing
Manoj Kumar
No ratings yet
Data Wrangling
Document30 pages
Data Wrangling
Yashwanth Yashu
No ratings yet
Session 2: Preprocessing
Document49 pages
Session 2: Preprocessing
Muhammad arslan
No ratings yet
Unit 2 - Data Munging PDF
Document54 pages
Unit 2 - Data Munging PDF
test test
No ratings yet
Deep Learning Ram
Document21 pages
Deep Learning Ram
Ram Bhardwaj
No ratings yet
Data Processing in Research
Document31 pages
Data Processing in Research
Adv .Rashbana thansi V M
No ratings yet
Solution
Document16 pages
Solution
403 Chaudhari Sanika Sagar
No ratings yet
Gerente El Lago de Datos
Document24 pages
Gerente El Lago de Datos
Elias Amado Martinez
No ratings yet
Bi 20soeit11002 Antala Krishnaa
Document5 pages
Bi 20soeit11002 Antala Krishnaa
Krishna Antala
No ratings yet
Datamining Unit 2 Part2
Document11 pages
Datamining Unit 2 Part2
Bhasutkar Mahesh
No ratings yet
Data Analytics Lifecycle
Document51 pages
Data Analytics Lifecycle
Bhagya Patil
No ratings yet
Data Cleaning Thesis
Document5 pages
Data Cleaning Thesis
kvpyqegld
100% (1)
A) Data Cleaning
Document7 pages
A) Data Cleaning
Aziz Ur Rehman
No ratings yet
Data Mining
Document135 pages
Data Mining
Dewsun Riseon
No ratings yet
DATA MINING Unit 1
Document22 pages
DATA MINING Unit 1
Raju
No ratings yet
1708443470801
Document71 pages
1708443470801
Ronald Cruz
No ratings yet
2.3 Stages of Data Processing
Document12 pages
2.3 Stages of Data Processing
Rishabh Verma
No ratings yet
Data Cleaning, Integration, and Data Transformation Techniques
Document7 pages
Data Cleaning, Integration, and Data Transformation Techniques
samforresume
No ratings yet
Assignement - Data Science For Business Growth and Big Data and Business Analytics
Document5 pages
Assignement - Data Science For Business Growth and Big Data and Business Analytics
Robert Demetz
No ratings yet
DA Qns
Document8 pages
DA Qns
narakatlas1987
No ratings yet
DMDW 03
Document25 pages
DMDW 03
Harsh Nag
No ratings yet
Prep Conductor 2019
Document17 pages
Prep Conductor 2019
Sadaf Iqbal Behlim
No ratings yet
Chapter 2 - Chapter 3 - Chia Jing Xian
Document4 pages
Chapter 2 - Chapter 3 - Chia Jing Xian
Chia Jing Xian
No ratings yet
Data Science2
Document7 pages
Data Science2
Kaushal
No ratings yet
Dat Collection Foundation
Document24 pages
Dat Collection Foundation
LOICE HAZVINEI KUMIRE
No ratings yet
Introduction To Data Mining-1
Document24 pages
Introduction To Data Mining-1
Krutika Salotagi
No ratings yet
Advance Database With Lab: Professor & Head (Department of Software Engineering)
Document5 pages
Advance Database With Lab: Professor & Head (Department of Software Engineering)
Mahmud Rana
No ratings yet
DWH Notes
Document30 pages
DWH Notes
subhabirajdar
No ratings yet
CompTIA Data+ (Plus) The Ultimate Exam Prep Study Guide to Pass the Exam
From Everand
CompTIA Data+ (Plus) The Ultimate Exam Prep Study Guide to Pass the Exam
Jamie Murphy
No ratings yet
Moving Toward Paperless World
Document4 pages
Moving Toward Paperless World
Neha Chogale
No ratings yet
QB Docx-3
Document3 pages
QB Docx-3
Neha Chogale
No ratings yet
Web X - Desc Questions
Document21 pages
Web X - Desc Questions
Neha Chogale
No ratings yet
Term Test 2 Question Bank
Document2 pages
Term Test 2 Question Bank
Neha Chogale
No ratings yet
IT - R19 - Sem VI - Image Processing - Sample Questions
Document10 pages
IT - R19 - Sem VI - Image Processing - Sample Questions
Neha Chogale
No ratings yet
Experiment 02: AIM: To Perform Data Visualization Theory
Document4 pages
Experiment 02: AIM: To Perform Data Visualization Theory
Neha Chogale
No ratings yet
College of Engineering: Department of Information Technology COURSE NAME: DS Using Python Lab
Document4 pages
College of Engineering: Department of Information Technology COURSE NAME: DS Using Python Lab
Neha Chogale
No ratings yet
College of Engineering: Department of Information Technology SUBJECT: DS Using Python Lab
Document5 pages
College of Engineering: Department of Information Technology SUBJECT: DS Using Python Lab
Neha Chogale
No ratings yet
Heat Exchanger Designing Using Aspen Plus
Document6 pages
Heat Exchanger Designing Using Aspen Plus
MeethiPotter
No ratings yet
U.S. Pat. 9,514,727, Pickup With Integrated Contols, John Liptac, (Dialtone) Issued 2016.
Document39 pages
U.S. Pat. 9,514,727, Pickup With Integrated Contols, John Liptac, (Dialtone) Issued 2016.
Duane Blake
No ratings yet
Fr-E700 Instruction Manual (Basic)
Document155 pages
Fr-E700 Instruction Manual (Basic)
DeTiEnamorado
No ratings yet
PW Unit 8 PDF
Document4 pages
PW Unit 8 PDF
Dragana Antic
50% (2)
GATE General Aptitude GA Syllabus Common To All
Document2 pages
GATE General Aptitude GA Syllabus Common To All
AbiramiAbi
No ratings yet
Danube Coin Laundry
Document29 pages
Danube Coin Laundry
mjgossler
No ratings yet
Simavi - Project Officer PROPOPI
Document4 pages
Simavi - Project Officer PROPOPI
Agus Nugraha
No ratings yet
Bisleri 2.0
Document59 pages
Bisleri 2.0
Dr Amit Rangnekar
100% (4)
DR-2100P Manual Esp
Document86 pages
DR-2100P Manual Esp
Gustavo Holik
No ratings yet
Surface Care
Document18 pages
Surface Care
Christi Thomas
No ratings yet
Go Ask Alice Essay
Document6 pages
Go Ask Alice Essay
afhbexrci
100% (2)
EASY DMS Configuration
Document6 pages
EASY DMS Configuration
Rahul Kumar
No ratings yet
Numerical Transformer Differential Relay
Document2 pages
Numerical Transformer Differential Relay
Tariq Mohammed Omar
No ratings yet
Chat Application (Collg Report)
Document31 pages
Chat Application (Collg Report)
Kartik Wadehra
No ratings yet
A Comparison of Pharmaceutical Promotional Tactics Between HK & China
Document10 pages
A Comparison of Pharmaceutical Promotional Tactics Between HK & China
Alfred Leung
No ratings yet
Matrix of Consumer Agencies and Areas of Concern: Specific Concern Agency Concerned
Document4 pages
Matrix of Consumer Agencies and Areas of Concern: Specific Concern Agency Concerned
AJ Santos
No ratings yet
Assessment 21GES1475
Document4 pages
Assessment 21GES1475
kavindupunsara02
No ratings yet
Unit 5 Andhra Pradesh.
Document18 pages
Unit 5 Andhra Pradesh.
Charu Modi
No ratings yet
Analisa RAB Dan INCOME Videotron Trenggalek
Document2 pages
Analisa RAB Dan INCOME Videotron Trenggalek
Mohammad Bagus Saputro
No ratings yet
Reference: Digital Image Processing Rafael C. Gonzalez Richard E. Woods
Document43 pages
Reference: Digital Image Processing Rafael C. Gonzalez Richard E. Woods
Nisha Joseph
No ratings yet
Povidone
Document2 pages
Povidone
Elizabeth Walsh
No ratings yet