Open navigation menu

Welcome to Scribd!

Pandas PD Numpy NP: Import As Import As

Uploaded by

0% found this document useful (0 votes)

5 views2 pages

The document discusses preprocessing and analyzing a COVID-19 dataset. It loads the dataset, checks for missing values, formats date columns as datetime, drops categorical variables, and outputs basic statistics and the dataset shape. The goal is to clean the data and prepare it for further analysis.

Original Description:

Original Title

A1

Copyright

© © All Rights Reserved

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

The document discusses preprocessing and analyzing a COVID-19 dataset. It loads the dataset, checks for missing values, formats date columns as datetime, drops categorical variables, and outputs basic statistics and the dataset shape. The goal is to clean the data and prepare it for further analysis.

Copyright:

© All Rights Reserved

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

5 views2 pages

Pandas PD Numpy NP: Import As Import As

Uploaded by

The document discusses preprocessing and analyzing a COVID-19 dataset. It loads the dataset, checks for missing values, formats date columns as datetime, drops categorical variables, and outputs basic statistics and the dataset shape. The goal is to clean the data and prepare it for further analysis.

Copyright:

© All Rights Reserved

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 2

Search inside document

In

[9]: # Assignment - A1 | Name : Pratik Pingale | Roll No : 19CO056

In [1]: #subtask 1 - importing libraries

import pandas as pd

import numpy as np

In [2]: #subtask 2 - Dataset url

# Name - COVID -19 Global Reports early March 2022

# URL - https://www.kaggle.com/danielfesalbon/covid-19-global-reports-early-mar
# local machine relative address - /covid_19_clean_complete_2022.csv/

In [3]: #subtask 3 - Loading dataset

df = pd.read_csv("covid_19_clean_complete_2022.csv")

df = df.drop('Province/State', axis=1)

df.head()

Out[3]: Country/Region Lat Long Date Confirmed Deaths Recovered Active WHO Regio

2020- Easter
0 Afghanistan 33.93911 67.709953 0 0 0 0
01-22 Mediterranea

2020-
1 Albania 41.15330 20.168300 0 0 0 0 Europ
01-22

2020-
2 Algeria 28.03390 1.659600 0 0 0 0 Afric
01-22

2020-
3 Andorra 42.50630 1.521800 0 0 0 0 Europ
01-22

2020-
4 Angola -11.20270 17.873900 0 0 0 0 Afric
01-22

In [4]: #subtask 4 - data preprocessing - detecting NaN values and using describe() fun
column={}

for x in df.columns:

column[x] = df[x].isnull().any()

print(column)

df.describe()

{'Country/Region': False, 'Lat': True, 'Long': True, 'Date': False, 'Confirme

d': False, 'Deaths': False, 'Recovered': False, 'Active': False, 'WHO Region':
True}
Out[4]: Lat Long Confirmed Deaths Recovered Active

count 213348.000000 213348.000000 2.148940e+05 214894.000000 2.148940e+05 2.148940e+05

mean 20.528131 22.735337 4.578132e+05 9310.764693 1.079987e+05 3.405037e+05

std 25.899139 76.304185 2.708770e+06 47497.835275 8.470111e+05 2.516382e+06

min -71.949900 -178.116500 0.000000e+00 0.000000 0.000000e+00 -1.638280e+05

25% 6.426991 -27.932425 2.530000e+02 2.000000 0.000000e+00 1.600000e+01

50% 22.233350 21.752000 5.223000e+03 71.000000 4.500000e+01 1.243000e+03

75% 41.166070 88.658375 9.892275e+04 1675.000000 5.115750e+03 2.644675e+04

max 71.706900 178.065000 7.925051e+07 958144.000000 3.097475e+07 7.829236e+07

In [5]: #shape of dataset (dimensions)

df.shape

(214894, 9)
Out[5]:
In [6]: #subtask 5 - data formatting and normalization

df.dtypes

Country/Region object

Out[6]:
Lat float64

Long float64

Date object

Confirmed int64

Deaths int64

Recovered int64

Active int64

WHO Region object

dtype: object

In [7]: df['Date'] = pd.to_datetime(df['Date'])

df.dtypes

Country/Region object

Out[7]:
Lat float64

Long float64

Date datetime64[ns]

Confirmed int64

Deaths int64

Recovered int64

Active int64

WHO Region object

dtype: object

In [8]: #subtask 6 - handling categorical values

#dropping the categorical variable column

df = df.drop(['WHO Region','Country/Region'], axis=1)

df.head()

Out[8]: Lat Long Date Confirmed Deaths Recovered Active

0 33.93911 67.709953 2020-01-22 0 0 0 0

1 41.15330 20.168300 2020-01-22 0 0 0 0

2 28.03390 1.659600 2020-01-22 0 0 0 0

3 42.50630 1.521800 2020-01-22 0 0 0 0

4 -11.20270 17.873900 2020-01-22 0 0 0 0

You might also like

Math Reproducibles - Grade 6
From Everand
Math Reproducibles - Grade 6
Vicky Shiotsu
Rating: 5 out of 5 stars
5/5 (4)
MECH 3032 Arduino - LED Finite State Machine Part 1: LED Finite State Machine, Random States Arduino Code
Document4 pages
MECH 3032 Arduino - LED Finite State Machine Part 1: LED Finite State Machine, Random States Arduino Code
Rehman Khan
No ratings yet
Mbist Flow
Document2 pages
Mbist Flow
Jaita Bakshi Kulshreshtha
No ratings yet
GeoVES - V1 VES Software
Document5 pages
GeoVES - V1 VES Software
Aurangzeb Jadoon
No ratings yet
ML Project - Jupyter Notebook
Document5 pages
ML Project - Jupyter Notebook
JAISURYA S B.Sc.AIDA
No ratings yet
Digital Image Processing: Fundamentals and Applications
From Everand
Digital Image Processing: Fundamentals and Applications
Fouad Sabry
No ratings yet
Loading The Dataset: First We Load The Dataset and Find Out The Number of Columns, Rows, NULL Values, Etc
Document8 pages
Loading The Dataset: First We Load The Dataset and Find Out The Number of Columns, Rows, NULL Values, Etc
Divyani Chavan
100% (1)
Taxi Trips Analysis Project 1682332303
Document28 pages
Taxi Trips Analysis Project 1682332303
mohit c-35
100% (2)
CovidData - Ipynb - Colaboratory
Document4 pages
CovidData - Ipynb - Colaboratory
ammar jamal
No ratings yet
World Population Analysis
Document14 pages
World Population Analysis
deepakbhumra
No ratings yet
02 Multicollinearity
Document8 pages
02 Multicollinearity
Gabriel Gheorghe
100% (1)
Decision Tree
Document12 pages
Decision Tree
Kagade Ajinkya
No ratings yet
Taxi Fare Team 09
Document25 pages
Taxi Fare Team 09
ABDUL RAHMAN.M
No ratings yet
Fmsa315 - s11 - A.ipynb - Fernández - Constanza
Document6 pages
Fmsa315 - s11 - A.ipynb - Fernández - Constanza
ScribdTranslations
No ratings yet
Corporate Finance Chaitanya Ca 1
Document13 pages
Corporate Finance Chaitanya Ca 1
Chaitanya Gembali
No ratings yet
Prueba 15-06-21
Document11 pages
Prueba 15-06-21
reg_rimiap
No ratings yet
Capital Budgeting 1
Document5 pages
Capital Budgeting 1
Sohaib Riaz
No ratings yet
Institutional Quality Indicators and Their Impact On Attracting Foreign Direct Investment in Algeria Using The ARDL Model
Document14 pages
Institutional Quality Indicators and Their Impact On Attracting Foreign Direct Investment in Algeria Using The ARDL Model
Najib Benslimane
No ratings yet
Import: Sys - Executable - M Pip Install
Document23 pages
Import: Sys - Executable - M Pip Install
Aounaiza Ahmed
No ratings yet
Walmart - Sales: Pandas PD Seaborn Sns Numpy NP Matplotlib - Pyplot PLT Matplotlib Datetime
Document26 pages
Walmart - Sales: Pandas PD Seaborn Sns Numpy NP Matplotlib - Pyplot PLT Matplotlib Datetime
Ramaseshu Machepalli
100% (1)
California 1673295505
Document18 pages
California 1673295505
doudoudz
No ratings yet
(Export Report) Sales Overview 20201001-20201031
Document3 pages
(Export Report) Sales Overview 20201001-20201031
amierul33
No ratings yet
Cleaning
Document555 pages
Cleaning
ifan1984
No ratings yet
Fertilidad: Numpy NP Pandas PD Matplotlib - Pyplot PLT Ipywidgets
Document8 pages
Fertilidad: Numpy NP Pandas PD Matplotlib - Pyplot PLT Ipywidgets
Kevin Fernando Ochoa Martinez
100% (1)
Daalgawar
Document6 pages
Daalgawar
САНЧИР Ганбаатар
No ratings yet
Leer Los Datos: Import As Import As Import As From Import From Import
Document14 pages
Leer Los Datos: Import As Import As Import As From Import From Import
Ricardo Sebastian
100% (1)
Fin Data Extraction Inquiry: Extraction Code: ZBSBDC Balance Sheet-BDC ZBSBDC-21000000
Document24 pages
Fin Data Extraction Inquiry: Extraction Code: ZBSBDC Balance Sheet-BDC ZBSBDC-21000000
Muhammad Ali
No ratings yet
Data Analysis With Python - Jupyter Notebook
Document10 pages
Data Analysis With Python - Jupyter Notebook
Nitish Ravuvari
No ratings yet
410 - ShubhiGupta - TSA - 2 - Shubhi Gupta
Document12 pages
410 - ShubhiGupta - TSA - 2 - Shubhi Gupta
aman gupta
No ratings yet
ML Practical 1 Code
Document1 page
ML Practical 1 Code
C-34 Siddhesh Zajada
100% (1)
Retail Analysis Walmart
Document18 pages
Retail Analysis Walmart
Navin Mathad
No ratings yet
Fin - Model - Class2 - Excel - Problem - Review - Questions
Document4 pages
Fin - Model - Class2 - Excel - Problem - Review - Questions
Gel vira
No ratings yet
A. Generate The Cumulative (Non-Discounted) After-Tax Cash Flow Diagram
Document13 pages
A. Generate The Cumulative (Non-Discounted) After-Tax Cash Flow Diagram
Haziq Hakimi
100% (1)
2020-CLO Attainment Report (CE-150 Computer Programming (A) )
Document3 pages
2020-CLO Attainment Report (CE-150 Computer Programming (A) )
Kamran JUTT
No ratings yet
DCF Calculation
Document2 pages
DCF Calculation
Nguon Sophealey
No ratings yet
CaseStudy Indexes DowJones Shangai
Document8 pages
CaseStudy Indexes DowJones Shangai
melinaguimaraes
No ratings yet
FI6051 Dynamic Delta Hedging Example HullTable14!2!3
Document4 pages
FI6051 Dynamic Delta Hedging Example HullTable14!2!3
fmurphy
No ratings yet
ORA Preview H121
Document2 pages
ORA Preview H121
Bypass Icloudiphoneipad Bypasshello
No ratings yet
Data Untuk Routing Waduk: Hasil Perhitungan
Document2 pages
Data Untuk Routing Waduk: Hasil Perhitungan
Siti Rauhun
No ratings yet
Apotelesmata 2000+ Robust
Document4 pages
Apotelesmata 2000+ Robust
Adamantini Ioannou
No ratings yet
Demo Ndivind$defaultview Spreadsheet
Document19 pages
Demo Ndivind$defaultview Spreadsheet
Daniela Tunaru
No ratings yet
Loading The Dataset: 'Churn - Modelling - CSV'
Document6 pages
Loading The Dataset: 'Churn - Modelling - CSV'
Divyani Chavan
No ratings yet
VaR & Python & BVC
Document4 pages
VaR & Python & BVC
Siham Ait ima
100% (1)
Bda
Document24 pages
Bda
Abin
No ratings yet
Valuation Final Exam
Document4 pages
Valuation Final Exam
Jeane Mae Boo
100% (1)
Assignment Sujith S
Document13 pages
Assignment Sujith S
sujith
No ratings yet
Section II: Question 1 (20 Marks)
Document5 pages
Section II: Question 1 (20 Marks)
tech damn
No ratings yet
DATA SCIENCE IDC 302 End Sem Project
Document1 page
DATA SCIENCE IDC 302 End Sem Project
soniakar3000
No ratings yet
2022 2 Bbac 322 Exam
Document18 pages
2022 2 Bbac 322 Exam
sipanjegiven
No ratings yet
MCS1040 Advanced Concepts in Data Communication Assignment 1 - 2011 (NS-2) Student No:2010/MCS/051 Senario 1
Document4 pages
MCS1040 Advanced Concepts in Data Communication Assignment 1 - 2011 (NS-2) Student No:2010/MCS/051 Senario 1
Saman
No ratings yet
BILANT 2015: Romania JUDETUL Teleorman Comuna Galateni Anexa 1 (Anexa 1 La Normele Metodologice)
Document13 pages
BILANT 2015: Romania JUDETUL Teleorman Comuna Galateni Anexa 1 (Anexa 1 La Normele Metodologice)
Sorin Georgian
No ratings yet
Lab1 Student
Document9 pages
Lab1 Student
fancy242
No ratings yet
Advertising Data Analysis
Document12 pages
Advertising Data Analysis
Vishal Sharma
No ratings yet
Covid-19 Graph
Document7 pages
Covid-19 Graph
Jude
No ratings yet
Business Mathematics
Document11 pages
Business Mathematics
Kabutu Chuunga
No ratings yet
Practical-5 - Jupyter Notebook
Document8 pages
Practical-5 - Jupyter Notebook
Harsha Gohil
100% (1)
ACCESSINFINITYLTD Case Study
Document50 pages
ACCESSINFINITYLTD Case Study
Revanth Kumar Reddy
No ratings yet
Corporate Finance
Document3 pages
Corporate Finance
Ritesh Bang
No ratings yet
Python Course Cheat Sheet
Document30 pages
Python Course Cheat Sheet
Vinay
No ratings yet
Graph of Height and Versus Time: (DH/DT)
Document19 pages
Graph of Height and Versus Time: (DH/DT)
Đức Hà Anh
No ratings yet
Decision Tree Regressor and Ensemble Techniques - Regressors
Document18 pages
Decision Tree Regressor and Ensemble Techniques - Regressors
S A
No ratings yet
Country Ranking Analysis Template
Document12 pages
Country Ranking Analysis Template
sendano 0
No ratings yet
Software Requirements Specification: 2. Scope
Document4 pages
Software Requirements Specification: 2. Scope
Gm Abu sayeed
No ratings yet
Using A MFRC522 Reader To Read and Write MIFARE RFID Cards On ARDUINO Through The MFRC522 Library BY COOQROBOT
Document20 pages
Using A MFRC522 Reader To Read and Write MIFARE RFID Cards On ARDUINO Through The MFRC522 Library BY COOQROBOT
Biraj
No ratings yet
26 28
Document25 pages
26 28
Marlon Campoverde
No ratings yet
SPI Slave VHDL Design - Surf-VHDL
Document12 pages
SPI Slave VHDL Design - Surf-VHDL
अमरेश झा
No ratings yet
IoT Based Low-Cost Weather Station and Monitoring System For Precision Agriculture in India
Document6 pages
IoT Based Low-Cost Weather Station and Monitoring System For Precision Agriculture in India
nelsitotovar
No ratings yet
2021-02 - LaunchWizard - For - SAP - External
Document13 pages
2021-02 - LaunchWizard - For - SAP - External
Isp ot
No ratings yet
G6 Glossary of Terms WPS Office
Document2 pages
G6 Glossary of Terms WPS Office
Francia Vergara
No ratings yet
HƯỚNG DẪN SỬ DỤNG OPEN REFINE
Document2 pages
HƯỚNG DẪN SỬ DỤNG OPEN REFINE
Cuong Huynh
No ratings yet
Cloud Wars
Document10 pages
Cloud Wars
Atharva Sachin Kulkarni
No ratings yet
Image Database
Document16 pages
Image Database
mulayam singh yadav
No ratings yet
Huawei Cloud Overview Thailand Presentation 2022 Ver 2
Document14 pages
Huawei Cloud Overview Thailand Presentation 2022 Ver 2
Krittamet Nethwonge
No ratings yet
Identificación para El Desarrollo - ID4D
Document19 pages
Identificación para El Desarrollo - ID4D
Felix Jimenez
No ratings yet
Machine Learning Enabled Wireless Communication Network System
Document5 pages
Machine Learning Enabled Wireless Communication Network System
papot mahmul
No ratings yet
Integrator Guide BD9XX
Document79 pages
Integrator Guide BD9XX
Kon
No ratings yet
19 Inch LCD COLOR MONITOR SERVICE MANUAL HP L1940T NT68663MEFG A02 PDF
Document52 pages
19 Inch LCD COLOR MONITOR SERVICE MANUAL HP L1940T NT68663MEFG A02 PDF
Rendy Adam Farhan
No ratings yet
ARCON - PAM - Deck - Latest - 2021
Document7 pages
ARCON - PAM - Deck - Latest - 2021
shako12
No ratings yet
Harshvardhan Resume
Document2 pages
Harshvardhan Resume
Harshvardhan Khopade
No ratings yet
VF-co-cc: Mobile Prescott & SIS655FX
Document41 pages
VF-co-cc: Mobile Prescott & SIS655FX
Kamel Nait
No ratings yet
Installation Guide KWC-100-101 Vista
Document6 pages
Installation Guide KWC-100-101 Vista
Trace Hdez
No ratings yet
Programming Assign Unit 7
Document5 pages
Programming Assign Unit 7
Mahmoud Heretani
No ratings yet
First Generation: Vacuum Tubes (1940-1956) : A UNIVAC Computer at The Census Bureau
Document17 pages
First Generation: Vacuum Tubes (1940-1956) : A UNIVAC Computer at The Census Bureau
anam
No ratings yet
Laconictester Tricentis Tosca
Document3 pages
Laconictester Tricentis Tosca
Mallikharjuna Chowdary Solo
No ratings yet
Ict Exam
Document2 pages
Ict Exam
Joms Villanueva
No ratings yet
Hadoop: A Report Writing On
Document13 pages
Hadoop: A Report Writing On
dilip kodmour
No ratings yet
"Fuzzy Logic Using Matalb": Miss. Smita A.Khadake
Document13 pages
"Fuzzy Logic Using Matalb": Miss. Smita A.Khadake
Smita Khadake-Patil
No ratings yet
Test IO Exploratory Testing Manual 2019
Document16 pages
Test IO Exploratory Testing Manual 2019
Felipe Pérez Upegui
No ratings yet
Product Name(s) : Model Number(s) :: HP LASER 408dn Firmware Readme
Document7 pages
Product Name(s) : Model Number(s) :: HP LASER 408dn Firmware Readme
Luis Roberto
No ratings yet
Lecture - 5 - Error Handling
Document9 pages
Lecture - 5 - Error Handling
gunarathneisuri11
No ratings yet