Adult Income Prediction Using Machine Learning Algorithms: Submitted by

Uploaded by

ritika singh

0% found this document useful (0 votes)

54 views9 pages

Original Title

BDA Project.pptx

Copyright

Available Formats

PPTX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

54 views9 pages

Adult Income Prediction Using Machine Learning Algorithms: Submitted by

Uploaded by

ritika singh

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 9

Search inside document

Adult Income Prediction using

Machine Learning Algorithms

Submitted by:
Sanchit Kaushal (2K19/BMBA/14)
Ritika (2K19/BMBA/13)
Research Questions

 Does education play a major role in salary and what is the minimum level
of education needed to ensure a high salary?
 Will marital status affect the salary of a person?
 Will all other factors being the same will sex of a person determine him/her
getting a higher salary?
 Will the age of a person play a significant role in defining the salary?
 Will the race of a person be a significant factor in defining the salary?
Overview

 The data set we are analysing is census data with a focus on income
of the population.
 Total size of the data: 32561 rows and 15 number of predictors.
 Following is a row of data from the dataset:
39, State-gov, 77516, Bachelors, 13, Never-married, Adm-clerical, Not-in-family,
White, Male, 2174, 0, 40, United-States, <=50K

Goals: Identify how salary is affected by demographics.

Briefing with Dataset
The dataset got 14 number of variables and the predictor column income. Given as follows:
Import Libraries and Load Data

We will first load the Python libraries that we are going to use, as well as the adult
data. The last column will be our target variable, ‘income’, and the rest will be the
features.
Data Analysis

An initial exploration of the dataset like finding the number of records,

the number of individuals making more or less than 50k etc., will show
us how many individuals fit in each group.
Data Pre-processing
Data must be preprocessed in order to be used in Machine Learning algorithms. This preprocessing phase includes the
cleaning and preparing the data.

 Missing Values:

 Removed missing values which were denoted by “?”.

 na.omit() was used to remove those rows.
Data Pre-processing

 Data Modification

 Removed less significant columns (“fnlwgt & “education_num”).

 Data Binning – Grouping multiple categories into lesser number of bins.

Normalization

 It is recommended to perform some type of scaling on numerical features. It is

used to change the values of numeric columns in the dataset to a common scale,
without distorting differences in the ranges of values.

(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
From Everand
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
Janet Laane Effron
No ratings yet
Machine Learning Engineer Nanodegree: Supervised Learning Project: Finding Donors For Charityml
Document18 pages
Machine Learning Engineer Nanodegree: Supervised Learning Project: Finding Donors For Charityml
Daniel Petrov
No ratings yet
Cleaning Excel Data With Power Query Straight to the Point
From Everand
Cleaning Excel Data With Power Query Straight to the Point
Oz du Soleil
Rating: 4.5 out of 5 stars
4.5/5 (3)
Predicting Credit Card Approvals
Document14 pages
Predicting Credit Card Approvals
as
100% (1)
Introduction To Business Statistics Through R Software: Software
From Everand
Introduction To Business Statistics Through R Software: Software
Editor IJSMI
No ratings yet
US Census Income 1
Document18 pages
US Census Income 1
rajeshpndt01
No ratings yet
Adult Income Prediction
Document9 pages
Adult Income Prediction
MR.NAITIK PATEL
0% (1)
November 2010)
Document6 pages
November 2010)
zhangzhongshi91
No ratings yet
Machine Learning Engineer Nanodegree Supervised Learning Project: Finding Donors For CharityML
Document16 pages
Machine Learning Engineer Nanodegree Supervised Learning Project: Finding Donors For CharityML
Carlos Pimentel
No ratings yet
Analysis of NYC Government Jobs
Document14 pages
Analysis of NYC Government Jobs
Leon Corriea
No ratings yet
Crowdsourcing For Top-K Query Processing
Document5 pages
Crowdsourcing For Top-K Query Processing
Venkata Vishnu Vardhan N
No ratings yet
24 Ultimate Data Science Projects To Boost Your Knowledge and Skills
Document13 pages
24 Ultimate Data Science Projects To Boost Your Knowledge and Skills
Moiz khan
No ratings yet
Dawak2006 Final
Document10 pages
Dawak2006 Final
Wojtek
No ratings yet
DAC - Phase 2
Document7 pages
DAC - Phase 2
mgsanthosh59150
No ratings yet
Assignment2 Stats
Document5 pages
Assignment2 Stats
duasipra03
No ratings yet
Data Clean R
Document11 pages
Data Clean R
roy.scar2196
100% (1)
Problem 1: Linear Regression
Document14 pages
Problem 1: Linear Regression
Vijayalakshmi Palaniappan
50% (12)
Data Analytics With MS Excel Power BI This Book Will Transform You Into Data Analytics Expert
Document183 pages
Data Analytics With MS Excel Power BI This Book Will Transform You Into Data Analytics Expert
Phillip Jabu Mamba
100% (2)
Free Ebook - The Ultimate Guide To Basic Data Cleaning PDF
Document70 pages
Free Ebook - The Ultimate Guide To Basic Data Cleaning PDF
Santiagourquizo
No ratings yet
Practical 10
Document2 pages
Practical 10
Harshada Bokan
No ratings yet
Presentation of Data in Term Paper
Document8 pages
Presentation of Data in Term Paper
e9xy1xsv
100% (1)
Welcome To The Information Age
Document120 pages
Welcome To The Information Age
Nanthiga Babu
No ratings yet
Prabhu, Punit - Data Analytics With Excel (2021)
Document152 pages
Prabhu, Punit - Data Analytics With Excel (2021)
paula
100% (2)
24 Ultimate Data Science Projects To Boost Your Knowledge and Skills
Document10 pages
24 Ultimate Data Science Projects To Boost Your Knowledge and Skills
Juank Z Bk
No ratings yet
Important Da
Document9 pages
Important Da
Priyadarshini
No ratings yet
FDS Unit 2
Document8 pages
FDS Unit 2
Amit Adhikari
No ratings yet
Kroenke Ch1 2006 Databases
Document17 pages
Kroenke Ch1 2006 Databases
daisyduck2013
No ratings yet
Bioinformatics Combines Computer Programming
Document3 pages
Bioinformatics Combines Computer Programming
Immanuel Lourdu
No ratings yet
15CS34E Analytic Computing Key
Document17 pages
15CS34E Analytic Computing Key
shenbagaraman cse
No ratings yet
Data Mining Assignment
Document8 pages
Data Mining Assignment
Amanat Construction
No ratings yet
15CS34E Analytic Computing Answer Key Part-A
Document17 pages
15CS34E Analytic Computing Answer Key Part-A
shenbagaraman cse
No ratings yet
Data Mining Using Rapidminer by William Murakami-Brundage Mar. 15, 2012
Document44 pages
Data Mining Using Rapidminer by William Murakami-Brundage Mar. 15, 2012
dvdmx
No ratings yet
Deep Learning Ram
Document21 pages
Deep Learning Ram
Ram Bhardwaj
No ratings yet
Data As Clean of Excel
Document66 pages
Data As Clean of Excel
abduo.gah5000
No ratings yet
5 Summarizing Data
Document29 pages
5 Summarizing Data
akmam.haque
No ratings yet
PASW Manual
Document24 pages
PASW Manual
Christian J. Watkins
No ratings yet
User Manual (Mental Health Issue Among University Student
Document19 pages
User Manual (Mental Health Issue Among University Student
ANIS NABIHAH BINTI MOHD JAIS
No ratings yet
Source: What Is Big Data?
Document14 pages
Source: What Is Big Data?
Diviya M
No ratings yet
Data Mining Vs Data Exploration UNIT-II
Document11 pages
Data Mining Vs Data Exploration UNIT-II
Hanumanthu Gouthami
No ratings yet
Concepts and Techniques: Data Mining
Document80 pages
Concepts and Techniques: Data Mining
divya
No ratings yet
Dawit House
Document49 pages
Dawit House
dawitbelete1992
No ratings yet
Data Science and The Role of Data Scientist
Document8 pages
Data Science and The Role of Data Scientist
Mahamud elmoge
No ratings yet
DMProject Report
Document19 pages
DMProject Report
Swing Trade
No ratings yet
Dev Answer Key
Document17 pages
Dev Answer Key
jayapriya kce
100% (1)
Hiring Procss
Document6 pages
Hiring Procss
Tim Kansi
100% (1)
K Means Kkwc3f
Document19 pages
K Means Kkwc3f
Raja
No ratings yet
Ass-3 Ds
Document7 pages
Ass-3 Ds
Vedant Andhale
No ratings yet
Concepts and Techniques: - Chapter 3
Document64 pages
Concepts and Techniques: - Chapter 3
swjaffry
No ratings yet
Data Cleansing Using R
Document10 pages
Data Cleansing Using R
Daniel N Sherine Foo
0% (1)
To Artificial Intelligence: What Is Data Science?
Document131 pages
To Artificial Intelligence: What Is Data Science?
jilenebla
100% (1)
Aakash 2220776 SEC 4
Document4 pages
Aakash 2220776 SEC 4
Saswat Lath
No ratings yet
MIS Chapter 4
Document5 pages
MIS Chapter 4
Komal Rahim
No ratings yet
Jathro UU100
Document1 page
Jathro UU100
JeThro Lockington
No ratings yet
Basic Data Cleaning
Document66 pages
Basic Data Cleaning
Antares Orion
100% (1)
Data Cleaning Guide
Document66 pages
Data Cleaning Guide
faizkhan
No ratings yet
Soal CISDM
Document3 pages
Soal CISDM
Reza Hikamatulloh
No ratings yet
Prediction
Document10 pages
Prediction
endriasmit_469556062
100% (1)
Project Submission Machine Learning - Ankit Bhagat - 8th Jan
Document36 pages
Project Submission Machine Learning - Ankit Bhagat - 8th Jan
ankitbhagat
100% (6)
Machine Learning Project - Predicting Boston House Prices With Regression - by Victor Roman - Towards Data Science
Document20 pages
Machine Learning Project - Predicting Boston House Prices With Regression - by Victor Roman - Towards Data Science
Ghifari Raka
No ratings yet
Unit 1 SPSS
Document9 pages
Unit 1 SPSS
Aayushi Pillai
No ratings yet
Tribhuvan University: Institute of Engineering
Document48 pages
Tribhuvan University: Institute of Engineering
ritika singh
No ratings yet
Indian School Education Statistics - Ritika - Dhwani
Document12 pages
Indian School Education Statistics - Ritika - Dhwani
ritika singh
No ratings yet
Cmu CS QTR 127
Document38 pages
Cmu CS QTR 127
ritika singh
No ratings yet
Project Report Sentiment Analysis On Twitter Using Apache Spark
Document9 pages
Project Report Sentiment Analysis On Twitter Using Apache Spark
ritika singh
No ratings yet
WTW India - Recruitment Process Details
Document3 pages
WTW India - Recruitment Process Details
ritika singh
No ratings yet
Machine Learning: Supervisor
Document3 pages
Machine Learning: Supervisor
ritika singh
No ratings yet
Email Address Roll Number Full Name Contact Number Course at DTU
Document14 pages
Email Address Roll Number Full Name Contact Number Course at DTU
ritika singh
No ratings yet
Analysis of New Product Launch Using Google Double Click (Ritika and Sanchit Kaushal)
Document13 pages
Analysis of New Product Launch Using Google Double Click (Ritika and Sanchit Kaushal)
ritika singh
100% (5)
Indian School Education Statistics - Ritika
Document8 pages
Indian School Education Statistics - Ritika
ritika singh
No ratings yet
Big Data Analytics Lab Test 2 (Ritika)
Document3 pages
Big Data Analytics Lab Test 2 (Ritika)
ritika singh
No ratings yet
Group 6 BDA LAB TEST Paper
Document3 pages
Group 6 BDA LAB TEST Paper
ritika singh
No ratings yet
Forecasting: Case 3
Document2 pages
Forecasting: Case 3
ritika singh
No ratings yet
Low Cost Building
Document10 pages
Low Cost Building
Azhar Bhatty
No ratings yet