You are on page 1of 9

Prediction of Adult Income based on Census

Data master

Naitik Patel (20BCP238)


Vinit Premlani (20BCP243)
Bhavya Patel (20BCP229)

Under Guidance: Dr. Chintan Bhatt


Abstract
• Prediction of Adult Income helps us in knowing our personal
value in the world.
• Data mining based forecasting techniques for data analysis of
people’s income can help in the prediction of their income .
• We provide a comprehensive classification and comparison of
the techniques that have been frequently used for such
prediction and analyses models .
• Moreover, we highlight the challenges and future research
directions in this area that can be considered in order to develop
optimized solutions for prediction of income after graduation.
Introduction
The goal of this project is to predict if an individual’s income
exceeds 50K or not using machine learning classification
algorithms and also finding patters in the dataset using
Association rules. This helps us to determine various things
such as lucrativeness of setting up a business in a city based
on average income of the people, Real Estate preferences and
bank loan eligibility for a particular person. In addition, we
can also figure out what type of tourist places a particular
strata of people would like to visit and whether that person’s
children would prefer a public or private college in future.
Literature Survey
• Developing a machine learning model for prediction of an individuals
income is above 50 k dollars or not based on assiocation rule mining
• Keeping in mind the parameters such as age, work class, fnlwgt,
education, education num, marital status, occupation, relationship,
race, sex, 'capital gain, capital loss, hours per week, native country
and income.
• this model uses various machine learning algorithms and neural
networks concept to predict weather income will be above 50 k or
not.
Related Work
• There has not been many major works related to the specific domain
of income prediction but there are some Kaggle datasets available
and people have made models for that which were pretty helpful for
us while making this project.
• communities have been talking about this topic much and much in
recent times and this topic has seen some major developments in
itself and with the datasets being provided publicly analyst can work
on their own make models analyse the dataset and everything related
to it.
Dataset Description
• An individual’s annual income results from various factors. Intuitively, it is
influenced by the individual’s education level, age, gender, occupation, and etc.
• This is a widely cited KNN dataset. I encountered it during my course, and I wish to
share it here because it is a good starter example for data pre-processing and
machine learning practices.
• Fields
The dataset contains 16 columns
Target filed: Income
-- The income is divide into two classes: <=50K and >50K
Number of attributes: 14
-- These are the demographics and other features to describe a person
• We can explore the possibility in predicting income level based on the individual’s
personal information.
Proposed Work
• Gathering the Data:
The first step in solving any machine learning challenge is to prepare the
data. For this issue, we'll be using datasets from Kaggle.The dataset has
qualitative data.We will be using data-driven approach.

• Preprocessing the Data:


The data contains various null values and also the symptoms are in text
format.Hence, the symptoms are first replaced with their respective
weights in the severity dataset and null values are replaced with 0 (no
related symptoms).
Proposed Work(Cont.)
• Model Building:
After gathering and cleaning the data, the data is ready and can be used to train
a machine learning model. We will be using this cleaned data to train
Performed Logistic Regression, Association Rules, K-Fold Cross Validation,
Applying CART algorithms, Using ensemble learning implemented neural
networks. We will be comparing these models based on their accuracy,
precision and recall from the confusion matrix.

• Inference:
Once these models have been trained, we will use their predictions to forecast
the prediction of income
Conclusion
• Finally from the dataset we predict whether a person makes over
$50K a year or not.  
• Find Patters in the dataset  
• K-Fold cross validation
• In this study, systematic efforts are made in designing a system which
results in the prediction of adult income

You might also like