Professional Documents
Culture Documents
S.
Name of the
N Enroll. No. Branch Mobile No. E-mail address
student
o.
1
2 02215002816 Karan sehgal E.C.E 8826608475 Karansehagl151998@gmail.com
3 04215002816 Rishabh Rawat E.C.E 8750644313 rishabhrawat570@gmail.com
● PreProcessing of Data.
2
Roadblocks and their workarounds encountered during the progress of the
project are explained below,
❖ Missing values in the dataset
➢ Some of the important pieces of information were found
missing while working on the dataset, it included passengers’
age column.
➢ The first workaround was to take average values of
passengers’ ages and fill it where it is NULL. This was not
giving good results and was not a reliable measure to work
against.
➢ Thinking through various solutions, a better approach was
finally implemented which included taking average age for
every possible category of the passenger i.e., calculating the
average of the ages for all the kids, women, men, young boys,
young girl, separately (and all possible salutations were
included). Then filling NULL values according to the
passenger’s identity.
❖ Incompatible chunk of data present
➢ There were many rows in the dataset containing more than
90% NULL or missing values and they were not contributing
to the accuracy of the prediction.
➢ Filling those missing/NULL values was the goal at first, but
after some evaluations it was clear that dropping those values
was the better option at hand.
❖ Redundant information in the dataset
➢ There were many columns in the dataset that had absolutely
zero contribution to the end result.
3
➢ Dropping those columns was the best way to reduce the
overhead in the computation. These columns included,
● Cabin,
● name,
● survived,
● passengerId, and
● ticket
4
Percentage of work completed till date: …………………………..
Evaluation Criteria
Name Regularity Progress Timely Total
Enroll.
S.No. of the Branch (02) of work submission of Marks
No.
student done progress report (10)
(06) (02)
1
2
3