You are on page 1of 8

Titanic Survival

USING MACHINE LEARNING


The mission
Predict the survival of passengers on the Titanic using:
• Passenger ID
• Survived
• Passenger Class
• Name
• Sex
• Age
• Number of siblings / spouses
• Number of parents
• Ticket number
• Fare
• Cabin Number
• Port of Embarkation
• Survival
Data inspection
REDUCING DIMENSIONALITY

• Removed features not contributing to the story


• Removed Cabin feature

D AT A C L E A N U P

• Age imputed with median value (28)


• Imputed fare with median of each passenger class
• Removed Capt. Crosby
Age distribution
Fare distribution
Feature engineering

QuantileTransformer(n_quantiles = 300,\
output_distribution = 'normal')

Port embarkation one-hot encoded


Model Selection
CLASSIFIER MODELS ACCURACY

94.5% 86.8% 96.4%

sl

Logisitic Regression Naive Bayes Linear Discriminant Analysis


Score - 0.80​ Score - 0.78 ​Score - 0.80
Next time out
1. Natural language processing

a. Investigate names/titles

b. Determine siblings/parent relations

2. Destructure passenger class feature

3. Refine model

a. Investigate data transformations

b. Reduce dimensionality

You might also like