This document summarizes an analysis of Titanic passenger survival data using machine learning models. Key steps included:
1. Cleaning the data by removing unused features like Cabin, imputing missing Age and Fare values, and one-hot encoding Port.
2. Engineering features like transforming variables with QuantileTransformer and one-hot encoding Port.
3. Evaluating classifier models like Logistic Regression, Naive Bayes, and Linear Discriminant Analysis, finding Logistic Regression and Linear Discriminant Analysis had the best accuracy at around 94.5-96.4%.
4. Proposing next steps like using NLP on names/titles, deconstructing passenger class, and refining the model through
This document summarizes an analysis of Titanic passenger survival data using machine learning models. Key steps included:
1. Cleaning the data by removing unused features like Cabin, imputing missing Age and Fare values, and one-hot encoding Port.
2. Engineering features like transforming variables with QuantileTransformer and one-hot encoding Port.
3. Evaluating classifier models like Logistic Regression, Naive Bayes, and Linear Discriminant Analysis, finding Logistic Regression and Linear Discriminant Analysis had the best accuracy at around 94.5-96.4%.
4. Proposing next steps like using NLP on names/titles, deconstructing passenger class, and refining the model through
This document summarizes an analysis of Titanic passenger survival data using machine learning models. Key steps included:
1. Cleaning the data by removing unused features like Cabin, imputing missing Age and Fare values, and one-hot encoding Port.
2. Engineering features like transforming variables with QuantileTransformer and one-hot encoding Port.
3. Evaluating classifier models like Logistic Regression, Naive Bayes, and Linear Discriminant Analysis, finding Logistic Regression and Linear Discriminant Analysis had the best accuracy at around 94.5-96.4%.
4. Proposing next steps like using NLP on names/titles, deconstructing passenger class, and refining the model through
The mission Predict the survival of passengers on the Titanic using: • Passenger ID • Survived • Passenger Class • Name • Sex • Age • Number of siblings / spouses • Number of parents • Ticket number • Fare • Cabin Number • Port of Embarkation • Survival Data inspection REDUCING DIMENSIONALITY
• Removed features not contributing to the story
• Removed Cabin feature
D AT A C L E A N U P
• Age imputed with median value (28)
• Imputed fare with median of each passenger class • Removed Capt. Crosby Age distribution Fare distribution Feature engineering