0% found this document useful (0 votes)

65 views5 pages

Document 26 1

Bom

Uploaded by

abdullahfarooq849

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

65 views5 pages

Document 26 1

Bom

Uploaded by

abdullahfarooq849

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Titanic Survival Prediction Using

Machine Learning Algorithms

Group Leader

Institution

University of Management and Technology (UMT)

Date

June 1, 2025

Abstract

This exploration investigates predicting passenger survival on the Titanic

using multiple machine learning algorithms. The Titanic dataset from Kaggle,
which includes features such as age, sex, class, and fare, was used to build
and test six different classification models: Logistic Regression, Decision
Trees, Random Forest, K-Nearest Neighbors (KNN), Support Vector Classifier
(SVC), and XGBoost. Among these models, Logistic Regression achieved the
highest accuracy at 80.76%. The methodology used in this paper is
explained, and the performance of each model is analyzed and compared.

1. Introduction

The Titanic dataset presents an opportunity to explore binary classification:

predicting whether a passenger survived based on their socio-demographic
characteristics. This type of classification problem is common in various
fields, ranging from healthcare to business. Machine learning algorithms are
frequently applied to the Titanic dataset to predict survival probabilities
based on features like age, gender, class, and family size. This study
evaluates the performance of six machine learning algorithms on the Titanic
dataset and compares their accuracy and effectiveness.

2. Methodology

Dataset

The Kaggle Titanic dataset includes the following features:

 PassengerId: A unique identifier for each passenger.

 Pclass: Passenger class (1st, 2nd, or 3rd).
 Sex: Gender (male/female).
 Age: Age of the passenger.
 SibSp: Number of siblings/spouses aboard.
 Parch: Number of parents/children aboard.
 Fare: Fare paid by the passenger.
 Embarked: Embarkation port (C = Cherbourg, Q = Queenstown, S =
Southampton).
 Survived: Target variable (1 = survived, 0 = did not survive).

Data Preprocessing

 Missing Data Handling: Missing values were handled by filling the

"Age" column with its median value and the "Embarked" column with
the most frequent value.
 Encoding Features: Categorical features such as "Sex" and
"Embarked" were encoded using Label Encoding and One-Hot
Encoding, respectively.
 Scaling: Numerical features (e.g., Age, Fare, SibSp) were scaled using
StandardScaler to standardize their range.

Model Selection

The following machine learning models were used:

1. Logistic Regression: A simple classifier for binary classification.

2. Decision Tree Classifier: A tree-based classifier that splits the
dataset based on feature values.
3. Random Forest Classifier: An ensemble method consisting of
multiple decision trees.
4. K-Nearest Neighbors (KNN): A non-parametric method based on
distance measures.
5. Support Vector Classifier (SVC): A classifier that maximizes the
margin between different classes.
6. XGBoost Classifier: A gradient-boosting ensemble classifier known
for its high performance.

The dataset was split into training (80%) and testing (20%) sets, and
accuracy was used as the evaluation metric.

3. Results

The performance of each model was measured in terms of accuracy:

Accuracy
Model
(%)
Logistic Regression 80.76
Decision Tree Classifier 79.33
Random Forest Classifier 79.33
K-Nearest Neighbors
75.12
(KNN)
Support Vector Classifier
72.45
(SVC)
XGBoost Classifier 79.50

4. Model Performance Analysis

 Logistic Regression: This model achieved the highest accuracy

(80.76%) and was the most interpretable. Its simplicity made it the
best choice for this task.
 Decision Tree and Random Forest: These models showed similar
performance (79.33%), but were more prone to overfitting.
 KNN and SVC: Both models struggled with scaling and distance
measures, leading to relatively poor performance.
 XGBoost: While usually a high-performance model, it was
outperformed by Logistic Regression in this case, possibly due to the
smaller dataset and nature of the features.

5. Comparison and Discussion

 Comparison of Models: Logistic Regression emerged as the best

performer with an accuracy of 80.76%. This suggests that the
relationship between the features and the target variable is relatively
straightforward. Decision Trees and Random Forests, while able to
capture non-linear relationships, were susceptible to overfitting.
XGBoost, a typically high-accuracy algorithm, did not outperform
Logistic Regression here, likely due to the dataset's size and feature
set. KNN and SVC performed poorly due to their sensitivity to feature
scaling and data complexity.
 Feature Importance: The most important features for predicting
survival were "Pclass," "Sex," and "Age." This supports the hypothesis
that wealthier passengers and women had a higher chance of survival.
 Model Complexity: Simpler models like Logistic Regression
performed well on this relatively small dataset, while more complex
models like Random Forest and XGBoost may require more data to
truly shine.

6. Summary

This study aimed to predict Titanic passenger survival using machine

learning models. Logistic Regression was found to be the best-performing
model with an accuracy of 80.76%, outperforming more complex models
such as Random Forest and XGBoost. While more advanced models hold
promise, Logistic Regression’s simplicity and interpretability make it the
most practical solution for this task. Future work may explore
hyperparameter tuning, feature engineering, and advanced models like deep
learning to further improve accuracy.
References

1. Kaggle Titanic Dataset. Accessible at: Kaggle Titanic Data

2. Alvarez, D., & Gomez, M. (2018). Applying predictive models to the
Titanic dataset. Journal of Machine Learning and Data Science.
3. Zhang, Y., & Zheng, Y. (2020). Machine learning for Titanic survival
prediction. International Journal of Data Science and Machine Learning.
4. Kaggle Titanic Competition Participants (2015). Predicting Titanic
survival with machine learning models. Kaggle Titanic Dataset.

MCA - Project Documentation Guidelines 2024-2025
No ratings yet
MCA - Project Documentation Guidelines 2024-2025
26 pages
Iml Project
No ratings yet
Iml Project
13 pages
Titanic Passenger Survival Prediction Analysis
No ratings yet
Titanic Passenger Survival Prediction Analysis
8 pages
Indraneel S (RA2211003010421)
No ratings yet
Indraneel S (RA2211003010421)
21 pages
Titanic Survival Prediction with ML
No ratings yet
Titanic Survival Prediction with ML
13 pages
LP3 - ML Mini-Project Report Format Shreeyas
No ratings yet
LP3 - ML Mini-Project Report Format Shreeyas
13 pages
Titanic Survival Prediction Using ML
No ratings yet
Titanic Survival Prediction Using ML
24 pages
Technical Assignment II
No ratings yet
Technical Assignment II
4 pages
Using Titanic Dataset For Comprehensive Machine Learning Model Training
No ratings yet
Using Titanic Dataset For Comprehensive Machine Learning Model Training
3 pages
Exploratory Data Analysis of Titanic Survival Prediction Using Machine Learning Techniques
No ratings yet
Exploratory Data Analysis of Titanic Survival Prediction Using Machine Learning Techniques
5 pages
Titanic Survival Prediction Using ML Miniproject
No ratings yet
Titanic Survival Prediction Using ML Miniproject
21 pages
Titanic Survival Prediction with ML
No ratings yet
Titanic Survival Prediction with ML
7 pages
Titanic
No ratings yet
Titanic
3 pages
Titanic ML Survival Prediction Analysis
No ratings yet
Titanic ML Survival Prediction Analysis
3 pages
Titanic Survival Prediction Project
No ratings yet
Titanic Survival Prediction Project
5 pages
Titanic Documentation-1722102624939
No ratings yet
Titanic Documentation-1722102624939
34 pages
Titanic Survival Prediction Using Machine Learning
No ratings yet
Titanic Survival Prediction Using Machine Learning
34 pages
Mini Project ml111
No ratings yet
Mini Project ml111
2 pages
ML Mini Project 2
No ratings yet
ML Mini Project 2
26 pages
? Titanic Survival Prediction Project
No ratings yet
? Titanic Survival Prediction Project
3 pages
Predictive Modeling of Titanic Survivors
No ratings yet
Predictive Modeling of Titanic Survivors
12 pages
Titanic Survival Prediction - Step-by-Step Guide
No ratings yet
Titanic Survival Prediction - Step-by-Step Guide
4 pages
Titanic Survival Prediction with ML
No ratings yet
Titanic Survival Prediction with ML
6 pages
Titanic Survival Analysis with ML Models
No ratings yet
Titanic Survival Analysis with ML Models
11 pages
ML - Other Pracs
No ratings yet
ML - Other Pracs
7 pages
A Comparative Study On Machine Learning Techniques Using Titanic Dataset
No ratings yet
A Comparative Study On Machine Learning Techniques Using Titanic Dataset
6 pages
Titanic Data Analysis Project
No ratings yet
Titanic Data Analysis Project
14 pages
Machine Learning for Titanic Survival Analysis
100% (1)
Machine Learning for Titanic Survival Analysis
62 pages
Titanic Survival Prediction Using ML
No ratings yet
Titanic Survival Prediction Using ML
7 pages
Titanic PuneethRegonda
No ratings yet
Titanic PuneethRegonda
8 pages
Report TSP
No ratings yet
Report TSP
13 pages
ML Mini Project
No ratings yet
ML Mini Project
17 pages
Titanic ML Review Paper
No ratings yet
Titanic ML Review Paper
2 pages
ML WebApp for Titanic Survival Prediction
No ratings yet
ML WebApp for Titanic Survival Prediction
2 pages
Titanic Survival Prediction with ML
No ratings yet
Titanic Survival Prediction with ML
5 pages
Titanic Survival Prediction with ML
No ratings yet
Titanic Survival Prediction with ML
7 pages
Titanic Survival Prediction Model Insights
No ratings yet
Titanic Survival Prediction Model Insights
20 pages
ML Mini Project - Docx New (A)
No ratings yet
ML Mini Project - Docx New (A)
17 pages
Assaignment 2 2
No ratings yet
Assaignment 2 2
7 pages
Titanic Survival Prediction Model
No ratings yet
Titanic Survival Prediction Model
1 page
Random Forest Algorithm - Titanic Dataset
No ratings yet
Random Forest Algorithm - Titanic Dataset
12 pages
Titanic Survival Prediction Analysis
No ratings yet
Titanic Survival Prediction Analysis
18 pages
Titanic Survival Prediction Model
No ratings yet
Titanic Survival Prediction Model
20 pages
Titanic Survival Prediction Assignment
No ratings yet
Titanic Survival Prediction Assignment
3 pages
Titanic Classification Project
No ratings yet
Titanic Classification Project
17 pages
Cruise Traveler Survival Prediction
No ratings yet
Cruise Traveler Survival Prediction
10 pages
Titanic
No ratings yet
Titanic
6 pages
Titanic & Airline ML Analysis Guide
No ratings yet
Titanic & Airline ML Analysis Guide
3 pages
Titanic ML Kaggle
No ratings yet
Titanic ML Kaggle
3 pages
Titanic Survival Prediction with ML
No ratings yet
Titanic Survival Prediction with ML
14 pages
Titanic Survival Prediction Model Report
No ratings yet
Titanic Survival Prediction Model Report
3 pages
Titanic Disaster Logistic Regression Analysis
No ratings yet
Titanic Disaster Logistic Regression Analysis
35 pages
Predictive Modeling: Titanic & Housing Datasets
No ratings yet
Predictive Modeling: Titanic & Housing Datasets
13 pages
Titanic Classification Project
No ratings yet
Titanic Classification Project
17 pages
Titanic Survival Prediction with Decision Trees
No ratings yet
Titanic Survival Prediction with Decision Trees
1 page
Data Science Insights on Titanic
No ratings yet
Data Science Insights on Titanic
24 pages
Titanic Survival Prediction with ML
No ratings yet
Titanic Survival Prediction with ML
20 pages
Submitted To The Savitribai Phule Pune University, Pune FOR
No ratings yet
Submitted To The Savitribai Phule Pune University, Pune FOR
4 pages
Titanic Survival Prediction
100% (1)
Titanic Survival Prediction
14 pages
CH04 - Starting Out With Python
No ratings yet
CH04 - Starting Out With Python
49 pages
Clang Analyzer Guide v0.1
No ratings yet
Clang Analyzer Guide v0.1
64 pages
Software Engineering Activities Overview
No ratings yet
Software Engineering Activities Overview
14 pages
Inheritance and Polymorphism in Java
No ratings yet
Inheritance and Polymorphism in Java
39 pages
Clarifye Zeiss 2021
No ratings yet
Clarifye Zeiss 2021
9 pages
Java Lab Manual (Bcs306a) - Cse (1) - 1
No ratings yet
Java Lab Manual (Bcs306a) - Cse (1) - 1
22 pages
Summer Training Report
No ratings yet
Summer Training Report
49 pages
Online Food Ordering System Proposal
No ratings yet
Online Food Ordering System Proposal
8 pages
Introduction To Python Programming
No ratings yet
Introduction To Python Programming
2 pages
Advanced Series WALLPAPER CUSTOMES
No ratings yet
Advanced Series WALLPAPER CUSTOMES
9 pages
Embedded Web Technology-1
100% (1)
Embedded Web Technology-1
16 pages
Os Lab Viva Questions
No ratings yet
Os Lab Viva Questions
3 pages
Java Hash Tables: Basics and Examples
No ratings yet
Java Hash Tables: Basics and Examples
34 pages
Effective and Efficient Software Testing
No ratings yet
Effective and Efficient Software Testing
5 pages
C Programming Quiz: Data Types & Outputs
No ratings yet
C Programming Quiz: Data Types & Outputs
12 pages
1-D Array Operations and Linked List Programs
No ratings yet
1-D Array Operations and Linked List Programs
99 pages
Outlook PST Repair Guide
No ratings yet
Outlook PST Repair Guide
12 pages
C++ Object-Oriented Programming Concepts
No ratings yet
C++ Object-Oriented Programming Concepts
4 pages
Hive Workshop Practical
No ratings yet
Hive Workshop Practical
29 pages
CM72L1eL - M8 - A1 Rename PC With CM 7.2 Installed Up To SR4 Guidance Doc - 201410 PDF
No ratings yet
CM72L1eL - M8 - A1 Rename PC With CM 7.2 Installed Up To SR4 Guidance Doc - 201410 PDF
19 pages
Python Relational and Conditional Operators
No ratings yet
Python Relational and Conditional Operators
11 pages
Lab 05 Analyzing Types of Attacks and Mitigation Techniques
No ratings yet
Lab 05 Analyzing Types of Attacks and Mitigation Techniques
8 pages
Full Stack Python/React Developer Resume
No ratings yet
Full Stack Python/React Developer Resume
3 pages
Sap Change Management
No ratings yet
Sap Change Management
62 pages
Microsoft Excel - Wikipedia
No ratings yet
Microsoft Excel - Wikipedia
28 pages
Computer Skills Syllabus 2023-2024
No ratings yet
Computer Skills Syllabus 2023-2024
1 page
Java Programming Model Answers
No ratings yet
Java Programming Model Answers
5 pages
Music Recommendation System Powerpoint Presentation
100% (4)
Music Recommendation System Powerpoint Presentation
16 pages
Periodic Table of DevOps Tools V3
No ratings yet
Periodic Table of DevOps Tools V3
1 page
ICS 2302 Software Engineering Exam Guide
No ratings yet
ICS 2302 Software Engineering Exam Guide
3 pages

Document 26 1

Uploaded by

Document 26 1

Uploaded by

Titanic Survival Prediction Using

Machine Learning Algorithms

University of Management and Technology (UMT)

This exploration investigates predicting passenger survival on the Titanic

The Titanic dataset presents an opportunity to explore binary classification:

The Kaggle Titanic dataset includes the following features:

 PassengerId: A unique identifier for each passenger.

 Missing Data Handling: Missing values were handled by filling the

The following machine learning models were used:

1. Logistic Regression: A simple classifier for binary classification.

The performance of each model was measured in terms of accuracy:

4. Model Performance Analysis

 Logistic Regression: This model achieved the highest accuracy

5. Comparison and Discussion

 Comparison of Models: Logistic Regression emerged as the best

This study aimed to predict Titanic passenger survival using machine

1. Kaggle Titanic Dataset. Accessible at: Kaggle Titanic Data

You might also like