Welcome to Scribd!

Titanic in Class Competition

Uploaded by

0% found this document useful (0 votes)

7 views2 pages

The document discusses a Titanic passenger dataset that has been split into a training set and test set. The training set includes the outcome (survival) for each passenger and is to be used to build machine learning models based on features like gender, class, age, etc. The test set does not include the outcomes and is to be used to predict survival for each passenger. Issues to consider are data balance, missing data, and using voting/averaging in predictions.

Original Description:

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

7 views2 pages

Titanic in Class Competition

Uploaded by

Gifted David Odesola

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 2

Search inside document

Titanic in Class Completion (ticc)

The data has been split into two groups: training set (ticc_train.csv) and test set (ticc_test.csv).

The training set should be used to build your machine learning models. For the training set, the
outcome (also known as the “ground truth”) is provided for each passenger. Your model will be based
on “features” like passengers’ gender and class. You can also use feature engineering to create new
features (see page 21 of the “Python Machine Learning by Example” book).

The test set should be used to see how well your model performs on unseen data. For the test set, you
are not provided with the ground truth for each passenger. It is your job to predict these outcomes. For
each passenger in the test set, use the models you trained to predict whether they survived the sinking
of the Titanic.

Variable Definition Key

survival Survival 0 = No, 1 = Yes
pclass Ticket class 1 = 1st, 2 = 2nd, 3 = 3rd
sex Sex
Age Age in years
# of siblings / spouses aboard
sibsp
the Titanic
# of parents / children aboard
parch
the Titanic
ticket Ticket number
fare Passenger fare
cabin Cabin number
C = Cherbourg,
embarked Port of Embarkation Q = Queenstown,
S = Southampton

Variable Notes

pclass: A proxy for socio-economic status (SES)

1st = Upper; 2nd = Middle; 3rd = Lower

age: Age is fractional if less than 1. If the age is estimated, is it in the form of xx.5

sibsp: The dataset defines family relations in this way...

Sibling = brother, sister, stepbrother, stepsister
Spouse = husband, wife (mistresses and fiancés were ignored)

parch: The dataset defines family relations in this way...

Parent = mother, father
Child = daughter, son, stepdaughter, stepson
Some children travelled only with a nanny, therefore parch=0 for them.
Issues to thing about (and maybe deal with)

Is the data set balanced? Data that is overwhelmed by one particular sub class is often ineffectual
when training a predictive system. What could be done in such situations?

How are you going to deal with missing data? (See page 22 of the “Python Machine Learning by
Example” book.)

What part might voting and averaging play in your final prediction? (See page 27 of the “Python
Machine Learning by Example” book.)

Titanic
Document2 pages
Titanic
ajms_1973
No ratings yet
Neural Network Project
Document4 pages
Neural Network Project
ayten55zoweil
No ratings yet
Titanic-The Challenge
Document15 pages
Titanic-The Challenge
Bruno Reis
No ratings yet
Titanic Dataset Variables
Document3 pages
Titanic Dataset Variables
Javier Mansilla
No ratings yet
University Institute of Engineering Department of Computer Science & Engineering
Document18 pages
University Institute of Engineering Department of Computer Science & Engineering
Ankit SiNgh
No ratings yet
Penny Genetics How Well Does A Punnett Square Predict The Actual Ratios
Document3 pages
Penny Genetics How Well Does A Punnett Square Predict The Actual Ratios
nata
No ratings yet
Titanic
Document1 page
Titanic
carloslariogomez
No ratings yet
Probstat 09242023
Document5 pages
Probstat 09242023
Nathan Aduca
No ratings yet
What Is Cross-Entropy?: 1 Answer
Document3 pages
What Is Cross-Entropy?: 1 Answer
MD. ABIR HASAN 1704018
No ratings yet
Bays Classifier (Machine Learning)
Document16 pages
Bays Classifier (Machine Learning)
Suman Kundu
No ratings yet
Work For Sept. 23/24 Starts Here:: Unit 2: Probability
Document52 pages
Work For Sept. 23/24 Starts Here:: Unit 2: Probability
Riddhiman Pal
No ratings yet
CH 10
Document30 pages
CH 10
rodkov
No ratings yet
Stat273 - Chapter 03 (S) - Part#1
Document10 pages
Stat273 - Chapter 03 (S) - Part#1
Husain Ali
No ratings yet
The Preschooler’s Handbook: ABC’s, Numbers, Colors, Shapes, Matching, School, Manners, Potty and Jobs, with 300 Words that every Kid should Know
From Everand
The Preschooler’s Handbook: ABC’s, Numbers, Colors, Shapes, Matching, School, Manners, Potty and Jobs, with 300 Words that every Kid should Know
Dayna Martin
No ratings yet
Lecture 13
Document4 pages
Lecture 13
api-3702538
No ratings yet
Gambling and Probability
Document10 pages
Gambling and Probability
Leonardo Pisano
No ratings yet
Mathematics s4 Sample Offline Learning Basic Probability RL
Document18 pages
Mathematics s4 Sample Offline Learning Basic Probability RL
dhngoc179
No ratings yet
Logic
Document55 pages
Logic
yacob
No ratings yet
Business Analytics Assignment
Document6 pages
Business Analytics Assignment
bilal bashir
No ratings yet
Biostat Formulas
Document3 pages
Biostat Formulas
Pat Bongon
No ratings yet
C2 - Data Mining - Van - DUE - 2021 - P2
Document20 pages
C2 - Data Mining - Van - DUE - 2021 - P2
Quý Thanh
No ratings yet
Illustration of Events, Union and Intersection of Events: Presented By: Belery C. Flaviano
Document23 pages
Illustration of Events, Union and Intersection of Events: Presented By: Belery C. Flaviano
Ellen Rose Olbe
No ratings yet
Talambuhay
Document1 page
Talambuhay
mikaycorpuz31
No ratings yet
LBYEC3C Project 1 Data Processing
Document4 pages
LBYEC3C Project 1 Data Processing
Jedrek Dy
No ratings yet
Titanic DS Callenge
Document24 pages
Titanic DS Callenge
ARCHANA R
No ratings yet
6 Basic Steps To Completing Single Cross
Document3 pages
6 Basic Steps To Completing Single Cross
gilliane
No ratings yet
ARTIC /K/ & /G/ Game Boards: No Prep
Document29 pages
ARTIC /K/ & /G/ Game Boards: No Prep
Оля о жизни в Корее Olga my life in Korea
No ratings yet
W1-3 Q3 Math-8B Simple-Probability
Document6 pages
W1-3 Q3 Math-8B Simple-Probability
Shanelle Santillana
No ratings yet
Probability Laws, Demorgans Laws: Chris Morgan, Math G160 Csmorgan@Purdue - Edu January 11, 2012
Document24 pages
Probability Laws, Demorgans Laws: Chris Morgan, Math G160 Csmorgan@Purdue - Edu January 11, 2012
Anima Anwar
No ratings yet
Chapter 3
Document59 pages
Chapter 3
MATHIS DO CAO
No ratings yet
CSE 103: Discrete Mathematics: Predicate Rules of Inference
Document31 pages
CSE 103: Discrete Mathematics: Predicate Rules of Inference
Abu OUbaida
No ratings yet
Computational Genome Analysis: Lecture-4
Document60 pages
Computational Genome Analysis: Lecture-4
Swayam
No ratings yet
Trigonometric Identities
Document10 pages
Trigonometric Identities
lekz re
No ratings yet
Mea 319 Titanic Project
Document26 pages
Mea 319 Titanic Project
api-518559206
No ratings yet
L03 Logic
Document26 pages
L03 Logic
shomik1610
No ratings yet
Chapter 1: Stochastic Processes: STATS 310 Statistics STATS 210 Foundations of Statistics and Probability
Document12 pages
Chapter 1: Stochastic Processes: STATS 310 Statistics STATS 210 Foundations of Statistics and Probability
Caden Lee
No ratings yet
Format - ExpeLabReport (Repaired)
Document21 pages
Format - ExpeLabReport (Repaired)
Aly Magano
No ratings yet
Mathematics 8 4th: of Ways The Event Can Occur (Favorable Outcomes)
Document5 pages
Mathematics 8 4th: of Ways The Event Can Occur (Favorable Outcomes)
jeff
No ratings yet
Permutation vs. Combination
Document8 pages
Permutation vs. Combination
Queenie Austria
No ratings yet
Example Lesson From Music Last Year
Document3 pages
Example Lesson From Music Last Year
api-557208367
No ratings yet
The Kindergartener’s Handbook: ABC’s, Vowels, Math, Shapes, Colors, Time, Senses, Rhymes, Science, and Chores, with 300 Words that every Kid should Know (Engage Early Readers: Children's Learning Books)
From Everand
The Kindergartener’s Handbook: ABC’s, Vowels, Math, Shapes, Colors, Time, Senses, Rhymes, Science, and Chores, with 300 Words that every Kid should Know (Engage Early Readers: Children's Learning Books)
Dayna Martin
No ratings yet
Weeks 6 9 Puzzles and Riddles ARG 1
Document1 page
Weeks 6 9 Puzzles and Riddles ARG 1
MEDINA Lawrence Jassen F.
No ratings yet
Chapter 3 - Stories Categorical Data Tell
Document15 pages
Chapter 3 - Stories Categorical Data Tell
Bryant Bachelor
No ratings yet
SM - Sheet 4
Document5 pages
SM - Sheet 4
lovely_mhmd
No ratings yet
Gann Ttta-2
Document48 pages
Gann Ttta-2
sUPRAKASH
75% (4)
Chapter # 4 Exhaustive Events
Document5 pages
Chapter # 4 Exhaustive Events
MUHAMMAD BILAL
No ratings yet
4-3 Right Triangle Trigonometry: Pre-Calculus
Document38 pages
4-3 Right Triangle Trigonometry: Pre-Calculus
Mary Rose David
No ratings yet
Practice Test - Get Ready
Document5 pages
Practice Test - Get Ready
HaThanh Nguyen
No ratings yet
Assignment 1 - TITANIC
Document6 pages
Assignment 1 - TITANIC
MIGUEL GARCIA DE JULIAN
No ratings yet
ATD Exercicio Excel
Document13 pages
ATD Exercicio Excel
Teresa Guarda
No ratings yet
Week1 Random Experiment
Document23 pages
Week1 Random Experiment
Juliano Medellu
No ratings yet
Continuous Time Fourier Transform
Document7 pages
Continuous Time Fourier Transform
Onur Can Ervakit
No ratings yet
HW 2
Document3 pages
HW 2
Tarik Adnan Moon
No ratings yet
Lab: The Chi-Square Test
Document13 pages
Lab: The Chi-Square Test
Joey Ma
100% (1)
Permutation and Combination
Document30 pages
Permutation and Combination
Dreiza Patria Sunodan
No ratings yet
Assignment
Document10 pages
Assignment
shubham9nag
No ratings yet
Pippa and Pop Scope and Sequence
Document6 pages
Pippa and Pop Scope and Sequence
Anabel Mariel Gelsi
No ratings yet
Lesson 17 - Listening
Document18 pages
Lesson 17 - Listening
Tuấn Khải Vương
No ratings yet
Penny Genetics
Document3 pages
Penny Genetics
Dzulfiqar
0% (2)
Are We There Yet?: A Participatory Play for Teens
From Everand
Are We There Yet?: A Participatory Play for Teens
Jane Heather
No ratings yet
Chapter 4 Notes Hall Aud Cis
Document3 pages
Chapter 4 Notes Hall Aud Cis
ziahnepostreli
No ratings yet
SUMERA - Kmeans Clustering - Jupyter Notebook
Document7 pages
SUMERA - Kmeans Clustering - Jupyter Notebook
Alleah Laylo
No ratings yet
Algorithms in ML
Document15 pages
Algorithms in ML
nami
No ratings yet
College Professior Emails
Document155 pages
College Professior Emails
NaveenJain
No ratings yet
Data Science 2015
Document229 pages
Data Science 2015
hubner janampa
No ratings yet
JCR - Cmc-Comput Mater Con - 2021
Document23 pages
JCR - Cmc-Comput Mater Con - 2021
emadnabilcs
No ratings yet
Whats App
Document23 pages
Whats App
Râjä Sékhãr
No ratings yet
Fundamentals of Artificial Intelligence - Spring 2022 Lab 1: Expert Systems
Document2 pages
Fundamentals of Artificial Intelligence - Spring 2022 Lab 1: Expert Systems
Mihai Filipescu
No ratings yet
Yaqoob Et Al. - 2016 - Big Data From Beginning To Future
Document19 pages
Yaqoob Et Al. - 2016 - Big Data From Beginning To Future
Bruno Oliveira
No ratings yet
Research On Complaint Operation Management System Based On Digital Transformation
Document6 pages
Research On Complaint Operation Management System Based On Digital Transformation
Mattew Olawumi
No ratings yet
Software Engineering: UNIT-1
Document37 pages
Software Engineering: UNIT-1
Jayavarapu Karthik J
No ratings yet
Linked Data Structures and Parse Trees: ISTA 350 Hw2, Due Thursday 9/16/2021 at 11:59 PM
Document4 pages
Linked Data Structures and Parse Trees: ISTA 350 Hw2, Due Thursday 9/16/2021 at 11:59 PM
mahnoor
No ratings yet
QP-Computer Science-12-Practice Paper-1
Document8 pages
QP-Computer Science-12-Practice Paper-1
subashreebaski16
No ratings yet
Vector Data Model (GIS)
Document34 pages
Vector Data Model (GIS)
Arden Lo
No ratings yet
Lab Book: F. Y. B. B. A. (C.A.) Semester I
Document47 pages
Lab Book: F. Y. B. B. A. (C.A.) Semester I
3427 Sohana Joglekar
No ratings yet
Comparing ML Algorithms On Financial Fraud Detection N
Document5 pages
Comparing ML Algorithms On Financial Fraud Detection N
Prashant Chafle
No ratings yet
Ebook Applied Machine Learning PDF Full Chapter PDF
Document67 pages
Ebook Applied Machine Learning PDF Full Chapter PDF
michael.ortiz653
100% (23)
Investigating Reproducibility and Tracking Provena
Document15 pages
Investigating Reproducibility and Tracking Provena
Ahmad Solihin
No ratings yet
U3 - C8 - T2 - Peer-To-Peer Middleware
Document9 pages
U3 - C8 - T2 - Peer-To-Peer Middleware
Ashutosh Srivastava
No ratings yet
Annotation Imprtant
Document5 pages
Annotation Imprtant
hicham zmaimita
No ratings yet
Caie A2 Level: Computer SCIENCE (9618)
Document16 pages
Caie A2 Level: Computer SCIENCE (9618)
Zane soh
No ratings yet
2012 Lekkerkerk MSC Integrating-Data
Document154 pages
2012 Lekkerkerk MSC Integrating-Data
Huibert-Jan Lekkerkerk
No ratings yet
1 s2.0 S0952197622002081 Main
Document3 pages
1 s2.0 S0952197622002081 Main
manuelcq2
No ratings yet
Resume Parser and Summarizer
Document6 pages
Resume Parser and Summarizer
IJARSCT Journal
No ratings yet
A Review Paper On Cryptography
Document5 pages
A Review Paper On Cryptography
cikechukwujohn
No ratings yet
Chapter 3-Part II
Document26 pages
Chapter 3-Part II
Abebaw Gebre
100% (1)
Challenges
Document9 pages
Challenges
N Latha Reddy
No ratings yet
The History of Geographic Information Systems (Gis) .
Document11 pages
The History of Geographic Information Systems (Gis) .
Shabbir Ahmad
No ratings yet
Possible Thesis Title For Computer Science Students
Document4 pages
Possible Thesis Title For Computer Science Students
donnakuhnsbellevue
No ratings yet
Ashish (Riya) - DATA4000 Introduction To Business Analytics - Edited
Document13 pages
Ashish (Riya) - DATA4000 Introduction To Business Analytics - Edited
Jiya Bajaj
No ratings yet