Data Analytics 01: Drag The Titanic Data Add Set Role Connect It Configure It

Uploaded by

Jhon Rey Balbastro

0% found this document useful (0 votes)

18 views2 pages

Original Title

06-MoreModeling

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

18 views2 pages

Data Analytics 01: Drag The Titanic Data Add Set Role Connect It Configure It

Uploaded by

Jhon Rey Balbastro

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 2

Search inside document

DATA ANALYTICS 01

MORE MODELING
With the preprocessing operators we have discussed so far you can blend and prepare most data
sets for building a predictive model. In this lecture, we will use one of the most widely used
machine learning methods, namely a Decision Tree, to predict who will survive the Titanic
accident. Of course, there is nothing you can do about this now, AFTER the ship sunk, but you
can still use this model for similar situations and make predictions then. Should you really buy a
third class ticket when traveling with your family? The model will show!
RETRIEVE THE TITANIC DATA.

1. Drag the Titanic data into your process.

2. Add Set Role, connect it, and configure it as you did in the previous Lecture. Change the
role of the attribute Survived to label.

Note: Remember that the attribute with role label is the one you want to predict. It is important
to set the label, because there are machine learning methods, like the decision tree algorithm,
that use existing data with known label values (a training set) to find hidden patterns. It then
creates predictions from those patterns and applies them to new data without known labels (the
testing set).
REMOVE UNNECESSARY ATTRIBUTES.

1. Add Select Attributes to the process and connect it.

2. Set attribute filter type to subset and click Select Attributes.
3. In the resulting dialog, select the Survived, Sex, Passenger Class, Passenger Fare, and the
No of... parents, children, siblings, and spouses.

Note: You removed (didn't select) Life Boat because passengers who made it on a life boat are
likely survivors. Adding this information would lead to trivial models practically only depending
on this piece of information. The real question is actually: who made it to a life boat in the first
place? Name and ticket number are different kinds of ID, so you left them out as well.
BUILD A DECISION TREE MODEL.

1. Drag in the Decision Tree operator, connect the input, and connect the "mod" output
port to the results port.

Note that the data connections are blue while the model connections are green. This
helps to easily find and verify the correct connection ports.

Page 1 of 2
Dr. Stephan Kupsch
DATA ANALYTICS 01

1. Run the process.

2. Inspect the decision tree model.

Note: It is interesting to see that for women, family size matters more than passenger class. This
behavioral pattern could not be detected for men. In general, men had a lower likelihood to
survive ("women and children first!").
After this you must have learned how to use the most common data preprocessing operators
and even built your first predictive model in RapidMiner. This is an exciting moment - celebrate!

TASKS:

 Can you find out how to restrict the depth of the decision tree, i.e. reduce its complexity?
Why could this be a good idea?
 Limit the depth of the decision tree to 4. Use the parameter setting you found above.
 Re-execute the process and look at the reduced tree. It should only have a depth of 4
now. The width of each colored bar in the tree represents how many passengers fall into
this bucket. Can you figure out who was the largest group of survivors and hence has the
highest likelihood to survive?
 What would you say was the rough probability for survival for this group? How does this
compare to the survival probability for men?

Page 2 of 2
Dr. Stephan Kupsch

Assessing and Improving Prediction and Classification: Theory and Algorithms in C++
From Everand
Assessing and Improving Prediction and Classification: Theory and Algorithms in C++
Timothy Masters
No ratings yet
MATLAB Machine Learning Recipes: A Problem-Solution Approach
From Everand
MATLAB Machine Learning Recipes: A Problem-Solution Approach
Michael Paluszek
No ratings yet
Joe Celko's Trees and Hierarchies in SQL for Smarties
From Everand
Joe Celko's Trees and Hierarchies in SQL for Smarties
Joe Celko
No ratings yet
Description Start Here If... : Evaluation
Document5 pages
Description Start Here If... : Evaluation
Worh Falex
No ratings yet
Thesis 2.0 Download Free
Document6 pages
Thesis 2.0 Download Free
jennyhillminneapolis
100% (2)
An Extensive Step by Step Guide To Exploratory Data Analysis
Document26 pages
An Extensive Step by Step Guide To Exploratory Data Analysis
ojeifoissy
No ratings yet
Ilovepdf Merged
Document50 pages
Ilovepdf Merged
hrishabhmaurya7474
No ratings yet
AI ClassXII Study Materials
Document40 pages
AI ClassXII Study Materials
Pradeep Singh
No ratings yet
Practical Computer Vision Applications Using Deep Learning with CNNs: With Detailed Examples in Python Using TensorFlow and Kivy
From Everand
Practical Computer Vision Applications Using Deep Learning with CNNs: With Detailed Examples in Python Using TensorFlow and Kivy
Ahmed Fawzy Gad
No ratings yet
The Art of Immutable Architecture: Theory and Practice of Data Management in Distributed Systems
From Everand
The Art of Immutable Architecture: Theory and Practice of Data Management in Distributed Systems
Michael L. Perry
No ratings yet
Grokking Machine Learning
From Everand
Grokking Machine Learning
Luis Serrano
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Thesis PDF Files
Document6 pages
Thesis PDF Files
dwt5trfn
100% (2)
Data Science Solutions with Python: Fast and Scalable Models Using Keras, PySpark MLlib, H2O, XGBoost, and Scikit-Learn
From Everand
Data Science Solutions with Python: Fast and Scalable Models Using Keras, PySpark MLlib, H2O, XGBoost, and Scikit-Learn
Tshepo Chris Nokeri
No ratings yet
Introduction To Data Science
Document8 pages
Introduction To Data Science
Bhargav Shah
No ratings yet
Introduction To Machine Learning Top-Down Approach - Towards Data Science
Document6 pages
Introduction To Machine Learning Top-Down Approach - Towards Data Science
Kashaf Bakali
No ratings yet
Class Handout BES320225L Jason Boehning
Document35 pages
Class Handout BES320225L Jason Boehning
Valentina Coloma Reyes
No ratings yet
Basic Interview Q's On ML PDF
Document243 pages
Basic Interview Q's On ML PDF
sourajit roy chowdhury
100% (2)
Bachelor Dissertation PDF
Document8 pages
Bachelor Dissertation PDF
NeedSomeoneToWriteMyPaperForMeSingapore
100% (1)
Practical Data Science: A Guide to Building the Technology Stack for Turning Data Lakes into Business Assets
From Everand
Practical Data Science: A Guide to Building the Technology Stack for Turning Data Lakes into Business Assets
Andreas François Vermeulen
No ratings yet
MAchine Learning
Document120 pages
MAchine Learning
Aanchal Saran
No ratings yet
How To Spot A Fake Data Scientist - Towards Data Science
Document8 pages
How To Spot A Fake Data Scientist - Towards Data Science
Jorge Francisco
No ratings yet
Thesis Online NZ
Document5 pages
Thesis Online NZ
afknikfgd
100% (1)
Decision Tree
Document38 pages
Decision Tree
Pedro Catenacci
No ratings yet
Random Forest - Basics
Document9 pages
Random Forest - Basics
arunspai1478
No ratings yet
Divorce Prediction System: Devansh Kapoor 179202050
Document12 pages
Divorce Prediction System: Devansh Kapoor 179202050
Aman
No ratings yet
An Enlightenment To Machine Learning
Document16 pages
An Enlightenment To Machine Learning
Ankur Singh
100% (1)
Thesis Pattern Recognition
Document5 pages
Thesis Pattern Recognition
amyholmesmanchester
100% (2)
Data Science: Concepts and Practice
From Everand
Data Science: Concepts and Practice
Vijay Kotu
Rating: 3 out of 5 stars
3/5 (2)
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
Fundamentals of Computer Network Analysis and Engineering
From Everand
Fundamentals of Computer Network Analysis and Engineering
Radz
No ratings yet
An Enlightenment To Machine Learning - Resp
Document22 pages
An Enlightenment To Machine Learning - Resp
IgorJales
No ratings yet
Pointers in C Programming: A Modern Approach to Memory Management, Recursive Data Structures, Strings, and Arrays
From Everand
Pointers in C Programming: A Modern Approach to Memory Management, Recursive Data Structures, Strings, and Arrays
Thomas Mailund
No ratings yet
Machine Learning with TensorFlow, Second Edition
From Everand
Machine Learning with TensorFlow, Second Edition
Chris Mattmann
No ratings yet
Fishbone Analysis StudyGuide
Document7 pages
Fishbone Analysis StudyGuide
Julrick Cubio Egbus
100% (1)
TSP Thesis
Document6 pages
TSP Thesis
quaobiikd
100% (1)
ML Interview Questions
Document146 pages
ML Interview Questions
IndraneelGhosh
No ratings yet
Data Science Crash Course
Document32 pages
Data Science Crash Course
Abhinandan Chatterjee
No ratings yet
Random Forest PHD Thesis
Document4 pages
Random Forest PHD Thesis
heidibrowneverett
100% (2)
I Love Algorithms - Machine Learning Cards
Document6 pages
I Love Algorithms - Machine Learning Cards
Carlos Manuel Ventura Matos
No ratings yet
Bachelor Thesis Erste Seite
Document6 pages
Bachelor Thesis Erste Seite
afcmausme
100% (2)
Machine Learning With Random Forests and Decision Trees - A Visual Guide For Beginners by Scott Hartshorn
Document73 pages
Machine Learning With Random Forests and Decision Trees - A Visual Guide For Beginners by Scott Hartshorn
Ivaylo Krustev
No ratings yet
New To Machine Learning? Try To Avoid These Mistakes
Document11 pages
New To Machine Learning? Try To Avoid These Mistakes
rvin
No ratings yet
Titanic Survival Prediction
Document14 pages
Titanic Survival Prediction
Nishit Chaudhary
No ratings yet
Artificial Intelligence QNS
Document31 pages
Artificial Intelligence QNS
Miraculous Miracle
No ratings yet
Bilder in Bachelor Thesis
Document6 pages
Bilder in Bachelor Thesis
Tye Rausch
100% (2)
Machine Learning - A Comprehensive, Step-by-Step Guide to Learning and Applying Advanced Concepts and Techniques in Machine Learning: 3
From Everand
Machine Learning - A Comprehensive, Step-by-Step Guide to Learning and Applying Advanced Concepts and Techniques in Machine Learning: 3
Peter Bradley
No ratings yet
Machine Learning: Short Hand Book
Document14 pages
Machine Learning: Short Hand Book
Rade Bojadjievski
No ratings yet
Interpretability Tcav
Document5 pages
Interpretability Tcav
Tushar Choudhary
No ratings yet
Dawit House
Document49 pages
Dawit House
dawitbelete1992
No ratings yet
Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning
From Everand
Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning
Tshepo Chris Nokeri
No ratings yet
Information Retrieval Master Thesis
Document7 pages
Information Retrieval Master Thesis
afcmunxna
100% (2)
Bachelor Thesis Themen Finden
Document7 pages
Bachelor Thesis Themen Finden
tiffanygrahamkansascity
100% (2)
Data Mining Lab Manual
Document34 pages
Data Mining Lab Manual
Keerthana Sudarshan
No ratings yet
Introduction To Data Science: Dataset
Document13 pages
Introduction To Data Science: Dataset
yogesh
No ratings yet
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
Rating: 5 out of 5 stars
5/5 (1)
Matrix Theory and Applications With MATLAB by Darald J Hartfiel
Document382 pages
Matrix Theory and Applications With MATLAB by Darald J Hartfiel
Saad rehman
No ratings yet
Design: Avoiding Repetition
Document3 pages
Design: Avoiding Repetition
Nicolás García Romero
No ratings yet
Managing Your Data Science Projects: Learn Salesmanship, Presentation, and Maintenance of Completed Models
From Everand
Managing Your Data Science Projects: Learn Salesmanship, Presentation, and Maintenance of Completed Models
Robert de Graaf
No ratings yet
Anomaly Detection For Monitoring
Document77 pages
Anomaly Detection For Monitoring
Dynatrace
No ratings yet
TLE 7 8 Carpentry Module 10
Document24 pages
TLE 7 8 Carpentry Module 10
Jhon Rey Balbastro
100% (1)
TLE 7 8 Carpentry Module 6 PDF
Document24 pages
TLE 7 8 Carpentry Module 6 PDF
schoolstuff
No ratings yet
TLE 7 8 Carpentry Module 9
Document24 pages
TLE 7 8 Carpentry Module 9
Jhon Rey Balbastro
No ratings yet
TLE 7 8 Carpentry Module 8
Document24 pages
TLE 7 8 Carpentry Module 8
Jhon Rey Balbastro
0% (1)
TLE 7 8 Carpentry Module 7
Document24 pages
TLE 7 8 Carpentry Module 7
Jhon Rey Balbastro
100% (1)
TLE 7 8 Carpentry Module 5 PDF
Document22 pages
TLE 7 8 Carpentry Module 5 PDF
schoolstuff
No ratings yet
Data Analytics 02: Blending Is About Transforming A Data Set From One State To Another or Combining Multiple Data
Document3 pages
Data Analytics 02: Blending Is About Transforming A Data Set From One State To Another or Combining Multiple Data
Jhon Rey Balbastro
No ratings yet
WHLP Week 2 Module 2 Quarter 4 Filipino 9
Document3 pages
WHLP Week 2 Module 2 Quarter 4 Filipino 9
Jhon Rey Balbastro
75% (4)
Data Analytics 02: Drag Connect It Change Remove Cabin, Life Boat, Name, and Ticket Number
Document2 pages
Data Analytics 02: Drag Connect It Change Remove Cabin, Life Boat, Name, and Ticket Number
Jhon Rey Balbastro
No ratings yet
Data Analytics 02: Pivoting of Data. You Might Be Familiar With The Concept of Pivoting From BI Tools or Excel: Rotate
Document2 pages
Data Analytics 02: Pivoting of Data. You Might Be Familiar With The Concept of Pivoting From BI Tools or Excel: Rotate
Jhon Rey Balbastro
No ratings yet
WHLP Week 2 Module 2 Quarter 4 Filipino 9
Document3 pages
WHLP Week 2 Module 2 Quarter 4 Filipino 9
Jhon Rey Balbastro
75% (4)
Data Analytics 01: Id Attributes Are Usually Ignored by Modeling Algorithms Because They Are Only Used As Unique
Document2 pages
Data Analytics 01: Id Attributes Are Usually Ignored by Modeling Algorithms Because They Are Only Used As Unique
Jhon Rey Balbastro
No ratings yet
Standing Long Jump (187CM.) (73.62in.)
Document7 pages
Standing Long Jump (187CM.) (73.62in.)
Jhon Rey Balbastro
No ratings yet
Research Presentation: Procedures On How To Host A Class Meeting or Webinar With Google Meet
Document7 pages
Research Presentation: Procedures On How To Host A Class Meeting or Webinar With Google Meet
Jhon Rey Balbastro
No ratings yet
Iwwar 11
Document1 page
Iwwar 11
Jhon Rey Balbastro
No ratings yet
Qualitative Research Design
Document12 pages
Qualitative Research Design
Jhon Rey Balbastro
No ratings yet
Compare and Contrast Linear Model, Interaction Model, and Transactional Model Using A Venn Diagram and Answer The Following Questions. (20 Points)
Document1 page
Compare and Contrast Linear Model, Interaction Model, and Transactional Model Using A Venn Diagram and Answer The Following Questions. (20 Points)
Jhon Rey Balbastro
No ratings yet
Topic: Stress Management Target Learner: Grade 11 & Grade 12
Document6 pages
Topic: Stress Management Target Learner: Grade 11 & Grade 12
Jhon Rey Balbastro
No ratings yet
Economic Issues On Globalization in Education
Document4 pages
Economic Issues On Globalization in Education
Jhon Rey Balbastro
No ratings yet
Entrepreneurship: Module 1: Quarter 0 - Week 1
Document14 pages
Entrepreneurship: Module 1: Quarter 0 - Week 1
Jhon Rey Balbastro
100% (4)
Qualitative Research Design: I.G.A. Lokita Purnamika Utami
Document37 pages
Qualitative Research Design: I.G.A. Lokita Purnamika Utami
Anonymous vSlHKcv3Ec
No ratings yet
Qualitative Research Design: I.G.A. Lokita Purnamika Utami
Document37 pages
Qualitative Research Design: I.G.A. Lokita Purnamika Utami
Anonymous vSlHKcv3Ec
No ratings yet
P 21 Commoncoretoolkit
Document48 pages
P 21 Commoncoretoolkit
api-214433908
No ratings yet
The Many Faces of Mat MAN: Disclaimer
Document18 pages
The Many Faces of Mat MAN: Disclaimer
ReCyo
No ratings yet
Objectives: A Semi-Detailed Lesson Plan in English Iii On One-Act Radio Play
Document1 page
Objectives: A Semi-Detailed Lesson Plan in English Iii On One-Act Radio Play
Aristotle Tomas
No ratings yet
Topic 12 - The Concept of Grammar
Document10 pages
Topic 12 - The Concept of Grammar
Sara Matons
100% (3)
PR2 Curriculum Guide
Document7 pages
PR2 Curriculum Guide
Herminia Agulay
No ratings yet
ICMA PM6 Syllabus
Document1 page
ICMA PM6 Syllabus
Muneer Dhamani
No ratings yet
Different Philosophies in Education
Document3 pages
Different Philosophies in Education
KeishaAaliyah
No ratings yet
Report Card Comments
Document3 pages
Report Card Comments
Ma Elena Umali
100% (3)
Ell Siop Lesson Plan
Document2 pages
Ell Siop Lesson Plan
api-293758694
No ratings yet
Internship Essay
Document3 pages
Internship Essay
api-460846857
No ratings yet
Systematic Review On E-Learning During Covid 19
Document16 pages
Systematic Review On E-Learning During Covid 19
Brenda Lukas Nanang
No ratings yet
Exam PR1
Document2 pages
Exam PR1
Joh Cabadonga
No ratings yet
Living Things Concept Attainment
Document2 pages
Living Things Concept Attainment
api-350421404
100% (1)
Character Evaluation Form: Department of Education
Document1 page
Character Evaluation Form: Department of Education
Arjix HandyMan
No ratings yet
UT Dallas Syllabus For Comd6307.001 05s Taught by Ariela Jokel (Ariela)
Document2 pages
UT Dallas Syllabus For Comd6307.001 05s Taught by Ariela Jokel (Ariela)
UT Dallas Provost's Technology Group
No ratings yet
Introduction To The Study: Title of Research
Document9 pages
Introduction To The Study: Title of Research
Any Kadam
No ratings yet
Mean-MPS-SD - 2nd
Document64 pages
Mean-MPS-SD - 2nd
Josam Montero Ford
No ratings yet
Remington Evans: Professional Summary
Document2 pages
Remington Evans: Professional Summary
api-453904783
No ratings yet
Ap Lang Language Ted Talk
Document3 pages
Ap Lang Language Ted Talk
api-261524446
No ratings yet
TCRC December Evaluation
Document53 pages
TCRC December Evaluation
Jomari Felisilda
No ratings yet
Linstad Assure Model Instructional Plan - Animal Habitats
Document4 pages
Linstad Assure Model Instructional Plan - Animal Habitats
api-270439059
No ratings yet
Learning Language and Loving It
Document10 pages
Learning Language and Loving It
api-532139048
No ratings yet
CBLM Unit 2
Document48 pages
CBLM Unit 2
Andrea Nicole Erispe Bron
No ratings yet
Arp Reflection
Document3 pages
Arp Reflection
api-317806307
No ratings yet
Tune Model Hyperparameters Azure Machine Learning
Document6 pages
Tune Model Hyperparameters Azure Machine Learning
flylence
No ratings yet
INDIA CHALLENGES and TRENDS
Document5 pages
INDIA CHALLENGES and TRENDS
BakLyzer
No ratings yet
Report 7
Document2 pages
Report 7
AhSan Rajpoot
No ratings yet
Proposal FIeldwork Course Postgraduate UNNES
Document14 pages
Proposal FIeldwork Course Postgraduate UNNES
Wulan Aulia Azizah
No ratings yet
Local Proceeding ICMScE 2017 PDF
Document1,208 pages
Local Proceeding ICMScE 2017 PDF
jayanti
100% (1)
Learning Activity Sheet No.1 in Science 2: Key Concept
Document25 pages
Learning Activity Sheet No.1 in Science 2: Key Concept
Mary Grace Baldoza
No ratings yet