02 Machine-Learning-V-Statistics

Uploaded by

Anon123

0% found this document useful (0 votes)

7 views16 pages

Original Title

_8cce0cef9131f2f0ac18544424a3c147_02_machine-learning-v-statistics

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

7 views16 pages

02 Machine-Learning-V-Statistics

Uploaded by

Anon123

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 16

Search inside document

Machine learning

Brian Caffo, Jeff Leek, Roger Peng

@bcaffo
www.bcaffo.com
Machine learning is a set of
algorithms that can take a set of
inputs (data) and return a
prediction
Non-exhaustive list of ML
activities:
Unsupervised learning
Supervised learning
Unsupervised learning: trying to
uncover unobserved factors;
clustering, mixture models,
principal components
Supervised learning: using a
collection of predictors and some
observed outcomes to build an
algorithm to predict the outcome
when it is not observed; random
forests, boosting, SVMs
Contrasting machine learning and traditional statistical analyses

Machine learning Traditional statistical analyses

● Emphasize predictions ● Emphasizes superpopulation inference
● Evaluates results via prediction performance ● Focuses on a-priori hypotheses
● Concern for overfitting but not model ● Simpler models preferred over complex ones
complexity per se (parsimony), even if the more complex models
● Emphasis on performance perform slightly better
● Generalizability is obtained through ● Emphasis on parameter interpretability
performance on novel datasets ● Statistical modeling or sampling assumptions
● Usually no superpopulation model specified connects data to a population of interest
● Concern over performance and robustness ● Concern over assumptions and robustness
Examples

Machine learning
● build an automated movie recommender system
● success - anything that produces reliable recommendations

Statistical analysis
● build a parsimonious and interpretable model to better understand why people choose the movies that
they do
● success - anything true learned about movie choices
Examples

Machine learning
● build an automated system for predicting hospital stays from previous claims
● success - anything that produces reliable predictions

Statistical analysis
● build a parsimonious and interpretable model to better understand why people stay in the hospital
longer
● success - anything true learned about hospital stays
Google flu trends

Machine learning
● build an automated system for predicting flu outbreaks
● success - anything that produces reliable predictions

Statistical analysis
● build a parsimonious and interpretable model to better understand flu outbreaks
● success - anything true learned about the flu
Lessons learned

● Both approaches are extremely valuable and have their place

● Amount of tolerable model/algorithm complexity changes dramatically between the approaches
● Goals of the approaches are very different
○ Caveat - there’s a fair amount of work on making machine learning algorithms more
interpretable and work on producing better predictions from more traditional approaches
○ (Meeting in the middle?)
Further reading

“In this paper I will argue that the focus in the statistical community on data models has:
● Led to irrelevant theory and questionable scientific conclusions;
● Kept statisticians from using more suitable algorithmic models;
● Prevented statisticians from working on exciting new problems;”
Further reading
Further reading

“so that the apparent superiority of more sophisticated methods may be

something of an illusion. In particular, simple methods typically yield
performance almost as good as more sophisticated methods, to the extent
that the difference in performance may be swamped by other sources of
uncertainty that generally are not considered in the classical supervised
classification paradigm.”

Design Analysis - Modeling and Simulation
Document17 pages
Design Analysis - Modeling and Simulation
veronica Ngunzi
No ratings yet
Machine Learning and Econometrics: AEA Continuing Education Program
Document401 pages
Machine Learning and Econometrics: AEA Continuing Education Program
WILMOIS MASTER
No ratings yet
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
From Everand
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
Elaine Tate
No ratings yet
Agent-Based Modelling of Complex Systems
Document44 pages
Agent-Based Modelling of Complex Systems
Rully Medianto
No ratings yet
Improving the User Experience through Practical Data Analytics: Gain Meaningful Insight and Increase Your Bottom Line
From Everand
Improving the User Experience through Practical Data Analytics: Gain Meaningful Insight and Increase Your Bottom Line
Mike Fritz
No ratings yet
Assessing and Improving Prediction and Classification: Theory and Algorithms in C++
From Everand
Assessing and Improving Prediction and Classification: Theory and Algorithms in C++
Timothy Masters
No ratings yet
Lecture 1
Document19 pages
Lecture 1
Blue Whale
No ratings yet
How To Choose Among Three Forecasting Methods
Document9 pages
How To Choose Among Three Forecasting Methods
goingroup
No ratings yet
Lecture-4: Introduction To Data Science
Document41 pages
Lecture-4: Introduction To Data Science
Saif Ali Khan
No ratings yet
Executive Data Science
Document6 pages
Executive Data Science
MusicLover21 Adityansingh
No ratings yet
Data Science
Document16 pages
Data Science
Dankmulla
No ratings yet
Machine Learning and Predictive Modeling
From Everand
Machine Learning and Predictive Modeling
Chuck Sherman
No ratings yet
Unit6 Part3 General Procedure
Document19 pages
Unit6 Part3 General Procedure
tamanna sharma
No ratings yet
Mastering Partial Least Squares Structural Equation Modeling (Pls-Sem) with Smartpls in 38 Hours
From Everand
Mastering Partial Least Squares Structural Equation Modeling (Pls-Sem) with Smartpls in 38 Hours
Ken Kwong-Kay Wong
Rating: 3 out of 5 stars
3/5 (1)
Fundamental of ML Week 1
Document7 pages
Fundamental of ML Week 1
Raj Physio
No ratings yet
Unit I Predictive Analytics
Document39 pages
Unit I Predictive Analytics
NarendranGRevathi
No ratings yet
From Novice to ML Practitioner: Your Introduction to Machine Learning
From Everand
From Novice to ML Practitioner: Your Introduction to Machine Learning
Moss Adelle Louise
No ratings yet
All Notes
Document6 pages
All Notes
tiajung humtsoe
No ratings yet
Data Science
Document64 pages
Data Science
Smart's Vids
No ratings yet
Introduction To Simulation: Book: Jerry Bank's Chapter 1
Document27 pages
Introduction To Simulation: Book: Jerry Bank's Chapter 1
Trân Lê
No ratings yet
Machine Learning
Document29 pages
Machine Learning
Cubic Section
No ratings yet
Data Science Course Curriculum 27 Feb 2023
Document21 pages
Data Science Course Curriculum 27 Feb 2023
thugorigin
No ratings yet
Cha 3
Document68 pages
Cha 3
Alex Hayme
No ratings yet
Simulation Model & Systems Simulation
Document21 pages
Simulation Model & Systems Simulation
Silene Oliveira
No ratings yet
Mastering Machine Learning: A Comprehensive Guide to Success
From Everand
Mastering Machine Learning: A Comprehensive Guide to Success
Rick Spair
No ratings yet
Placement Prep - Analytics
Document19 pages
Placement Prep - Analytics
Topsy Kreate
No ratings yet
CM1 Models Summary
Document3 pages
CM1 Models Summary
evanc1289
No ratings yet
ML Imp Questions-1
Document15 pages
ML Imp Questions-1
Keerthi
No ratings yet
Applied Predictive Modeling: An Overview of Applied Predictive Modeling
From Everand
Applied Predictive Modeling: An Overview of Applied Predictive Modeling
Steven Taylor
No ratings yet
The Predictive Analytics Model
Document6 pages
The Predictive Analytics Model
Armand Cristobal
No ratings yet
Jumpstart Your ML Journey: A Beginner's Handbook to Success
From Everand
Jumpstart Your ML Journey: A Beginner's Handbook to Success
Moss Adelle Louise
No ratings yet
Big Data Modeling and Management Systems
From Everand
Big Data Modeling and Management Systems
Alexander Afriyie
No ratings yet
L3 - Supervised and Unsupervised Learning
Document24 pages
L3 - Supervised and Unsupervised Learning
Gaurav Rohilla
100% (1)
Lesson1 2
Document25 pages
Lesson1 2
Jhimar Peredo Jurado
No ratings yet
Unit I
Document38 pages
Unit I
Veena S
No ratings yet
A Clustering-Based Sales Forecasting Scheme by Using Extreme Learning
Document8 pages
A Clustering-Based Sales Forecasting Scheme by Using Extreme Learning
sri baginda
No ratings yet
Diabetes Prediction
Document15 pages
Diabetes Prediction
Anurag Rawat
No ratings yet
Welcome To MS383 (Simulation)
Document17 pages
Welcome To MS383 (Simulation)
Sanjay Prakash
No ratings yet
DS PassHojaoge
Document22 pages
DS PassHojaoge
dihosid99
No ratings yet
Mmds
Document12 pages
Mmds
Ankitha Vardhini
No ratings yet
Introduction To Simulation: Book: Jerry Bank's Chapter 1
Document28 pages
Introduction To Simulation: Book: Jerry Bank's Chapter 1
Xuân Trường
No ratings yet
Effective Model Validation Using Machine Learning
Document4 pages
Effective Model Validation Using Machine Learning
prabu2125
No ratings yet
Lecture02 Frameworks Platforms-Part1
Document40 pages
Lecture02 Frameworks Platforms-Part1
Tuna Öztürk
No ratings yet
PA (Complete)
Document217 pages
PA (Complete)
Jithinmathai jacob
No ratings yet
AIR Questions
Document4 pages
AIR Questions
Mangesh Vitekar
No ratings yet
Introduction To Operations Research
Document13 pages
Introduction To Operations Research
Hitesh Kachhadiya
No ratings yet
Data Analysis (27 Questions) : 1. (Given A Dataset) Analyze This Dataset and Tell Me What You Can Learn From It
Document28 pages
Data Analysis (27 Questions) : 1. (Given A Dataset) Analyze This Dataset and Tell Me What You Can Learn From It
kumar kumar
No ratings yet
Introduction To Machine Learning: Dr.S.Sankar Ganesh Vellore Institute of Technology
Document132 pages
Introduction To Machine Learning: Dr.S.Sankar Ganesh Vellore Institute of Technology
sartg
No ratings yet
Computer Modeling in Business Assigmnt
Document10 pages
Computer Modeling in Business Assigmnt
Mat Sapik
No ratings yet
Data Science 1B
Document50 pages
Data Science 1B
kagome
No ratings yet
Manish Bhatt 2451137 ProjectIV
Document20 pages
Manish Bhatt 2451137 ProjectIV
Hemanth Backup
No ratings yet
Prescriptive Analytics - Research Material: Data Analysis
Document12 pages
Prescriptive Analytics - Research Material: Data Analysis
api-592322284
No ratings yet
Introduction To Simulation 2022
Document30 pages
Introduction To Simulation 2022
Deni Abdul Qodir Al-Jaelani
No ratings yet
9 - Simulation Output Analysis
Document32 pages
9 - Simulation Output Analysis
Daniel Tjeong
No ratings yet
DIgitization Week 7
Document6 pages
DIgitization Week 7
Ilion Barboso
No ratings yet
Lecture - 32 - 33
Document65 pages
Lecture - 32 - 33
spandansahil15
No ratings yet
Machin Learning Treking
Document5 pages
Machin Learning Treking
bharat
No ratings yet
REPORT - STOCK PRICE PREDICTION - New
Document40 pages
REPORT - STOCK PRICE PREDICTION - New
subhradeepnath2.o
No ratings yet
Introduction To Machine Learning
Document4 pages
Introduction To Machine Learning
SHUBHAM SRIVASTAVA
No ratings yet
Using Forecasting Methodologies to Explore an Uncertain Future
From Everand
Using Forecasting Methodologies to Explore an Uncertain Future
Dr Poon Teng Fatt
No ratings yet
Safety at Midblock Pedestrian Signals (2023) : This PDF Is Available at
Document59 pages
Safety at Midblock Pedestrian Signals (2023) : This PDF Is Available at
Anon123
No ratings yet
Data Visualization Methods For Transportation Agencies (2017)
Document75 pages
Data Visualization Methods For Transportation Agencies (2017)
Anon123
No ratings yet
Management and Use of Data For Transportation Performance Management: Guide For Practitioners (2019)
Document151 pages
Management and Use of Data For Transportation Performance Management: Guide For Practitioners (2019)
Anon123
No ratings yet
Visualization of Highway Performance Measures (2022) : This PDF Is Available at
Document139 pages
Visualization of Highway Performance Measures (2022) : This PDF Is Available at
Anon123
No ratings yet
Datascienceethics 08.01 Slides
Document15 pages
Datascienceethics 08.01 Slides
Anon123
No ratings yet
11 Defining Success in Data Science - 1
Document12 pages
11 Defining Success in Data Science - 1
Anon123
No ratings yet
Outputs of A Data Science Experiment
Document7 pages
Outputs of A Data Science Experiment
Anon123
No ratings yet
ENGLISH 9-3Q-Module 3-ENHANCED
Document9 pages
ENGLISH 9-3Q-Module 3-ENHANCED
Anon123
No ratings yet
Impact of Private Sectors in Philippine Urban Planning
Document4 pages
Impact of Private Sectors in Philippine Urban Planning
Anon123
No ratings yet
Planning The Nipas Protected Areas
Document7 pages
Planning The Nipas Protected Areas
Anon123
No ratings yet
Machine Learning - CheatSheet
Document2 pages
Machine Learning - CheatSheet
lucy01123
100% (1)
Introduction To Intelligent Control
Document10 pages
Introduction To Intelligent Control
Dereje Shiferaw
No ratings yet
2D To 2D Implicit Map Based Localization
Document3 pages
2D To 2D Implicit Map Based Localization
arda xz99
No ratings yet
Yoga Pose Detection
Document7 pages
Yoga Pose Detection
SHINDE OM
No ratings yet
Machine Learning Foundation
Document4 pages
Machine Learning Foundation
Nishant Sharma
No ratings yet
ERA Syllabus
Document5 pages
ERA Syllabus
heet
No ratings yet
Machine Learning Simplified
Document109 pages
Machine Learning Simplified
GolamRassel
No ratings yet
Hard Net Hardness Aware Discrimination Network For 3d Early Activity Prediction
Document17 pages
Hard Net Hardness Aware Discrimination Network For 3d Early Activity Prediction
Friji Racha
No ratings yet
Car Number Plate Detection
Document5 pages
Car Number Plate Detection
International Journal of Innovative Science and Research Technology
No ratings yet
Code For BFS Applied On MAP To Reach From Arad To Bucharest. (Artificial Intelligence)
Document2 pages
Code For BFS Applied On MAP To Reach From Arad To Bucharest. (Artificial Intelligence)
Ahmad Farooq
100% (1)
A New Deep Learning Method For Automatic Ovarian Cancer Prediction & Subtype Classification
Document10 pages
A New Deep Learning Method For Automatic Ovarian Cancer Prediction & Subtype Classification
sukeshini jadhav
No ratings yet
Transfer Functions: Hidden Possibilities For Better Neural Networks
Document14 pages
Transfer Functions: Hidden Possibilities For Better Neural Networks
Dana Abkh
No ratings yet
AI 501 - Lesson 1 - Intro To AI PDF
Document45 pages
AI 501 - Lesson 1 - Intro To AI PDF
Devang Kharbanda
No ratings yet
Project Proposal
Document2 pages
Project Proposal
Cynthea Godwin
No ratings yet
Leancontext: Cost-Efficient Domain-Specific Question Answering Using Llms
Document8 pages
Leancontext: Cost-Efficient Domain-Specific Question Answering Using Llms
billeton
No ratings yet
Ganguli CV
Document9 pages
Ganguli CV
Saurav Ågarwal
No ratings yet
2011 11305
Document17 pages
2011 11305
Mahmoud Alsaeed
No ratings yet
(IJIT-V7I4P7) :yew Kee Wong
Document4 pages
(IJIT-V7I4P7) :yew Kee Wong
IJITJournals
No ratings yet
Protvec: Problem Based Learning - July 2
Document12 pages
Protvec: Problem Based Learning - July 2
Arpita Saha
No ratings yet
Bcse306l Artificial-Intelligence TH 1.0 70 Bcse306l
Document2 pages
Bcse306l Artificial-Intelligence TH 1.0 70 Bcse306l
b20126381
No ratings yet
DLW Information Booklet
Document11 pages
DLW Information Booklet
Xot Ong
No ratings yet
1 - Introduction of AI
Document17 pages
1 - Introduction of AI
Sohnea
No ratings yet
Mansoura University Faculty of Computers and Information Department of Computer Science First Semester: 2020-2021
Document31 pages
Mansoura University Faculty of Computers and Information Department of Computer Science First Semester: 2020-2021
Amir Mohamed Nabil Saleh Elghamery
No ratings yet
Deep Learning in Practice Project Two: NLP of The Holy Quran in Python
Document11 pages
Deep Learning in Practice Project Two: NLP of The Holy Quran in Python
shoaib riaz
No ratings yet
Enhancing Alexnet For Arabic Handwritten Words Recognition Using Incremental Dropout
Document7 pages
Enhancing Alexnet For Arabic Handwritten Words Recognition Using Incremental Dropout
farid ghazali
No ratings yet
Lecun2015 PDF
Document9 pages
Lecun2015 PDF
Thong Vo
No ratings yet
Neural Network 1&2
Document72 pages
Neural Network 1&2
parag
No ratings yet
Introduction To Machine Learning - Unit 3 - Week 1
Document3 pages
Introduction To Machine Learning - Unit 3 - Week 1
Dhivya
No ratings yet
Vector Representation of Text: Vagelis Hristidis Prepared With The Help of Nhat Le Many Slides Are From Richard Socher
Document20 pages
Vector Representation of Text: Vagelis Hristidis Prepared With The Help of Nhat Le Many Slides Are From Richard Socher
ming bakwane
No ratings yet
Artificial Intelligence AI Timeline Infographic PDF
Document1 page
Artificial Intelligence AI Timeline Infographic PDF
Bellus N Olaf
No ratings yet