Welcome to Scribd!

Skip carousel

Introduction To Machine Learning: Linear and Logistic Regression + Hands-On With Azure ML

Uploaded by

Karim Lahrichi

0% found this document useful (0 votes)

6 views19 pages

Original Title

Intro to linear regression

Copyright

Available Formats

PPTX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

6 views19 pages

Introduction To Machine Learning: Linear and Logistic Regression + Hands-On With Azure ML

Uploaded by

Karim Lahrichi

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 19

Search inside document

INTRODUCTION TO MACHINE LEARNING

LINEAR AND LOGISTIC REGRESSION + HANDS-ON WITH AZURE ML

LEARNING OBJECTIVES

 Understand linear & logistic regression

 Be able to perform logistic regression with AzureML

 Understand classification performance metrics

LINEAR REGRESSION
LINEAR REGRESSION: UNIVARIATE

We want to predict a continuous variable

LINEAR REGRESSION
We want to predict a continuous variable

Objective: Minimize the Mean Squared Error

Observed value

Predicted value
LINEAR REGRESSION FORMULA

“Training” a model means finding the parameters (in this case the beta values) that minimize the error (MSE for linear regression)
LINEAR REGRESSION METRICS
LINEAR REGRESSION METRICS
LINEAR REGRESSION METRICS
LOGISTIC REGRESSION
LOGISTIC REGRESSION

We want to predict a binary (0/1) variable

Doesn’t work because the right

side is unconstrained (-infinity to
+infinity)

We will predict a probability P

Works because the

left side is
unconstrained too
(-infinity to +infinity)
LOGISTIC REGRESSION FORMULA
LOGISTIC REGRESSION – LOSS FUNCTION

 We predict a probability “p” of the outcome “y” being equal to 1

 We penalize being wrong

 The cost function is:

 - Ln(p) when y=1 (penalty high if p close to 0) Since 0 <= p <= 1, the log is a negative number, that is why we have the “-” sign in
front to make the cost function positive.
 - Ln(1 – p) when y=0 (penalty high if p close to 1)

 The algorithm seeks to minimize the cost function

IMPORTANT PROPERTIES OF LINEAR MODELS

Applies to both linear regression and logistic regression

 1. As the name implies (“linear”), each input feature has a linear and monotonic impact on the output.
=> These models cannot, on their own, handle non-linear relationships.
 2. Each input feature has a in the end a separate and distinct impact on the output.
=> These models do not, on their own, take into account the relationships between input variables.
 3. Linear and logistic regressions can be VERY sensitive to outliers
(see link)

Applies only to linear regression

 The output of a linear regression is not bounded (can go from –infinity to +infinity)
HOW TO DEAL WITH CATEGORICAL VARIABLES?
One-hot-encoding (also known as creating dummy variables)
DO IT WITH ML AZURE!
CLASSIFICATION PERFORMANCE METRICS: AUC
Threshold=0.0
TPR = [True Positives] / [Observed Positives]
Predicted
1 0

1
Observed

Predicted
1 0

1
Observed
Threshold=1.0 FPR = [False Positives] / [Observed Negatives] 0
AUC = Area Under the Curve
CLASSIFICATION PERFORMANCE METRICS: RECALL, PRECISION,
F1
OVERFIT, HOLD-OUT DATA AND CROSS-VALIDATION

Overfit is when a model begins to "memorize" training data rather than "learning" to generalize
from a trend.

One way to measure overfit is to train on a dataset but assess model

performance on a different dataset, called testing or hold-out
dataset  that’s why we SPLIT the data into train and test.

A slightly more sophisticated version is to use cross-validation,

which splits an initial dataset into pairs of (train, test) folds.

In general, for a fixed-size dataset, the more variables you have, the
higher the risk of overfit => if it doesn’t hurt performance, less
variables is better.

Introduction to Logarithms and Exponentials
From Everand
Introduction to Logarithms and Exponentials
Simone Malacrida
No ratings yet
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
Prediction & Forecasting: Regression Analysis
Document3 pages
Prediction & Forecasting: Regression Analysis
smartanand2009
No ratings yet
Brainware University: Pre/post Increment Arithmetic Operator Relational Operator
Document9 pages
Brainware University: Pre/post Increment Arithmetic Operator Relational Operator
PIYALI DE
No ratings yet
Operators and Control Statements in C 5.09.2019 - Updated
Document43 pages
Operators and Control Statements in C 5.09.2019 - Updated
sabout
No ratings yet
Cheat Sheet Quantitative Methods in Finance Nova Cheat Sheet Quantitative Methods in Finance Nova
Document3 pages
Cheat Sheet Quantitative Methods in Finance Nova Cheat Sheet Quantitative Methods in Finance Nova
bassirou ndao
0% (1)
Logistic Regression
Document25 pages
Logistic Regression
tsandrasanal
No ratings yet
EMF CheatSheet V4
Document2 pages
EMF CheatSheet V4
Marvin
100% (1)
Machine Learning (Analytics Vidhya) : What Is Logistic Regression?
Document5 pages
Machine Learning (Analytics Vidhya) : What Is Logistic Regression?
pradeep
100% (1)
5.1 Relational and Logical Operators
Document8 pages
5.1 Relational and Logical Operators
فاطمة جميل سلطان
No ratings yet
3-Linear Regreesion-Assumptions
Document28 pages
3-Linear Regreesion-Assumptions
Monis Khan
No ratings yet
Logistic Regression (Autosaved)
Document21 pages
Logistic Regression (Autosaved)
Siddharth Doshi
No ratings yet
Logistic Regression
Document47 pages
Logistic Regression
Ehab Emam
No ratings yet
Modu
Document23 pages
Modu
Rolan Calata
No ratings yet
Iassc Lean Six Sigma Green Belt: Improve Phase: Course Structure
Document70 pages
Iassc Lean Six Sigma Green Belt: Improve Phase: Course Structure
Dimitris Papamatthaiakis
No ratings yet
ML Exam BHT Tough H Beeeeeeeee
Document10 pages
ML Exam BHT Tough H Beeeeeeeee
Rasika Deshpande
No ratings yet
Logistic Regression: Prof. Andy Field
Document34 pages
Logistic Regression: Prof. Andy Field
Syed
No ratings yet
Module 3.3 Classification Models, An Overview
Document11 pages
Module 3.3 Classification Models, An Overview
Duane Eugenio Ani
No ratings yet
17 Probit and Logit Models
Document28 pages
17 Probit and Logit Models
Kiky Nur Fauzi
No ratings yet
Session 2.1 Logistic Regression
Document15 pages
Session 2.1 Logistic Regression
springfield12
No ratings yet
Statstic Slide
Document24 pages
Statstic Slide
QATRUN NADIRA BINTI BAHARIN Moe
No ratings yet
Simple$MultipleLinearRegression RobinTeotia
Document15 pages
Simple$MultipleLinearRegression RobinTeotia
hani.sharma324
No ratings yet
ReLu Heuristics For Avoiding Local Bad Minima
Document10 pages
ReLu Heuristics For Avoiding Local Bad Minima
Shanmuganathan V (RC2113003011029)
100% (1)
Java Operators
Document11 pages
Java Operators
milliongate1122
No ratings yet
A7 - One Way Anova
Document7 pages
A7 - One Way Anova
eoip.swdept.mgmt
No ratings yet
Timing Solution: Bookmarks
Document188 pages
Timing Solution: Bookmarks
turtlespiritflutes
No ratings yet
NC Solutions
Document123 pages
NC Solutions
Carmen Elissa
No ratings yet
Autocorrelation
Document25 pages
Autocorrelation
damian saucedo
No ratings yet
MPC Project Assignment 4
Document4 pages
MPC Project Assignment 4
Mahdi Ansari
No ratings yet
Operators
Document30 pages
Operators
Minnakanti Sai Venkata Pavan
No ratings yet
Module 3 Operators
Document20 pages
Module 3 Operators
Stefano Edic
No ratings yet
Briefly Discuss The Concept of LR Analysis
Document9 pages
Briefly Discuss The Concept of LR Analysis
Rozen Tareque Hasan
No ratings yet
2023 Statistics Fin 10
Document14 pages
2023 Statistics Fin 10
T
No ratings yet
04 Chap04 ClassificationMethods-LogisticRegression 2024
Document23 pages
04 Chap04 ClassificationMethods-LogisticRegression 2024
lyh0926
No ratings yet
COMP6598 - Week 3 - Elementary Programming (Operator)
Document34 pages
COMP6598 - Week 3 - Elementary Programming (Operator)
Ed
No ratings yet
Lec 05 Regularization
Document77 pages
Lec 05 Regularization
Mr. Coffee
No ratings yet
REGRESSION
Document3 pages
REGRESSION
Elsa Raidy
No ratings yet
Coefficient Stability
Document41 pages
Coefficient Stability
Pedro Luis Azañedo Tintinapón
No ratings yet
Machine Learning Methods To Forecast Stock Markets
Document35 pages
Machine Learning Methods To Forecast Stock Markets
pelusonietoaxel
No ratings yet
Lesson 1 Limit of A Function (Algebraic)
Document84 pages
Lesson 1 Limit of A Function (Algebraic)
Roberto Discutido
100% (1)
Podmanik Ebook - Chapter 2
Document119 pages
Podmanik Ebook - Chapter 2
julianrivasalfonzo
No ratings yet
Operators
Document6 pages
Operators
Govindan G.
No ratings yet
5-LR Doc - R Sqared-Bias-Variance-Ridg-Lasso
Document26 pages
5-LR Doc - R Sqared-Bias-Variance-Ridg-Lasso
Monis Khan
No ratings yet
Workshop 4 - Part 2 - Advanced Time Series Econometrics With EViews
Document99 pages
Workshop 4 - Part 2 - Advanced Time Series Econometrics With EViews
Lyana Zainol
100% (2)
Time Series: Test For Esmooth and SARIMA
Document4 pages
Time Series: Test For Esmooth and SARIMA
alisaphan0
No ratings yet
Prog0101 CH05
Document36 pages
Prog0101 CH05
njoh bless
No ratings yet
Operators (Autosaved)
Document27 pages
Operators (Autosaved)
aviaravind753
No ratings yet
Kuant Guides: Sem With Lavaan 0.5-15
Document11 pages
Kuant Guides: Sem With Lavaan 0.5-15
jose_vent16416
No ratings yet
Unit 2
Document19 pages
Unit 2
bushrajameel88
No ratings yet
Lecture 2 Ai
Document24 pages
Lecture 2 Ai
Enes sağnak
No ratings yet
Module 6A Estimating Relationships
Document104 pages
Module 6A Estimating Relationships
carolinelouise.bagsik.acct
No ratings yet
Rectified Linear Units (ReLU) in Deep Learning - Kaggle
Document3 pages
Rectified Linear Units (ReLU) in Deep Learning - Kaggle
sushilnamoijam
No ratings yet
2-Logistic Regression
Document15 pages
2-Logistic Regression
abdala sabry
No ratings yet
C Lecture 5 Part 1
Document45 pages
C Lecture 5 Part 1
Gagan
No ratings yet
Pseudo Code Slides
Document22 pages
Pseudo Code Slides
rahima
No ratings yet
C Operators
Document16 pages
C Operators
Yubraj Chaudhary
No ratings yet
Regression
Document34 pages
Regression
Ravi goel
100% (1)
Algorithms PDF
Document3 pages
Algorithms PDF
Nikalia
No ratings yet
5 Classification
Document72 pages
5 Classification
sharad
100% (1)
Exercises of Multi-Variable Functions
From Everand
Exercises of Multi-Variable Functions
Simone Malacrida
No ratings yet
Zacks Small-Cap Research: Corecivic, Inc
Document8 pages
Zacks Small-Cap Research: Corecivic, Inc
Karim Lahrichi
No ratings yet
Full Collection Nomad Letters
Document218 pages
Full Collection Nomad Letters
bhupiscribd
No ratings yet
Michael Mauboussin The Base Rate Book
Document152 pages
Michael Mauboussin The Base Rate Book
Lucas Beaumont
No ratings yet
Viken Karaminassian - CEBD 1151
Document3 pages
Viken Karaminassian - CEBD 1151
Karim Lahrichi
No ratings yet
Chapter 3 - Forecasting PDF
Document45 pages
Chapter 3 - Forecasting PDF
Jomi Salvador
No ratings yet
BA Module 4 Summary
Document3 pages
BA Module 4 Summary
shubham wagh
No ratings yet
Student Xi Yi 1 95 85 2 85 95 3 80 70 4 70 65 5 60 70: Student X y (X - XM) (Y - Y)
Document3 pages
Student Xi Yi 1 95 85 2 85 95 3 80 70 4 70 65 5 60 70: Student X y (X - XM) (Y - Y)
Muhammad hanzla
No ratings yet
SARIMA Model RMSE 1
Document9 pages
SARIMA Model RMSE 1
praz
No ratings yet
SAMPLE Midterm Exam #2
Document11 pages
SAMPLE Midterm Exam #2
Mack Coleman Dowdall
No ratings yet
Manual Stata 13
Document371 pages
Manual Stata 13
IsaacCisneros
100% (1)
Lec 11 - Korelasi (Optimized)
Document26 pages
Lec 11 - Korelasi (Optimized)
Asal Review
No ratings yet
Lasso Regularization of Generalized Linear Models - MATLAB & Simulink
Document14 pages
Lasso Regularization of Generalized Linear Models - MATLAB & Simulink
jayaramanjt
No ratings yet
Time Series Analysis: An Outlook On Argriculture in The Philippines (Mark Anthony B Belderol - G156716a)
Document20 pages
Time Series Analysis: An Outlook On Argriculture in The Philippines (Mark Anthony B Belderol - G156716a)
Mark Charm Belderol
No ratings yet
Adaptive Forecasting SKJ
Document21 pages
Adaptive Forecasting SKJ
Stitapragyan Kumar
No ratings yet
Regression Analysis
Document19 pages
Regression Analysis
PRANAY
No ratings yet
Quantitative Methods Public Policy Analisys
Document6 pages
Quantitative Methods Public Policy Analisys
Dan Melendez
No ratings yet
Chap 9 Dummy Variable Regression Model
Document66 pages
Chap 9 Dummy Variable Regression Model
Samina Ahmeddin
No ratings yet
Statistics For Business and Economics: Multiple Regression and Model Building
Document111 pages
Statistics For Business and Economics: Multiple Regression and Model Building
RAMA
No ratings yet
LASSO Regression Math
Document7 pages
LASSO Regression Math
Vitor Duarte
No ratings yet
Homework Chapter 13: Pooling Cross Sections Across Time: Simple Panel Data Methods
Document2 pages
Homework Chapter 13: Pooling Cross Sections Across Time: Simple Panel Data Methods
k
No ratings yet
Course Content PDF
Document153 pages
Course Content PDF
Abid Hussain
No ratings yet
Linear Regression Analysis
Document53 pages
Linear Regression Analysis
superanuj
100% (3)
An Article in The Journal of Environmental Engineering Vol. 115
Document4 pages
An Article in The Journal of Environmental Engineering Vol. 115
Ed Lawrence Montalbo
No ratings yet
Unit 8
Document16 pages
Unit 8
Ramesh G Mehetre
No ratings yet
Output
Document9 pages
Output
Shreyansh Paleria
No ratings yet
Simple Linear Regression - Assign3
Document8 pages
Simple Linear Regression - Assign3
Sravani Adapa
No ratings yet
Regression and Analysis
Document132 pages
Regression and Analysis
Taddese Gashaw
No ratings yet
2011PGP558 Session-6 Sneakers Answers
Document8 pages
2011PGP558 Session-6 Sneakers Answers
Anuj Peepre
No ratings yet
Survival in STATA
Document2 pages
Survival in STATA
hubik38
No ratings yet
Muhammad Hasib
Document17 pages
Muhammad Hasib
ahmaddjody.jr
No ratings yet
Long Run and Short Run Models by Afees Salisu
Document51 pages
Long Run and Short Run Models by Afees Salisu
Nurudeen Bamidele Faniyi (fbn)
No ratings yet
ARIMA Models For Time Series Forecasting - Introduction To ARIMA Models
Document6 pages
ARIMA Models For Time Series Forecasting - Introduction To ARIMA Models
gschiro
No ratings yet
Econometric Modelling: Module - 2
Document17 pages
Econometric Modelling: Module - 2
Bharathithasan Saminathan
No ratings yet
Quick GRETL Guide
Document5 pages
Quick GRETL Guide
Fernanda Carolina Ferreira
No ratings yet