You are on page 1of 12

Generalized Linear Models

1. Regression
2. Why GLM?
3. What is GLM?
4. Family of distributions
5. Modelling with GLM

By mathXplorers 1​
Lesson 01: Regression

Regression is a statistical technique that models the relationship between a


dependent variable and one or more independent variables.

Regression is crucial for prediction, forecasting, and understanding the


direction and strength of relationships between variables.

By mathXplorers 2
Lesson 02: Why GLM?

Diverse Distributions: GLMs are designed to handle a wide range of


probability distributions from an exponential family of distribution, including
normal, binomial, Poisson, and more.

Handling a Wide Range of Data Types: They can handle a diverse array of
data types, including binary (yes/no), count data, and continuous data.

Non-Linear Relationships: Unlike traditional linear models, GLMs allow


modeling non-linear relationships between variables. Although, GLM
assumes that the linear predictor & scaled y variable have linear relation.

By mathXplorers 3​
Lesson 03: What is GLM?

The Probability Distribution:


It's based on a specified probability distribution, often chosen based on the
nature of the response variable (e.g., normal, binomial, or Poisson).

The Linear Predictor (eta):


GLM uses a linear predictor, denoted as eta (η), which is a linear combination
of the independent variables. It's the 'raw' prediction before applying the link
function.

The Link Function (g(μ) = η):


The link function (g) relates the linear predictor (η) to the expected value of
the response variable (μ). This link function helps model non-linear
relationships effectively.

By mathXplorers 4
Lesson 04: Modelling with GLM?

Define Variables

Identifying Probability Distributions​

Choosing correct link function​

Equate the Linear Predictor

Constructing Models​
Qualitative​
Quantitative​
Mixed

By mathXplorers 5​
Lesson 05: Types of Models

01. Quantitative Models:


Data is Quantitative in nature which is it has numerical values and can be
measured or counted.

By mathXplorers 6​
Lesson 05: Types of Models

02. Qualitative Models:

Data is Qualitative in nature which is it has non-numerical values and


descriptive or categorical.

By mathXplorers 7​
Lesson 05: Types of Models

03. Mixed Models:

Mix of Qualitative and Quantitative Data

By mathXplorers 8​
Lesson 06: Model Testing

Scaled Deviance Test & AIC:

we compare the likelihood under this model with the likelihood under the
saturated model.

The saturated model uses the same distribution and link function as the
current model, but has as many parameters as there are data points.

AIC - Akaike’s Information Criteria


AIC penalizes a model for having more parameters, Smaller the AIC, the
better the fit.

By mathXplorers 9​
Lesson 08: Case Study in R

We're analyzing a dataset of 10 health insurance policies. It includes age


(numeric) and categorical data like gender, region, existing health conditions.

Our goal is to determine the right premium rates. Claim numbers are the
dependent variable, and age, along with categorical factors, are predictors.

By mathXplorers 10
Lesson 07: Major Applications of GLM

Insurance:
Modelling claim frequencies & severity, Factors based Pricing

Healthcare:
Predicting disease outcomes, Survival rates modelling

Finance:
Market trends, Risk assessment, Credit scoring

Marketing:
Customer response rates, Customer churn, and the effectiveness of
advertising campaigns.

By mathXplorers 11
Thankyou!

If you enjoyed this video,


please do not forget to like, share and subscribe

In case of any doubts, please feel free to reach out to me


mathxplorers.mxp@gmail.com

By mathXplorers

You might also like