You are on page 1of 53

Machine learning and statistical learning

2.2 Quantile Regression

Ewen Gallic
ewen.gallic@gmail.com

MASTER in Economics - Track EBDS - 2nd Year

2020-2021
1. Quantiles

1. Quantiles

Ewen Gallic Machine learning and statistical learning 2/53


1. Quantiles

Quantiles

• Quantiles are used to describe a distribution


• Very useful for highly dispersed distributions
I e.g., with income
• Useful if we are interested in relative variations (rather
Source: Alvaredo et al. (2018) than absolute variations)
Figure 1: Total Income Growth
by Percentile Across All World
Regions, 1980-2016

Ewen Gallic Machine learning and statistical learning 3/53


1. Quantiles

Some references
• Arellano, M (2009). Quantile methods, Class notes
• Charpentier, A (2020). Introduction à la science des données et à l’intelligence artificielle,
https://github.com/freakonometrics/INF7100
• Charpentier, A. (2018). Big Data for Economics, Lecture 3
• Davino et al. (2014). Quantile Regression. John Wiley & Sons, Ltd.
• Givord, P., D’Haultfoeuillle, X. (2013). La régression quantile en pratique, INSEE
• He, X., Wang, H. J. (2015). A Short Course on Quantile Regression.
• Koenker and Bassett Jr (1978). Regression quantiles. Econometrica: journal of the
Econometric Society (46), 33–50
• Koenker (2005). Quantile regression. 38. Cambridge university press.
Ewen Gallic Machine learning and statistical learning 4/53
1. Quantiles
1.1. Median

1.1 Median

Ewen Gallic Machine learning and statistical learning 5/53


1. Quantiles
1.1. Median

Median: a specific quantile


0.6

• The median m of a real-valued probabil-


ity distribution separates this probability 0.4 m
distribution into two halves
I 50% of the values below m, 50% of

f(x)
the values above m
• For a random variable Y with cumula- 0.2 50%

tive distribution F , the median is such


that: 50%
0.0
1 1
P(Y ≤ m) ≥ and P(Y ≥ m) ≥ 0 1 2
x
3 4 5
2 2
Figure 2: Probability density function of a F (5, 2).
Ewen Gallic Machine learning and statistical learning 6/53
1. Quantiles
1.1. Median

Median: a specific quantile


0.8

0.6
• From the cumulative distribution func-
tion F , that gives F (y) = P(Y ≤ y)

F(x)
0.4

I find the antecedent of .5


I if the cumulative distribution is strictly
0.2
increasin, the antecedent is unique
I if not, all antecedents of 0.5 are pos-
m
sible values for the median 0.0

0 1 2 3 4 5
x

Figure 3: Cumulative distribution function of a F (5, 2).


Ewen Gallic Machine learning and statistical learning 7/53
1. Quantiles
1.1. Median

Median and symmetric distribution


0.4

0.3
φ(x)

0.2 If the probability distribution is symmetric,


the median m and the mean (if it exists) are
0.1 50% m = E(X) 50%
the same (the point at which the symmetry
occurs).
0.0

−2 0 2
x

Figure 4: PDF of the N (0, 1) distribution.


Ewen Gallic Machine learning and statistical learning 8/53
1. Quantiles
1.1. Median

Numeric Optimization

It may be useful to think of the median as the solution of a minimization problem:


• The median minimizes the absolute sum of deviations:

median(Y ) ∈ arg min E | Y − m |


m

• An analogy can be made with the mean µ of a random variable.


I It can be obtained using the following numerical optimization, where m is defined as
the center of the distribution:

µ = arg min E(Y − m)2


m

Ewen Gallic Machine learning and statistical learning 9/53


1. Quantiles
1.2. Quantiles : definition

1.2 Quantiles : definition

Ewen Gallic Machine learning and statistical learning 10/53


1. Quantiles
1.2. Quantiles : definition

What is a quantile?

In statistics and probability, quantiles are cut points dividing the range of a
probability distribution into continuous intervals with equal probabilities, or
dividing the observations in a sample in the same way. There is one fewer
quantile than the number of groups created (Wikipedia)

We thus define a probability threshold τ and then look for the value Q(τ ) (i.e., the
τ th quantile) such that:
• a proportion τ of the observations have a value lower or equal to τ ;
• a proportion 1 − τ of the observations have a value greater or equal to τ .

Ewen Gallic Machine learning and statistical learning 11/53


1. Quantiles
1.2. Quantiles : definition

Formal definition
The quantile QY (τ ) of level τ ∈ (0, 1) of a random variable Y is such that:

P {Y ≤ QY (τ )} ≥ τ and P {Y ≥ QY (τ )} ≥ 1 − τ

We can also define it as:

QY (τ ) = F −1 (τ ) = inf {y ∈ R : FY (y) ≥ τ }

The most used quantiles are:


• τ = 0.5: the median
• τ = {0.1, 0.9}: the first and last deciles
• τ = {0.25, 0.75}: the first and last quartiles
Ewen Gallic Machine learning and statistical learning 12/53
1. Quantiles
1.2. Quantiles : definition

Illustration
A Probability distribution function B Cumulative distribution function
0.6
0.8

0.6
0.4 Qτ(.25)

F(x)
f(x)

0.4

0.2

0.2

25% 75% Qτ(.25)


0.0 0.0

0 1 2 3 4 5 0 1 2 3 4 5
x x

Figure 5: Quantiles of a probability distribution (here, F (5, 2)).

Ewen Gallic Machine learning and statistical learning 13/53


1. Quantiles
1.2. Quantiles : definition

Numerical optimization

Once again, quantiles can be viewed as particular centers of the distribution that minimize
the weighted absolute sum of deviation. For the τ th quantile:

Qτ (Y ) ∈ arg min E [ρτ (Y − m)]


m

where ρτ (u) is a loss function defined as follows:

ρτ (u) = (τ − 1(u < 0)) × u, (1)

for 0 < τ < 1.

Ewen Gallic Machine learning and statistical learning 14/53


1. Quantiles
1.2. Quantiles : definition

Loss function
2.0

1.5

This loss function is


• a continuous piecewise linear function

ρτ(u)
• non differentiable at u = 0 1.0 τ = 0.1

Then,
• a (1 − τ ) weight is assigned to the
0.5
negative deviations
• a τ weight is assigned to the positive
deviations 0.0

−2 −1 0 1 2
u

Ewen Gallic Figure 6: Quantile regression


Machine loss
learning and function
statistical ρτ .15/53
learning
1. Quantiles
1.2. Quantiles : definition

Numerical optimization
The minimization program Qτ (Y ) = arg minm E [ρτ (Y − m)] can be written as:
• For a discrete Y variable with pdf f (y) = P(Y = y):

 
 X X 
Qτ (Y ) = arg min (1 − τ ) | y − m | f (y) + τ | y − m | f (y)
m 
y>c

y≤m

• For a continuous Y variable with pdf f (y):

 Z m Z +∞ 
Qτ (Y ) = arg min (1 − τ ) | y − m | f (y)dy + τ | y − c | f (y)dy
m −∞ m

Ewen Gallic Machine learning and statistical learning 16/53


1. Quantiles
1.3. Graphical Representation

1.3 Graphical Representation

Ewen Gallic Machine learning and statistical learning 17/53


1. Quantiles
1.3. Graphical Representation

Boxplots
Q1 Q2 Q3

1 A famous graphical representation of the dis-


tribution of data relies on quantiles: the
boxplots.
IQR: interquartile range
1.5 × IQR 1.5 × IQR
IQR

2.0 2.5 3.0 3.5 4.0 4.5


Sepal Witdh

Figure 7: Boxlpot.
Ewen Gallic Machine learning and statistical learning 18/53
1. Quantiles
1.3. Graphical Representation

Choropleth maps
A B

Based on quantiles
Equal cuts
43.4°N (3.7,7.5]
43.4°N
(3.7,5.67]
(7.5,8.3]
(5.67,7.63]
43.35°N 43.35°N (8.3,9]
(7.63,9.6]
(9,9.44]
(9.6,11.6]
43.3°N 43.3°N (9.44,10.1]
(11.6,13.5]
(10.1,10.8]
(13.5,15.5]
43.25°N
(10.8,11.5]
43.25°N (15.5,17.5]
(11.5,12.8]
(17.5,19.4]
43.2°N
(12.8,15]
43.2°N (19.4,21.4]
5.3°E 5.35°E 5.4°E 5.45°E 5.5°E (15,21.4]
5.3°E 5.35°E 5.4°E 5.45°E 5.5°E NA
NA

Note: Example adapted from Arthur Charpentier’s slides (INF7100 Ete 2020, slides 224)
Figure 8: Proportion of individuals between the ages of 18 and 25 (inclusive) registered on the electoral lists in
Marseilles, by voting center.
Ewen Gallic Machine learning and statistical learning 19/53
2. Quantile regression

2. Quantile regression

Ewen Gallic Machine learning and statistical learning 20/53


2. Quantile regression
2.1. Principles

2.1 Principles

Ewen Gallic Machine learning and statistical learning 21/53


2. Quantile regression
2.1. Principles

Quantile regression

In the linear regression context, we have focused on the conditional distribution of y, but
only paid attention to the mean effect.
In many situations, we only look at the effects of a predictor on the conditional mean
of the response variable. But there might be some asymetry in the effects across the
quantiles of the response variable:
• the effect of a variable could not be the same for all observations (e.g., if we
increase the minimum wage, the effect on wages may affect low wages differently
than high wages).
Quantile regression offers a way to account for these possible asymmetries.

Ewen Gallic Machine learning and statistical learning 22/53


2. Quantile regression
2.1. Principles

2 to 20 years: Boys NAME 2 to 20 years: Girls NAME


Body mass index-for-age percentiles RECORD # Body mass index-for-age percentiles RECORD #

Date Age Weight Stature BMI* Comments Date Age Weight Stature BMI* Comments
BMI BMI

35 35

34 34

33 33

32 32

31 95
31

30 30
95
29 29

BMI 28 BMI 28
90
90
27 27 27 27
26 85 26 26 26
85

25 25 25 25
75
24 24 24 75 24

23 23 23 23
50
22 22 22 22
50
21 21 21 21
25
20 20 20 20
25
10
19 19 19 19
5
10
18 18 18 5
18

17 17 17 17

16 16 16 16

15 15 15 15

14 14 14 14

13 13 13 13

12 12 12 12

kg/m
2
AGE (YEARS) kg/m 2
kg/m
2
AGE (YEARS) kg/m2
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Published May 30, 2000 (modified 10/16/00). Published May 30, 2000 (modified 10/16/00).
SOURCE: Developed by the National Center for Health Statistics in collaboration with SOURCE: Developed by the National Center for Health Statistics in collaboration with
the National Center for Chronic Disease Prevention and Health Promotion (2000). the National Center for Chronic Disease Prevention and Health Promotion (2000).
http://www.cdc.gov/growthcharts http://www.cdc.gov/growthcharts

Source: Centers for Disease Control and Prevention


Figure 9: BMI Percentile Calculator for Child and Teen (boys).

Ewen Gallic Machine learning and statistical learning 23/53


2. Quantile regression
2.1. Principles

Principles of Quantile Regression


Let Y be the response variable we want to predict using p + 1 predictors X (including
the constant).
In the linear model, using least squares, we write:
Y = X > β + ε,
with ε a zero mean error term with variance σ 2 In .
Thus, the conditional distribution writes:
E (Y | X = x) = x> β

Here, β represents the marginal change in the mean of the response Y to a marginal
change in x.
Ewen Gallic Machine learning and statistical learning 24/53
2. Quantile regression
2.1. Principles

Principles of Quantile Regression


Instead of looking at the mean effect, we can look at the effect at a given quantile. The
conditional quantile is defined as:

Qτ (Y | X = x) = inf{y : F (y | x) ≥ τ }

The linear quantile regression model assumes:

Qτ (Y | X = x) = x> β τ (2)

h i>
where β τ = β0,τ β1,τ β2,τ . . . βp,τ is the quantile coefficient:
• it corresponds to the marginal change in the τ th quantile folowing a marginal
change in x.
Ewen Gallic Machine learning and statistical learning 25/53
2. Quantile regression
2.1. Principles

Principles of Quantile Regression

If we assume that Qτ (ε | X) = 0, then Eq. (2) is equivalent to:

Y = X > β τ + ετ (3)

Ewen Gallic Machine learning and statistical learning 26/53


2. Quantile regression
2.1. Principles

Asymetric absolute loss


2.0

1.5

Recall the asymetric absolute loss function

ρτ(u)
1.0 τ = 0.1
is defined as:

ρτ (u) = (τ − 1(u < 0)) × u,


0.5

for 0 < τ < 1.


0.0

−2 −1 0 1 2
Ewen Gallic u and statistical learning 27/53
Machine learning
2. Quantile regression
2.1. Principles

Asymetric absolute loss


With ρτ (u) used as a loss function, it can be shown that Qτ minimizes the expected
loss, i.e.:

Qτ (Y ) ∈ arg min {E [ρτ (Y − m)]} (4)


m

We can note that in the case in which τ = 1/2, this corresponds to the median.
Quantiles may not be unique:
• any element of {x ∈ R : FY (x) = τ } minimizes the expected loss
• if the solution is not unique, we have an interval of τ th quantiles
I the smalest element is chosen (this way, the quantile function remains left-continuous).
Ewen Gallic Machine learning and statistical learning 28/53
2. Quantile regression
2.1. Principles

Empirically

Now, let us turn to the estimation. Let us consider a random sample {(x1 , y1 ), . . . , (xn , yn )}.
In the case of the least square estimation, the expectation E minimizes the risk that
corresponds to the quadratic loss function, i.e.:

n h io
E(Y ) = arg min E (Y − m)2
m

Pn
• The sample mean solves minm i=1 (yi − m)2
• The least squares estimates of the parameters are obtained by minimizing
Pn > 2
i=1 (yi − xi β)

Ewen Gallic Machine learning and statistical learning 29/53


2. Quantile regression
2.1. Principles

Empirically
The τ th quantile minimizes the risk associated with the asymetric absolute loss function:
Qτ (Y ) ∈ arg min {E [ρτ (Y − m)]}
m

The τ th sample quantile of Y solves:


n
X
min ρτ (yi − m)
m
i=1

If we assume that Qτ (Y | X) = X > β τ , then, the quantile estimator of the parameters


is given by
n
( )
1X
β̂ τ ∈ arg min ρτ (yi − xi> β) (5)
β n i=1
Ewen Gallic Machine learning and statistical learning 30/53
2. Quantile regression
2.1. Principles

A τ = 0. 1 B

2,000
0.7

Slope of quantile regression


Food Expenditure (BEF/year)

1,500 0.6

1,000 0.5

0.4
500

1,000 2,000 3,000 4,000 5,000 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Household Income (BEF/year) Probability level (τ)

Source: Koenker and Bassett Jr (1982)


Figure 11: Food Expenditure as a function of Income.

Ewen Gallic Machine learning and statistical learning 31/53


2. Quantile regression
2.1. Principles

In the previous slide, let us focus on τ = .1.


• Left graph: The estimated slope and intercept for that quantile allows us to show
that if the household income is 3~500 Belgian Francs (BEF):
I there is a 90% chance that food expenditure for the household is lower than 1~500
BEF;
I there is a 10% chance that foodexpenditure for the household is greater than 1~500
BEF;
• Right graph: plotting the slope of the quantile regression of expenditure on income,
for different values of τ
I the slope coefficient is always positive
I the value increases with the level of the quantile: for relatively richer households, the
marginal effect of income on food expenditures is higer.

Ewen Gallic Machine learning and statistical learning 32/53


2. Quantile regression
2.1. Principles

2,000
Food Expenditure (BEF/year)

1,500

1,000

500

1,000 2,000 3,000 4,000 5,000


Household Income (BEF/year)

Figure 12: Food expenditure as a function of income: slope of linear quantile regression.

Ewen Gallic Machine learning and statistical learning 33/53


2. Quantile regression
2.1. Principles

From the previous graph, we observe that the slopes change relatively greatly depending
on the level of the quantile :
• for the individuals with the lowest incomes, the slope is relatively small (for τ = .1,
it is 0.40)
• for individuals with higher incomes, the slope is relatively higher (for τ = .9, it is
0.69)
Therefore, the higher the income category, the greater the effect of income on consump-
tion.

Ewen Gallic Machine learning and statistical learning 34/53


2. Quantile regression
2.1. Principles

Location-shift model

Let usconsider a simple model:

Y = X >γ + ε (6)

We have:
Qτ (Y | X = x) = X > γ + Qτ (ε)

This model known as the location model, the only coefficient that varies accordingly
with τ is the coefficient associated with the constant: β0 , τ = γ1 + Qτ (ε)

Ewen Gallic Machine learning and statistical learning 35/53


2. Quantile regression
2.1. Principles

Location-shift model

• The conditional distribution FY |X=x are parallel when x varies.


• As a consequence, the conditional quantiles are linearly dependant on X, and the
only coefficient that varies with the quantile is β0,τ , i.e., the coefficient associated
with the constant :
I β0,τ = γ1 + Qτ (ε)
I βj,τ = γj for all the coefficients except the constant.

Ewen Gallic Machine learning and statistical learning 36/53


2. Quantile regression
2.1. Principles

Location-shift model: exemple

Linear reg.

Consider the following true process: Quantile reg.

Y = β0 + β1 x2 + ε, with β0 = 3
0.01

and β1 = −.1, and where

y
0.25

ε ∼ N (0, 4). −5
0.5

0.75

Let us generate 500 observations 0.9

NA

from this process.


−10

0 25 50 75 100
x

Ewen Gallic Machine learning and statistical learning 37/53


2. Quantile regression
2.1. Principles

Location-scale model
Now, let us consider a location-scale model. In this model, we assume that the
predictors have an impact both on the mean and on the variance of the response:

Y = X > β + (X > γ)ε,

where ε is independant of X.
As Qτ (aY + b) = aQτ (Y ) + b, we can write:

Qτ (Y | X = x) = x> (β + γQτ (ε))

• By posing β τ = β + γQτ (ε), the assumption (2) still holds.


• The impact of the predictors will vary accross quantiles
• The slopes of the lines corresponding to the quantile regressions are not parallel
Ewen Gallic Machine learning and statistical learning 38/53
2. Quantile regression
2.1. Principles

Location-scale model: example


20

Consider the following true process:


Y = β1 + β2 x2 + ε, with β0 = 3 Linear reg.

and β1 = −.1, and where ε is a 0


Quantile reg.

normally distributed error with zero 0.01

y
0.25

mean and non-constant variance. 0.5

Let us generate 500 observations −20


0.75

0.9

from this process, and then NA

estimate a quantile regression on


different quantiles.
−40

0 25 50 75 100
x

Ewen Gallic Machine learning and statistical learning 39/53


2. Quantile regression
2.2. Inference

2.2 Inference

Ewen Gallic Machine learning and statistical learning 40/53


2. Quantile regression
2.2. Inference

Asymptotic distribution

As there is no explicit form for the estimate β̂ τ , achieving its consistency is not as easy
as it is with the least squares estimate.
First, we need some assumptions to achieve consistency:
• the observations {(x1 , y1 ), . . . , (xn , yn )} must be conditionnaly i.i.d.
• the predictors must have a bounded second moment, i.e., E || X i ||2 < ∞
 

• the error terms εi must be continuously distributed given X i , and centered, i.e.,
R0
f ()d = 0.5
h−∞ ε i
• f ε(0)XX > must be positive definite (local identification property).

Ewen Gallic Machine learning and statistical learning 41/53


2. Quantile regression
2.2. Inference

Asymptotic distribution
Under those weak conditions, β̂ τ is asymptotically normal:
√  
d
n βˆ − β − → N 0, τ (1 − τ )D−1 Ωx D−1

τ τ τ τ (7)

where h i h i
Dτ = E f ε(0)XX > and Ωx = E XX >

The asymptotic variance of β̂ writes, for the location shift model (Eq. 6):

n
!−1
i τ (1 − τ )
h 1X >
V̂ar βˆτ = h x xi (8)
n i=1 i
i2
fˆε (0)

where fˆε (0) can be estimated using an histogram (see Powell 1991).
Ewen Gallic Machine learning and statistical learning 42/53
2. Quantile regression
2.2. Inference

Confidence intervals
h i
If we have obtained an estimator of Var βˆτ , the confidence interval of level 1 − α is given by:
" r #
h i
β̂τ ± z1−α/2 V̂ar βˆτ (9)

Otherwise, we can rely on bootstrap or resampling methods (see Emmanuel Flachaire’s course):

• Generate a sample {(x1(b) , y1(b) ), . . . , (xn(b) , yn(b) )} from {(x1 , y1 , . . . , (xn , yn )}


(b)
n  o
(b) (b)>
• Then estimate β (b)
τ using β̂ τ = arg min ρ τ y i − xi β
• Do the two previous staps B times
• The bootstrap estimate of the variance of βˆτ is then computed as:
B 2
  1 X  ˆ (b)
V̂ar? βˆτ = β τ − βˆτ
B
Ewen Gallic b=1 Machine learning and statistical learning 43/53
2. Quantile regression
2.3. Example: birth data

2.3 Example: birth data

Ewen Gallic Machine learning and statistical learning 44/53


2. Quantile regression
2.3. Example: birth data

Birth data

Let us consider an example provided in Koenker (2005), about the impact of demographic
characteristics and maternal behaviour on the birth weight of infants botn in the US.
Let us use the Natality Data from the National Vital Statistics System, freely available
on the NBER website (https://data.nber.org/data/vital-statistics-natality-data.html)
The raw data for 2018 concerns more than 3M babies born in that year, from american
mothers aged bewteen 18 and 50.
For the sake of the exercise, we only use a sample of 20,000 individuals.

Ewen Gallic Machine learning and statistical learning 45/53


2. Quantile regression
2.3. Example: birth data

Variables kept
The predictors used are:

• education (less than high school (ref), high school, some college, college graduate)
• prenatal medical care (no prenatal visit, first prenatal visit in the first trimester of
pregnancy (ref), first visit in the second trimester, first visit in the last trimester)
• the sex of the infant
• the marital status of the mother
• the ethnicity of the mother
• the age of the mother
• whether the mother smokes
• the number of cigarette per day
• the gain of weight for the mother

Let us focus only on a couple of them.


Ewen Gallic Machine learning and statistical learning 46/53
2. Quantile regression
2.3. Example: birth data

Age
A τ = 0. 1 B
10
5,000

4,000

Slope of quantile regression


5
Birth weight (in grams)

3,000

0
2,000

1,000
−5

0
20 30 40 50 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Age of the mother (years) Probability level (τ)

Ewen Gallic Machine learning and statistical learning 47/53


2. Quantile regression
2.3. Example: birth data

Age

• If the slopes are not parallel, and the observed differences are significant, then the
effect of age on the baby’s weight is different depending on where the baby is in the
distribution.
• At the lower part of the distribution of weights (τ = .1, relatively light babies):
effect of age on weight negative but not significant
• At the upper part of the distribution of weights (τ = .9, relatively heavy babies):
positive and significant effect of age on weight

Ewen Gallic Machine learning and statistical learning 48/53


2. Quantile regression
2.3. Example: birth data

Age: nonlinear effect?


5,000

4,000
Birth weight (in grams)

3,000 0.1
0.25
0.5
0.75
2,000 0.9

1,000

0
20 30 40 50
Age of the mother (years)

Figure 13: Birth weight as a function of the age of the mother - Non-linear Quantile Regression
Ewen Gallic Machine learning and statistical learning 49/53
2. Quantile regression
2.3. Example: birth data

Age: nonlinear effect?

The influence of the mother’s age may be thought to be non-linear on birth weight.
For heavy babies (and also light babies), the weight of the babies is relatively lower on
the borders of the mother’s age distribution.

Ewen Gallic Machine learning and statistical learning 50/53


2. Quantile regression
2.3. Example: birth data

Cigarettes
A B 0
5,000

4,000

Slope of quantile regression


−100
Birth weight (in grams)

3,000

2,000
−200

1,000

−300

0
Do not smoke Smokes 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Mother smokes Probability level (τ)

Figure 14: Birth weight depending on whether the mother smokes


Ewen Gallic Machine learning and statistical learning 51/53
2. Quantile regression
2.3. Example: birth data

Cigarettes

• Smoking during pregnancy is associated with a decrease of birth weight


• For relatively light babies, the coefficient of the slope is relatively dramatically lower
than for relatively heavy babies:
I the fact thet the mother is smoking prior pregnancy has a significant impact on the
newborn weight, especially for the relatively lighter babies.

Ewen Gallic Machine learning and statistical learning 52/53


3. Quantile regression

References I

Alvaredo, F., Chancel, L., Piketty, T., Saez, E., and Zucman, G. (2018). The elephant curve of global inequality
and growth. AEA Papers and Proceedings, 108:103–08.

Davino, C., Furno, M., and Vistocco, D. (2014). Quantile Regression. John Wiley & Sons, Ltd.

Koenker, R. (2005). Quantile regression. Number 38. Cambridge university press.

Koenker, R. and Bassett Jr, G. (1978). Regression quantiles. Econometrica: journal of the Econometric Society,
pages 33–50.

Koenker, R. and Bassett Jr, G. (1982). Robust tests for heteroscedasticity based on regression quantiles. Econo-
metrica: Journal of the Econometric Society, pages 43–61.

Powell, J. L. (1991). Estimation of monotonic regression models under quantile restrictions. Nonparametric and
semiparametric methods in Econometrics, pages 357–384.

Ewen Gallic Machine learning and statistical learning 53/53

You might also like