Lectures 14 and 15

Lane Changing Process
▪ Modeling lane changing process is complicated

✓ Entire lane changing process is latent in nature
✓ Execution of lane changing process is only observed – final gap acceptance
▪ Time at which lane change decision is made cannot be observed
▪ Once a decision to lane change is made, a driver may continue to

search for gaps or may change his/her mind–which are observed
▪ Lane change decision is continuous in nature
▪ Drivers are assumed to make decisions about lane changes at every

discrete point in time irrespective of the decisions made earlier
✓ For modeling process, time is discretized
▪ Impact of past lane changing decision on current lane changing

decision is not modeled
Lane Change Identification
▪ Important parameters/factors required to identify LC
✓ Position
✓ Speed of all vehicles involved in LC process (in
both current lane and adjacent lanes)
✓ Acceleration/deceleration
▪ Data to be obtained from trajectory of vehicles collected using

video graphic technique, drones, etc.
▪ LC window is the time interval where the following conditions are

satisfied (Venthuruthiyil et al. 2020)
✓ Lateral speed of the vehicle increases from zero state, reaches the
maximum, then decreases back to zero
✓ Maximum lateral displacement during this period should be more than 1.5
times of the vehicle width
Lane Change Identification
3
Factors affecting decision to change lane and lane selection
▪ Speed and position of the subject vehicle
▪ Relative speed and position of subject vehicle with respect to

surrounding vehicles
▪ Roadway factors such as presence of permanent and temporary

obstructions, lane use regulations
▪ Traffic factors such traffic state, safety criteria, etc.
▪ Driver characteristics such as desired speed of driver, driving style,

personal discomfort, experience, etc.
4
Random utility theory
▪ Random utility theory considers that individual/driver preferences

are latent and unobservable for analyst
▪ Latent utility can be expressed as a sum of systematic and random

component
▪ The true utility of alternative ‘i’ to the driver ‘n’ is

𝑈𝑖𝑛 = 𝑉𝑖𝑛 + ∈𝑖𝑛
Where, Vin is the systematic component, and 𝑉𝑖𝑛 = σ𝑖 𝑏𝑖 𝑥𝑖
ϵin is the error or the portion of the utility unknown to the
analyst
xi is the explanatory variables related to alternatives ‘i’
b𝑖 is the coefficient related to the variables
5
Linear and Logistic regression
Linear regression Logistic regression
Independent
Independent
variables
variables
1 Dependent
1
β0 variable
β0 Dependent
x1 Y
x1 variable β1
β1 0
Sigmoid/L
Y β2
β2 x2 ogistic
x2 (Real function
value) 1
βn 𝑃 𝑦=1
βn ln Categories/Groups
𝑃 𝑦=0
xn
xn
Y is a linear function of the Log-odds is a linear function of the

independent variables independent variables
Examples: Speed of the vehicle, Pricing of Examples: Decision to change lane, Risk of
the house, etc.. developing cancer, etc..
Logistic regression
▪ It is a classification technique that classifies/groups the given data
▪ An effort is made to study the probability of an event occurrence, so
that it can be classified into categories or groups
▪ Dependent variable of logistic regression, 𝑌 ~ 𝐵𝑒𝑟𝑛𝑜𝑢𝑙𝑙𝑖 𝑛, 𝑝 (n =
1, and p is unknown)
▪ Examples of logistic regression are:
✓ Predicting whether an individual will pass the given exam (given the
number of hours studied, classes attended, etc.), or
✓ risk of developing cancer (given the age, gender, results of various
tests, etc.) or
✓ will change the lane (given the speed of his vehicle, front vehicle, clear
front gap, etc.)
▪ Note: Bernoulli distribution is just a special case of Binomial distribution
where n = 1
Log-odds
▪ The success is “1” and the failure is “0”
▪ Odds ratio is the chances of success over the chances of failure
𝑃(𝑦 = 1)
𝑂𝑑𝑑𝑠 =
𝑃(𝑦 = 0)
𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑜𝑐𝑐𝑢𝑟𝑒𝑛𝑐𝑒 𝑜𝑓 𝑎𝑛 𝑒𝑣𝑒𝑛𝑡

=
𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑛𝑜𝑛 − 𝑜𝑐𝑐𝑢𝑟𝑒𝑛𝑐𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑠𝑎𝑚𝑒 𝑒𝑣𝑒𝑛𝑡
▪ Probability of success (P(y = 1)) is represented as p and

probability of failure (P(y = 0)) is, q = 1 – p
𝑝 𝑝
▪ Log − odds = ln 𝑜𝑑𝑑𝑠 𝑟𝑎𝑡𝑖𝑜 = ln = ln
𝑞 1−𝑝
Logit function
▪ In logistic regression, we estimate an unknown p for any given

linear combination of independent variables (estimate of p is 𝑝)Ƹ
▪ The link which connects the independent variables to the

dependent variable (distributed Bernoulli(p)) is the logit function
▪ Logit function maps the linear combination of variables that

could result in any value onto the Bernoulli distribution with a
domain from 0 to 1
▪ Logit(p) is the natural log of the odds ratio

𝑃 𝑦=1 𝑝
ln 𝑜𝑑𝑑𝑠 𝑟𝑎𝑡𝑖𝑜 = ln = ln or ln(p) – ln(q) = logit(p)
𝑃 𝑦=0 1−𝑝
Logit function
10
Image source: https://www.javatpoint.com/linear-regression-vs-logistic-regression-in-machine-learning

Logit function
Comparison of logit(p) v/s inverse logit(p)

(logit(p) in the domain of 0 to 1, where the base of the logarithm is e.)
11
Image source: https://math.stackexchange.com/questions/3816925/how-to-adjust-logit-functions-input-domain

Estimated regression equation
From the figure/structure of logistic regression
𝑃(𝑌 = 1|𝑋1 , 𝑋2 , …., 𝑋𝑘 ) = 𝐹(𝛽0 + 𝛽1 𝑋1 + 𝛽1 𝑋2 + …… + 𝛽𝑘 𝑋𝑘 )
𝑝
logit(p) = ln = 𝛽0 + 𝛽1 𝑋1 + 𝛽1 𝑋2 + ..… + 𝛽𝑘 𝑋𝑘
1−𝑝
By considering antilog
𝑝
= 𝑒 𝛽0 + 𝛽1 𝑋1 +𝛽1 𝑋2 + ..… + 𝛽𝑘 𝑋𝑘
1−𝑝
𝑝 = 1 − 𝑝 𝑒𝛽0+ 𝛽1 𝑋1+𝛽1 𝑋2+ ..… + 𝛽𝑘 𝑋𝑘
𝑝 + 𝑝 𝑒 𝛽0 + 𝛽1 𝑋1 +𝛽1 𝑋2 + ..… + 𝛽𝑘 𝑋𝑘 = 𝑒 𝛽0 + 𝛽1 𝑋1 +𝛽1 𝑋2 + ..… + 𝛽𝑘 𝑋𝑘
..… + 𝛽𝑘 𝑋𝑘
𝑒 𝛽0 + 𝛽1 𝑋1 +𝛽1 𝑋2 +
Estimated regression equation, 𝑝Ƹ =
1+𝑒 𝛽0 + 𝛽1 𝑋1 +𝛽1 𝑋2 + ..… + 𝛽𝑘 𝑋𝑘
▪ If the 𝑝Ƹ calculated is > threshold (Default = 0.5) considered, Y = 1

otherwise, Y = 0
Logistic regression coefficients
▪ In logistic regression, regression coefficients are chosen to maximize

the P(y) for a given X by using Maximum Likelihood Estimate (MLE)
technique
▪ Positive coefficient says positive impact but how?
▪ For example: Equations is of the form
▪ For linear model: y = 𝛽0 + 𝛽1 𝑋1 + 𝛽1 𝑋2 + ..… + 𝛽𝑘 𝑋𝑘
▪ For logistic model: logit(p) = 𝛽0 + 𝛽1 𝑋1 + 𝛽1 𝑋2 + ..… + 𝛽𝑘 𝑋𝑘
13
Linear regression coefficients
▪ For linear model, where y is the scoring marks in the exam, the
fitted regression equation is
y = 0.07 + 13 𝑆𝑡𝑢𝑑𝑦_ℎ𝑜𝑢𝑟𝑠 − 4 (𝑀𝑎𝑙𝑒)
▪ Interpretation: Keeping all other independent variables constant,

✓ For 1 unit increase in study hours results in an increase of 13
marks (here gender variable is kept constant)
✓ For categorical variables like gender, males score 4 marks less
than females in the class (here study_hours variable is kept
constant)
14
Logistic regression coefficients
For logistic model, where y is the passing the driving test, the fitted
regression equation is
logit(p) = 0.07 + 0.13 𝑃𝑟𝑎𝑐𝑡𝑖𝑐𝑒_ℎ𝑜𝑢𝑟𝑠 − 1.82 (𝐹𝑒𝑚𝑎𝑙𝑒)
(𝑒 0.13 = 1.14, 𝑒 −1.82 = 0.16) Y = 1, Pass in the test

Y = 0, Fail in the test
▪ Interpretation: Keeping all other independent variables constant,

✓ For 1 unit increase in practice hours (by keeping gender variable
constant), there will be (𝑒 0.13 − 1 = 0.14), 14% increase in odds
for passing the test
✓ For categorical variable like gender (by keeping practice_hours
variable constant), the odds for passing the test is (𝑒 −1.82 − 1 =
0.16 − 1 = −0.84), 84% lesser for female compared to male

Lectures 14 and 15

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lectures 14 and 15

Uploaded by

Copyright:

Available Formats

Lane Changing Process

▪ Modeling lane changing process is complicated

▪ Time at which lane change decision is made cannot be observed

▪ Once a decision to lane change is made, a driver may continue to

▪ Lane change decision is continuous in nature

▪ Drivers are assumed to make decisions about lane changes at every

▪ Impact of past lane changing decision on current lane changing

▪ Data to be obtained from trajectory of vehicles collected using

▪ LC window is the time interval where the following conditions are

▪ Speed and position of the subject vehicle

▪ Relative speed and position of subject vehicle with respect to

▪ Roadway factors such as presence of permanent and temporary

▪ Traffic factors such traffic state, safety criteria, etc.

▪ Driver characteristics such as desired speed of driver, driving style,

▪ Random utility theory considers that individual/driver preferences

▪ Latent utility can be expressed as a sum of systematic and random

▪ The true utility of alternative ‘i’ to the driver ‘n’ is

Y is a linear function of the Log-odds is a linear function of the

▪ The success is “1” and the failure is “0”

▪ Odds ratio is the chances of success over the chances of failure

𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑜𝑐𝑐𝑢𝑟𝑒𝑛𝑐𝑒 𝑜𝑓 𝑎𝑛 𝑒𝑣𝑒𝑛𝑡

▪ Probability of success (P(y = 1)) is represented as p and

▪ In logistic regression, we estimate an unknown p for any given

▪ The link which connects the independent variables to the

▪ Logit function maps the linear combination of variables that

▪ Logit(p) is the natural log of the odds ratio

Image source: https://www.javatpoint.com/linear-regression-vs-logistic-regression-in-machine-learning

Comparison of logit(p) v/s inverse logit(p)

Image source: https://math.stackexchange.com/questions/3816925/how-to-adjust-logit-functions-input-domain

▪ If the 𝑝Ƹ calculated is > threshold (Default = 0.5) considered, Y = 1

▪ In logistic regression, regression coefficients are chosen to maximize

▪ Positive coefficient says positive impact but how?

▪ For example: Equations is of the form

▪ For linear model: y = 𝛽0 + 𝛽1 𝑋1 + 𝛽1 𝑋2 + ..… + 𝛽𝑘 𝑋𝑘

▪ For logistic model: logit(p) = 𝛽0 + 𝛽1 𝑋1 + 𝛽1 𝑋2 + ..… + 𝛽𝑘 𝑋𝑘

▪ Interpretation: Keeping all other independent variables constant,

(𝑒 0.13 = 1.14, 𝑒 −1.82 = 0.16) Y = 1, Pass in the test

▪ Interpretation: Keeping all other independent variables constant,

You might also like