You are on page 1of 15

Lane Changing Process

▪ Modeling lane changing process is complicated


✓ Entire lane changing process is latent in nature
✓ Execution of lane changing process is only observed – final gap acceptance

▪ Time at which lane change decision is made cannot be observed

▪ Once a decision to lane change is made, a driver may continue to


search for gaps or may change his/her mind–which are observed

▪ Lane change decision is continuous in nature

▪ Drivers are assumed to make decisions about lane changes at every


discrete point in time irrespective of the decisions made earlier
✓ For modeling process, time is discretized

▪ Impact of past lane changing decision on current lane changing


decision is not modeled
Lane Change Identification
▪ Important parameters/factors required to identify LC
✓ Position
✓ Speed of all vehicles involved in LC process (in
both current lane and adjacent lanes)
✓ Acceleration/deceleration

▪ Data to be obtained from trajectory of vehicles collected using


video graphic technique, drones, etc.

▪ LC window is the time interval where the following conditions are


satisfied (Venthuruthiyil et al. 2020)

✓ Lateral speed of the vehicle increases from zero state, reaches the
maximum, then decreases back to zero

✓ Maximum lateral displacement during this period should be more than 1.5
times of the vehicle width
Lane Change Identification

3
Factors affecting decision to change lane and lane selection

▪ Speed and position of the subject vehicle

▪ Relative speed and position of subject vehicle with respect to


surrounding vehicles

▪ Roadway factors such as presence of permanent and temporary


obstructions, lane use regulations

▪ Traffic factors such traffic state, safety criteria, etc.

▪ Driver characteristics such as desired speed of driver, driving style,


personal discomfort, experience, etc.

4
Random utility theory

▪ Random utility theory considers that individual/driver preferences


are latent and unobservable for analyst

▪ Latent utility can be expressed as a sum of systematic and random


component

▪ The true utility of alternative ‘i’ to the driver ‘n’ is


𝑈𝑖𝑛 = 𝑉𝑖𝑛 + ∈𝑖𝑛
Where, Vin is the systematic component, and 𝑉𝑖𝑛 = σ𝑖 𝑏𝑖 𝑥𝑖
ϵin is the error or the portion of the utility unknown to the
analyst
xi is the explanatory variables related to alternatives ‘i’
b𝑖 is the coefficient related to the variables
5
Linear and Logistic regression
Linear regression Logistic regression

Independent
Independent
variables
variables
1 Dependent
1
β0 variable
β0 Dependent
x1 Y
x1 variable β1
β1 0
Sigmoid/L
Y β2
β2 x2 ogistic
x2 (Real function
value) 1
βn 𝑃 𝑦=1
βn ln Categories/Groups
𝑃 𝑦=0

xn
xn

Y is a linear function of the Log-odds is a linear function of the


independent variables independent variables
Examples: Speed of the vehicle, Pricing of Examples: Decision to change lane, Risk of
the house, etc.. developing cancer, etc..
Logistic regression
▪ It is a classification technique that classifies/groups the given data
▪ An effort is made to study the probability of an event occurrence, so
that it can be classified into categories or groups
▪ Dependent variable of logistic regression, 𝑌 ~ 𝐵𝑒𝑟𝑛𝑜𝑢𝑙𝑙𝑖 𝑛, 𝑝 (n =
1, and p is unknown)
▪ Examples of logistic regression are:
✓ Predicting whether an individual will pass the given exam (given the
number of hours studied, classes attended, etc.), or
✓ risk of developing cancer (given the age, gender, results of various
tests, etc.) or
✓ will change the lane (given the speed of his vehicle, front vehicle, clear
front gap, etc.)
▪ Note: Bernoulli distribution is just a special case of Binomial distribution
where n = 1
Log-odds

▪ The success is “1” and the failure is “0”

▪ Odds ratio is the chances of success over the chances of failure

𝑃(𝑦 = 1)
𝑂𝑑𝑑𝑠 =
𝑃(𝑦 = 0)

𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑜𝑐𝑐𝑢𝑟𝑒𝑛𝑐𝑒 𝑜𝑓 𝑎𝑛 𝑒𝑣𝑒𝑛𝑡


=
𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑛𝑜𝑛 − 𝑜𝑐𝑐𝑢𝑟𝑒𝑛𝑐𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑠𝑎𝑚𝑒 𝑒𝑣𝑒𝑛𝑡

▪ Probability of success (P(y = 1)) is represented as p and


probability of failure (P(y = 0)) is, q = 1 – p

𝑝 𝑝
▪ Log − odds = ln 𝑜𝑑𝑑𝑠 𝑟𝑎𝑡𝑖𝑜 = ln = ln
𝑞 1−𝑝
Logit function

▪ In logistic regression, we estimate an unknown p for any given


linear combination of independent variables (estimate of p is 𝑝)Ƹ

▪ The link which connects the independent variables to the


dependent variable (distributed Bernoulli(p)) is the logit function

▪ Logit function maps the linear combination of variables that


could result in any value onto the Bernoulli distribution with a
domain from 0 to 1

▪ Logit(p) is the natural log of the odds ratio


𝑃 𝑦=1 𝑝
ln 𝑜𝑑𝑑𝑠 𝑟𝑎𝑡𝑖𝑜 = ln = ln or ln(p) – ln(q) = logit(p)
𝑃 𝑦=0 1−𝑝
Logit function

10

Image source: https://www.javatpoint.com/linear-regression-vs-logistic-regression-in-machine-learning


Logit function

Comparison of logit(p) v/s inverse logit(p)


(logit(p) in the domain of 0 to 1, where the base of the logarithm is e.)
11

Image source: https://math.stackexchange.com/questions/3816925/how-to-adjust-logit-functions-input-domain


Estimated regression equation
From the figure/structure of logistic regression
𝑃(𝑌 = 1|𝑋1 , 𝑋2 , …., 𝑋𝑘 ) = 𝐹(𝛽0 + 𝛽1 𝑋1 + 𝛽1 𝑋2 + …… + 𝛽𝑘 𝑋𝑘 )
𝑝
logit(p) = ln = 𝛽0 + 𝛽1 𝑋1 + 𝛽1 𝑋2 + ..… + 𝛽𝑘 𝑋𝑘
1−𝑝
By considering antilog
𝑝
= 𝑒 𝛽0 + 𝛽1 𝑋1 +𝛽1 𝑋2 + ..… + 𝛽𝑘 𝑋𝑘
1−𝑝
𝑝 = 1 − 𝑝 𝑒𝛽0+ 𝛽1 𝑋1+𝛽1 𝑋2+ ..… + 𝛽𝑘 𝑋𝑘
𝑝 + 𝑝 𝑒 𝛽0 + 𝛽1 𝑋1 +𝛽1 𝑋2 + ..… + 𝛽𝑘 𝑋𝑘 = 𝑒 𝛽0 + 𝛽1 𝑋1 +𝛽1 𝑋2 + ..… + 𝛽𝑘 𝑋𝑘
..… + 𝛽𝑘 𝑋𝑘
𝑒 𝛽0 + 𝛽1 𝑋1 +𝛽1 𝑋2 +
Estimated regression equation, 𝑝Ƹ =
1+𝑒 𝛽0 + 𝛽1 𝑋1 +𝛽1 𝑋2 + ..… + 𝛽𝑘 𝑋𝑘

▪ If the 𝑝Ƹ calculated is > threshold (Default = 0.5) considered, Y = 1


otherwise, Y = 0
Logistic regression coefficients

▪ In logistic regression, regression coefficients are chosen to maximize


the P(y) for a given X by using Maximum Likelihood Estimate (MLE)
technique

▪ Positive coefficient says positive impact but how?

▪ For example: Equations is of the form

▪ For linear model: y = 𝛽0 + 𝛽1 𝑋1 + 𝛽1 𝑋2 + ..… + 𝛽𝑘 𝑋𝑘

▪ For logistic model: logit(p) = 𝛽0 + 𝛽1 𝑋1 + 𝛽1 𝑋2 + ..… + 𝛽𝑘 𝑋𝑘

13
Linear regression coefficients
▪ For linear model, where y is the scoring marks in the exam, the
fitted regression equation is
y = 0.07 + 13 𝑆𝑡𝑢𝑑𝑦_ℎ𝑜𝑢𝑟𝑠 − 4 (𝑀𝑎𝑙𝑒)

▪ Interpretation: Keeping all other independent variables constant,


✓ For 1 unit increase in study hours results in an increase of 13
marks (here gender variable is kept constant)
✓ For categorical variables like gender, males score 4 marks less
than females in the class (here study_hours variable is kept
constant)

14
Logistic regression coefficients

For logistic model, where y is the passing the driving test, the fitted
regression equation is
logit(p) = 0.07 + 0.13 𝑃𝑟𝑎𝑐𝑡𝑖𝑐𝑒_ℎ𝑜𝑢𝑟𝑠 − 1.82 (𝐹𝑒𝑚𝑎𝑙𝑒)

(𝑒 0.13 = 1.14, 𝑒 −1.82 = 0.16) Y = 1, Pass in the test


Y = 0, Fail in the test

▪ Interpretation: Keeping all other independent variables constant,


✓ For 1 unit increase in practice hours (by keeping gender variable
constant), there will be (𝑒 0.13 − 1 = 0.14), 14% increase in odds
for passing the test
✓ For categorical variable like gender (by keeping practice_hours
variable constant), the odds for passing the test is (𝑒 −1.82 − 1 =
0.16 − 1 = −0.84), 84% lesser for female compared to male

You might also like