Professional Documents
Culture Documents
Causal AI For LQG - Ben Steiner - Apr 2023
Causal AI For LQG - Ben Steiner - Apr 2023
2021 CAUSAL
TECHNIQUES
RECOGNIZED
1
4/17/2023
2022 CAUSAL
TECHNIQUES
HYPED
3. Appropriate expectations for investment management: Don’t expect a crystal ball inside a black box
If prediction accuracy is the goal, there are better ways to curve-fit (with enough data and no concept drift)
For Investment management specifically: benefits include communication and collaboration to accelerate innovation
More generally: causal methods can unify domain experts and machine learning
2
4/17/2023
ROADMAP
1 2 3 4
Intro Foundations Examples Final Thoughts
• What do we mean • Causal Graphs • Examples from • Assumptions,
by causality? capital markets Weaknesses &
• Structural Causal and asset Limitations
• Introduction to Models management
causal AI • What are good and
• Causal Discovery bad uses of causal
approaches?
• Causal Inference
3
4/17/2023
Shark Attacks
Figure source: Judea Pearl and Dana Mackenzie, The Book of Why
4
4/17/2023
INTRODUCTION TO CAUSAL AI
INTRODUCTION TO CAUSAL AI
CORRELATION VERSUS CAUSATION
“Correlation is not
causation”
But that's the problem...
Because sometimes it is!
The challenge is
knowing when it is,
and when it isn't
5
4/17/2023
INTRODUCTION TO CAUSAL AI
CAUSAL DRIVERS ARE A SUBSET OF CORRELATED ASSOCIATIONS
1. 2.
“Correlation is not
Of all the correlated … only some are
factors in a model…
causation”true
But causal
that's thedrivers.
problem...
Because sometimes it is!
3. The challenge 4.
is
(a) differentiate
Extra causal knowing when itbetween
is, causal and
drivers can be correlated.
and when it isn't
(b) Discover more
discovered causal drivers
INTRODUCTION TO CAUSAL AI
X CAN BE CORRELATED WITH Y… BUT NOT CAUSAL
Correlation can be the same in these 5 cases:
C M
X Y X Y X Y
X Y X Y
X is a true causal driver Now the other way A 3rd variable is the driver of No causal relationship, but X is the driver of a 3rd variable,
of Y, changes in X lead to around, Y is a driver of X both of them association detected in which is the driver of Y
changes in Y. limited sample size.
Use X to model Y Do not use X to model Y Do not use X to model Y Do not use X to model Y X could be used to model Y…
6
4/17/2023
INTRODUCTION TO CAUSAL AI
POTENTIAL FOR BOTH ACCURACY AND EXPLAINABILITY
“Correlation is not
Causal
AI
causation”
But that's the problem...
Because sometimes it is!
Causal Human
Discovery Guided
algorithms Research
Data used to Humans can
uncover true see models to
causal drivers interact &
modify them
14
7
4/17/2023
ROADMAP
1 2 3 4
Intro Foundations Examples Final Thoughts
• What do we mean • Causal Graphs • Examples from • Assumptions,
by causal AI? capital markets Weaknesses &
• Structural Causal and asset Limitations
• Introduction to Models management
causal AI • Value add for
• Causal Discovery investment
management
• Causal Inference
Causal Graphs
Causal Discovery
Causal Inference
Counterfactuals
‘What-if”
Algorithmic recourse for explainability
Counterfactuals for model validation
8
4/17/2023
CAUSAL GRAPHS
X Y
X Y
X Y
M
X Y
C
Y
X1 X2
X Y
9
4/17/2023
Example
A negative correlation is observed between Y and
X5. But X5 is not a causal driver of Y
Y is directly caused by X3 and X4 only
Strictly -ve
Y = f(X1, X2) Neither
X1 Y X2
10
4/17/2023
Causal
Discovery Human Human / ML
(data & ML Experts iteration
only)
Orthogonal Model
Factor Validation
Discovery
FOUNDATIONAL CONCEPTS:
1. STRUCTURAL CAUSAL MODELS (SCM)
2. CAUSAL DISCOVERY
3. CAUSAL INFERENCE
11
4/17/2023
FOUNDATIONAL CONCEPTS
CAUSAL GRAPH (DAG) AND STRUCTURAL CAUSAL MODEL (SCM)
Directed Acyclic Graph (DAG) is the name for a graph An SCM {U,V,F} is fully described by exogenous
with directed edges (arrows) used to describe causes variables, U , endogenous variables, V , and a set of
and effect functions, F. The set of functions, F, assign values to
variables in V based on other variables in the model.
Key point: DAGs and SCMs are alternative ways to represent causal relationships in a system
FOUNDATIONAL CONCEPTS
TRADITIONAL ALGEBRA COULD NOT HANDLE CAUSAL QUESTIONS
2. Did the new tax break cause sales to go up? Or the new marketing campaign?
Y = aX vs. Y ← aX
12
4/17/2023
FOUNDATIONAL CONCEPTS
TRADITIONAL ALGEBRA COULD NOT HANDLE CAUSAL QUESTIONS
Y = aX vs. Y ← aX
FOUNDATIONAL CONCEPTS
NEW MATHEMATICAL LANGUAGE TO HANDLE ‘BECAUSE’ NOT ‘WHEN’
As an example, assume and are correlated. Further, assume that one causes the other (but we don’t know which).
How can we express the distribution of (our target)? Answer: It depends if is caused by (or not)
Traditional mathematical notation (conditional probability) does not differentiate if ‘X causes Y’ or ‘Y causes X’
Both factorizations of the Unable to express the
joint distribution are equally
possible!
direction of causality
In contrast: “do-calculus” allows two different distributions after intervening on X (applying the “do” operator)
If we intervene on , fix it to , only one of equations [1] or [2] below will be correct depending on the causal relationship:
● If causes , , then intervening on changes the conditional distribution of :
RHS of [1] and [2] are
[1] different
13
4/17/2023
FOUNDATIONAL CONCEPTS
THREE APPROACHES TO CAUSAL DISCOVERY
FOUNDATIONAL CONCEPTS
CAUSAL DISCOVERY FROM DATA: MANY DIFFERENT APPROACHES
14
4/17/2023
FOUNDATIONAL CONCEPTS
CONDITIONAL INDEPENDENCE
Stats 101 X and Z are independent if P(X)= P(X | Z), or alternatively, P(X∩Z)=P(X)*P(Z)
Intuition If we know Y already, then knowing X tells us nothing more about Z (nor Z about X)
Example • Ice cream sold (X) & shark attacks (Z) are not
independent. They are both higher when its Y
hot (Y)
• However, if we know the temperature
(‘condition on Y’) then X and Z are X
Conditionally Z
conditionally independent Independent given Y
15
4/17/2023
FOUNDATIONAL CONCEPTS
CAUSAL DISCOVERY FROM DATA: EQUIVALENCE CLASSES
CONDITIONAL INDEPENDENCIES IN DAGS
X Y Z
X Y Z
FOUNDATIONAL CONCEPTS
CAUSAL DISCOVERY FROM DATA: EQUIVALENCE CLASSES
MARKOV EQUIVALENCE CLASS:
X Y Z
X Y Z
X Y Z
16
4/17/2023
FOUNDATIONAL CONCEPTS
CAUSAL DISCOVERY FROM DATA: EQUIVALENCE CLASSES
X is independent of Z given Y
X is not independent of Z given Y
X Y Z
X Y Z
X Y Z
Most causal discovery algorithms work by
identifying the Markov equivalence class,
performing conditional independence tests
X Y Z
‘Collider’ relationships are not conditionally
independent.
FOUNDATIONAL CONCEPTS
CAUSAL DISCOVERY FROM DATA: EXAMPLE ALGORITHMS
17
4/17/2023
FOUNDATIONAL CONCEPTS
CAUSAL GRAPHS INCORPORATING HUMAN DOMAIN KNOWLEDGE
Market experts can define the causal drivers and relationships according to beliefs. Encoded as
constraints by the machine. The machine calibrates edge functions based on data and constraints.
Step 1: Human can specify the drivers of the target Step 2: Human can constrain edges between nodes
This node is a direct parent of the target The relationship between two nodes are strictly positive (or negative)
This node is not a parent of the target The relationship between two nodes is piecewise linear
FOUNDATIONAL CONCEPTS
CAUSAL MODEL: CALIBRATION OF EDGE FUNCTIONS FROM DATA
18
4/17/2023
ROADMAP
1 2 3 4
Intro Foundations Examples Final Thoughts
• What do we mean • Causal Graphs • Examples from • Assumptions,
by causal AI? capital markets Weaknesses &
• Structural Causal and asset Limitations
• Introduction to Models management
causal AI • Value for
• Causal Discovery investment
management
• Causal Inference
19
4/17/2023
Causal Human
Discovery Guided
algorithms Research
Data used to Humans can
uncover true see models to
causal drivers interact &
modify them
39
PERFORMANCE ATTRIBUTION
UNDERSTAND WHY PERFORMANCE OCCURS, NOT JUST WHEN
An explainable framework to understand performance more precisely:
Supplement traditional performance attribution,
Investigate the source of a manager’s performance,
Monitor if a manager is doing what they claim to do.
Combine macroeconomic factors with Find causal drivers, beyond linear models, to Explainable and transparent causal
security-specific and trade-specific factors gain a deeper understanding of performance attribution
20
4/17/2023
PORTFOLIO CONSTRUCTION
SUPERIOR PORTFOLIO CONSTRUCTION AT LOWER TURNOVER
Traditional methods require accurate correlation forecasts (of the future)
Markowitz Mean-Variance uses covariance Hierarchical Risk Parity clusters on a distance Correlations: Risk of overfitting to historical data, &
matrix (& correlations) measure (using correlations) unable to capture the asymmetry of causal drivers
Using causal relationships, we find more stable relationships to improve traditional methods
Markowitz Mean-Variance -> Causal Quadratic Optimisation
Hierarchical Risk Parity -> Causal Hierarchical Risk Parity
Higher out-of-sample
generalizability and
lower turnover with
similar performance
21
4/17/2023
Solution
• Deliver key propensity indicators and recommend ‘next-best-actions’
• Expose the root cause of change in client behaviors to drivers (fees, discounts, client
engagement, etc)
• Causal model is explainable and fully compliant with banks model risk guidelines and
AI regulations
Benefits
• Proactive client engagement with early warnings for Relationship managers
• Lower cost to serve existing clients and improve scalability
• Models that are inherently explainable, non-technical users can interrogate and trust
ROADMAP
1 2 3 4
Intro Foundations Examples Final Thoughts
• What do we mean • Causal Graphs • Examples from • Assumptions,
by causal AI? capital markets Weaknesses &
• Structural Causal and asset Limitations
• Introduction to Models management
causal AI • Value for
• Causal Discovery investment
management
• Causal Inference
22
4/17/2023
23
4/17/2023
24
4/17/2023
QUESTIONS?
BEN STEINER
LINKEDIN.COM/IN/STEINERBEN/
BS3283@COLUMBIA.EDU
25