A Meta Transfer Objective For Learning To Disentangle Causal Mechanisms

A Meta-Transfer Objective for Learning to Disentangle
Causal Mechanisms
Krishna Prasad Neupane
kpn3569@rit.edu
July 11, 2020
Krishna Prasad Neupane (RIT) Causal Mechanisms July 11, 2020 1 / 12

Observational and Causal concept
Key point:
An observational or associational concept is any relationship that can be
defined in terms of a joint distribution of observed variables, and a causal
concept is any relationship that cannot be defined from the distribution
alone.a
a
https://ftp.cs.ucla.edu/pub/stats er /r 350.pdf
Observational p(y |x): What is the distribution of Y given that

observe variable X takes value x. It is a conditional distribution which
can be calculated as: p(y |x) = p(x,y
p(x)
)
Interventional p(y |do(x)): What is the distribution of Y if we set the

value of X to x. This describes the distribution of Y we would observe
if we intervened in the data generating process by artificially forcing
the variable X to take value x, but otherwise simulating the rest of the
variables according to the original process that generated the data.1
1
https://www.inference.vc
Example
2
Figure: Three different scripts and their corresponding joint distributions.
2
Example
3
Figure: Joint distribution plot after intervention and conditional at x=3.
Note: Conditional probability and causal probability are distinct.
3
Summary of the Paper
To meta-learn causal structures based on how fast a learner adapts to

new distributions arising from sparse distributions. (Meta- structure)
Based on the assumption of small change in the right knowledge
representation space, the paper defines a meta-learning objective that
measures the speed of adaptation. (Out of distribution)
To obtain fast transfer or adaptation, the paper is able to recover a
good approximation of the true causal decomposition into
independent mechanisms. (Disentangle)
Main Idea
If we have the right knowledge representation, then we should get fast
adaptation to the transfer distribution when starting from a model that is
well trained on the training distribution.

Which is Cause and Which is Effect?
The problem of determining if variable A causes variable B or
vice-versa.
The comparative performance of two hypotheses (A → B vs B → A)
in terms of how fast the two models adapt on a transfer distribution.
Figure: We see that the correct causal model adapts faster (smaller regret),
and that the most informative part of the trajectory (where the two models
generalize the most differently) is in the first 10-20 examples.
Experiments on Adaptation to the transfer distribution
In this section, the paper has used only few gradient updates with a small
set of data coming from the different but related distributions.
Experimental comparison of learning curve of correct vs incorrect
causal models.
The adaptation with only a few gradient steps on data coming from a
different, but related, transfer distribution is critical in getting a signal
that can be leveraged by their meta-learning algorithm.
Figure: Train (red) and transfer (green and blue) samples from an SCM for joint
distribution of A and B.

Parameter Counting Argument
It helps to understand what we are observing in Figure 1.

Proposition 1: The modules that were correctly learned in the
training distribution and whose ground truth conditional distribution
did not change with the transfer distribution, the parameters already
are at a maximum of the log-likelihood over the transfer distribution.
Proposition 2:The gradient of the negative log-likelihood of the
transfer data is the difference between the log-likelihoods of the two
hypotheses on the transfer data.
Proposition 3: Stochastic gradient descent (with appropriately
decreasing learning rate) on expected transfer data is converges
towards sigmoid with values 1 and 0.

Experimental Results
Convergence result from Proposition 3:learning the structural

parameter in a bivariate model.
MLPs to parametrize the conditional distributions to decide whether
one variable is a direct causal parent or not.
Figure: Learning structure parameter and cross-entropy between the

ground-truth SCM structure and the learned SCM structure.

Representation Learning
Many realistic scenarios for learning agents might not use true causal
variables but sensory-level data instead, like pixels and sounds. So,
the correct causal graph will be sparsely connected.
To tackle this, the paper follows the deep learning objective of
disentangling the underlying causal variables to learn a representation
in which these properties hold.
The learner must map its raw observations to a hidden representation
space H via an encoder E. The encoder is trained such that the
hidden space H helps to optimize the meta-transfer objective.

Representation Learning

The End

A Meta Transfer Objective For Learning To Disentangle Causal Mechanisms

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Meta Transfer Objective For Learning To Disentangle Causal Mechanisms

Uploaded by

Copyright:

Available Formats

A Meta-Transfer Objective for Learning to Disentangle

Krishna Prasad Neupane

July 11, 2020

Krishna Prasad Neupane (RIT) Causal Mechanisms July 11, 2020 1 / 12

Observational p(y |x): What is the distribution of Y given that

Interventional p(y |do(x)): What is the distribution of Y if we set the

Note: Conditional probability and causal probability are distinct.

To meta-learn causal structures based on how fast a learner adapts to

Krishna Prasad Neupane (RIT) Causal Mechanisms July 11, 2020 5 / 12

Krishna Prasad Neupane (RIT) Causal Mechanisms July 11, 2020 7 / 12

It helps to understand what we are observing in Figure 1.

Krishna Prasad Neupane (RIT) Causal Mechanisms July 11, 2020 8 / 12

Convergence result from Proposition 3:learning the structural

Figure: Learning structure parameter and cross-entropy between the

Krishna Prasad Neupane (RIT) Causal Mechanisms July 11, 2020 9 / 12

Krishna Prasad Neupane (RIT) Causal Mechanisms July 11, 2020 10 / 12

Krishna Prasad Neupane (RIT) Causal Mechanisms July 11, 2020 11 / 12

Krishna Prasad Neupane (RIT) Causal Mechanisms July 11, 2020 12 / 12

You might also like