Professional Documents
Culture Documents
A Meta Transfer Objective For Learning To Disentangle Causal Mechanisms
A Meta Transfer Objective For Learning To Disentangle Causal Mechanisms
Causal Mechanisms
kpn3569@rit.edu
Key point:
An observational or associational concept is any relationship that can be
defined in terms of a joint distribution of observed variables, and a causal
concept is any relationship that cannot be defined from the distribution
alone.a
a
https://ftp.cs.ucla.edu/pub/stats er /r 350.pdf
2
Figure: Three different scripts and their corresponding joint distributions.
2
https://www.inference.vc
Krishna Prasad Neupane (RIT) Causal Mechanisms July 11, 2020 3 / 12
Example
3
Figure: Joint distribution plot after intervention and conditional at x=3.
3
https://www.inference.vc
Krishna Prasad Neupane (RIT) Causal Mechanisms July 11, 2020 4 / 12
Summary of the Paper
Main Idea
If we have the right knowledge representation, then we should get fast
adaptation to the transfer distribution when starting from a model that is
well trained on the training distribution.
Figure: We see that the correct causal model adapts faster (smaller regret),
and that the most informative part of the trajectory (where the two models
generalize the most differently) is in the first 10-20 examples.
Krishna Prasad Neupane (RIT) Causal Mechanisms July 11, 2020 6 / 12
Experiments on Adaptation to the transfer distribution
In this section, the paper has used only few gradient updates with a small
set of data coming from the different but related distributions.
Experimental comparison of learning curve of correct vs incorrect
causal models.
The adaptation with only a few gradient steps on data coming from a
different, but related, transfer distribution is critical in getting a signal
that can be leveraged by their meta-learning algorithm.
Figure: Train (red) and transfer (green and blue) samples from an SCM for joint
distribution of A and B.
Many realistic scenarios for learning agents might not use true causal
variables but sensory-level data instead, like pixels and sounds. So,
the correct causal graph will be sparsely connected.
To tackle this, the paper follows the deep learning objective of
disentangling the underlying causal variables to learn a representation
in which these properties hold.
The learner must map its raw observations to a hidden representation
space H via an encoder E. The encoder is trained such that the
hidden space H helps to optimize the meta-transfer objective.