Professional Documents
Culture Documents
PS
NP
3.34%
3.94%
Runx1
HF
5.30%
10.20% 4SG
4SFG-
Flk1
80
60
40
20
4S
HF
NP
0
PS
4.27%
Supplementary Figure 1: Fluorescence activated cell sorting of cells with blood potential. (a)
Gating strategy for Flk1+ cells at PS, NP and HF stages, and Runx1-ires-GFP+ cells (4SG) or
Flk1+Runx1-ires-GFP- cells (4SFG-) at the 4S stage. (b) Total numbers of cells per embryo at
each stage, estimated from FACS data. Each point represents one embryo and lines indicate
the median.
PC 2 scores
30 20 10 0
10
PC 2 scores
30 20 10 0
10 20
30
30 20 10 0
10
PC 1 scores
30
20
30
20
30
4SFG-
20
30
4SG
20
HF
30
30
PC 2 scores
30 20 10 0
10 20
20
30 20 10 0
10
PC 1 scores
20
30
Cluster
30 20 10 0
10
PC 1 scores
100
80
20
30
NP
30 20 10 0
10
PC 1 scores
30 20 10 0
10
PC 1 scores
PC 2 scores
30 20 10 0
10 20
30
30
PS
PC 2 scores
30 20 10 0
10
30
30 20 10 0
10 20
PC 1 scores
PC 2 scores
30 20 10 0
10 20
Stage
30
PC 2 scores
30 20 10 0
10 20
60
40
20
30 20 10 0
10
PC 1 scores
20
30
PS
NP
HF 4SG 4SFGCluster1 Cluster2 Cluster3
Supplementary Figure 2: Development is asynchronous. (a) PCA of the 3,934 cells, coloured
retrospectively according to the stage from which they were sorted. Blue, PS; green, NP; orange,
HF; red, 4SG; purple, 4SFG-. (b) For each stage, the cells from different embryos are shown on
the PCA as different colours (cells from other stages shown in grey). (c) PCA coloured according
to the clusters cells belong to in Figure 1e. Green, I; Blue, II; Pink, III. (d) For each embryo, the
percentage of cells in clusters I, II and III was calculated. The mean and standard deviation was
then calculated for each cluster in each stage. Number of embryos shown in Fig. 1C.
30
20
10
0
10
20
30
20
10
10
20
30
30
20
10
10
20
30
30
20
10
10
20
30
PC 2 scores
30
PC 1 scores
Mesoderm
a
Ectoderm
Endoderm
Meso-haemangioblasts
c
Somitic
muscle
d
Cardiac
muscle
40
Myod1
Myog
Myf5
20
10
PS
NP
FPKM
200
150
Hand1
Hand2
Mesp1
100
50
PS
NP
e
Smooth
muscle
Endothelium
20
Mixl1
Eomes
10
PS
NP
HF
30
20
Acta2
Myocd
Cald1
10
0
HF
PS
NP
HF
60
50
30
40
Cdh5
Kdr
Sox7
30
20
10
0
Blood
progenitors
40
HF
250
30
30
Haemangioblasts
PS
NP
HF
Runx1
Tal1
Gata1
20
10
0
PS
NP
HF
Cbfa2t3h
Egfl7
Eif2b1
Ets1
Ets2
Etv6
Fli1
Foxh1
Foxo4
Gfi1
Gfi1b
Hhex
Hoxb2
Hoxd8
Ikzf1
Itga2b
Kit
Ldb1
Lmo2
Lyl1
Mecom
Meis1
Mitf
Nfe2
Notch1
Pecam1
Polr2a
Procr
Spi1
Sox17
Tbx3
Tbx20
Ubc
dCt
ND
-10
Mrpl19
-5
Myb
Supplementary Figure 5: Diffusion plots. Diffusion plots showing expression patterns of additional genes not shown in Figure 2.
0.00
0.00
0.04
0.04
PS
0.10
0.00
0.04
0.00
0.04
0.04
0.10
0.04
0.04
0.00
0.04
0.00
0.00
0.04
0.04
HF
0.10
0.00
0.04
0.00
0.10
0.04
0.04
Diffusion component 1
NP
0.00
0.04
4SG
0.00
0.04
0.00
0.04
0.00
0.00
0.04
0.04
4SFG-
0.00
0.04
0.00
0.04
Diffusion component 2
0.10
0.04
0.04
0.10
0.00
0.04
0.00
0.04
Diffusion
component 4
Supplementary Figure 6: Diffusion plots highlighting each stage. Top left plot shows all populations, other
plots were coloured according to embryonic stage with cells from other stages shown in grey. Blue, PS;
green, NP; orange, HF; red, 4SG; purple, 4SFG-.
Nature Biotechnology: doi:10.1038/nbt.3154
Diffusion map
ICA
PCA
tSNE
Supplementary Figure 7: Comparison of multivariate analysis techniques. The data for all
3,934 cells were plotted using diffusion maps, principal component analysis, independent
component analysis and t-SNE. Blue, PS; green, NP; orange, HF; red, 4SG; purple, 4SFG-.
Supplementary Figure 8: Some states occur in multiple cells. State transition graph coloured
by the number of times each binary state occurs in the 3,934 expression profiles. Grey, occurs
once; blue, occurs twice; red, occurs more than twice.
30
NP
HF
4SG
4SFG-
HF simulated (black)
HF real (pink)
0
30
20
10
10
20
30
30
20
10
PC 2 scores
10
20
30
30
20
10
10
20
PS
30
20
10
10
20
30
30
20
10
10
20
30
PC 1 scores
Supplementary Figure 9: Populations mask variation and asynchrony. Pools of 20 cells were simulated
from single cell data and projected onto the principal component analysis for all 3,934 cells. Each plot
shows one embryonic stage and the simulated pools of 20 cells for that population in black (normalized in
the same way as single cells). Bottom right hand plot shows the head fold stage and simulated pools as
well as actual pools of 20 cells sorted from embryos (pink).
Motif
AP1-like
Ets
Ebox
Gata
Gfi
HMG box
Homeobox
Homeobbox
Ikaros
Myb
Rbpj
Nfe2
Erg, Ets1, Etv2, Fli1, PU.1
Scl, Lyl1
Gata1
Gfi1, Gfi1b
Sox7, Sox17
Hhex
Hoxb4
Ikaros
Myb
Notch1
TCAY
GGAW
CANNTG
GATA
AATC
WWCAAWG
YWATTAAR
TAAT
HRGGAW
YAACNG
TTCCCAA
TFBS search
Mouse
Human
Platypus
Chicken
Xenopus
. . . .
TFs
. . . .
Factor type
. . . .
a
HoxB4 day6
HoxB4 day16
HoxB4 day26
Day 3
Day 4
53.0 1.47
45.5 0.11
46.9 5.71
41.4 5.95
Day 5
33.2 7.13
38.1 21.6
1_
160 _
1_
160 _
1_
Erg
Day 6
18.7 10.5
24.8 46.0
Day 7
10.8 15.8
20.2 53.2
Hsp/Venus
Day 3
Day 4
Day 5
Day 6
Day 7
0.122
0.873
0.373
0.145
0.219
1.61
18.6
17.4
13.0
8.92
0.835
7.38
7.33
5.5
3.62
0.568
13.6
14.2
9.01
3.72
1.34
11.7
6.34
3.44
2.09
Hsp/Venus
56.4 1.67
41.8 0.12
50.7 8.78
34.1 6.51
32.1 11.3
34.3 22.3
14.3 17.4
20.8 47.5
6.49 21.1
15.4 57.0
Hsp/Venus/
Erg+85
Hsp/Venus/
Erg+85Sox
Hsp/Venus/
Erg+85HoxSox
61.8 1.26
36.8 0.16
38.8 3.09
51.8 6.38
29.5 5.86
42.5 22.1
14.6 8.63
34.9 41.9
9.78 13.1
25.9 51.2
40.5 0.50
58.9 0.06
49.5 5.88
40.0 4.66
29.2 8.70
43.0 19.0
13.3 14.1
26.8 45.7
5.22 22.6
29.4 42.8
64.0 2.02
33.8 0.15
43.0 8.13
40.5 8.38
28.5 6.51
45.3 19.7
22.4 7.50
21.7 48.4
17.8 10.3
16.4 55.6
CD41
Hsp/Venus/
Erg+85Hox
Venus YFP
Hsp/Venus/
Erg+85
Flk1
Hsp/Venus/
Erg+85Hox
mm10
100 kb
160 _
Hsp/Venus/
Erg+85Sox
Hsp/Venus/
Erg+85HoxSox
SSC
Supplementary Figure 11: The Erg+85 enhancer is active during haematopoietic development. (a) UCSC Genome Browser tracks of re-analysed
ChIP-Seq data for HoxB45 indicate that HoxB4 binds to the Erg+85 enhancer (shown in grey) during a 26-day differentiation time course that produces
HSCs from ES cells, with HSCs able to contribute to long-term engraftment in recipient mice. Binding is absent at day 6, but becomes apparent by
day 16. (b) Flow cytometric analysis of Flk1 and CD41 expression across a time-course of EB differentiation from Figure 4B indicates that different
clones have similar kinetics of differentiation, with Flk1+ cells giving rise to Flk1+CD41+ and then Flk1-CD41+ cells. (c) Flow cytometric analysis of YFP
expression in all live cells across a time-course of EB differentiation from Figure 4b illustrates reduction of YFP positive cells in ES cell clones where
YFP expression is driven by mutant enhancer constructs. In (b) and (c), a representative experiment is shown from three biological repeats of 2-3
clones per construct.
Nature Biotechnology: doi:10.1038/nbt.3154
b
Meis1
Sox7
Tal1
Etv2
Etv6
Correlation
Tbx3
FoxO4
Myb
Runx1
Ikaros
Gata1
Gfi1b
Nfe2
Lyl1
Hoxb2
Sox17
Sox7
Tal1
Ldb1
Mecom
Etv2
Hhex
Lmo2
Hoxb2
Gfi1
Cbfa2t3h
Mitf
Hoxd8
Nfe2
Lyl1
Hoxb4
Foxh1
Foxo4
Tbx3
Tbx20
Etv6
Meis1
Spi1
Myb
Runx1
Ikaros
Gata1
Gfi1b
Cbfa2t3h
Mitf
0.5 1.0
Ldb1
Mecom
Notch1
Ets2
Fli1
Ets1
Erg
Notch1
-1.0 -0.5
Sox17
Tbx20
Fli1
FoxH1
Lyl1
Eto2
Ets2
Gfi1
Sox17
Ets2
Fli1
Ets1
Erg
Notch1
Sox7
Tal1
Etv2
Hhex
Lmo2
Hoxd8
Gfi1
Hoxb4
Tbx3
Tbx20
Foxh1
Foxo4
Etv6
Meis1
Spi1
Erg
Gfi1b
Ets1
Myb
HoxB4
Ikaros
Hhex
Gata1
Mecom
Ldb1
Runx1
PU.1
Nfe2
Supplementary Figure 12: Partial correlation analysis. (a) Hierarchical clustering of partial correlation coefficients between pairs of transcription
factors for all 3,934 expression profiles. Pairwise coefficients were calculated while controlling for the effect of all other transcription factors. Erg does
not correlate with Sox or Hox factors. (b) Network diagram showing putative undirected activating (red) and inhibiting (blue) relationships suggested
by significant correlations (p-value < 1e-10).
Supplementary Table 1: List of TaqMan assays used for single cell gene expression analysis.
Gene name
Cbfa2t3h
Cdh1
Cdh5
Egfl7
Eif2b1
Erg
Ets1
Ets2
Etv2
Etv6
Fli1
FoxH1
FoxO4
Gata1
Gata2
Gfi1
Gfi1b
Hbb-bH1
Hhex
HoxB2
HoxB3
HoxB4
HoxD8
Ikzf1
Itga2b
Kdr
Kit
Ldb1
Lmo2
Lyl1
Mecom
Meis1
Mitf
Mrpl19
Myb
Nfe2
Notch1
Pecam1
Polr2a
Procr
Runx1
Spi1
Sox17
Sox7
Tal1
Tbx20
Tbx3
Ubc
Protein name
Eto2
E-cadherin
VE-cadherin
Egfl7
Eif2b1
Erg
Ets1
Ets2
Etv2
Tel
Fli1
FoxH1
FoxO4
Gata1
Gata2
Gfi1
Gfi1b
Hbb-bH1
Hhex
HoxB2
HoxB3
HoxB4
HoxD8
Ikaros
CD41
Flk1
c-Kit
Ldb1
Lmo2
Lyl1
MDS1/Evi1
Meis1
Mitf
Mrpl19
Myb
Nfe2
Notch1
CD31
Polr2a
Epcr
Runx1
PU.1
Sox17
Sox7
Scl
Tbx20
Tbx3
Ubc
Assay ID
Mm00486780 m1
Mm01247357 m1
Mm00486938 m1
Mm00618004 m1
Mm01199614 m1
Mm01214246 m1
Mm01175819 m1
Mm00468977 m1
Mm00468389 m1
Mm01261325 m1
Mm00484409 m1
AJD1TLL (custom assay)
AJLJIMX (custom assay)
Mm00484678 m1
Mm00492300 m1
Mm00515855 m1
Mm00492318 m1
Mm00756487 mH
Mm00433954 m1
Mm04209931 m1
Mm00650701 m1
Mm00657964 m1
AJFARRT (custom assay)
Mm01187882 m1
Mm00439768 m1
Mm01222421 m1
Mm00445212 m1
Mm00440156 m1
Mm01281680 m1
Mm01247198 m1
Mm01289155 m1
Mm00487659 m1
Mm01182480 m1
Mm03048937 m1
Mm00501741 m1
Mm00801891 m1
Mm00435249 m1
Mm01242584 m1
Mm00839493 m1
Mm00440993 mH
Mm01213405 m1
Mm00488142 m1
Mm04208182 m1
Mm00776876 m1
Mm01187033 m1
Mm00451517 m1
Mm01195726 m1
Mm01201237 m1
Supplementary Table 2: Boolean update rules for the network in Figure 3c, synthesized from
the state graph in Figure 3b. The final column indicates whether the DNA binding motifs of the
predicted regulators are present in the locus of the target gene (see Supplementary Figure 10).
Gene
Scl
Etv2
Fli1
Fli1
Notch1
Etv2
Sox7
Sox7
Sox17 HoxB4
(HoxB4 Lyl1) Sox17
(HoxB4 Tal1) Sox17
Sox7
Gfi1b Lmo2
Gfi1b Hhex
Gfi1b Ets1
(Lyl1 Ets1) Gata1
(Lyl1 Nfe2) Gata1
(Lyl1 Ikaros) Gata1
Lyl1 Gfi1b
(Eto2 Sox7) Gfi1b
(Eto2 Tal1) Gfi1b
Notch1
Gata1 Sox17
Nfe2 Sox17
Nfe2 Myb
Pu.1 Ikaros
Pu.1 Nfe2
Pu.1 Myb
Sox7
Hhex
Ets1 Fli1
Sox7
Notch1
Nfe2 Gfi1b
Nfe2 Gata1
Nfe2 Gfi1
Sox7 Gfi1
Sox7 Erg
Sox7 HoxB4
Ikaros
Gfi1 Erg
HoxB4
Lyl1
Sox7
Erg
Notch1
Gata1
HoxB4
Sox17
Ets1
Gfi1
Gfi1b
Eto2
Hhex
Ikaros
Lmo2
Nfe2
Pu.1
Myb
% Non-observed
transitions disallowed (Ni )
98
96
96
97
92
82
84
83
94
86
84
84
65
65
65
77
76
75
96
88
88
87
86
86
86
93
92
94
97
93
84
83
82
79
79
77
78
67
64
Motifs present
Yes
Yes
Yes
Yes
Yes
No (Sox missing)
Yes
Yes
Yes
Yes
No (Hhex missing)
Yes
Yes
Yes
No (Ikaros missing)
No (Gfi missing)
No (Gfi missing)
No (Gfi missing)
Yes
Yes
Yes
Yes
No (Ikaros missing)
Yes
Yes
No (Sox missing)
No (Hhex missing)
No (Ets missing)
No (Sox missing)
No (Rbpj missing)
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Supplementary Table 3: Boolean update rules after multiple rounds of bootstrapping (a-e).
In general, the number of possible solutions for a genes update function grows as the amount of
data used is decreased, and including the full data set narrows these possibilities.
Supplementary Table 3a
Gene
Scl
Etv2
Fli1
Lyl1
Sox7
Erg
Notch1
Gata1
HoxB4
Sox17
Ets1
Gfi1
Gfi1b
Eto2
Hhex
Ikaros
Lmo2
Nfe2
Gene
Pu.1
Myb
Supplementary Table 3b
Gene
Scl
Etv2
Fli1
Lyl1
Sox7
Erg
Notch1
Gata1
HoxB4
Sox17
Ets1
Gfi1
Gfi1b
Eto2
Hhex
Gene
Ikaros
Lmo2
Nfe2
Pu.1
Myb
Supplementary Table 3c
Gene
Scl
Etv2
Fli1
Lyl1
Sox7
Erg
Notch1
HoxB4
Sox17
Ets1
Gfi1
Gfi1b
Eto2
Hhex
Gene
Ikaros
Lmo2
Nfe2
Pu.1
Myb
Supplementary Table 3d
Gene
Scl
Etv2
Fli1
Lyl1
Sox7
Erg
Notch1
Gata1
HoxB4
Sox17
Ets1
Gfi1
Gfi1b
Gene
Eto2
Hhex
Ikaros
Lmo2
Nfe2
Pu.1
Myb
Supplementary Table 3e
Gene
Scl
Etv2
Fli1
Lyl1
Sox7
Erg
Notch1
Gata1
HoxB4
Sox17
Ets1
Gfi1
Gfi1b
Gene
Eto2
Hhex
Ikaros
Lmo2
Nfe2
Pu.1
Myb
Supplementary Table 4: Boolean update rules synthesised from data filtered according to a
more stringent limit of detection (Ct 23).
Gene
Scl
Etv2
Fli1
Lyl1
Sox7
Erg
Notch1
Gata1
HoxB4
Sox17
Ets1
Gfi1
Gfi1b
Eto2
Hhex
Gene
Ikaros
Lmo2
Nfe2
Pu.1
Myb
Mutant
Day3
Day4
Day5
Day6
Day7
*
ns
ns
ns
ns
ns
ns
ns
ns
ns
ns
ns
ns
**
ns
**
Sox
ns
ns
ns
ns
HoxSox
ns
**
Flk1-CD41+ Hox
ns
***
**
ns
ns
Sox
ns
ns
ns
ns
ns
HoxSox
ns
**
ns
ns
Flk1+CD41- Hox
Sox
HoxSox
Flk1+CD41+ Hox
Update function
Gata2 (Pu.1 (Gata1 Fog1))
(Gata1 Gata2 Fli1) Pu.1
Gata1
Gata1 Fli1
Gata1 EKLF
Gata1 Pu.1
Cebpa (Scl (Fog1 Gata1))
(Cebpa Pu.1) (Gata1 Gata2)
Pu.1 Gfi1
(Pu.1 cJun) Gfi1
Cebpa EgrNab
The results of applying our synthesis method are shown in the table below. The method successfully reconstructs the Boolean update function for all but one gene (EgrNab), in some cases
uniquely identifying the correct function. We note that when multiple solutions are found for an
update function, these solutions, while not exact, all provide useful regulatory information that
could be verified in an experimental scenario. For example, both solutions for Scl successfully
predict Scls activation by Gata1, although one of the two solutions omits its repression by Pu.1.
We found that these results were robust when performing bootstrapping, randomly discarding
10% of the state space data and rerunning the analysis.
Table 2: Synthesised Boolean functions
Gene
Gata2
Gata1
Fog1
EKLF
Fli1
Scl
Cebpa
Pu.1
cJun
EgrNab
Gfi1
Comments
Unique
Unique
Unique
Unique
Incorrect
Unique
REFERENCES
1.
Sacilotto,
N.
et
al.
Analysis
of
Dll4
regulation
reveals
a
combinatorial
role
for
Sox
and
Notch
in
arterial
development.
Proc.
Natl.
Acad.
Sci.
U.
S.
A.
110,
118938
(2013).
2.
3.
4.
Gttgens,
B.
et
al.
Analysis
of
vertebrate
SCL
loci
identifies
conserved
enhancers.
Nat.
Biotechnol.
18,
181186
(2000).
5.
6.
Warren,
A.
J.
et
al.
The
Oncogenic
Cysteine-rich
LIM
domain
protein
Rbtn2
is
essential
for
erythroid
development.
Cell
78,
4557
(1994).
7.
Hart,
A.
et
al.
Fli-1
is
required
for
murine
vascular
and
megakaryocytic
development
and
is
hemizygously
deleted
in
patients
with
thrombocytopenia.
Immunity
13,
167
77
(2000).
8.
9.