You are on page 1of 35

Article

Evolution of neuronal cell classes and types


in the vertebrate retina

https://doi.org/10.1038/s41586-023-06638-9 Joshua Hahn1,16, Aboozar Monavarfeshani2,16, Mu Qiao3,15, Allison H. Kao2, Yvonne Kölsch4,
Ayush Kumar1, Vincent P. Kunze5, Ashley M. Rasys6, Rose Richardson7,
Received: 30 March 2023
Joseph B. Wekselblatt8, Herwig Baier4, Robert J. Lucas7, Wei Li5, Markus Meister3,
Accepted: 13 September 2023 Joshua T. Trachtenberg9, Wenjun Yan2, Yi-Rong Peng10, Joshua R. Sanes2 ✉ &
Karthik Shekhar1,11,12,13,14 ✉
Published online: 13 December 2023

Open access
The basic plan of the retina is conserved across vertebrates, yet species differ profoundly
Check for updates
in their visual needs1. Retinal cell types may have evolved to accommodate these
varied needs, but this has not been systematically studied. Here we generated and
integrated single-cell transcriptomic atlases of the retina from 17 species: humans,
two non-human primates, four rodents, three ungulates, opossum, ferret, tree shrew,
a bird, a reptile, a teleost fish and a lamprey. We found high molecular conservation
of the six retinal cell classes (photoreceptors, horizontal cells, bipolar cells, amacrine
cells, retinal ganglion cells (RGCs) and Müller glia), with transcriptomic variation
across species related to evolutionary distance. Major subclasses were also conserved,
whereas variation among cell types within classes or subclasses was more pronounced.
However, an integrative analysis revealed that numerous cell types are shared across
species, based on conserved gene expression programmes that are likely to trace back
to an early ancestral vertebrate. The degree of variation among cell types increased
from the outer retina (photoreceptors) to the inner retina (RGCs), suggesting that
evolution acts preferentially to shape the retinal output. Finally, we identified rodent
orthologues of midget RGCs, which comprise more than 80% of RGCs in the human
retina, subserve high-acuity vision, and were previously believed to be restricted
to primates2. By contrast, the mouse orthologues have large receptive fields and
comprise around 2% of mouse RGCs. Projections of both primate and mouse
orthologous types are overrepresented in the thalamus, which supplies the primary
visual cortex. We suggest that midget RGCs are not primate innovations, but are
descendants of evolutionarily ancient types that decreased in size and increased in
number as primates evolved, thereby facilitating high visual acuity and increased
cortical processing of visual information.

The ability to assess gene conservation among species has been of enabled related activity focused on determining the extent to which cell
great value in multiple ways. It has revealed the evolutionary history types, the functional units of complex tissues6,7, are conserved among
of specific genes, highlighted crucial developmental and functional species. Analysing patterns of cell-type conservation across phylogeny
pathways, informed strategies for rational in vivo manipulations and can serve as a conceptual foundation for reconstructing the evolution
helped guide choices of animal models that mimic human diseases3,4. of cell types and identifying conserved developmental programmes8–10.
Comparative genomics was enabled by advances in DNA sequencing, as The neural retina, the portion of the brain that resides in the back of
well as statistical methodologies for sequence alignment and phyloge- the eye, is well-suited for this type of analysis. It is arguably as complex
netic inference5. Advances in high-throughput single-cell RNA sequenc- as any other part of the brain, but its compactness and accessibility
ing (scRNA-seq) and single-nucleus RNA sequencing (snRNA-seq) have facilitate detailed investigations of structure and function11. Moreover,

Department of Chemical and Biomolecular Engineering, University of California, Berkeley, Berkeley, CA, USA. 2Department of Cellular and Molecular Biology, Center for Brain Science, Harvard
1

University, Cambridge, MA, USA. 3Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA. 4Max Planck Institute for Biological Intelligence,
Martinsried, Germany. 5Retinal Neurophysiology Section, National Eye Institute, National Institutes of Health, Bethesda, MD, USA. 6Department of Cellular Biology, University of Georgia, Athens,
GA, USA. 7Division of Neuroscience and Centre for Biological Timing, Faculty of Biology Medicine and Health, University of Manchester, Manchester, UK. 8Division of Chemistry and Chemical
Engineering, California Institute of Technology, Pasadena, CA, USA. 9Department of Neurobiology, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA. 10Department of
Ophthalmology, Stein Eye Institute, UCLA David Geffen School of Medicine, Los Angeles, CA, USA. 11Helen Wills Neuroscience Institute,Vision Science Graduate Group, University of California,
Berkeley, Berkeley, CA, USA. 12Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA. 13Center for Computational Biology, Biophysics Graduate
Group, University of California, Berkeley, Berkeley, CA, USA. 14California Institute of Quantitative Biosciences (QB3), University of California, Berkeley, Berkeley, CA, USA. 15Present address:
LinkedIn, Mountain View, CA, USA. 16These authors contributed equally: Joshua Hahn, Aboozar Monavarfeshani. ✉e-mail: sanesj@mcb.harvard.edu; kshekhar@berkeley.edu

Nature | Vol 624 | 14 December 2023 | 415


Article
a b

Evolutionary distance (Myr)


c r OS 500

400
ONL
300
OPL
200
AC HC
MG BC INL
100
IPL 0
et l t
an ue w se ys us irre eep ow Pi
g re m rd ck sh rey
AC GCL um aq os hre ou om ysc C r
ss
u
iza hi afi mp
c m s q u Sh Fe L C r
H a ar e M bd
o m S po Ze
b
La
M M Tre a r O
RGC Rh Pe
c CHX10 AP2A RBPMS DAPI
Squirrel Mouse Peromyscus Tree shrew Human Sheep Cow Pig
ONL

INL

GCL

Fig. 1 | Conserved retinal structure across vertebrates. a, Cartoon of a inner plexiform layer (IPL). b, Phylogeny of the 17 vertebrate species analysed
section through a vertebrate retina showing the arrangement of its six major in this work. The scale bar on the right indicates estimated divergence time.
cell classes: photoreceptors (including rods (r) and cones (c)), horizontal cells c, Sections from retinas of eight species immunostained for RBPMS (a pan-RGC
(HC), bipolar cells (BC), amacrine cells (AC), retinal ganglion cells (RGC) and marker), CHX10 (also known as VSX2) (a pan-bipolar cell marker) and AP2A
Müller glia (MG). The outer segments of rods and cones (OS), outer nuclear (also known as TFAP2A) (a pan-amacrine cell marker) and stained with the nuclear
layer (ONL), inner nuclear layer (INL) and ganglion cell layer (GCL)—which stain DAPI. Scale bars, 25 µm. Figures are representative of images from three
contain cell somata—are indicated, as are the outer (synaptic) layer (OPL) and retinas.

unlike other brain regions (for example, the cerebral cortex), the basic similarities in gene expression. This principle extends to identified
structural blueprint of the retina is highly conserved among verte- subclasses of photoreceptors, bipolar cells and amacrine cells. Tran-
brates1. The retina contains five neuronal classes—photoreceptors, scription factors implicated in cell and subclass specification are also
horizontal cells, bipolar cells, amacrine cells and retinal ganglion cells evolutionarily conserved, pointing to common programmes of retinal
(RGCs)—and a resident glial class called Müller glia12. The cell somata are development. Within each cell class, the transcriptomic variation across
arranged in three nuclear layers separated by two plexiform (synaptic) species increases with evolutionary time in a manner incompatible with
layers (Fig. 1a) with information flowing through them in a defined purely ‘neutral’ evolution18. Second, we assessed the extent of evolu-
direction: photoreceptors in the outer nuclear layer sense light and tionary variation among cell types within photoreceptors, horizontal
transmit visually evoked signals to interneurons in the inner nuclear cells, bipolar cells and RGCs, which have been comprehensively classi-
layer; the interneurons (horizontal cells, bipolar cells and amacrine fied in mice19–21 and primates22–24. We identify numerous evolutionarily
cells) process the information and supply it to RGCs in the innermost conserved types but find that variation is more extensive in RGCs than
layer; and the RGCs send axons through the optic nerve to visual cen- in other classes, suggesting that natural selection acts preferentially to
tres in the brain. Several of the neuronal classes can be subdivided shape the retinal output. Finally, we identify non-primate orthologues
into subclasses, and all classes comprise multiple types that differ in of midget RGCs, which account for more than 80% of RGCs in humans
morphology, physiology, connectivity and molecular composition6,11–14. and are primarily responsible for high-acuity vision. To our knowledge,
The specificity of connections between interneuronal and RGC types no counterparts of these cell types have previously been identified in
endows many RGC types with selective responsiveness to small subsets non-primates, precluding mechanistic analysis of blinding diseases
of visual features such as edges, directional motion and chromatic- involving RGC loss, such as glaucoma. This orthology suggests that
ity14,15. As a result of neural computations in the retina, the optic nerve rather than appearing de novo in primates, midget RGCs evolved from
transmits a set of parallel representations of the visual scene to the rest cell types that were present in the common mammalian ancestor.
of the brain for further processing16,17.
Despite these conserved features, vertebrate species differ greatly
in their visual needs1. Some species are diurnal, others are nocturnal; Retinal cell atlases of 17 species
some are terrestrial, others are aquatic; and some mainly hunt, whereas Previously, we used scRNA-seq and snRNA-seq to study retinal cell
others forage for colourful fruits. It is likely that variations in retinal types in five species: Mus musculus19,20,25,26 (hereafter referred to as
cell types across species emerged during the course of evolution to ‘mouse’), cynomolgus macaque22 (Macaca fascicularis), human23 (Homo
serve these diverse needs. However, the evolutionary relationships sapiens), chick27 (Gallus gallus) and zebrafish28 (Danio rerio). For the
among retinal cell types have not been mapped systematically. Here we present study, we generated atlases from 12 additional species: ferret
address this gap by using single-cell transcriptomics to compare retinal (Mustela putoriusfuro), brown anole lizard (Anolis sagrei), deer mouse
cell classes, subclasses and types in 17 vertebrate species (Fig. 1b,c). (Peromyscus maniculatus bairdii), tree shrew (Tupaia belangeri chinensis),
First, we show that the conserved functional and morphological pig (Sus domesticus), sheep (Ovis aries), cow (Bos taurus), opossum
character of the six cell classes is mirrored by marked cross-species (Monodelphis domestica), marmoset (Callithrix jacchus), 4-striped

416 | Nature | Vol 624 | 14 December 2023


a b c

0.25
0.50
0.75
1.00

0.25
0.50
0.75

0.25
0.50
0.75
1.00
1.00
0
Normalized expression Spearman correlation Spearman correlation
MG
MG
PR

HC

AC
PR
BC

RGC

HC

Peromyscus
Rhabdomys
Tree shrew
BC

Marmoset
C
AC

G
PR
C

Opossum

Zebrafish
Macaque
H

M
G

Squirrel
Human
R

Mouse

Sheep

Lizard
Ferret

Chick
Cow
Pig
AC
d Cones Glycinergic ACs OFF BCs

10 RGC

BC AC
5
PR ARR3 SLC6A9 GRIK1

UMAP2
0
Rods GABAergic ACs ON BCs

RGC HC
−5
MG

−10 BC
RBPMS
RBPMS2
SLC17A6
THY1
NEFL
NEFM
POU4F1
VSX1
VSX2
OTX2
GRM6
TRPM1
CABP5
GRIK1
TFAP2A
TFAP2B
TFAP2C
GAD1
GAD2
PAX6
ROBO1
SLC6A9
ONECUT1
ONECUT2
LHX1
CALB1
VIM
SLC4A3
WDR72
SAG
PDC

GNGT1
GNGT2
NRL
SLC1A3
RLBP1

GLUL
SFRP2
RCVRN

APOE

RHO GAD1 ISL1


−10 −5 0 5 10 0123456
Normalized expression
UMAP1
e
RGC BC AC HC PR MG
0.6
Mean squared

0.4
divergence

0.2

0
0 0.4 0.8 1.2 0 0.4 0.8 1.2 0 0.4 0.8 1.2 0 0.4 0.8 1.2 0 0.4 0.8 1.2 0 0.4 0.8 1.2
Evolutionary divergence (substitutions per 100 bp)

Fig. 2 | Class- and subclass-specific transcriptomic signatures. a, Heat map indicating class identity (left) or expression levels of subclass-specific markers
showing average expression of marker genes (columns) within each major (right). GAD1, a marker for GABAergic amacrine cells, is also expressed by some
cell class in 17 species (rows). Rows are grouped by cell class (left). Within each horizontal cells, and ISL1, a marker for ON bipolar cells, is also expressed by
class, species are ordered as in Fig. 1b. Grey tiles indicate data that are missing some RGCs, horizontal cells, and amacrine cells. Details of gene expression
owing to the absence of the corresponding orthologue in the species annotation. by species are shown in Extended Data Fig. 8d. e, Pairwise mean squared
Colours indicating cell class are uniform in a–e. PR, photoreceptor. b, Cross- divergence of class-specific pseudobulk gene expression profiles between
correlation matrix (Spearman) of pseudobulk transcriptomic profiles for species (y axis) increases with evolutionary distance, as estimated by
the 16 jawed vertebrates. Rows and columns are grouped by class, and then substitutions per 100 bp (x axis). Data from mammals, chicken and lizard are
ordered by phylogeny within a class. A total of 4,560 1:1 gene orthologues were included. Data including zebrafish are presented in Extended Data Fig. 7e.
used to calculate the correlation values. c, As in b, with rows and columns Solid lines represent power law (y = ax b) regression fits. Across the graphs,
grouped by species instead of class. Matrices including lamprey are shown a ∈ [0.33, 0.47] and b ∈ [0.23, 0.35]. The coefficient of determination (R 2) values
in Extended Data Fig. 7c,d. d, Left, uniform manifold approximation and range from 0.75 to 0.92.
projection (UMAP) embedding of integrated cross-species data, with points

grass mouse (Rhabdomys pumilio), 13-lined ground squirrel (Ictidomys cells) were not analysed further. Biological replicates within each collec-
tridecemlineatus) and sea lamprey (Petromyzon marinus) (Fig. 1b,c). We tion exhibited a high degree of concordance (Extended Data Figs. 3–6).
also profiled around 185,000 nuclei from 18 human donors, thereby The numbers of cells in each class for each species are summarized in
allowing us to identify over 30 more cell types than had been detected Supplementary Table 1.
in the dataset analysed previously23, including 10 additional RGC types
(Extended Data Fig. 1). To obtain sufficient numbers of bipolar cells and
RGCs for comprehensive analysis, we enriched these classes in some Molecular conservation of neuronal classes
collections (Extended Data Figs. 2–6 and Methods). We also collected We analysed the expression of class markers that have been validated
cells without enrichment to ensure representation of all classes. in mice and primates; that is, genes that are co-expressed within a
We used a standardized computational pipeline to normalize, correct retinal cell class but exhibit little or no expression in other retinal
batch effects, reduce dimensionality and cluster the data from each cell classes19,20,22–26. Many showed similar expression patterns in
species separately29 (Methods). Cells that did not belong to the six other vertebrates (Fig. 2a). Using these markers, we assigned cells
canonical classes named above (for example, microglia or endothelial within each species to one of the six classes. We then assessed the

Nature | Vol 624 | 14 December 2023 | 417


Article
interspecies similarity of classes by comparing ‘pseudobulk’ transcrip- bipolar cell types. In mice, there are 15 bipolar cell types: 6 OFF and 9
tomic profiles on the basis of shared orthologous genes (Methods). A ON bipolar cell types; one of the ON bipolar cell types receives input
cross-correlation analysis among the 16 jawed vertebrates showed that predominantly from rods (RBCs) and all others receive input predomi-
transcriptomic similarity was driven by cell class identity rather than nantly from cones19.
species identity—for example, bipolar cells of a given species are more Initial clustering of mammalian bipolar cells generated groups that
closely related to bipolar cells of other species than they are to other were defined by species (Fig. 3a). The datasets were therefore reana-
classes from the same species (Fig. 2b,c and Extended Data Fig. 7a,b). lysed using an integration method that minimizes species-specific
Qualitatively similar results were obtained when lamprey—a jawless signals, thereby emphasizing other transcriptomic relationships29
vertebrate—was included, although the signal was attenuated because (Methods). This analysis intermixed the species while retaining struc-
fewer orthologous genes were available (Extended Data Fig. 7c,d). ture that separates ON cone, OFF cone and ON RBCs from each other
Thus, class identity dominates species identity in the transcriptional (Fig. 3b).
profile of a retinal cell. The integrated data revealed 14 groups of cells based on shared
We found that conserved genes within a cell class included many transcriptomic signatures (Fig. 3c). Even though species-specific
genes encoding known lineage-determining transcription factors, such cluster labels were not an input to the analysis, mouse bipolar cell
as POU4F1 (RGCs), VSX2 (bipolar cells and Müller glia), OTX2 (photo- types mapped to the integrated groups in a 1:1 fashion, with the sole
receptors and bipolar cells), TFAP2A–C (amacrine cells), ONECUT1/2 exception of two closely related and sparsely represented types (BC8
(horizontal cells) and CRX (photoreceptors)30 (Fig. 2a). This suggests and BC9) that mapped to the same group (Fig. 3d and Extended Data
that the genetic mechanisms underlying neurogenesis and fate speci- Fig. 9a). We call these groups neuronal orthotypes although, as in the
fication of cell classes are evolutionarily ancient. case of BC8 and BC9, they may sometimes contain small sets of related
We assessed evolutionary trends by comparing mean squared types. We named the bipolar cell orthotypes according to the mouse
expression divergence in pseudobulk profiles and evolutionary dis- types; thus, the orthotype containing mouse BC1A is called oBC1A,
tance among pairs of species for each cell class. Expression divergence and so on. Each bipolar cell orthotype was represented in nearly all
increased with evolutionary distance according to a power law that mammals (Extended Data Fig. 9b) and 91% of mammalian bipolar cell
was qualitatively similar across all cell classes18 (R2 = 0.75–0.92) (Fig. 2e clusters (172 out of 190) predominantly mapped specifically to a single
and Extended Data Fig. 7e). The trends were inconsistent with purely orthotype (Fig. 3d, middle and Supplementary Table 3). We identi-
neutral transcriptome evolution, which predicts a linear relationship fied differentially expressed genes that distinguished the bipolar cell
between average expression distance and evolutionary distance18,31. orthotypes (Fig. 3e).
Although variation at the pseudobulk level can arise from changes in The ‘mammalian’ orthotypes remained robust when mammalian,
cell-type composition as well as from changes in gene expression in chick, lizard and zebrafish bipolar cells were integrated together.
individual cell types, the finding that the variance of Müller glia—a sin- Although 32% fewer orthologous genes were available to guide the
gle cell type—was similar to that of more complex cell classes suggests analysis, many bipolar cell clusters in chick, several in lizard and a few
that the variation at pseudobulk level is dominated by changes in gene in zebrafish mapped to these mammalian orthotypes (Fig. 3d, right).
expression in individual cell types. Thus, stabilizing and/or positive However, two additional ‘non-mammalian’ orthotypes emerged, com-
selection may contribute to the evolution of retinal cell class-specific prising OFF bipolar cells and ON bipolar cells from the non-mammals
transcriptomes. (Extended Data Fig. 9c–e and Supplementary Table 3). Attempts to find
additional substructure in these non-mammalian bipolar cell ortho-
types were unsuccessful, probably because chick, lizard and zebrafish
Molecular conservation of neuronal subclasses are nearly as evolutionarily distant from each other as they are from
Classically, three of the retinal cell classes have been subdivided mammals. Nonetheless, the fact that several chick and lizard bipolar
into subclasses12: photoreceptors comprise rods, specialized for cell clusters map to the mammalian orthotypes suggests that some
low-light vision, and cones, which mediate chromatic vision. Nearly type-specific bipolar cell identities have been conserved for more than
all amacrine cells use either GABA (γ-aminobutyric acid) or glycine 300 million years.
as their neurotransmitter, and transmitter choice is highly correlated To illustrate the utility of the integration, we highlight two bipolar
with key morphological features. Bipolar cells can be subdivided into cell orthotypes: oRBC and oBC1B (Fig. 3f). RBCs receive most of their
those that depolarize and hyperpolarize to illumination—ON and OFF input from rods, as their name implies, and they connect with specific
types, respectively. Within photoreceptors, amacrine cells and bipolar amacrine cell types rather than connecting directly with RGCs32. oRBC
cells, cells from different species segregated on the basis of subclass contained RBCs from all mammals (Fig. 3f). Mammalian RBCs were dis-
identity and expressed orthologues of gene markers that have been tinguished by the high expression of PRKCA and LRRTM4 (Fig. 3e), both
well-characterized in mice (Fig. 2d and Extended Data Fig. 8). Thus, of which are RBC-specific in mice19. RBCs also exhibit species-specific
the evolutionary conservation of cell classes extends to subclasses. gene expression (Extended Data Fig. 9f). RBCs have been described in
Several transcription factor-encoding genes are expressed selectively chicks and zebrafish, but these types did not map to oRBC.
in mouse retinal subclasses, including NRL and NR2E3 in rods, THRB and The second orthotype represents a non-canonical OFF bipolar cell
LHX4 in cones, MEIS2 in GABAergic amacrine cells, TCF4 in glycinergic described in mice, named BC1B19 or GluMI33. The name BC1B reflects its
amacrine cells, FEZF2 and LHX3 in OFF bipolar cells, and ISL1 and ST18 transcriptional similarity to BC1A. However, unlike canonical bipolar
in ON bipolar cells30. Some, including NRL, NR2E3, THRB and ISL1, have cells, BC1B retracts its dendrite during early postnatal life and there-
been implicated in the differentiation of the subclass that expresses fore has no direct connection with mature photoreceptors19. No BC1B
them. The subclass-specific expressions of these transcription factors equivalent has yet been identified in other species, probably because it
were broadly conserved across species (Extended Data Fig. 8a–d), sug- lacks this connection. However, 10 of the 13 mammals profiled here, as
gesting that the programmes specifying subclasses, like those specify- well as chicks and lizards, contained a bipolar cell cluster that mapped
ing classes, are evolutionarily ancient. exclusively to oBC1B (Fig. 3f), whereas two mammals (Peromyscus and
ferret) contained a cluster that mapped to both oBC1A and oBC1B. Thus,
transcriptomics enabled the identification of a potentially conserved
Tight conservation of outer retinal cell types cell type that would have been difficult to identify by conventional
We next considered the conservation of neuronal types within classes. morphological methods; its type-specific markers can now be used
We began by analysing the evolutionary variation among mammalian to seek morphological and physiological validation.

418 | Nature | Vol 624 | 14 December 2023


a BCs (no integration) BCs (integrated) b Rod BCs ON BCs (rod + cone) OFF cone BCs

Cow PRKCA ISL1 GRIK1


Ferret
Human
Macaque 6

Expression
Marmoset
Mouse 4
Opossum 2
Peromyscus
0
Pig
Rhabdomys
Sheep
Squirrel
Tree shrew

c Bipolar cell orthotypes d


oBC1A
oBC5A oBC1B
oBC5D
oBC2
oBC3B
oBC5B oBC3A
Mapping
oBC3B (%)
oBC5C
oBC3A oBC4 oBC4
0
oBC5A 25
oBC5B 50
oBC1A oBC5C 75
oBC1B oRBC oBC5D 100
oBC2 oBC6
oBC8/9
oBC7
oBC8/9
oBC6
oBC7 oRBC

g
ow

k
rd
p
an
se

el
ue

sh
um
t

ys

re
e

c
Pi
cu

ee
rr
re
os

za

hi
ou

afi
um

aq

r
C
ui

ss
Fe
Sh
ys
sh

C
do

Li
m

br
M

Sq
ac

po
H

m
ar

ee

Ze
ab
M

ro

O
M

Tr

Rh

Pe
e Orthotype differentially expressed genes f Orthotypes corresponding to rod BCs and BC1B
oBC1A No. of species
oBC1A
oBC1B 0 oBC1B
oBC2 5 oBC2
oBC3A 9 oBC3A Mapping
oBC3B oBC3B (%)
13
BC orthotype

BC orthotype

oBC4 oBC4 0
oBC5A Average oBC5A 20
oBC5B expression oBC5B 40
oBC5C 3 oBC5C >60
oBC5D 2 oBC5D
oBC6 oBC6
oBC7 1 oBC7
oBC8/9 0 oBC8/9
oRBC –1 oRBC
Pe 6
Tr 6
Sq e_8

Sh 13
Rh e_5

P 5
M po _8
M _BC 3
H c_O B
_O Fx

C ow X
P− 2
Tr 11
Sq 15
Sh _9
Rh e_1

Pi 2
Pe 1
M Opo 1
_R 1
M BC
M _RB

um B

Fe B
C _1
_1
LR CA

4
5D

PT 5
KI RM

PL L3

U 2

PT A

AD 1A

ER 2

N 4
PC H2

N 10

3
M K

1
r_

_B 1

1
g_
r_
ou _
C F
1

H _R
_R
M

Y
BB

IN
2
PR

O ig

r
ow
um F
z_

u_

a_

ou _

hi _

e_

a_
F
E

DH
SO

C
SH

AN
C

XD

XP
K
RT

KA
P
RR

ar
ac
Li
N
PR

g
Species PR types to orthotypes Mapping Mapping
o_Rod (%) h Species HC types to orthotypes (%)
o_sCone 0 30
25 oHC1 40
o_mlCone oHC2 50
50
75 60
M _R d
ar od

M e_R d
Rh _R d
a od
Sq r_R d
Sh _R d
C e_R d
d
g d
r d
um o_ d
ar Co d
Tr _sC ne
ou C e
Rh _sCone
Sq _s one
Sh _sCone
ow C e
Pi _sC ne

ou lC ne
a_ lC e
lC e
u lC e
l e
Fe _sC ne

H po Co e
um _s ne
ac lC e
ar lC e
l e

ow l ne
g_ lC e
O r_m Cone
po l e
lC ne
e

ac 1
ar 1
Tr _H1
Sq_H1
Sh u_H
ow 1
Pi _H1
po 1
M _H1
Rh _H
Pe _H
Pe _Ha
ac b
ar 2
2
O w_ 2
O _H 2

2B
_H A
M e_s on

C e_s on

O _s on

M _m Con
M _m on
Tr _m on

Rh _m on
Pe m on
Sq r_m on
Sh _m on

Pi _m on
Fe ml on

on

M _H
M _H

C e_H

O _H

M _H
Sh _H
C e_H
po H
ac o

Tr _Ro
ou o

Pe _Ro

u o
o
ow o
Pi _Ro
Fe _Ro
H Op _Ro
M _s Ro

M r_H

po 2
100 70
o

M e_m Co

_m Co
g o

C e_m Co
M _R

ou
a
u C

r
um

g
um

o
H
a

r
H

Fig. 3 | Multispecies integration of bipolar cells. a, UMAP of mammalian orthotype. Mapping that includes non-mammalian orthotype is shown in
bipolar cells computed with the raw (left) and integrated (right) gene Extended Data Fig. 9e. e, Dot plot showing differentially expressed genes
expression matrices. Cells are coloured by species of origin. b, Feature plots within each bipolar cell orthotype. The size of the dot represents the number
showing expression within the integrated space of the rod bipolar cell marker of mammalian species (out of 13 mammalian species in total) that express the
PRKCA, the ON bipolar cell marker ISL1, and the OFF bipolar cell marker GRIK1. gene in at least 30% of cells in the corresponding orthotype, and the colour
c, As in a, but with cells coloured by orthotype identity. d, Left, confusion represents normalized expression level. f, Confusion matrix showing the
matrix showing the percentage of cells from each mouse bipolar cell-type species bipolar cell clusters (columns) that map specifically to the orthotype
mapping to each mammalian bipolar cell orthotype. Each column sums to oRBC and oBC1B. Bipolar cell types are named on the basis of their species of
100%. See Extended Data Fig. 9a for a higher magnification view. Centre, origin and within-species bipolar cell cluster ID (Extended Data Figs. 1 and 3–6);
confusion matrix showing specific mapping between mammalian bipolar cell for example, Peromyscus bipolar cell cluster 1 is called ‘Per_1’. g,h, Confusion
orthotypes and bipolar cell clusters within each mammalian species (Extended matrix showing mapping of mammalian photoreceptor (g) and horizontal cell
Data Figs. 1 and 3–6). Right, confusion matrices showing the mapping of bipolar (h) types to orthotype. Format as in d, centre.
cell clusters in lizard, chick and zebrafish to the mammalian bipolar cell

We repeated the orthotype analysis for photoreceptors and horizon- only a single horizontal cell type. Again, orthotype analysis separated
tal cells, which are less diverse classes than bipolar cells. As noted above, horizontal cells into two groups (Fig. 3h). Many non-mammalian ver-
photoreceptors are divided into two subclasses, rods and cones. Most tebrates are more complex in these respects, with 4 or 6 photorecep-
mammals have a single rod type and two cone types, tuned to respond tor types and 4 horizontal cell types in birds (including chicken) and
best to short wavelengths (S cones, also known as blue cones) and fish27,34,35 (including zebrafish); these species mapped less well onto
medium wavelengths (M cones, also known as green cones), respec- the mammalian orthotypes.
tively. However, many primates have a third cone type (L cones, also
known as red cones) that is sensitive to longer wavelengths34. Orthotype
analysis separated mammalian M and L cones from S cones effectively, Retinal ganglion cell orthotypes
with the few exceptions probably being due to insufficient cell num- We next performed orthotype analysis on RGCs, the only output neu-
bers (Fig. 3g). Similarly, most mammals have two horizontal cell types, rons in the retina. We identified 21 RGC orthotypes in mammals and
called H1 and H2—although mice and perhaps other rodents—have found differentially expressed genes that distinguished them (Fig. 4a–c

Nature | Vol 624 | 14 December 2023 | 419


Article
a b c
oRGC1
oRGC13 oRGC2 No. of
Ferret oRGC18 oRGC12 oRGC3 species
Human fovea oRGC20 oRGC4
oRGC21 0
Human periphery oRGC5
oRGC19 oRGC11 oRGC6 4
Macaque fovea oRGC10
Macaque periphery oRGC7 8
oRGC6 oRGC3 oRGC14 oRGC8
Marmoset fovea 12
Marmoset periphery oRGC9
oRGC17 oRGC15

Orthotypes
oRGC5 oRGC10
Mouse oRGC11
Opossum Average
oRGC7oRGC16 oRGC12 expression
Peromyscus oRGC13
Pig 3
oRGC1 oRGC14
Rhabdomys oRGC2 oRGC15 2
Sheep oRGC4 oRGC16
Squirrel oRGC17 1
Tree shrew oRGC9 oRGC18
oRGC8 oRGC19 0
oRGC20 −1
oRGC21

G A2

G L2
13 9

TB K3
0
M X6
N IC
P T4

AP 1

1
RX 3
RG
2C

C 6

R 1
DH 3
N 7
RU IB

SG 1
PL CD
C 6
EM ST

DH

SO H1
PC CS
1

X2

1R X

LP

X
T

DF
DS
SY

IR

IL RO

IR
XN
RI

U
TM CH

D
d
oRGC1
oRGC2
oRGC3
oRGC4
oRGC5
oRGC6
oRGC7
Mapping
oRGC8
(%)
oRGC9
RGC orthotype

0
oRGC10 25
oRGC11 50
oRGC12 75
oRGC13 100
oRGC14
oRGC15
oRGC16
oRGC17
oRGC18
oRGC19
oRGC20
oRGC21
F P F P F P s l t
w se ys cu rre ee
p
Pi
g
rre um rd ck sh
re ou m ui ss za hi afi
an e et sh do ys Sh Fe Li C br
u e M m Sq po
um aq os re ha
b ro O Ze
H ac ar
m T R Pe
M M

e EOMES IRX3 TBR1 NEUROD2 MAFB FOXP2 TFAP2D BNC2 Other


oRGC13
oRGC10
oRGC21
oRGC20 No. of
species
oRGC19
oRGC18 0
oRGC17 4
oRGC15 8
oRGC14 12
RGC orthotype

oRGC12
oRGC7 Average
oRGC3 expression
oRGC5 3
oRGC6 2
oRGC16 1
oRGC1 0
oRGC11 −1
oRGC2
oRGC4
oRGC9
oRGC8
8 a 2 7 4 ) 8 7 7 2 1 l 5 6 9 2 3 4 l 9 0 9 5 5 4 6 0 4 1 1 1 2
C _MX M1 1b _M C C4 NS C1 C2 C3 NT _S _S FFS GC ove C1 C2 C2 3D 3D C3 FFT OFF OFF ove iON iON 3B C1 C2 C3 C3 C2 C_N C1 C3 _DV C1 C2 3L 3D C1 3L
ES

TB 3
M R1
FO B
N SA 2
RO 2
AP 2
BN D
M O C C 2
X

XP
EU TB
TF D

R n i i C
d n W
AF

22 3_ 0_ 31 O aO in id C_n mi mi 6_ C
ha G G haO 5_J C_ W W _W
IR

_W 0_W SG
M

C C3 C4 C a h G 1_ _ 13
EO

h l p R R 3 p m m F F C D S C C2
lp A _T _T lp C RG 2 3 l F F G _ _ o D
(A
4 1_ 21 17 2_A _ T C C
5 _A C4_ 28_ _FR 38 C3 _o oo
C
3 C C C C4 9 4 C 32 C 1 2 _
C
4 C C C C 16
C

Fig. 4 | Multispecies integration of retinal ganglion cells. a, Integrated primates. Right, confusion matrices showing the mapping of RGC clusters
UMAP of RGCs from 12 mammals (cow was excluded owing to the paucity of (columns) in lizard, chick and zebrafish to the 21 mammalian RGC orthotypes.
RGC data.). Cells are labelled by species of origin. For primates, cells from fovea Mapping to the single non-mammalian RGC orthotype is shown in Extended
and periphery are plotted separately. b, As in a, with RGCs labelled by orthotype. Data Fig. 10d. e, Left, confusion matrix showing that mouse RGC types (rows;
c, Dot plot showing differentially expressed genes within each RGC orthotype. naming as in ref. 20) belonging to transcription factor-based subsets39
Representation as in Fig. 3e. d, Left, confusion matrices showing that species- (colours) map to the same orthotypes (columns). Right, dot plot showing
specific RGC clusters (Extended Data Figs. 1 and 3–6) map to mammalian RGC specific expression patterns of subclass-specific transcription factor-
orthotypes in a specific fashion. Representation as in Fig. 3d, centre, except encoding genes39 in orthotypes. Representation as in Fig. 3e.
that clusters from fovea (F) and periphery (P) are mapped separately for

and Extended Data Fig. 10a). Eighty-one per cent of mammalian RGC by non-mammalian species (Extended Data Fig. 10b–d and Supple-
clusters (329 out of 408) mapped predominantly to a single orthotype mentary Table 3).
(Fig. 4d). In species that contain more RGC types than orthotypes, tran- To test the reliability of orthotype analysis for RGCs, we searched
scriptomically similar RGC clusters mapped to the same orthotype. As for orthologues of an evolutionarily ancient set of RGC types called
was the case for bipolar cells, RGC orthotypes remained stable when intrinsically photosensitive RGCs (ipRGCs). ipRGCs contain the pho-
lizard, chick and zebrafish were included in the integration (Fig. 4d, topigment melanopsin (encoded by OPN4), which enables them to
right), but were supplemented by an additional orthotype dominated generate visually evoked signals without input from photoreceptors36.

420 | Nature | Vol 624 | 14 December 2023


They mediate crucial non-image-forming visual functions, such as RGC types mapping to these orthotypes included a set of four related
circadian entrainment and the pupillary light reflex. ipRGCs have been types called α-RGCs46; of the five mouse cell types mapping to the
detected in the retinas of diverse vertebrate orders, including several of ON and OFF midget- and OFF parasol-containing orthotypes, three
the species profiled here, generally on the basis of OPN4 expression37. were α-RGCs. A resemblance of parasol RGCs to α-RGCs has been sug-
ipRGCs also express the transcription factor-encoding gene EOMES gested previously22,47, but the correspondence was unexpected for
(also known as TBR2), although some EOMES-expressing RGCs have not midget RGCs, because α-RGCs are present at low abundance (around
been functionally validated as ipRGCs. RGCs in two orthotypes, oRGC8 2%) and are among the largest mouse RGCs. Nonetheless, several lines
and oRGC9, expressed OPN4 (Extended Data Fig. 10e). oRGC9 contained of evidence support the orthology between primate midgets and para-
five mouse RGC types, three of which were the ipRGC types M1a, M1b sols, and the mouse α-RGC types.
and M2, which express the highest levels of melanopsin. oRGC8 con- First, the four α-RGC types can be distinguished on the basis of
tained the paralogous types, MX and C8. Overall, out of 35 clusters response polarity (ON versus OFF) and response kinetics (sustained
from 11 species in these 2 oRGCs, 25 expressed OPN4 and 33 expressed (s) versus transient (t)): αONs, αOFFs, αONt and αOFFt46. Mouse αONs
EOMES. OPN4-expressing RGC types from chick and lizard also mapped and αOFFs mapped to ON and OFF midgets, respectively, and mouse
to these orthotypes. Thus, cross-species integration captures an RGC αONt and αOFFt mapped to ON and OFF parasols, respectively. Second,
group with a conserved physiological property. midgets and parasols exhibit sustained and transient light responses,
We showed recently that 45 molecularly defined mouse RGC types, respectively, that match the kinetics of their mouse orthologues46,48.
many of which map to physiologically and morphologically defined Third, dendrites of matched types arborize in homologous sublaminae
mouse RGC types38, can be grouped into subsets defined by selectively of the inner plexiform layer, with the parasol and α-transient types
expressed transcription factor-encoding genes20,39,40. Some of these nearer the centre of the layer than the midget and α-sustained types49.
transcription factor-encoding genes (for example, EOMES, TBR1 and Fourth, morphological studies have identified the bipolar cell types that
NEUROD2) have been implicated in RGC development41–44. Although innervate midgets, parasols and α-RGCs50–52. In each case, the primate
many RGC subsets defined according to transcription factor-encoding bipolar cell type that provides the majority of excitatory input to the
gene expression align with morphologically or functionally defined midget or parasol RGC type is a member of the same bipolar cell ortho-
RGC subclasses (for example, EOMES+ ipRGCs and Tbr+ T-RGCs), oth- type as a mouse bipolar cell type that provides substantial input to the
ers are novel (for example, Irx3+ RGCs and Bnc2+ RGCs). The mapping corresponding α-RGC type. Thus, although none of these metadata
of mouse RGC types to RGC orthotypes mirrored these transcription were provided explicitly, the integration matched types correctly based
factor-defined subsets (Fig. 4e, left), and subset-defining transcrip- on their polarity, response kinetics, dendritic lamination and inputs
tion factor expression patterns were recovered in a large propor- (Fig. 5b). In addition, orthologues exhibit similar response properties:
tion of species (Fig. 4e, right). These results suggest that as noted midget RGCs and sustained α-RGCs primarily report on contrast and
above for photoreceptor, bipolar cell and amacrine cell subclasses, are minimally feature-selective, whereas parasol RGCs and transient
it may be possible to classify RGCs into evolutionarily conserved α-RGCs, are motion-sensitive53,54.
subclasses. We assessed the strength of the primate midget and parasol to mouse
Although orthotypes for all neuronal classes were represented in α-RGC correspondence with two additional statistical approaches.
all mammals, the number of neuronal types within a species varied The first is factorized linear discriminant analysis55 (FLDA) (Extended
over a greater range for RGCs (29 ± 10 (mean ± s.d.)) than for other Data Fig. 12a and Supplementary Note 2). Given single-cell transcrip-
classes (photoreceptors, 3–4; horizontal cells, 1–2; and bipolar cells, tomic data from cells that carry multiple categorical attributes, FLDA
14 ± 2) (Extended Data Figs. 1 and 3–6). Similarly, RGC orthotypes were attempts to factorize the gene expression data into a low-dimensional
associated with more types within a species (1.62 ± 1.39, corresponding representation in which each axis captures the variation along one
to a coefficient of variation (CV) of 0.86) than other classes: 1 ± 0.05, attribute while minimally co-varying with other attributes. We applied
CV = 0.05 for photoreceptors; 1.1 ± 0.25, CV = 0.22 for horizontal cells; FLDA to project primate midgets and parasols and mouse α-RGCs
and 1.13 ± 0.44, CV = 0.4 for bipolar cells (amacrine cells are poorly onto a 3D space whose three axes represent species (mouse–primate),
annotated and cannot be integrated across species at this time). kinetics (sustained–transient) and polarity (ON–OFF). FLDA generated
Thus, the extent of variation within cell classes increases systematically a projection in which the relative arrangement of the four primate and
from outer to inner retina in the order photoreceptor < horizontal the four mouse cell types was consistent with their attributes (Fig. 5c
cell < bipolar cell < RGC. and Extended Data Fig. 12b). We then tested whether α-RGCs were
a better transcriptomic match to midgets and parasols than other
mouse RGC types carrying similar attributes. For this purpose, we
Orthologues of midget and parasol RGCs identified a set of 20 mouse RGC types for which polarity (ON–OFF)
In most species studied to date, no RGC type comprises more than about and kinetics (sustained–transient) are known (Supplementary Table 4).
10% of all RGCs. By contrast, the retina of many primates—including We matched all possible 432 combinations of 4 drawn from this set
humans—is dominated by two closely related RGC types, ON and OFF with the midgets and parasols, calculated the FLDA projections, and
midget RGCs, named for their diminutive dendritic trees45. Together ranked them on the basis of the magnitude of the variance captured
they account for more than 80% of all RGCs in macaque and human, by FLDA along the polarity and kinetics axes (Extended Data Fig. 12c).
with similar abundance in fovea and periphery22,23. However, despite The best match comprised all four α-RGC types, and the next three
their importance for vision, no non-primate orthologues of midget matches contained three α-RGC types plus one other type (Extended
RGCs have been found, and our own previous comparison of mouse and Data Fig. 12d).
macaque primate RGCs did not find any correspondence22. Similarly, The second statistical method, geometric analysis of gene expression
attempts to find orthologues of the next most abundant primate RGC (GAGE), focuses on the geometric arrangement of the cluster means
types, ON and OFF parasol RGCs (5–10% of all RGCs) have remained of RGC types in gene expression space (Supplementary Note 3). The
inconclusive2. cluster centroids for the macaque midget and parasol types form a
We used orthotypes to revisit this issue. Each of the four abundant four-cornered shape in the space of gene expression values. GAGE
primate types mapped to a distinct orthotype (oRGC1, oRGC2, oRGC4 tests whether there are groups of mouse RGC types that form that
and oRGC5), and each of these orthotypes contained the correspond- same shape, except for a linear translation corresponding to species
ing cell type from both fovea and periphery of human, macaque and differences (Fig. 5d, inset). For every combination of four mouse cell
marmoset (Fig. 5a and Extended Data Fig. 11a). Remarkably, the mouse types in the set described above, we scored how well the mouse shape

Nature | Vol 624 | 14 December 2023 | 421


Article
a
oRGC1
oRGC2
oRGC3
oRGC4
oRGC5
oRGC6
oRGC7
Mapping
oRGC8
(%)
oRGC9
oRGC10 0
RGC orthotype

oRGC11 20
oRGC12 40
>60
oRGC13
oRGC14
oRGC15
oRGC16
oRGC17
oRGC18
oRGC19
oRGC20
oRGC21
Mac_MGC_OFF
Mar_MGC_OFF
MarFov_fRGC12
Mou_C17_TRGC_S1
Mou_C42_AlphaOFFS
Fer_14
Fer_20
Opo_2
Per_18
Per_23
Pig_1
Pig_17
Rha_26
She_2
Squ_27
Tre_1
Tre_28

Hum_MGC_ON
Mac_MGC_ON
Mar_MGC_ON
Mou_C43_AlphaONS_M4
Fer_19
Fer_9
Opo_28
Per_34
Pig_4
Tre_3

Hum_PGC_OFF
HumFov_RGC4
HumPer_RGC4
Mac_PGC_OFF
Mar_PGC_OFF
Mou_C34
Mou_C45_AlphaOFFT
Opo_22
Opo_27
Per_36
Per_38
Rha_30
Squ_30
Tre_34
Hum_PGC_ON
HumFov_RGC10
HumFov_RGC21
HumPer_RGC10
HumPer_RGC20
HumPer_RGC21
Mac_PGC_ON
MacFov_fRGC7
MacPer_pRGC9
Mar_PGC_ON
MarFov_fRGC7
Mou_C18
Mou_C27
Mou_C37
Mou_C41_AlphaONT
Fer_22
Per_14
Pig_14
Pig_22
Pig_25
Pig_34
Rha_24
Rha_25
Rha_33
She_9
Squ_31
Tre_17
Tre_21
Tre_32
Tre_33
Hum_MGC_OFF

b OTs oBC1A oBC7 oBC3A oBC5AD


c
BCs FMB mBC1 IMB mBC7 DB3a mBC3a DB4 mBC5
Primate MGC OFF
Primate PGC OFF
Primate PGC ON 7,000
Mouse
INL Primate MGC ON 6,000
Mouse αONs 5,000
Mouse αOFFt 4,000
OFF 3,000
Mouse αONt
2,000
IPL Mouse αOFFs
1,000
ON 0 Macaque

500
250
GCL 0 Sustained
–250
OFF C42 ON C43 OFF C45 ON C41 –600
RGCs –400 –500
MGC αOFF MGC αON PGC αOFF PGC αON –200 –750
OFF 0
OTs oRGC1 oRGC4 oRGC5 oRGC2 200 –1,000
400 –1,250 Transient
Polarity OFF ON OFF ON 600
ON
Kinetics Sustained Transient

d e
0.5
No. of mouse combinations

20 Mouse
0.4

Macaque
Proportion

0.3

10 0.2

Orthotype
0.1
oRGC1 (MGC OFF)
oRGC4 (MGC ON)
0 0
0.45 0.50 0.55 0.60 0.65
ys

Sq e

m el

po t
ee m

m ig
tP

a F

aq P
H eF

um P
F
O rre
cu
s

ee
ro irr

re

P
Tr ssu

M set

an
M que

an
m
ou

M ose

Fraction of explained variance


Fe
ys

sh
u

Sh
do

um
M

o
m
ab

ac
ac

H
ar
ar
Rh

Pe

Fig. 5 | Mammalian orthologues of midget and parasol RGCs. a, Confusion species, polarity and kinetics (see Supplementary Note 2). d, Matching MGCs
matrix showing RGC clusters from different species that map specifically to and PGCs to mouse types by GAGE. Inset, given sets of mouse and primate
oRGC1, oRGC4, oRGC5 and oRGC2, which contain OFF and ON midget RGCs RGC types, the model fits the arrangement of their cluster centroids in gene
(MGCs) and OFF and ON parasol RGCs (PGCs). Representation as in Fig. 3f. expression space by assuming a shape that is simply shifted to the other species
Column names corresponding to primate midget and parasol types are shown via a linear translation. Symbols mark the four response types: circle, sustained;
in red, and mouse α-RGC types are shown in blue. b, Schematic delineating square, transient; open, ON; filled, OFF. The graph is a histogram of the fraction
morphological and physiological similarities between primate and midget of explained variance showing for each proposed combination of four mouse
RGCs and their α-RGC orthologues. Orthotypes (OTs) of each pair as well as cell types how well the resulting shape fits the macaque RGC geometry. The red
the orthology among bipolar cell types that innervate them are also shown. bar shows the set of four α-RGC types. Green bars show combinations containing
Morphologies of neuronal types were created on the basis of published data three α-RGC types. Grey bars, remaining sets of four mouse cell types as shown
(Supplementary Note 1). Within each pair, the left column corresponds to in Supplementary Table 4. e, Relative proportion of OFF and ON midget RGC
primate types and the right column corresponds to mouse types. c, FLDA orthologues in mammalian species based on frequencies of cells in oRGC1 and
projection of the scRNA-seq data for primate midget and parasol types and oRGC4.
mouse α-RGC types onto the corresponding 3D space, with axes representing

matches the macaque shape (Methods). The four α-RGC types pro- transcriptomically defined mouse RGC types20. The four α-RGC types
duced the strongest match by a large margin, followed by several com- with the correct matching of polarity and kinetics with the MGCs and
binations containing three α-RGC types (Fig. 5d). Finally, we considered PGCs scored in second place out of all such combinations. The top
matches for all 3,575,880 possible combinations of 4 drawn from the 45 match was biologically implausible (see Extended Data Fig. 12e).

422 | Nature | Vol 624 | 14 December 2023


Together, these results provide strong support for the orthology acknowledgements, peer review information; details of author contri-
of primate midget and parasol RGCs with mouse α-RGCs, suggesting butions and competing interests; and statements of data and code avail-
that midget and parasol RGCs are not primate innovations as they have ability are available at https://doi.org/10.1038/s41586-023-06638-9.
been considered to be. Moreover, the presence of midget and parasol
orthologues in all the mammals studied here (Fig. 5e and Extended Data 1. Baden, T., Euler, T. & Berens, P. Understanding the retinal basis of vision across species.
Fig. 11b) suggests that they are likely to have evolved from antecedent Nat. Rev. Neurosci. 21, 5–20 (2020).
types present in the mammalian common ancestor. 2. Berson, D. M. in The Senses: A Comprehensive Reference (eds Masland, R. H. & Albright, T.)
491–520 (Academic Press, 2008).
For midget RGCs, we suggest a relationship between their marked 3. Koonin, E. V. Orthologs, paralogs, and evolutionary genomics. Annu. Rev. Genet. 39,
expansion in the primate lineage (Fig. 5e) and the evolution of visual 309–338 (2005).
processing. In primates, the principal retinorecipient region is the dor- 4. Alfoldi, J. & Lindblad-Toh, K. Comparative genomics as a tool to understand evolution and
disease. Genome Res. 23, 1063–1068 (2013).
solateral geniculate nucleus (dLGN), whereas in mice it is the superior 5. Durbin, R., Eddy, S. R., Krogh, A. & Mitchison, G. Biological Sequence Analysis: Probabilistic
colliculus56. Midget RGCs project almost exclusively to the dLGN57. In Models of Proteins and Nucleic Acids (Cambridge Univ. Press, 1998).
mice, anterograde16 and retrograde58,59 tracing studies suggest that 6. Zeng, H. & Sanes, J. R. Neuronal cell-type classification: challenges, opportunities and
the path forward. Nat. Rev. Neurosci. 18, 530–546 (2017).
α-RGCs are overrepresented among those RGCs that project to the 7. Zeng, H. What is a cell type and how to define it? Cell 185, 2739–2755 (2022).
dLGN (two- to fourfold in ref. 53). The dLGN provides the dominant 8. Marioni, J. C. & Arendt, D. How single-cell genomics is changing evolutionary and
visual input to the primary visual cortex, whereas superior colliculus developmental biology. Annu. Rev. Cell Dev. Biol. 33, 537–553 (2017).
9. Tanay, A. & Sebe-Pedros, A. Evolutionary cell type mapping with single-cell genomics.
projects in large part to areas that control reflexive motor responses, Trends Genet. 37, 919–932 (2021).
including eye movements60. In primates, complex visual processing 10. Roberts, R. J. V., Pop, S. & Prieto-Godino, L. L. Evolution of central neural circuits: state of
occurs largely at the cortical level, and may be best served by the rel- the art and perspectives. Nat. Rev. Neurosci. 23, 725–743 (2022).
11. Dowling, J. E. The Retina: An Approachable Part of the Brain 2nd edn (Harvard Univ. Press,
atively unprocessed, high-acuity rendering of the visual world that 2012).
midget RGCs provide. The modest loss in response time in this system 12. Masland, R. H. The neuronal organization of the retina. Neuron 76, 266–280 (2012).
is presumably compensated by the greater flexibility in response type. 13. Cajal, S. R. Y. La retine des vertebres. Cellule 9, 119–255 (1893).
14. Sanes, J. R. & Masland, R. H. The types of retinal ganglion cells: current status and
As the cortex has a key role in primate vision, midget-like RGCs already implications for neuronal classification. Annu. Rev. Neurosci. 38, 221–246 (2015).
present in the mammalian ancestor may have decreased in receptive 15. Kerschensteiner, D. Feature detection by retinal ganglion cells. Annu. Rev. Vis. Sci. 8,
135–169 (2022).
field size and increased in number to facilitate this flexibility as pri-
16. Martersteck, E. M. et al. Diverse central projection patterns of retinal ganglion cells. Cell
mates evolved. Rep. 18, 2058–2072 (2017).
17. Robles, E., Laurell, E. & Baier, H. The retinal projectome reveals brain-area-specific visual
representations generated by ganglion cell diversity. Curr. Biol. 24, 2085–2096 (2014).
18. Chen, J. et al. A quantitative framework for characterizing the evolutionary history of
Conclusions mammalian gene expression. Genome Res. 29, 53–63 (2019).
We integrated single-cell transcriptomic cell atlases of the retina from 19. Shekhar, K. et al. Comprehensive classification of retinal bipolar neurons by single-cell
transcriptomics. Cell 166, 1308–1323.e1330 (2016).
17 vertebrate species and used them to assess the extent to which cell
20. Tran, N. M. et al. Single-cell profiles of retinal ganglion cells differing in resilience to injury
classes, subclasses and types have been conserved through vertebrate reveal neuroprotective genes. Neuron 104, 1039–1055.e1012 (2019).
evolution. Our main results and the conclusions we draw from them are 21. Rheaume, B. A. et al. Single cell transcriptome profiling of retinal ganglion cells identifies
cellular subtypes. Nat. Commun. 9, 2759 (2018).
as follows. First, retinal cell classes and subclasses are highly conserved
22. Peng, Y. R. et al. Molecular classification and comparative taxonomics of foveal and
at the molecular level through evolution, mirroring their structural peripheral cells in primate retina. Cell 176, 1222–1237.e1222 (2019).
and functional conservation. The pattern of gene expression varia- 23. Yan, W. et al. Cell atlas of the human fovea and peripheral retina. Sci. Rep. 10, 9802 (2020).
24. Cowan, C. S. et al. Cell types of the human retina and its organoids at single-cell resolution.
tion in classes is inconsistent with neutral transcriptome evolution,
Cell 182, 1623–1640.e1634 (2020).
suggesting that selective pressures shape the cellular repertoire of the 25. Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells
retina. Second, although greater cross-species variation exists at the using nanoliter droplets. Cell 161, 1202–1214 (2015).
26. Yan, W. et al. Mouse Retinal Cell Atlas: molecular identification of over sixty amacrine cell
level of cell types, numerous conserved types can be detected using an types. J. Neurosci. 40, 5177–5195 (2020).
analytical framework that identifies transcriptomic groups, which we 27. Yamagata, M., Yan, W. & Sanes, J. R. A cell atlas of the chick retina based on single-cell
call orthotypes. Third, evolutionary divergence among types is more transcriptomics. eLife 10, e63907 (2021).
28. Kolsch, Y. et al. Molecular classification of zebrafish retinal ganglion cells links genes to
pronounced for RGCs than for other retinal classes, suggesting that cell types to behavior. Neuron 109, 645–662.e649 (2021).
the outer retina is built from a conserved parts list, whereas natural 29. Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902.e1821
selection acts more strongly on diversifying those neuronal types that (2019).
30. Petridou, E. & Godinho, L. Cellular and molecular determinants of retinal cell fate. Annu.
transmit information from the retina to the rest of the brain. Fourth, Rev. Vis. Sci. 8, 79–99 (2022).
conserved transcription factors at all three levels (class, subclass and 31. Bedford, T. & Hartl, D. L. Optimization of gene expression by natural selection. Proc. Natl
type) suggest that developmental programmes for the specification Acad. Sci. USA 106, 1133–1138 (2009).
32. Grimes, W. N., Songco-Aguas, A. & Rieke, F. Parallel processing of rod and cone signals:
of retinal neurons have an ancient origin. Fifth, midget and parasol retinal function and human perception. Annu. Rev. Vis. Sci. 4, 123–141 (2018).
RGCs, which together comprise more than 90% of human RGCs, have 33. Della Santina, L. et al. Glutamatergic monopolar interneurons provide a novel pathway of
orthologues in other mammalian species, suggesting that these primate excitation in the mouse retina. Curr. Biol. 26, 2070–2077 (2016).
34. Baden, T. & Osorio, D. The retinal basis of vertebrate color vision. Annu. Rev. Vis. Sci. 5,
cell types are derived from the expansion and modification of types 177–200 (2019).
present more than 300 million years ago in the retina of the last com- 35. Song, P. I., Matsui, J. I. & Dowling, J. E. Morphological types and connectivity of horizontal
mon ancestor of mammals. In mice, the orthologues are a numerically cells found in the adult zebrafish (Danio rerio) retina. J. Comp. Neurol. 506, 328–338
(2008).
minor set of types called α-RGCs. The marked (approximately 40-fold) 36. Hattar, S., Liao, H. W., Takao, M., Berson, D. M. & Yau, K. W. Melanopsin-containing retinal
difference in abundance of midget orthologues between mice and ganglion cells: architecture, projections, and intrinsic photosensitivity. Science 295,
humans correlates with the greater prominence of visual processing 1065–1070 (2002).
37. Do, M. T. H. Melanopsin and the intrinsically photosensitive retinal ganglion cells: biophysics
in the primate cortex. Knowing the orthologues of midget and parasol to behavior. Neuron 104, 205–226 (2019).
RGCs in several accessible models will aid efforts to slow their degen- 38. Goetz, J. et al. Unified classification of mouse retinal ganglion cells using function,
eration in blinding diseases such as glaucoma. morphology, and gene expression. Cell Rep. 40, 111040 (2022).
39. Shekhar, K., Whitney, I. E., Butrus, S., Peng, Y. R. & Sanes, J. R. Diversification of multipotential
postmitotic mouse retinal ganglion cell precursors into discrete types. eLife 11, e73809
(2022).
Online content 40. Whitney, I. E. et al. Vision-Dependent and -independent molecular maturation of mouse
retinal ganglion cells. Neuroscience 508, 153–173 (2023).
Any methods, additional references, Nature Portfolio reporting sum- 41. Cherry, T. J. et al. NeuroD factors regulate cell fate and neurite stratification in the
maries, source data, extended data, supplementary information, developing retina. J. Neurosci. 31, 7365–7379 (2011).

Nature | Vol 624 | 14 December 2023 | 423


Article
42. Kiyama, T. et al. Essential roles of Tbr1 in the formation and maintenance of the orientation- 55. Qiao, M. Factorized discriminant analysis for genetic signatures of neuronal phenotypes.
selective J-RGCs and a group of OFF-sustained RGCs in mouse. Cell Rep. 27, 900–915. Front. Neuroinform. https://doi.org/10.3389/fninf.2023.1265079 (2023).
e905 (2019). 56. Seabrook, T. A., Burbridge, T. J., Crair, M. C. & Huberman, A. D. Architecture, function, and
43. Mao, C. A. et al. T-box transcription regulator Tbr2 is essential for the formation and assembly of the mouse visual system. Annu. Rev. Neurosci. 40, 499–538 (2017).
maintenance of Opn4/melanopsin-expressing intrinsically photosensitive retinal 57. Dacey, D. M., Peterson, B. B., Robinson, F. R. & Gamlin, P. D. Fireworks in the primate retina:
ganglion cells. J. Neurosci. 34, 13083–13095 (2014). in vitro photodynamics reveals diverse LGN-projecting ganglion cell types. Neuron 37,
44. Liu, J. et al. Tbr1 instructs laminar patterning of retinal ganglion cell dendrites. Nat. Neurosci. 15–27 (2003).
21, 659–670 (2018). 58. Rosón, M. R. et al. Mouse dLGN receives functional input from a diverse population of
45. Polyak, S. L. The Retina (Univ. of Chicago Press, 1941). retinal ganglion cells with limited convergence. Neuron 102, 462–476. e468 (2019).
46. Krieger, B., Qiao, M., Rousso, D. L., Sanes, J. R. & Meister, M. Four alpha ganglion cell types 59. Johnson, K. P. et al. Cell-type-specific binocular vision guides predation in mice. Neuron
in mouse retina: function, structure, and molecular signatures. PLoS ONE 12, e0180091 109, 1527–1539.e1524 (2021).
(2017). 60. Ito, S. & Feldheim, D. A. The mouse superior colliculus: an emerging model for studying
47. Crook, J. D. et al. Y-cell receptive field and collicular projection of parasol ganglion cells circuit formation and function. Front. Neural Circuits 12, 10 (2018).
in macaque monkey retina. J. Neurosci. 28, 11277–11291 (2008).
48. de Monasterio, F. M. Center and surround mechanisms of opponent-color X and Y Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in
ganglion cells of retina of macaques. J. Neurophysiol. 41, 1418–1434 (1978). published maps and institutional affiliations.
49. Nassi, J. J. & Callaway, E. M. Parallel processing strategies of the primate visual system.
Nat. Rev. Neurosci. 10, 360–372 (2009). Open Access This article is licensed under a Creative Commons Attribution
50. Tsukamoto, Y. & Omi, N. OFF bipolar cells in macaque retina: type-specific connectivity in 4.0 International License, which permits use, sharing, adaptation, distribution
the outer and inner synaptic layers. Front. Neuroanat. 9, 122 (2015). and reproduction in any medium or format, as long as you give appropriate
51. Tsukamoto, Y. & Omi, N. ON bipolar cells in macaque retina: type-specific synaptic credit to the original author(s) and the source, provide a link to the Creative Commons licence,
connectivity with special reference to OFF counterparts. Front. Neuroanat. 10, 104 (2016). and indicate if changes were made. The images or other third party material in this article are
52. Yu, W. Q. et al. Synaptic convergence patterns onto retinal ganglion cells are preserved included in the article’s Creative Commons licence, unless indicated otherwise in a credit line
despite topographic variation in pre- and postsynaptic territories. Cell Rep. 25, to the material. If material is not included in the article’s Creative Commons licence and your
2017–2026.e2013 (2018). intended use is not permitted by statutory regulation or exceeds the permitted use, you will
53. Wang, F., Li, E., De, L., Wu, Q. & Zhang, Y. OFF-transient alpha RGCs mediate looming need to obtain permission directly from the copyright holder. To view a copy of this licence,
triggered innate defensive response. Curr. Biol. 31, 2263–2273.e2263 (2021). visit http://creativecommons.org/licenses/by/4.0/.
54. Manookin, M. B., Patterson, S. S. & Linehan, C. M. Neural mechanisms mediating motion
sensitivity in parasol ganglion cells of the primate retina. Neuron 97, 1327–1340.e1324 (2018). © The Author(s) 2023

424 | Nature | Vol 624 | 14 December 2023


Methods on an Illumina NovaSeq at the Bauer Core Facility at Harvard University.
Sequencing data were demultiplexed and aligned using Cell Ranger
Ethical compliance software (version 4.0.0, 10X Genomics).
Human eyes were obtained post-mortem at a median of 6 h from death
either from Massachusetts General Hospital via the Rapid Autopsy Histology
Program or from The Lion’s Eye Bank in Murray, UT. Acquisition and Whole eyes were fixed in 4% paraformaldehyde (in PBS) for 1–2 h and
use of post-mortem human tissue samples were approved by either then transferred to PBS. Either whole retinas or 8-mm punches of cen-
the Institutional Review Board of the University of Utah (protocol tral retina were dissected out and sunk in 30% sucrose in PBS overnight
IRB_00010201), or the Human Study Subject Committees at Harvard at 4 °C, before being embedded in tissue freezing medium and sec-
(Dana Farber/Harvard Cancer Center protocol no. 13-416), and pro- tioned coronally at 20 μm in a cryostat. Sections were mounted onto
cedures were compliant with the National Human Genome Research coated slides. Slides were incubated for 1 h with 5% donkey serum (with
Institute policies. All donors were confirmed to have no history or 0.1% Triton X-100) at room temperature, then overnight with primary
clinical evidence of ocular disease or intraocular surgery. Informed antibodies (1:500 RBPMS (PhosphoSolutions 1832-RBPMS); 1:400
consent was obtained from all donors per IRB protocols. Pig, cow and CHX10 (Novus Biologicals NBP1-84476); 1:50 AP2A (DSHB 3B5)) at 4 °C,
sheep eyes were obtained, on average, 1 h after death from an abattoir and finally for 2 h with secondary antibodies in PBS at room tempera-
located in West Groton, MA. Other animal eyes were obtained ture. Images were acquired on Zeiss LSM 900 confocal microscopes
from animal colonies maintained at Brandeis University (ferret), with 405, 488, 568 and 647 nm lasers, and processed using Zeiss ZEN
California Institute of Technology (tree shrew), Harvard University software suites.
(Peromyscus), MIT (marmoset), NIH (squirrel), University of Manchester,
UK (Rhabdomys), University of Georgia (lizard) and University of Preprocessing of transcriptomic data
California, Los Angeles (lamprey and opossum). Animals of both sexes We used Cellranger (v7.0, 10X Genomics) to align the scRNA-seq and
were included when possible. Animal experiments conducted in the snRNA-seq datasets, following the manufacturer’s instructions. For
USA were approved by the Institutional Animal Care and Use Com- each species, sequencing reads were demultiplexed into distinct sam-
mittees (IACUCs) in each location. Rhabdomys tissue was collected ples and the.fastq.gz files corresponding to each sample were aligned to
in accordance with the Animals, Scientific Procedures Act of 1986 reference transcriptomes to obtain binary alignment map (.bam) files.
(UK) and approved by the University of Manchester ethical review The reference transcriptomes used are listed in Supplementary Table 5.
committee. To include both exonic and intronic reads in the quantification of gene
expression for each sample, regardless of cellular or nuclear origin,
Number of animals and cells or nuclei used we applied velocyto61 to the corresponding.bam files. This generated
The number of animals used, biological replicates sequenced, and two separate gene expression matrices (GEMs) (genes × cells) for each
high-quality cells or nuclei collected are indicated for each species in sample, corresponding to ‘spliced’ and ‘unspliced’ reads. The two GEMs
Extended Data Figs. 1 and 3–6. The number of cells or nuclei recovered were summed element by element to obtain the ‘total’ GEM for each
for each class within each species is indicated in Supplementary Table 1. sample. For each species, GEMs from different samples were combined
See also ‘Statistics and reproducibility’. (column-wise concatenated) to yield a species GEM.

snRNA-seq Computational analysis


Nuclei isolation and sorting. For isolation of nuclei, frozen retinal Analysis of the GEMs was performed in R. Our workflow was based on
tissues were homogenized in a Dounce homogenizer in 1 ml lysis buffer Seurat v4.3.0 for single-cell analysis developed and maintained by the
consisting of 0.1% NP-40 in a solution containing 10 mM Tris, 1 mM Satija laboratory29,62 (https://satijalab.org/seurat/) and includes sev-
CaCl2, 8 mM MgCl2, 15 mM NaCl, 0.1 U μl−1 RNAse inhibitor (Promega eral packages used for statistical calculations and data visualizations
RNasin Ribonuclease Inhibitor N2615), and 0.02 U μl−1 DNAse (D4527, including MASS v7.3.60, pvclust v2.2.0, reshape2 v1.4.4, stats v4.3.0,
Sigma Aldrich). The homogenized tissue was passed through a 40-µm ggplot2 v3.4.2, dendextend v1.17.1 and ggdendro v0.1.23 We describe
cell strainer. The filtered nuclei were pelleted at 500 rcf for 5 min, the analysis steps here at a high level. We have also made the analysis
resuspended in staining buffer (Tween 0.02% and 2% BSA in the Tris scripts available via Zenodo (https://zenodo.org/record/8067826)
base buffer) and stained with anti-NEUN (1:300, Sigma FCMAB317PE and on our Github page (https://github.com/shekharlab/RetinaEvo-
or MAB377A5) and anti-CHX10 (1:600, Santa Cruz Biotechnology lution).
sc-365519 AF647) for 12 min at 4 °C.
Following staining, nuclei were centrifuged, resuspended in sorting Segregation of major retinal cell classes. Data from each species
buffer (2% BSA in the Tris base buffer), and counterstained with DAPI were separately analysed through a clustering procedure to identify
(1:1,000). The NEUN+ and CHX10+ nuclei were sorted into separate high-quality cells, and segregate the major cell classes (photorecep-
tubes using BD FACSDiva v8.02 (Extended Data Fig. 2a–c), pelleted tor, bipolar cell, horizontal cell, amacrine cell, RGC and Müller glia). In
again at 500 rcf for 5 min, resuspended in 0.04% non-acetylated BSA/ brief, GEMs from different replicates were combined, and transcript
PBS solution, and adjusted to a concentration of 1,000 nuclei per µl. counts in each cell was normalized to a total library size of 10,000 and
The integrity of the nuclear membrane and presence of non-nuclear log-transformed (X → log (X + 1)). We identified the top 2,000 highly
material were assessed under a bright-field microscope (Extended variable genes and applied principal components analysis to factorize
Data Fig. 2d,e) before loading into a 10X Chromium Single Cell Chip the submatrix corresponding to these highly variable genes. Using
(10X Genomics) with a targeted recovery of 8,000 nuclei per channel. the subspace corresponding to the top 20 principal components, we
built a k-nearest neighbour graph on the data, and then clustered with
Library preparation. Single-nuclei libraries were generated with a resolution parameter of 0.5 using Seurat’s FindClusters function.
either Chromium 3′ V3, or V3.1 platform (10X Genomics) following the The same principal components were used to embed the cells onto
manufacturer’s protocol. In brief, single nuclei were partitioned into a 2D visualization using the uniform manifold approximation63. The
Gel-beads-in-Emulsion where nuclear lysis and barcoded reverse tran- 2D embeddings were solely used to visualize clustering structure and
scription of RNA would take place to yield cDNA; this was followed by gene expression patterns post hoc.
amplification, enzymatic fragmentation and 5′ adapter and sample Each cluster was assigned to one of the six major retinal cell
index attachment to yield the final libraries. Libraries were sequenced classes based on expression of orthologues of canonical markers
Article
characterized in mice25: photoreceptors (Arr3, Rho and Crx), horizon- further analysis. Overall, we found 1,905 1:1 orthologues among all
tal cells (Calb1, Onecut1, Onecut2 and Lhx1), bipolar cells (Vsx1, Otx2 17 species, 4,560 among the 16 jawed vertebrates (that is, omitting
and Grik1), amacrine cells (Gad1, Gad2, Tfap2a, Tfap2b and Tfap2c), lamprey) and 6,693 among the 13 mammals. The number of shared
RGCs (Rbpms, Nefl, Nefm and Slc17a6) and Müller glia (Glul, Apoe and orthologues decreased with evolutionary distance, and we found fewer
Rlpb1). Clusters that mapped to other cell types found at much lower orthologues shared between mammals and non-mammalian verte-
frequency (such as endothelial cells or microglia) or that contained brates than among mammals.
low quality cells were not considered further. The number of cells of
each class in each species is indicated in Supplementary Table 1. We Visualization of cell classes. For an alternative view on the cell classes,
note that because many experiments were designed to enrich certain we subsampled each cell class to 200 per species, and then combined
classes (RGCs or bipolar cells), the relative frequencies do not reflect the GEMs. The resulting GEMs were integrated using Seurat using each
endogenous values. species as a ‘batch’. Note that batch correction was not performed for
samples within a species, nor was cell class information provided to the
Integration and clustering to identify species-specific types for integration. The resulting integrated data was visualized on a UMAP
photoreceptors, horizontal cells, bipolar cells and RGCs. We sepa- (Fig. 2d and Extended Data Fig. 8). Dendrograms for the cell-averaged
rated photoreceptors, horizontal cells, bipolar cells and RGCs within profiles were constructed using hclust (package stats), and then plot-
each species, and clustered them independently using the following ted in a circular representation using the circlize_dendrogram function
procedure. After subsetting the data by class, cells with abnormally (package dendextend) (Extended Data Fig. 7a).
high (>mean + 2 × s.d.) or low (<mean − 2 × s.d.) counts were removed.
We also removed replicate batches that contained the class of inter- Evolutionary variation of pseudobulk transcriptomes. For each spe-
est at a frequency less than 50 cells. We split the cells by replicate cies, we computed cell-averaged (or pseudobulk) gene expression
ID and used Seurat’s integration pipeline to remove batch effects, vectors for the six major cell classes (photoreceptor, horizontal cell,
reduce dimensionality and cluster the data in a shared low-dimensional bipolar cell, amacrine cell, RGC and Müller glia). Each pseudobulk vec-
integrated space. We selected the top 20–25 latent variables tor was z-scored (subtract mean and divide by variance) prior to sub-
in the integrated space to identify clusters and generate 2D UMAP sequent computations. The mean squared expression distance (MSD)
visualizations. between two species for a cell class was calculated as the euclidean
We initially deliberately overclustered the data using a resolution distance between the corresponding pseudobulk vectors
2
parameter of 1.1. Clusters were then merged or pruned as follows: for MSD(a, b) = ∣∣a − b∣∣ . To analyse evolutionary trends within a class
each cluster, we calculated differentially expressed marker genes, and (Fig. 2e), we compared MSD(a, b) to evolutionary time separating the
these markers were inspected to determine if clusters should be merged corresponding species t (a, b) . To estimate the evolutionary time for
or removed. Some clusters were also removed if their top differentially each pair of species, we downloaded a phylogenetic tree of vertebrate
expressed markers were widely expressed in several clusters, if they had species from the UCSC Genome Browser at http://hgdownload.cse.
lower RNA counts compared to other clusters, or if several of the top ucsc.edu/goldenpath/hg19/multiz100way/65. Evolutionary time sepa-
differentially expressed markers were canonical markers for contami- rating two pairs of species was assumed to be the branch length between
nant cell classes. If more than 20% of cells were removed via pruning, the corresponding nodes of this tree, measured in units of substitutions
the filtered data was subjected to another round of integration and per 100 bp of neutrally evolving sites. Branch lengths were computed
clustering. Two or more clusters were merged if a differential expression using the Environment for Tree Exploration toolkit66. We then fit the
test failed to find markers that sufficiently distinguished the clusters. MSD versus t using a power law model, MSD = at b introduced earlier18,
We applied these steps to define photoreceptor, horizontal cell, which is reported in Fig. 2e and Extended Data Fig. 7e. We also attempt-
bipolar cell and RGC clusters for species initially reported in this paper: ed to fit the data with a linear model MSD = a + bt and an Ornstein–
Peromyscus, ferret, opossum, brown anole lizard, cow, sheep, pig, Uhlenbeck model MSD = a(1 − e−bt) but both produced fits with lower
13-lined ground squirrel, 4-striped grass mouse, marmoset and tree R2 than the power law model.
shrew. Individual clusters correspond to individual cell types, and in
some cases, to small groups of closely related types. For the sake of Data integration and identification of orthotypes. We identified
consistency, we also applied the same procedure to photoreceptor, orthotypes separately for photoreceptors, horizontal cells, bipolar
horizontal cell, bipolar cell and RGC data of species published else- cells and RGCs. In each case, we followed the following steps: (1) Within
where (mouse19,20, macaque22, human23, zebrafish28 and chick27). In all each species, the corresponding GEM for each type was downsampled
cases, our clusters were largely consistent with published annotations, cluster-wise to include no more than 200 cells per cluster. This ensures
and we therefore labelled these clusters based on their published equitable representation of the transcriptomic clusters indicated in
labels. Extended Data Figs. 3–6; (2) the downsampled species-specific GEMs
were combined along the set of shared gene orthologues, normalized to
Selection of shared orthologous genes. Orthologous genes were 10,000 counts per cell, and log-transformed; (3) 2,000 highly variable
identified using orthology tables via Ensembl BioMart (https://useast. genes were selected within each species, and features that were repeat-
ensembl.org/info/data/biomart/index.html). Using mouse as a refer- edly variable were used for anchor finding, integrated dimensionality
ence species, pairwise orthology tables were generated between mouse reduction, and clustering of GEMs based on the Seurat pipeline29. The
and every other species. These orthology tables contained information resulting clusters were called orthotypes. A resolution of 0.5 was used
about the number of predicted orthologues for every mouse gene for the clustering. Transcriptomically proximal orthotypes based on a
within each species. Mouse genes that had a 1:1 orthologue in every gene expression dendrogram that contained distinct subsets of species
other species were retained as the set of orthologous features, with the were merged. Note that other than the downsampling step, species
exception of zebrafish. Due to a whole gene duplication, zebrafish has cluster IDs were not used to influence the selection of variable genes,
several paralogous pairs of genes (for example, rbpms2a and rbpms2b) integration or clustering steps.
known as ‘ohnologs’64. The prevalence of ohnologs results in a paucity
of 1:1 orthologues. To address this issue, we collapsed each ohonolog Integrating mammalian and non-mammalian datasets. In several
pair by summing over their expression (for example, rbpms2a and cases, cells from non-mammalian species formed orthotypes sepa-
rbpms2b to rbpms2). If the ohnologs were the only orthologues of a rate from those containing cells from mammalian species. We believe
gene, then the composite gene was regarded as the 1:1 orthologue for that this result largely reflects three issues. First, the representation
of species classes in our study is skewed: 13 mammals vs 1 reptile, 1 Supplementary Note 2 shows that u, v and w are solutions to general-
bird and 1 fish. Second, non-mammalian species are generally more ized eigenvalue problems.
evolutionarily distant from each other than mammalian species are
from each other. Third, the number of 1:1 orthologous genes decreas- Geometric analysis of gene expression
es as more distant species are co-analysed, which further compro- This approach is similar in intent to FLDA in that the goal is to identify
mises integration due to the loss of features. Including additional axes in gene expression space that capture the structure of the data, and
non-mammalian species and or improving computational methods that the choice of these axes is guided by a structure imposed through a
may lead to greater inclusion of non-mammalian cell types in the cur- Cartesian classification of cell types (for example ON vs OFF or primate
rent mammalian orthotypes. vs mouse). The main difference is that FLDA also attempts to capture
the variance across cells within a type, and this influences the selection
Statistics and reproducibility of the composite axes u, v and w. By contrast, GAGE only seeks to model
Based on the cluster-informed downsampling procedure described the shape formed by the gene expression centroids of the cell types
above, n = 32,350 cells of multiple cell classes were used to generate under consideration. Thus, for a quartet of primate cell types (MGC
Fig. 2d, and 38,366 bipolar cells, 61,161 RGCs, 13,605 photoreceptors OFF, MGC ON, PGC OFF and PGC ON) that form some shape in gene
and 5,405 horizontal cells were used to generate the orthotype results expression space, this method asks if there is a quartet of mouse cell
shown in Figs. 3 and 4. The mammalian orthotypes remained robust types that forms the same shape. The mathematical and implementa-
to different downsampling trials (see below), as well as the inclusion tion details of this method are delineated in Supplementary Note 3.
of non-mammals in the analysis (refer to Fig. 3d and Extended Data
Fig. 9d for bipolar cells, and Fig. 4d and Extended Data Fig. 10c for Reporting summary
RGCs). Across downsampling trials, we found that cells mapping to Further information on research design is available in the Nature
a given orthotype were present in the same cluster >90% of the time. Portfolio Reporting Summary linked to this article.
As the orthotypes are the result of a clustering of the integrated data,
the number of orthotypes depends on the resolution parameter. We
varied the clustering resolution and tracked the number of orthotypes, Data availability
the adjusted Rand index (ARI) of the clustering, and the number of The raw and processed sequencing data produced in this work are
species-specific orthotypes. The bipolar cell orthotypes were robust available via the Gene Expression Omnibus (GEO) under acces-
across a wide range of resolution (0.4–1.5), as indicated by a stable num- sion GSE237215. The species-specific datasets are available via
ber of orthotypes (16–21), high values of the ARI (0.88–0.96), and very the subseries accession numbers GSE237202–GSE237214. Previ-
few, if any, species-specific orthotypes. The RGC orthotypes exhibited ously published data utilized in this paper were downloaded from
higher sensitivity to the resolution parameter over the same range, with GEO repositories with accession numbers GSE81905, GSE137400,
the number of clusters ranging from 26–46. For resolution values over GSE152842, GSE148077, GSE15910 and GSE236005. Species phylo-
1, moret than 5 species-specific orthotypes were consistently observed genetic trees were downloaded from the UCSC Genome Browser
across trials. However, ARI values were reasonably high across values database (https://genome.ucsc.edu), and species reference genomes
tested (0.625–0.849). The results presented in the main text are for a are available on Ensembl (https://www.ensembl.org). Source data are
resolution of 0.5. provided with this paper.
We repeated the orthotype analysis for bipolar cells using three
alternative integration methods: Harmony67, Liger68 and scVI69. All
three methods produced results consistent with those from Seurat, Code availability
but they generated several additional species-specific orthotypes and scRNA-seq data clustering, integration and visualization was per-
also did not resolve some known distinctions among bipolar cell types. formed in the R statistical language, and heavily relied on the Seurat
We therefore used Seurat to obtain the results presented in the text. package (https://satijalab.org/seurat/). All scripts are available via
Zenodo (https://zenodo.org/record/8067826) and on our GitHub page
Factorized linear discriminant analysis (https://github.com/shekharlab/RetinaEvolution). FLDA analysis was
FLDA seeks a low-dimensional factorization of high-dimensional gene performed in Python, and the code and documentation are available at
expression data from cells with multiple categorical attributes such that https://github.com/muqiao0626/FLDA. GAGE analysis was performed
each axis of the low-dimensional space captures the variation along in Python, and the code and documentation are available at https://
one attribute while minimizing co-variation with other attributes. The github.com/markusmeister/Gene-Geometry.
mathematical derivations underlying FLDA are described in a previous
paper55, and are summarized in Supplementary Note 2. In this study, we 61. La Manno, G. et al. RNA velocity of single cells. Nature 560, 494–498 (2018).
62. Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587.
applied FLDA to factorize transcriptomic data for RGCs carrying three e3529 (2021).
categorical attributes: response polarity (ON vs OFF), response kinetics 63. Becht, E. et al. Dimensionality reduction for visualizing single-cell data using UMAP. Nat.
(transient vs sustained) and species (mouse vs primate). Using A, B and Biotechnol. 37, 38–44 (2019).
64. Howe, K. et al. The zebrafish reference genome sequence and its relationship to the human
C to represent these attributes, the total gene expression covariance genome. Nature 496, 498–503 (2013).
matrix can be expressed as: 65. Tyner, C. et al. The UCSC Genome Browser database: 2017 update. Nucleic Acids Res. 45,
D626–D634 (2017).
ΣT = ΣA + ΣB + ΣC + Σe 66. Huerta-Cepas, J., Dopazo, J. & Gabaldon, T. ETE: a python environment for tree exploration.
BMC Bioinf. 11, 24 (2010).
67. Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony.
Nat. Methods 16, 1289–1296 (2019).
where ΣT is the total covariance matrix, and ΣA, ΣB and ΣC are covariance
68. Welch, J. D. et al. Single-cell multi-omic integration compares and contrasts features of
explained by attributes A, B and C respectively. Σe is the residual vari- brain cell identity. Cell 177, 1873–1887.e1817 (2019).
ance that is not explained by these attributes. 69. Lopez, R., Regier, J., Cole, M. B., Jordan, M. I. & Yosef, N. Deep generative modeling for
single-cell transcriptomics. Nat. Methods 15, 1053–1058 (2018).
FLDA identifies a 3D embedding (u, v, w) of the cells such that u
maximizes the variance of attribute A while minimizing variances
of attributes B and C, v maximizes the variance of attribute B while Acknowledgements This work was supported by the NIH (K99EY033457 (A.M.),
R00EY028625 (K.S.), R01EY023871 (J.T.T.), R21EY028633 (J.R.S.), U01MH105960 (J.R.S.),
minimizing variances of attributes C and A, and w maximizes the vari- R01NS111477 (M.M.), and T32GM007103 (A.M.R.)), the Chan-Zuckerberg Initiative (CZF-2019-
ance of attribute C while minimizing variances of attributes A and B. 002459; J.R.S.), Simons Foundation 543015 (M.M.), the Glaucoma Research Foundation (K.S.),
Article
startup funds from the UC Berkeley (K.S.), an award from Research to Prevent Blindness and a M.Q. performed the FLDA analysis, and M.M. performed the GAGE analysis. R.J.L. and R.R.
Klingenstein-Simons Fellowship Award (Y.-R.P.), a Wellcome Trust Investigator Award provided an annotated Rhabdomys genome. J.R.S. and K.S. wrote the paper with input from
(210684/Z/18/Z) (R.J.L.), an ARCS Foundation Scholarship and a Society for Developmental the other authors.
Biology Emerging Models grant (A.M.R.), and grants from Children’s Glaucoma Foundation and
NSF (1827647) to J.D. Lauderdale and D.B. Menke. The authors thank J.D. Lauderdale and D.B.
Menke for supervision of A.M.R.; M. Laboulaye and R. Schaffer for assistance; G. Feng for Competing interests The authors declare no competing interests.
marmoset tissue; S. Van Hooser for ferret tissue; J. Chen for helpful discussions; R. Louie for
feedback; and S. Yun for assisting with data curation and visualization. Icons for species in the Additional information
figures were obtained from BioRender.com. Supplementary information The online version contains supplementary material available at
https://doi.org/10.1038/s41586-023-06638-9.
Author contributions J.R.S. and K.S. conceived the study and supervised the project. J.H. Correspondence and requests for materials should be addressed to Joshua R. Sanes or
performed the computational analysis, with contributions from K.S., W.Y. and A.K. A.M. Karthik Shekhar.
performed the scRNA-seq, snRNA-seq and histology experiments with contributions from Peer review information Nature thanks Tom Baden, Alex Pollen and Gregory Schwartz for their
A.H.K., Y.K., and Y.-R.P., respectively. A.M.R., R.R., J.B.W., V.P.K. and J.T.T. provided tissue. H.B., contribution to the peer review of this work.
R.J.L. and W.L provided guidance on zebrafish, Rhabdomys and squirrel studies, respectively. Reprints and permissions information is available at http://www.nature.com/reprints.
a b
mac_neun+_326
mac_neun+_326 mac_all_220&235
BP mac_neun+_355_s1 mac_neun+_220
Cone mac_neun+_355_s2 peri_all_220
GabaAC mac_neun+_563 mac_neun+_235
GlyAC mac_neun+_564 peri_all_235
mac_all_822 mac_neun+_978&194&150
HC mac_neun+_869 mac_neun+_1005&1053&1017
MG mac_neun+_887 peri_neun+_1059&564
Other mac_neun+_1059 peri_neun+_326&355
RGC mac_all_218 peri_neun+_563
mac_neun+_218 peri_neun+_887&869
Rod
peri_all_218

Avg Exp Pct Exp Proportion

−1

25
50
75
0
1
2

0.0

0.1

0.2

0.3
0
c f
MG
Cone
Rod
HC
GlyAC
GabaAC
MGC_OFF
DB5 0.8

RB
DB4
MGC_ON
BBGB
IMB
DB6

d RGC10
DB3b
DB2
RGC21
RGC20 DB3a
FMB
RGC15
OFFX
PGC_ON
RGC13 DB1
RGC18 0.3
RGC6 RGC8
RGC14
RGC19 RGC7
RGC11 RGC10
RGC5
RGC9
RGC8
RGC4
PGC_OFF RGC4 RGC17
RGC16 RGC12 RGC7 PGC_OFF
RGC9 PGC_ON

e
MGC_OFF
MGC_ON
DB3a
RGC14
RGC13
DB1
DB2 RGC21
OFFX
DB6 RGC18
RGC15
RB RGC6
FMB DB3b
RGC5
RGC17
DB5 IMB BBGB RGC16
RGC20

DB4 RGC12
RGC11
RGC19
mac_neun+_563
mac_neun+_8691
mac_all_218
peri_all_218
mac_all_220&235
peri_all_220
mac_neun+_235
mac_neun+_355_s2
peri_all_235
mac_neun+_887
mac_neun+_591
mac_neun+_218
mac_neun+_355_s1
peri_neun+_1059&564
peri_neun+_563
mac_neun+_220
peri_neun+_887&869
mac_all_822
peri_neun+_326&355
mac_neun+_326
mac_neun+_056
mac_neun+_978&194&150
mac_neun+_1005&1053&1017

0.6
T H A6

S A 1
O LC D2

U 9
L T1
AL 1
T P B1
SA 3
P DG
R C
P D HO

C H

C 3
LB 3
AP P 1
S L BP 2

G X2
VS 6
VSY1
O X2

AB 1
T F R IK5
TF P2 1
T FAP 2A
AP B
G 2C

A RX

E
C MS

G D

EC A

R 1A
M

C HX

SL RR
R MS

C X
G P

O
E6
N 6
17

A
R

A
BP
R

Features

Extended Data Fig. 1 | See next page for caption.


Article
Extended Data Fig. 1 | snRNA-seq data from the fovea/macula and peripheral donors, with individual points colored by type identity. f. Dotplot showing
retina of healthy human donors (n = 18). a. UMAP embedding of nuclei expression of cell class-specific markers (columns) in the human clusters
(n = 184,808) from the central and peripheral retina of healthy human donors, (rows). The size of each dot represents the fraction of cells in the group with
with individual points colored by cell class. PRs have been divided into rod and non-zero expression, and the color represents expression level. The six classes
cone subclasses, and ACs have been divided into GABAergic and glycinergic are MG, HC, PR (subdivided into Rod and Cone), AC (subdivided into Gabaergic
subclasses. b. Same as a, with points colored by sample identity. c. UMAP ACs (GabaAC) and glycinergic ACs (Gly AC)), BC and RGC. Only BCs and RGCs
embedding of RGC nuclei (n = 80,032) from the foveal and peripheral retina of have been subclustered. Rows corresponding to BC and RGC clusters are
healthy human donors, with individual points colored by type identity. Only ordered based on hierarchical clustering (dendrograms, left). Barplot on the
ON and OFF midget ganglion RGCs are labeled. d. UMAP embedding of non- right of the dotplot depicts the relative frequency of each cluster within a class
midget RGC nuclei (n = 6615) from c, with individual points colored by type (colors). The rightmost heatmap depicts the distribution of each cluster across
identity. ON and OFF parasol ganglion cells are labeled. e. UMAP embedding biological replicates (columns).
of BC nuclei (n = 9126) from the fovea and peripheral retina of healthy human
a Pig: Whole Retina c
Single Nuclei
RGCs

NEUN
FSC

BCs
10µm
DAPI CHX10

b Human: Macula
d

Single Nuclei RGCs


NEUN
FSC

BCs

10µm
DAPI CHX10

e
NEUN CHX10 DAPI

ONL

INL
GCL

Human Pig Sheep

Cow Squirrel Tree shrew 50µm

Extended Data Fig. 2 | Nuclear enrichment strategies for retinal ganglion in ~90% yield for RGCs; BCs were not analyzed in this experiment. c. Brightfield
cells (RGCs) and bipolar cells (BCs). a. Examples of gating strategy in image showing the morphology and integrity of FACS-purified nuclei. d.
fluorescent activated cell sorting (FACS) experiments for collecting single Confocal image of DAPI stained FACS-purified nuclei. e. Retinal sections from
nuclei labeled with either PE-conjugated NEUN, which enriches RGCs, or six species show that PE-conjugated NEUN (red) and APC-conjugated CHX10/
APC-conjugated CHX10 (also known as VSX2), which enriches BCs. Data shown VSX2 labels RGCs and BCs, respectively. Retinal sections were co-stained for
are representative from experiments in the pig retina. NEUN and CHX10-based DAPI (blue) to visualize nuclei. Scale bar, 50 μm. Images in panels a–e
enrichment resulted in ~90% yield for RGCs and ~95% yield for BCs. b. Same as representative of n ≥ 3 experiments.
panel a, for human macular retina samples. NEUN-based enrichment resulted
Article
Pct Exp Avg Exp Proportion
Pct Exp Avg Exp Proportion
a b

0.6
0.4
0.2
−1

0.0
25
50
75
Tree shrew

100
0
1
2
Sheep

25
50
75

0.0

0.2

0.4

0.6

0.8
−1
0

0
1
2
MG
Cone MG
Rod
1 Cone
HC
GlyAC Rod
GabaAC
BC_6 0.8
HC
BC_3 GlyAC
BC_7
BC_10 GabaAC
BC_11 BC_10 0.9
BC_9
BC_13 BC_6
BC_8 BC_14
BC_4
BC_2 BC_2
BC_12
BC_1
BC_5
BC_1 BC_12
RGC_29 0.3
BC_5
RGC_3
RGC_20 BC_8
RGC_19
BC_9
RGC_15
RGC_6 BC_3
RGC_16
BC_4
RGC_39
RGC_37 BC_11
RGC_13
BC_7
RGC_1

Identity
RGC_35 BC_13
RGC_38 0.8
RGC_24 RGC_8
RGC_30 RGC_12
RGC_27
RGC_28 RGC_11
RGC_32 RGC_19
RGC_7
RGC_26 RGC_5
RGC_18 RGC_4
RGC_2
RGC_5 RGC_15
RGC_36 RGC_16
RGC_9
RGC_33 RGC_9
RGC_34 RGC_14
RGC_17
RGC_21 RGC_3
RGC_31 RGC_13
RGC_12
RGC_14 RGC_6
RGC_23 RGC_10
RGC_10
RGC_22 RGC_7
RGC_4 RGC_17
RGC_8
RGC_11 RGC_1
RGC_25
RGC_18
chx10+_s1

chx10+_s2
neun-_chx10-_s1

neun-_chx10-_s2

chx10+_s3
neun+_s2
neun+_s3
neun+_s1

0.1
RGC_2
1

S L −1
PO S2

Y1

X1

TF X2

T F 2A
2B

2− 1

N A9

EC 1
T2

PD G
H

R 3
P1
BP S

X
AD AD
4F

O UT

R
M

E6

R
SA
U
AS
M

TH

LB
VS

T
AP

AP

AR
C
BP

C
O

EC
G

0.2
R

chx10+_s1
chx10+_s2
neun+_s2

neun+_s5
neun+_s6
chx10+_s3
neun+_s3
neun+_s4
N
R

T HA6
S L BP S 2

N Y1
N FL
V SF M
O X2
G X2
V M6
A 1
G BP 5

T FAP 2 1
T F AP A
AP 2B
G 2C
S A 1
O LC D2

U 9
L T1
AL 1
T P B1
SA 3
P DG
C
E 6O
C H

C 3
LB 3
AP P 1
C MS

ARR X

E
O

T F R IK

G D

EC A

R 1A
C SX

C HX

SL R

O
P DR H
G

R M

17

N 6
T

A
R
E
Features

BP
R Features

Pct Exp Avg Exp Proportion


Pct Exp Avg Exp Proportion
c d
0.00

0.25

0.50

0.75
100

Cow Pig
25
50
75

−1

100
0

0
1
2

0.0

0.2

0.4

0.6

0.8
25
50
75

−1
0

0
1
2
MG MG
Cone
Rod
Cone
HC 0.8
GlyAC
Rod GabaAC
BC_18 0.8
BC_7
HC BC_6
BC_3
GlyAC BC_11
BC_16
BC_10
GabaAC BC_5
0.9 BC_4
BC_13
BC_13
BC_12
BC_9
BC_15 BC_14
BC_17
BC_8
BC_14 BC_15
BC_2
BC_6 BC_1
RGC_36 0.6
RGC_5
BC_3 RGC_11
RGC_10
RGC_29
BC_8
RGC_16
RGC_6
BC_5 RGC_21
RGC_7
RGC_13
BC_16 RGC_31
RGC_23
BC_7 RGC_15
RGC_18
RGC_3
BC_9 RGC_26
RGC_37
RGC_14
BC_4
RGC_4
RGC_1
BC_11 RGC_17
RGC_2
RGC_22
BC_2 RGC_12
RGC_8
BC_1 RGC_9
RGC_20
RGC_30
BC_10 RGC_24
RGC_19
RGC_28
BC_12
RGC_27
0.9 RGC_32
RGC RGC_35
RGC_34
RGC_33
neun+_s1

neun+_s6

RGC_25
chx10+_s1
chx10+_s2
all_s2
neun+_s2

neun+_s3
neun+_s4
chx10_s3
neun+_s5
T HA6

S A 1
O LC D2

U 9
L T1
AL 1
T P B1
SA 3
P DG
C
E 6O
C H

C 3
LB 3
AP P 1
S L BP S 2

N Y1
N FL
V SF M
O X2
G X2
V M6
A 1
G BP 5

T FAP 2 1
T F AP A
AP 2B
G 2C
C MS

ARR X

E
T F R IK

G D

EC A

R 1A
C SX

C HX

SL R

O
PD H
R M

17

N 6
T

A
R
E

neun-chx10-_s2
neun-chx10+_s3
neun+chx10-_s3
neun-chx10+_s2

neun+chx10-_s2
neun-chx10+_s4

neun+chx10-_s4
cd73-_s1a
cd73-_s1b
cd90+_s1a
cd90+_s1b

0.2
BP

T F 2A
2B

1
SL 2

EC 9
N T1

T2

R 3
P1
S2

Y1

C 2

G 5
T F IK1
BP S

E
R

AD

AD

O C6A
X

R
M

O
PD
SA

H
O U

U
AB

LB
M
TH

VS

AP

AP

AR

AP

Features
BP

R
EC
G

G
R

N
R

Features

Extended Data Fig. 3 | See next page for caption.


Extended Data Fig. 3 | Summary of cell type atlases for tree shrew, sheep, dotplot depicts the relative frequency of each cluster within a class (colors).
cow, and pig. a. Dotplot showing expression of cell class-specific markers The rightmost heatmap depicts the distribution of each cluster across samples
(columns) in the tree shrew (n = 3 animals; 71,571 nuclei) clusters (rows). The (columns). Panels b-d depict the same information as panel a for sheep (n = 6
size of each dot represents the fraction of nuclei in the group with non-zero animals; 65,490 nuclei) (b), cow (n = 6 animals; 75,794 nuclei) (c), and pig (n = 4
expression, and the color represents expression level. The six classes are MG, animals; 49,955 nuclei) (d). Note that in this figure, as well as Extended Data
HC, PR (subdivided into Rod and Cone), AC (subdivided into GABAergic AC Figs. 1 and 4–6, the proportions shown accurately report our data but do not
(GabaAC) and glycinergic AC (Gly AC)), BCs and RGCs. Only BCs and RGCs have necessarily represent the true endogenous proportions. This is because in
been subclassified through a within-species integration and clustering analysis many cases we depleted photoreceptors or enriched BCs or RGCs to obtain
(Methods). Rows corresponding to BC and RGC clusters are ordered based on a sufficient numbers of rare cell types (see Methods).
hierarchical clustering analysis (dendrograms, left). Barplot on the right of the
Article
Pct Exp Avg Exp Proportion Pct Exp Avg Exp Proportion
a b

100
Peromyscus Ferret

100
25
50
75

0.0

0.2

0.4

0.6
−1

0.0

0.2

0.4

0.6
25
50
75
0

−1
0
1
2

0
1
2
MG
MG
Cone
Rod Cone
HC 0.6 Rod
GlyAC 0.9
HC
GabaAC
GlyAC
BC_9 0.7
BC_11 GabaAC
BC_5 BC_7 0.7

BC_8 BC_4
BC_7
BC_3
BC_3
BC_2 BC_2
BC_6 BC_12
BC_10 BC_1
BC_1
BC_10
BC_4
RGC_17 0.5 BC_11
RGC_24 BC_8
RGC_27 BC_5
RGC_13
BC_9
RGC_9
RGC_8 BC_6
RGC_26 0.8
RGC_2
RGC_5 RGC_13
RGC_6
RGC_23
RGC_32
RGC_21 RGC_24
RGC_22 RGC_19
RGC_19 RGC_17
RGC_18
RGC_14
RGC_20
RGC_16 RGC_10
RGC_7 RGC_12
RGC_14 RGC_8
RGC_37
RGC_16
RGC_12
RGC_34 RGC_18
RGC_36 RGC_1
RGC_38 RGC_15
RGC_23 RGC_21
RGC_29
RGC_33 RGC_3
RGC_30 RGC_6
RGC_10 RGC_25
RGC_35 RGC_22
RGC_11
RGC_31 RGC_9
RGC_26 RGC_20
RGC_28 RGC_2
RGC_4 RGC_11
RGC_15
RGC_3 RGC_4
RGC_1 RGC_7
RGC_25 RGC_5
0.1 0.1

all_s1
all-s2a
all-s2b
cd73-_s2a
cd73-_s2b
cd90+_s2a
cd90+_s2b
cd90+_s1
SL 2
S2

Y1

X2

G 2
T F IK1

T F 2A
2B

EC 9
N T1

T2

PD O
H

R 3
P1
BP S

cd73-_s1a
cd73-_s1b
cd73-_s1c
cd73-_s2a
cd73-_s2b
cd73-_s3
cd90+_s1a
cd90+_s1b

TF P5
T F 2A

G B
G 1
SL 2
S2
Y1
X2

C 2

N A9
N T1

T2

G
O
C

P1
AD

AD

O C6A

BP S

R X

E
TX

R
M

E6

R
SA

AD
AD
TX

R
O U

O
M
TH

LB

PD

R
SA
VS

AP

AP

H
AR
C
BP

O CU

U
AB
M
TH

O C6

LB
VS

AP
AP
O

AR
EC

C
G

AP
BP

R
O

EC
R

E
N

R
R

Features Features

Pct Exp Avg Exp Proportion Pct Exp Avg Exp Proportion

0.0

0.2

0.4

0.6
100

0.0

0.1

0.2

0.3

c d

−1
25
50
75
Opossum Lizard
25
50
75

−1

0
1
2
0
0

0
1
2

MG MG
Cone Cone
Rod
Rod
HC
GlyAC HC
GabaAC GlyAC
BC_13 0.8 GabaAC
BC_5 BC_16 0.9
BC_8
BC_5
BC_14
BC_8
BC_7
BC_10 BC_6
BC_12 BC_15
BC_11 BC_4
BC_2 BC_10
BC_15
BC_7
BC_17
BC_9 BC_11
BC_6 BC_12
BC_3 BC_1
BC_1 BC_14
BC_4
BC_13
BC_16
BC_3
RGC_30 0.5
RGC_28 BC_9
RGC_31 BC_2
Identity

RGC_29 BC_0
RGC_23 RGC_23 0.2
RGC_20
RGC_19
RGC_12
RGC_5 RGC_22
RGC_15 RGC_21
RGC_13 RGC_11
RGC_17 RGC_20
RGC_18 RGC_15
RGC_6
RGC_4
RGC_10
RGC_3 RGC_5
RGC_26 RGC_16
RGC_24 RGC_18
RGC_19 RGC_14
RGC_11
RGC_6
RGC_27
RGC_14 RGC_7
RGC_8 RGC_9
RGC_2 RGC_1
RGC_25 RGC_17
RGC_7 RGC_12
RGC_1
RGC_13
RGC_16
RGC_21 RGC_10
RGC_4 RGC_2
RGC_22 RGC_8
RGC_9 RGC_3
neun-_s1
neun-_s3a
neun-_s3b
neun-_s4a
neun-_s4b
neun+_s1
neun+_s5a
neun-_s2
neun+_s5b
neun+_s3
neun+_s4

0.1 0.2
TH 6
VS 1
X2

G 2

VS 6
G 1
T F IK1
T F P 2A
T F P 2B

G C
G 1
S L D2

EC 9
C T1
B1

C
P D HO

AR H
SL R3

R A3

AP 1
E
A

AD

N A
Y

TX

U 6
VS 2
C X2
G 5

O 1
PR X2
T F CA

G A
S L D2

LH 9
C X1

TP 1
3
G
C
P D HO

AR H
SL R3

R A3

AP 1
O

C S

E
2

PD

E6

neun-_s1
neun-_s2

neun+_s1
neun+_s2a
neun+_s2b
all_s3
U

P O 7A

IK

6A

B
M
17

O C6

AL

1
LB

6F

P
A

SL PM

O
AP

PD

E6
SA
R

C
O

AB

AL

1
LB
A
A

AP

A
K
R

R
C

C
B
SL

Features Features

Extended Data Fig. 4 | Summary of cell type atlases for Peromyscus, ferret, 49,972 cells) (b), opossum (n = 5 animals; 76,763 nuclei) (c), and brown anole
opossum, and brown anole lizard. Panels a-d depict the atlases (as in Extended lizard (n = 3 animals; 42,848 nuclei) (d).
Data Fig. 3) for peromyscus (n = 3 animals; 44,223 cells) (a), ferret (n = 2 animals;
Pct Exp Avg Exp Proportion Pct Exp Avg Exp Proportion
a Rhabdomys

0.00

0.25

0.50

0.75

1.00
100
100

0.0
0.1
0.2
0.3
0.4
0.5
b

25
50
75
25
50
75

−1
−1
Squirrel

0
0

0
1
2
0
1
2
MG MG
Cone Cone
Rod
Rod
HC
HC
GlyAC
GlyAC
GabaAC
GabaAC
BC_15 0.9
BC_3 BC_8 0.9

BC_1 BC_6
BC_11 BC_13
BC_7 BC_14
BC_17 BC_11
BC_13 BC_15
BC_14 BC_4
BC_16 BC_3
BC_6 BC_12
BC_9 BC_10
BC_5 BC_7
BC_12 BC_2
BC_10 BC_1
BC_4 BC_5
BC_2
BC_9
BC_8
RGC_30 0.2
BC_18
RGC_29
RGC_32 0.2
RGC_32
RGC_30
RGC_31 RGC_24
RGC_33 RGC_25
RGC_17 RGC_20
RGC_26 RGC_NA
RGC_22 RGC_9
RGC_25 RGC_28
RGC_29 RGC_11
RGC_24 RGC_2
RGC_16 RGC_21
RGC_3 RGC_19
RGC_1 RGC_26
RGC_28 RGC_16
RGC_19 RGC_31
RGC_20 RGC_18
RGC_4
RGC_3
RGC_7
RGC_13
RGC_11
RGC_15
RGC_21
RGC_12
RGC_10
RGC_9 RGC_8
RGC_23 RGC_22
RGC_6 RGC_27
RGC_12 RGC_17
RGC_18 RGC_7
RGC_15 RGC_4
RGC_14 RGC_14
RGC_8 RGC_23
RGC_13 RGC_5
RGC_27 RGC_1
RGC_2 RGC_10
RGC_5 RGC_6
0.2 0.2
all_s2a
all_s2b
neun-_s2
neun-_s1
neun+_s2
neun+_s1a
neun+_s1b
T HA6

V 6

P DG
R C
P D HO
A 6H
C 3
LB 3
AP P 1
S L BP 2

N Y1
VS FL
O X2
G X2

AB 1
T F R IK5
TF P2 1
A
AP B
G 2C
S A 1
O LC D2

U 9
L T1
AL 1
T P B1
SA 3

E
C MS

chx10+_s1

neun+_s1
TH 6
S L BP M 2

VS 1
O 2
G 2
VS 6
C X1
G P5
A 1
A A
AP B
G 2C
G D1
O LC D2
EC 9

LH 1
C X1
B1
G
R C
P D HO
H

R A3
P1
C S

C X
M

AD

R 1A
EC A

SL RR
R MS

C SX
G P

C HX
T FAP 2

T F IK

N 6A
R MS

Y
X
TX

T
TF P2
TF P2

PD

E6
S L CR
17

N 6

SA
T

E
R

U
17

AB

AL

1
LB
A
S A
A

R
G

R
BP

BP
R

Features Features

Pct Exp Avg Exp Proportion Pct Exp Avg Exp Proportion

c d Sea-lamprey

100
0.00

0.25

0.50

0.75

1.00
100

0.2

0.4

0.6

0.8
Marmoset

25
50
75
25
50
75

−1
−1

0
0

0
1
2
0
1
2

MG MG
Cone
Rod Cone
HC
Rod
GlyAC
GabaAC HC
BC_DB1 0.9
BC_FMB GlyAC
BC_DB3a GabaAC
BC_BB/GB*
BC_IMB BC_8 0.6

BC_DB6
BC_3
BC_RB
BC_DB4_2 BC_7
BC_DB4
BC_DB5* BC_5
BC_DB3b BC_2
BC_DB2
pRGC_pRGC10 0.2 BC_1
pRGC_pRGC8
BC_4
pRGC_pRGC7_b
pRGC_pRGC7_a BC_6
pRGC_pRGC4/9
pRGC_OFF_MGC RGC_15 0.3

pRGC_OFF_PGC_a RGC_18
pRGC_pRGC12/13
pRGC_OFF_PGC_b RGC_17
pRGC_pRGC3
RGC_12
pRGC_pRGC14
pRGC_pRGC5 RGC_9
pRGC_pRGC11
pRGC_ON_PGC RGC_3
pRGC_pRGC6 RGC_13
pRGC_ON_MGC
pRGC_pRGC1 RGC_8
fRGC_fRGC10
RGC_7
fRGC_ON_PGC
fRGC_OFF_PGC_a RGC_5
fRGC_fRGC2
fRGC_fRGC12 RGC_16
fRGC_fRGC11
RGC_10
fRGC_fRGC6
fRGC_fRGC3 RGC_11
fRGC_fRGC5
RGC_2
fRGC_fRGC4
fRGC_fRGC9 RGC_1
fRGC_OFF_PGC_b
fRGC_fRGC8 RGC_14
fRGC_OFF_MGC
RGC_6
fRGC_fRGC1
fRGC_ON_MGC RGC_4
fRGC_fRGC7
peri_cd73-_s1a
peri_cd73-_s1b
peri_cd90+_s1
fov_all_s1a
fov_all_s2a
fov_all_s2b
fov_all_s1b
fov_all_s2c

all_s1

all_s2

0.3
2

X2

T1

B
X1
S

1
2

0.2
H

O
T H A6

TF RI 5

G 1
N 6 2
U 9
L T1
AL 1
T P B1
SA 3
P DG
C
E 6O
C H

C R3
LB 3
S L BP S 2

VSY1
O X2
G TX2

C SX6
A 1

T FAP 1
T F AP A
AP 2B
G 2C

AP P 1
S LAR X
17 S

6F

AD

6A
TX

PN

O
M

_X

_X
VM

AD
O LCAD
EC A

R 1A
G BP

C HX
C M

O
R

E6
P DR H

RH
U

LH
VS
R M

AP
BP
U
R

C
2B

2B

EC
G

O
PD
BP

PO

SL
R

AP

AP
S

N
R

O
TF

TF

Features
Features

Extended Data Fig. 5 | Summary of cell type atlases for Rhabdomys, squirrel, 22,821 cells) (b), marmoset (n = 2 animals; 52,559 cells) (c), and Sea-lamprey
marmoset and sea-lamprey. Panels a-d depict atlases (as in Extended Data (n = 2 animals; 18,928 cells) (d).
Fig. 3) for Rhabdomys (n = 2 animals; 65,338 nuclei) (a), squirrel (n = 1 animal;
Article
Pct Exp Avg Exp Proportion Pct Exp Avg Exp Proportion

a b

100
Macaque

0.0
0.1
0.2
0.3
0.4
0.5
0.6
Mouse

100

0.0

0.1

0.2

0.3

0.4

25
50
75

−1
25
50
75

−1

0
1
2
0

0
1
2
MG
MG Cone
Cone Rod
Rod HC
HC GlyAC
GabaAC
GlyAC BC7 0.8
GabaAC BC5A
DB5* 0.8 BC6
BC5B
DB4
BC5C
DB6 BC9
BB/GB* BC8
OFFx BC5D
BC1A
DB1 BC1B
FMB BC2
DB2 BC3B
DB3b BC4
BC3A
DB3a RBC
RB C42_AlphaOFFS 0.4
IMB C33_M1a
pRGC_pRGC17 0.3 C41_AlphaONT
C45_AlphaOFFT
pRGC_pRGC11 C44
pRGC_pRGC7 C22_MX
pRGC_pRGC16 C31_M2
C20
pRGC_pRGC9
C39
pRGC_pRGC13 C14
pRGC_pRGC10 C35
pRGC_pRGC6 C28_FmidiOFF
C27
pRGC_pRGC15 C37
pRGC_pRGC18 C38_FmidiON
pRGC_pRGC12 C23_W3D2
pRGC_PGC_ON C11
C43_AlphaONS_M4
pRGC_pRGC5 C10
pRGC_pRGC14 C32_FRGC_novel
pRGC_PGC_OFF C19
pRGC_MGC_ON C36
C25
pRGC_pRGC8 C17_TRGC_S1
pRGC_MGC_OFF C7
fRGC_fRGC14 C13_W3L2
C5_JRGC
fRGC_fRGC16
C34
fRGC_fRGC15 C1_W3L1
fRGC_fRGC12 C6_W3B
fRGC_fRGC13 C3_FminiON
C16_ooDSGC_DV
fRGC_fRGC7 C21_TRGC_S2
fRGC_fRGC11 C9_TRGC_novel
fRGC_fRGC10 C40_M1b
fRGC_fRGC9 C2_W3D1
C30_W3D3
fRGC_fRGC6 C18
fRGC_fRGC8 C8
fRGC_fRGC5 C24
C12_ooDSGC_N
fRGC_MGC_ON
C15
fRGC_PGC_ON C29
fRGC_PGC_OFF C4_FminiOFF
fRGC_MGC_OFF C26

all_s1
all_s2
all-s3
all-s4
cd90+_s1
cd90+_s2
cd90+_s3
cd90+_s4
cd90+_s5
cd90+_s6
cd90+_s7
cd90+_s8
cd90+_s9
cd90+_s10
0.1
peri_cd73-_s1a
peri_cd73-_s1b
peri_cd73-_s2a
peri_cd73-_s2b
peri_cd73-_s2c
peri_cd90+_s1a
peri_cd90+_s1b
peri_cd90+_s1c
peri_cd90+_s1d
peri_cd90+_s1e
peri_cd90+_s1f
peri_cd90+_s1g
peri_cd90+_s1h
peri_cd90+_s1i
fov_all_s3a
fov_all_s1a
fov_all_s4a
fov_all_s4b
fov_all_s1b
fov_all_s1c
fov_all_s4c
fov_all_s1d
fov_all_s3b
fov_all_s1e
fov_all_s1f
fov_all_s1g
fov_all_s2a
fov_all_s2b
fov_all_s2c
fov_all_s2d
fov_all_s2e
fov_all_s2f
fov_all_s3c
fov_all_s2g
fov_all_s2h

0.4

T HA6

T FAP A
AP 2B
G 2C
G 1
N 6 2
U 9
C HX 1
AL 1
T P B1
SA 3
P DG
P DR H C
E 6O
C H

R 1A3
LB 3
AP P 1
S L BP S 2

VSY1
O X2
G TX2

C SX6
A 1
TF RI 5
T FAP 1

S LAR X
17 S

E
AD
O LC D
VM

EC A

M
G BP

L T

CR
2
T HA6

C M

O
C S2

VSY1
O X2
G X2
VS 6
G X1

TF P2 1
A A
AP B
G 2C
G D1
O LC D2

U 9
L T1
C HX 1

SA 1
P DG
R C
P D HO

C H

C 3
R 1A3
AP P 1

R
A RX

E
M

T F IK

R M
EC A

A
SL RR
TF P2

O
E6

R
SL PM
17

N 6

AL

LB
T

A
S A
R

BP
R
A

S
B

R
R

Features Features

Pct Exp Avg Exp Proportion Pct Exp Avg Exp Proportion
c Chicken d Zebrafish

100

0.0

0.1

0.2

0.3
100

25
50
75
0.0
0.1
0.2
0.3
0.4
0.5

−1
25
50
75

0
−1

0
1
2
0

0
1
2

MG MG
Cone Cone
Rod Rod
HC 0.9 HC
GlyAC GlyAC
GabaAC GabaAC
BP−5 0.7 BC_9 0.8
BP−8 BC_16
BP−2 BC_6
BP−12 BC_8
BP−18
BC_18
BP−11
BP−15 BC_11
BP−16 BC_15
BP−17 BC_14
BP−13 BC_7
BP−19 BC_23
BP−21 BC_22
BP−10 BC_21
BP−1 BC_17
BP−3 BC_10
BP−14 BC_12
BP−4
BP−6 BC_13
BP−7 BC_4
BP−22 BC_2
BP−9 BC_3
BP−20 BC_19
RGC−40 0.2 BC_5
RGC−13 BC_20
RGC−37 BC_1
RGC−41 RGC_24 0.2
RGC−23 RGC_30
RGC−31
RGC_31
RGC−22
RGC−29 RGC_25
RGC−18 RGC_29
RGC−27 RGC_20
RGC−28 RGC_4
RGC−14 RGC_18
RGC−8 RGC_27
RGC−35 RGC_6
RGC−33 RGC_21
RGC−34 RGC_9
RGC−30 RGC_7
RGC−4
RGC−9 RGC_22
RGC−36 RGC_3
RGC−16 RGC_13
RGC−38 RGC_5
RGC−32 RGC_10
RGC−10 RGC_32
RGC−19 RGC_23
RGC−1 RGC_28
RGC−39 RGC_11
RGC−11 RGC_14
RGC−20
RGC_12
RGC−6
RGC−26 RGC_16
RGC−3 RGC_19
RGC−21 RGC_2
RGC−15 RGC_8
RGC−17 RGC_17
RGC−5 RGC_15
RGC−7 RGC_26
RGC−12 RGC_1
RGC−2 0.3
RGC−25
P O S 2B

TH 2
VS 1

S L D2
PR X2

O A
VS 2
TF X1

G A

EC 9
C T1

TP 1
3
AR HO
P D 3A
G 6H
AP LA
EB
N A

B
M
6F

TX

RGC−24
KC

U
O C6

AL

LU
AP

R
E

O
R

cd73-_s1

cd90+_s15
U

cd73-_s2
cd73-_s3
cd73-_s4
cd73-_s5
cd73-_s6
cd73-_s7
cd73-_s8
cd90+_s1
cd90+_s2
cd90+_s3
cd90+_s4
cd90+_s5
cd90+_s6
cd90+_s7
cd90+_s8
cd90+_s9
cd90+_s10
cd90+_s11
cd90+_s12
cd90+_s13
cd90+_s14
M

0.1
BP
all_s1
all_s2
all_s3
all_s4
cd90+_s1
cd90+_s2
cd90+_s3
cd90+_s4
G B
G 1
S L D2
U 2
TH 1
VS 1
VS 1
G 2

O 1
TF TX2
T F P 2A

E 9
EC 1

AR 2
3
G 1
T2

PD O
B
R C
P1
BP S

R
IK

AD

N A
P O MS
4F

Y
X
X

N T

T
R
S
R PM

E6
PD
H

Features
O CU

G
O C6

LB
AP

R
R

R
A

N
B
R

Features

Extended Data Fig. 6 | Summary of cell type atlases for macaque, mouse, replicates; 60657 cells) (d). Cluster labels are consistent with published
chick and zebrafish. Panels a-d depict atlases (as in Extended Data Fig. 3) for annotations19,20,22,27,28. Each biological replicate in zebrafish involved a pooling
macaque (n = 4 animals; 146,054 cells) (a), mouse (n = 10 animals; 51,162 cells) of eyes from multiple (5-8) fish.
(b), chick (n = 4 animals; 34,788 cells) (c), and zebrafish (n = 15 biological
a b

s_AC
Tree shrew_AC

C
Chicken_AC

Mac oset_AC
Pero

um_A
S heep_ AC

Rhabdomy

C
Lizard_

HC
e_ A
mys se_ A
Ferr _ AC

Pig_AC
Cow_A C

HC
w_
Tre an_ A
Mo rel_ A C

Oposs

aqu
Sq

cus

Marm

hre
et_ A

rre e_ H s_
u
uir p_ H C

AC
Ze mo an_
S h fish BP

M r om C
Fe ous yscu
s
Hum
M Hu

Pe w_ H
br
e

C
e
ar

e
Rh

a et_ BP
10

C
ab

HC
Co
C
d

s
_
Ch om C _ HC C

t_
m

A
Tre ick ys_ H
e g_ en _ H
Op
sh
r
en BP
_ P i hick sum Chicken
os ew_ BP C os HC _ HC
su
m_ P
B Op ard_ mys Cow
P ig BP Liz abdo HC
Co _ BP R h irrel_ HC 5 Ferret
u
w_ S q ue_ Human
S he
ep_ P
B caq HC
Ma
Ferr BP an_ Lamprey
et_ B Hum oset_HC
Lizard P Marm G Lizard
_BP ish_M
Peromysc Zebraf
us_BP
Zebrafish_
BP Macaque

UMAP_2
Mouse_BP
Zebrafish_HC 0 Marmoset
Squirrel_BP
Macaque_BP Lizard_ PR Mouse
Zebrafish_R
GC Squirrel_P
R Opossum
C
Cow_ RG
Chicken
_P R Peromyscus
_R GC Tree
S heep GC P ig_
shrew
_P R −5 Pig
et_ R C Pero R
P
Ferr RG Rhabdomys
rd_ C R h myscu
Liza _ R G a
Op bdom s_ P R Sheep
n GC
ma Mo ossu ys_ P
Hu ue_ R GC R Squirrel
q
ca ys_ R GC Ze use m_ P
Ma m R C M bra _ P R R Tree shrew
do P ig_ R G C M arm fish
ab _ ac o _ −10 Zebrafish
Rh m RG aq set P R
su s_
S q use R G C

Hu ee P R
ue _ P
Mo ken _ R G

os cu
rel_ R G C

S h w_ P R
_P R

m p_ P
p
ys
Pero hrew R G C

Co rret_

an
O R
ic et

RG C

Fe _ MG
m

_P
_
Ch os

ro
Co MG
Marm cus_ C
ose MG

Pe
P ig p_ MG
_

R
m

Hum t_ MG

S he

w
ar

R
G

S qu
G

Lizard
_MG

_
M

Rhabdo
Tree shrew_M
Mouse_ MG
Ferret_MG
Opossum_MG
an_ M
_

ue_M
uir

e
irrel_
s

Chicken

−10 −5 0 5 10
_MG
my
es

Macaq

MG
mys_ MG

UMAP_1
Tre

c d

5
0
5
0
5
0
5
0

0.2
0.5
0.7
1.0
0.2
0.5
0.7
1.0

MG

PR

HC

AC

BC

RGC

e RGC BC AC HC PR MG
0.8
expression distance

0.6
Mean squared

0.4

0.2
y

0.0
0.0 0.8 1.6 2.4 0.0 0.8 1.6 2.4 0.0 0.8 1.6 2.4 0.0 0.8 1.6 2.4 0.0 0.8 1.6 2.4 0.0 0.8 1.6 2.4
Evolutionary Divergence (substitutions per 100 bp)

Extended Data Fig. 7 | Evolutionary conservation of retinal classes. a class. d. Same as panel c, but rows and columns grouped based on species
a. Dendrogram showing transcriptional relationships among pseudobulk instead of class (compare with Fig. 2c). e. Pairwise mean-squared distance of
expression vectors following integration. Each node is a cell class within a class-specific cell-averaged gene expression profiles between all 16 jawed
particular species. Dendrograms were computed via hierarchical clustering vertebrate species (y-axis) increases with evolutionary divergence, as estimated
analysis (correlation distance, average linkage). b. Same as Fig. 2d, with cells by substitutions per 100 bp (x-axis) (compare with Fig. 2e). Gray shaded regions
colored by species of origin. Inset shows a magnified region containing samples demarcate species pairs involving zebrafish. Solid lines represent power law
from all species. c. Cross-correlation matrix (spearman) of class- and species- (y = ax b) regression fits. Across the panels, a ∈ [0.34, 0.47] and b ∈ [0.29, 0.45].
specific cell-averaged profiles for all 17 vertebrates (compare with Fig. 2b). The coefficient of determination (R 2) values range from 0.79-0.93.
Rows and columns are grouped by class, and then ordered by phylogeny within
Article
a b MEIS2 TCF4

THRB LHX4
10 10
GabaAC

5 5

Cone GlyAC
NRL NR2E3
UMAP_2

UMAP_2
0 0

Rod
−5 −5

−10 −10

−10 −5 0 5 10 −10 −5 0 5 10
UMAP_1 UMAP_1

c e f
FEZF2 LHX3

5
0
5
0
0.2
0.5
0.7
1.0
5
0
5
0
0.2
0.5
0.7
1.0
10

Cone

5
Rod
ISL1 ST18
GlyAC
UMAP_2

GabaAC
−5
OFF BC

−10 OFF BC ON BC
ON BC

−10 −5 0 5 10
UMAP_1

Cone

Norm. Exp.
Rod 0.00
0.25
0.50
0.75
1.00

GlyAC

GabaAC

OFF BC

ON BC
B

3
F4

M 9
2

3
L1

8
6A

AD

IK

T1
X

2E

IS

ZF

X
R

IS
TC
LH

LH
N
AR
TH

R
E
LC

S
R

FE
G

G
N

Extended Data Fig. 8 | See next page for caption.


Extended Data Fig. 8 | Evolutionary conservation of retinal subclasses. (annotation bar, left). Within each subclass, species are ordered as in Fig. 1b,
a. UMAP embedding of integrated cross-species data (as in Fig. 2d), highlighting with top and bottom nodes in each dendrogram corresponding to lamprey
PR subclasses cones and rods. Insets show feature plots of cone-specific (top) and human, respectively (corresponding to right and left in Fig. 1a). Gray tiles
and rod-specific (bottom) transcription factors (TFs). b. Same as panel a, for AC correspond to missing orthology information. e. Cross-correlation matrix
subclasses GABAergic ACs (GabaAC) and glycinergic ACs (GlyAC). Insets show (spearman) of subclass- and species-specific pseudobulk transcriptomic
feature plots of a GABAergic TF MEIS2 and a glycinergic TF TCF4. c. Same as profiles for all 16 jawed vertebrates. Rows and columns are grouped by subclass,
panel a, for BC subclasses ON BCs and OFF BCs. Insets show feature plots of OFF and then ordered by phylogeny within a class. Lamprey was excluded due to
BC-specific (top) and ON BC-specific (bottom) transcription factors (TFs). paucity of shared orthologs. f. Same as panel d, but rows and columns grouped
d. Heatmap showing average expression of subclass-specific genes (columns) based on species instead of subclass.
within the six subclasses across 17 species (rows). Rows are grouped by subclass
Article
a b
BC1A 1.00

Cone BC OT distribution
oBC1A
BC1B
BC2 oBC1B
BC OrthoType (OT)

BC3A 0.75 oBC2


BC3B oBC3A
Mapping %
BC4 oBC3B
0
BC5A 25
0.50 oBC4
BC5B 50
75
BC5C oBC5A
100

BC5D oBC5B
0.25
BC6 oBC5C
BC7 oBC5D
BC8/9
oBC6
RBC 0.00
oBC7

ep

ig
ea

t
ry

ew

se

el

ow
a

ys

um
ry

rre
ve

cu
r
ve

P
irr
he

he
he
ov

om

he
ou
hr

Fe

ss
M _ B 1A
M u_ 1B
M _ B C2
M _ B 3A
M u_ 3B
M _ B C4
M u_ B 5A
M _ B 5B
M _ B 5C
M _ B5D
M _B 6
M _B 7
M u_ B 8
_R 9
BC

ys
Fo
Fo

qu
r ip

r ip
r ip
ou C
ou C
o C
ou C

F
oBC8/9

M
S

S
bd

po
m
ou C
o C

ou C
o C

o C
ou C
ou B

ou B

ou C
ou C

S
an

et
Pe

Pe
e
Pe

ee

ro
M _B

ha

O
qu

os
um

Pe
Tr
an

et

R
ue
ou

m
ac

os
H
um

aq

ar
M

M
Mouse Type (Shekhar et al., 2016)

ac
H

ar
M

M
c d f
RBC DE genes across mammals
Chicken Opossum Avg Exp
Cow oRBC
Ferret 2.5
Ferret 2.0
NM_ON Pig 1.5
Human Cow 1.0
Lizard oBC3A oBC7 Sheep 0.5
Macaque
oBC6 0.0
Squirrel −0.5
Marmoset oBC8/9 oBC5C Peromyscus
Mouse Rhabdomys Pct Exp
oBC5B
Opossum Mouse 0
Peromyscus oBC5D Tree_shrew 25
oBC2
Pig oBC1B oBC5A Marmoset 50
NM_OFF
Rhabdomys Macaque 75
Sheep Human 100
oBC1A
Squirrel

C IK
P 3
01 W 2

AR P C MD
G 9

C A7
XC 1
T S 14
AK HB
N P1
N D2
R 1
U B
C E2
C T9
PP B
S A A2

16 L

G 28
T L1
SD N
S H T H 7B

O 4
SL TP 4
C R2

IP
oBC3B
Tree shrew

SH

H DH
C RK

G G

8L C2

RO S D
M
30 V KP
S C IN2

PA G

T H C2

EM
R
oBC4

16
L

AT AP

12
A

B
S
O
R

N
H
IT
Zebrafish

I
SL

A8
e

oBC1A
oBC1B
oBC2
oBC3A
oBC3B
Mapping %
oBC4
0
oBC5A 25

oBC5B 50
75
oBC5C
100
oBC5D
oBC6
oBC7
oBC8/9
oRBC
NM_ON
NM_OFF
ys

s
an

ue

et

se

ow

rd

ck

h
rre

re
cu

ee

Pi

is
re

om
os

za

hi
ou
um

aq

r
C

af
ss
ui
ys

Fe
Sh

Sh

C
m

Li

br
M

bd
ac

Sq

po
H

m
ar

Ze
ee
M

ha

ro

ow
M

Tr

Pe
R

Br

Extended Data Fig. 9 | Bipolar Cell OrthoType analysis including non- enriched for non-mammalian BCs from chick, lizard and zebrafish. The two
mammals. a. Confusion matrix showing the rationale behind naming OTs, named NM_OFF and NM_ON, are enriched for OFF and ON BCs from
mammalian BC OTs (rows) based on the mapping patterns of mouse BC types non-mammals (also see panel e). e. Confusion matrices showing the mapping of
(columns)19. Representation as in Fig. 3d, with each column summing to 100%. species-specific BC clusters (columns) to BC OTs (rows) identified by integrating
OT BC8/9 contains mappings from both mouse BC8 and BC9, which are BCs from all jawed vertebrates (panel c). Representation as in Fig. 3d’. Mammalian
transcriptionally proximal. b. Barplot showing within-species relative BC clusters predominantly map to the mammalian OTs (rows 1-14), and the
frequencies (y-axis) of the 13 cone BC OTs within each mammalian species pattern of mapping is similar to Fig. 3d. Chick, Lizard and Zebrafish BCs largely
(x-axis). The foveal and peripheral data from primates are plotted separately. map to the non-mammalian OTs NM_OFF and NM_ON (rows 15-16). f. Dotplot
c. Integrated UMAP of BCs from all 16 jawed vertebrates. Cells are colored by showing species-specific genes (columns) expressed in RBC orthologs in
species of origin. Lamprey, a jawless vertebrate, was excluded from the analysis mammals (rows). The size and color of each dot represent the percentage of
due to the paucity of shared orthologous genes. d. Same as c, with cells colored cells within the species cluster expressing the gene and the average expression
by OT identity. The integration of all jawed vertebrates recovers all the level, respectively.
mammalian BC OTs listed in Fig. 3c, but additionally identifies two OTs
a OT
b c
1.00
oRGC1 oRGC12 Chicken oRGC13
oRGC2 oRGC13 Ferret
HumanFovea
0.75 oRGC3 oRGC14 HumanPeriphery oRGC21
oRGC12 oRGC14
Lizard
oRGC4 oRGC15 oRGC10
MacaqueFovea oRGC15
Proportion

oRGC5 oRGC16 MacaquePeriphery oRGC3 oRGC17


0.50 MarmosetFovea oRGC18
oRGC11 oRGCNM
oRGC6 oRGC17 MarmosetPeriphery oRGC20 oRGC16
Mouse oRGC6
oRGC7 oRGC18 oRGC7
Opossum oRGC19
oRGC8 oRGC19 Peromyscus oRGC5
0.25
Pig
oRGC9 oRGC20 oRGC1
Rhabdomys oRGC2
oRGC10 oRGC21 Sheep
Squirrel oRGC9
0.00 oRGC11 oRGC4
Tree shrew
oRGC8
aq an ry

M ac eri ea
os eF y

m ip a
ry

sh a

Pe bdo e
m s
S us

S el
ep

F g
po t
um
Zebrafish
w

O erre
ro my
m qu er
M etP ove

ee ve

ha us

i
irr

P
a c m he

os he

re

c
M P v

he

ss
ar a ph

ys
ue Fo

Tr tFo

R Mo

qu
M Hu rip

ar er
Pe

e
an
um
H

d
oRGC1
oRGC2
oRGC3
oRGC4
oRGC5
oRGC6
RGC OrthoType (OT)

oRGC7
oRGC8
oRGC9
oRGC10
oRGC11
oRGC12
oRGC13
oRGC14
oRGC15
oRGC16
oRGC17
oRGC18
oRGC19
oRGC20
oRGC21
oRGC_NM

Tree shrew Mouse Rhabdomys Peromyscus


P

P
F

Human Macaque Marmoset

oRGC1
oRGC2
oRGC3
oRGC4
oRGC5
oRGC6
RGC OrthoType (OT)

oRGC7
oRGC8
oRGC9
oRGC10
oRGC11
oRGC12
oRGC13
oRGC14
oRGC15
oRGC16
oRGC17
oRGC18
oRGC19
oRGC20
oRGC21
oRGC_NM

Squirrel Sheep Pig Ferret Opossum Lizard Chick Zebrafish

e OTs containing intinsically photosensitive (ip) RGCs


oRGC1
oRGC2
oRGC3
oRGC4
oRGC5
oRGC6
oRGC7 Mapping %
RGC OrthoType

oRGC8 0
oRGC9
oRGC10 20
oRGC11 40
oRGC12
oRGC13 >60
oRGC14
oRGC15
oRGC16
oRGC17
oRGC18
oRGC19
oRGC20
oRGC21

OPN4+
EOMES+
Liz 11
3
H Fo Fer 6
M Pe R G 7
M Fo _ R 6
M Per fR G 6
F pR 5
ou fR C8
22 C1
ou X
Pe 8
P r_ 7
P 15
ha 8
S 6
S _3
Tr _ 6
H Fo Fe 29
M Pe R G 26
M Fo R G 13
M Per fR G 13

G 4
M u_ GC 5
M _ C 31_ 8
_ C 3_ 2

M _M a
_ b
ou 4
Pe C7

r 6
P _ 37
ha 1
S _ 27

Ch Tre 4
G 0

_2
C
C
ar _ C

_C

R _1
_1

Pe pR 1
M r_ p C1
ou C 1
ou 3 M
40 M1
ou 1
M C4

Pe _ 1

R _3

_2
i_R _ 2
r_
um v_ _

M _M

C-
ac r_ C
ac v_ C
ac _ C
_

he
he
e_
_ _
ac v_ G

M ov_ G
_C G

_
Fe

ig
ig

ig
um v r

qu
o R
ac r
um

um
H

Species RGC clusters

Extended Data Fig. 10 | See next page for caption.


Article
Extended Data Fig. 10 | Retinal Ganglion Cell OrthoType analysis including species RGC clusters (columns) to RGC OTs (rows) identified by integrating
non-mammals. a. Barplot showing within-species relative frequencies (y-axis) RGCs from all jawed vertebrates (panel c). Representation as in Fig. 4d.
of the 21 RGC OTs within mammalian species (x-axis) (Fig. 4b). The foveal and Mammalian RGC clusters predominantly map to the mammalian OTs (rows
peripheral data from primates are shown separately. Cow is excluded due to the 1-21), and the pattern of mapping is similar to Fig. 4d. Except for ipRGCs, chick,
paucity of data. b. Integrated UMAP of RGCs from all 15 jawed vertebrates lizard and zebrafish RGCs largely map to oRGC_NM (row 22). e. Confusion
(excluding cow). Cells are colored by species of origin. For primates, fovea and matrix showing the species-specific RGC clusters (columns) that map to
periphery are plotted separately. c. Same as b, with cells colored by RGC OT. the oRGC8 and 9, corresponding to ipRGCs. Representation as in Fig. 3f.
OTs 1-21 map 1:1 to the mammalian OTs in Fig. 4b, but we recover an additional Annotation bar (bottom) highlights species-specific RGC clusters that express
OT (NM) predominantly containing non-mammalian RGCs from chick, lizard OPN4 and EOMES, a transcription factor expressed selectively by ipRGCs20,21.
and zebrafish (also see panel d). d. Confusion matrices showing the mapping of
H
um
a

b
R M _M
Proportion a G

EOMES

TLE1
PROX1
MAFB
ZEB2
GRM5
IRX3
TBR1
PLXNA2
TFAP2D
MEIS2
ha M c_ M C_
bd M M ar_ GC OF

0.00
0.03
0.06
0.09
0.12
M ou_ arF OF _ O F
ou C ov F _ F
om
_ C 17 _ f M F
M ys 42 _ T R G GC
o _ A R G C1
lp C 2
ha _ S
O 1
Pe S q use
Fe F F S
ro uir
m re r
Fe _ 14
r
ys l
c O _ 20
po
S us Pe _ 2
r
he
Pe _ 18
r_
ep
O Fer P i 23
p r P i g_ 1
M Tr os et R g_ 1
ha 7
_
ar ee su
m S h 26
S q e_
sh m
u_ 2
os re
M et w Tr 27
Tr e_ 1
e_
M ar m P e P
H 28
ac o ri ig
p um
M
aq se h
ou M _M
u t e _C a c_ GC
M e P Fo ry M
43 a MG _
_ A r_ O C ON
lp N _ O
H aca eri vea
um qu ph ha _ N
O MG
N C
S_
an e F er
y F e M4
H Pe ove r_
F 19
um ri a
p O er_
e
an he po 9
Pe _ 28
r_
Fo ry
P i 34
g
ve
a Tr _ 4
e_
3
H
um
H _P
um G
OT
H F o C_
u O
M mP v_ R F F
a e G
M c_ P r_ R C4
M ar G G
_O C C
ou
F _O 4

of DE genes across OT1-4 and their expression across orthologous species-


_C

specific clusters. The size and color of each dot represent the percentage of
45 MF _ P F F
_A o G
lp u_ C C

cells within the species cluster expressing the gene and the average expression
ha 3

Extended Data Fig. 11 | Midget and Parasol OTs. a. Dotplot showing examples
O 4
O FF
p T
O o_ 2
po 2
oRGC2 (PGC ON)

Pe _ 27
r
oRGC5 (PGC OFF)

Pe _ 36
R r_ 3
ha 8
Sq _3
u 0
Tr _ 30
e_
34
H
H um_
um P
H F G
um ov C
H Fo _ R _ ON
u v G
H mP _ R C1

oRGC2 and oRGC5.


u e G 0
H mP r_ R C2
um er G 1
P _R C
M er_ G 10
M ac_ R GC20
ac P C
M Fo GC 2
ac v _ 1
P _ O
M er_ fR G N
M ar_ pR C7
ar O G
F o N_ C
v_ P 9
G
M fR G C
M o C
ou M u_ C 7
_C o 1
41 M u_ C 8
_ A ou 27
lp _ C
ha 37
O
F e NT
r
Pe _ 22
r_
P i 14
g
P i _ 14
g
P i _ 22
g
P i _ 25
R _3g
h 4
R a_ 2
h 4
R a_ 2
ha 5
_
S h 33
S q e_
u 9
Tr _ 31
e
Tr _ 17
e
Tr _ 21
e_
Tr 32
e_
33
RGC orthologs in mammalian species based on the frequencies of cells in
0
1
2
0

−1
50
40
30
20
10
Pct Exp

Avg Exp

level, respectively. Column order as in Fig. 5a. b. Relative proportion of parasol


Article
a mouse cells primate cells FLDA eigenvalue
genes

genes

combination 1 combination 185 combination 402


ONsus ONtrans OFFsus OFFtrans ONsus ONtrans OFFsus OFFtrans ONsus ONtrans OFFsusOFFtrans

join by common genes


and identify high
variable genes

select candidate
combination

mouse and primate cells


Shared 1:1 orthologs

mouse and primate cells

multicollinearity removal

b Primate MGC OFF c


Primate PGC OFF

FLDA eigenvalue (kinetics)


Sustained vs. Transient

Primate MGC ON
Primate PGC ON

Mouse αOFFs
Mouse αOFFt
Mouse αONs
Mouse αONt 1
2
3
4

FLDA eigenvalue (polarity)

d Top matching Mouse Types (FLDA) e


Primate
Types 1 2 3 4
MGC ON αONs αONs C40_M1a C36
MGC OFF αOFFs C38_FmidiON αOFFs αOFFs
PGC ON αONt αONt αONt αONt
PGC OFF αOFFt αOFFt αOFFt αOFFt

Extended Data Fig. 12 | See next page for caption.


Extended Data Fig. 12 | Factorized Linear Discriminant Analysis (FLDA) e. Geometric analysis of gene expression (GAGE) in which primate MGCs and
and Geometric Analysis of Gene Expression (GAGE). a. FLDA workflow and PGCs are compared to all combinations of 4 mouse RGC types (45 choose 4 * 4!
eigenvalue analysis. The gene expression matrices of primate and mouse RGCs = 3,575,880) rather than only the 432 curated combinations used to generate
were combined by their shared orthologous genes. Highly variable genes were Fig. 5d. Grey bars: histogram of scores for all sets of 4 mouse types. Red bar
selected, and PCA was applied to remove multicollinearity. FLDA was performed highlights the set of 4 α-RGC types with the correct matching of polarity and
on different combinations of mouse RGC candidates with known polarity and kinetics with the primate types, also marked by the red arrow located at a score
kinetics listed Supplementary Table 4. The combinations were ranked based of x = 0.657. The bulk of the distribution is approximated as a Gaussian with
on their FLDA eigenvalues, which measures the variance along each attribute mean 0.50 and standard deviation 0.0374 (blue line). The 4 α-RGC fit has the
captured in the projection. b. Visualization of the FLDA projection (Fig. 5c) second highest score among ~3.6 million candidates. The null hypothesis that
along the 2D subspace corresponding to polarity (x-axis) and kinetics (y-axis). this arises by chance has a p-value of p < 10 −6 based on a one-sided Student’s
c. Scatter plot of the FLDA eigenvalues for the kinetics (y-axis) vs. polarity t-test. The top scoring combination with a score of 0.658 involves mouse RGC
(x-axis), measuring the magnitude of the variance corresponding to these types C18, C7, C39 and C8 corresponding to the ON PGC, ON MGC, OFF PGC and
attributes captured in the projection. Inset highlights the top four matches OFF MGC respectively. Of the four mouse types, two – C18 and C8 - have been
(numbered 1-4) from the 432 combinations of 4 mouse types shown in physiologically characterized to exhibit sustained ON responses38, which
Supplementary Table 4. d. Mouse RGC types present within the top four violates their expected phenotypic correspondence to ON PGC (ON transient)
combinations out of the 432 combinations in panel c. The top matched set and OFF MGC (OFF sustained).
contains all four α-RGC types; the next three include 3 α-RGC types.
κ

You might also like