You are on page 1of 67

Envisioning the Future of Multiomics

Innovative Tools Driving Research and Discovery


in Cancer and Immunology
Article Collection

Sponsored by:
Resolve cancer
with single cell and spatial multiomics
Fundamentally alter your understanding of cancer and accelerate translational
research with flexible and innovative solutions for single cell sequencing and
spatially-resolved transcriptional profiling from 10x Genomics.

• Unravel the complexities of heterogeneous cancer samples to detect tumor


clones and unique cellular states that drive malignancy
• Resolve the tumor microenvironment and explore the influence of cancer on its
resident tissue
• Advance immunotherapies by characterizing the tumor immune response and
the molecular mechanisms underlying therapeutic response and resistance

Chromium Single Cell Solutions


Single Cell Gene Expression
Single Cell Immune Profiling
Single Cell Epigenomic Profiling
Single Cell Protein Expression
Targeted Gene Expression

Visium Spatial Solutions


Spatial Gene Expression
Spatial Protein Expression
Targeted Gene Expression

Learn more at 10xgenomics.com/cancer


Contents
4 41
Introduction Genomic Cytometry and New
Modalities for Deep Single-Cell

5
Interrogation
BY ROBERT SALOMON, LUCIANO MARTELOTTO, FATIMA VALDES-MORA,
DAVID GALLEGO-ORTEGA

Single-Cell Sequencing in

51
Translational Cancer Research and
Challenges to Meet Clinical Diagnostic
Needs Computational Approaches for High-
BY ULRICH PFISTERER, JULIA BRÄUNIG, PER BRATTÅS, MARKUS
HEIDENBLAD, GÖRAN KARLSSON, THOAS FIORETOS Throughput Single-Cell Data Analysis
BY HELENA TODOROV AND YVAN SAEYS

26
Identification of a Tumor–Specific
Gene Regulatory Network in Human
B-cell Lymphoma COVER IMAGE © 10x Genomics

BY 10x GENOMICS

30
Recent advances in single-cell
multimodal analysis to study immune
cells
BY RAYMOND HY LOUIE & FABIO LUCIANI

3
Introduction

F
rom cancer to immunology, single cell RNA- expression, can provide greater stratification of immune
sequencing (RNA-seq) has dramatically changed how cell states, including a cell’s ability to bind antigens, attack
researchers approach biology. Single cell resolution invading cells, and follow a path of differentiation. Cell
has progressed the concept of inherent heterogeneity of states can change over time and across space, and multiomic
biological systems and led to novel advances in how we technologies have been developed to evaluate each of
understand developmental processes, treat disease, and these variables. This article discusses recent single cell
develop therapeutics. Now, biologists can further increase multiomic applications to immunology, focusing on next
the breadth of their understanding with multiomic single generation multiomic techniques that enable simultaneous
cell analysis. In addition to a readout of mRNA abundance measurement of at least two distinct modalities from the
from single cell RNA-seq, single cell techniques can now be same single cell. Of particular relevance for immunologists
applied to profile DNA, chromatin state, and the proteome. is the ability to track clonal differentiation of T or B cells
To take it one step further, some methods enable next using receptor sequencing in the context of CAR-T therapy,
generation multiomics—the ability to capture multiple autoimmune disease, and lineage tracing of hematopoietic
measurements simultaneously from the same single cell, progenitor cells.
rather than examining one readout at a time. This abundant
data can provide novel insights, but it also presents new The proliferation of multiomic single cell approaches has
challenges, including how to collect, store, and manage been made possible by the confluence of several disparate
data; integrate different modalities; and properly interpret technologies, including genomics, microfluidics, cytometry,
findings. and informatics. Genomic Cytometry, described by Salomon
et al. (2020), is any technique that provides cell-by-cell
This collection of articles provides an overview of the exciting measurement of multiple modalities, including protein,
innovations occurring at the forefront of multiomics. The first mRNA, DNA, and epigenetic states, through a sequencing-
two articles focus on oncology. Cancer research is dedicated based readout, therefore overcoming the limitations of
to improving cancer diagnostics, patient stratification, fluorescence and mass cytometry by opening up unlimited
treatment monitoring, and therapeutic development. Single analytic space to quantify hundreds of thousands of
cell multiomics has provided increasingly detailed cell different molecular species at once. Multiple methods exist
atlases that let researchers gain a better picture of tumor to perform Genomic Cytometry, including plate-based,
heterogeneity and investigate how that heterogeneity droplet-based microfluidics, solid microfluidics, in situ
impacts disease progression and treatment response. combinatorial indexing, image-based approaches, and
Pfisterer et al. (2020) describes how the latest single cell spatial transcriptomics.
multiomic techniques can be applied to cancer research,
reviews the methods available for single cell isolation, and Gathering data is only the beginning, however. In Todorov and
highlights recent multiomic single cell oncology studies. In Saeys (2019), we examine the process underlying analyzing
our Data Spotlight from 10x Genomics, the simultaneous a single cell experiment, including power calculations
readout of epigenomic and transcriptomic data from the performed during experimental design, inclusion of controls
same cells enables the direct reconstruction of cell type– during data generation, pre-processing, checking for batch
specific gene regulatory networks for B-cell lymphoma. This effects during data visualization, cell type identification,
study highlights the power of using Chromium Single Cell differential analysis, and more. This article reviews methods
Multiome ATAC + Gene Expression, the first commercial for dimensionality reduction and cell clustering, compares
solution for paired ATAC-seq and RNA-seq analysis of single approaches for trajectory analysis, and provides an
cells. The data from this study is available for download so introduction to single cell multiomic data integration.
you can continue to explore the possibilities yourself.
These articles are designed to provide a comprehensive
In Louie and Luciani (2021) our attention shifts to the immune understanding of the technical innovations happening
system, another heterogeneous system that benefits from right now in single cell multiomics, and highlight how these
single cell investigation. Analysis of multiple modalities, advances are fueling the future of biological research and
including chromatin state, transcription status, and protein medicine.

4
REVIEW ARTICLE

Single-cell sequencing in translational cancer research and


challenges to meet clinical diagnostic needs

Ulrich Pfisterer1,2 | Julia Bräunig1,2 | Per Brattås1,2 | Markus Heidenblad1,2 |


Göran Karlsson3 | Thoas Fioretos1,2,4

1
Center for Translational Genomics, Lund
University, Lund, Sweden Abstract
2
Clinical Genomics Lund, Science for Life The ability to capture alterations in the genome or transcriptome by next-generation
Laboratory, Lund University, Lund, Sweden
sequencing has provided critical insight into molecular changes and programs under-
3
Division of Molecular Hematology, Lund
Stem Cell Center, Lund University, Lund, lying cancer biology. With the rapid technological development in single-cell
Sweden sequencing, it has become possible to study individual cells at the transcriptional,
4
Division of Clinical Genetics, Department of
genetic, epigenetic, and protein level. Using single-cell analysis, an increased resolu-
Laboratory Medicine, Lund University, Lund,
Sweden tion of fundamental processes underlying cancer development is obtained, providing
comprehensive insights otherwise lost by sequencing of entire (bulk) samples, in
Correspondence
Ulrich Pfisterer, Department of Laboratory which molecular signatures of individual cells are averaged across the entire cell pop-
Medicine, Center for Translational Genomics,
ulation. Here, we provide a concise overview on the application of single-cell analysis
Lund University, Lund, Sweden.
Email: ulrich.pfisterer@med.lu.se of different modalities within cancer research by highlighting key articles of their
respective fields. We furthermore examine the potential of existing technologies to
Thoas Fioretos, Division of Clinical Genetics,
Department of Laboratory Medicine, Lund meet clinical diagnostic needs and discuss current challenges associated with this
University, Lund, Sweden.
translation.
Email: thoas.fioretos@med.lu.se

Funding information KEYWORDS


Governmental ALF grants; Lund University cancer research, clinical diagnostics, clinical utility, single-cell sequencing
Cancer Center (LUCC); Medical Faculty Lund
University; SciLifeLab Stockholm;
StemTherapy Lund University

1 | I N T RO DU CT I O N developments in next-generation sequencing.6,7 Clinical applications


for these platforms span the areas of diagnostics, prognostics and
Cancer represents highly complex and diverse pathological conditions, therapeutics using massively parallel sequencing for whole-genome
characterized by aberrant genomic, epigenomic and transcriptomic (WGS) or targeted DNA-sequencing (eg, whole-exome sequencing
features, such as structural alterations, single nucleotide and copy WES), RNA-sequencing, chromatin immunoprecipitation (ChIP)-
1
number variations (SNVs, CNVs), and altered epigenetic and tran- sequencing, and DNA methylation assays for epigenetic mapping.
scriptional signatures.2,3 Both intra- and intertumoral heterogeneity In order to further improve clinical application of sequencing-
contribute to the complexity of cancer with mutations in driver genes based technology and to ultimately provide better cancer diagnosis,
adding on to clonal evolution4 and consequently to dynamic clonal patient stratification, treatment monitoring, and personalized therapy,
5
architecture throughout disease progression. Recent years have the recent initiative of the human tumor atlas network aims at the
witnessed a dramatic progress in studying the genetic and molecular generation of longitudinal cell atlases of various tumor types
basis of human cancer, enabled, in part, by the rapid technological employing single-cell and spatially resolved technologies.8

This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any
medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.
© 2021 The Authors. Genes, Chromosomes and Cancer published by Wiley Periodicals LLC.

5
The considerable cellular heterogeneity present in most tumors is advanced computational analyses.24-26 Several different modalities have
likely to contribute to the currently ineffective and highly individual been applied to cancer research using either dissociated single-cells or
responses of patients to therapeutic approaches. While bulk analyses intact tissue with spatial resolution (Figure 2). While single-cell sequenc-
of tumor tissues have provided important insight into for example, the ing is progressively applied to study clinical cancer samples, its broader
transcriptional signature or overall genetic variability of a given translation into clinical diagnostics has yet to come.
tissue,9-12 it does not resolve the cellular composition of malignant In order to translate single-cell analyses into reliable clinical appli-
and normal cells. Hence, resolving tumor composition at single-cell cations, thorough assessment will be essential to define a technology's
resolution offers great potential not only to provide critical insights overall clinical applicability, which relies on its demonstrated analytical
into tumor biology per se, but also to shed light on other therapeuti- and clinical validity as well as clinical utility.27-29 Here, we define ana-
cally relevant issues related to heterogeneity such as tumor microen- lytical validity as the confidence of a given test to measure the pres-
vironment, cell-of-origin, and cancer stem cells. Thus, the advent of ence or absence of a disease-related alteration. In contrast, clinical
single-cell analyses promises to improve diagnosis, facilitate monitor- validity is determined as the accuracy and confidence with which a
ing of both disease progression and treatment response and will, detected variation can be related to a distinct disease phenotype.
hopefully, pave the way to more personalized therapeutic approaches Finally, clinical utility determines whether a test result will yield medi-
to realize the promises of precision medicine (Figure 1A-D). cal intervention to ultimately improve the patients' health or, where
The importance of elucidating cancer at single-cell resolution has treatment is unavailable, support clinical diagnosis of patients.28
been demonstrated in a plethora of studies which have allowed inves- This review provides an overview of currently available single-cell
tigators to assess tumor heterogeneity, to define cell types and states sequencing technologies and how such technologies have been used
in healthy specimen and tumors, as well as to examine heterogeneous recently to provide important insights into the molecular basis of can-
treatment response and drug resistance, among other clinically rele- cer. Furthermore, it discusses selected studies of different tumor
vant applications.13,14 types, the results of which suggest that single-cell sequencing will
Rapid technological development makes it feasible today to access have great clinical utility in the near future, and highlights challenges
15
many different modalities in single-cells, in some cases to profile more and hurdles that exist in order for single-cell sequencing to meet clini-
than one measure from a single-cell simultaneously16-23 and to perform cal diagnostic needs.

F I G U R E 1 Schematic representation of different scenarios where single-cell resolution is beneficial. A, Monitoring cellular tumor composition
from diagnosis through treatment to monitor treatment response and to potentially refine therapy. B, Immune profiling of tumors to decompose
the different immune cells and cell states infiltrating the tumor tissue. C, Large-scale analysis of tumor composition to decipher intratumor
heterogeneity as well as heterogeneity of tumors of the same origin among patients. D, Monitoring clonal composition for example during
treatment to determine whether a specific therapeutic approach is efficient

6
F I G U R E 2 Overview of different modalities for single-cell analysis of cancer tissues. Tumor tissues may be analyzed using either tissue-
destructive methods following tissue dissociation or by maintaining the spatial location of the cells in a given tissue. To date, a plethora of
platforms and chemistries exist to access different modalities in single-cells in tumor tissue. They enable metabolome analysis, genome
sequencing, cell surface and immune cell receptor profiling, epigenetic modifications as well as sequencing of the transcriptome. To date, only
certain modalities can be studied at spatial resolution (dotted line), whereas tissue-destructive methodologies are available to study all of the
single-cell modalities depicted

2 | APPLICATION OF SINGLE-CELL mRNA- Combinatorial indexing,50 where individual cells undergo several
SEQUENCING IN CANCER rounds of molecular barcoding, in combination with droplet-based
methods as exemplified in a preprint51 have furthermore greatly
Isolation of single cells may follow different principles: individual cells increased the number of cells which can be profiled in a single experi-
may be handpicked or sorted into PCR plates by flow cytometry. It is ment. This elevated throughput leveraged large-scale studies as exem-
further possible to directly dispense cells into chips harboring several plified by profiling 690 000 single-cells of the adult mouse brain
thousand nanowells, to trap single cells in channels and capture sites giving rise to a comprehensive cell atlas of the rodent brain.52 While
of microfluidic devices, as well as to encapsulate single cells in nano the most relevant single-cell genomics applications have been used in
oil-droplets using yet other microfluidic devices. With a growing a plethora of different studies53 and even have been subjected to sys-
demand to study large cell numbers, microfluidic devices for droplet- tematic comparison regarding cost and information content,54,55
based cell capturing are at present among the most common single-cell transcriptomics is still in the early stage of clinical transla-
platforms used. Importantly, the principle of cell isolation does not tion and application.
necessarily restrict the modalities which can be analyzed. An overview One of the very first attempts to assess the transcriptome of single
of the most widely adopted single-cell isolation principles, platforms cancer cells was described with the development of the original Smart-
and modalities can be found in Table 1. seq chemistry.56 The potential of single-cell RNA-sequencing for diag-
Single-cell transcriptomics has greatly increased our understanding nostic purposes was initially demonstrated on a metastatic breast can-
of the composition of complex tissues30-33 and has facilitated the study cer cell line (MDA-MB-231) by monitoring clonal evolution manifested
of a wide variety of human diseases at unprecedented depth.34-39 At in transcriptional alterations and mutation analyses inferred from
present, a large number of single-cell transcriptomics methodologies mRNA reads along treatment with Paclitaxel, which provided novel
40 41 42
and platforms are at hand (Smart-seq2, Smart-seq3, STRTseq, insight into drug resistance dynamics.57 Furthermore, single-cell
43 44 45 46
Cyto-seq, inDrop, Drop-seq, 10X Genomics as well as CEL- sequencing of lung adenocarcinoma (ADC) cells identified a distinct
seq2,47 Quartz-seq,48 MARS-seq,23 Seq-Well49). transcriptional signature of cells associated with resistance to anti-

7
TABLE 1 Overview of different single-cell isolation principles and corresponding platforms most commonly used in the cited literature of this
review

Modalities studied in selected references


Single-cell isolation principle Examples of platforms or chemistries applying various platforms
Manual isolation and dispensation into Serial dilution SNV
tubes or plates Hand picking mRNA, inferred SNV
Mouth pipetting mRNA, TCR expression, SNV, CNV,
methylome
Fluorescence-activated cell sorting into Various chemistries (eg, Smart-seq2) mRNA, TCR expression, SNV, CNV,
tubes or plates methylome, ATAC
MARS-seq mRNA
TCR-seq TCR expression
QRP DOP-PCR CNV
Immunomagnetic cell separation MagSweeper SNV, CNV
Cell dispensation into nanowells iCell8cx mRNA, DNA
cellenONE, CNV
sciFLEXARRAYER S3
Seq-well mRNA
Microfluidics with capture sites Fluidigm C1 mRNA, SNV, CNV, ATAC
DEPArray (Menarini Silicon Biosystems) CNA
Microfluidics with nanodroplets 10X Genomics mRNA, TCR expression, inferred SNV/CNV,
ATAC, cell surface proteins
inDrop mRNA, TCR expression
MissionBio SNV/CNV, cell surface proteins
Custom-built Chromatin immunoprecipitation

Abbreviations: ATAC, assay for transposase-accessible chromatin; CNV, copy number variation; MARS-seq, massively parallel RNA single-cell sequencing;
QRP DOP-PCR, quasi-random priming degenerate oligonucleotide primed polymerase chain reaction; SNV, single-nucleotide variant; TCR, T cell receptor;
TCR-seq, T cell receptor sequencing.

cancer drugs.58 Similarly, single-cell resolution of the transcriptome in clinical grade tumors of glioma,37 distinct transcriptional programs of
renal cell carcinoma has shed new light on intratumor heterogeneity tumor-associated macrophages in glioma67 and to the determination
59
and led to the derivation of a new, combinatorial therapeutic strategy. of varying gene signatures in malignant cells in head and neck squa-
Moreover, two single-cell studies assessed the transcriptome of circu- mous cell carcinoma.68 Unbiased clustering of single-cells obtained
60 61
lating tumor cells (CTCs) in prostate cancer (PC) and breast cancer. from colorectal cancer not only discovered novel cancer-associated
This led to the identification of two distinct phenotypes of breast can- fibroblast types and unmasked tumor heterogeneity, but importantly
cer CTCs with the capacity to interconvert and potentially contribute also demonstrated that single-cell transcriptomics provides prognostic
to treatment resistance.61 Single-cell analysis further revealed great insight previously hidden in bulk sequencing data.69 Furthermore, dec-
diversity of PC CTCs among treated patients and identified splice vari- iphering of cellular composition of breast cancer patient-derived
ants and mutations in the androgen receptor (AR) gene, associating xenografts (PDX) identified a stem-like cell type with high epidermal
failed AR inhibitor treatment to noncanonical Wnt signaling.60 Overall, growth factor receptor (EGFR) gene expression levels and further
analysis of single CTCs may open up for exciting possibilities for future linked high EGFR expression to an elevated mesenchymal gene
non-invasive, single-cell diagnostics. signature,70 similar to another study identifying elevated expression
38
Tumor heterogeneity has been elucidated in glioblastomas, of epithelial-to-mesenchymal (EMT) - associated genes in breast can-
breast cancer,62 and large-scale tumor cell atlases have been gener- cer cells.71 Following breast cancer samples along the treatment
ated in lung,63,64 renal,65 and pediatric brain tumors66 at single-cell course of several years, integrated single-cell genome and trans-
resolution. Interestingly, validating single-cell transcriptomics data criptome analyses identified discrete phenotypes associated with
with bulk RNA-sequencing, proteomics and functional studies con- chemoresistance, with the most prominent upregulation being an
firmed novel phenotypes of endothelial cells which in turn potentially EMT gene signature.72
opens up for new therapeutic target points blocking tumor angiogene- While combined analysis of DNA and RNA in individual cells is
64
sis in lung cancer. Further technological advancement leveraging feasible,22 current protocols are not amendable for high cellular
increased cellular throughput, led to the identification of discrete tran- throughput and have therefore not frequently been used. Comple-
scriptional programs and cellular compositions in relation to increasing mentation of gene expression with inferred CNV from full-length

8
single-cell mRNA to distinguish malignant cells37,38,71,73,74 or targeted Single-cell transcriptomics is a rapidly evolving technology which
genotyping35 present attractive alternatives to comprehensively study already has yielded critical insight into cellular diversity of complex tis-
human malignancies. Accordingly, chromosomal aberrations charac- sues.31 The highlighted research in this section provided first insights
38
teristic for glioblastoma were inferred onto tumor cells and classifi- into pathological transcriptional changes underlying cancer develop-
cation of malignant cells in glioma were corroborated.37 In line with ment and progression as well as response to therapy. These studies
this, cancer-specific genomic aberrations could be inferred from clearly demonstrate a great potential for single-cell mRNA-sequencing
single-cell transcriptomics data and were restricted to malignant gli- to become a clinical diagnostic tool in the near future. Most likely, the
oma cells. Haplotype inference additionally revealed heterozygous first clinical applicability will be as a prognostic tool in the diagnostic
loss of chromosome 14 alleles in glioma tumors.74 Interestingly, RNA- setting, for example in hematologic malignancies, to decipher the cel-
inferred CNV information clearly distinguished immune from carci- lular composition of normal and malignant tissue based on their tran-
71
noma cells in breast cancer, opening up for the possibility for unre- scriptional signatures. However, this will require several large-scale
strained profiling of both cell types without usage of cell surface studies to demonstrate that cellular composition correlates with
markers. Deducting genomic alterations such as CNV from mRNA- important clinical parameters. Along with increased sensitivity and
sequencing also aided the delineation of cellular hierarchies in reproducibility, single-cell mRNA-sequencing, in combination with
75
oligodendroglioma. other modalities (see below) is likely to become increasingly important
More recently, high throughput single-cell mRNA approaches in monitoring treatment response.
such as Seq-well49 combined with targeted genotyping were used to
elucidate molecular hierarchies in acute myeloid leukemia (AML) and
confidently identified six malignant AML cell types with mutations 3 | SINGLE-CELL IMMUNE PROFILING IN
being absent in healthy donor samples.76 This study furthermore CANCER
combined both short- and long-read sequencing technologies to
determine genetic aberrations such as insertions, deletions and gene- In the thymus, lymphoid progenitors are molded into committed T
fusions in individual cells. It additionally employed a large cohort of cells which in turn play an important role in shaping the adaptive
longitudinally collected AML samples (diagnosis, treatment, immune system. Besides the acquisition of somatic mutations
remission),76 thereby suggesting that single-cell transcriptome analy- throughout life in normal cells of different tissues, contributing to
sis may be applied to monitor treatment response and putatively aid cancerogenesis, progressive decline in T cell production in the thymus
clinical decision making. However, in order for this approach to pro- has been associated with an increased incidence of age-relate dis-
vide analytical and ultimately clinical validity, it needs to possess eases, including cancer.79 Moreover, the type of immune cells and
greater detection sensitivity of mutation signatures. Hence, while this their location and density in a given tumor were postulated to possess
work leveraged large-scale analysis, about 40% of the targeted sites prognostic value, and suggested that high frequency of cytotoxic
were not detected and mutations located in proximity to either the memory T cells in a tumor tissue was indicative of disease relapse post
30 end of the mRNA or to an internal polyadenylation site were cap- treatment.80 These results exemplified the potential clinical benefit of
tured more efficiently. This is directly linked to the design of the comprehensive immune cell profiling of tumors and strengthened the
sequencing library preparation in Seq-well49 which preferentially ultimate necessity to retain spatial information within the tumor
yields sequences toward the 3 end of mRNA transcripts via polyT-
0
tissue.
capture sequences. In line with this, Petti and co-workers utilized a Since the development of T cells involves both the differentiation
droplet-based platform to infer genomic information from single-cell of T lymphocytes and the generation and maintenance of a diverse
transcriptomes and were able to deduce SNV information in 23% of TCR repertoire, precise comprehension of developmental processes
77
the cells analyzed. In addition, the authors confidently distinguished underlying T cell specification are of significant importance to under-
normal from tumor cells and successfully identified a cell-surface stand disease progression in cancer. While targeted TCR analysis via
marker (CD99) from the single-cell transcriptomics data, enabling for nested PCR has allowed analysis of several hundreds of single-cells,81
the precise isolation of distinct clonal cells.77 While this study fell recent developments have made it possible to probe even larger num-
short in identifying novel cell-surface markers, it nicely demonstrated bers of T cells in an unbiased fashion,73,82,83 as well as in combination
the possibility for precise isolation of malignant cells for refined with targeted TCR analysis.84 In addition, simultaneous profiling of
downstream analyses. Recently, uveal melanoma (UM) was studied both transcriptomic and TCR signatures from thousands of individual
by integrated mRNA and B and T cell receptor (BCR and TCR) cells has been reported.85-87
78
expression. Inferring genomic aberrations present in the single-cell A recent study revealed bias in VDJ gene usage during recombi-
transcriptomics data using the software inferCNV, both canonical nation of TCRβ throughout differentiation toward mature T cells by
and non-canonical CNVs were identified across all samples, delineat- integrating transcriptional signatures of cell states with the expression
ing clonal structures in the tumor tissue.78 This furthermore demon- data on TCR chains α and β.88 This observed bias in TCR recombina-
strates the applicability of single-cell transcriptome analysis to tion might impact the adaptive immune response and consequently an
deduce genomic variation in cancer. individual's response to antigenic stimuli.

9
In the attempt to elucidate the tumor microenvironment, single- pre-existing in the tumor.95 Analysis of patients with metastatic mela-
cell analysis enabled the identification of molecular signatures of noma responsive to ICB treatment displayed a greater fraction of
exhaustion programs in T cells, their associated markers, and linked large T cell clones as opposed to non-responsive patients.85 Interest-
dysfunctional signatures to tumor reactivity in human mela- ingly, transcriptional alterations and gene modules induced by ICB
noma.73,84,85 It further led to the determination of a transcriptional sig- treatment did not correlate with the clinical outcome observed in
nature of specific immune cells which could in turn be linked to patient patients,85 rendering simultaneous profiling of TCR clonality a neces-
survival and improved existing prognostication of breast cancer sity to deduct clinically relevant information.
patients.89 Moreover, integrated mRNA- and targeted TCR-sequencing Despite the correlation of therapy response to clone size, T cell
revealed that dysfunctional T cells exhibited prominent clonal expan- clonal specificity to distinct tumor antigens yet needs to be deter-
sion with continuous proliferation in metastatic melanoma.84 mined and integrated to define lasting predictive markers for the out-
Single-cell analysis further defined clonotypes of T cells while come of different ICB therapies. Interestingly, TCR repertoire
suggesting their activation status in the human hepatocarcinoma analysis of CD8+ T cells in UM revealed that these cells strongly
(HCC) microenvironment,90 and identified a distinct dendritic cell type expressed the checkpoint marker gene LAG3, whereas, unexpectedly,
capable of migrating from the tumor tissue to the hepatic lymph expression of PD1 was minimal.78 This may in part explain the lack
node.91 Furthermore, valuable insight into transcriptional signatures of responsiveness of UM to checkpoint immunotherapy targeting
of tumor-infiltrating myeloid cells in lung ADC has been obtained PD1. Moreover, single-cell analysis of immune cells from glioblas-
using high throughput single-cell mRNA-sequencing.92 toma combined with murine models identified a distinct macrophage
A recent study utilized single-cell technology to elucidate the type which in turn appeared to be a potential target for combinato-
composition of immune cells in the tumor microenvironment of breast rial immune therapy.96
86
cancer. High-throughput integrated mRNA- and TCR-sequencing rev- Very recently, the development of single-cell metabolic regulome
ealed an increased phenotypic diversity of both lymphoid and myeloid profiling (scMEP) made it possible to study the highly dynamic func-
cells in tumorous tissue, as opposed to normal breast tissue, and tions exerted by immune cells manifested in metabolomic alterations
86
exhibited inter-patient variation in metabolic signatures. Corroborating at spatial resolution, deciphering metabolic profiles of CD8+ T cells in
their findings using two different platforms (inDrop and 10X Genomics), the tumor microenvironment.97 The ability to analyze immune cell
the authors identified continuous T cell activation, which in part could migration into diseased tissue, which is tightly regulated by the cells'
be explained by broad stimuli activating TCR repertoire, and showed metabolism, holds great promise to understand immune cell-mediated
that tumor residing T cells were comprised of different clonotype clus- processes in the tumor following treatment. In addition to study
ters with varying activation states.86 Overall, distinct phenotypes are tumor immune cells based on mRNA and TCR expression or metabo-
shaped by the TCR repertoire in response to antigenic stimuli but diver- lites, recent technological advances made it possible to generate cell
sity is also mediated by environmental stimuli such as hypoxia.86 atlases of human tumors based on the expression of cell surface
More recently, integrated analysis of mRNA and TCR repertoires markers complemented with single-cell transcriptome sequencing,
in 141 623 T cells was performed in four different types of cancers exemplified by the analysis of lung ADC.98
(non-small-cell lung ADC, endometrial ADC, colorectal ADC and renal Taken together, single-cell immune profiling holds great potential
clear cell carcinoma) as well as in histologically normal adjacent tissue to refine existing therapies (Figure 1A) and has greatly increased our
(NAT) and peripheral blood.87 This study led to the discovery that understanding how clonal composition of immune cells, both within
diverse clonal expansion patterns across patients with clonotypes being the tumor and adjacent tissue, is encoded in the transcriptome and
either expanded similarly in the tumor and NAT or following differing receptor repertoire (Figure 1B). Single-cell resolution has further
patterns.87 Moreover, this work shed light on the existence of a strong offered insight into intra-tumoral heterogeneity of immune cells and
correlation between peripheral and intratumoral clone size, a finding potential bias in responsiveness to treatment, how regulation of meta-
which was substantiated by re-analyzing data of related studies investi- bolic pathways underlies immune cell function, and how these path-
gating T cells in non-small-lung cancer93 and colorectal cancer cells.94 ways may be exploited to device novel therapeutic strategies
In addition, non-exhausted T cell clones were more likely to be blood- enhancing the overall immune cell response. Given the dramatic
associated as opposed to exhausted clones and different clonal expan- impact of ICB in cancer treatment during recent years and the realiza-
sion patterns were correlated with the clinical response of patients.87 tion that the immune system plays a critical role in cancer develop-
These findings suggest that the detection of clones in blood may ment and progression, single-cell immune profiling is most likely to
serve a useful proxy to determine the presence of clinically relevant, become one of the first strategies reaching clinical diagnostics.
expanded clones in the tumor, opening up for the possibility of “liquid
biopsies” for monitoring treatment response following therapy with
Atezolizumab, Sunitinib, or IMmotion150 using single-cell technology. 4 | EP I G E N E T I C A NA L Y S E S O F C A N C E R A T
Utilizing the same combined mRNA- and TCR-sequencing S I N G L E - C E L L RE S O LU T I O N
approach on basal and squamous cell carcinoma samples before and
after immune checkpoint blockade (ICB) treatment suggested that Besides immune infiltration, transcriptomic and genomic alterations,
novel T cells exert treatment response rather than T cell clones epigenetic changes underlie cancer development and evolution, but

10
also disease prognosis and treatment outcome.99,100 Epigenetics con- Besides monomodal single-cell approaches, multimodal methods
stitute inheritable cellular regulation of gene expression, which occur offer the possibility to assign an epigenome to a transcriptome,
independently of the genetic information. Chromatin status, accessi- genome, or proteome revealing the regulatory correlations between
bility and conformation are highly regulated by histone and genome them. Several scATAC-seq and single-cell bisulfite sequencing proto-
modifications, and by interactions between DNA and protein struc- cols provided enough genome coverage to analyze CNVs.112,116 One
tures. DNA methylation and histone acetylation have been the subject of the earliest single-cell study in cancer epigenetics utilized bisulfite
to intensive research. Hypermethylation of promoter regions, a gen- sequencing, CNV and transcriptome analysis (scTrio-seq) to investi-
eral reduction in genomic 5-methylcytosine levels as well as the loss gate hepatocellular carcinoma (HCC).112 The authors found that the
101,102
of histone acetylation are commonly observed in cancer cells event of a CNV did not alter the methylation pattern of the affected
and ultimately contribute to altered gene expression regulation. In DNA region and that aberrantly methylated regions did not overlap
contrast to genomic mutations and aberrations, epigenetic marks and with the presence of CNVs, but that both influenced transcriptional
their deregulation are often reversible. levels. These results additionally confirmed that DNA methylation in
To date, several DNA methyltransferase inhibitors (DMTIs) promoter regions correlates negatively with gene expression, whereas
and histone deacetylase inhibitors have been investigated as anti- DNA methylation in the gene body correlates positively with tran-
103
cancer drugs and are approved by the FDA for several cancers. scription as demonstrated in HepG2 and HCC cells. However, only
First trials with DMTIs yielded promising treatment results, but they 26 single HCC cells from one patient were investigated, thus limiting
also evoked severe side effects.104,105 Lower treatment dosages the general conclusions possible to be drawn for HCC from this
of DMTIs were similarly successful, but no major demethylation study.112
effect was observed in bulk sequencing experiments in contrast ScRRBS-seq and Smart-seq2 data were obtained from the same
to higher treatment concentrations.106,107 Analysis of monoclonal cell by separating mRNA and DNA, revealing an Ibrutinib sensitive B
populations of the human colon carcinoma cell line HCT116 showed cell subpopulation in CLL patients, which is expelled from the lymph
that every clone has a distinct partial demethylation pattern and node upon treatment.113
that the resulting changes in epigenetic regulation are sufficient to Combining CITE-seq, Smart-seq2, and scATAC-seq to investi-
slow cancer cell proliferation.107 This monoclonal analysis exempli- gate mixed-phenotype acute leukemia (MPAL) revealed that ana-
fied the necessity for single-cell resolution in cancer epigenetics lyses based on either surface-protein expression, chromatin
in order to unravel cellular heterogeneity, to device novel therapies accessibility or mRNA expression yielded reproducible and compa-
and to monitor treatment. Today, several single-cell methods for rable cell clusters.108 While MPAL is a rare disease displaying char-
DNA methylation and chromatin accessibility are available to acteristics of both AML and acute lymphoblastic leukemia (ALL),
study cancer (scATAC-seq, sciATAC-seq, scRRBS-seq, scChip-seq, MPAL patients are more responsive to ALL treatment compared to
scTrio-seq).108-112 AML therapies.119 Single-cell ATAC-seq and Smart-seq2 data pro-
Single-cell reduced-representation bisulfite sequencing (scRRBS- vided the necessary resolution to show that distinct genes are uni-
seq) was used to trace cancer evolution by measuring alterations in versally upregulated in either MPAL or AML cancer cells,108
the methylome in both healthy individuals and patients with chronic possibly explaining why AML treatments often fail in MPAL
lymphocytic leukemia (CLL) before and after treatment.113 Overall, patients. In addition, RUNX1 was associated with transcription fac-
this study revealed impaired B cell development in diseased individ- tor binding motifs in MPAL cancer cells.108 Using single-cell combi-
uals and increased cell-to-cell heterogeneity of B cells in CLL as natorial indexing ATAC-seq (sciATAC-seq), the potential regulatory
opposed to healthy controls and normal B cells.113,114 role of RUNX transcription factor motifs was investigated in a
The application of single-cell assay for transposase-accessible murine lung ADC model, revealing that accessible RUNX transcrip-
chromatin sequencing (scATAC-seq) showed that breast cancer cell tion factor motifs were mainly present during the metastatic stage.
lines clustered separately before and after JQ1-treatment based on Additionally, transcription factor scores were matched with differ-
their epigenetic state.115 Furthermore, scATAC-seq identified a sub- ent tumor stages, as well as RUNX and NKX2.1 transcription factors,
population of a PD-1 immunotherapy responsive T cell population which correlated with patient prognosis.109 Interestingly, while the
116
and its underlying regulatory mechanism in basal cell carcinoma, transcription factor NKX2.1 is used as a diagnostic marker in clinical
and has pinpointed distinct transcription factor motifs which drive lung ADC,120 the metastatic sciATAC-seq cluster correlated better
117
cancer heterogeneity in leukemic cells. with overall patient survival than the NKX2.1 cluster,109 suggesting
Unlike scATAC-seq, single-cell chromatin immunoprecipitation that the accessible chromatin status could be used as an improved
(scChip-seq) also captures repressed regions of the chromatin in addi- diagnostic marker.
tion to accessible sites.110 Using this approach, a recent study con- Besides genomic DNA, mitochondrial DNA (mtDNA) is subjected
cluded that tumor cells resistant to the cytostatic drug Capecitabine to epigenetic alterations, SNVs and CNVs, which play a role in tumori-
can be discriminated from non-resistant tumor cells based on their genesis, cancer progression and drug resistance.121 Modification of a
chromatin status in a triple-negative breast cancer model, and that droplet-based scATAC-seq protocol facilitated capturing of mtDNA
distinct repressed H3K27me3 regions were associated with genes (scmtATAC-seq) and demonstrated that a 50x coverage of the mito-
responsible for therapy resistance.118 chondria genome can yield robust CNV and even SNV data in addition

11
to accessible chromatin information.122 This revealed mutations and microfluidics enabling stringent quality control via cell imaging, while
CNVs related to disease progression and drug resistance in CLL simultaneously reducing contaminating ambient DNA interfering with
patients, with individual subpopulations showing impaired methyla- genomic analyses.
tion patterns in genes related to drug resistance such as TIAM1 and Further optimization in part addressed the shortcomings of exis-
ZNF257. Interestingly, the small size of the mitochondrial genome ting approaches with regard to low genomic coverage and allelic drop-
with only 16 kb in size strongly reduces sequencing costs, potentially out rates, lack of uniformity, and polymerase-induced errors.134 As
facilitating broader application areas. such, recent methods have utilized DNA transposition in combination
At present, published single-cell studies in the field of cancer epi- with linear amplification135 or direct construction of sequencing-ready
genetics have demonstrated that available methods and protocols are libraries.136,137
sufficient to distinguish between healthy and diseased cell types, and Existing technologies facilitate the investigation of CNVs and
to enlighten cancer heterogeneity, progression, and treatment effects. SNVs, however, other structural variations such as translocations and
Identified subpopulations, transcription factor motifs, and regulatory inversions - relevant measures of disease prognosis - are more chal-
mechanism could potentially predict patient outcome and drug resis- lenging to identify at single-cell resolution. Strand-seq enables the
tance suggesting sufficient analytical and potentially even clinical generation of directional sequencing libraries and strand-specific
validity. However, as this is a relatively young field, modalities need sequencing reads, yielding homolog resolution in single cells.138 This
further refinement to accomplish analytical validity. ScRRBS-seq was recently utilized to investigate evolutionary differences between
covers less CG islands than bulk bisulfite sequencing110 and in com- human and macaque based on genetic inversions139 and to develop
parison with single-cell bisulfite sequencing methods, scChip-seq has the analytical tool single-cell tri-channel processing (scTRIP)140
an overall lower genome coverage and a higher ratio of background extracting and utilizing additional information from Strand-seq data.
noise.123 While this approach enables for more comprehensive analysis of
Some methods, like scTrio-seq, offer a lower throughput genomic complexity, it relies on the possibility to label nascent DNA
impeding the possibility to access cancer heterogeneity in its during replication, which excludes its application to clinical samples
entirety. In the recently developed method Cleavage Under Targets containing non-dividing cells or nuclei.
and Tagmentation (Cut&Tag), antibodies target defined histone Single-cell DNA-sequencing has been used intensively to deci-
modifications and conjugated Tn5 cuts accessible DNA, which pher clonal structures in different cancer types and to augment our
reduces unspecific signals. Overall, analytical validity of novel knowledge on tumor clonal evolution. An early study applied DOP-
124
methods such as Cut&Tag remains to be demonstrated in cancer PCR on 100 single nuclei isolated from two human breast cancer
research. cases and demonstrated that clonal evolution patterns can be inferred
Finally, integration of epigenetic modifications with other modali- from shallow single-cell WGS.127 While this study did not provide suf-
ties such as mRNA or cell surface protein expression will be of impor- ficient coverage to resolve SNVs in a genome-wide manner, subse-
tance to gain more complete insight on how the disease is manifested quent utilization of G2/M nuclei yielded comparably higher genome
and regulated, as well as to explain the effect of cancer-induced epi- coverage and improved both allelic dropout and false positive rate in
genetic changes. breast cancer samples.141 In this study, the authors indicated that
structural genomic alterations occur early during breast cancer evolu-
tion, while SNVs are acquired progressively and gradually contribute
5 | A SS ES SM E N T OF C LO N A L to clonal diversity.141 The finding, that the majority of single-cell
H E T E R O G E N E I T Y I N CA N C E R B Y CNVs were clonal and stable during tumor growth of breast
S I N G L E - C E LL D N A - S E Q U E N C I N G cancer,142 additionally strengthened the notion that copy number
aneuploidy is acquired early during tumor evolution. Single-cell analy-
Continuous gain of genetic variation in individual cells underlie tumor sis of breast cancer xenografts moreover corroborated that clonal
initiation, maintenance and evolution. In particular, ongoing cell divi- expansion dynamics represent reproducible trajectories, indicating
sion within tumor tissue fosters genetic mosaicism manifested in that clonal selection follows a non-random process with distinct muta-
1
CNVs, SNVs and gene breakpoints. While bulk DNA-sequencing has tion genotypes defining clonal fitness and therefore clonal expansion
demonstrated substantial genetic heterogeneity in cancers, such as processes.143
125 126
AML or primary renal carcinomas, determination of clonal struc- In longitudinal breast cancer samples, bulk exome sequencing
ture of cancer types necessitates single-cell resolution. integrated with single-cell DNA and RNA analyses provided insight
Among the first methods to be used to interrogate clonal diversity into clonal extinction in response to treatment and identified resistant
at single-cell resolution were PCR-based methods such as degenerate clones selectively expanded as a result to chemotherapy.144 Single-
127
oligonucleotide primed PCR (DOP-PCR), isothermal multiple dis- cell analysis furthermore enabled the identification of patient-
placement amplification (MDA)128-130 as well as PicoPlex131 and mul- individual clonal seeding patterns in colorectal cancer leading to the
132
tiple annealing and looping-based amplification cycles. Increased metastatic tumor.145
130
cellular throughput was achieved by employing microfluidic devices Highly relevant with regard to clinical application was the dis-
and single-cell combinatorial indexed sequencing (sci-seq),133 with covery that a large fraction of both trunk and metastatic mutations

12
could be recapitulated in CTCs from PC146 and that CNV pattern (1479 single-cells) and combined different computational approaches
on a whole-genome scale of CTCs was not altered during the treat- for the identification of clonal structures and the removal of low qual-
147
ment course of lung cancer. Furthermore, CNVs detected in ity cells due to WGA-induced noise.154 This allowed the authors to
CTCs of ADC and small-cell lung cancer (SCLC) were reproducible identify clones co-occurring in most patients and to suggest a more
between cells and individuals.147 In line with this, copy number precise hierarchical clonal structure for ALL where the majority of
aberrations in CTCs of SCLC were used to determine classifiers structural aberrations preceded point mutation acquisition and VDJ
supporting categorization of chemosensitive or chemorefractory recombination.154
148
SCLCs. Single-cell WGS of 88 CTCs generated classifiers with The literature summarized above clearly demonstrates that
sufficient power to assign the vast majority (>80%) of CTC test single-cell DNA-sequencing is capable to provide analytical validity,
samples to either a chemosensitive or chemorefractory treatment for example, in elucidating tumor heterogeneity, and to monitor
148
response. This suggests an exciting possibility for single-cell clonal evolution in response to treatment (Figure 1A,C,D), features
analysis to provide analytical validity for future diagnostic purposes of importance in personalized medicine. Particularly, in cases with a
similar to single-cell transcriptome studies targeting CTCs,60,61 high prevalence of genomic lesions specific for a given cancer type,
especially in the absence of primary tumor tissue. However, in targeted genomic approaches hold great potential to become rou-
order to reach closer to clinical validity and utility, the persistence tine diagnostic application in the near future. Finally, it will be of
of CNV patterns in lung cancer CTCs needs to be corroborated in essence to understand how different clones and their respective
larger patient cohorts. In addition, molecular classifiers capable of expansion patterns influence tumor evolution and treatment
predicting treatment response of SCLCs will require a larger response.
starting number of cells covering a more complete space of geno-
mic alterations.
Single-cell sequencing of hematologic malignancies such as AML 6 | SP A TI A L R E SOL U T I ON TO A I D C A N CE R
gained insight into the clonal architecture underlying this heteroge- T I S S U E A NA L Y S E S A N D D I A G N O S T I C S
nous disease entity, although in limited sample numbers.131 In con-
trast, droplet-based cell capturing and barcoding opened up for the In order to truly understand tumor behavior, particularly in solid can-
possibility to profile known genomic loci in AML at unprecedented cers, both disease-related transcriptional and genomic alterations
throughput.149,150 More recently, a similar approach leveraged analy- need to be related to the cells' phenotypes in the spatial context of
sis of 735 483 single-cells obtained from 123 AML patients, unveiling the tumor microenvironment. Retaining information of type, density
clonal evolution patterns and correlation of AML driver mutations.151 and location of immune cells in colorectal cancer tissue demonstrated
In total, a selection of 530 validated mutations were included in the an association of spatial immune cell composition with clinical out-
analysis, which in the case of a subset of longitudinal AML samples, come.80 It was suggested that such immunological criteria could be
provided additional insight into the clonal evolution processes during relevant for a clinical application in cancers where the density of
treatment.151 Furthermore, a very recent study utilized droplet-based, tumor-infiltrating T cells is linked to favorable prognosis. Current tech-
targeted single-cell DNA-sequencing in AML on a large cohort of sam- nologies for massively parallel processing of mRNA or genomic alter-
ples, providing insight into clonal complexity and co-occurring muta- ations lack spatial resolution as a consequence of tissue dissociation,
tions in epigenetic modifiers in AML along with changes in cell surface which in addition has been shown to potentially induce misleading
protein expression underlying the pathogenesis of clonal transcriptional signatures.155
152
hematopoiesis. Different approaches have evolved to profile mRNA or protein
Integration of bulk exome and whole-genome sequencing on expression with spatial resolution while either preserving the tissue
51 cases of childhood ALL identified aberrant RAG recombinase activ- structure156-159 or destructing it by usage of molecular tags providing
ity as critical driving force for genomic aberrations underlying leuke- spatial information,160 imaging mass cytometry (IMC),161 or laser cata-
153
mic transformation. Targeted genotyping of mutations and pulting.162,163 Co-detection by indexing (CODEX) allows for highly
structural variants derived from bulk exome sequencing allowed the multiplexed profiling of protein markers and has been used to deci-
construction of phylogenetic trees. This confirmed that the fusion pher differences in tissue composition in murine normal and diseased
gene ETV6-RUNX1, which is considered one of the initiating genomic spleen at single-cell resolution.159 Its applicability to clinical human
153
lesions in this form of ALL, was found in the root of both trees. samples, however, still needs to be shown in large-scale studies.
Indication for RAG-mediated deletions in cells spanning the entire Moreover, multiplex immunohistochemistry has enabled parallel visu-
phylogenetic tree further suggested that the genomic aberrations alization of distinct immune checkpoint molecules at single-cell
observed were formed through a continuous process in these two resolution.164
153
cases. Shortcomings of this study comprised the limited number of The GeoMx/DSP platform has been applied to identify protein
single-cells processed; a relatively small number of genomic lesions markers associated with treatment outcome in melanoma,156 to evalu-
were analyzed and the dropout rates of mutant alleles were not thor- ate the PC micro-environment,158 and to assess B and T cell pheno-
oughly assessed. In contrast, microfluidic MDA targeted genome types in melanoma tumors.165 Furthermore, this platform has been
sequencing of six patients of ALL provided higher cellular throughput used to study B cell localization in tertiary lymphoid structures using

13
T A B L E 2 Overview of technical details of translational research articles highlighted in this review which describe transcriptome or immune
profiling of single-cells

14
Abbreviations: ADC, lung adenocarcinoma; AML, acute myeloid leukemia; ATRT, atypical teratoid/rhabdoid tumors; BCC, basal cell carcinoma; BCR, B cell
receptor; ccRCC, clear cell renal carcinoma; CML, chronic myeloid leukemia; CNV, copy number variation; CRC, colorectal cancer; CTC, circulating tumor
cells; CyTOF, cytometry by time of flight; ETMR, embryonal tumors with multilayered rosettes; FACS, fluorescence-activated cell sorting; GBM,
glioblastoma multiforme; HCC, hepatocellular carcinoma; HNSCC, head and neck squamous cacrinoma; MACS, magnetic-activated cell sorting; MARS-seq,
massively parallel RNA single-cell sequencing; MM, metastatic melanoma; NSCLC, non-small cell lung cancer; NSCL ADC, non-small-cell lung
adenocarcinoma; PC, prostate cancer; PDX, patient-derived xenograft; RCC, renal cell carcinoma; SCC, squamous cell carcinoma; SNV, single-nucleotide
variant; TCR, T cell receptor; TCR-seq, T cell receptor sequencing; UM, uveal melanoma; WGA, Whole-genome amplification; WNT MB, WNT-subtype
medulloblastoma.

multiplex protein analysis,166 and to profile mRNA and protein simul- staining to visualize structural elements, followed by subsequent isola-
taneously in colorectal tumor tissue.167 While this approach allows for tion of single-cells using UV laser162 or isolation of groups of cells
combined multiplex mRNA and protein analysis, the GeoMx/DSP plat- down to single-cells using single infra-red (IR) pulses.163 This made it
form lacks single-cell resolution and requires a priori knowledge of possible to spatially resolve genomic aberrations occurring during an
target protein markers or mRNAs together with reliable markers to early stage tumor such as ductal breast carcinoma,162 and holds great
visualize tissue structure. Interestingly, GeoMx-based analysis of B potential to increase our understanding of how tumor infiltration and
cells was complemented by technologies assessing mRNA and surface invasion processes occur at the single-cell level in the context of the
proteins at single-cell resolution.166 tumor microenvironment.
In contrast, high-definition spatial transcriptomics (HDST) enables Taken together, a broad variety of technologies allowing single-
unbiased mRNA profiling with 49% of the spatial barcodes being cell readouts in a spatial context are available, which have offered
assigned to a single-cell type and was successfully used to distinguish highly relevant insights by integrating different modalities, such as
cell types in breast cancer,160 while offering greater spatial resolution mRNA and protein expression or genomic aberrations within the
compared to similar approaches.168,169 Combining spatial trans- tissue context. These approaches differ in their capacity of cellular
criptomics with conventional high throughput single-cell sequencing throughput, compatibility with clinical samples, single-cell resolu-
enabled more refined spatial cell type annotations in pancreatic ductal tion, preservation of tissue integrity for downstream analyses, the
ADC.170 Despite these promises in spatial transcriptomics, future degree of multiplexed detection, and the necessity of a priori
developments need to improve the current sparsity of HDST and to knowledge on tissue-specific targets. Therefore, selection of a
demonstrate compatibility of this method with Formalin-Fixed methodology for spatial tissue analysis often necessitates a
Paraffin-Embedded (FFPE) sections, which represent the dominant compromise with regard to several aspects. For example, FFPE-
form in which solid tumor specimen are preserved to date. Interest- compatible mRNA analysis at single-cell analysis requires a trade-
ingly, the commercially available Visium chemistry (10X Genomics) off in the number of transcripts, which can be processed at the
has recently been applied successfully to FFPE sections of the mouse same time.
brain and ovarian carcinosarcoma as exemplified by a study currently Overall, spatial tissue analysis at single-cell resolution still needs
available as a preprint,171 opening up for the possibility to perform to overcome several limitations in order to become a widely used clin-
spatial transcriptome analysis on clinical FFPE samples. ical diagnostic technology but given the rapid development and dem-
Alternatively, highly specific in-situ hybridization (RNAscope) onstrated high promise of this technology it is likely that we will
157
enables detection of mRNA molecules in FFPE tissue and was used witness significant advancement toward clinical applicability in the
successfully for automated, quantitative profiling of HER2 status in near future.
breast carcinoma.172 While providing cellular and subcellular resolu-
tion, a priori knowledge of targets is necessary and highly multiplexed
tissue analysis is currently not possible. However, a higher degree of 7 | C HA LLE NGE S FO R C LI NI C A L
multiplexed targeted gene mRNA detection in breast cancer has been TRANSLATION OF SINGLE-CELL
achieved using padlock sequencing.173 Expansion of RNAscope by the SEQUENCING
usage of oligonucleotides conjugated to metal-chelated reporters to
bind RNA-probes during the final hybridization step facilitates simul- As described in previous sections, not only rapid technological pro-
taneous labeling of protein structures using metal-conjugated anti- gress but also the potential of analyzing tumor tissue routinely at
bodies. This in turn enables simultaneous profiling of mRNA and single-cell resolution has now become feasible in a research setting.
proteins from the same section using IMC and was shown to success- Single-cell analyses in cancer has opened up for the possibility to
fully correlate with mRNA and protein expression levels in a large putatively aid diagnostics,148,172 to monitor treatment
72,76,85,87,149
cohort of samples providing architectural maps of breast cancer tissue response or to refine treatment processes59-61,72 toward
at spatial single-cell resolution.161 personalized therapies and has thus spurred great interest to translate
In addition to spatially resolved mRNA and protein expression, such technologies into routine clinical applications.
recent technological advances linked the genomic profile of a single- Single-cell analyses regardless of modality or spatial resolution,
cell to its position in a given tissue.162,163 Similar to both approaches currently requires cost intensive, large-scale sequencing reactions in
is that the tumor tissue is subjected to hematoxylin & eosin (H&E) order to process a clinically informative cohort of patient samples and

15
to extract sufficient numbers of single-cells to guarantee statistically Another important obstacle in order to achieve clinical translation
valid analyses to infer reliable diagnostic information. Novel strategies of single-cell technologies is sample comparability with regards to sam-
for multiplexing single-cell transcriptomes50,51 or single-cell ple isolation, molecular characterization and downstream computational
133
genomes have greatly increased the possible cellular throughput analyses. It is known, that cell dissociation strategies can alter transcrip-
per sample. However, in particular WES and WGS of single-cells for tional signatures155 and that cell isolation needs to be evaluated care-
de novo SNV calling necessitate high sequencing depths, rendering fully prior to starting an experiment.176 Furthermore, it has been shown
these approaches currently economically challenging for large-scale that gene expression patterns are induced which are distinct for
studies. Additionally, increasing single-cell throughput together with biopsy- and autopsy-derived brain samples177 and which biases down-
multiomic readouts and higher-dimensional data require sophisticated stream analyses. Thus, the effect of sample isolation, storage and sam-
expertise for data analysis in addition to large computational infra- ple type on the modality analyzed (mRNA, DNA, protein) needs to be
structure. Approaches of combined low and high coverage systematically assessed in order to define common classifiers to facili-
analyses,130,141,147 targeted qPCR-based analyses153,174 and more tate comparability of results between different research centers and
recently microfluidic droplet-based analysis149,175 present more cost across a large space of clinical samples. Integration of heterogeneous
effective alternatives to assess genomic aberration in cancer. data sets obtained at different research centers using different method-
Alternatively, inference of genomic alterations from less cost ologies will require advanced batch correction24 and computational
intensive mRNA-sequencing strategies may provide an attractive tools to combine these data sets in a meaningful way while minimizing
strategy to facilitate introduction of single-cell sequencing in a clinical overcorrection and maintaining relevant biological differences.178
diagnostic setting.35,37,38,71,73,74,76 However, such approaches provide Extensive benchmarking of single-cell technologies179 together
only indirect genomic information, which is furthermore limited due with interrogation of sampling artefacts180 are pivotal in order to
to the necessity that a genomic lesion needs to be manifested on the achieve overall comparability of test results. A recent study has carried
mRNA level which on top of that can be captured by the chosen out a systematic comparison of different cell and nuclei isolation strate-
single-cell chemistry. Also, most solid tumor samples are stored as gies on a diverse range of clinical cancer samples. Testing several isola-
FFPE tissue blocks which often yield low quality mRNA thus render- tion protocols per sample type, the authors based their evaluation
ing single-cell transcriptomics challenging. among other metrics on the cellular diversity a given protocol would
While Smart-seq2-based full-length mRNA-sequencing at single- reproduce. Taken together, this study provides an extensive resource
cell resolution generates more transcriptional information which can suggesting highly specified isolation protocols for various cancer tissues
be used to infer genomic aberrations, its analysis cost per cell on with the overarching goal to provide standardizable, robust and compa-
sorted plates amounts to approximately 30 USD as opposed to 4 USD rable single-cell workflows for the use in a clinical setting.181
on commercial platforms such as the iCell8cx. In contrast, droplet- The results of an interrogation of clinical samples by single-cell
based mRNA-sequencing on the 10X Chromium provides a more cost analysis may vary depending on the platform used. It is therefore cru-
effective alternative and processing cost of 0.5USD per cell. For fur- cial to systematically assess existing platforms for single-cell genome
ther comparison of different platforms, methodologies, cohort size in and spatial tissue analysis, similarly to single-cell transcriptome
selected reference literature see Tables 2 and 3. chemistries,54,55 to determine most suitable applications and experi-
Initially, single-cell studies focused on manual isolation of indi- mental conditions for defined clinical questions.
128,129
vidual cells, an approach that does not meet the required cellu- With regards to mRNA analysis, definition of a statistically critical
lar throughput for clinical applications. Technological advancements number of cells for computational analyses, optimal sequencing depth in
such as the use of microwells43,49 and nanodroplets44-46 have greatly addition to the minimum number of cells necessary to define a cell type
increased the cellular throughput, however, such methods require or state, among others, will be crucial for improved comparability. In order
rather large amounts of starting cell numbers and provide compara- to provide sequencing data, which allow for comparable mutation analysis
bly low capturing rates,45,46 thereby rendering these technologies and variant calling, a unified way to define clonal structures will need to
less favorable in a clinical setting when sample size may be small and be established.134 Novel computational tools are required to manage data
limiting. The necessity to enrich for distinct cell populations via anti- analysis in increasingly large data sets, as has been demonstrated previ-
body staining and flow cytometry provides another example of sam- ously.52,178 Computational challenges associated with the analysis of can-
ple loss, which becomes particularly unfavorable when analyzing cer samples by single-cell transcriptomics are reviewed elsewhere.26 In
60,61,146-148
low input samples and rare cell types such as CTCs. In addition, analysis of longitudinal clinical samples is likely to provide insight
order to generate clinically valuable results, capturing of extremely into disease progression and treatment response.175 In order to relate lon-
rare clones in a tumor sample must be guaranteed which in turn gitudinal samples, existing trajectory inference methods182,183 need to be
requires processing of patient-derived samples in their entirety developed further to integrate different modalities of the same sample
agnostic of sample size. Platforms to isolate scarce clinical samples along a pseudo-time axis, to resolve multiple clonal subtypes and to per-
for high throughput analysis, for example, the sciFLEXARRAYER S3 form high-dimensional comparative analyses against large patient cohorts
or cellenONE systems, are becoming available and have recently while maintaining the longitudinal order of cell states and clones.
been used successfully for low-coverage CNV profiling of human Moreover, to reach clinical validity, existing single-cell methodolo-
breast cancer samples.137 gies need to possess optimal detection sensitivity and specificity. The

16
T A B L E 3 Overview of technical details of translational research articles highlighted in this review, which describe genomic, epigenetic, or
spatial analyses

17
Abbreviations: ADC, adenocarcinoma; AML, acute myeloid leukemia; ALL, acute lymphoblastic leukaemia; ATAC, assay for transposase-accessible
chromatin; BCC, basal cell carcinoma; CITEseq, cellular indexing of transcriptomes and epitopes by sequencing; CLL, chronic lymphocytic leukemia; CNA,
copy number alteration; CNV, copy number variation; DSP, digital spatial profiler; FACS, fluorescence-activated cell sorting; FL, follicular lymphoma; HDST,
high density spatial transcriptomics; HCC, hepatocellular carcinoma; MPAL, mixed-phenotype acute leukemia; MRD AML, minimal residual disease acute
myeloid leukemia; PC, protstate cancer; PDAC, pancreatic ductal adenocarcinoma; PHLI-seq, phenotype-based high-throughput laser-aided isolation and
sequencing; QRP DOP-PCR, quasi-random priming degenerate oligonucleotide primed polymerase chain reaction; RCC, renal cell carcinoma; SNV, single-
nucleotide variant; SCLC, small cell lung cancer; SS, synovial sacroma; ST, spatial transcriptomics; TNBC, triple-negative breast cancer; WES, whole-exome
sequencing; WGS, whole-genome sequencing.

F I G U R E 3 Timeline illustrating key references utilizing single-cell technology in translational cancer research for transcriptome analysis and
immune profiling. Chronological appearance of selected references highlighted in this review in which RNA analysis and immune profiling in the
context of various different cancer types were performed. At the bottom, schematic illustration of key platforms used over time to isolate and
process single cells such as FACS isolation into PCR plates, microfluidic devices such as the Fluidigm C1, droplet-based technologies such as 10X
Genomics, laser-catapulting, imaging mass cytometry and spatially resolved analysis of the transcriptome. AML = acute myeloid leukemia;
ATRT = atypical teratoid/rhabdoid tumors; CML = chronic myeloid leukemia; CTCs = circulating tumor cells; CyTOF = cytometry by time of flight;
ETMR = embryonal tumors with multilayered rosettes; FACS = fluorescence-activated cell sorting; GBM = glioblastoma multiforme;
HNSCC = head and neck squamous carcinoma; MARS-seq = massively parallel RNA single-cell sequencing; PDX = patient-derived xenograft;
TCR-seq = T cell receptor sequencing; WGA = whole-genome amplification; WNT MB = WNT-subtype medulloblastoma (n.a.: not available/
disclosed in article)

18
F I G U R E 4 Timeline illustrating key references utilizing single-cell technology to study epigenetic alterations and genomic aberrations in single
cancer cells as well as selected references studying cancer tissues with spatial resolution. Chronological appearance of selected research articles
highlighted in this review assessing epigenetic and DNA changes in cancer samples and the progressive emergence of studies spatially resolving
different modalities such as mRNA and protein expression as well as genomic alterations in cancer tissue. Schematic representations of key
platforms frequently used in various different single-cell studies such as FACS isolation into PCR plates, microfluidic devices such as the Fluidigm
C1, droplet-based technologies such as 10X Genomics, laser-catapulting, imaging mass cytometry and spatially resolved analysis of the
transcriptome. ALL = acute lymphoblastic leukaemia; AML = acute myeloid leukemia; ATAC-seq = assay for transposase-accessible chromatin;
CITE-seq = cellular indexing of transcriptomes and epitopes by sequencing; CLL = chronic lymphocytic leukemia; CML = chronic myeloid
leukemia; DOP PCR = degenerate oligonucleotide primed polymerase chain reaction; FACS = fluorescence-activated cell sorting; GeoMx
DSP = GeoMX Digital spatial profiler; GS = genome sequencing; HDST = high density spatial transcriptomics; IP = immunoprecipitation;
MALBAC = multiple annealing and looping based amplification cycles; MDA = multiple displacement amplification; MRD AML = minimal residual
disease AML; PDAC = pancreatic ductal adenocarcinoma; QRP DOP-PCR = quasi-random priming DOP-PCR; PHLI-seq = phenotype-based high-
throughput laser-aided isolation and sequencing; RCA = rolling circle amplification; SCLC = small cell lung cancer; ST = spatial transcriptomics;
TNBC = triple-negative breast cancer; Trio-seq = triple omics sequencing; WES = whole-exome sequencing; WGA = whole-genome amplification
(n.a.: not available/disclosed in article)

minute starting amounts of mRNA or DNA in single-cells often require as dropout events resulting from transcripts which were not captured
extensive amplification prior to sequencing which introduces technical during reverse transcription.184 Such noise impairs overall detection
134
noise such as allelic dropouts or polymerase-induced errors, as well confidence. Both continuous improvement of sample preparation and

19
computational models are required to correct for such errors for any CONFLIC T OF INT ER E ST
platform and chemistry which is intended to be used in a clinical set- The authors declare no potential conflict of interest.
184,185
ting, as exemplified here for single-cell mRNA-sequencing.
Advances in this field will facilitate not only the cellular DATA AVAILABILITY STAT EMEN T
deconvolution of cancer tissues but also to build cancer-specific clas- Data sharing not applicable to this article as no datasets were gener-
sifiers based on several modalities, thereby refining current cancer ated or analysed during the current study.
classification and treatment of cancer. Ultimately, single-cell data of
distinct modalities will need to be put into the context of the tumor
OR CID
tissue, where transcriptomic and genomic signatures are translated
Ulrich Pfisterer https://orcid.org/0000-0002-4613-6427
into altered functionality of cells in the diseased state.

RE FE RE NCE S
1. Burrell RA, Mcgranahan N, Bartek J, Swanton C. The causes and
8 | C O N CL U S I O N S consequences of genetic heterogeneity in cancer evolution. Nature.
2013;501:338-345. https://doi.org/10.1038/nature12625.
The tremendous technological development in single-cell sequencing of 2. Jones PA, Issa JPJ, Baylin S. Targeting the cancer epigenome for
therapy. Nat Rev Genet. 2016;17:630-641. https://doi.org/10.1038/
the past decade has yielded a broad toolbox to study many modalities in
nrg.2016.93.
cancer, such as mRNA,37,38,57,59 DNA alterations,141,144,146-149,154,175 3. Wouters BJ, Delwel R. Epigenetics and approaches to targeted epi-
immune cell composition of tumors,85-87,90,92,95 chromatin genetic therapy in acute myeloid leukemia. Blood. 2016;127:42-52.
changes 116-118,122 97
and metabolic effectors in dissociated single-cells or https://doi.org/10.1182/blood-2015-07-604512.
4. Landau DA, Carter SL, Stojanov P, et al. Evolution and impact of sub-
nuclei as well as within the context of diseased tissue156-164,167,172
clonal mutations in chronic lymphocytic leukemia. Cell. 2013;152(4):
(Figures 3 and 4). Mono-, bi-, or even-multimodal approaches have 714-726. https://doi.org/10.1016/j.cell.2013.01.019.
within short time facilitated cancer research at unprecedented depth and 5. Anderson K, Lutz C, Van Delft FW, et al. Genetic variegation of
gained invaluable information on tumor composition and classification, clonal architecture and propagating cells in leukaemia. Nature. 2011;
469(7330):356-361. https://doi.org/10.1038/nature09650.
clonal evolution in cancer, disease progression and treatment response.
6. Metzker ML. Sequencing technologies the next generation. Nat Rev
Nevertheless, many methods fall short in providing information Genet. 2010;11(1):31-46. https://doi.org/10.1038/nrg2626.
on tissue context, which can provide further information of prognostic 7. Shendure J, Ji H. Next-generation DNA sequencing. Nat Biotechnol.
value.80,186 The achievement of clinical translation of spatially 2008;26(10):1135-1145. https://doi.org/10.1038/nbt1486.
8. Rozenblatt-Rosen O, Regev A, Oberdoerffer P, et al. The human
resolved methodologies depends, among others, on the ability to com-
tumor atlas network: charting tumor transitions across space and
prehensively analyse high dimensional data comprised of information
time at single-cell resolution. Cell. 2020;181(2):236-249. https://doi.
on cell type and state, cell boundaries to adjacent cells together with org/10.1016/j.cell.2020.03.053.
the cells location within the tissue. This in turn requires the develop- 9. Ley TJ, Miller C, Ding L, et al. Genomic and epigenomic landscapes
ment of computational tools which allow for robust identification of of adult de novo acute myeloid leukemia. N Engl J Med. 2013;368
(22):2059-2074. https://doi.org/10.1056/NEJMoa1301689.
given patterns of cells in a tissue or tissue motifs,187 which then may
10. Abeshouse A, Adebamowo C, Adebamowo SN, et al. Comprehensive
enable for in silico construction of tissue network structures and to and integrated genomic characterization of adult soft tissue sarco-
ultimately infer pathological processes, necessary to classify patients mas. Cell. 2017;171(4):950-965.e28. https://doi.org/10.1016/j.cell.
and to aid diagnosis. 2017.10.014.
11. Koboldt DC, Fulton RS, McLellan MD, et al. Comprehensive molecu-
While current technical limitations prevent broad clinical application
lar portraits of human breast tumours. Nature. 2012;490(7418):61-
of the aforementioned methodologies, it seems clear that single-cell ana- 70. https://doi.org/10.1038/nature11412.
lyses will become an integral part in clinical diagnostics, prognostication, 12. Ciriello G, Gatza ML, Beck AH, et al. Comprehensive molecular por-
disease follow-up, and treatment selection in the next coming years. This traits of invasive lobular breast Cancer. Cell. 2015;163(2):506-519.
https://doi.org/10.1016/j.cell.2015.09.033.
is strongly emphasized by the large number of studies and their diverse
13. Tirosh I, Suvà ML. Deciphering human tumor biology by single-cell
scope employing transcriptome-sequencing of single cells (Figure 3). In expression profiling. Annu Rev Cancer Biol. 2019;3(1):151-166.
addition, existing single-cell DNA applications often present sufficient https://doi.org/10.1146/annurev-cancerbio-030518-055609.
analytical validity and additional refinement regarding detection sensitivity 14. Lawson DA, Kessenbrock K, Davis RT, Pervolarakis N, Werb Z.
Tumour heterogeneity and metastasis at single-cell resolution. Nat
and specificity of those methods may ultimately render bulk WGS/WES
Cell Biol. 2018;20(12):1349-1360. https://doi.org/10.1038/s41556-
obsolete, which are currently often used to either substantiate findings
018-0236-7.
obtained via single-cell sequencing144 or to nominate genomic lesions for 15. Stuart T, Satija R. Integrative single-cell analysis. Nat Rev Genet.
targeted single-cell analysis.149,154,175 Further, each existing technology 2019;1:257-272. https://doi.org/10.1038/s41576-019-0093-7.
possesses specific opportunities but also technical shortcomings, which 16. Fuzik J, Zeisel A, Mate Z, et al. Integration of electrophysiological
recordings with single-cell RNA-seq data identifies neuronal subtypes.
will affect their analytical validity and which will therefore lead to varying
Nat Biotechnol. 2016;34:175-183. https://doi.org/10.1038/nbt.3443.
time frames for clinical translation. Nevertheless, the literature highlighted 17. Cadwell CR, Palasantza A, Jiang X, et al. Electrophysiological, trans-
in this review clearly demonstrates the applicability and usefulness of criptomic and morphologic profiling of single neurons using patch-seq.
single-cell analysis in cancer research and diagnostics. Nat Biotechnol. 2015;34:199-203. https://doi.org/10.1038/nbt.3445.

20
18. Chen S, Lake BB, Zhang K. High-throughput sequencing of the 38. Patel AP, Tirosh I, Trombetta JJ, et al. Single-cell RNA-seq highlights
transcriptome and chromatin accessibility in the same cell. Nat Bio- intratumoral heterogeneity in primary glioblastoma. Science. 2014;
technol. 2019;37:1452-1457. https://doi.org/10.1038/s41587-019- 344:1396-1401. https://doi.org/10.1126/science.1254257.
0290-0. 39. Skene NG, Bryois J, Bakken TE, et al. Genetic identification of brain
19. Stoeckius M, Hafemeister C, Stephenson W, et al. Simultaneous epi- cell types underlying schizophrenia. Nat Genet. 2018;50:825-833.
tope and transcriptome measurement in single cells. Nat Methods. https://doi.org/10.1038/s41588-018-0129-5.
2017;14:865-868. https://doi.org/10.1038/nmeth.4380. 40. Picelli S, Faridani OR, Bjorklund AK, Winberg G, Sagasser S,
20. Peterson VM, Zhang KX, Kumar N, et al. Multiplexed quantification Sandberg R. Full-length RNA-seq from single cells using smart-seq2.
of proteins and transcripts in single cells. Nat Biotechnol. 2017;35 Nat Protoc. 2014;9(1):171-181. https://doi.org/10.1038/nprot.
(10):936-939. https://doi.org/10.1038/nbt.3973. 2014.006.
21. Macaulay IC, Ponting CP, Voet T. Single-cell Multiomics: multiple 41. Hagemann-Jensen M, Ziegenhain C, Chen P, et al. Single-cell RNA
measurements from single cells. Trends Genet. 2017;33:155-168. counting at allele- and isoform-resolution using smart-seq3. Nat Bio-
https://doi.org/10.1016/j.tig.2016.12.003. technol. 2020;38:708-714. https://doi.org/10.1038/s41587-020-
22. Macaulay IC, Haerty W, Kumar P, et al. G&T-seq: parallel sequencing 0497-0.
of single-cell genomes and transcriptomes. Nat Methods. 2015;12: 42. Islam S, Kjällquist U, Moliner A, et al. Highly multiplexed and strand-
519-522. https://doi.org/10.1038/nmeth.3370. specific single-cell RNA 50 end sequencing. Nat Protoc. 2012;7:813-
23. Jaitin DA, Kenigsberg E, Keren-Shaul H, et al. Massively parallel 828. https://doi.org/10.1038/nprot.2012.022.
single-cell RNA-seq for marker-free decomposition of tissues into 43. Fan HC, Fu GK, Fodor SP. Expression profiling. Combinatorial labeling
cell types. Science. 2014;343:776-779. https://doi.org/10.1126/ of single cells for gene expression cytometry. Science. 2015;347
science.1247651. (6222):1258367. https://doi.org/10.1126/science.1258367.
24. Luecken MD, Theis FJ. Current best practices in single-cell RNA-seq 44. Klein AM, Mazutis L, Akartuna I, et al. Droplet barcoding for single-
analysis: a tutorial. Mol Syst Biol. 2019;15(6):e8746. https://doi.org/ cell transcriptomics applied to embryonic stem cells. Cell. 2015;161
10.15252/msb.20188746. (5):1187-1201. https://doi.org/10.1016/j.cell.2015.04.044.
25. La Manno G, Soldatov R, Zeisel A, et al. RNA velocity of single cells. 45. Macosko EZ, Basu A, Satija R, et al. Highly parallel genome-wide
Nature. 2018;560(7719):494-498. https://doi.org/10.1038/s41586- expression profiling of individual cells using nanoliter droplets. Cell.
018-0414-6. 2015;161:1202-1214. https://doi.org/10.1016/j.cell.2015.05.002.
26. Fan J, Slowikowski K, Zhang F. Single-cell transcriptomics in 46. Zheng GXY, Terry JM, Belgrader P, et al. Massively parallel digital
cancer: computational challenges and opportunities. Exp Mol transcriptional profiling of single cells. Nat Commun. 2017;8:14049.
Med. 2020;52(9):1452-1465. https://doi.org/10.1038/s12276- https://doi.org/10.1038/ncomms14049.
020-0422-0. 47. Hashimshony T, Senderovich N, Avital G, et al. CEL-Seq2: sensitive
27. Burke W. Clinical validity and clinical utility of genetic tests. Curr highly-multiplexed single-cell RNA-Seq. Genome Biol. 2016;17(1):1-
Protoc Hum Genet. 2004;42:15.1-15.6. https://doi.org/10.1002/ 7. https://doi.org/10.1186/s13059-016-0938-8.
0471142905.hg0915s42 Chap. 9. 48. Sasagawa Y, Danno H, Takada H, et al. Quartz-Seq2: a high-
28. Burke W. Genetic tests: clinical validity and clinical utility. Curr Pro- throughput single-cell RNA-sequencing method that effectively uses
toc Hum Genet. 2014;81:1-14. https://doi.org/10.1002/ limited sequence reads. Genome Biol. 2018;19(1):29. https://doi.org/
0471142905.hg0915s81. 10.1186/s13059-018-1407-3.
29. Katsanis SH, Katsanis N. Molecular genetic testing and the future of 49. Gierahn TM, Wadsworth MH, Hughes TK, et al. Seq-well: portable,
clinical genomics. Nat Rev Genet. 2013;14(6):415-426. https://doi. low-cost rna sequencing of single cells at high throughput. Nat
org/10.1038/nrg3493. Methods. 2017;14(4):395-398. https://doi.org/10.1038/nmeth.4179.
30. Han X, Wang R, Zhou Y, et al. Mapping the mouse cell atlas by 50. Rosenberg AB, Roco CM, Muscat RA, et al. Single-cell profiling of
microwell-Seq. Cell. 2018;172(5):1091-1097.e17. https://doi.org/ the developing mouse brain and spinal cord with split-pool
10.1016/j.cell.2018.02.001. barcoding. Science. 2018;360:176-182. https://doi.org/10.1126/
31. Regev A, Teichmann SA, Lander ES, et al. The human cell atlas. Elife. science.aam8999.
2017;6:e27041. https://doi.org/10.7554/eLife.27041. 51. Datlinger P, Rendeiro AF, Boenke T, Krausgruber T, Barreca D,
32. Schaum N, Karkanias J, Neff NF, et al. Single-cell transcriptomics of Bock C. Ultra-high throughput single-cell RNA sequencing by combi-
20 mouse organs creates a tabula Muris. Nature. 2018;562:367-372. natorial fluidic indexing. bioRxiv. 2019;1-27. https://doi.org/10.
https://doi.org/10.1038/s41586-018-0590-4. 1101/2019.12.17.879304.
33. Rozenblatt-Rosen O, Stubbington MJT, Regev A, Teichmann SA. 52. Saunders A, Macosko EZ, Wysoker A, et al. Molecular diversity and
The human cell atlas: from vision to reality. Nature. 2017;550:451- specializations among the cells of the adult mouse brain. Cell. 2018;
453. https://doi.org/10.1038/550451a. 174:1015-1030.e16. https://doi.org/10.1016/j.cell.2018.07.028.
34. Velmeshev D, Schirmer L, Jung D, et al. Single-cell genomics iden- 53. Svensson V, Vento-Tormo R, Teichmann SA. Exponential scaling of
tifies cell type–specific molecular changes in autism. Science. 2019; single-cell RNA-seq in the past decade. Nat Protoc. 2018;13(4):599-
364:685-689. https://doi.org/10.1126/science.aav8130. 604. https://doi.org/10.1038/nprot.2017.149.
35. Giustacchini A, Thongjuea S, Barkas N, et al. Single-cell trans- 54. Ziegenhain C, Vieth B, Parekh S, et al. Comparative analysis of
criptomics uncovers distinct molecular signatures of stem cells in single-cell RNA sequencing methods. Mol Cell. 2017;65(4):631-643.
chronic myeloid leukemia. Nat Med. 2017;23(6):692-702. https:// e4. https://doi.org/10.1016/j.molcel.2017.01.023.
doi.org/10.1038/nm.4336. 55. Zhang X, Li T, Liu F, et al. Comparative analysis of droplet-based
36. Zhang F, Wei K, Slowikowski K, et al. Defining inflammatory cell ultra-high-throughput single-cell RNA-Seq systems. Mol Cell.
states in rheumatoid arthritis joint synovial tissues by integrating 2019;73(1):130-142.e5. https://doi.org/10.1016/j.molcel.2018.
single-cell transcriptomics and mass cytometry. Nat Immunol. 2019; 10.020.
20:928-942. https://doi.org/10.1038/s41590-019-0378-1. 56. Ramskold D, Luo S, Wang YC, et al. Full-length mRNA-Seq from
37. Venteicher AS, Tirosh I, Hebert C, et al. Decoupling genetics, line- single-cell levels of RNA and individual circulating tumor cells. Nat
ages, and microenvironment in IDH-mutant gliomas by single-cell Biotechnol. 2012;30(8):777-782. https://doi.org/10.1038/nbt.2282.
RNA-seq. Science. 2017;355:eaai8478. https://doi.org/10.1126/ 57. Lee MCW, Lopez-Diaz FJ, Khan SY, et al. Single-cell analyses of
science.aai8478. transcriptional heterogeneity during drug tolerance transition in

21
cancer cells by RNA sequencing. Proc Natl Acad Sci U S A. 2014;111 76. Van Galen P, Hovestadt V, Ii MHW, et al. Single-cell RNA-Seq
(44):E4726-E4735. https://doi.org/10.1073/pnas.1404656111. reveals AML hierarchies relevant to disease progression and immu-
58. Kim KT, Lee HW, Lee HO, et al. Single-cell mRNA sequencing iden- nity article single-cell RNA-Seq reveals AML hierarchies relevant to
tifies subclonal heterogeneity in anti-cancer drug responses of lung disease progression and immunity. Cell. 2019;176:1-17. https://doi.
adenocarcinoma cells. Genome Biol. 2015;16(1):1-15. https://doi. org/10.1016/j.cell.2019.01.031.
org/10.1186/s13059-015-0692-3. 77. Petti AA, Williams SR, Miller CA, et al. A general approach for
59. Kim KT, Lee HW, Lee HO, et al. Application of single-cell RNA detecting expressed mutations in AML cells using single cell RNA-
sequencing in optimizing a combinatorial therapeutic strategy in sequencing. Nat Commun. 2019;10(1):3660. https://doi.org/10.
metastatic renal cell carcinoma. Genome Biol. 2016;17:80. https:// 1038/s41467-019-11591-1.
doi.org/10.1186/s13059-016-0945-9. 78. Durante MA, Rodriguez DA, Kurtenbach S, et al. Single-cell analysis
60. Miyamoto DT, Zheng Y, Wittner BS, et al. RNA-Seq of single pros- reveals new evolutionary complexity in uveal melanoma. Nat
tate CTCs implicates noncanonical Wnt signaling in antiandrogen Commun. 2020;11(1):496. https://doi.org/10.1038/s41467-019-
resistance. Science. 2015;349(6254):1351-1356. https://doi.org/10. 14256-1.
1126/science.aab0917. 79. Palmer S, Albergante L, Blackburn CC, Newman TJ. Thymic involution
61. Jordan NV, Bardia A, Wittner BS, et al. HER2 expression identifies and rising disease incidence with age. Proc Natl Acad Sci U S A. 2018;
dynamic functional states within circulating breast cancer cells. Nature. 115(8):1883-1888. https://doi.org/10.1073/pnas.1714478115.
2016;537(7618):102-106. https://doi.org/10.1038/nature19328. 80. Pagès F, Galon J, Dieu-Nosjean MC, Tartour E, Sautès-Fridman C,
62. Gao R, Kim C, Sei E, et al. Nanogrid single-nucleus RNA sequencing Fridman WH. Immune infiltration in human tumors: a prognostic fac-
reveals phenotypic diversity in breast cancer. Nat Commun. 2017;8 tor that should not be ignored. Oncogene. 2010;29(8):1093-1102.
(1):228. https://doi.org/10.1038/s41467-017-00244-w. https://doi.org/10.1038/onc.2009.416.
63. Lambrechts D, Wauters E, Boeckx B, et al. Phenotype molding of 81. Han A, Glanville J, Hansmann L, Davis MM. Linking T-cell receptor
stromal cells in the lung tumor microenvironment. Nat Med. 2018;24 sequence to functional phenotype at the single-cell level. Nat Bio-
(8):1277-1289. https://doi.org/10.1038/s41591-018-0096-5. technol. 2014;32(7):684-692. https://doi.org/10.1038/nbt.2938.
64. Goveia J, Rohlenova K, Taverna F, et al. An integrated gene expres- 82. Zemmour D, Zilionis R, Kiner E, Klein AM, Mathis D, Benoist C.
sion landscape profiling approach to identify lung tumor endothelial Single-cell gene expression reveals a landscape of regulatory T cell
cell heterogeneity and Angiogenic candidates. Cancer Cell. 2020;37 phenotypes shaped by the TCR article. Nat Immunol. 2018;19(3):
(1):21-36.e13. https://doi.org/10.1016/j.ccell.2019.12.001. 291-301. https://doi.org/10.1038/s41590-018-0051-0.
65. Young MD, Mitchell TJ, Vieira Braga FA, et al. Single-cell trans- 83. Moral JA, Leung J, Rojas LA, et al. ILC2s amplify PD-1 blockade by
criptomes from human kidneys reveal the cellular identity of renal activating tissue-specific cancer immunity. Nature. 2020;579(7797):
tumors. Science. 2018;361(6402):594-599. https://doi.org/10.1126/ 130-135. https://doi.org/10.1038/s41586-020-2015-4.
science.aat1699. 84. Li H, van der Leun AM, Yofe I, et al. Dysfunctional CD8 T cells form
66. Jessa S, Blanchet-Cohen A, Krug B, et al. Stalled developmental programs a proliferative, dynamically regulated compartment within human
at the root of pediatric brain tumors. Nat Genet. 2019;51:1702-1713. melanoma. Cell. 2019;176(4):775-789.e18. https://doi.org/10.1016/
https://doi.org/10.1038/s41588-019-0531-7. j.cell.2018.11.043.
67. Müller S, Kohanbash G, Liu SJ, et al. Single-cell profiling of human gli- 85. Fairfax BP, Taylor CA, Watson RA, et al. Peripheral CD8+ T cell char-
omas reveals macrophage ontogeny as a basis for regional differences acteristics associated with durable responses to immune checkpoint
in macrophage activation in the tumor microenvironment. Genome blockade in patients with metastatic melanoma. Nat Med. 2020;26
Biol. 2017;18(1):1-14. https://doi.org/10.1186/s13059-017-1362-4. (2):193-199. https://doi.org/10.1038/s41591-019-0734-6.
68. Puram SV, Tirosh I, Parikh AS, et al. Single-cell Transcriptomic analy- 86. Azizi E, Carr AJ, Plitas G, et al. Single-cell map of diverse immune
sis of primary and metastatic tumor ecosystems in head and neck phenotypes in the breast tumor microenvironment. Cell. 2018;174
Cancer. Cell. 2017;171(7):1611-1624.e24. https://doi.org/10.1016/ (5):1293-1308.e36. https://doi.org/10.1016/j.cell.2018.05.060.
j.cell.2017.10.044. 87. Wu TD, Madireddi S, de Almeida PE, et al. Peripheral T cell
69. Li H, Courtois ET, Sengupta D, et al. Reference component analysis expansion predicts tumour infiltration and clinical response. Nature.
of single-cell transcriptomes elucidates cellular heterogeneity in 2020;579(7798):274-278. https://doi.org/10.1038/s41586-020-
human colorectal tumors. Nat Genet. 2017;49(5):708-718. https:// 2056-8.
doi.org/10.1038/ng.3818. 88. Park JE, Botting RA, Conde CD, et al. A cell atlas of human thymic
70. Savage P, Blanchet-Cohen A, Revil T, et al. A targetable EGFR- development defines T cell repertoire formation. Science. 2020;367
dependent tumor-initiating program in breast Cancer. Cell Rep. 2017; (6480):eaay3224. https://doi.org/10.1126/science.aay3224.
21(5):1140-1149. https://doi.org/10.1016/j.celrep.2017.10.015. 89. Savas P, Virassamy B, Ye C, et al. Single-cell profiling of breast can-
71. Chung W, Eum HH, Lee HO, et al. Single-cell RNA-seq enables com- cer T cells reveals a tissue-resident memory subset associated with
prehensive tumour and immune cell profiling in primary breast can- improved prognosis. Nat Med. 2018;24(7):986-993. https://doi.org/
cer. Nat Commun. 2017;8(May):1-12. https://doi.org/10.1038/ 10.1038/s41591-018-0078-7.
ncomms15081. 90. Zheng C, Zheng L, Yoo JK, et al. Landscape of infiltrating T cells in
72. Brady SW, McQuerry JA, Qiao Y, et al. Combating subclonal evolu- liver Cancer revealed by single-cell sequencing. Cell. 2017;169(7):
tion of resistant cancer phenotypes. Nat Commun. 2017;8(1):1231. 1342-1356.e16. https://doi.org/10.1016/j.cell.2017.05.035.
https://doi.org/10.1038/s41467-017-01174-3. 91. Zhang Q, He Y, Luo N, et al. Landscape and dynamics of single
73. Tirosh I, Izar B, Prakadan SM, et al. Dissecting the multicellular eco- immune cells in hepatocellular carcinoma. Cell. 2019;179(4):
system of metastatic melanoma by single-cell RNA-seq. Science. 829-845.e20. https://doi.org/10.1016/j.cell.2019.10.003.
2016;352(6282):189-196. https://doi.org/10.1126/science.aad0501. 92. Zilionis R, Engblom C, Pfirschke C, et al. Single-cell Transcriptomics
74. Filbin MG, Tirosh I, Hovestadt V, et al. Developmental and oncogenic of human and mouse lung cancers reveals conserved myeloid
programs in H3K27M gliomas dissected by single-cell RNA-seq. Science. populations across individuals and species. Immunity. 2019;50(5):
2018;360(6386):331-335. https://doi.org/10.1126/science.aao4750. 1317-1334.e10. https://doi.org/10.1016/j.immuni.2019.03.009.
75. Tirosh I, Venteicher AS, Hebert C, et al. Single-cell RNA-seq supports 93. Guo X, Zhang Y, Zheng L, et al. Global characterization of T cells in
a developmental hierarchy in human oligodendroglioma. Nature. non-small-cell lung cancer by single-cell sequencing. Nat Med. 2018;
2016;539(7628):309-313. https://doi.org/10.1038/nature20123. 24(7):978-985. https://doi.org/10.1038/s41591-018-0045-3.

22
94. Zhang L, Yu X, Zheng L, et al. Lineage tracking reveals dynamic rela- hepatocellular carcinomas. Cell Res. 2016;26(3):304-319. https://
tionships of T cells in colorectal cancer. Nature. 2018;564(7735): doi.org/10.1038/cr.2016.23.
268-272. https://doi.org/10.1038/s41586-018-0694-x. 113. Gaiti F, Chaligne R, Gu H, et al. Epigenetic evolution and lineage his-
95. Yost KE, Satpathy AT, Wells DK, et al. Clonal replacement of tumor- tories of chronic lymphocytic leukaemia. Nature. 2019;569(7757):
specific T cells following PD-1 blockade. Nat Med. 2019;25(8):1251- 576-580. https://doi.org/10.1038/s41586-019-1198-z.
1259. https://doi.org/10.1038/s41591-019-0522-3. 114. Pastore A, Gaiti F, Lu SX, et al. Corrupted coordination of epigenetic
96. Goswami S, Walle T, Cornish AE, et al. Immune profiling of human modifications leads to diverging chromatin states and transcriptional
tumors identifies CD73 as a combinatorial target in glioblastoma. Nat heterogeneity in CLL. Nat Commun. 2019;10(1):1874. https://doi.
Med. 2020;26(1):39-46. https://doi.org/10.1038/s41591-019-0694-x. org/10.1038/s41467-019-09645-5.
97. Hartmann FJ, Mrdjen D, McCaffrey E, et al. Single-cell metabolic 115. Shu S, Wu HJ, Ge JY, et al. Synthetic lethal and resistance interac-
profiling of human cytotoxic T cells. Nat Biotechnol. 2020;39: tions with BET Bromodomain inhibitors in triple-negative breast
186–197. https://doi.org/10.1101/2020.01.17.909796. Cancer. Mol Cell. 2020;78(6):1096–1113 e8. https://doi.org/10.
98. Lavin Y, Kobayashi S, Leader A, et al. Innate immune landscape in 1016/j.molcel.2020.04.027.
early lung adenocarcinoma by paired single-cell analyses. Cell. 2017; 116. Satpathy AT, Granja JM, Yost KE, et al. Massively parallel single-cell
169(4):750-765.e17. https://doi.org/10.1016/j.cell.2017.04.014. chromatin landscapes of human immune cell development and
99. Wilting RH, Dannenberg JH. Epigenetic mechanisms in tumorigene- intratumoral T cell exhaustion. Nat Biotechnol. 2019;37(8):925-936.
sis, tumor cell heterogeneity and drug resistance. Drug Resist Updat. https://doi.org/10.1038/s41587-019-0206-z.
2012;15(1-2):21-38. https://doi.org/10.1016/j.drup.2012.01.008. 117. Litzenburger UM, Buenrostro JD, Wu B, et al. Single-cell epigenomic
100. Darwiche N. Epigenetic mechanisms and the hallmarks of cancer: an variability reveals functional cancer heterogeneity. Genome Biol.
intimate affair. Am J Cancer Res. 2020;10(7):1954-1978. 2017;18(1):1-12. https://doi.org/10.1186/s13059-016-1133-7.
101. Ehrlich M. DNA methylation in cancer: too much, but also too little. 118. Grosselin K, Durand A, Marsolier J, et al. High-throughput single-cell
Oncogene. 2002;21(35):5400-5413. https://doi.org/10.1038/sj.onc. ChIP-seq identifies heterogeneity of chromatin states in breast can-
1205651. cer. Nat Genet. 2019;51(6):1060-1066. https://doi.org/10.1038/
102. Audia JE, Campbell RM. Histone modifications and Cancer. Cold s41588-019-0424-9.
Spring Harb Perspect Biol. 2016;8(4):a019521. https://doi.org/10. 119. Maruffi M, Sposto R, Oberley MJ, Kysh L, Orgel E. Therapy for chil-
1101/cshperspect.a019521. dren and adults with mixed phenotype acute leukemia: a systematic
103. Cheng Y, He C, Wang M, et al. Targeting epigenetic regulators for review and meta-analysis. Leukemia. 2018;32(7):1515-1528.
cancer therapy: mechanisms and advances in clinical trials. Signal https://doi.org/10.1038/s41375-018-0058-4.
Transduct Target Ther. 2019;4(1):62. https://doi.org/10.1038/ 120. Stenhouse G, Fyfe N, King G, Chapman A, Kerr KM. Thyroid tran-
s41392-019-0095-0. scription factor 1 in pulmonary adenocarcinoma. J Clin Pathol. 2004;
104. Issa JP, Garcia-Manero G, Giles FJ, et al. Phase 1 study of low-dose 57(4):383-387. https://doi.org/10.1136/jcp.2003.007138.
prolonged exposure schedules of the hypomethylating agent 5-aza- 121. Kim HK, Noh YH, Nilius B, et al. Current and upcoming mitochon-
20 -deoxycytidine (decitabine) in hematopoietic malignancies. Blood. drial targets for cancer therapy. Semin Cancer Biol. 2017;47:154-
2004;103(5):1635-1640. https://doi.org/10.1182/blood-2003-03- 167. https://doi.org/10.1016/j.semcancer.2017.06.006.
0687. 122. Lareau CA, Ludwig LS, Muus C, et al. Massively parallel single-cell
105. Kantarjian H, Oki Y, Garcia-Manero G, et al. Results of a randomized mitochondrial DNA genotyping and chromatin profiling. Nat Bio-
study of 3 schedules of low-dose decitabine in higher-risk technol. 2020. https://doi.org/10.1038/s41587-020-0645-6.
myelodysplastic syndrome and chronic myelomonocytic leukemia. 123. Lo PK, Zhou Q. Emerging techniques in single-cell epigenomics and
Blood. 2007;109(1):52-57. https://doi.org/10.1182/blood-2006-05- their applications to cancer research. J Clin Genomics. 2018;1(1).
021162. https://doi.org/10.4172/JCG.1000103.
106. Issa JP, Kantarjian HM. Targeting DNA methylation. Clin Cancer Res. 124. Kaya-Okur HS, Wu SJ, Codomo CA, et al. CUT&tag for efficient epi-
2009;15(12):3938-3946. https://doi.org/10.1158/1078-0432.CCR- genomic profiling of small samples and single cells. Nat Commun.
08-2783. 2019;10(1):1930. https://doi.org/10.1038/s41467-019-09982-5.
107. Takeshima H, Yoda Y, Wakabayashi M, Hattori N, Yamashita S, 125. Ding L, Ley TJ, Larson DE, et al. HHS public. Access. 2012;481
Ushijima T. Low-dose DNA demethylating therapy induces repro- (7382):506-510. https://doi.org/10.1038/nature10738.Clonal.
gramming of diverse cancer-related pathways at the single-cell level. 126. Gerlinger M, Rowan AJ, Sc B, et al. Intratumor heterogeneity and branched
Clin Epigenetics. 2020;12(1):142. https://doi.org/10.1186/s13148- evolution revealed by multiregion sequencing. N Engl J Med. 2012;366(10):
020-00937-y. 883-892. https://doi.org/10.1056/NEJMoa1113205.Intratumor.
108. Granja JM, Klemm S, McGinnis LM, et al. Single-cell multiomic analy- 127. Navin N, Kendall J, Troge J, et al. Tumour evolution inferred by
sis identifies regulatory programs in mixed-phenotype acute leuke- single-cell sequencing. Nature. 2011;472:90-94. https://doi.org/10.
mia. Nat Biotechnol. 2019;37(12):1458-1465. https://doi.org/10. 1038/nature09807.
1038/s41587-019-0332-7. 128. Hou Y, Song L, Zhu P, et al. Single-cell exome sequencing and mono-
109. LaFave LM, Kartha VK, Ma S, et al. Epigenomic state transitions clonal evolution of a JAK2-negative myeloproliferative neoplasm.
characterize tumor progression in mouse lung adenocarcinoma. Can- Cell. 2012;148:873-885. https://doi.org/10.1016/j.cell.2012.02.028.
cer Cell. 2020;38(2):212-228 e13. https://doi.org/10.1016/j.ccell. 129. Xu X, Hou Y, Yin X, et al. Single-cell exome sequencing reveals
2020.06.006. single-nucleotide mutation characteristics of a kidney tumor. Cell.
110. Guo H, Zhu P, Guo F, et al. Profiling DNA methylome landscapes of 2012;148(5):886-895. https://doi.org/10.1016/j.cell.2012.02.025.
mammalian cells with single-cell reduced-representation bisulfite 130. Wang J, Fan HC, Behr B, Quake SR. Genome-wide single-cell analy-
sequencing. Nat Protoc. 2015;10(5):645-659. https://doi.org/10. sis of recombination activity and de novo mutation rates in human
1038/nprot.2015.039. sperm. Cell. 2012;150(2):402-412. https://doi.org/10.1016/j.cell.
111. Rotem A, Ram O, Shoresh N, et al. Single-cell ChIP-seq reveals cell 2012.06.030.
subpopulations defined by chromatin state. Nat Biotechnol. 2015;33 131. Hughes AEO, Magrini V, Demeter R, et al. Clonal architecture of
(11):1165-1172. https://doi.org/10.1038/nbt.3383. secondary acute myeloid leukemia defined by single-cell sequencing.
112. Hou Y, Guo H, Cao C, et al. Single-cell triple omics sequencing PLoS Genet. 2014;10(7):e1004462. https://doi.org/10.1371/journal.
reveals genetic, epigenetic, and transcriptomic heterogeneity in pgen.1004462.

23
132. Zong C, Lu S, Chapman AR, Xie XS. Genome-wide detection of in older patients with AML. Blood. 2020;135(11):791-803. https://doi.
single-nucleotide and copy-number variations of a single human cell. org/10.1182/blood.2019003988.
Science. 2012;338(6114):1622-1626. https://doi.org/10.1126/ 151. Morita K, Wang F, Jahn K, et al. Clonal evolution of acute myeloid
science.1229164. leukemia revealed by high-throughput single-cell genomics. Nat
133. Vitak SA, Torkenczy KA, Rosenkrantz JL, et al. Sequencing thousands Commun. 2020;11(1):5327. https://doi.org/10.1038/s41467-020-
of single-cell genomes with combinatorial indexing. Nat Methods. 19119-8.
2017;14:302-308. https://doi.org/10.1038/nmeth.4154. 152. Miles LA, Bowman RL, Merlinsky TR, et al. Single-cell mutation anal-
134. Gawad C, Koh W, Quake SR. Single-cell genome sequencing: current ysis of clonal evolution in myeloid malignancies. Nature. 2020;587:
state of the science. Nat Rev Genet. 2016;17(3):175-188. https:// 477-482. https://doi.org/10.1038/s41586-020-2864-x.
doi.org/10.1038/nrg.2015.16. 153. Papaemmanuil E, Rapado I, Li Y, et al. RAG-mediated recombination
135. Chen C, Xing D, Tan L, et al. Single-cell whole-genome analyses is the predominant driver of oncogenic rearrangement in
by linear amplification via transposon insertion (LIANTI). Science. ETV6-RUNX1 acute lymphoblastic leukemia. Nat Genet. 2014;46(2):
2017;356(6334):189-194. https://doi.org/10.1126/science.aak9787. 116-125. https://doi.org/10.1038/ng.2874.
136. Zahn H, Steif A, Laks E, et al. Scalable whole-genome single-cell 154. Gawad C, Koh W, Quake SR. Dissecting the clonal origins of child-
library preparation without preamplification. Nat Methods. 2017;14 hood acute lymphoblastic leukemia by single-cell genomics. Proc
(2):167-173. https://doi.org/10.1038/nmeth.4140. Natl Acad Sci U S A. 2014;111(50):17947-17952. https://doi.org/10.
137. Laks E, McPherson A, Zahn H, et al. Clonal decomposition and DNA repli- 1073/pnas.1420822111.
cation states defined by scaled single-cell genome sequencing. Cell. 2019; 155. Van Den Brink SC, Sage F, Vértesy A,  et al. Single-cell sequencing
179(5):1207-1221.e22. https://doi.org/10.1016/j.cell.2019.10.026. reveals dissociation-induced gene expression in tissue subpopula-
138. Falconer E, Hills M, Naumann U, et al. DNA template strand tions. Nat Methods. 2017;14(10):935-936. https://doi.org/10.1038/
sequencing of single-cells maps genomic rearrangements at high res- nmeth.4437.
olution. Nat Methods. 2012;9(11):1107-1112. https://doi.org/10. 156. Toki MI, Merritt CR, Wong PF, et al. High-Plex predictive marker
1038/nmeth.2206. discovery for melanoma immunotherapy–treated patients using digi-
139. Maria Maggiolini FA, Sanders AD, Shew CJ, et al. Single-cell strand tal spatial profiling. Clin Cancer Res. 2019;25(18):5503-5512.
sequencing of a macaque genome reveals multiple nested inversions https://doi.org/10.1158/1078-0432.ccr-19-0104.
and breakpoint reuse during primate evolution. Genome Res. 2020; 157. Wang F, Flanagan J, Su N, et al. RNAscope: a novel in situ RNA analy-
30(11):1680-1693. https://doi.org/10.1101/gr.265322.120. sis platform for formalin-fixed, paraffin-embedded tissues. J Mol Diagn.
140. Sanders AD, Meiers S, Ghareghani M, et al. Single-cell analysis of 2012;14(1):22-29. https://doi.org/10.1016/j.jmoldx.2011.08.002.
structural variations and complex rearrangements with tri-channel 158. Ihle CL, Provera MD, Straign DM, et al. Distinct tumor microenviron-
processing. Nat Biotechnol. 2020;38(3):343-354. https://doi.org/10. ments of lytic and blastic bone metastases in prostate cancer
1038/s41587-019-0366-x. patients. J Immunother Cancer. 2019;7(1):1-9. https://doi.org/10.
141. Wang Y, Waters J, Leung ML, et al. Clonal evolution in breast cancer 1186/s40425-019-0753-3.
revealed by single nucleus genome sequencing. Nature. 2014;512 159. Goltsev Y, Samusik N, Kennedy-Darling J, et al. Deep profiling of
(7513):155-160. https://doi.org/10.1038/nature13600. mouse splenic architecture with CODEX multiplexed imaging. Cell.
142. Gao R, Davis A, McDonald TO, et al. Punctuated copy number evo- 2018;174(4):968-981.e15. https://doi.org/10.1016/j.cell.2018.07.010.
lution and clonal stasis in triple-negative breast cancer. Nat Genet. 160. Vickovic S, Eraslan G, Salmén F, et al. High-definition spatial trans-
2016;48(10):1119-1130. https://doi.org/10.1038/ng.3641. criptomics for in situ tissue profiling. Nat Methods. 2019;16(10):987-
143. Eirew P, Steif A, Khattra J, et al. Dynamics of genomic clones in 990. https://doi.org/10.1038/s41592-019-0548-y.
breast cancer patient xenografts at single-cell resolution. Nature. 161. Schulz D, Zanotelli VRT, Fischer JR, et al. Simultaneous multiplexed
2015;518(7539):422-426. https://doi.org/10.1038/nature13952. imaging of mRNA and proteins with subcellular resolution in breast
144. Kim C, Gao R, Sei E, et al. Chemoresistance evolution in triple-negative cancer tissue samples by mass cytometry. Cell Syst. 2018;6(1):25-36.
breast Cancer delineated by single-cell sequencing. Cell. 2018;173(4): e5. https://doi.org/10.1016/j.cels.2017.12.001.
879-893.e13. https://doi.org/10.1016/j.cell.2018.03.041. 162. Casasent AK, Schalck A, Gao R, et al. Multiclonal invasion in breast
145. Leung ML, Davis A, Gao R, et al. Single-cell DNA sequencing reveals tumors identified by topographic single cell sequencing. Cell. 2018;
a latedissemination model in metastatic colorectal cancer. Genome 172(1-2):205-217.e12. https://doi.org/10.1016/j.cell.2017.12.007.
Res. 2017;27(8):1287-1299. https://doi.org/10.1101/gr.209973.116. 163. Kim S, Lee AC, Lee HB, et al. PHLI-seq: constructing and visualizing
146. Lohr JG, Adalsteinsson VA, Cibulskis K, et al. Whole-exome cancer genomic maps in 3D by phenotype-based high-throughput
sequencing of circulating tumor cells provides a window into meta- laser-aided isolation and sequencing. Genome Biol. 2018;19:158.
static prostate cancer. Nat Biotechnol. 2014;32(5):479-484. https:// https://doi.org/10.1186/s13059-018-1543-9.
doi.org/10.1038/nbt.2892. 164. Gorris MAJ, Halilovic A, Rabold K, et al. Eight-color multiplex immu-
147. Ni X, Zhuo M, Su Z, et al. Reproducible copy number variation pat- nohistochemistry for simultaneous detection of multiple immune
terns among single circulating tumor cells of lung cancer patients. checkpoint molecules within the tumor microenvironment. J Immunol.
Proc Natl Acad Sci U S A. 2013;110:21083-21088. https://doi.org/ 2018;200(1):347-354. https://doi.org/10.4049/jimmunol.1701262.
10.1073/pnas.1320659110. 165. Cabrita R, Lauss M, Sanna A, et al. Tertiary lymphoid structures
148. Carter L, Rothwell DG, Mesquita B, et al. Molecular analysis of improve immunotherapy and survival in melanoma. Nature.
circulating tumor cells identifies distinct copy-number profiles in 2020;577(7791):561-565. https://doi.org/10.1038/s41586-019-
patients with chemosensitive and chemorefractory small-cell lung 1914-8.
cancer. Nat Med. 2017;23(1):114-119. https://doi.org/10.1038/nm. 166. Helmink BA, Reddy SM, Gao J, et al. B cells and tertiary lymphoid
4239. structures promote immunotherapy response. Nature. 2020;577
149. Pellegrino M, Sciambi A, Treusch S, et al. High-throughput single-cell (7791):549-555. https://doi.org/10.1038/s41586-019-1922-8.
DNA sequencing of acute myeloid leukemia tumors with droplet 167. Merritt CR, Ong GT, Church SE, et al. Multiplex digital spatial profil-
microfluidics. Genome Res. 2018;28(9):1345-1352. https://doi.org/ ing of proteins and RNA in fixed tissue. Nat Biotechnol. 2020;38(5):
10.1101/gr.232272.117. 586-599. https://doi.org/10.1038/s41587-020-0472-9.
150. DiNardo CD, Tiong IS, Quaglieri A, et al. Molecular patterns of 168. Rodriques SG, Stickels RR, Goeva A, et al. Slide-seq: a scalable tech-
response and treatment failure after frontline venetoclax combinations nology for measuring genome-wide expression at high spatial

24
resolution. Science. 2019;363(6434):1463-1467. https://doi.org/10. Genome Biol. 2020;21(1):1-16. https://doi.org/10.1186/s13059-
1126/science.aaw1219. 020-02032-0.
169. Stahl PL, Salmen F, Vickovic S, et al. Visualization and analysis of 181. Slyper M, Porter CBM, Ashenberg O, et al. A single-cell and single-
gene expression in tissue sections by spatial transcriptomics. Science. nucleus RNA-Seq toolbox for fresh and frozen human tumors. Nat
2016;353(6294):78-82. https://doi.org/10.1126/science.aaf2403. Med. 2020;26(5):792-802. https://doi.org/10.1038/s41591-020-
170. Moncada R, Barkley D, Wagner F, et al. Integrating microarray- 0844-1.
based spatial transcriptomics and single-cell RNA-seq reveals tissue 182. Bendall SC, Davis KL, Amir EAD, et al. Single-cell trajectory detec-
architecture in pancreatic ductal adenocarcinomas. Nat Biotechnol. tion uncovers progression and regulatory coordination in human b
2020;38(3):333-342. https://doi.org/10.1038/s41587-019-0392-8. cell development. Cell. 2014;157:714-725. https://doi.org/10.
171. Villacampa EG, Larsson L, Kvastad L, Andersson A, Carlson J, 1016/j.cell.2014.04.005.
Lundeberg J. Genome-wide spatial expression profiling in FFPE tis- 183. Trapnell C, Cacchiarelli D, Grimsby J, et al. The dynamics and regula-
sues. bioRxiv. 2020. https://doi.org/10.1101/2020.07.24.219758 tors of cell fate decisions are revealed by pseudotemporal ordering
172. Wang Z, Portier BP, Gruver AM, et al. Automated quantitative RNA of single cells. Nat Biotechnol. 2014;32:381-386. https://doi.org/10.
in situ hybridization for resolution of equivocal and heterogeneous 1038/nbt.2859.
ERBB2 (HER2) status in invasive breast carcinoma. J Mol Diagn. 2013; 184. Kharchenko PV, Silberstein L, Scadden DT. Bayesian approach to
15(2):210-219. https://doi.org/10.1016/j.jmoldx.2012.10.003. single-cell differential expression analysis. Nat Methods. 2014;11(7):
173. Ke R, Mignardi M, Pacureanu A, et al. In situ sequencing for RNA 740-742. https://doi.org/10.1038/nmeth.2967.
analysis in preserved tissue and cells. Nat Methods. 2013;10(9):857- 185. Grün D, Kester L, Van Oudenaarden A. Validation of noise models
860. https://doi.org/10.1038/nmeth.2563. for single-cell transcriptomics. Nat Methods. 2014;11(6):637-640.
174. Potter N, Ermini L, Papaemmanuil E, et al. Single-cell mutational pro- https://doi.org/10.1038/nmeth.2930.
filing and clonal phylogeny in cancer. Genome Res. 2013;23:2115- 186. Pagès F, Mlecnik B, Marliot F, et al. International validation of the
2125. https://doi.org/10.1101/gr.159913.113.23. consensus Immunoscore for the classification of colon cancer: a
175. Ediriwickrema A, Aleshin A, Reiter JG, et al. Single-cell mutational profil- prognostic and accuracy study. Lancet. 2018;391(10135):2128-
ing enhances the clinical evaluation of AML MRD. Blood Adv. 2020; 2139. https://doi.org/10.1016/S0140-6736(18)30789-X.
4(5):943-952. https://doi.org/10.1182/bloodadvances.2019001181. 187. Bodenmiller B. Multiplexed epitope-based tissue imaging for discov-
176. Nguyen QH, Pervolarakis N, Nee K, Kessenbrock K. Experimental ery and healthcare applications. Cell Syst. 2016;2(4):225-238.
considerations for single-cell RNA sequencing approaches. Front Cell https://doi.org/10.1016/j.cels.2016.03.008.
Dev Biol. 2018;6:108. https://doi.org/10.3389/fcell.2018.00108.
177. Hodge RD, Bakken TE, Miller JA, et al. Conserved cell types with
divergent features in human versus mouse cortex. Nature. 2019;
573:61-68. https://doi.org/10.1038/s41586-019-1506-7.
178. Barkas N, Petukhov V, Nikolaeva D, et al. Joint analysis of heteroge-
neous single-cell RNA-seq dataset collections. Nat Methods. 2019;
16:695-698. https://doi.org/10.1038/s41592-019-0466-z.
179. Mereu E, Lafzi A, Moutinho C, et al. Benchmarking single-cell RNA-
sequencing protocols for cell atlas projects. Nat Biotechnol. 2020;38
(6):747-755. https://doi.org/10.1038/s41587-020-0469-4.
180. Massoni-Badosa R, Iacono G, Moutinho C, et al. Sampling
time-dependent artifacts in single-cell genomics studies.

25
Identification of a tumor–specific gene
regulatory network in human B-cell lymphoma

Introduction
Simultaneous readout of transcriptomic and epigenomic Highlights
data from the same cell at single cell resolution allows for
• Distinguish tumor versus normal cells in
direct reconstruction of cell type–specific gene regulatory
a heterogeneous sample
networks that does not rely on inference or assumptions to
tie the two data types together. Here, we show how multio- • Reconstruct cell type–specific gene regulatory
mic analysis of paired RNA-seq and ATAC-seq data from network
the same single cells using Chromium Single Cell Multi- • Confirm PAX5 as a critical regulator specific
ome ATAC + Gene Expression enables direct linkage of to tumor B cells
differentially accessible DNA regions to proximal differen-
• Identify putative target genes downstream
tially expressed genes to identify putative regulatory
of PAX5
targets. As a result, you can answer questions not only
about what genes are expressed in a single cell, but how
expression is regulated through associated open chroma-
tin regions. In a diffuse small B-cell lymphoma sample, we
confirmed Paired Box 5 (PAX5) as an important regulator
in tumor B cells and identified a network of potential
PAX5 target genes.

Sample prep GEM generation Library construction Sequencing Data processing Data visualization

Prepare
Nuclei
Suspension

Figure 1. Experimental methods for nuclei isolation and multiomic data generation. Flash-frozen intra-abdominal lymph
node tumor, with pathologist annotation of diffuse small B-cell lymphoma tissue, was acquired from BioIVT Asterand®.
Nuclei were isolated following the Nuclei Isolation from Complex Tissues for Single Cell Multiome ATAC + Gene Expression
Sequencing Demonstrated Protocol (CG000375). Isolated nuclei were flow sorted before permeabilization. Nuclei were
transposed in bulk before single nuclei encapsulation in GEMs (Gel Bead-in-emulsion), where DNA fragments and the 3’
ends of mRNA were barcoded. Paired ATAC and gene expression libraries were generated from 14,000 total nuclei as
described in the Chromium Next GEM Single Cell Multiome ATAC + Gene Expression User Guide (CG000338 Rev A) and
sequenced on an Illumina NovaSeq™ 6000 v1.5.

26
B
Tumor B cycling
pDC
Cell type
B
Fibroblasts
Mono
Stromal cells Mono
pDC

umap2

umap2
Stromal cells Fibroblasts
T
pDC T T cycling
Mono T cycling
Tumor B
Tumor B cycling
Tumor B cycling
NA
Fibroblasts Tumor B

Stromal cells

A. Gene expression
Gene Expression
Gene Expression ATAC
ATAC
ATAC
Gene expression, T cells
Gene Expression, T cells
umap1

ATAC, T cells
uma

B B
Tumor B Tumor B Treg NKT

B B
Tumor B cycling
Tumor B cycling CD4 memory
pDC pDC T cell subtype
Cell type Cell type Cell type Cell type CD8 Naive
CD8 memory CD4 cytotoxic
B B B B
CD4 Naive CD4 memory
Fibroblasts Fibroblasts Fibroblasts Fibroblasts
CD4 Naive
Mono Mono Mono Mono
Stromal cells
Stromal cells Mono Mono CD4 Tfh
pDC pDC pDC pDC
umap22

umap22

umap22
umap2
umap2

umap2
T cycling T cycling Proliferating T CD8 Exhausted
UMAP

UMAP

UMAP
Stromal cells
Stromal cells Fibroblasts
Fibroblasts Stromal cells
Stromal cells
NK CD8 memory CD
T T T T
CD8 Naive
pDC pDC T T T cycling T cycling T cycling T cycling CD4 Tfh
Mono Mono T cycling T cycling NK
Tumor B Tumor B Tumor B Tumor B
T T NKT
Tumor B cycling
Tumor B cycling Tumor B cycling
Tumor B cycling CD8 Exhausted
Tumor B cycling
Tumor B cycling Proliferating T
NA NA NA NA
Fibroblasts
Fibroblasts Tumor B Tumor B CD4 cytotoxic Treg

Stromal cells
Stromal cells

umap1
UMAP 1 umap1 umap1 umap1
UMAP 1 umap1
UMAP 1 uma

B. MS4A1 BANK1 PAX5


Gene Expression, T cells T cells
Gene Expression, ATAC, TATAC,
cells T cells

Tumor B cells
Treg TregNKT NKT

CD4 memory
CD4 memory
T cell subtype
T cell subtype T cell subtype
T cell subtype
CD8 Naive
CD8 Naive CD8 memory
CD8 memory CD4 cytotoxic
CD4 cytotoxic CD4 cytotoxic
CD4 cytotoxic

CD4 Naive
CD4 Naive CD4 memory
CD4 memory CD4 memory
CD4 memory
umap2 UMAP 2

umap2 UMAP 2

UMAP 2
CD4 Naive CD4 Naive CD4 Naive CD4 Naive
CD4 Tfh CD4 Tfh CD4 Tfh CD4 Tfh
umap2
umap2

Proliferating T
Proliferating T CD8 Exhausted
CD8 Exhausted Proliferating T
Proliferating T CD8 Exhausted
CD8 Exhausted
NK NK CD8 memory
CD8 memory CD8 Exhausted CD4 Tfh CD4 Tfh
CD8 Exhausted CD8 memory
CD8 memory
CD8 Naive CD8 Naive CD4 cytotoxic
CD4 cytotoxic CD8 Naive CD8 Naive
CD4 Tfh CD4 Tfh
NK NK NK NK
Treg Treg
NKT NKT NKT NKT
CD8 Exhausted
CD8 Exhausted CD8 Naive
CD8 Naive
ProliferatingProliferating
T T ProliferatingProliferating
T T
CD4 Naive
CD4 Naive
CD4 cytotoxic
UMAPCD41 cytotoxic Treg Treg UMAP 1 CD4 memory
CD4 memory Treg Treg UMAP 1

CD8 memory
CD8 memory

Figure 2. Simultaneous measurement of gene expression and open chromatin profiles from the same single nuclei enables NKT NKT

clustering based on either modality. A. Shown are clustering and manual annotation based on gene expression for all 14,000
NK NK

nuclei (left); gene expression-derived annotations layered on ATAC projections (middle); and the gene expression plot on the
umap1 umap1 umap1 umap1

left restricted to the T-cell populations (right). B. Highlighted are expression levels of select genes, including MS4A1, a canonical
B-cell marker (left); BANK1, an attenuator of BCR activation pathway that is repressed in tumor cells relative to normal B cells
(middle); and PAX5, required for B-cell differentiation (right).

A. B.
Feature linkage Annotate peaks linked to DEGs

Figure 3. Computational strategy for identification of cell type–specific gene regulatory networks. A. In 10x Genomics Cell
Ranger ARC software, feature linkages are defined as pairs of genomic features, such as peaks and genes, that exhibit signifi-
cant correlation in their chromatin accessibility and transcript level, respectively, across cells. Feature linkages can be positively
or negatively correlated. For example, an open enhancer region may have a positive correlation with gene expression of its
associated transcript (blue), while the binding of a repressor would result in a negatively correlated feature linkage (red). The
greater the correlation between open chromatin signal and gene expression, the taller the arc. B. To identify a gene regulatory
network in tumor B cells, genes were first filtered based on significant transcriptional upregulation in tumor B cells relative to
normal B cells (p < 10 -20), resulting in 198 differentially expressed genes (DEGs, green). Peaks associated with DEGs (green) were
identified using feature linkages. Tumor B cell–specific enriched motifs were then identified using DEG-linked peaks. Enriched
motifs and linked upregulated genes were used to define a B cell lymphoma–specific gene regulatory network (Figure 4).

27
What to look for
Since mRNA and ATAC data are generated from the same PAX1 and PAX5 motifs are highly similar, however PAX1
cells, cell-type annotations can be transferred from one is not expressed in tumor B cells, while PAX5 is highly
modality to the other (Figure 2A, middle). In addition to expressed (Figure 4). Therefore, it is likely the PAX5 tran-
the identification of B cells, monocytes, and T-cell sub- scription factor is binding the identified PAX1 motif. This
types using canonical cell markers like the B-cell marker inference is only possible with paired gene expression
MS4A1, tumor B cells were distinguishable from normal and open chromatin information from the same cells.
B cells based on upregulated CD40 expression (data not
To understand the role of PAX5 in tumor B cells, we
shown) and reduced BANK1 (Figure 2B). PAX5 was sig-
zoomed in on the PAX5 locus, which is differentially
nificantly upregulated in tumor B cells relative to normal
expressed between B cells and tumor B cells (Figure 5).
B cells (Figure 2B), and has previously been identified as
Expression of PAX5 is highly correlated with open PAX5
a core regulator of chronic lymphocytic leukemia (CLL)
motif sites in a previously identified super-enhancer, sug-
(Ott et al., 2018).
gesting autoregulation (Figure 5, dashed box). Additional
Paired gene expression and open chromatin signals pave feature linkages contribute further to the reconstruction
the way for high-confidence gene regulatory network pre- of a putative tumor B cell–specific gene regulatory net-
dictions using feature linkages, which are calculated work, and suggest PAX5 may also regulate the immune
automatically in Cell Ranger ARC (Figure 3A). Feature transcription factor genes NFATC1, TCF4, IKZF1, and
linkages help build putative gene regulatory networks by IRF8 (Figure 4). The importance of PAX5 and its position
providing correlated gene expression and open chromatin as a key genetic regulator in tumor B cells is consistent
regions across the genome. To identify tumor B cell–spe- with previously published results showing that, of 147
cific gene regulatory networks, we first annotated feature transcription factors tested, loss of PAX5 had the great-
linkages by genes upregulated in tumor B cells to identify est effect on cell proliferation in a CLL cell line (Ott et
peaks that were potential drivers of differential expres- al., 2018). While confirmation of individual links in our
sion. We then identified motifs enriched in these peaks predicted gene regulatory network requires functional
relative to a set of matched background motifs within tests, the confidence in regulatory connections is greatly
tumor B cells (Figure 3B). Using this method, we found increased by joint measurement of mRNA and ATAC data.
that the PAX1 motif was the most enriched (Figure 4).

Target genes

Immune TFs Other Immune Genes Linkage


Significance
PAX5 200
ONECUT1 100
MOTIFS PAX1 50
CUX1 25
PAX9
CUX2 10
0

TC
FOF4
N XP1
PA TC
IK 5
AH 1
TOR
IR
PO 8
TP U2
LE 3 2
C
TF RD
ST C 1
BC GA
D L2 1
SKX1
C AP2
SY 83
IL K
R R
C SG
D KN P3
KL 1 A
BL L6
SE K
PA MA
IG 2 A
AD 1
N TR
FCKBI
FCRL3

enrichment log10 UMI


FA

A
D R
LG 2

F P
4
ZF

LC
X 1

K 4
F1

R 1

H
N
6 F

6
X

R
L2
Z
L

Figure 4. Feature linkages help build a tumor-specific gene regulatory network. The table summarizes significant feature
linkages between motifs in the PAX/CUX/ONECUT family and a selection of immune-related transcription factors (TFs) and
other immune genes that are differentially expressed in tumor B cells. At far left, the blue line plot shows motif enrichment
scores, calculated using the analysis outlined in Figure 3. Gene expression levels of the transcription factors expected to bind
each motif are indicated in the adjacent bar graph. For every differentially expressed gene–PAX/CUX/ONECUT motif pair, the
significance of the most significant feature linkage is indicated by a colored square.

28
Figure 5. Loupe Browser enables visualization of feature linkages. Positively correlated feature linkages are denoted by
arcs at top. Highlighted by the dotted box is a highly significant feature linkage between PAX5 and a previously annotated
CLL super-enhancer that is depicted in black (Ott et al., 2018). Below the illustrated feature linkages are open chromatin peaks
identified for each cell cluster across a 0.3 Mb region. Annotated cell types are color coded. On the right are plots showing the
expression level of PAX5 (top) and accessibility of the linked super-enhancer (bottom) for each annotated cell type. Tumor
B cells (blue), in contrast to normal B cells (red), have elevated PAX5 expression and open chromatin at this super-enhancer.

Explore what you can do Resources


Chromium Single Cell Multiome ATAC + Gene Expression To explore the dataset further, download the data here:
helps you identify the critical regulators and pathways https://support.10xgenomics.com/single-cell-multiome-
behind cell state. Putative gene regulatory networks can atac-gex/datasets/1.0.0/lymph_node_lymphoma_14k
be built based on correlated gene expression and open
chromatin sites with greater accuracy and confidence
than would be possible with a single modality. At the
References
same time, the identity of likely transcriptional regula- Ott CJ, et al. Enhancer Architecture and Essential Core
tors can be constrained by both expression level and Regulatory Circuitry of Chronic Lymphocytic Leukemia.
motif availability. Multiomic readout at the transcrip- Cancer Cell. 34: 982–995, 2018.
tional and epigenetic levels, particularly from the same
single cell, takes much of the guesswork out of network
reconstruction based on gene expression alone, enabling
a deeper understanding of the molecular mechanisms
underpinning disease progression, developmental differ-
entiation, and therapeutic response.

Contact us
10xgenomics.com | info@10xgenomics.com
© 2021 10x Genomics, Inc. FOR RESEARCH USE ONLY. NOT FOR USE IN DIAGNOSTIC PROCEDURES.
LIT000110 - Rev A - Data Spotlight - Tumor–specific gene regulatory network in human B-cell lymphoma

29
SPECIAL FEATURE REVIEW

Recent advances in single-cell multimodal analysis to study


immune cells
Raymond HY Louie & Fabio Luciani
School of Medical Sciences, The Kirby Institute, University of New South Wales (UNSW), Sydney, NSW, Australia

Keywords
Abstract
cell state, cell–cell interaction, clonal analysis,
immune cells, lineage, multimodal analysis, Recent advances in single-cell technologies have enabled the profiling of the
pseudotime, single-cell technology, temporal genome, epigenome, transcriptome and proteome, along with temporal and
analysis
spatial information of individual cells. These technologies have provided
unique opportunities to understand mechanisms underpinning the immune
Correspondence
system, such as characterizations of the molecular cell state, how the cell state
Fabio Luciani, School of Medical Sciences and
the Kirby Institute, University of New South
evolves along its lineage and the impact of spatial location on cell state. In this
Wales (UNSW), Sydney, NSW 2052, review, we discuss how these mechanisms have been studied through recent
Australia. advances in single-cell multimodal technologies.
E-mail: luciani@unsw.edu.au

Received 5 September 2020; Revised 30


October, 24 November and 9 December
2020; Accepted 9 December 2020

doi: 10.1111/imcb.12432

Immunology & Cell Biology 2021; 99:


157–167

different stages of cellular differentiation. This can be


INTRODUCTION
achieved through combining “ome-layers” and temporal
Recent advances in single-cell technology has made it modalities, thus capturing the information related to the
possible to simultaneously extract different types of ordering of cells at different stages of differentiation.
information, or “modalities,” from the same single cell. Combining ome-layer and temporal modalities can
These modalities can arise from the genome, epigenome, characterize how the molecular state of an immune cell
transcriptome and proteome (Figure 1a). In each of these evolves along its lineage, from hematopoietic stem cells
ome-layers, different modalities exist, such as mutations (HSCs) in the bone marrow to its cell fate. A cell’s
and copy number variations at the genome layer, DNA lineage is created by the developmental history of the cell,
methylation and chromatin accessibility at the epigenome with each cell belonging to the same or sister clones.
layer and unspliced and spliced messenger RNA (mRNA) Single-cell multimodal analysis can also inform on how
at the transcriptome layer. Single-cell multimodal analysis molecular cell state is location dependent, which requires
has been used for different immunological applications. spatial modalities. These modalities include the (1) spatial
For example, a common application is to characterize the location of the cell in the body or (2) which cells are
molecular cell state, which can be described by a single interacting with each other. Cell-to-cell interactions can
modality, for example, the expression of certain genes, or be determined by examining receptor–ligand pair
by a combination of modalities spanning across the interactions and are important, as neighboring cells can
genome, epigenome, transcriptome and proteome modulate cell function through these interactions.
(Figure 1a). Single-cell multiple modalities can be obtained either
Single-cell multimodal analysis can also be used to experimentally or bioinformatically. Experimental
describe how the molecular cell state evolves along modalities can be obtained through gathering cytometric

30
(a)

Intracellular
Genome
proteome
DNA mutations

Surface proteome

Transcriptome Epigenome
RNA chromatin
accessibility,
methylation

(b) Molecular cell state


Heterogeneous population

Mode 1 Mode 2

Single-cell multi-modal applications


Temporal Spatial
(e.g., gene, protein)
Marker expression

Location in body
Cell-cell interaction

Time

Figure 1. Single-cell multimodal analysis: components and applications. (a) Cellular components of single-cell multimodal analysis. (b)
Applications of single-cell multimodal analysis to molecular cell state, temporal evolution and spatial analysis.

information before a destructive assay, separation of layer can also be used to computationally predict
cellular components or a conversion of cellular modalities at another layer. For example, transcriptomic
information into a common molecular format.1 Detailed data have been used to predict cell-to-cell interactions at
descriptions of these methods are given in an excellent the spatial layer,3,4 the temporal ordering of cells5,6 and
review,1 and will not be discussed further. Bioinformatic the future states of cells2 at the temporal layer.
tools can also be used to extract multiple modalities. The In this review, we will discuss recent single-cell
key difference from the experimental approach is that multimodal applications to human or mouse immune
these modes are extracted from data generated from the cells, in contrast to broad reviews which focus on general
same assay, where the data are typically sequenced reads. applications of single-cell multimodal analysis.7 We
For example, at the transcriptomics layer, reads aligned define single-cell multimodal analysis as the analysis of
to a transcriptome can be processed bioinformatically to data sets arising from at least two modalities obtained
yield unspliced and spliced RNA.2 Data obtained at one from the same cell, as opposed to bioinformatically

31
integrated data sets arising from different samples or similar protocols, as previously reviewed.9 However,
(addressed in previous reviews1). As summarized in these methods are laborious and only applicable for small
Table 1, we will review recent advances demonstrating cell number. New methods are available to isolate single
the impact of single-cell multimodal analysis in cells at high throughput, for instance, utilizing cellular
understanding molecular cell state, temporal and spatial bar codes and unique molecular identifiers [e.g.
location of immune cells (Figure 1b). Finally, we will microfluidics technology (10x Chromium) or nano plates
discuss future research opportunities. (Rhapsody)].9 These methods require demultiplexing of
individual cells which is performed bioinformatically.
While these approaches were first developed to perform
MOLECULAR CELL STATE
single-cell RNA sequencing (scRNA-seq), more recently
The molecular state of an immune cell can be these have been also developed to perform multimodal
characterized by a combination of modalities from the analyses. For example, Cellular Indexing of
genome, epigenome, transcriptome and proteome Transcriptomes and Epitopes by Sequencing (CITE-Seq)10
(Figure 1a). A common application of multimodal and AbSeq11 are two technologies which can
information is to isolate cells with a certain state using simultaneously extract intracellular (surface) protein and
one modality, and then examine the cell state of these gene expression in the same cell. These technologies have
isolated cells in another modality. This process is been used to explore heterogeneous populations in both
sometimes repeated multiple times at different modalities. healthy and disease samples.10–12 For example, CITE-
For example, surface protein markers have been Seq10 was applied in combination with 10x Chromium to
traditionally used to first isolate or sort cells by identify cord blood mononuclear cells and successfully
fluorescence-activated cell sorting, followed by analysis identify natural killer cells based on the CD16 and CD56
using gene expression, immune receptors, chromatin surface markers, after which gene expression analysis
accessibility regions or combinations of these modalities. revealed differentially expressed signatures of natural
One of the key advantages of single-cell analysis is to killer subtypes between healthy and disease samples,
dissect the cellular and molecular heterogeneity in a including cytotoxic markers such as GZMB, GZMK and
tissue or sample, and even to identify subsets within the PRF1.
same cell type. The identification of cell states using Although technologies such as CITE-Seq and AbSeq
multimodal analysis has been applied to analyze immune allow simultaneous measurements of the surface protein
cells in healthy and disease, pathogen infection, and gene expression, extracting both the intracellular
autoimmune and cancer samples. protein and gene expression within the same single cell
remains largely unexplored. This is because these
measures require permeabilization of cell membrane
Identification of cell state in healthy samples
which may result in cell death, thus impairing the
Single-cell multimodal analysis can be used to isolate cell possibility to utilize current approaches for combining
subsets and characterize their molecular signatures from intracellular protein expression quantification with other
healthy samples, which can then be used as a baseline modalities, such as scRNA-seq. This roadblock has been
reference when comparing with immune cells from recently addressed by intracellular staining and
disease samples. For example, a recent study explored T- sequencing (INs-seq),13 which permits the measurement
cell composition in lymphoid and nonlymphoid tissues of both intracellular protein and mRNA. INs-seq was
from both healthy humans and mice.8 By combining applied to several immune subsets, including dendritic
single-cell gene expression with T-cell receptor (TCR) cells, myeloid cells and T cells. For the latter, intracellular
sequences, this study showed distinctive signatures quantification of the transcription factors FOXP3, TCF7
between regulatory and memory subsets across lymphoid and ID2 in combination with scRNA-seq data revealed
and nonlymphoid tissues, and also similar subsets of gene modules associated with these transcription factors,
regulatory T cells across humans and mice. Unexpectedly, for example, TCF7+ cells had gene modules associated
this integrated analysis also revealed that the same T-cell with na€ıve phenotype (CCR7, SELL and LEF1), whereas
clones (i.e. with identical TCR) could be identified in ID2+ cells revealed genes related to cytotoxicity (GNLY,
lymphoid and nonlymphoid samples, thus suggesting GZMA/B, PRF1).
migration of regulatory cells between organs.
Single cells can be separated using high-purity
Identification of cell state in diseases
fluorescence-activated cell sorting into wells, after which
mRNA or DNA is extracted for single-cell analyses. This Identifying cell states using multimodal analysis can lead
is the case for plate-based approaches such as Smart-seq2 to the discovery of novel correlates of disease, clinical

32
Table 1. Overview of the current applications of single-cell multimodal analysis to study immune cells

Applications Targeted
to immunology Genome Epigenome Transcriptome proteins Component details Reference
8
Molecular cell state U U Surface protein (sorting) + mRNA
29
U U Surface protein (sorting) + TF binding + chromatin accessibility + clone (TCR)
10,51
U U Surface protein (barcode) + mRNA
26
U U Surface protein (barcode) + mRNA + clone (BCR and TCR)
13
U U Intracellular protein (sorting) + mRNA
15–19
U mRNA + clone (TCR)
27
U U U Surface protein (sorting) + mRNA + somatic mutations + clone (BCR)
30
U U U Surface protein (sorting) + mRNA + somatic mutations
29
U U Surface protein (sorting) + TF binding + chromatin accessibility + clone (TCR)

5,6
Temporal U U Surface protein (sorting) + mRNA + pseudotime (mRNA)
36
U U Surface protein (sorting) + chromatin accessibility + pseudotime (chromatin accessibility)
35,62
U mRNA + pseudotime (mRNA)
44
U mRNA + pseudotime (mRNA) + clone (TCR)
38
U U Surface protein (sorting) + mRNA + clone (barcode)
63
U U Surface protein (sorting) + mRNA + clone (genetics)
34
U U U Surface protein (sorting) + mRNA + chromatin accessibility + clone (mitochondrial DNA)
39
U U Surface protein (sorting) + chromatin accessibility + clone (mitochondrial DNA)
33
U U Surface protein (sorting) + mRNA + clone (TCR)
37
U U Surface protein (sorting) + mRNA + clone (BCR)
64,65
U U Surface protein (sorting) + mRNA + clone (Ag-specific TCR)

41
Spatial U U Surface protein (sorting) + mRNA + spatial (cell location)
3,4,43,44
U mRNA + spatial (cell–cell)

BCR, B-cell receptor; mRNA, messenger RNA; TCR, T-cell receptor; TF, transcription factor.

33
parameters and outcome. For example, a study molecular signatures of influenza-specific CD8+ T cells
performed proteomic and transcriptomic analysis using across different stages of infection.
CITE-Seq and scRNA-seq from the peripheral blood The importance of single-cell multimodal analysis has
mononuclear cells of healthy individuals vaccinated with led to several recent studies of coronavirus disease 2019
influenza or yellow fever vaccine.14 This analysis revealed (COVID-19). Single-cell analysis of both gene expression
a distinctive baseline signature across low and high profile and immune receptor sequencing has been also
responders following vaccination. Within each cell type performed on bronchoalveolar lavage fluids from patients
identified by CITE-Seq protein data, gene expression was with mild or severe disease.25 This analysis revealed that
used to identify significant differences between low and patients with mild COVID-19 disease were characterized
high responders within the plasmacytoid dendritic cell by highly clonally expanded CD8+ T cells, and that
and lymphocyte clusters, suggesting that people who proinflammatory monocyte-derived macrophages were
respond well to vaccines have a distinct activation status abundant in the bronchoalveolar lavage fluid from severe
of cells at baseline (i.e. before vaccination). COVID-19 cases. The use of proteomics, gene expression
Single-cell multimodal analysis has also been utilized to and clonal information has also been investigated.26 Here,
simultaneously study gene expression and clonal surface protein using CITE-Seq, in addition to scRNA-seq
expansion of T cells and B cells. For instance, gene and B-cell receptor and TCR information, was used to
expression and immune receptor sequencing from both investigate the peripheral blood mononuclear cell of
of these subsets were simultaneously measured from COVID-19 patients. These authors showed that a pre-
peripheral blood mononuclear cells of patients with exhaustion phenotype in HLA-DR+CD38+-activated T
metastatic melanoma treated with anti-CTLA-4 and anti- cells and an anti-inflammatory signature in monocytes
PD-1 immunocheckpoint blockade.15 By employing are associated with progressive disease, whereas a TCR
machine learning techniques, the authors of this study and B-cell receptor analysis revealed a skewed clonal
showed that clonally expanded subset of peripheral CD8+ distribution of CD8+ T- and primary B-cell response.
T cells was associated with a long-term treatment Single-cell multimodal analyses have been recently
response. Single-cell gene expression and immune applied for the first time in rare pathogenic B cells secreting
receptor have also been applied to discover new cell states autoantibodies in the context of Sj€ogren syndrome.27 In this
in cancer such as hepatocellular carcinoma, colorectal study, B cells were first sorted as CD19+CD27+IgD
cancer and lung cancer,16–18 as well as in tumor memory cells from patients with Sj€ ogren syndrome, to
infiltrating T cells in the context of novel isolate clonally related cells responsible for autoantibodies
immunocheckpoint blockade therapies (e.g. in associated with cryoglobulinemic vasculitis. By utilizing
melanoma).19 single-cell genome and transcriptome sequencing,28 full-
In the case of viral infections, single-cell multimodal length gene expression data from each cell were analyzed
analysis has proven extremely useful in the with VDJPuzzle22 to reconstruct the full-length heavy and
identification of viral-specific T cells and B cells. These light chains of immunoglobulin B cell secreting
cells are generally found in low numbers within the autoantibodies, thus demonstrating the expansion of a
pool of circulating and resident cells, which pose single “rogue” clone dominating the observed phenotype.
challenges for their identification and separation for Single-cell DNA was then utilized to identify lymphoma
molecular and phenotypic analyses. Single-cell analysis driver somatic mutations present only within the rogue
has provided a means to accurately characterize rare clone of autoantibody-forming B cells. This study provided
cell populations.20 Several teams, including ours, have the first direct evidence that somatic mutations drive loss of
applied single-cell multimodal analyses to separate viral- tolerance and disease pathogenesis.
specific CD8+ T cells using tetramers and then utilized Single-cell multimodal analysis has also been useful to
index sorting and scRNA-seq (Smart-seq2) to investigate the epigenetic profile of T-cell subsets and
simultaneously identify their gene expression and full- their clonal expansion in the context of leukemia.29 By
length TCR in individuals infected with hepatitis C combining assay for transposase-accessible chromatin
virus.21,22 These analyses were then used to identify the using sequencing with TCR sequencing, this study first
active and resting subsets within these viral-specific identified regulatory elements and transcription factors
responses, along with their clonal expansion. Similar associated with each canonical T-cell subset in healthy
applications have been also utilized to study chronic donors. Surprisingly, this study found that the epigenetic
HIV infection, for instance, to demonstrate the profiles of canonical T cell subsets form a continuum of
existence of HIV-specific CD8+ T cells that recognize states, suggesting significant regulatory variability within
epitopes within the HLA-II instead of class I23 and cell surface marker-defined subpopulations. By applying
influenza-specific CD8+ T cells,24 to reveal evolving this approach to T cells derived from leukemia patients

34
the authors identified the state of abnormal clones, hence HSCs are stem cells derived from the bone marrow
determining the mechanisms driving disease. In a which give rise to myeloid and lymphoid lineages and are
separate study,30 mutations from scRNA-seq data were thus a natural starting point to study pseudotime using
used to identify and isolate three clones in a bone single-cell multimodal analysis. Several works have
marrow sample from a patient with acute myeloid utilized the natural inherent relationship between HSCs
leukemia. Gene expression was then used to identify the and differentiated immune cells, to infer the
cell-type compositions of these clones, determining that differentiation trajectories using single-cell genomics. For
these clones belonged to progenitor-like, monocyte-like instance, differentiation trajectories were obtained from
and dendritic cell-like cells. scRNA-seq data of HSCs from the bone marrow of
mouse,5,6 which revealed three differentiation
trajectories,5 originating from sorted CD48CD150+
TEMPORAL ANALYSES
CD45+EPCR+ HSCs, and ending with erythroid,
As discussed, multimodal measurements of immune cells granulocytes–macrophage and lymphoid progenitors.
can lead to a deeper understanding of the heterogeneity Another natural avenue for pseudotime analysis is the
inherent in these immune cells, and the changes that a study of T-cell selection in the thymus. In a recent
disease can cause to cell state. However, the molecular state study,35 transcriptomic data were obtained from
of an immune cell is a dynamic process, from HSC developing and postnatal thymus and postnatal samples
generation in the bone marrow to its differential fate. covering the entire period of active thymic function.
Single-cell measurements are crucial in obtaining an Pseudotime values obtained from scRNA-seq revealed
accurate estimation of this temporal differentiation. This is developmental marker genes for different cell types
because bulk samples contain a mixture of cells at various during T-cell development, such as ST18 for early double
differential stages, thus tracking the average bulk expression negative, and AQP3 for double positive. The TCR was
across time may not reflect the terminal differential also obtained, which revealed that the dependence of
trajectory.31 In order to study the evolving state along a cell nonproductive on productive recombination events was
lineage, the molecular-state modalities will need to be associated with different cell types. For example, there
coupled with temporal modalities. We define temporal was a higher amount of fully recombined TCRb
modalities as information related to the time ordering of compared with nonproductive chains in double-negative
cells during their differentiation process. The ideal scenario stages that dropped to basal levels as cells entered double-
would be to obtain measurements of cell states belonging to positive stages, thus demonstrating the impact of thymic
the same clone at different time points. However, this selection on the TCR repertoire.
information is not always available, for instance, from Although pseudotime trajectories have been mostly
cross-sectional studies. Recent single-cell technologies have derived from scRNA-seq data, this metric can be also
attempted to address these issues which have allowed for obtained from other modalities, such as single-cell
the study of cell state32 and clonal differentiation.33,34 chromatin accessibility data. For example, in a recent
study, cellular populations were sorted from CD34+
human bone marrow cells, including myeloid, erythroid
Molecular cell state differentiation
and lymphoid lineages.36 Pseudotime was then generated
Numerous algorithms have been proposed to estimate from chromatin accessibility data, which showed motif
time information for each cell with a metric known as accessibility dynamics along myeloid cell differentiation.
“pseudotime” using either gene expression or chromatin For example, they showed that accessibility at
accessibility data.32 This metric describes how a modality transcription factor motifs associated with HOXB8 and
changes in a continuous differentiation process along a GATA1 was high in HSC and decreased through
trajectory. To obtain this trajectory, a dimension differentiation to common myeloid progenitors.
reduction step is first performed so that each cell is
embedded in a lower dimensional space. A trajectory is
Clonal differentiation
then formed in this space, with cells positioned along this
trajectory depending on their transcriptional or Although trajectory analyses using pseudotime have
accessibility profiles.32 Despite pseudotime values being provided important insights into cell state differentiation,
only an estimate of how cell state evolves over time, and these approaches have limitations in revealing the true
clonal information is not known for each cell, numerous cell lineage endpoint. To achieve this goal, novel methods
immunological discoveries have already been made using have been developed which can identify clonal markers in
this metric, which we will now review. individual cells, in addition to also measuring “omes”

35
and temporal information in the same single cell. We will transposase-accessible chromatin using sequencing) to
discuss several of these approaches. cultured CD34+ HSCs, collected over the course of
The most natural way to track clones in T cells and B 20 days. Mitochondrial DNA was extracted, which
cells is by their unique cell receptor. In the context of incrementally accumulates genetic mutations passed onto
cellular immunotherapies, such as chimeric antigen daughter cells, and subsequently used for lineage tracing.
receptor (CAR) T cells, single-cell multimodal analysis Combining lineage tracing with chromatin profiles
have been recently applied to study clonality, gene revealed possible fates of HSPCs, in particular
signatures and kinetics. TCRs were used to track CAR-T distinguishing bipotent progenitors from those biased in
cells in patients undergoing anti-CD19 CAR-T favor of an erythroid versus monocytic fate. Lineage
immunotherapy in leukemia, in order to understand tracing using mitochondrial DNA has also been applied
characteristics of clonally expanded CAR-T cells.33 In this to study acute myeloid leukemia.39 Clones were first
study, CAR-T cells were sorted from blood samples from isolated based on mutations in the mitochondrial DNA
patients with B-cell acute or chronic lymphoblastic from assay for transposase-accessible chromatin using
leukemia to isolate CD8+ CAR-T cells using a truncated sequencing data, taken from primary blood samples of a
version of the epidermal growth factor receptor, which is patient with acute myeloid leukemia. This allowed new
coexpressed with the CAR on the T-cell surface. A insights into “preleukemic” HSCs, adding to the evidence
decrease in TCR diversity was observed after CAR-T that this cell population is heterogeneous with multiple
infusion, suggesting that CAR-T cells underwent clonal clones, and that the lineage giving rise to acute myeloid
expansion. Gene expression analysis showed clones which leukemia is not the lineage with the optimal potential
increase in frequency after infusion displayed higher among pluripotent HSCs.
expression of cytotoxic genes. Gene expression analysis of
the infusion product showed distinct clusters
SPATIAL ANALYSES AND CELL–CELL
distinguished by expression of activation, cytotoxicity,
COMMUNICATION
mitochondrial and cell cycle-associated genes. Tracking
clones via their immune receptor has been also applied to Multimodal applications at the single-cell level have also
autoimmune diseases.37 Transitional IgDlow B cells were been applied to study how molecular-state modalities are
first sorted from peripheral blood mononuclear cell affected by spatial modalities, in particular a cell’s spatial
collected longitudinally from patients with myasthenia location within a tissue, and its location relative to other
gravis who relapsed after treatment with rituximab, a B- cells. Technologies which measure spatial location are
cell-depleting drug. B-cell receptor clones were then becoming increasingly available,40,41 and some of these
isolated using the gene expression data, which were have been already applied in immunology. For example,
shown to be related to clones identified previously from single-cell spatially resolved transcriptomics was applied
untreated patients. This then allowed identification of to mice bone marrow niches using an improved version
persistent B cells. Clustering using gene expression of laser-capture microdissection coupled with
revealed 820 persistent clones in both memory B-cell and sequencing.41 This allowed the transcriptional profile of
antibody-secreting cell clusters. major bone marrow cell types to be determined, and
A recent approach has been to identify clones with their spatial location in distinct bone marrow niches.
“barcodes,” which can be identified at the single-cell This analysis also showed that Cxcl12-abundant-reticular
level. This approach has been applied to HSC using a cell subsets differentially localize to sinusoidal and
lentiviral delivery system.38 Cells cultured in vitro and arteriolar surfaces and act locally as “professional
cells transplanted in vivo were collected over several days, cytokine-secreting-cells.”
and then sorted to isolate oligopotent and multipotent Some studies have utilized single-cell multiomics to
progenitor cells using flow cytometric markers. In this investigate cell–cell interactions, and recently applied to
study, the early transcriptional signature of HSC was COVID-19. For example, cell–cell interaction was
linked to the clonal fates via barcoding. This high- estimated using CellPhoneDB,42 which was applied to
throughput system allowed mapping of more than scRNA-seq data obtained from nasopharyngeal and
300 000 cells and 10 968 distinct clones, and identified bronchial samples in patients with moderate or critical
genes correlating with fate, revealing two routes of disease.43 This analysis revealed a higher number of
monocyte differentiation that give rise to distinct subsets epithelium–immune cell interactions in patients with
in immune compartments. critical COVID-19, in particular for CD8+ T cells,
Another promising approach to track clones is the use nonresident macrophages and monocyte-derived
of mitochondrial DNA.34 This was performed by applying macrophages, thus likely contributing to clinical
single-cell chromatin accessibility assay (assay for observations of heighted inflammatory tissue damage.

36
Both spatial and temporal single-cell multimodal analysis to identify the target genes which are linked to a
have also been also performed on bronchoalveolar lavage transcription factor, as this can lead to a better
fluid from mild and critical patients.44 TCR clonal understanding of the molecular network modules that
information, gene expression and pseudotime analysis drive immune-cell lineages and their differentiation. Two
revealed that patients with mild COVID-19 were multimodal technologies can potentially address this
characterized by fully differentiated resident memory T issue. The first is thiol(SH)-linked alkylation of the
cells undergoing active clonal expansion, whereas in metabolic sequencing of RNA, which integrates scRNA-
critical COVID-19 patients, these resident memory T cells seq with metabolic RNA labeling to provide two
fail to differentiate or expand. In the same study, the modalities in the transcriptome: total RNA levels and
authors also applied CellPhoneDB to show differences in recently transcribed RNA.47 When combined with
immune cell-type interactions between mild and severe perturbation methods, thiol(SH)-linked alkylation of the
COVID-19. For example, they showed that interactions metabolic sequencing of RNA can identify target genes of
between monocytes/macrophages and neutrophils almost transcriptional regulators, as has been shown in cancer
always involve promigratory interactions in critical cells.48 Single-nucleus chromatin accessibility and mRNA
COVID-19, but interleukin signaling in mild COVID-19. expression sequencing can also be used to understand
Other non-COVID applications include those which gene regulation, as this technology provides high-
studied cellular interactions between melanoma and head throughput sequencing of the transcriptome and
and neck cancer cells and various immune cells, chromatin accessibility in the same cell.49 Other advances
including T cells and macrophages3 and isolated T-cell can be applied for a different problem of characterizing
subsets from transcriptomics data.4 molecular cell state, and include those utilizing
transcriptomics (mRNA) as one of its modalities, in
addition to either DNA,50 protein expression,10,51
CONCLUSIONS AND FUTURE DIRECTIONS
chromatin accessibility52,53 or DNA methylation.54–57
Single-cell multimodal technologies have led to exciting Lineage tracing of immune cells that do not carry a
discoveries on the mechanisms underpinning the immune natural bar code, such as TCR or B-cell receptor, can also
system. We have highlighted some of these discoveries, benefit from recent technologies using synthetic barcodes,
which have provided insight into the molecular state of such as CellTagging which uses a lentiviral approach.58
immune cells, how these states evolve over time and the This approach offers advantages over alternate
impact of spatial location. These technologies are approaches where gene editing is challenging.58
becoming increasingly available and easy to apply, as Despite the rapid increase of single-cell multimodal
exemplified by the recent publications in the field of approaches, several computational and technical caveats
COVID-19 research in the last few months.25,26,43,44 still need to be addressed for optimal analysis of these
These technologies can also be used to answer more data. For example, the sequencing output maybe too
general questions, such as quantifying the relationship shallow to identify the immune receptor, and optimal
between transcripts and proteins, or dissecting the gene expression requires a deeper coverage to control for
landscape of post-translational modifications. In this area, technical noise and drop out of low-expressing genes.59
there remains more work to be done, as shown by recent Similarly, better tools with deeper sequencing are
studies investigating promoter accessibility–gene required for identification of more complex gene
expression45 and gene expression–protein46 correlation. In expression quantities, such as isoforms. Multimodal
immunology, there have already been significant advances technologies also carry a substantial level of technical
in the last decade using single-cell multimodal analysis, as noise which can blur the true biological variation that
we have reviewed. However, despite these achievements, exist,1 for instance, between gene and protein expression,
there remains promising avenues to be explored. For and are an important area for future work. Finally, a
example, it is conceivable that with the development of significant challenge is the development of bioinformatics
new multimodal technologies, further cellular states tools to permit integration of these data, as recently
which comprise of a combination of modalities across reviewed.1
different “omes” will be discovered. Combined with The rapid growth of single-cell multimodal
spatial modalities, these cellular states may also be technologies has also generated debate about the precise
spatially dependent. Other promising avenues will now be definition of how a “mode” or an “omic” is defined. We
discussed. have opted for a broad definition, considering a mode as
Novel multimodal technologies are being proposed any type of information from the same single cell, which
every year, with some yet to be applied to immune cells. is consistent with a previous definition in a highly cited
For example, an important and yet unresolved problem is review.1 We have thus included temporal and spatial

37
information as separate modalities, in addition to 3. Ren X, Zhong G, Zhang Q, Zhang L, Sun Y, Zhang Z.
different information obtained from the same data set. Reconstruction of cell spatial organization from single-cell
Temporal and spatial information has also been RNA sequencing data based on ligand-receptor mediated.
considered by other authors as its own separate omic or Cell Res 2020; 30: 763–778.
4. Braga FV, Kar G, Berg M, Carpaij O, Polanski K. A
modality.60,61 We envisage that future research in this
cellular census of healthy lung and asthmatic airway wall
field will lead to more advanced approaches and methods identifies novel cell states in health and disease. Nat Med
to better quantify the temporal and spatial modalities of 2019; 25: 1153–1163.
immune cells and converge in adopting a less confusing 5. Nestorowa S, Hamey FK, Pijuan Sala B, et al. A single-cell
language, thus increasing the involvement of resolution map of mouse hematopoietic stem and
immunologists in the field of single cell. progenitor cell differentiation. Blood 2016; 128: 20–32.
Single-cell data sets are growing remarkably fast. For 6. Tikhonova AN, Dolgalev I, Hu H, et al. The bone marrow
instance, the Human Cell Atlas (https://data.humancellatlas. microenvironment at single-cell resolution. Nature 2019;
569: 222–228.
org), which comprises already approximately 2.7 million
7. Macaulay IC, Ponting CP, Voet T. Single-cell multiomics:
cells, and 19.68 TB of data across multiple tissues in both
multiple measurements from single cells. Trends Genet
healthy and human diseases. This initiative has already 2017; 33: 155–168.
significantly contributed to immunology by providing novel 8. Miragaia RJ, Gomes T, Chomka A, et al. Single-cell
data sets across thymus, spleen and other organs, as well as transcriptomics of regulatory T cells reveals trajectories of
characterizing novel subsets of lymphocytes and monocytes tissue adaptation. Immunity 2019; 50: 493–504.
through development and into adulthood. A major aim of 9. Hwang B, Lee JH, Bang D. Single-cell RNA sequencing
data-gathering initiatives such as the Human Cell Atlas is to technologies and bioinformatics pipelines. Exp Mol Med
increase sample size and permit interrogation of single-cell 2018; 50: 1–14.
10. Stoeckius M, Hafemeister C, Stephenson W, et al.
multimodal data for more complex questions, such as
Simultaneous epitope and transcriptome measurement in
identifying the entire cell composition of the human body, or single cells. Nat Methods 2017; 14: 865–868.
to predict with machine learning algorithms the clinical 11. Mair F, Erickson JR, Voillet V, et al. A targeted multi-
outcome in disease. We envisage that single-cell multimodal omic analysis approach measures protein expression and
technologies will pervade basic and translational immunology low-abundance transcripts on the single-cell level. Cell Rep
research and will become a tool to discover mechanisms and 2020; 31: 1–13.
new cell states, as well as to mold novel immune therapies to 12. Granja JM, Klemm S, McGinnis LM, et al. Single-cell
effectively target specific molecular pathways in disease and multiomic analysis identifies regulatory programs in mixed-
allow the identification of target cells. phenotype acute leukemia. Nat Biotechnol 2019; 37: 1458–1465.
13. Katzenelenbogen Y, Sheban F, Katzenelenbogen Y, et al.
Coupled scRNA-Seq and intracellular protein activity
ACKNOWLEDGMENTS reveal an immunosuppressive role of TREM2 in cancer.
Cell 2020; 182: 1–14.
This research was supported by a NHMRC Project grant 14. Kotliarov Y, Sparks R, Martins AJ, et al. Broad immune
(APP1121643 to FL). FL is funded by an NHMRC CDA activation underlies shared set point signatures for vaccine
fellowship (APP1128416). responsiveness in healthy individuals and disease activity
in patients with lupus. Nat Med 2020; 26: 618–629.
15. Fairfax BP, Taylor CA, Watson RA, et al. Peripheral CD8+
AUTHOR CONTRIBUTIONS T cell characteristics associated with durable responses to
Raymond HY Louie: Conceptualization; Investigation; immune checkpoint blockade in patients with metastatic
Writing-original draft; Writing-review & editing. Fabio melanoma. Nat Med 2020; 26: 193–199.
Luciani: Conceptualization; Investigation; Supervision; 16. Zhang Q, He Y, Luo N, et al. Landscape and dynamics of
Writing-review & editing. single immune cells in hepatocellular carcinoma. Cell
2019; 179: 829–845.
17. Wu TD, Madireddi S, de Almeida PE, et al. Peripheral T
CONFLICT OF INTEREST cell expansion predicts tumour infiltration and clinical
response. Nature 2020; 579: 274–278.
The authors declare no conflicts of interest.
18. Guo X, Zhang Y, Zheng L, et al. Global characterization
of T cells in non-small-cell lung cancer by single-cell
REFERENCES sequencing. Nat Med 2018; 24: 978–985.
19. Sade-Feldman M, Yizhak K, Bjorgaard SL, et al. Defining
1. Stuart T, Satija R. Integrative single-cell analysis. Nat Rev T cell states associated with response to checkpoint
Genet 2019; 20: 257–272. immunotherapy in melanoma. Cell 2018; 175: 998–1013.
2. Manno GL, Soldatov R, Zeisel A, et al. RNA velocity of 20. Nguyen A, Phan TG. Single cell RNA sequencing of rare
single cells. Nature 2018; 560: 494–498. immune cell populations. Front Immunol 2018; 9: 1–11.

38
21. Eltahla AA, Rizzetto S, Pirozyan MR, et al. Linking the T 38. Weinreb Caleb, Rodriguez-Fraticelli Alejo, Camargo
cell receptor to the single cell transcriptome in antigen- Fernando D, Klein AM. Lineage tracing on transcriptional
specific human T cells. Immunol Cell Biol 2016; 94: 604– landscapes links state to fate during differentiation. Science
611. 2020; 367: eaaw3381.
22. Rizzetto S, Koppstein DNP, Samir J, et al. B-cell receptor 39. Xu J, Nuno K, Litzenburger UM, et al. Single-cell lineage
reconstruction from single-cell RNA-seq with VDJPuzzle. tracing by endogenous mutations enriched in transposase
Bioinformatics 2018; 16: 2846–2847. accessible mitochondrial DNA. Elife 2019; 8: 1–14.
23. Ranasinghe S, Lamothe PA, Soghoian DZ, et al. Antiviral 40. Codeluppi S, Borm LE, Zeisel A, et al. Spatial organization
CD8+ T Cells restricted by human leukocyte antigen class of the somatosensory cortex revealed by osmFISH. Nat
II exist during natural HIV infection and exhibit clonal Methods 2018; 15: 932–935.
expansion. Immunity 2016; 45: 917–930. 41. Baccin C, Al-Sabah J, Velten L, et al. Combined single-cell
24. Wang Z, Zhu L, Nguyen THO, et al. Clonally diverse and spatial transcriptomics reveal the molecular, cellular
CD38+HLA-DR+CD8+ T cells persist during fatal H7N9 and spatial bone marrow niche organization. Nat Cell Biol
disease. Nat Commun 2018; 9: 1–12. 2020; 22: 38–48.
25. Liao M, Liu Y, Yuan J, et al. Single-cell landscape of 42. Efremova M, Vento-Tormo M, Teichmann SA, Vento-
bronchoalveolar immune cells in patients with COVID-19. Tormo R. Cell PhoneDB: inferring cell–cell
Nat Med 2020; 26: 842–844. communication from combined expression of multi-
26. Unterman A, Sumida TS, Nouri N, et al. Single-cell omics subunit ligand–receptor complexes. Nat Protoc 2020; 15:
reveals dyssynchrony of the innate and adaptive immune 1484–1506.
system in progressive COVID-19. medRxiv 2020. https:// 43. Chua RL, Lukassen S, Trump S, et al. COVID-19 severity
doi.org/10.1101/2020.07.16.20153437. [Epub ahead of correlates with airway epithelium–immune cell
print]. interactions identified by single-cell analysis. Nat
27. Singh M, Jackson KJL, Wang JJ, et al. Lymphoma driver Biotechnol 2020; 38: 970–979.
mutations in the pathogenic evolution of an iconic human 44. Wauters E, Van Mol P, Garg A, et al. Discriminating mild
autoantibody. Cell 2020; 180: 878–894. from critical COVID-19 by innate and adaptive immune
28. Macaulay IC, Haerty W, Kumar P, et al. G&T-seq: parallel single-cell profiling of bronchoalveolar lavages. bioRxiv
sequencing of single-cell genomes and transcriptomes. Nat 2020. https://doi.org/10.1101/2020.07.09.196519. [Epub
Methods 2015; 12: 519–522. ahead of print].
29. Satpathy AT, Saligrama N, Buenrostro JD, et al. 45. Starks RR, Biswas A, Jain A, Tuteja G. Combined analysis of
Transcript-indexed ATAC-seq for precision immune dissimilar promoter accessibility and gene expression profiles
profiling. Nat Med 2018; 24: 580–590. identifies tissue-specific genes and actively repressed
30. van Galen P, Hovestadt V, Wadsworth MH, et al. Single- networks. Epigenetics and Chromatin 2019; 12: 1–16.
cell RNA-Seq reveals AML hierarchies relevant to disease 46. Liu Y, Beyer A, Aebersold R. On the dependency of
progression and immunity. Cell 2019; 176: 1265–1281. cellular protein levels on mRNA abundance. Cell 2016;
31. Trapnell C. Defining cell types and states with single-cell 165: 535–550.
genomics. Genome Res 2015; 25: 1491–1498. 47. Herzog VA, Reichholf B, Neumann T, et al. Thiol-linked
32. Saelens W, Cannoodt R, Todorov H, Saeys Y. A alkylation of RNA to assess expression dynamics. Nat
comparison of single-cell trajectory inference methods. Methods 2017; 14: 1198–1204.
Nat Biotechnol 2019; 37: 547–554. 48. Muhar M, Ebert A, Neumann T, et al. SLAM-seq defines
33. Sheih A, Voillet V, Hana L, et al. Clonal kinetics and direct gene-regulatory functions of the BRD4- MYC axis.
single-cell transcriptional profiling of CAR-T cells in Science 2018; 360: 800–805.
patients undergoing CD19 CAR-T immunotherapy. Nat 49. Chen S, Lake BB, Zhang K. High-throughput sequencing
Commun 2020; 11: 219. of the transcriptome and chromatin accessibility in the
34. Lareau CA, Ludwig LS, Muus C, et al. Massively parallel same cell. Nat Biotechnol 2019; 37: 1452–1457.
single-cell mitochondrial DNA genotyping and chromatin 50. Dey SS, Kester L, Spanjaard B, Bienko M, Van
profiling. Nat Biotechnol 2020; https://doi.org/10.1038/ Oudenaarden A. Integrated genome and transcriptome
s41587-020-0645-6 sequencing of the same cell. Nat Biotechnol 2015; 33: 285–
35. Park JE, Botting RA, Conde CD, et al. A cell atlas of 289.
human thymic development defines T cell repertoire 51. Shahi P, Kim SC, Haliburton JR, Gartner ZJ, Abate AR.
formation. Science 2020; 367: eaay3224. Abseq: Ultrahigh-throughput single cell protein profiling
36. Buenrostro JD, Corces MR, Lareau CA, et al. Integrated with droplet microfluidic barcoding. Sci Rep 2017; 7: 1–
single-cell analysis maps the continuous regulatory 12.
landscape of human hematopoietic differentiation. Cell 52. Liu L, Liu C, Quintero A, et al. Deconvolution of single-
2018; 173: 1535–1548. cell multi-omics layers reveals regulatory heterogeneity.
37. Jiang R, Fichtner ML, Hoehn KB, et al. Single-cell Nat Commun 2019; 10: 1–10.
repertoire tracing identifies rituximab-resistant B cells 53. Cao J, Cusanovich DA, Ramani V, et al. Joint profiling of
during myasthenia gravis relapses. JCI insight 2020; 5: chromatin accessibility and gene expression in thousands
1–18. of single cells. Science 2018; 361: 1380–1385.

39
54. Hu Y, Huang K, An Q, et al. Simultaneous profiling of 60. Lederer AR, La Manno G. The emergence and promise of
transcriptome and DNA methylome from a single cell. single-cell temporal-omics approaches. Curr Opin
Genome Biol 2016; 17: 1–11. Biotechnol 2020; 63: 70–78.
55. Angermueller C, Clark SJ, Lee HJ, et al. Parallel single-cell 61. Bingham GC, Lee F, Naba A, Barker TH. Spatial-omics:
sequencing links transcriptional and epigenetic Novel approaches to probe cell heterogeneity and extracellular
heterogeneity. Nat Methods 2016; 13: 229–232. matrix biology. Matrix Biol 2020; 91–92: 152–166.
56. Clark SJ, Argelaguet R, Kapourani CA, et al. ScNMT-seq 62. Schulte-Schrepping J, Reusch N, Paclik D, et al. Severe
enables joint profiling of chromatin accessibility DNA COVID-19 is marked by a dysregulated myeloid cell
methylation and transcription in single cells. Nat Commun compartment. Cell 2020; 182: 1–22.
2018; 9: 1–9. 63. Upadhaya S, Sawai CM, Papalexi E, et al. Kinetics of adult
57. Wang Y, Yuan P, Yan Z, et al. Single-cell multiomics hematopoietic stem cell differentiation in vivo. J Exp Med
sequencing reveals the functional regulatory landscape of 2018; 215: 2815–2832.
early embryos. bioRxiv 2019. https://doi.org/10.1101/ 64. Yao C, Sun HW, Lacey NE, et al. Single-cell RNA-seq
803890. [Epub ahead of print]. reveals TOX as a key regulator of CD8+ T cell persistence
58. Kong W, Biddy BA, Kamimoto K, Amrute JM, Butka EG, in chronic infection. Nat Immunol 2019; 20: 890–901.
Morris SA. Cell Tagging: combinatorial indexing to 65. Koutsakos M, Illing PT, Nguyen THO, et al. Human
simultaneously map lineage and identity at single-cell CD8+ T cell cross-reactivity across influenza A, B and C
resolution. Nat Protoc 2020; 15: 750–772. viruses. Nat Immunol 2019; 20: 613–625.
59. Rizzetto S, Eltahla AA, Lin P, et al. Impact of sequencing
depth and read length on single cell RNA sequencing data
of T cells. Sci Rep 2017; 7: 1–11. ª 2021 Australian and New Zealand Society for Immunology, Inc.

40
REVIEW ARTICLE

Genomic Cytometry and New Modalities for Deep


Single-Cell Interrogation

Robert Salomon,1,2* Luciano Martelotto,3 Fatima Valdes-Mora,4,5 David Gallego-Ortega1,4,6

1
 Abstract
Institute for Biomedical Materials and
In the past few years, the rapid development of single-cell analysis techniques has
Devices, The University of Technology
allowed for increasingly in-depth analysis of DNA, RNA, protein, and epigenetic states,
Sydney, Ultimo, New South Wales, 2006,
Australia
at the level of the individual cell. This unprecedented characterization ability has been
enabled through the combination of cytometry, microfluidics, genomics, and informat-
2
ACRF Child Cancer Liquid Biopsy Program, ics. Although traditionally discrete, when properly integrated, these fields create the
Children’s Cancer Institute. Lowy Cancer synergistic field of Genomic Cytometry. In this review, we look at the individual
Research Centre, University of New South methods that together gave rise to the broad field of Genomic Cytometry. We further
Wales (UNSW) Sydney, Randwick, New outline the basic concepts that drive the field and provide a framework to understand
South Wales, 2031, Australia
this increasingly complex, technology-intensive space. Thus, we introduce Genomic
3
Centre for Cancer Research, University of Cytometry as an emerging field and propose that synergistic rationalization of dispa-
Melbourne, Parkville, Victoria, Australia rate modalities of cytometry, microfluidics, genomics, and informatics under one
4
St Vincent’s Clinical School, Faculty of banner will enable massive leaps forward in the understanding of complex biology.
Medicine, University of New South Wales © 2020 International Society for Advancement of Cytometry
(UNSW) Sydney, Darlinghurst, New South
Wales, 2010, Australia  Key terms
5
Cancer Epigenetic Biology and Therapeutics. genomic cytometry; technology; cytometry; genomicsmicrofluidics; single-cell
Personalised Medicine Theme. Children’s
Cancer Institute. Lowy Cancer Research
Centre, University of New South Wales
(UNSW) Sydney, Randwick, New South THE cell is the basic unit of life and is capable of a vast array of biological complex-
Wales, 2031, Australia
6
ity. In order to understand how different populations of cells can functionally coexist
Tumour Development Lab, The Kinghorn to form organs, organisms, and indeed disease, it is critical to profile all aspects of
Cancer Centre, Garvan Institute of Medical
Research, Darlinghurst, New South Wales,
the individual cells. The capability to perform in-depth single-cell analysis has pro-
2010, Australia vided us with a more complete understanding of disease, development, and normal
Received 6 November 2019; Revised 28
function. Moreover, the application of single-cell genomic technologies has already
June 2020; Accepted 7 August 2020 identified many of the molecular features of cell populations within tissues, organs,
* and diseases.
Correspondence to: Robert Salomon,
Institute for Biomedical Materials and Techniques that together comprise the field of Genomic Cytometry have
Devices, The University of Technology already been used to reveal a fundamental aspect of biology. Most notably, that cell
Sydney, Ultimo New South Wales 2006, populations are more heterogeneous than ever imagined. Each individual cell is
Australia Email: rob@rob-salomon.com unique in terms of space (e.g., physical position in tissues and/or organs), time
or rsalomon@ccia.org.au (e.g., phases of cell cycle, activation or developmental state), and molecular profile.
This uniqueness makes understanding the underlying biology a significant challenge.
Published online 5 September 2020 in While in the past, scientists could interrogate, enumerate, and classify cell types
Wiley Online Library according to their appearance under the microscope, this analysis is limited in the
(wileyonlinelibrary.com) number of characteristics that can be simultaneously probed, the rate at which
DOI: 10.1002/cyto.a.24209 observations can be made has relied heavily on the individual interpreting the data.
Modern flow cytometry emerged to give an additional level of detail to the classifica-
© 2020 International Society for
Advancement of Cytometry tion process. By making use of multi-parameter, multi-laser instruments, flow cyto-
metry has redefined cell classification at the molecular level, aided the discovery and
definition of major and minor cell subsets, and has quickly become an essential tool
for dissecting the functional complexity of cell populations. In general, however, it is
primarily used to identify cellular protein expression profiles and despite being able

41
REVIEW ARTICLE

to process many millions of cells in a rapid manner, it is also development, and disease. Currently, there are a multitude of
hampered by limited dimensionality and the inherent loss of different tools and methodologies that can be used to charac-
anatomical context. terize a cell. These tools can measure the physical characteris-
Limits around fluorochrome uniqueness and detector tics of a cell (such as size, deformability, electrical impedance,
numbers result in characterization ability topping out around and density) as well as biochemical aspects such as DNA,
the 30-parameter mark. In line with advances in fluorescent RNA, and protein (concentration, monomer composition,
cytometry and the development of spectral cytometers (1), and chemical status including mutation, acetylation, phos-
instruments such as the CYTOF (2) have matured and are phorylation, methylation, etc.). Importantly, emerging tools
now capable of 40+ parameters (3, 4). While there is some are increasingly allowing the simultaneous characterization of
debate around the benefits and trade-offs associated with the these parameters. This is known as multi-omics.
use of mass cytometry (5), the advent of scanning ablation Although many aspects of a cell are able to be assessed
and ion beam systems (6–8) has helped to bridge the gap by traditional cytometry, it has primarily been leveraged to
between imaging and flow cytometry. In doing so, they have characterize protein expression profiles at the level of the
provided tools that allow 2D reconstruction of tissue sections individual cell. From the early advent of fluorescence cyto-
such that anatomical location of protein expression can be metry in the late 1960s (15) and cell sorting in 1965 (16), the
performed down to the micrometer range. A recent study by underlying technology has remained relatively static. Instru-
Keren et al. has improved this resolution down to 260 nm ment manufacturers have added additional laser lines and
(9). These imaging systems, however, tend to be much slower increased detector numbers in order to improve multiplexed
than traditional cytometry and have their own unique single-cell characterization; however, flow cytometry is still
challenges. hampered by a lack of spectrally unique fluorochromes.
Given that even the most advanced methods in fluores- Recent developments in dye technology, particularly around
cence and mass cytometry are still limited, it is clear that new tunable polymer-based dyes (17), have allowed flow cyto-
methods must emerge to allow deep single-cell characteriza- metry assays to reach into the 28 color range (18–22). How-
tion. In order to be widely applicable in biological studies, ever, if we look at the total cellular complexity, it is clear that
these systems should provide throughputs similar to current even high dimensional fluorescent flow cytometry is incapable
fluorescent flow cytometric techniques while also providing of completely characterizing the full range of cellular identi-
improved dimensionality (hundreds to thousands of parame- ties and cellular states.
ters simultaneously). By combining advances in cytometry To understand the challenge of fully characterizing a sin-
with the tools emerging from the field of single-cell genomics, gle cell, we must look at the complexity within cells (Table 1).
we are entering a new era of Genomic Cytometry. With the The human genome is composed of 3 billion nitrogenous
tools and workflows being created by today’s emerging geno- bases. These are structurally organized into regions that can
mic cytometrists, we now can understand, in a concerted be transcribed to RNA and subsequently translated to protein.
manner, many aspects of individual cells. These regions are known as genes. Although there is still con-
Genomic Cytometry techniques, while focused on the jecture around the number of genes (24, 25), studies suggest
single cell, allow us to identify and characterize a group of that the number of human genes sits somewhere in excess of
single cells that share a similar function. The characteristics 19,000 (26–29). Of the estimated 19,000 protein-coding genes,
able to be probed are no longer limited to protein expression it is possible to make many different proteins, some authors
profiles, but now include aspects such as DNA, RNA, pro- suggest as many as 100 different proteins can be made from
teins, metabolites, and even epigenetic modifications. This
unprecedented ability to sensitively interrogate large numbers
Table 1. Potential complexity of the individual human cell
of individual cells at a reduced cost is accelerating discovery
and challenging existing paradigms in cytometry. Perhaps ESTIMATED OBSERVABLE
more importantly, this technological leap is transforming how MEASURABLE CHARACTERISTIC NUMBERS

we understand basic and translational biology. DNA 3,000,000,000 (bases)


The single-cell multi-omics revolution has fostered a par- Epigenetic states
allel development of computational approaches, necessary to Open chromatin regions 100,000–150,000 (peaks)
integrate and understand the data generated from single-cell (enhancers and
genomic techniques. These methodologies and approaches promoters)
have been described elsewhere (10–14). In this review, we DNA methylation 25,000 (CpG islands)
analyze the factors and motivations that have given rise to the Three-dimensional genome 7,000,000 (long-range
field of Genomic Cytometry. We also provide an overview of architecture contacts) (23)
the tools currently available in this space. RNA 19,000 (coding genes)–
100,000 (noncoding
UNRAVELING CELLULAR COMPLEXITY RNAs)
Proteins >19,000
Cells are complex assemblies of macromolecules and
CD markers >400
chemicals that function as a single unit during homeostasis,

42
REVIEW ARTICLE

each gene (30). To date, the Human Cell Differentiation Mol- 5. Spatial transcriptomics (combining basic imaging with
ecule (HCDM) group has defined over 400 cluster of differen- novel positionally traceable cellular barcodes).
tiation markers (31).
In addition to regions that code for proteins, noncoding
regions of the DNA also exist. These regions include Plate-Based Approaches
enhancers, insulators, and promoters, which are key for gene Plate-based assays are the most familiar to the traditional
expression regulation and thus important markers of cell- cytometrist and are one of the few high throughput Genomic
type. Epigenetic mechanisms like DNA methylation, histone Cytometry methods that currently allow active single-cell
post-translational modifications, expression of noncoding deposition. Active cell deposition is usually achieved using
RNAs, three-dimensional, structure and nucleosome position- FACS, which allows selective deposition of cells based on
ing all shape the conformation of the chromatin to regulate characteristics measurable by traditional flow cytometry
gene transcription adding an additional layer of complexity to techniques.
the characteristics of the cell (32). Mechanically, plate sorting is most commonly achieved
through the use of electrostatic droplet-based cell sorting. In
COMMON GENOMIC CYTOMETRY APPROACHES these systems, single cells are sequentially flown through an
interrogation point, characterized and deflected into the well
Broadly speaking, it is possible to arrange Genomic Cyto-
of a microtiter plate. By incorporating a system capable of
metry techniques into five main methodology categories.
moving the microtiter plate with repeated micron-level accu-
These categories are shown in Figure 1, they are:
racy, it is possible to target individual wells sequentially. Cells
1. Plate-based approaches (making use of traditional Fluo- are deposited into 96- or 384-well microtiter plates; however,
rescent Activated Cell Sorting [FACS]). in some cases higher density plates can be used. In addition
2. Microfluidics: (1) Droplet-based microfluidics (aqueous to ensuring the target cell is deposited into the correct well,
reaction chambers created within an oil-in-water droplet); most instruments will allow the operator to control the likeli-
and (2) solid microfluidics (miniaturized single-cell han- hood that (1) a cell is in the deflected drop, (2) more than
dling tools with associated molecular workflows for one target cell is not deposited, and (3) the nontarget cell
downstream characterization). contamination is minimized. For traditional FACS, these are
3. In situ combinatorial indexing (using the cell as the reac- controlled through the application of a sort mask and allow
tion chamber itself). the operator to balance cellular throughput and deflection
4. Image-based approaches (making use of direct imaging accuracy with the requirements of high-speed cell sorting.
or spatially traceable barcodes to create high dimensional, As current cytometers have not yet overcome the
anatomically relevant images). randomness of cell arrival times, many cells that meet the
selection criteria are not deposited into the sort well. Single-
cell masks look at the predicted position of the cell in the
Plate based
individual drop and will abort the sort if the cell is located in
either the leading or trailing edge of the drop. This means
that single cells on the periphery of the drop are not deflected
and adds to the cell losses associated with the requirement to
In situ Droplet abort sort packets that contain coincident events. The ability
microfluidic
to deterministically control cell location with relation to time
and space will remove inefficiencies associated with the Poisson
distribution of cells in drops and will result in higher through-
Genomic put, lower loss single-cell approaches, while still retaining the
cytometry characterization complexity afforded by traditional FACS.
methodology While electrostatic droplet-based FACS is by far the
categories
most common method for depositing cells into microtiter
plates, emerging technologies such as the CellenONE and the
Spatial Solid WOLF cell sorters are providing alternatives. Both of these
microfluidic systems use a low-pressure microfluidics-based approach and
can thus be used on highly friable cell types that may be sen-
sitive to the stresses of traditional FACS. The CellenONE sys-
tem is a unique ultra-low volume liquid handler that utilizes
Imaging an active image-based cell sorting approach to improve cell
deposition accuracy while simultaneously minimizing cell loss
(sort aborts are simply collected without dilution for subse-
Figure 1. The main methodology categories that comprise the quent reanalysis and deposition). Both the WOLF and the
field of Genomic Cytometry. [Color figure can be viewed at CellenONE systems are slow when compared to FACS, and
wileyonlinelibrary.com]
can only handle limited cell numbers, for this reason, they

43
REVIEW ARTICLE

tend to have specific applications and often require pre- microfluidics and genomics could generate single-cell data
enrichment steps when dealing with rare cell populations. with relative ease. As the field becomes more mature and
Modern FACS instruments also include a software mod- competitors increasingly enter the market, we expect the
ule that tracks the characteristics of the cell sorted and links dominance of a single platform to be significantly challenged.
this to the well coordinates. This process, known as index Mechanistically, droplet-based microfluidic systems work
sorting, is critical to multi-omic studies as it allows protein by mixing two immiscible liquids to create a water-in-oil
expression profiles (captured as part of the sort decision) to emulsion. The oil forms a self-contained reaction vessel
be cross-correlated to the genomic data generated in down- around an aqueous phase. The aqueous phase contains both
stream assays. cells and a bead containing uniquely barcoded mRNA capture
Assays that take advantage of a plate-based approach probes in lysis buffer. For 30 scRNA-seq assays, the capture
include: Smart-Seq (33), Smart-Seq2 (34), Smart-Seq3 (35), probe contains a poly dT region of around 22–25 nucleotides,
STRT-seq (36), STRT-seq-2i (37), Cell-Seq, Cell-Seq2 (38), which binds to polyadenylated transcripts released upon cell
MARS-Seq (39), mcSCRB-seq (40), Qartz-seq (41), Qartz- lysis. Thus, as the mRNA is released the polyadenylated
seq2 (42), scBS-seq (43), and single-cell HiC (44). region of the transcript is immediately bound to an oligo con-
taining a (1) a cell barcode, (2) a Unique Molecular Identifier
Microfluidics (UMI), and (3) a nucleotide region that assists with subse-
Microfluidics have expanded massively in popularity in the quent transcript amplification. As the aim of these systems is
past two decades (45). In recent years, the field has also made to co-locate a single bead with a single cell to a single droplet,
a significant contribution to both our understanding of biol- the RNA profile for each captured cell can be obtained by
ogy and to many areas of health care (45–48). Using micro- informatically pooling of cell barcodes. The UMI tracks indi-
fluidics, entirely new assays can be created and traditional vidual transcripts allowing for correction of amplification
assays miniaturized. With reactions performed in the nano to bias. The utilization of the dual barcode approach allows digi-
pico-liter range (49), microfluidic-driven miniaturization can tal transcript counting at the level of the single cell.
result in log fold difference in the reaction volume. Because Droplet volume is dependent on the flow rates and the
miniaturization can improve reaction efficiencies by simulta- chip geometry and while this can be used to create a wide
neously reducing reagent and sample input, microfluidics is range of droplet sizes, the droplets used in scRNA-seq appli-
becoming increasingly critical to our ability to perform high- cations generally range from a few hundred pico-liters to a
throughput, high-resolution, high-sensitive assays in a cost- few nano-liters (52). Droplet volume has been shown to be
effective manner. inversely related to the number of transcripts detected in the
Microfluidics is used in a range of technologies but with final library, for this reason, applications such as DroNC-Seq
reference to genomics, its application in massively parallel (designed for polyadenylated RNA transcripts from the cell
sequencing technologies was a significant contributor to the nucleus) are better suited to systems that produce smaller
precipitous drop in sequencing cost. It is also being used in droplets (75 vs 120 μm diameter droplets) (53).
most of today’s commercially available, high-throughput Droplet microfluidics have been extensively utilized in
Genomic Cytometry platforms, such as the 10× Genomics the context of Genomic Cytometry. In addition to the work
Chromium and BD Rhapsody, Dolomite Bio Nadia, Mis- mentioned above, it has been used to profile transcriptomes
sionbio Tapestri, ICell8, Biorad ddseq, InDrops, and Fluidgm at single-cell resolution (54), and other non–RNA-based
C1 systems. To assist with the categorization of the many applications. These include, (1) single-cell epigenetic
microfluidic approaches available, we have split the tech- approaches such as: single-cell ChIP-seq (55, 56), dscATAC-
niques into two subcategories, those that involve droplets and seq (57, 58) ChIA-Drop (59) single-cell ATAC seq (23); and
those that utilize miniaturized solid reaction chambers. (2) single-cell DNA approaches like single-cell gDNA-
seq (60–62) and a variety of multi-omic workflows including
Droplet Microfluidics CITE-Seq (63), REAP-seq (64) (protein and transcriptome),
The realization that droplet microfluidics is useful in the ECCITE-seq (65) (transcriptome, protein, clonotypes, and
study of biology came of age with the simultaneous publica- CRISPR perturbations), and SNARE-seq (66) (chromatin and
tion of two seminal papers out of Harvard and the Broad transcriptome).
Institutes in 2015 (50, 51). These papers showed, for the first
time, the application of high-throughput droplet-based gener- Solid Microfluidics
ators in single-cell RNA-seq (scRNA-seq). Since then, com- Solid microfluidic platforms use physical barriers to create
mercial systems such as the 10x Genomics Chromium, Biorad individual reaction chambers, often at high physical densities
ddseq, Dolomite Bio Nadia, and the Missionbio Tapestri sys- but always with ultra-low volumes. These chambers can be
tems have been released. Among these, the 10× Genomics made from a variety of materials but commonly include plas-
Chromium system has the broadest acceptance. This is likely tic, metal or polydimethylsiloxane (PDMS). Because solid
due to the fact that it was the first to include a highly defined microfluidics uses a physical confinement on solid substrates,
kit-based approach combined with an accessible data inter- it is possible to perform imaging on the cells in the well. If
face. At the time, this created a uniquely user-friendly ecosys- each well location can be associated with the unique cell
tem. With this, a biologist without deep expertise in barcode, then it is also possible to associate this data with the

44
REVIEW ARTICLE

downstream genomic characterization. Systems that allow this indexing ATAC seq (74), sci-RNA-seq (single-cell combinato-
tend to have lower throughput and include the Fluidgm C1™ rial indexing RNA sequencing) (75), split-pool ligation-based
and ICell8™ systems. transcriptome sequencing (SPLiT-seq) (76), single-cell combi-
The Fluidgm C1™ system is perhaps the best know solid natorial indexed sequencing (SCI-seq) (77), sci-CAR (78)
microfluidics platform. The C1 utilizes an intricate micro- single-cell transposome hypersensitive sites sequencing (THS-
fluidics architecture to provide high-level control of the com- seq) (79), single-cell DNA methylation (sci-MET) (80),
plex molecular reactions required for single-cell analysis. The droplet-based sci-ATAC (57), and single-cell Hi-C (Sci-Hi-C)
C1 system has been used in a number of studies characteriz- (81). Recently, SplitBio announced commercial release of a
ing single cells at the level of RNA, DNA, and epigenetic single-cell RNA sequencing kit utilizing in situ combinatorial
changes (67–69). Despite the systems advanced approach, indexing.
problems have been identified and care should be taken with
its use (70). The ICell8™ system is a commercially miniatur- Image-Based Approaches
ized plate-based system that allows high-density fluid han- In contrast to spatial transcriptomic systems that rely primar-
dling to achieve microfluidic scale single-cell genomics. ily on spatially attributable cell barcodes, image-based Geno-
Recently, Becton Dickinson has released a high through- mic Cytometry techniques rely on in situ imaging of cells.
put scRNA-seq system, the BD Rhapsody. It uses a similar These systems have been used to directly image the location
approach to the CytoSeq (71) and Seqwell protocols (72). By of both RNA and protein in tissue sections. Because sample
using a microwell approach, the Rhapsody system can place a handling is reduced and solid tissue does not require diges-
single bead in virtually every well and does not expose the tion, such systems may provide the most representative
cells to the same pressures associated with droplet generation. method to study cellular composition in solid tissues. Exam-
This may be an important consideration when working with ple systems include the Codex and a number of highly multi-
cells highly sensitive to pressure-related stress. plexed fluorescent in situ hybridization (FISH)-based
In addition to high recovery rates, Rhapsody workflows approaches.
also allow both a whole transcriptome as well as a targeted Highly multiplexed FISH approaches take advantage of
transcriptomics approach. While commercial modifications specially designed probes combined with multiple rounds of
have occurred, the molecular workflow is similar to that used hybridization and imagining to build anatomically localized
in many droplet-based scRNA-seq techniques. The targeted transcript maps on tissue sections. Examples of such
scRNA-seq approach, however, is currently unique to the approaches include MERFISH (82), STAR-map (83), Seq-Fish
Rhapsody and while it requires a priori knowledge of the sys- + (84), or DNA microscopy (84).
tem being interrogated, it allows transcripts of interest to be The Codex system (85) can perform high-dimensional
deeply probed without incurring the high sequencing cost image-based protein detection with the use of oligo-
associated with reading common housekeeping and lowly conjugated antibodies. The system has been adapted for both
informative transcripts. Depending on the panel, it is possible slide imaging, super-resolution imaging, and has also been
to obtain the same sequencing saturation, with up to 10 times shown to work with volumetric imaging. By using a series of
less sequencing reads than that obtained when using a WTA fluorescently labeled bases and relying on the specificity of
approach (73). complementary binding of fluorescently labeled base pair
sequence to the oligo attached to the antibody, it is possible
In Situ Combinatorial Indexing to perform highly multiplexed protein detection in tissue.
In situ single-cell methods provide an ingenious way to use Codex has been validated in both FFPE and frozen samples
the inherent structure of the cell or nuclei as the reaction and can detect more than 40 proteins from the same
chamber itself. This is achieved by first fixing the cell using individual cell.
methanol, or beginning with an intact nucleus, and subjecting
these to multiple sequential barcoding steps using a split-pool Spatial Transcriptomics
approach. Through successive integration of molecular Spatial transcriptomic workflows are complicated and require
barcodes into the cell/nucleus itself, in situ combinatorial complex bioinformatics pathways. However, they can be sim-
methods are capable of building up a library of uniquely plified to a number of key steps, (1) a tissue section is cut,
barcoded single cells. For these methods to work effectively, it (2) section is laid on a solid imageable surface containing
is critical to ensure that the number of barcodes that can be immobilized region-specific capture probes (these are akin to
created is well in excess of the number of cells/nuclei being the cellular barcode used in other methods), (3) the section is
labeled. As the total number of barcodes possible is a combi- imaged, (4) the sample is then permeabilized, and finally
nation of (1) the number of unique starting oligos and (2) the (5) the polyadenylated mRNA is captured by spatial probes.
number of successive split-barcode-pool-split steps, these Following this, cDNA is synthesized, libraries are created, and
methods require careful balancing of cell inputs to available then sequenced. As the location of the unique oligo sequence
barcodes. Failure to do this will result in cells/nuclei sharing for the capture probe can be traced back to a discrete physical
the same barcode. location, it is possible to create a single-cell transcriptomic
Notable examples of in situ combinatorial approaches library that retains anatomical information. The resolution of
include, a 2015 method to perform single-cell combinatorial the system is governed by both the spot size of the deposited

45
REVIEW ARTICLE

capture probes and the distance between the centers of adja- substantial step forward in the ability to perform high dimen-
cent capture probe spots. The very first spatial transcriptomic sional single-cell protein detection. By incorporating a unique
system (86) had a spot size of 100 μm, with a distance oligo onto the antibody, it is possible to detect the extra-
between spot centers of 200 μm, and an estimated 200 million cellular protein expression on the cell using common 30
capture oligos per spot. Academic systems with spot sizes scRNA-seq. The oligos attached to the antibodies contain
approaching that of the single-cell include Slide-seq (87) and (1) an antibody specific base pair sequence (to identify anti-
HDST (88). These systems have a resolution of 10 and 2 μm, gen specificity), (2) a PCR handle (to allow amplification dur-
respectively. Recently, alterations to the molecular compo- ing library preparation), and (3) a poly-A sequence (to allow
nent, including “the bead barcode synthesis, array sequencing the antibody conjugated oligo to be captured by the polyT
pipeline and the enzymatic processing of cDNA” of the Slide- region of the capture probe). This approach has been com-
seq method, were used to improve sensitivity by an order of mercialized by both Biolegend and Becton Dickinson.
magnitude (Slide-seqV2) and allow better transcript represen- The use of oligonucleotide-conjugated antibodies has
tation (89). been shown to be effective at detecting many antigens. How-
Commercial methods such as the Visium from 10× ever, the technology is relatively new, and care should still be
Genomics are currently available but not yet in widespread taken when designing panels. While we do not yet have
use. These methods are also not yet at the level of the single guidelines for panel design, factors such as (1) epitope expres-
cell. Instead, they have spot sizes that contain many cells and sion density, (2) cell numbers stained, (3) sequencing depth,
have large gaps between the spots. The Visium platform uses (4) relative expression ratios, and (5) library complexity are
spot sizes of 55 μm, with the separation between spot centers likely to affect the outcome of oligo antibody characterization
being 100 μm. One caveat of systems like this is the need of studies.
permeabilization time optimization which will vary from One of the criticisms of oligonucleotide-conjugated anti-
sample to sample. We expect that as spatial transcriptomics bodies is that the sequence allocation required to detect all
are further developed, they will become a valuable method for bound antibodies is dependent on the relative expression
deeply characterizing patient disease. However, until stan- across all proteins in the panel. In panels that contain a few
dardized protocols across a number of tissue types can be very high-expressing antigens, most of the sequence reads can
determined, the widespread clinic adoption of such systems be taken up by a small number of antigens. In this case, the
will likely be hindered. dynamic range of the remaining antibodies is significantly
reduced. While there are a number of ways to approach this
(including antibody titration and spiking in cold, unlabeled
MULTI-OMICS antibody), one approach that will undoubtedly become popu-
Multi-omics is the science of combining measurements lar is to first sort populations of cells defined by high
afforded by the different omics modalities on the same sam- expressing antigens using fluorescently labeled antibodies
ple. In Genomic Cytometry, multi-omics involves the mea- prior to labelling sorted fractions with oligo-antibodies to
surement of more than one class of cellular characteristics at identify the remaining antigen profiles.
the level of the single cell simultaneously. Generally, this This FACS-assisted sequencing approach ensures an
includes the measurement of (1) RNA with protein, (2) RNA efficient use of sequencing reads and when combined with
with DNA, (3) DNA with protein, or (4) epigenetics analysis hashtag antibodies (93) or lipid modified oligo or cholesterol
with protein. However, approaches allowing three modalities modified oligo (94) (to molecularly barcode each sorted
to be probed simultaneously are emerging. population), it becomes a powerful multi-omics strategy with
Low throughput multi-omics has been possible since the high-throughput. A comparison of this approach, including
advent of FACS-based index sorting for downstream scRNA- its impact on sequencing read allocation, is modeled in
seq applications. By simply varying the downstream genomic Figure 2.
analysis method, it is possible to use index sorting for a multi-
tude of single-cell multi-omic studies. This approach is often
APPLICATION OF GENOMIC CYTOMETRY
used in mid throughput scRNA-seq plate-based assays such
as Smart-Seq (33, 34), Cell-Seq2 (38), and MARS-seq (39). Since around 2015, there has been an explosion of methods
Even inherently multi-omics methods such as G&T-seq (90) aimed at single-cell genomic characterization. Alongside this,
can be combined with index sorting to add a protein dimen- there have been an increasing number of studies making use
sion to the multi-omic analysis. The idea of using index of scRNA-seq approaches; see review (95). Indeed, following
sorting to boost multi-omics identification of single-cell at the the completion of the human genome project (96), scientists
level of RNA, DNA, and protein has recently been leveraged have become increasingly aware that bulk genomic
in the TARGET-Seq (91) protocol. approaches lack the precision to unravel subtle changes at the
In order to facilitate high-throughput multi-omic approaches level of the individual cell. This is critically important in dis-
involving protein detection, a number of oligonucleotide- eases such as cancer and immune disorders where a single
conjugated antibody techniques have been developed. These rogue cell can be the base of disease. It is also important for
include CITE-seq (63), REAP-seq (64), and Ab-seq (92). The the understanding of many developmental processes where
use of oligonucleotide labeled antibodies has allowed a single cells give rise to many cells.

46
REVIEW ARTICLE

Figure 2. FACS assisted sequencing provides an efficient and targeted multi-omics approach. (A) A comparison of standard full oligo
antibody panel (unbiased) (top), with FACS assisted sequencing (targeted) (bottom) using a combination of fluorescently labeled
antibodies (for pre selection of populations) followed by oligo antibody labelling. (B) Read sequencing utilization in a mock panel. An
imaginary 30 plex panel was created. The panel consisted of 5 high-expression epitopes, 14 medium-density epitopes, and 11 low
expressors. To compare the effect of removing the high-expressing antigens from the sequencing run, we compared the relative
proportion of sequencing reads used by each oligo tag under both conditions. Each of the concentric circles in the radar plots indicates a
single percentage of sequencing reads used up by the marker. This model predicts that when highly expressed antigens were removed
from the oligo antibody panel, it is clear that low-expressing antigens are associated with higher read counts when FACS assisted
sequencing was used. [Color figure can be viewed at wileyonlinelibrary.com]

Whole transcriptome analysis, the method used by the identifying unique cell types. These studies have formed the
majority of scRNA-seq studies to date, allows global profiling basis of the Human Cell Atlas (HCA) project (100). The HCA
of many of the RNA species found in cells in an unbiased is a multicenter, international effort aiming to create a database
manner and without the need of a priori knowledge of the of all cell types in the human body using single-cell
cells or the cell system to be studied. Although single-cell approaches. This is an important effort and is the next logical
transcriptomic methods are not capable of amplifying every step following on from the human genome project. Just as the
single mRNA, even relatively poorly performing methods are human genome project provides the reference data that has
proving capable of accurately identifying many existing and allowed deep interrogation of the biology associated with geno-
novel cell populations (97). Furthermore, many of the original mic changes, the completion of the HCA should provide the
methodologies are being improved with molecular techniques reference data to allow classification of individual cells from
aimed at increasing transcript detection sensitivity. Notable their unique omic signatures. This is particularly important, as
examples include Chromium V3, Smart-Seq 3, Quartz-Seq2, many of the databases that we are currently using to interpret
and Seq-Well S^3 (35, 42, 98, 99). single-cell genomic studies are based on bulk genomics.
Early studies have tended to be descriptive efforts, pri- While there is clear virtue in these types of studies, this
marily aimed at uncovering the cellular heterogeneity and shotgun approach is only designed to provide a fundamental

47
REVIEW ARTICLE

Figure 3. Flowchart for determining the most suitable Genomic Cytometry method for the biological question. [Color figure can be
viewed at wileyonlinelibrary.com]

base for more nuanced approaches. The approach required for emerging to allow RNA-seq, DNA-seq, epigenetic analysis,
biologically directed studies will depend on the (1) biology of and protein detection at the level of the single cell will funda-
the system, (2) the questions being asked, (3) the technical mentally change what we know about biological processes
expertise of the scientists running the experiment, and (4) the and how quickly we can deeply interrogate complex biological
funds available. While the decision of which technology is best systems. We are beginning to see a systems-based approach
suited to the biological question being asked is not always that will allow us to do accurate single-cell multi-omics stud-
straightforward, we have outlined some of the more common ies with the sensitivity, efficiency, and cost that means true
questions involved in the decision-making process in Figure 3. biology can be uncovered. This deep characterization is all-
owing us to unravel cellular complexity in highly heteroge-
neous samples and to find the root cause of disease and
CONCLUSION unravel the cellular complexity of development. Eventually,
As we move into the age of Genomic Cytometry, we are now we believe it will give us the power to analyze the DNA,
looking to synergistically leverage the modalities of genomics, RNA, protein, and epigenetic states of individual cells at
informatics, microfluidics, and cytometry toward a single aim. throughputs that will rival that of current flow cytometers.
To do this, we must develop ways to work in a cross disci- As the field of single-cell genomics matures, and we
plinary fashion such that microfluidics and FACS-based tech- begin to embrace the broader field of Genomic Cytometry, it
niques can be seamlessly integrated into molecular workflows will become increasingly more evident that results from
and high-dimensional data analysis frameworks. The combi- single-cell omics studies will need to be supported and vali-
nation of these four, traditionally distinct expertise areas, is dated by alternate systems. These systems will include tradi-
what provides the foundation for the new field of Genomic tional imaging, lineage tracing, and fluorescence cytometry
Cytometry. methods. This will create a circle of discovery and validation
With Genomic Cytometry, it is possible to study cellular that unites the field of genomics and cytometry. For this rea-
characteristics more deeply than ever before. The new tools son, although we envision a dramatic shift in the tools

48
REVIEW ARTICLE

available to the traditional cytometrist, cytometry will still 23. Buenrostro JD, Wu B, Litzenburger UM, Ruff D, Gonzales ML, Snyder MP,
Chang HY, Greenleaf WJ. Single-cell chromatin accessibility reveals principles of
hold a critical place in the emerging application of single-cell regulatory variation. Nature 2015;523(7561):486–490.
genomics. It is, for this reason, Genomic Cytometry will 24. Salzberg SL. Open questions: How many genes do we have? BMC Biol 2018;16(1):94.
25. Willyard C. New human gene tally reignites debate. Nature 2018;558(7710):354–355.
become the modality of choice for single-cell analysis.
26. Ezkurdia I, Juan D, Rodriguez JM, Frankish A, Diekhans M, Harrow J, Vazquez J,
Valencia A, Tress ML. Multiple evidence strands suggest that there may be as few
as 19,000 human protein-coding genes. Hum Mol Genet 2014;23(22):5866–5878.
CONFLICT OF INTEREST 27. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO,
The authors declared no potential conflict of interest. Yandell M, Evans CA, Holt RA, et al. The sequence of the human genome. Science
2001;291(5507):1304–1351.
28. International Human Genome Sequencing Consortium. Finishing the euchromatic
sequence of the human genome. Nature 2004;431(7011):931–945.
AUTHOR CONTRIBUTIONS 29. Clamp M, Fry B, Kamal M, Xie X, Cuff J, Lin MF, Kellis M, Lindblad-Toh K,
Lander ES. Distinguishing protein-coding and noncoding genes in the human
Luciano Martelotto: Conceptualization; writing-review and genome. Proc Natl Acad Sci USA 2007;104(49):19428–19433.
editing. Fatima Valdes-Mora: Conceptualization; writing- 30. Ponomarenko EA, Poverennaya EV, Ilgisonis EV, Pyatnitskiy MA, Kopylov AT,
Zgoda VG, Lisitsa AV, Archakov AI. The size of the human proteome: The width
review and editing. David Gallego-Ortega: Conceptualiza- and depth. Int J Anal Chem 2016;2016:7436849.
tion; supervision; writing-original draft; writing-review and 31. Engel P, Boumsell L, Balderas R, Bensussan A, Gattei V, Horejsi V, Jin BQ,
Malavasi F, Mortari F, Schwartz-Albiez R, et al. CD nomenclature 2015: Human
editing. leukocyte differentiation antigen workshops as a driving force in immunology.
J Immunol 2015;195(10):4555–4563.
32. Allis CD, Jenuwein T. The molecular hallmarks of epigenetic control. Nat Rev
LITERATURE CITED Genet 2016;17(8):487–500.
1. Futamura K, Sekino M, Hata A, Ikebuchi R, Nakanishi Y, Egawa G, Kabashima K, 33. Ramskold D, Luo S, Wang YC, Li R, Deng Q, Faridani OR, Daniels GA,
Watanabe T, Furuki M, Tomura M. Novel full-spectral flow cytometry with multi- Khrebtukova I, Loring JF, Laurent LC, et al. Full-length mRNA-Seq from single-
ple spectrally-adjacent fluorescent proteins and fluorochromes and visualization of cell levels of RNA and individual circulating tumor cells. Nat Biotechnol 2012;30
in vivo cellular movement. Cytometry A 2015;87(9):830–842. (8):777–782.
2. Bendall SC, Simonds EF, Qiu P, Amir el AD, Krutzik PO, Finck R, Bruggner RV, 34. Picelli S, Faridani OR, Björklund ÅK, Winberg G, Sagasser S, Sandberg R. Full-
Melamed R, Trejo A, Ornatsky Ol, et al. Single-cell mass cytometry of differential length RNA-seq from single cells using Smart-Seq2. Nat Protoc 2014;9(1):171–181.
immune and drug responses across a human hematopoietic continuum. Science 35. Hagemann-Jensen M, Ziegenhain C, Chen P, Ramsköld D, Hendriks G-J, Larsson
2011;332(6030):687–696. AJM, Faridani OR, Sandberg R. Single-cell RNA counting at allele- and isoform-
3. Simoni Y, Chng MHY, Li S, Fehlings M, Newell EW. Mass cytometry: A powerful resolution using Smart-Seq3. bioRxiv 2019:817924.
tool for dissecting the immune landscape. Curr Opin Immunol 2018;51:187–196. 36. Islam S, Kjallquist U, Moliner A, Zajac P, Fan JB, Lonnerberg P, Linnarsson S.
4. Bandura DR, Baranov VI, Ornatsky OI, Antonov A, Kinach R, Lou X, Pavlov S, Characterization of the single-cell transcriptional landscape by highly multiplex
Vorobiev S, Dick JE, Tanner SD. Mass cytometry: Technique for real time single RNA-seq. Genome Res 2011;21(7):1160–1167.
cell multitarget immunoassay based on inductively coupled plasma time-of-flight 37. Hochgerner H, Lönnerberg P, Hodge R, Mikes J, Heskol A, Hubschle H, Lin P,
mass spectrometry. Anal Chem 2009;81(16):6813–6822. Picelli S, la Manno G, Ratz M, et al. STRT-seq-2i: Dual-index 50 single cell and
5. Bendall SC, Nolan GP, Roederer M, Chattopadhyay PK. A deep profiler’s guide to nucleus RNA-seq on an addressable microwell array. Sci Rep 2017;7(1):16327.
cytometry. Trends Immunol 2012;33(7):323–332. 38. Hashimshony T, Senderovich N, Avital G, Klochendler A, de Leeuw Y, Anavy L,
6. Angelo M, Bendall SC, Finck R, Hale MB, Hitzman C, Borowsky AD, Gennert D, Li S, Livak KJ, Rozenblatt-Rosen O, et al. CEL-Seq2: Sensitive highly-
Levenson RM, Lowe JB, Liu SD, Zhao S, et al. Multiplexed ion beam imaging of multiplexed single-cell RNA-Seq. Genome Biol 2016;17:77.
human breast tumors. Nat Med 2014;20(4):436–442. 39. Jaitin DA, Kenigsberg E, Keren-Shaul H, Elefant N, Paul F, Zaretsky I, Mildner A,
7. Cornett DS, Reyzer ML, Chaurand P, Caprioli RM. MALDI imaging mass spec- Cohen N, Jung S, Tanay A, et al. Massively parallel single-cell RNA-seq for
trometry: Molecular snapshots of biochemical systems. Nat Methods 2007;4(10): marker-free decomposition of tissues into cell types. Science 2014;343(6172):
828–833. 776–779.
8. Schober Y, Guenther S, Spengler B, Rompp A. Single cell matrix-assisted laser 40. Bagnoli JW, Ziegenhain C, Janjic A, Wange LE, Vieth B, Parekh S, Geuder J,
desorption/ionization mass spectrometry imaging. Anal Chem 2012;84(15): Hellmann I, Enard W. Sensitive and powerful single-cell RNA sequencing using
6293–6297. mcSCRB-seq. Nat Commun 2018;9(1):2937.
9. Keren L, Bosse M, Thompson S, Risom T, Vijayaragavan K, McCaffrey E, 41. Sasagawa Y, Nikaido I, Hayashi T, Danno H, Uno KD, Imai T, Ueda HR. Quartz-
Marquez D, Angoshtari R, Greenwald NF, Fienberg H, et al. MIBI-TOF: A multi- Seq: A highly reproducible and sensitive single-cell RNA sequencing method,
plexed imaging platform relates cellular phenotypes and tissue structure. Sci Adv reveals non-genetic gene-expression heterogeneity. Genome Biol 2013;14(4):3097.
2019;5(10):eaax5851. 42. Sasagawa Y, Danno H, Takada H, Ebisawa M, Tanaka K, Hayashi T,
10. Luecken MD, Theis FJ. Current best practices in single-cell RNA-seq analysis: A Kurisaki A, Nikaido I. Quartz-Seq2: A high-throughput single-cell RNA-sequencing
tutorial. Mol Syst Biol 2019;15(6):e8746. method that effectively uses limited sequence reads. Genome Biol 2018;19(1):29.
11. Chen G, Ning B, Shi T. Single-cell RNA-seq technologies and related computa- 43. Smallwood SA, Lee HJ, Angermueller C, Krueger F, Saadeh H, Peat J, Andrews SR,
tional data analysis. Front Genet 2019;10:317. Stegle O, Reik W, Kelsey G. Single-cell genome-wide bisulfite sequencing for
assessing epigenetic heterogeneity. Nat Methods 2014;11:817–820.
12. Hwang B, Lee JH, Bang D. Single-cell RNA sequencing technologies and bioinfor-
matics pipelines. Exp Mol Med 2018;50(8):96. 44. Stevens TJ, Lando D, Basu S, Atkinson LP, Cao Y, Lee SF, Leeb M, Wohlfahrt KJ,
Boucher W, O’Shaughnessy-Kirwan A, et al. 3D structures of individual mamma-
13. Huang X, Liu S, Wu L, Jiang M, Hou Y. High throughput single cell RNA sequenc- lian genomes studied by single-cell Hi-C. Nature 2017;544:59–64.
ing, bioinformatics analysis and applications. Adv Exp Med Biol 2018;1068:33–43.
45. Sackmann EK, Fulton AL, Beebe DJ. The present and future role of microfluidics
14. Ji F, Sadreyev RI. Single-cell RNA-seq: Introduction to bioinformatics analysis. in biomedical research. Nature 2014;507(7491):181–189.
Curr Protoc Mol Biol 2019;127(1):e92.
46. Kulasinghe A, Wu H, Punyadeera C, Warkiani ME. The use of microfluidic tech-
15. Van Dilla MA, Trujillo TT, Mullaney PF, Coulter JR. Cell microfluorometry: A nology for cancer applications and liquid biopsy. Micromachines (Basel) 2018;9
method for rapid fluorescence measurement. Science 1969;163(3872):1213–1214. (8):19.
16. Fulwyler MJ. Electronic separation of biological cells by volume. Science 1965;150 47. Guo MT, Rotem A, Heyman JA, Weitz DA. Droplet microfluidics for high-
(3698):910–911. throughput biological assays. Lab Chip 2012;12(12):2146–2155.
17. Chattopadhyay PK, Gaylord B, Palmer A, Jiang N, Raven MA, Lewis G, 48. Velve-Casquillas G, le Berre M, Piel M, Tran PT. Microfluidic tools for cell biologi-
Reuter MA, Nur-ur Rahman AK, Price DA, Betts MR, et al. Brilliant violet cal research. Nano Today 2010;5(1):28–47.
fluorophores: A new class of ultrabright fluorescent compounds for immunofluo-
rescence experiments. Cytometry A 2012;81(6):456–466. 49. Collins DJ, Neild A, deMello A, Liu AQ, Ai Y. The Poisson distribution and
beyond: Methods for microfluidic droplet production and single cell encapsulation.
18. Nettey L, Giles AJ, Chattopadhyay PK. OMIP-050: A 28-color/30-parameter fluo- Lab Chip 2015;15(17):3439–3459.
rescence flow cytometry panel to enumerate and characterize cells expressing a
wide array of immune checkpoint molecules. Cytometry A 2018;93(11):1094–1096. 50. Macosko EZ, Basu A, Satija R, Nemesh J, Shekhar K, Goldman M, Tirosh I,
Bialas AR, Kamitaki N, Martersteck EM, et al. Highly parallel genome-wide expres-
19. Liechti T, Roederer M. OMIP-060-30-parameter flow cytometry panel to assess T sion profiling of individual cells using Nanoliter droplets. Cell 2015;161(5):
cell effector functions and regulatory T cells. Cytometry A 2019;95:1129–1134. 1202–1214.
20. Liechti T, Roederer M. OMIP-051 – 28-color flow cytometry panel to characterize 51. Klein AM, Mazutis L, Akartuna I, Tallapragada N, Veres A, Li V, Peshkin L,
B cells and myeloid cells. Cytometry A 2019;95(2):150–155. Weitz DA, Kirschner MW. Droplet barcoding for single-cell transcriptomics
21. Liechti T, Roederer M. OMIP-058: 30-parameter flow cytometry panel to charac- applied to embryonic stem cells. Cell 2015;161(5):1187–1201.
terize iNKT, NK, unconventional and conventional T cells. Cytometry A 2019;95 52. Salomon R, Kaczorowski D, Valdes-Mora F, Nordon RE, Neild A, Farbehi N,
(9):946–951. Bartonicek N, Gallego-Ortega D. Droplet-based single cell RNAseq tools: A practi-
22. Mair F, Prlic M. OMIP-044: 28-color immunophenotyping of the human dendritic cal guide. Lab Chip 2019;19:1706–1727.
cell compartment. Cytometry A 2018;93(4):402–405.

49
REVIEW ARTICLE

53. Habib N, Avraham-Davidi I, Basu A, Burks T, Shekhar K, Hofree M, mouse brain and spinal cord with split-pool barcoding. Science 2018;360(6385):
Choudhury SR, Aguet F, Gelfand E, Ardlie K, et al. Massively parallel single- 176–182.
nucleus RNA-seq with DroNc-seq. Nat Methods 2017;14(10):955–958. 77. Vitak SA, Torkenczy KA, Rosenkrantz JL, Fields AJ, Christiansen L, Wong MH,
54. Zheng GX, Terry JM, Belgrader P, Ryvkin P, Bent ZW, Wilson R, Ziraldo SB, Carbone L, Steemers FJ, Adey A. Sequencing thousands of single-cell genomes with
Wheeler TD, McDermott GP, Zhu J, et al. Massively parallel digital transcriptional combinatorial indexing. Nat Methods 2017;14(3):302–308.
profiling of single cells. Nat Commun 2017;8:14049. 78. Cao J, Cusanovich DA, Ramani V, Aghamirzaie D, Pliner HA, Hill AJ, Daza RM,
55. Rotem A, Ram O, Shoresh N, Sperling RA, Goren A, Weitz DA, Bernstein BE. Sin- McFaline-Figueroa JL, Packer JS, Christiansen L, et al. Joint profiling of chromatin
gle-cell ChIP-seq reveals cell subpopulations defined by chromatin state. Nat Bio- accessibility and gene expression in thousands of single cells. Science 2018;361
technol 2015;33(11):1165–1172. (6409):1380–1385.
56. Grosselin K, Durand A, Marsolier J, Poitou A, Marangoni E, Nemati F, 79. Lake BB, Chen S, Sos BC, Fan J, Kaeser GE, Yung YC, Duong TE, Gao D, Chun J,
Dahmani A, Lameiras S, Reyal F, Frenoy O, et al. High-throughput single-cell Kharchenko PV, et al. Integrative single-cell analysis of transcriptional and epige-
ChIP-seq identifies heterogeneity of chromatin states in breast cancer. Nat Genet netic states in the human adult brain. Nat Biotechnol 2018;36(1):70–80.
2019;51(6):1060–1066. 80. Mulqueen RM, Pokholok D, Norberg SJ, Torkenczy KA, Fields AJ, Sun D,
57. Lareau CA, Duarte FM, Chew JG, Kartha VK, Burkett ZD, Kohlway AS, Sinnamon JR, Shendure J, Trapnell C, O’Roak BJ, et al. Highly scalable generation
Pokholok D, Aryee MJ, Steemers FJ, Lebofsky R, et al. Droplet-based combinatorial of DNA methylation profiles in single cells. Nat Biotechnol 2018;36(5):428–431.
indexing for massive-scale single-cell chromatin accessibility. Nat Biotechnol 2019; 81. Ramani V, Deng X, Qiu R, Lee C, Disteche CM, Noble WS, Shendure J, Duan Z.
37(8):916–924. Sci-Hi-C: A single-cell Hi-C method for mapping 3D genome organization in large
58. Satpathy AT, Granja JM, Yost KE, Qi Y, Meschi F, McDermott GP, Olsen BN, number of single cells. Methods 2019;170:61–68.
Mumbach MR, Pierce SE, Corces MR, et al. Massively parallel single-cell chroma- 82. Chen KH, Boettiger AN, Moffitt JR, Wang S, Zhuang X. RNA imaging. Spatially
tin landscapes of human immune cell development and intratumoral T cell exhaus- resolved, highly multiplexed RNA profiling in single cells. Science 2015;348(6233):
tion. Nat Biotechnol 2019;37(8):925–936. aaa6090.
59. Zheng M, Tian SZ, Capurso D, Kim M, Maurya R, Lee B, Piecuch E, Gong L, 83. Wang X, Allen WE, Wright MA, Sylwestrak EL, Samusik N, Vesuna S, Evans K,
Zhu JJ, Li Z, et al. Multiplex chromatin interactions with single-molecule precision. Liu C, Ramakrishnan C, Liu J, et al. Three-dimensional intact-tissue sequencing of
Nature 2019;566(7745):558–562. single-cell transcriptional states. Science 2018;361(6400):eaat5691.
60. Pellegrino M, Sciambi A, Treusch S, Durruthy-Durruthy R, Gokhale K, Jacob J, 84. Eng CL, Lawson M, Zhu Q, Dries R, Koulena N, Takei Y, Yun J, Cronin C,
Chen TX, Geis JA, Oldham W, Matthews J, et al. High-throughput single-cell Karp C, Yuan GC, et al. Transcriptome-scale super-resolved imaging in tissues by
DNA sequencing of acute myeloid leukemia tumors with droplet microfluidics. RNA seqFISH. Nature 2019;568(7751):235–239.
Genome Res 2018;28(9):1345–1352.
85. Goltsev Y, Samusik N, Kennedy-Darling J, Bhate S, Hale M, Vazquez G,
61. Velazquez-Villarreal EI, Maheshwari S, Sorenson J, Fiddes IT, Kumar V, Yin Y, Black S, Nolan GP. Deep profiling of mouse splenic architecture with CODEX
Webb M, Catalanotti C, Grigorova M, Edwards PA. Resolving sub-clonal heteroge- multiplexed imaging. Cell 2018;174(4):968–981.e15.
neity within cell-line growths by single cell sequencing genomic DNA. bioRxiv
2019:757211. 86. Stahl PL, Salmen F, Vickovic S, Lundmark A, Navarro JF, Magnusson J,
Giacomello S, Asp M, Westholm JO, Huss M, et al. Visualization and analysis of
62. Hosokawa M, Nishikawa Y, Kogawa M, Takeyama H. Massively parallel whole gene expression in tissue sections by spatial transcriptomics. Science 2016;353
genome amplification for single-cell sequencing using droplet microfluidics. Sci (6294):78–82.
Rep 2017;7:5199.
87. Rodriques SG, Stickels RR, Goeva A, Martin CA, Murray E, Vanderburg CR,
63. Stoeckius M, Hafemeister C, Stephenson W, Houck-Loomis B, Chattopadhyay PK, Welch J, Chen LM, Chen F, Macosko EZ. Slide-seq: A scalable technology for mea-
Swerdlow H, Satija R, Smibert P. Simultaneous epitope and transcriptome mea- suring genome-wide expression at high spatial resolution. Science 2019;363(6434):
surement in single cells. Nat Methods 2017;14(9):865. 1463–1467.
64. Peterson VM, Zhang KX, Kumar N, Wong J, Li L, Wilson DC, Moore R, 88. Vickovic S, Eraslan G, Salmen F, Klughammer J, Stenbeck L, Schapiro D, Ajio T,
McClanahan TK, Sadekova S, Klappenbach JA. Multiplexed quantification of pro- Bonneau R, Bergenstrahle L, Navarro JF, et al. High-definition spatial trans-
teins and transcripts in single cells. Nat Biotechnol 2017;35(10):936–939. criptomics for in situ tissue profiling. Nat Methods 2019;16(10):987–990.
65. Mimitou EP, Cheng A, Montalbano A, Hao S, Stoeckius M, Legut M, Roush T, 89. Stickels RR, Murray E, Kumar P, Li J, Marshall JL, Di Bella D, Arlotta P, Macosko
Herrera A, Papalexi E, Ouyang Z, et al. Multiplexed detection of proteins, trans- EZ, Chen F. Sensitive spatial genome wide expression profiling at cellular resolu-
criptomes, clonotypes and CRISPR perturbations in single cells. Nat Methods tion. bioRxiv 2020:2020.03.12.989806.
2019;16(5):409–412.
90. Macaulay IC, Haerty W, Kumar P, Li YI, Hu TX, Teng MJ, Goolam M, Saurat N,
66. Chen S, Lake BB, Zhang K. Linking transcriptome and chromatin accessibility in Coupland P, Shirley LM, et al. G&T-seq: Parallel sequencing of single-cell genomes
nanoliter droplets for single-cell sequencing. bioRxiv 2019:692608. and transcriptomes. Nat Methods 2015;12(6):519–522.
67. Li H, Courtois ET, Sengupta D, Tan Y, Chen KH, Goh JJL, Kong SL, Chua C, Hon 91. Rodriguez-Meira A, Buck G, Clark SA, Povinelli BJ, Alcolea V, Louka E,
LK, Tan WS, et al. Reference component analysis of single-cell transcriptomes elu- McGowan S, Hamblin A, Sousos N, Barkas N, et al. Unravelling Intratumoral het-
cidates cellular heterogeneity in human colorectal tumors. Nat Genet 2017;49(5): erogeneity through high-sensitivity single-cell mutational analysis and parallel
708–718. RNA sequencing. Mol Cell 2019;73(6):1292.
68. Proserpio V, Piccolo A, Haim-Vilmovsky L, Kar G, Lonnberg T, Svensson V, 92. Shahi P, Kim SC, Haliburton JR, Gartner ZJ, Abate AR. Abseq: Ultrahigh-
Pramanik J, Natarajan KN, Zhai W, Zhang X, et al. Single-cell analysis of CD4+ T- throughput single cell protein profiling with droplet microfluidic barcoding. Sci
cell differentiation reveals three major cell states and progressive acceleration of Rep 2017;7:44447.
proliferation. Genome Biol 2016;17:103.
93. Stoeckius M, Zheng S, Houck-Loomis B, Hao S, Yeung BZ, Mauck WM,
69. Szulwach KE, Chen P, Wang X, Wang J, Weaver LS, Gonzales ML, Sun G, Unger Smibert P, Satija R. Cell hashing with barcoded antibodies enables multiplexing
MA, Ramakrishnan R. Single-cell genetic analysis using automated microfluidics to and doublet detection for single cell genomics. Genome Biol 2018;19(1):224.
resolve somatic mosaicism. PLoS One 2015;10(8):e0135007.
94. McGinnis CS, Patterson DM, Winkler J, Conrad DN, Hein MY, Srivastava V,
70. Xin Y, Kim J, Ni M, Wei Y, Okamoto H, Lee J, Adler C, Cavino K, Murphy AJ, Hu JL, Murrow LM, Weissman JS, Werb Z, et al. MULTI-seq: Sample multiplexing
Yancopoulos GD, et al. Use of the Fluidigm C1 platform for RNA sequencing of for single-cell RNA sequencing using lipid-tagged indices. Nat Methods 2019;16
single mouse pancreatic islet cells. Proc Natl Acad Sci USA 2016;113(12): (7):619–626.
3293–3298.
95. Svensson V, da Veiga Beltrame E, Pachter L. A curated database reveals trends in
71. Fan HC, Fu GK, Fodor SP. Expression profiling. Combinatorial labeling of single single-cell transcriptomics. bioRxiv 2019:742304.
cells for gene expression cytometry. Science 2015;347(6222):1258367.
96. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K,
72. Gierahn TM, Wadsworth MH II, Hughes TK, Bryson BD, Butler A, Satija R, Dewar K, Doyle M, FitzHugh W, et al. Initial sequencing and analysis of the
Fortune S, Love JC, Shalek AK. Seq-well: Portable, low-cost RNA sequencing of human genome. Nature 2001;409(6822):860–921.
single cells at high throughput. Nat Methods 2017;14(4):395–398.
97. Mereu E, Lafzi A, Moutinho C, Ziegenhain C, MacCarthy DJ, Alvarez A, Batlle E,
73. Mair F, Erickson JR, Voillet V, Simoni Y, Bi T, Tyznik AJ, Martin J, Gottardo R, Newell Sagar, Grün D, Lau JK, et al. Benchmarking single-cell RNA sequencing protocols
EW, Prlic M. A targeted multi-omic analysis approach measures protein expression for cell atlas projects. bioRxiv 2019:630087.
and low abundance transcripts on the single cell level. bioRxiv 2019:700534.
98. Hughes TK, Wadsworth MH, Gierahn TM, Do T, Weiss D, Andrade PR, Ma F, de
74. Cusanovich DA, Daza R, Adey A, Pliner HA, Christiansen L, Gunderson KL, Andrade Silva BJ, Shao S, Tsoi LC, et al. Highly efficient, massively-parallel single-
Steemers FJ, Trapnell C, Shendure J. Multiplex single cell profiling of chromatin cell RNA-seq reveals cellular states and molecular features of human skin pathol-
accessibility by combinatorial cellular indexing. Science 2015;348(6237):910–914. ogy. bioRxiv 2019:689273.
75. Cao J, Packer JS, Ramani V, Cusanovich DA, Huynh C, Daza R, Qiu X, Lee C, 99. Ding J, Adiconis X, Simmons SK, Kowalczyk MS, Hession CC, Marjanovic ND,
Furlan SN, Steemers FJ, et al. Comprehensive single-cell transcriptional profiling of Hughes TK, Wadsworth MH, Burks T, Nguyen LT, et al. Systematic comparative
a multicellular organism. Science 2017;357(6352):661–667. analysis of single cell RNA-sequencing methods. bioRxiv 2019:632216.
76. Rosenberg AB, Roco CM, Muscat RA, Kuchina A, Sample P, Yao Z, Graybuck LT, 100. Rozenblatt-Rosen O, Stubbington MJT, Regev A, Teichmann SA. The human cell
Peeler DJ, Mukherjee S, Chen W, et al. Single-cell profiling of the developing atlas: From vision to reality. Nature 2017;550(7677):451–453.

50
REVIEW ARTICLE

Computational approaches for high-throughput single-cell


data analysis
Helena Todorov1,2,3 and Yvan Saeys1,2
1 Data Mining and Modelling for Biomedicine, VIB Center for Inflammation Research, Ghent, Belgium
2 Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Belgium
3 Centre International de Recherche en Infectiologie, Inserm, U1111, Universite 
 Claude Bernard Lyon 1, CNRS, UMR5308, Ecole Normale
rieure de Lyon, Univ Lyon, France
Supe

Keywords During the past decade, the number of novel technologies to interrogate
bioinformatics; computational tools; biological systems at the single-cell level has skyrocketed. Numerous
proteome; single cell; transcriptome
approaches for measuring the proteome, genome, transcriptome and epi-
genome at the single-cell level have been pioneered, using a variety of tech-
Correspondence
Y. Saeys, Department of Applied nologies. All these methods have one thing in common: they generate large
Mathematics, Computer Science and and high-dimensional datasets that require advanced computational mod-
Statistics, Ghent University, elling tools to highlight and interpret interesting patterns in these data,
Technologiepark 927, 9052 Gent, Belgium potentially leading to novel biological insights and hypotheses. In this
Fax: +32 9 221 76 73 work, we provide an overview of the computational approaches used to
Tel: +32 9 331 37 40
interpret various types of single-cell data in an automated and unbiased
E-mail: yvan.saeys@ugent.be
way.
(Received 22 February 2018, revised 4 June
2018, accepted 25 July 2018)

doi:10.1111/febs.14613

Introduction
Single-cell technologies are currently revolutionising resolution [1]. While initially consisting largely of man-
the way life scientists are studying biological systems ual labour and thus being very low-throughput, auto-
from different perspectives. Three major classes of mated image acquisition and segmentation have enabled
technologies can be distinguished: imaging-based tech- high-throughput image-based screening, by analysing
niques, techniques based on flow or mass cytometry up to hundreds of thousands of cells in single-well plates
and techniques based on next-generation sequencing. [2]. Similarly, many other microscopy-based techniques
However, this is only a rough classification, as some allow the extraction of information at the single-cell
recent innovations combine elements of different level, although at a lower throughput. These include
classes of techniques. While many of the early data most types of light and electron microscopy, with a
preprocessing steps are specific to each class of tech- broad variety of applications. Common to all these
niques, several downstream computational analyses are image-based approaches is the fact that advanced
generally applicable to any form of single-cell data, image-analysis pipelines are needed to arrive at single-
and one of the goals of this work is to provide a unify- cell resolution [3]. A typical image processing pipeline
ing overview of these generally applicable approaches. first performs segmentation of the single cells from the
Historically, microscopy-based techniques were the image, followed by a feature extraction step, typically
first methodology to study organisms at single-cell extracting several hundreds of features for each

Abbreviations
DE, differential expression; HVGs, highly variable genes; scRNA-Seq, single-cell RNA sequencing; TI, trajectory inference.

51
Computational tools for single-cell data analysis H. Todorov and Y. Saeys

individual cell [4]. In comparison to other single-cell is provided by single-cell omics tools, as they aim to
approaches where cells are dissociated in suspension, a sequence all of the cell’s content, instead of focusing
major advantage of image-based single-cell profiling on a set of user-defined targets as is done in cytome-
methodology is that it inherently provides the user with try. This allows performing novel types of analyses,
two- or three-dimensional spatial information, as know- such as studying the heterogeneity of cell populations
ing a cell’s spatial context is often the key to discover in much greater detail, identifying rare cell types, and
novel biological findings. studying the dynamics of cellular systems. Further-
Flow cytometry allows profiling and analysing cells more, the field continues to evolve by combining sin-
in a high-throughput fashion and is based on passing gle-cell RNA sequencing with other technologies such
cells through a laser beam in a rapidly flowing fluid as spatial transcriptomics [19] and CRISPR-mediated
stream. This core technology is in essence very similar knockout screens (Perturb-Seq [20]/CRISP-seq [21]).
to the original design from the late 1960s [5], illustrat- Recent approaches combine transcriptomics with other
ing the robustness of the technology [4,6]. The field of types of omics data at a single-cell resolution such as
flow cytometry has emerged as a powerful methodol- single-cell proteomics (CITE-seq [22]/REAP-seq [23]),
ogy for single-cell analysis due to continuous innova- single-cell genomics (G&T-seq [24]) and single-cell
tions such as (a) multicolour assays enabling the methylomics (scM&T-seq [25]). These emerging ‘single-
measurement of a large number of proteins simultane- cell multi-omics’ technologies [26] integrate several
ously [7], (b) spectral flow cytometry [8] in which clas- types of measurements on the same single cell and are
sical mirrors, optics and detectors are replaced by likely to be part of the everyday methodology of
dispersive optics and a linear array of detectors allow- molecular biologists in the future.
ing highly complex fluorochrome combinations, (c) While all techniques described above provide the
imaging flow cytometry [9] combining flow cytometry user with information at single-cell level, the through-
and microscopy for high-throughput imaging of single put, resolution, cost and type of information acquired
cells, and (d) acoustic-based focusing and sorting [10]. differ drastically between technologies. We will take a
In addition, other technological advances such as mass computational perspective here, and compare the main
cytometry have replaced the fluorescent labelling and dataset characteristics for the three major classes of
readout using optics by labelling using heavy isotopes, single-cell data introduced above. Classical imaging-
and subsequent readout by mass spectrometry [11]. based techniques typically offer a low throughput,
This eliminates the problem of spectral overlap in clas- measuring a few hundreds of cells, while more
sical flow cytometry, allowing the theoretical measure- advanced high-content screening methods allow high-
ment of up to 100 proteins simultaneously. Mass throughput measurements of hundreds of thousands to
cytometry can also be performed on tissue slices, millions of cells. When applying segmentation and fea-
thereby scanning the tissue spot-by-spot and perform- ture extraction, for example using popular pipelines
ing a single experiment per spot. This approach, such as CELLPROFILER [27], almost a thousand image-
named imaging mass cytometry, allows performing derived features can be extracted per cell. However,
spatial proteomics in a high-throughput fashion [12]. many of those capture redundant information and
The ability to measure increasing amounts of proteins thus are very correlated. Flow and mass cytometry
simultaneously [7] complicates the analysis of this type allow measuring cells at high throughput, up to mil-
of data, which can no longer be analysed manually as lions of cells for classical flow cytometry. Only a few
was done with datasets containing a few markers per tens of parameters can be quantified simultaneously
cell, but needs new computational approaches to cor- per single cell, but these parameters often represent
rectly identify cell populations [13]. very complementary information, as they are manually
Recent developments in microvolume sequencing chosen by an expert. Single-cell omics technologies
have led to a new wave of single-cell ‘-omics’ profiling offer medium throughput, measuring thousands to tens
technologies [14–18], permitting the quantification of of thousands of cells in a single run. However, these
whole genomes, epigenomes and transcriptomes at the data are very rich in information, measuring thousands
single-cell level. Novel computational tools are being of transcripts in the case of single-cell transcriptomics.
developed in order to deal with the continuously While the profiling methodology and dataset charac-
increasing dimensionality of these datasets, since a sin- teristics in each of these technologies are very different,
gle experiment can quantify molecular characteristics many of the applications and computational workflows
of up to tens of thousands of cells, measuring tens of are quite similar. In the remainder of the paper, we
thousands of parameters (e.g. transcripts in the case of will discuss the differences and commonalities in com-
single-cell transcriptomics). A high level of resolution putational workflows for the different applications.

52
Computational workflow for the section ‘Data preprocessing and quality control’.
single-cell experiments After data preprocessing, an initial exploration of the
data can be performed using visualisation techniques,
Regardless of the specific technology used to generate in order to perform early detection of any possible
a single-cell dataset, a common pipeline can be batch effects or unexpected subpopulations. Applying
devised, starting with the experimental design, data visualisation techniques may also help to visualise the
generation, technology-specific preprocessing, quality population structure within samples, and to compare
control and subsequent data analysis (Fig. 1). A this structure between different samples. In this step,
detailed design of the experiment is a crucial step interesting populations or trends may be observed that
towards minimising technical variation and improving require further investigation.
scientific reproducibility. This not only includes stan- Next, several types of in-depth analyses can be per-
dardisation of experimental protocols and equipment, formed, in most cases starting with an automated clus-
but also careful planning and consultation with statis- tering of the cells into cell types. This clustering allows
ticians and/or bioinformaticians regarding sample size, quantifying and comparing different cell types in the
specific setup related to the biological questions that samples and identifying new cell types or transition
should be answered or specific types of computational states. Novel computational approaches to model
analyses that should be carried out. Subsequently the gradual transitions between cell states (trajectory infer-
experiment should be performed, ensuring that stan- ence) can also be applied at this stage. Other alterna-
dardised procedures are followed for sample prepara- tives include specific predictive modelling approaches
tion, handling equipment and data acquisition while such as classification, regression and survival analysis
appropriate controls are added at multiple steps of the modelling. All these approaches have the potential to
experiments. extract novel biomarkers from single-cell data, with
The next step in the pipeline is the preprocessing important diagnostic and therapeutic potential.
and quality control. This step will likely take a consid- Finally, more advanced computational approaches can
erable amount of time, as it is crucial to start from be applied to single-cell omics data. The correlations
good quality data if good quality results are desired. in gene expression within cells can be studied to assess
Therefore, it is important to perform technology-speci- gene regulatory networks (network inference). In the
fic preprocessing steps, a topic that will be covered in case of multi-omics datasets, data integration

Fig. 1. The computational workflow for single-cell experiments detailed in steps.

53
approaches can be used to combine the information CELLPROFILER has a modular structure that allows the
on single-cell mechanisms. user to select and configure the individual algorithms
that will be applied, which in turn defines the specific
preprocessing applied and the features that are
Data preprocessing and quality
obtained at the end of the pipeline. The resulting fea-
control
tures can later be used for visualisation, clustering or
differential downstream analyses for instance.
Single-cell imaging
The preprocessing of single-cell imaging data usually
Flow/mass cytometry
starts by accounting for batch effects through illumina-
tion correction, and image-wise processing such as In conventional flow cytometry, the first preprocessing
noise removal, aligning or cropping [28,29]. This pro- step is typically compensation of the spectral overlap,
cedure is commonly followed by the segmentation of to correct for spillover of the fluorescent signal into
the individual cells within the images, and finally by a neighbouring channels. This is typically accounted for
feature extraction process that yields a vector of in the experimental procedure, by measuring the fluo-
numeric features for each individual cell, usually in a rescence of single stains in the different channels,
tabular format. allowing for the calculation of a compensation matrix.
CELLPROFILER [27] is widely used to extract numerical In mass cytometry, this issue is largely avoided by
features from two-dimensional microscopy images using rare isotopes instead of light measurements,
(such as in high-content screening assays). The main although the measurement of certain isotopes can still
difficulty faced by CELLPROFILER is the segmentation of be polluted due to metal impurity levels, oxidation and
the cells or objects of interest present in the image. abundance sensitivity [35]. Mass cytometry panels
CELLPROFILER contains several fast algorithms that can should therefore be designed with caution by pairing
extract well-separated objects; however, in many cases, strong intensity markers with less sensitive channels in
these objects appear clumped, hindering their segmen- order to avoid interference between channels [36]. The
tation and making it prone to both false negatives data is then transformed through a biexponential or
(when the borders between objects cannot be found) hyperbolic arcsine transformation, which improves the
and false positives (when the sensitivity of the detec- separation between negative and positive cells for the
tion is too high). In order to deal with this difficulty, different markers. Fluctuations in measurements can
CELLPROFILER also provides a more complex segmenta- also be caused by an unsteady flow rate. Typically, up
tion algorithm that follows a hierarchical process: first, to 10 000 cells are measured per second at a steady
it finds primary level objects that are typically well- rate in flow cytometry. Mass cytometry has a slightly
separated (such as cell nuclei, visible on DNA-stain lower throughput, measuring a few thousand cells per
channels); then, the boundaries of secondary level second. However, obstructions in the fluid stream and
objects (such as cell edges) are searched around the manual interventions can disturb the flow, which also
primary level objects. impacts the amount of protein levels measured. To
However, it is also possible that the primary level remove these technical artefacts, the data needs to be
objects appear clumped, which is why CELLPROFILER either manually gated against time or screened by tools
divides their detection into several steps following the such as FLOWCLEAN [37], FLOWQ [38] and FLOWAI [39],
guidelines of previously published algorithms [30–34]. which can automatically identify and remove sections
Clumped objects are first detected, segmented and sep- in which the flow was perturbed.
arated by dividing lines, thus avoiding false negatives. The acquisition level of cytometers can slightly
Finally, some of the objects are either removed or change from one day to another, or even within hours.
merged to reduce the false positive rate. Once the pri- The use of control tubes to calibrate the machine
mary level objects are properly detected, it becomes before running an experiment can help to make differ-
simpler to find secondary level objects around them. ent samples more comparable, but batch effects are
CELLPROFILER provides an improved algorithm to prop- often observed between two experiments. The resulting
erly detect the borders even when the objects are slight shift in protein expression can be accounted for
clumped against each other. Once the objects have manually, by shifting the gates of every sample that
been segmented, multiple features can be extracted differs, or in an automated way using the FLOWSTATS
from each of them in a per-channel basis (area, shape, [40] package. In mass cytometry, beads are commonly
intensity, texture, etc.) or at the whole-image level used in the experiments, allowing normalisation of the
(number of cells, background intensity, etc.). data based on the signal of these beads to have more

54
comparable samples. Some markers can also be used data transformation is then applied to align similar cell
to barcode cells, and then pool several samples populations, resulting in more consistent datasets that
together, to avoid technical bias between different can be further analysed together.
experimental conditions. When performing experi- Several quality control metrics, such as the library
ments on different days, it may be advisable to include size and the percentage of mitochondrial genes, are
additional control samples, such as an aliquot from used to filter out abnormal cells, in order to reduce the
the same sample that is taken along all different exper- technical variance of the data [50]. Additionally, a
iment days, in order to allow normalisation between great part of intercellular variability can be caused by
experiment days later on. Once batch effects have been the cell cycle, and it is up to the user to decide whether
accounted for, debris, doublets and other low quality this variability should be removed from the data or
cells can be removed either by manual gating or using not. Cyclone [51] is a method that can be used to pre-
OPENCYTO [41], or FLOWDENSITY [42]. dict the cell cycle stage, which can subsequently be
As flow cytometry allows the measurement of pro- used to either remove cycling cells, or tag them so that
teins at the single-cell level while preserving the integ- they can be easily identified later in the analysis. F-
rity of the cells, it is sometimes used to sort specific scLVM [52] is another algorithm that identifies the
cells into wells before sequencing their transcriptome. amount of variability across the expression of each
The cells can either be sorted by cell population, based gene that is due to cell cycle differences. It can be used
on a set of common markers, or index-sorted, in which to infer ‘corrected’ gene expression values, removing
case single cells are sorted into wells and barcoded, so the effect of the cell cycle.
that their protein expression profile is kept. In this The next step in the process regards the normalisa-
case, doublets and empty wells might occur, which tion of the count data, since a large part of the
should be carefully removed from the analysis before observed variability can be due to differences in size,
any further processing step. viability, capturing efficiency and amplification biases
between cells. Some methods aim to standardise the
total number of reads per cell (RPKM [53], TPM [54],
Single-cell omics
downsampling) or proportions of the total number of
Preprocessing single-cell omics data based on NGS reads per cell (UQ, full quantile [55]). However, these
technologies further builds on the wide availability of methods can be seriously impacted by false negative
NGS preprocessing tools that are already available counts [56]. Indeed, the number of transcripts in a cell
from experiments on bulk RNA or DNA. However, being very low for certain genes, there is a high proba-
single-cell omics technologies lead to a number of bility that these transcripts will be missed, resulting in
additional challenges when going through the process a zero count in the final expression data. These missed
from the individual reads to the mapped genomes or transcripts are called dropouts, and lead to a high
transcriptomes. We will focus here more specifically on technical variance that can affect the final results.
methods for single-cell transcriptomics, as this is the High-throughput scRNA-Seq protocols typically show
most widely used type of single-cell omics data at pre- higher dropout rates [43], but high amounts of
sent. Several scRNA-Seq protocols were developed, sequenced cells can help to infer dropout probabilities.
usually focusing either on sequencing a large number ZIFA [57] is a method which identifies zero counts
of cells, or a high amount of genes at an increased that are most likely resulting from dropout events, and
sequencing depth [43]. Due to the low amount of tran- gives less weight to these counts. ZINB-WAVE [58] is
scripts in the cells, scRNA-Seq data usually contain a another method which not only assesses the probabil-
lot of technical variance, requiring specific computa- ity for a zero to be a dropout based on the sequencing
tional tools to perform quality control, normalisation depth, but also accounts for batch effects between
and downstream analyses [44–47]. samples, and computes global-scaling normalisation
When performing a computational analysis on factors, which allow it to be used directly on non-
scRNA-Seq data coming from multiple experiments, normalised data.
batch effects can arise, leading to an increased interex- Some methods rely on spike-ins to distinguish techni-
perimental variability. Two recently published algo- cal variability from biologically relevant changes in
rithms can be used in order to reduce batch effects. gene expression [59] (BASICS [60], GRM [61], SAMSTRT
These algorithms either identify a gene correlation [62]). Spike-ins are control RNA transcripts which are
structure [48], or a subset of cells coming from the added in the same quantity to all the samples to be
same population [49], that are shared between the sequenced. They can be used to normalise the data,
datasets coming from different experiments. Proper as all cells should have exactly the same amount of

55
spike-ins after sequencing, and the differences in spike- Table 1. Dimensionality reduction based- and clustering
in amounts should only be the consequence of technical based-tools for visualisation of single-cell high-dimensional data.
artefacts. However, the most commonly used spike-in Class of
set (ERCC [63]) cannot always faithfully account for method Name Description
the intrinsic gene variability, as they have been shown
Dimensionality PCA Linear reduction in the dimensions
to have a length and GC content that differ from mam-
reduction holding the highest variance into
malian transcripts [58]. Moreover, choosing the quan- orthogonal principal components
tity of spike-ins that should be added to the cells can MDS Nonlinear reduction in the
be challenging, as a significant amount of spike-ins has dimensions by preserving the
to be used in order to reflect faithfully the intercellular intercellular distances of high
variability, but may eclipse the intracellular transcripts dimensions in the lower
of interest. However, ERCC spike-ins are still com- dimensions
tSNE Nonlinear dimensionality reduction,
monly used to filter out low quality cells [50]. Overall,
preserves the local similarities
the views on the use of spike-ins for single-cell RNA between cells
Seq normalisation are still conflicting [64–66]. Diffusion Nonlinear dimensionality reduction,
The methods cited above apply global scaling fac- maps computes transition probabilities
tors to all cells equally, assuming that the relation between cells
between the number of genes measured per cell and SPRING k-Nearest Neighbour force directed
graph, preserves the high-
the sequencing depth is the same for all genes. How-
dimensional relationships between
ever, this assumption of a constant gene-count/sequen-
cells
cing depth ratio has been shown to hold on bulk Clustering SPADE Hierarchical clustering of the cells
RNA data, but not in single-cell datasets [67]. Apply- followed by the representation of
ing global scaling factors to scRNA-Seq data might these clusters in a minimal
therefore lead to biased correction of lowly and highly spanning tree
expressed genes. Two algorithms can be used to per- FLOWSOM SOM clustering followed by the
representation of these clusters in
form single-cell specific normalisation of scRNA-Seq
a minimal spanning tree
datasets. The SCnorm method [67] relies on the fact
Scaffold Semisupervised method: new cells
that the normalisation should not be applied in the Maps are grouped with the user-provided
same way to all the genes, as they differ in various cell populations to which they are
properties such as transcript length and GC content. most similar
SCnorm first groups genes with similar dependencies FLOWMAP Hierarchical clustering of the cells,
on sequencing depth and subsequently estimates differ- followed by the representation of
these clusters in a strong
ent scale factors for each group of genes. Alternatively,
connected graph structure
SCRAN [50], first groups cells with similar expression
Phenograph Groups cells which share the same
profiles together, and applies intragroup normalisation neighbours together and identifies
before performing intergroup normalisation. communities which maximise the
Louvain modularity

Visualising high-dimensional
single-cell data
Dimensionality reduction tools aim to capture the
Once the data has been preprocessed, visualisation structure of the high-dimensional data by projecting it
tools can help to get a first insight into the structure to a lower dimensional space that keeps the most
of the data. A quick principal component analysis important structural properties of the original, high-
(PCA) plot of the data can, for instance, allow identi- dimensional space. The lower dimensional projection
fying any remaining source of technical variability allows the human expert to visualise and explore the
between samples, which should be removed by normal- data. Dimensionality reduction can be performed
isation. Structures in the data or biological differences either in a linear way (the lower dimensional projec-
between the samples may then be investigated using tions are a linear combination of the original dimen-
different approaches: dimensionality reduction tech- sions), or in a nonlinear way. PCA is a linear
niques, clustering techniques, or the novel class of dimensionality reduction technique, in which the fea-
techniques to model cell trajectories and state transi- tures with the largest variability are preserved in prin-
tions. A list of visualisation tools and their principal cipal components. The main sources of variability in
characteristics is provided in Table 1. the data can then be optimally laid out. A PCA can

56
therefore be applied to check for batch effects in the [50,78], which considerably reduces the number of fea-
data, or to identify any main source of variability. The tures and the noise they contain, while preserving the
use of nonlinear dimensionality reduction methods main biologically relevant sources of variability.
(e.g. tSNE [t-stochastic neighbour embedding, 68], Another algorithm was implemented in the SEURAT R
MDS [multidimensional scaling, 69], diffusion maps package [79] to filter HVGs. Visualisation, clustering
[70], SPRING [71]) allows optimal plotting of the data or any downstream analysis algorithms can then be
in two dimensions while preserving the local similari- applied either to the HVGs, or, if the dimensions of
ties between cells. the data are still too high, on the principal compo-
Clustering-based visualisation methods group similar nents of a PCA run on these HVGs.
cells together and may be combined with a subsequent In order to highlight the differences between the dif-
visualisation step, for example by laying out the result- ferent methods cited above, we applied two dimension-
ing clusters in two dimensions. This reduces computa- ality reduction tools (PCA and tSNE) and two
tion time and can simplify the understanding of the clustering-based tool (FLOWSOM, Phenograph) on a
resulting plot. Several methods have been proposed for publicly available scRNA-Seq dataset [16] of 3000
the visualisation of clusters in single-cell data (SPADE peripheral blood mononuclear cells (PBMCs) from the
[Spanning-tree Progression Analysis of Density- 10X Genomics platform (Fig. 2). We first preprocessed
normalized Events] [72], FLOWSOM [73], FLOWMAP [74]). the dataset as described in the data preprocessing sec-
These methods represent the clusters under the form tion by filtering out low quality cells and genes. We
of a graph in which the most similar clusters are linked then selected the most highly variable genes, to which
by an edge. FLOWSOM also allows performing meta- we applied the different visualisation methods. This fil-
clustering, grouping clusters into larger populations, tering on highly variable genes has two advantages. It
which has shown to return results very similar to man- significantly reduces the size of the dataset, therefore
ual labelling of cytometry data [75]. Single-Cell Analy- reducing the analysis time, and it helps to focus on the
sis by Fixed Force- and Landmark-Directed (Scaffold) genes that are driving heterogeneity across cells [50].
maps [76] were specifically designed to simplify the The PBMC dataset had previously been expert-labelled
identification of user defined cell populations in cytom- in the Seurat R pipeline [79], which allowed us to use
etry data. Finally, Phenograph [77] identifies closely the cell identities to simplify the comparison of the
linked communities of cells in a graph structure. This outputs from the different methods. The different
algorithm therefore identifies populations without any methods provided complementary information on the
previous knowledge on the number of expected popu- structure of the data. For instance, all methods except
lations, which can be very useful in discovery studies. PCA identified the rare megakaryocyte cell population,
While most of these methods were initially developed and all methods except FlowSOM represented these
for flow cytometry data, FlowSOM and Phenograph megakaryocyes close to the monocyte cell population.
are scalable to high dimensional datasets. These meth- As a general guideline, it is often advisable to apply
ods can therefore be applied to mass cytometry and several techniques in parallel to acquire a deeper
scRNA-Seq datasets, or to features extracted from understanding of the data structure.
images, allowing the visualisation of structure in the
data.
However, scRNA-Seq and image derived data typi-
Cell type identification
cally contain much more dimensions than the usual While the clustering approach to single-cell analysis
10–30 colour panels used in cytometry. When dealing assumes that cells are forming well separated groups,
with features extracted from images, a first step can other types of techniques focus on better detecting
consist in performing principal component analysis, cells that are in transition between cell states. In the
which will help to reduce the redundancy of these first case, the expression of certain markers is expected
highly correlated features. One can then choose to to differ drastically, providing hard separations
work with the principal components containing 95% between cell populations. In the second case, the mark-
of the data variability. These principal components ers are seen as continuous variables which smoothly
can be analysed as new features, using visualisation or change from one cell to another, leading to structural
clustering techniques. scRNA-Seq datasets tend to patterns in the data which can be seen as developmen-
contain noise which might bias clustering studies, espe- tal trajectories (Fig. 3). The choice between the two
cially due to the high amount of lowly expressed genes sets of methods depends on the biological question,
and dropouts. Therefore, the highly variable genes but a good practice can be to first apply a clustering
(HVGs) can first be filtered on this type of data algorithm to identify the main populations in the data,

57
Fig. 2. Comparison of (A) tSNE, (B) PCA, (C) FLOWSOM and (D) Phenograph on the PBMC dataset. (A) The cell colours correspond to the
labels provided by experts in the Seurat R pipeline. (B) The main differences between cell types can be seen on the horizontal (1st principal
component) and vertical (2nd principal component) axis. (C) The colours inside the pies correspond to the cell colours on the tSNE plot. The
background colours correspond to the meta-clusters identified by FlowSOM. Discrepancies between the pie colour and the background
colour highlight the cells for which FlowSOM’s results diverged from the manual annotation. (D) The similarities between the different cell
types are nicely laid out on a Phenograph plot.

and then perform trajectory inference on a specific results. However, due to the increasing number of
group of similar cells. Indeed, trajectory inference tools markers used in cytometry data, there is a need to per-
will tend to identify trajectories in any dataset, so they form benchmark studies regularly, as tools which were
should be applied to specifically delineated sets of cells. very efficient with low-dimensional datasets might not
The identification of trajectories in highly variable necessarily perform equally well in higher dimensions
datasets is a current challenge, which is only described [84]. Another study [75] compared 18 clustering meth-
recently in the literature [80]. ods for conventional flow and mass cytometry data,
taking into account the clustering accuracy as well as
the computational time, which becomes more impor-
Clustering-based approaches
tant when dealing with large datasets. The FLOWSOM
Several tools have been implemented in order to iden- [73] algorithm showed the best clustering accuracy and
tify similar groups of cells in cytometry data, compar- was one of the fastest methods when applied to large
ing either the similarities between cells (SPADE [81], datasets, with a linear complexity with respect to the
FLOWSOM [73]), the distances between cells in a lower number of cells. CytoCompare [85] is a tool which was
dimensional space (Accense [82]) or the shared neigh- created to perform the comparison of the clustering
bours in a graph (Phenograph [77]). A benchmark results of three methods: SPADE, ViSNE/Accense [82]
study of clustering tools, the FLOWCAP I [83] challenge, and Citrus [86].
provided several mammalian datasets to assess the The clustering algorithms described above can also
ability of different clustering methods to identify cell be applied to image derived features, although, as was
populations accurately. Most tools provided a good the case for visualisation techniques, the high correla-
delineation of cell populations compared to manual tion between features might bias clustering results. The
gating, and ensemble methods which merged the out- redundancy of the features can be reduced by first
puts of several clustering methods showed the best applying a PCA to this type of data, and performing

58
Expression data

Trajectory
Clustering
inference

Similarities within Similarities between cells


clusters are preserved are preserved and displayed
in lower dimensions

Pseudotime

Fig. 3. In order to identify structures in an expression data matrix, two types of methods can be used. Clustering-based methods will tend
to maximise the similarities between cells within clusters while maximising the differences between clusters. These methods thus help to
identify homogeneous groups of cells in the data. On the other hand, trajectory inference methods will tend to preserve the local similarities
between cells, ordering them along trajectories which represent gradual changes between similar cells.

clustering on the principal components of the PCA. In methods were specifically designed to deal with this
scRNA-Seq data, clustering is more tricky because the artefact, either by imputing the expected value of
gene expression contains noise and the data is very dropout candidates (CIDR [90]), or by computing the
sparse. Cells may mistakenly be grouped together similarities between cells with techniques that are
based on technical noise attributed to sequencing robust to dropouts (SIMLR [91], SNN-Cliq [92], SCE-
depth or library size, rather than actual biological NIC [93]). The PAGODA [94] algorithm also accounts
effects. This raises the need for new tools, which are for technical biases such as the expression magnitude
able to overcome this issue. Several tools do not com- and the cell cycle.
pare the expression patterns of cells directly anymore,
but apply tricks to perform more accurate clustering:
Approaches for modelling gradual transitions
SC3 [87] computes a consensus clustering over several
kmeans runs at the cost of a high computational cost, Another set of approaches, called trajectory inference
BackSPIN [88] uses a biclustering method and (TI) methods, aim to reconstruct the developmental pro-
DIMM-SC [89] was designed specifically for droplet- cess that cells are undergoing. The resulting trajectory
based single-cell RNA seq data. consists of states and transitions, with each cell mapped
Another characteristic of scRNA-Seq data is the to a pseudotemporal location in the trajectory (Fig. 4A).
high amount of dropout events. Some clustering Various visualisation techniques can aid in interpreting

59
A B

State 2

Expression of Marker 3
State 5 State 3

State 4

State 1

C
b) Marker 1 Marker 2 D

Marker 3 Marker 4

Pseudotime

State 1 State 3 State 4

Fig. 4. There are several approaches to visualising trajectory models inferred by TI methods. (A) The most common visualisation is a
dimensionality reduction where similar cells are placed close together. The cells are typically coloured based on prior knowledge (e.g. cell
type) or computationally inferred clustering, and are overlaid by the trajectory inferred by the TI method. (B) A scatter plot can be used to
demonstrate a response in gene expression over pseudotime. (C) Colouring of the cells in the dimensionality reduction plot can also be
used to compare the gene expression profiles. (D) In order to obtain an overview of the dynamics of a large number of genes, these genes
can be grouped together into modules, and one path along the trajectory can be visualised in the form of a heatmap.

the cell state- and branching point delineation, by visual- hematopoietic stem cells into naive B cells [96]. Since
ising the expression value of a marker over time then, TI methods have been used increasingly to
(Fig. 4B), comparing the gene expression values in cells reconstruct cell developmental trajectories. There are
within the reduced dimensions (Fig. 4C), or grouping several strategies TI methods use to tackle this com-
genes together in pseudotemporally coregulated modules plexity, and the choice of which method is most
(Fig. 4D). Cannoodt et al. [95] provide an overview of appropriate will thereby depend on the characteristics
several commonly used TI methods, organising them by of the given dataset [97]. Pioneering TI methods were
the different components they are based on. often specialised in producing a fixed trajectory type
Trajectory inference was first explored on mass (e.g. linear [96,98], bifurcating [70,99] or cyclical [100]).
cytometry in order to reconstruct the differentiation of Some methods require specific input [101], while others

60
are capable of inferring the trajectory structure in an expression (DE) of genes in scRNA-Seq data (SCDE
unbiased way [72,102]. A recent comparative review [106], MAST [107], scDD [108]). These methods use
[97] assessed the performance of more than thirty TI mixture models or Bayesian modelling frameworks to
methods on both synthetic and real scRNA-Seq data- identify both the technical effects between samples
sets, providing useful practical guidelines to choose the (mainly caused by the gene detection rate) and the vari-
most appropriate methods. Notably, no method con- ance which is related to the condition being tested.
sistently outperformed the others on all datasets. Another method, CENSUS [72], normalises the single-
Rather, various sets of methods were better suited to cell gene expression into relative transcript counts
specific trajectories in the datasets, with some methods (accounting for technical variability between cells) in
better identifying linear trajectories, and others effi- time series studies specifically, allowing for the identifi-
ciently identifying cycles. A good practice would there- cation of genes whose expression varies along time.
fore be to identify a set of TI methods to apply to the These single-cell specific DE methods aim to free them-
data based on the expected structure, and comparing selves from the idea that gene expression is unimodal
the results of at least 2–3 methods to confirm the bio- across cells. Indeed, as many cells often show unmea-
logical findings. sured genes, either due to biological or technical effects,
these methods model gene expression through more
elaborate distributions.
Differential analysis
However, a recent study [109], which compared 36
differential gene expression approaches, concluded that
Cytometry-based approaches
methods that were largely used for the DE analysis of
In order to identify cell populations which differ bulk RNA datasets (such as DESEQ2 [110], edger [111],
between different experimental conditions (e.g between VOOM [112]), were in fact not performing worse than
samples of patients with different clinical outcomes), single-cell specific DE methods on scRNA-Seq data-
cytometry data can first be clustered, and these clusters sets. Single-cell specific DE approaches also required
can be compared between the conditions. In FLOWSOM more computational time, although they scaled well
[73], the user can provide a fold-change threshold, to with increasing cell numbers. This comparative study
colour clusters which differ between the conditions. highlighted the fact that an important trend that gen-
The Citrus [86] and COMPASS [103] algorithms both erally improved a DE analysis results was accurate
perform model selection to identify the clusters which gene filtering, which reduces noise in lowly expressed
are best associated with a certain condition. A similar genes, leading to less false positive genes being identi-
method was implemented, which groups cells into fied as differentially expressed.
hyperspheres instead of clusters (Cydar [59]). Convolu-
tional neural networks have also been used to identify
Advanced computational approaches
subpopulations of cells which differ the most between
two conditions (CellCNN [104]). However, none of
Network inference
these methods directly cope with complex experiments
and may therefore be sensitive to batch effects, which Single-cell transcriptomics provide a rich source of
might be misinterpreted as the main difference between data, by quantifying the expression profiles of thou-
the conditions. One solution is to first remove possible sands of cells. The intercellular heterogeneity which
batch effects in a preprocessing step before performing naturally results from biological stochasticity [113]
differential analysis. A CYTOF workflow [105] has allows inferring mechanisms of gene regulation involv-
been proposed, which first applies clustering and then ing transcription factors and their target genes. More
uses Gaussian linear mixture models to perform differ- complex, nonlinear interactions between genes can be
ential analysis while accounting for possible batch studied at the single-cell level, as was shown with the
effect, paired experiments and other sources of techni- PIDC [114] algorithm, which was able to infer regula-
cal variance in the data. tory networks involved in developmental processes
from sc-qPCR datasets. However, inferring one global
regulatory network from thousands of cells might not
Sequencing-based approaches
always prove accurate. Different subpopulations of
The technical biases which have to be dealt with are cells in the data might be undergoing different regula-
even larger in single-cell and bulk RNA-Seq data, as tory processes, which is why some methods were
many genes are lowly expressed and noisy. Several implemented specifically to compute differential regu-
methods were proposed to specifically tackle differential latory networks. These methods derive one regulatory

61
network for each cell subtype (CSRF [115], P olya tree and its transcription. More surprisingly, the measure-
models [116]). ment of both transcripts and proteins [122,123] in single
In order to improve the inference of gene regulatory cells has highlighted the fact that the amount of these
networks, external sources of information can be pro- two entities was poorly correlated. This could be due to
vided. As was discussed in the section ‘Approaches the fact that transcription occurs in bursts, resulting in
modelling gradual transitions’, cells can be ordered high discrepancies between the numbers of transcripts,
along developmental trajectories. Some network infer- whereas protein levels have been shown to be more
ence methods can include the information from these stable for particular genes [124].
inferred trajectories to reconstruct dynamic regulatory The experimental procedures cited above led to low-
networks (AR1MA1 [117], SCODE [118]). Another throughput datasets, typically containing 100 cells at
source of external information could come from pertur- most, and could therefore be analysed by regular corre-
bational studies, in which genes are knocked out and lation studies to assess the links between different omics
the consequences on the transcriptome can be observed entities. The recently published CITE-seq [22] and
[21]. New tools will be needed to optimally use this type REAP-seq [23] methods have allowed the simultaneous
of data in order to infer regulatory networks. measurement of the transcriptome as well as 100 pro-
Single-cell transcriptomics data represent a rich teins in thousands of cells, and have the potential to
source of information to infer interactions which occur measure thousands of proteins in single cells, as these
between genes and transcription factors. However, new proteins are tagged with synthetic oligonucleotides.
studies are highlighting the need to not only focus on Some studies have also achieved a broader characterisa-
a single-cell’s transcripts, but also the methylation tion of single cells by combining proteomics- and imag-
state of the DNA, the chromatin state and other epige- ing-based approaches [125,126]. As new experimental
nomic data that might enrich our knowledge of the procedures keep providing larger and larger datasets,
gene regulation dynamics [119,120]. and new tools allow getting more insight into the mech-
anisms of regulations at the single-cell level [127,128],
there is a great need for multi-omics integrative compu-
Single-cell multi-omics data integration
tational tools. These tools should have the ability to
Single-cell transcriptomics, proteomics, genomics and combine the information coming from complementary
epigenomics have provided a level of understanding of sources to infer complex global models.
the cellular heterogeneity that could not be reached
with bulk studies. However, the models which are
Conclusions and future perspectives
inferred from single technologies are by definition
incomplete. Indeed, the relationships between the gen- Various high-throughput approaches currently allow
ome, the amount of transcripts and proteins in a single studying cell populations into unprecedented depth.
cell are not always straightforward. Transcriptional The rapid development of novel technologies or
regulatory mechanisms such as methylation may for hybridisations between them is generating large and
instance alter the correlation between the gene copy complex datasets that require designing novel computa-
number and the associated number of transcripts. tional approaches for preprocessing, visualising and
Moreover, post-transcriptional mechanisms regulating extracting novel patterns from them. As novel tech-
protein translation and stability may also influence the nologies arise, the development of computational tools
relation between the number of transcripts and pro- and the adequate benchmarking between them is lag-
teins in a cell. In order to fully understand and to start ging behind. Indeed, many computational approaches
modelling the mechanisms involved in single cells, it to study single-cell data are continuously being pub-
will therefore be essential to integrate complementary lished, but the number of benchmark studies that objec-
types of data from the same single cells [26]. tively compare these methods is under-represented.
New experimental approaches have already been able Nevertheless, such benchmarks are essential to extract
to achieve a simultaneous and multiparameter measure- useful guidelines for biologists who want to use these
ment by combining methods. The study of the genome tools, pinpoint limitations of current approaches and
together with the transcriptome [24,121] for instance has highlight novel directions for future tool development.
confirmed the existence of a strong correlation between While current methods mainly focus on cells in sus-
genes with high copy numbers and the number of pension, novel advances that include the spatial con-
mRNA transcripts. The joint analysis of the methylome text will stimulate novel classes of computational tools
together with the transcriptome [25] also corroborated that will enable modelling cellular interactions and cell
the negative relation between the methylation of a gene dynamics into much greater depth. Such techniques

62
will allow going from cells in isolation to tissues and 8 Nolan JP, Condello D, Nolan JP & Condello D
organs, offering new perspectives for multiscale mod- (2013). Spectral flow cytometry. In Current Protocols
elling. On the other hand, single-cell multi-omics in Cytometry, p. 1.27.1–1.27.13. John Wiley & Sons,
approaches are providing complementary information Inc., Hoboken, NJ.
that can relate epigenetic, transcriptional and transla- 9 McGrath KE, Bushnell TP & Palis J (2008)
tional information, paving the way for single-cell mul- Multispectral imaging of hematopoietic cells: where flow
ti-omics and multi-source data integration. meets morphology. J Immunol Methods 336, 91–97.
All of these advances strengthen the idea that the 10 Goddard G, Martin JC, Graves SW & Kaduchak G
life sciences are becoming even more data-driven (2006) Ultrasonic particle-concentration for sheathless
focusing of particles for analysis in a flow cytometer.
sciences. To be able to analyse and correctly interpret
Cytometry Part A 69A, 66–74.
the results of computational pipelines, young research-
11 Bandura DR, Baranov VI, Ornatsky OI, Antonov A,
ers thus should be trained adequately in properly using
Kinach R, Lou X, Pavlov S, Vorobiev S, Dick JE &
and understanding the principles of these novel com-
Tanner SD (2009) Mass cytometry: technique for real
putational approaches.
time single cell multitarget immunoassay based on
inductively coupled plasma time-of-flight mass
Acknowledgements spectrometry. Anal Chem 81, 6813–6822.
12 Giesen C, Wang HA, Schapiro D, Zivanovic N, Jacobs
We thank Sofie Van Gassen, Robrecht Cannoodt, A, Hattendorf B, Sch€ uffler PJ, Grolimund D, Buhmann
Niels Vandamme and Daniel Peralta for critical com- JM, Brandt S et al. (2014) Highly multiplexed imaging
ments and valuable input. HT is funded by a BOF- of tumor tissues with subcellular resolution by mass
IOP grant from Ghent University; YS is an ISAC cytometry. Nat Methods 11, 417–422.
Marylou Ingram scholar. 13 Saeys Y, Van Gassen S & Lambrecht BN (2016)
Computational flow cytometry: helping to make sense
of high-dimensional immunology data. Nat Rev
Conflict of interest Immunol 16, 449–462.
14 Picelli S, Bj€
orklund AK,� Faridani OR, Sagasser S,
The authors declare no competing interests.
Winberg G & Sandberg R (2013) Smart-seq2 for
sensitive full-length transcriptome profiling in single
References cells. Nat Methods 10, 1096–1098.
15 Macosko EZ, Basu A, Satija R, Nemesh J, Shekhar K,
1 Liu Z, Lavis LD & Betzig E (2015) Imaging live-cell Goldman M, Tirosh I, Bialas AR, Kamitaki N,
dynamics and structure at the single-molecule level. Martersteck EM et al. (2015) Highly parallel genome-
Mol Cell 58, 644. wide expression profiling of individual cells using
2 Abraham V, Taylor D & Haskins J (2004) High nanoliter droplets. Cell 161, 1202–1214.
content screening applied to large-scale cell biology. 16 Zheng GXY, Terry JM, Belgrader P, Ryvkin P, Bent ZW,
Trends Biotechnol 22, 15–22. Wilson R, Ziraldo SB, Wheeler TD, McDermott GP, Zhu
3 Goodman A & Carpenter AE (2016) High-throughput, J et al. (2017) Massively parallel digital transcriptional
automated image processing for large-scale profiling of single cells. Nat Commun 8, 14049.
fluorescence microscopy experiments. Microsc 17 Gierahn TM, Wadsworth MH, Hughes TK, Bryson
Microanal 22, 538–539. BD, Butler A, Satija R, Fortune S, Love JC & Shalek
4 Kamentsky L, Jones TR, Fraser A, Bray MA, Logan AK (2017) Seq-Well: portable, low-cost RNA
DJ, Madden KL, Ljosa V, Rueden C, Eliceiri KW & sequencing of single cells at high throughput. Nat
Carpenter AE (2011) Improved structure, function and Methods 14, 395–398.
compatibility for Cell Profiler: modular high- 18 Rosenberg AB, Roco C, Muscat RA, Kuchina A,
throughput image analysis software. Bioinformatics 27, Mukherjee S, Chen W, Peeler DJ, Yao Z, Tasic B, Sellers
1179–1180. DL et al. (2017) Scaling single cell transcriptomics
5 Fulwyler MJ (1965) Electronic separation of biological through split pool barcoding. bioRxiv [preprint].
cells by volume. Science (New York, NY) 150, 910–911. 19 St�
ahl PL, Salm�en F, Vickovic S, Lundmark A,
6 Robinson JP & Roederer M (2015) Flow cytometry Navarro JF, Magnusson J, Giacomello S, Asp M,
strikes gold. Science 350, 739–740. Westholm JO, Huss M et al. (2016) Visualization and
7 Perfetto SP, Chattopadhyay PK & Roederer M (2004) analysis of gene expression in tissue sections by spatial
Innovation: Seventeen-colour flow cytometry: transcriptomics. Science (New York, NY) 353, 78–82.
unravelling the immune system. Nat Rev Immunol 4, 20 Dixit A, Parnas O, Li B, Chen J, Fulco CP, Jerby-
648–655. Arnon L, Marjanovic ND, Dionne D, Burks T,

63
Raychowdhury R et al. (2016) Perturb-Seq: dissecting shape information for 2D and 3D segmentation of cell
molecular circuits with scalable single-cell RNA nuclei in tissue sections. J Microsc 215, 67–76.
profiling of pooled genetic screens. Cell 167, 1853–1866 33 Ortiz de Sol� orzano C, Garc�ıa Rodriguez E, Jones A,
e17. Pinkel D, Gray JW, Sudar D & Lockett SJ. (1999)
21 Jaitin DA, Weiner A, Yofe I, Lara-Astiaso D, Keren- Segmentation of confocal microscope images of cell
Shaul H, David E, Meir Salame T, Tanay A, van nuclei in thick tissue sections. J Microsc 193, 212–26.
Oudenaarden A & Amit I (2016) Dissecting immune 34 Meyer F & Beucher S (1990) Morphological
circuits by linking CRISPR-pooled screens with single- segmentation. J Vis Commun Image Represent 1, 21–46.
cell RNA-Seq. Cell 167, 1883–1896.e15. 35 Leipold MD (2015) Another step on the path to mass
22 Stoeckius M, Hafemeister C, Stephenson W, Houck- cytometry standardization. Cytometry Part A 87,
Loomis B, Chattopadhyay PK, Swerdlow H, Satija R 380–382.
& Smibert P (2017) Simultaneous epitope and 36 Takahashi C, Au-Yeung A, Fuh F, Ramirez-Montagut
transcriptome measurement in single cells. Nat T, Bolen C, Mathews W & O’Gorman WE (2017)
Methods 14, 865–868. Mass cytometry panel optimization through the
23 Peterson VM, Zhang KX, Kumar N, Wong J, Li L, designed distribution of signal interference. Cytometry
Wilson DC, Moore R, McClanahan TK, Sadekova S Part A 91, 39–47.
& Klappenbach JA (2017) Multiplexed quantification 37 �
Fletez-Brant K, Spidlen J, Brinkman RR, Roederer M
of proteins and transcripts in single cells. Nat & Chattopadhyay PK (2016) flowClean: automated
Biotechnol 35, 936–939. identification and removal of fluorescence anomalies
24 Macaulay IC, Haerty W, Kumar P, Li YI, Hu TX, Teng in flow cytometry data. Cytometry Part A 89,
MJ, Goolam M, Saurat N, Coupland P, Shirley LM 461–471.
et al. (2015) G&T-seq: parallel sequencing of single-cell 38 Bashashati A & Brinkman RR (2009) A survey of flow
genomes and transcriptomes. Nat Methods 12, 519–522. cytometry data analysis methods. Adv Bioinform 2009,
25 Angermueller C, Clark SJ, Lee HJ, Macaulay IC, 584603.
Teng MJ, Hu TX, Krueger F, Smallwood SA, Ponting 39 Monaco G, Chen H, Poidinger M, Chen J,
CP, Voet T et al. (2016) Parallel single-cell sequencing deMagalh~aes JP & Larbi A (2016) flowAI: automatic
links transcriptional and epigenetic heterogeneity. Nat and interactive anomaly discerning tools for flow
Methods 13, 229–232. cytometry data. Bioinformatics 32, 2473–2480.
26 Macaulay IC, Ponting CP & Voet T (2017) Single-cell 40 Hahne F, Khodabakhshi AH, Bashashati A, Wong
multiomics: multiple measurements from single cells. CJ, Gascoyne RD, Weng AP, Seyfert-Margolis V,
TIG 33, 155–168. Bourcier K, Asare A, Lumley T et al. (2010) Per-
27 Carpenter AE, Jones TR, Lamprecht MR, Clarke C, channel basis normalization methods for flow
Kang IH, Friman O, Guertin DA, Chang J, Lindquist cytometry data. Cytometry Part A 77, 121–131.
RA, Moffat J et al. (2006) Cell Profiler: image analysis 41 Finak G, Frelinger J, Jiang W, Newell EW, Ramey J,
software for identifying and quantifying cell Davis MM, Kalams SA, De Rosa SC & Gottardo R
phenotypes. Genome Biol 7, R100. (2014) OpenCyto: an open source infrastructure for
28 Peng T, Thorn K, Schroeder T, Wang L, Theis FJ, scalable, robust, reproducible, and automated, end-to-
Marr C & Navab N (2017) A BaSiC tool for end flow cytometry data analysis. PLoS Comput Biol
background and shading correction of optical 10, e1003806.
microscopy images. Nat Commun 8, 14836. 42 Malek M, Taghiyar MJ, Chong L, Finak G, Gottardo
29 Smith K, Li Y, Piccinini F, Csucs G, Balazs C, R & Brinkman RR (2015) flowDensity: reproducing
Bevilacqua A & Horvath P (2015) CIDRE: an manual gating of flow cytometry data by automated
illumination-correction method for optical microscopy. density-based cell population identification.
Nat Methods 12, 404–406. Bioinformatics 31, 606–607.
30 W€ahlby C (2003) Algorithms for applied digital image 43 Ziegenhain C, Vieth B, Parekh S, Reinius B,
cytometry. Acta Universitatis Upsaliensis. Guillaumet-Adkins A, Smets M, Leonhardt H, Heyn
Comprehensive Summaries of Uppsala Dissertations H, Hellmann I & Enard W (2017) Comparative
from the Faculty of Science and Technology 896, analysis of single-cell RNA sequencing methods. Mol
75 pp., Uppsala. ISBN 91-554-5759-2. Cell 65, 631–643.
31 Malpica N, de Sol� orzano CO, Vaquero JJ, Santos A, 44 Stegle O, Teichmann SA & Marioni JC (2015)
Vallcorba I, Garc�ıa-Sagredo JM & del Pozo F (1998) Computational and analytical challenges in single-cell
Applyingwatershed algorithms to the segmentation of transcriptomics. Nat Rev Genet 16, 133–145.
clustered nuclei. Cytometry 28, 289–297. 45 Poirion OB, Zhu X, Ching T & Garmire L (2016)
32 Wahlby C, Sintorn IM, Erlandsson F, Borgefors G & Single-cell transcriptomics bioinformatics and
Bengtsson E (2004) Combining intensity, edge and computational challenges. Front Genet 7, 163.

64
46 Bacher R & Kendziorski C (2016) Design and 61 Ding B, Zheng L, Zhu Y, Li N, Jia H, Ai R, Wildberg
computational analysis of single-cell RNA-sequencing A & Wang W (2015) Normalization and noise
experiments. Genome Biol 17, 63. reduction for single cell RNA-seq experiments.
47 McCarthy DJ, Campbell KR, Lun ATL & Wills QF Bioinformatics 31, 2225–2227.
(2016) scater: pre-processing, quality control, 62 Katayama S, T€ oh€onen V, Linnarsson S & Kere J
normalisation and visualisation of single-cell RNA-seq (2013) SAMstrt: statistical test for differential
data in R. bioRxiv [preprint]. expression in single-cell transcriptome with spike-in
48 Butler A, Hoffman P, Smibert P, Papalexi E & Satija normalization. Bioinformatics 29, 2943–2945.
R (2018) Integrating single-cell transcriptomic data 63 Reid LH (2005) Proposed methods for testing and
across different conditions, technologies, and species. selecting the ERCC external RNA controls. BMC
Nat Biotechnol 36, 411–420. Genom 6, 150.
49 Haghverdi L, Lun ATL, Morgan MD & Marioni JC 64 Baran-Gale J, Chandra T & Kirschner K (2017)
(2018) Batch effects in single-cell RNA-sequencing Experimental design for single-cell RNA sequencing.
data are corrected by matching mutual nearest Brief Funct Genomics 17, 233–239.
neighbors. Nat Biotechnol 36, 421–427. 65 Tung PY, Blischak JD, Hsiao CJ, Knowles DA,
50 Lun ATL, McCarthy DJ & Marioni JC (2016) A step- Burnett JE, Pritchard JK & Gilad Y (2017) Batch
by-step workflow for low-level analysis of single-cell effects and the effective design of single-cell gene
RNA-seq data with bioconductor. F1000Research 5, expression studies. Sci Rep 7, 39921. https://doi.org/10.
2122. 1038/srep39921
51 Scialdone A, Natarajan KN, Saraiva LR, Proserpio V, 66 Lun AT, Calero-Nieto FJ, Haim-Vilmovsky L,
Teichmann SA, Stegle O, Marioni JC & Buettner F Gottgens B & Marioni JC (2017) Assessing the
(2015) Computational assignment of cellcycle stage reliability of spike-in normalization for analyses of
from single-cell transcriptome data. Methods 85, single-cell RNA sequencing data. bioRxiv [preprint].
54–61. 67 Bacher R, Chu LF, Leng N, Gasch AP, Thomson JA,
52 Buettner F, Pratanwanich N, McCarthy DJ, Marioni JC Stewart RM, Newton M & Kendziorski C (2017)
& Stegle O (2017) f-scLVM: scalable and versatile factor SCnorm: robust normalization of single-cell RNA-seq
analysis for single-cell RNA-seq. Genome Biol 18, 212. data. Nat Methods 14, 584–586.
53 Mortazavi A, Williams BA, McCue K, Schaeffer L & 68 van der Maaten L & Hinton G (2008) Visualizing data
Wold B (2008) Mapping and quantifying mammalian using t-SNE. J Mach Learn Res 9, 2579–2605.
transcriptomes by RNA-Seq. Nat Methods 5, 621–628. 69 Kruskal JB (1964) Multidimensional scaling by
54 Wagner GP, Kin K & Lynch VJ (2012) Measurement optimizing goodness of fit to a nonmetric hypothesis.
of mRNA abundance using RNA-seq data: RPKM Psychometrika 29, 1–27.
measure is inconsistent among samples. Theory Biosci 70 Haghverdi L, Buettner F & Theis FJ (2015) Diffusion
131, 281–285. maps for high-dimensional single-cell analysis of
55 Bullard JH, Purdom E, Hansen KD & Dudoit S differentiation data. Bioinformatics 31, 2989–2998.
(2010) Evaluation of statistical methods for 71 Weinreb C, Wolock S & Klein A (2017) SPRING: a
normalization and differential expression in mRNA- kinetic interface for visualizing high dimensional
Seq experiments. BMC Bioinformatics 11, 94. single-cell expression data. bioRxiv [preprint].
56 Vallejos CA, Risso D, Scialdone A, Dudoit S & 72 Qiu X, Mao Q, Tang Y, Wang L, Chawla R, Pliner
Marioni JC (2017) Normalizing single-cell RNA HA & Trapnell C (2017) Reversed graph embedding
sequencing data: challenges and opportunities. Nat resolves complex single-cell trajectories. Nat Methods
Methods 14, 565–571. 14, 979–982.
57 Pierson E & Yau C (2015) ZIFA: dimensionality 73 Van Gassen S, Callebaut B, Van Helden MJ,
reduction for zero-inflated single-cell gene expression Lambrecht BN, Demeester P, Dhaene T & Saeys Y
analysis. Genome Biol 16, 241. (2015) FlowSOM: using selforganizing maps for
58 Risso D, Perraudeau F, Gribkova S, Dudoit S & Vert visualization and interpretation of cytometry data.
JP (2017) ZINB-WaVE: a general and flexible method Cytometry Part A 87, 636–645.
for signal extraction from single-cell RNA-seq data. 74 Zunder ER, Lujan E, Goltsev Y, Wernig M & Nolan
bioRxiv [preprint]. GP (2015) A continuous molecular roadmap to iPSC
59 Lun ATL, Richard AC & Marioni JC (2017) Testing reprogramming through progression analysis of single-
for differential abundance in mass cytometry data. Nat cell mass cytometry. Cell Stem Cell 16, 323–337.
Methods 14, 707–709. 75 Weber LM & Robinson MD (2016) Comparison of
60 Vallejos CA, Marioni JC & Richardson S (2015) clustering methods for high-dimensional single-cell
BASiCS: Bayesian analysis of single-cell sequencing flow and mass cytometry data. Cytometry Part A 89,
data. PLoS Comput Biol 11, e1004333. 1084–1096.

65
76 Spitzer MH, Gherardini PF, Fragiadakis GK, hippocampus revealed by single-cell RNA-seq. Science
Bhattacharya N, Yuan RT, Hotson AN, Finck R, (New York, NY) 347, 1138–1142.
Carmi Y, Zunder ER, Fantl WJ et al. (2015) An 89 Sun Z, Wang T, Deng K, Wang XF, Lafyatis R, Ding
interactive reference framework for modeling a Y, Hu M & Chen W (2018) DIMM-SC: a Dirichlet
dynamic immune system. Science 349, 1259425. mixture model for clustering droplet-based single cell
77 Levine JH, Simonds EF, Bendall SC, Davis KL, EaD transcriptomic data. Bioinformatics 34, 139–146.
A, Tadmor MD, Litvin O, Fienberg HG, Jager A, 90 Lin P, Troup M & Ho JWK (2017) CIDR: ultrafast
Zunder ER et al. (2015) Data-driven phenotypic and accurate clustering through imputation for single-
dissection of AML reveals progenitor-like cells that cell RNA-seq data. Genome Biol 18, 59.
correlate with prognosis. Cell 162, 184–197. 91 Wang B, Zhu J, Pierson E, Ramazzotti D & Batzoglou
78 Klein A, Mazutis L, Akartuna I, Tallapragada N, S (2017) Visualization and analysis of single-cell RNA-
Veres A, Li V, Peshkin L, Weitz DA & Kirschner MW seq data by kernelbased similarity learning. Nat
(2015) Droplet barcoding for single-cell Methods 14, 414–416.
transcriptomics applied to embryonic stem cells. Cell 92 Xu C & Su Z (2015) Identification of cell types from
161, 1187–1201. single-cell transcriptomes using a novel clustering
79 Satija R, Farrell JA, Gennert D, Schier AF & Regev method. Bioinformatics 31, 1974–1980.
A (2015) Spatial reconstruction of single-cell gene 93 Aibar S, Gonz�alez-Blas CB, Moerman T, Huynh-Thu
expression data. Nat Biotechnol 33, 495–502. VA, Imrichova H, Hulselmans G, Rambow F, Marine
80 Campbell KR & Yau C (2016) Order under JC, Geur P & Aerts J (2017) SCENIC: single-cell
uncertainty: robust differential expression analysis regulatory network inference and clustering. Nat
using probabilistic models for pseudotime inference. Methods 14, 1083–1086.
PLoS Comput Biol 12, e1005212. 94 Fan J, Salathia N, Liu R, Kaeser GE, Yung YC,
81 Anchang B, Hart TDP, Bendall SC, Qiu P, Bjornson Herman JL, Kaper F, Fan J-B, Zhang K, Chun J
Z, Linderman M, Nolan GP & Plevritis SK (2016) et al. (2016) Characterizing transcriptional
Visualization and cellular hierarchy inference of heterogeneity through pathway and gene set over
single-cell data using SPADE. Nat Protoc 11, 1264– dispersion analysis. Nat Methods 13, 241–244.
1279. 95 Cannoodt R, Saelens W & Saeys Y (2016)
82 Shekhar K, Brodin P, Davis MM & Chakraborty AK Computational methods for trajectory inference from
(2014) Automatic classification of cellular expression single-cell transcriptomics. Eur J Immunol 46, 2496–2506.
by nonlinear stochastic embedding (ACCENSE). Proc 96 Bendall SC, Davis KL, Amir EAD, Tadmor MD,
Natl Acad Sci USA 111, 202–207. Simonds EF, Chen TJ, Shenfeld DK, Nolan GP &
83 Aghaeepour N, Finak G, Hoos H, Mosmann TR, Pe’er D (2014) Single-cell trajectory detection uncovers
Brinkman R, Gottardo R & Scheuermann RH (2013) progression and regulatory coordination in human B
Critical assessment of automated flow cytometry data cell development. Cell 157, 714–725.
analysis techniques. Nat Methods 10, 228–238. 97 Saelens W, Cannoodt R, Todorov H & Saeys Y (2018)
84 Newell EW & Cheng Y (2016) Mass cytometry: A comparison of single-cell trajectory inference
blessed with the curse of dimensionality. Nat Immunol methods: towards more accurate and robust tools.
17, 890–895. bioRxiv [preprint].
85 Platon L, Pejoski D, Gautreau G, Targat B, Le 98 Cannoodt R, Saelens W, Sichien D, Tavernier S,
Grand R & Beignon AS (2018) A computational Janssens S, Guilliams M, Lambrecht BN, De PK &
approach for phenotypic comparisons of cell Saeys Y (2016) SCORPIUS improves trajectory
populations in high-dimensional cytometry data. inference and identifies novel modules in dendritic cell
Methods 132, 66–75. development. bioRxiv [preprint].
86 Bruggner RV, Bodenmiller B, Dill DL, Tibshirani RJ 99 Setty M, Tadmor MD, Reich-Zeliger S, Angel O,
& Nolan GP (2014) Automated identification of Salame TM, Kathail P, Choi K, Bendall S, Friedman
stratifying signatures in cellular subpopulations. Proc N & Pe’er D (2016) Wishbone identifies bifurcating
Natl Acad Sci USA 111, E2770–E2777. developmental trajectories from single-cell data. Nat
87 Kiselev VY, Kirschner K, Schaub MT, Andrews T, Biotechnol 34, 1–14.
Yiu A, Chandra T, Natarajan KN, Reik W, Barahona 100 Liu Z, Lou H, Xie K, Wang H, Chen N, Aparicio
M, Green AR et al. (2017) SC3: consensus clustering OM, Zhang MQ, Jiang R & Chen T (2017)
of single-cell RNA-seq data. Nat Methods 14, 483–486. Reconstructing cell cycle pseudo time-series via single-
88 Zeisel A, Mu~ noz-Manchado AB, Codeluppi S, cell transcriptome data. Nat Commun 8, 22.
L€onnerberg P, La Manno G, Jur�eus A, Marques S, 101 Trapnell C, Cacchiarelli D, Grimsby J, Pokharel P, Li
Munguba H, He L, Betsholtz C et al. (2015) Brain S, Morse M, Lennon NJ, Livak KJ, Mikkelsen TS &
structure. Cell types in the mouse cortex and Rinn JL (2014) The dynamics and regulators of cell

66
fate decisions are revealed by pseudotemporal ordering 116 Filippi S & Holmes CC (2017) A Bayesian
of single cells. Nat Biotechnol 32, 381–386. nonparametric approach to testing for dependence
102 Street K, Risso D, Fletcher RB, Das D, Ngai J, Yosef between random variables. Bayesian Anal 12, 919–938.
N, Purdom E & Dudoit S (2017) Slingshot: cell 117 Castillo MS, Blanco D, Luna IMT, Carrion MC &
lineage and pseudotime inference for single-cell Huang Y (2018) A Bayesian framework for the
transcriptomics. bioRxiv [preprint]. inference of gene regulatory networks from time and
103 Lin L, Finak G, Ushey K, Seshadri C, Hawn TR, Frahm pseudo-time series data. Bioinformatics 34, 964–970.
N, Scriba TJ, Mahomed H, Hanekom W, Bart P-A et al. 118 Matsumoto H, Kiryu H, Furusawa C, Ko MSH, Ko
(2015) COMPASS identifies T-cell subsets correlated SBH, Gouda N, Hayashi T & Nikaido I (2017)
with clinical outcomes. Nat Biotechnol 33, 610–616. SCODE: an efficient regulatory network inference
104 Arvaniti E & Claassen M (2017) Sensitive detection of algorithm from single-cell RNA-Seq during
rare disease-Associated cell subsets via representation differentiation. Bioinformatics 33, 2314–2321.
learning. Nat Commun 8, 1–10. 119 Fiers MWEJ, Minnoye L, Aibar S, Bravo Gonz�alez-
105 Nowicka M, Krieg C, Weber LM, Hartmann FJ, Blas C, Kalender Atak Z & Aerts S (2018) Mapping
Guglietta S, Becher B, Levesque MP & Robinson MD gene regulatory networks from single-cell omics data.
(2017) CyTOF workflow: differential discovery in Brief Funct Genomics 17, 246–254.
high-throughput high-dimensional cytometry datasets. € o T & Bonneau R (2017) Biophysically motivated
120 Aij€
F1000Research 6, 748. regulatory network inference: progress and prospects.
106 Kharchenko PV, Silberstein L & Scadden DT (2014) Hum Hered 81, 62–77.
Bayesian approach to single-cell differential expression 121 Dey SS, Kester L, Spanjaard B, Bienko M & Van
analysis. Nat Methods 11, 740–742. Oudenaarden A (2015) Integrated genome and
107 Finak G, McDavid A, Yajima M, Deng J, Gersuk V, transcriptome sequencing of the same cell. Nat
Shalek AK, Slichter CK, Miller HW, McElrath MJ, Biotechnol 33, 285–289.
Prlic M et al. (2015) MAST: a flexible statistical 122 Darmanis S, Gallant CJ, Marinescu VD, Niklasson
framework for assessing transcriptional changes and M, Segerman A, Flamourakis G, Fredriksson S,
characterizing heterogeneity in single-cell RNA Assarsson E, Lundberg M, Nelander S et al. (2016)
sequencing data. Genome Biol 16, 278. Simultaneous multiplexed measurement of RNA and
108 Korthauer KD, Chu LF, Newton MA, Li Y, Thomson proteins in single cells. Cell Rep 14, 380–389.
J, Stewart R & Kendziorski C (2016) A statistical 123 Albayrak C, Jordi CA, Zechner C, Lin J, Bichsel CA,
approach for identifying differential distributions in Khammash M & Tay S (2016) Digital quantification
single-cell RNA-seq experiments. Genome Biol 17, 222. of proteins and mRNA in single mammalian cells.
109 Soneson C & Robinson MD (2018) Bias, robustness Mol Cell 61, 914–924.
and scalability in single-cell differential expression 124 Schwanh€ausser B, Busse D, Li N, Dittmar G,
analysis. Nat Methods 15, 255–261. Schuchhardt J, Wolf J, Chen W & Selbach M (2011)
110 Love MI, Huber W & Anders S (2014) Moderated Global quantification of mammalian gene expression
estimation of fold change and dispersion for RNA-seq control. Nature 473, 337–342.
data with DESeq2. Genome Biol 15, 550. 125 Soh KT, Tario JD, Colligan S, Maguire O, Pan D,
111 Robinson MD, McCarthy DJ & Smyth GK (2010) Minderman H & Wallace PK (2016) Simultaneous,
edgeR: a Bioconductor package for differential single-cell measurement of messenger RNA, cell
expression analysis of digital gene expression data. surface proteins, and intracellular proteins. Curr
Bioinformatics 26, 139–140. Protoc Cytom 75, 7.45.1–7.45.33.
112 Law CW, Chen Y, Shi W & Smyth GK (2014) voom: 126 Kochan J, Wawro M & Kasza A (2015) Simultaneous
precision weights unlock linear model analysis tools detection of mRNA and protein in single cells using
for RNA-seq read counts. Genome Biol 15, R29. immunofluorescence combined single-molecule RNA
113 Padovan-Merhar O & Raj A (2013) Using variability FISH. Biotechniques 59, 209–212, 214, 216.
in gene expression as a tool for studying gene 127 Buenrostro JD, Wu B, Litzenburger UM, Ruff D,
regulation. WIREs Syst Biol Med 5, 751–759. Gonzales ML, Snyder MP, Chang HY & Greenleaf WJ
114 Chan TE, Stumpf MPH & Babtie AC (2017) Gene (2015) Single-cell chromatin accessibility reveals
regulatory network inference from single-cell data principles of regulatory variation. Nature 523, 486–490.
using multivariate information measures. Cell systems 128 Jin W, Tang Q, Wan M, Cui K, Zhang Y, Ren G, Ni B,
5, 251–267.e3. Sklar J, Przytycka TM, Childs R et al. (2015) Genome-
115 Xu R, Nettleton D & Nordman DJ (2016) Case-specific wide detection of DNase i hypersensitive sites in single
random forests. J Comput Graph Stat 25, 49–65. cells and FFPE tissue samples. Nature 528, 142–146.

67

You might also like