You are on page 1of 4

technology feature

A dream of single-cell proteomics


As single-cell proteomics emerges, perhaps labs can avoid the need to infer protein levels from mRNA abundances.

Vivien Marx

N
owadays, labs can generate massive
sets of single-cell genomic and
single-cell transcriptomic data. In
proteomics, high-throughput single-cell
methods have not yet arrived. The nascent 1,
2...
field of single-cell proteomics (sc-proteomics) . . 1 ,25 5...
1. 37 6
is bringing change, and perhaps helping to ,25 74, 2, ...3,55
2 , 3 5 5 5 0 2 2
.. . , ,
avoid the need to infer proteins from cellular 5 4 ...3 21...4 9
mRNA levels. It’s early days, but it’s not 3,5 0...4,0 ...5,97
2 8
a distant dream to be able to tally the 4,0 ..5,97 ...6,79
5 7 7. ,789
proteins in single cells, says Ruedi Aebersold, 3, 0 ..6
2
a proteomics researcher at ETH Zurich
and the University of Zurich. To make that
dream an everyday reality, labs push hurdles
out of the way. Proteins are tougher to work
with than RNA or DNA, for example they’re
stickier, but eventually researchers might be
able to integrate single-cell mRNA and
single-cell proteomic measurements.
Over 20 years ago, says Aebersold,
Richard Smith at Pacific Northwest National
Laboratory and his colleagues characterized
hemoglobin from a single red blood
cell with a technique called capillary
electrophoresis-electrospray ionization
Fourier transform ion cyclotron resonance
mass spectrometry1. The red blood cell It’s not a faraway dream to be able to tally proteins in single cells. Credit: S. Larochelle, E. Dewalt,
was a special case, says Aebersold, since Springer Nature
it mainly consists of hemoglobin. But this
was single-cell analysis.
Fast forward to a recent approach that
Aebersold and his colleague Ben Collins call note the approach is “inherently single Single molecules
a “marriage across the ages”2. Developed in molecule” and thus “there are reasonable It’s hard to speculate how fast technology
the labs of Edward Marcotte, Eric Anslyn prospects for decreasing sample volumes for sc-proteomics will develop, says
and colleagues, at the University of Texas at and protein abundance requirements.” Harvard Medical School researcher
Austin, the technique yields the amino acid They state that once fluorescent labeling Peter Kharchenko, but “I am most excited
sequence of individual proteins in a highly of low-abundance proteins with fluorescent about the ability to quantify phosphorylation
parallelized fashion3. A spinout company, tags can be achieved, this method has and other modifications.” This would
Erisyon, has been launched to commercialize application potential, such as for single-cell move the field “beyond simple abundance-
a single-molecule protein sequencer. proteomics experiments. based models to more accurate dynamic
The approach involves Edman sequencing, It’s far from single-cell analysis, says descriptions.” Humboldt University
with which proteins were sequenced in the Aebersold, but “the method certainly has researcher, Rune Linding, hopes such
days before mass spec. Proteins are cleaved potential to do single cells, potentially much approaches might open up new ways to
and the peptides are labeled with identifying faster and in ways that the mass spectrometer analyze phosphorylation dynamics.
fluorescent tags; the tagged peptides are then would have a hard time doing.” As a student, Proteomics and genomics labs pursue
immobilized on a glass cover-slip. Successive he used Edman sequencing and he is different questions “but they clearly provide
rounds of Edman degradation chemistry intrigued to see its revival for single-molecule different views on the same system,” says
remove one amino acid at a time and the fluorescence. The method piggybacks on Aebersold. “In proteomics, we’re always kind
peptides are imaged at each round. The team flow-cell technology used in high-throughput of limping a while behind the genomics
found dyes that handled the process but some genome sequencers, he says. It will take work field,” he says. Single-cell genomics and
mishaps occurred: dyes fell off, didn’t attach to make this a routine application, he says, transcriptomics can capture “a kind of
well or provided inadequate fluorescence. and other labs have such approaches in genealogy of the cell,” and track cells as
In their published study, the team analyzed their sights, too. This study is an important they evolve and change through mutations,
a zeptomolar mixture of proteins, but they proof of principle. he says. Proteomics labs can now analyze
Nature Methods | www.nature.com/naturemethods
technology feature

cells with liquid chromatography and tandem sonication by lysing cells with a freeze–heat
mass spectrometry (LC-MS/MS). They cycle in pure water5. The team is testing
rethought sample preparation: cell lysis, the efficiency of this sample preparation
protein purification, digestion and clean-up. technique and “it appears to be at least as
And there’s mass spec instruments’ aversion good, if not better than, urea lysis,” he says.
to the clean-up chemicals to consider, says He and his team used this method to
Slavov. In standard mass spectroscopy, sample quantify 2,000 proteins in 356 cells — a
is always lost as it can, for example, stick to sample containing both monocytes
the sides of chromatography columns. Bulk and macrophages.
sample analysis helps labs cope with that. Vogel says that pure water might avoid
But, when assessing single mammalian cells artifacts from chemicals, but she wonders
and their few hundred picograms of proteins how soluble proteins without ion content are
each, little if any cargo should go missing. in pure water. Speaking more generally about
sample preparation for sc-proteomics, she
small numbers of cells. “It’s a beginning,” he says that proteins are trickier than RNA and
Freeze–heat Protein is digested
says. This is how single-cell transcriptomics Water
cycle to peptides DNA: they cannot be amplified, they’re sticky
started, followed by massive multiplexing and and they degrade easily. “Until someone
barcoding strategies that allowed resolution invents something to ‘amplify proteins’ that’ll
of large numbers of cells. Such trends will always be the problem,” she says. Many
take a while to develop in proteomics. methods, including hers and others, address
Labs are exploring cytometry-related ways sample preparation by using techniques such
96 or 384-well plate
to reap single-cell data. A team that includes as hydrostatic pressure or engineered surfaces
researchers at the University of Ottawa and for peptide enrichment.
the University of Oxford, used single-cell The Slavov lab is testing their mass spec
mass cytometry (CyTOF) to capture the sample-prep method to lyse cells in pure water, Multiplexing
cell-fate decisions during hematopoiesis, and using a freeze–heat cycle. Credit: Slavov lab, With TMT, around 20 barcodes can be
tracked how transcription factor expression Northeastern Univ. used. But labeling a protein or peptide
changed during a cell’s lineage commitment introduces complexities, says Aebersold:
at 13 time-points. They measured 27 proteins the reaction must be just right, excess
simultaneously in single cells. Another single- Slavov, his graduate student Harrison has to be removed and there’s clean-up.
cell CyTOF effort by a team at the University Specht, and the team, hunted a different way Multiplexing is a scale-up that makes
of Zurich, along with colleagues at other to extract proteins efficiently for LC-MS/MS workflow more complex. Transcriptome
institutions, profiled tumor and immune cells analysis. “We kept trying different analysis is readily available to biologists with
from 144 human breast tumor samples. approaches, most of which didn’t work,” commercially well-supported techniques,
In mass cytometry/CyTOF, cells are says Slavov. Sonication to lyse cells with while proteome analysis is mainly done in
prepped for analysis with isotope-conjugated focused acoustic waves led them to develop expert labs and “cannot easily reach the
antibodies. In sc-proteomics, it will likely be SCoPE-MS. When they validated the throughput, robustness and reproducibility
key to integrate different technologies and method, a student in the lab held the tube in of transcriptome analysis,” as Aebersold
physical principles, as with the combination the sonicator to lyse cells one at-a-time. The and his colleague Ben Collins point out.
of Edman sequencing and microscopy, says team mixed labeled peptides with labeled But it doesn’t have to stay that way.
New York University researcher Christine ‘carrier’ peptides to avoid the “never-ending The 200–300 picograms of protein
Vogel, who was a postdoctoral fellow in chase” to quantify sample losses from the in one mammalian cell cannot yet be tallied,
the Marcotte lab. clean-up process. They used isobaric tandem says Aebersold. A properly tuned and
mass tags (TMT), which bind to all peptides, optimized mass spec instrument can detect
Rethinking sample prep including the carrier peptides. The TMT tags all or most proteins expressed in a cell only
“My ideology is to make this as accessible all have the same molecular weight so “when when many cells are analyzed concurrently.
as possible,” says Nikolai Slavov, at they enter the instrument, they’re going to His lab has detected 500 to 1,000 distinct
Northeastern University, about his approach correspond to a single peak in m/z space,” proteins in a single cell from a sample
to sc-proteomics methods development. In says Slavov. equivalent to a single cell using SWATH-MS,
keeping with his training in genetics and Vogel likes SCoPE-MS and says it’s still a type of data-independent analysis.
biology, and interests in math, chemistry necessary to test new buffers and optimize They have not yet ‘processed’ a single
and physics, Slavov runs cross-disciplinary the approach, which her lab is currently cell but they injected a proportional
sc-proteomics meetings: mass spec veterans doing. In SCoPE-MS, carrier cells act as fraction from a small number of cells
attend along with researchers lacking such a kind of internal reference, says Linding. into the mass spec. The goal is to eliminate
expertise, as well as physicians, computational “It works,” he says, but eventually labs will sample handling losses, he says, such as
scientists and industry researchers. want to avoid a ‘carrier proteome’. Other material sticking to the surface of a
Slavov is happy to see how CyTOF, single- methods may emerge for analyzing proteins microtiter plate or Eppendorf tube.
cell Westerns and immunoassays are enabling and modifications, such as phosphorylation. The team hunted for the least absorbent
quantification of proteins in single cells. To Right now, sample preparation and labeling material, chose polydimethylsiloxane
move the possibilities for identification and technology are limiting sc-proteomics, he (PDMS) and built microfabricated devices
quantification beyond these techniques, his says, and he believes CyTOF will play an that “work quite well,” he says.
lab developed single cell proteomics by mass important role in validation and imaging. Cells flow into the device with one cell
spectrometry (SCoPE-MS)4 for identifying Because Slavov wanted a more affordable per compartment, which is confirmed with
and quantifying peptides from mammalian method, in SCoPE2 the team replaced imaging. Cells can be manipulated and lysed,

Nature Methods | www.nature.com/naturemethods


technology feature

proteins can be washed, and the sample can proteins are plentiful but in classic proteomic
then be worked up for mass spec. The team is analysis, a cell’s single peptides are often
still testing the approach. Aebersold likes thrown out. He’s addressing that with a
that it only requires cell sorting — no new algorithm for making more accurate
chemicals needed. error assessments8.
“Eventually, in single-cell analysis, In proteomics, it can seem that insufficient
each cell is a singleton,” says Aebersold. numbers of the cell’s proteins are detected
No two cells are entirely identical. They or that only the most abundant ones are
might resemble one another closely seen, says Linding, but mass spec does detect
in terms of biochemical function, but appear plenty. Unlike increased sensitivity detection
to be dissimilar due to differing cell-cycle for mRNAs, which are much less abundant
phases. To address such variability, Linding than cellular proteins, even small increases
says he and his team try to synchronize cell in the sensitivity of protein detection make a
cycles in their cell lines. Even then, cells are big difference.
“highly heterogeneous,” he says, which makes Computed proteins Given that sc-proteomics is in its
analysis tough. Much understanding about Slavov says that many researchers in single- infancy, many challenges also remain for
the cell cycle is based on population-level cell transcriptomics want to adapt their computational analysis, says Jürgen Cox from
data but sc-proteomics-based measurements software for sc-proteomics. All tools in this the Max Planck Institute of Biochemistry.
might deliver new insights about cell space will need to be benchmarked, he says, In sc-proteomics using mass spec, isobaric
cycle stages. “so that people don’t fool themselves,” he labelling is quite promising, as several
Until his and other sc-proteomics says. Labs need to diagnose data quality and single-cell channels can be multiplexed for
techniques mature, and labs can analyze identify what needs trouble-shooting. He a single mass-spec measurement. Including
large numbers of single cells quickly, the and his team have developed data-driven additional channels, such as multi-cell
techniques will not yield biologically optimization of MS (DO-MS)6 to visualize samples, enhances signal detection in the
interesting results, says Aebersold. and analyze data. It’s programmed in R, mass spectrometer and helps establish
That is why he and his team pursue built as a Shiny app and available here. It quantification standards.
an intermediate goal: analysis of small can help, for example, when elution profiles It remains challenging in computational
numbers of cells. They cluster cells using are not sampled at their apex, which is proteomics to interpret the multiplexed
multi-parameter fluorescence-activated cell needed to maximize the number and purity quantification channels, given that “isobaric
sorting (FACS) with 8–10 fluorescent colors, of ions. Slavov foresees much development labeling techniques are notoriously plagued
and group them according to similarity. work ahead for computational tools in by co-fragmentation signals,” says Cox.
“Rather than doing single cells and then sc-proteomics, as well as the need for
averaging them or combining them, we standards and benchmarking to make sure
combine them first and then measure the quantification is well-performed. Treatment A
average,” he says. At Harvard Medical School, Peter
A year ago, the team needed around Kharchenko and his team have been
20,000 cells; now the method works developing a computational tool for single-
with 100 cells or less. “Any cell population cell RNA-seq data analysis to address
that can be sorted with a FACS sorter, even heterogeneity issues. These challenges have
with very low numbers, is proteomically grown now that RNA-seq is being applied
accessible,” says Aebersold. A lab might be in complex study types involving many Normalized feature values
looking at a rare type of cell or one that is measurements, on numerous samples, from
only available in low numbers. For example, different people. A graphing tool from his Treatment B
his team plans to work with a neuroscience lab, written in R, is clustering on network of
lab to analyze mouse neuronal stem cells. samples (CONOS)7, which tracks cell types
One can obtain around 100 such cells across these heterogeneous datasets and
per animal and the goal is to use as few clusters similar cell sub-groups. It can also be
animals as possible. applied to sc-proteomics, says Kharchenko.
Aebersold acknowledges this intermediate “We’ve designed it to be very tolerant
approach may seem less exciting. At an with respect to diversity of samples,” says
annual conference on mass spec or other Kharchenko, so users can do joint analysis Deep learning Biological forecasting
technology-focused event, a lab presenting related to perturbations or across different
protein measurement from single cells tissues. A number of integration methods
might make a big impression. At a cancer are emerging. At the same time, integration P P
meeting, however, an sc-proteomic analysis has ‘subproblems’ he says, such as technical
from a tumor might reap a different variation, variability across individuals or
reaction. “They’re not going to be impressed, tissues, molecular modalities or species. With P P

because they’re going to say: ‘so what have sc-proteomics data analysis, the “relative
you learned?’” In experiments with small nature of the signal” is challenging and these
numbers of cells, such as when exposing cells data have more complex dropouts than
to a drug or other perturbation, some similar transcriptomic data. In single-cell proteomics, tools are emerging to
cell types might disappear and others appear. In Linding’s view, given that sc-proteomics quantify and integrate data about cell behavior,
“This is interesting information,” data are ‘richer’ than single cell RNA-seq data, signaling and regulatory networks, says Rune
says Aebersold. “what you need, is an error model.” Cellular Linding. Credit: Linding lab, Humboldt Univ.

Nature Methods | www.nature.com/naturemethods


technology feature

Fragmented peptides are “measured difference in mRNA level can lead a lab to
involuntarily” and add unwanted interpret the cells as different, which
contributions to the signals from peptides of they might be, or their cell physiology may
interest. “We are working on normalization have merely shifted, temporarily
methods and signal modeling that will and stochastically.
improve the situation,” he says. Missing A human cell contains hundreds
values in the data matrix are inherent to all of mRNA copies with proteins numbering
single-cell technologies and usually need in the tens of thousands, says Aebersold.
more attention than with bulk ‘omics data. One might find only a few copies per cell
Taken together with isobaric labeling, he says, of a lowly expressed mRNA, which makes
special algorithms are needed for these issues. it ever more important to determine
what kind of “stochastic time-space”
Big physics a cell is in, he says, and the risk of drawing
When physicists work with particle the proteome but, given a certain mRNA false conclusions again rears its ugly
accelerators, probabilities are applied concentration, it’s hard to model how many head. “I think there’s a lot of pitfalls in the
to the likelihood something is detected. proteins from that RNA are present. Beyond interpretation, from a biological point of
“That, I think, we are lacking in biology,” cell-to-cell variability in both transcription view, of single-cell data, which one would
says Linding. A mass spec instrument is little and translation, and varying levels of protein have to aware of.” Techniques matter, but
like a particle accelerator: when a signal of turnover, there are phenomena such as when developing and using them one must
a certain size is detected, it might or might transcriptional bursts and many external heed “what they’re good for,” he says. “Then
not be a true signal. He asks, “Can we assign factors that influence cells, mRNAs and it gets interesting when they can uncover
uncertainties to events?” Labs juggling big proteins. Much about the mRNA–protein something new.”
data should work with probabilities, not relationship has yet to be determined.
averages, he says. Models can then help Aebersold, along with Jürg Bähler of Sc-next
with data integration, and with explaining University College London, and colleagues, As sc-proteomics emerges, labs take a
cellular behavior, to tease out how different quantified the transcriptome and proteome diverse set of paths. As Vogel says, “it
quantitative data are related or to determine in fission yeast by looking at two cell-states: might be a combination of different
causality. “To do all this, we need probability- quiescence and rapid proliferation9. “There advances that will lead to near single-cell
based models,” he says, also for predicting were a lot of surprises,” says Aebersold, that proteomics.” One needs efficient protein
behaviors of protein networks. shed light on the relationship between gene extraction from cells, enrichment via specific
More labs can now use large datasets regulation, transcription, protein production surfaces, sensitive mass spec needing less
to, for example, predict how proteins are and physiology. When fission yeast shifts to and less sample, tagging tricks to enhance
interacting. Beyond measuring protein dormancy, proteins and mRNA are down- identification, and advances in mass spec data
abundance or concentration, cancer regulated, each in specific ways. acquisition and computational processing.
research labs will want to track kinase Proliferating cells had around 41,000 Nevertheless, Vogel is happy about some
activity quantitatively, learn how many mRNAs per cell, with a mean of around 3 recent developments, including new mass
phosphorylation sites there are and how transcripts per gene. Protein-coding genes spec data acquisition methods and more
many target proteins can be phosphorylated. produced a mean of around 5,000 protein libraries of known spectra, which increase the
“It is the activity of the molecule that is copies per cell, with a dynamic range possibilities of mined data.
important, not necessarily the abundance,” covering five orders of magnitude up to a “There is something to be gained by
says Linding. Biology is changing: more tools total of around 60 million protein having different approaches, particularly in
are emerging to quantify different aspects molecules per cell. the early phase,” says Linding. Given that its
of signaling and of the regulatory networks Quiescent cells had a much-reduced early days in the field, he says, this is not the
in and between cells — sc-proteomics helps transcriptome, with a little over 7,400 time to bet only on one horse. ❐
with that. Traveling between data captured at mRNAs, and around 31 million protein
different scales is non-trivial, says Linding, molecules per quiescent cell. When adjusted Vivien Marx
which calls for new algorithms, and for for the lower cell volume in quiescent cells, Technology editor for Nature Methods.
machine learning combined with more protein numbers dropped by around 10% of e-mail: v.marx@us.nature.com
traditional mathematical models. the levels in proliferating cells. The mRNA
Kharchenko agrees with the need levels drop, and the protein levels less so: Published: xx xx xxxx
for probabilistic data interpretation. when a good food source comes, “they’re https://doi.org/10.1038/s41592-019-0540-6
In many cases, the probability of detection ready to run,” says Aebersold.
References
will be close to 1, he says, but the uncertainty The shift to quiescence is accompanied 1. Hofstadler, S. A. et al. Anal. Chem. 67, 1477–1480 (1995).
in abundance will remain. It will take by proteome remodeling—nearly half of 2. Collins, B. C. & Aebersold, A. Nat. Biotechnol. 36,
more experiments to figure out the all proteins changed their copy numbers 1051–1053 (2018).
3. Swaminathan, J. et al. Nat. Biotechnol. 36, 1076–1082 (2018).
structure of variation in such measurements. around twofold. Protein levels drop but 4. Budnik, B., Levy, E., Harmange, G. & Slavov, N. Genome Biol. 19,
Models are needed to estimate this not indiscriminately, and those the cell will 161 (2018).
resulting uncertainty. need when it emerges from quiescence stay 5. Specht, H., Emmott, E., Koller, T. & Slavov, N. Preprint at bioRxiv
https://doi.org/10.1101/665307 (2019).
at higher levels. Overall, in quiescent cells, 6. Huffman, R. G., Chen, A., Specht, H. & Slavov, N. J. Proteome Res.
Counting mRNAs and proteins mRNAs were between 10,000 to 60,000- 18, 2493–2500 (2019).
At one point, labs will want to integrate fold less abundant than the corresponding 7. Barkas, N. et al. Nat. Methods https://doi.org/10.1038/s41592-
019-0466-z (2019).
mRNA and single-cell protein measurements. proteins, with 1 to 10 mRNAs per gene. The 8. Robin, X. et al. Preprint at bioRxiv https://doi.org/10.1101/621961
That will bring a longstanding discussion to results highlight an sc-proteomics challenge, (2019).
the fore. The transcriptome correlates with says Aebersold. For example, a twofold 9. Liu, Y., Beyer, A. & Aebersold, R. Cell 165, 535–550 (2016).

Nature Methods | www.nature.com/naturemethods

You might also like