You are on page 1of 3

Q&A SCIENCE TALK

B I O C H E M I ST R Y

Journey to
the Genetic
Interior
What was once known as junk DNA
turns out to hold hidden treasures,
says computational biologist Ewan Birney
Interview by Stephen S. Hall

I
n the 1970s, when biologists first glimpsed the landscape of human
genes, they saw that the small pieces of DNA that coded for proteins
(known as exons) seemed to float like bits of wood in a sea of genetic
gibberish. What on earth were those billions of other letters of DNA
there for? No less a molecular luminary than Francis Crick, co-discov-
erer of DNA’s double-helical structure, suspected it was “little better
than junk.”
IN BRIEF
The phrase “junk DNA” has haunted non-gene” parts of the human genome.
who
human genetics ever since. In 2000, Known as the Encyclopedia of DNA Ele-
EWAN BIRNEY when scientists of the Human Genome ments (ENCODE for short), the project
vocation/avocation Project presented the first rough draft of required scientists, in essence, to crawl
“ Cat herder in chief” of the ENCODE the sequence of bases, or code letters, in along the length of the double helix as
consortium of 400 geneticists from
human DNA, the initial results appeared they attempted to identify anything with
around the world
to confirm that the vast majority of the a biological purpose. In 2007 the group
where
sequence—perhaps 97 percent of its 3.2 published a preliminary report hinting
European Bioinformatics Institute,
Cambridge, England billion bases—had no apparent function. that, like the stuff all of us park in the at-
The “Book of Life,” in other words, looked tic, there were indeed treasures aplenty
research focus
like a heavily padded text. amid the so-called junk.
Creating an encyclopedia detailing
what the most mysterious parts of But beginning roughly at that same Now, in a series of papers published in
the human genome do time, a consortium of dozens of interna- September in Nature (Scientific Ameri-
big picture
tional laboratories embarked on a mas- can is part of Nature Publishing Group)
“I get this strong feeling that previously sive, unglamorous and largely unnoticed and elsewhere, the ENCODE group has
I was ignorant of my own ignorance, project to annotate what one biologist produced a stunning inventory of previ-
and now I understand my ignorance.” has called the “humble, unpretentious ously hidden switches, signals and sign-

80  Scientific American, October 2012


© 2012 Scientific American
Q&A SCIENCE TALK

posts embedded like runes throughout were discovered in the 1970s. I am now instead, that as much as 80 percent
the entire length of human DNA. In the convinced that it’s just not a very useful of the genome may be functional?
process, the ENCODE project is reinvent- way of describing what’s going on. One can use the ENCODE data and come
ing the vocabulary with which biologists up with a number between 9 and 80 per-
study, discuss and understand human in- What is one surprise you have had cent, which is obviously a very big range.
heritance and disease. from the “junk”? What’s going on there? Just to step back,
Ewan Birney, 39, of the European Bio- There has been a lot of debate, inside of the DNA inside of our cells is wrapped
informatics Institute in Cambridge, Eng- ENCODE and outside of the project, about around various proteins, most of them
land, led the analysis by the more than whether or not the results from our ex- histones, which generally work to keep
400 ENCODE scientists who annotated periments describe something that is real- everything kind of safe and happy. But
the genome. He recently spoke with Sci- ly going on in nature. And then there was there are other types of proteins called
entific American about the major find- a rather more philosophical question, transcription factors, and they have spe-
ings. Excerpts follow. which is whether it matters. In other cific interactions with DNA. A transcrip-
words, these things may biochemically oc- tion factor will bind only at 1,000 places,
Scientific American: The ENCODE cur, but evolution, as it were, or our body or maybe the biggest bind is at 50,000
project has revealed a landscape that doesn’t actually care. specific places across the genome. And so,
is absolutely teeming with important That debate has been running since when we talk about this 9 percent, we’re
genetic elements—a landscape that 2003. And then work by ourselves, but really talking about these very specific
used to be dismissed as “junk DNA.” also work outside of the consortium, has transcription-factor-to-DNA contacts.
Were our old views of how the genome made it much clearer that the evolution- On the other hand, the copying of
is organized too simplistic? ary rules for regulatory elements are dif- DNA into RNA seems to happen all the
birney:  People always knew there was ferent from those for protein-coding ele- time—about 80 percent of the genome is
more there than protein-coding genes. It ments. Basically the regulatory elements actually transcribed. And there is still a
was always clear that there was regula- turn over a lot faster. So whereas if you raging debate about whether this large
tion. What we didn’t know was just quite find a particular protein-coding gene in amount of transcription is a background
how extensive this was. a human, you’re going to find nearly the process that’s not terribly important or
Just to give you a sense here, about 1.2 same gene in a mouse most of the time, whether the RNA that is being made ac-
percent of the bases are in protein-coding and that rule just doesn’t work for regu- tually does something that we don’t yet
exons. And people speculated that “may- latory elements. know about.
be there’s the same amount again in- Personally, I think everything that is
volved in regulation or maybe a little bit In other words, there is more being transcribed is worth further explo-
more.” But even if we take quite a conser- complex regulation of genes, and ration, and that’s one of the tasks that we
vative view from our ENCODE data, we more rapid evolution of these will have to tackle in the future.
end up with something like 8 to 9 percent regulatory elements, in humans? 
of the bases of the genome involved in Absolutely. There is a widespread perception
doing something like regulation. that the attempts to identify common
That’s a rather different way of think- genetic variants related to human
Thus, much more of the genome is ing about genes—and evolution. disease through so-called genome-
devoted to regulating genes than to I get this strong feeling that previously I wide association studies, or GWAS,
the protein-coding genes themselves? was ignorant of my own ignorance, and have not revealed that much. Indeed,
And that 9 percent can’t be the whole now I understand my ignorance. It’s the ENCODE results now show that
story. The most aggressive view of the slightly depressing as you realize how ig- about 75 percent of the DNA regions
amount we’ve sampled is 50 percent. So norant you are. But this is progress. The that the GWAS have previously linked
certainly it’s going to go above 9 percent, first step in understanding these things is to disease lie no­­where near protein-
and one could easily argue for some- having a list of things that one has to un- coding genes. In terms of disease,
thing like 20 percent. That’s not an un- derstand, and that’s what we’ve got here. have we been wrong to focus on muta-
feasible number. tions in protein-coding DNA?
Earlier studies suggested that only, Genome-wide association studies are
Should we be retiring the phrase say, 3 to 15 percent of the genome had very interesting, but they are not some
“junk DNA” now? functional significance—that is, actu- magic bullet for medicine. The GWAS sit-
Yes, I really think this phrase does need ally did something, whether coding uation had everyone sort of scratching
to be totally expunged from the lexicon. for proteins, regulating how the genes their heads. But when we put these genet-
It was a slightly throwaway phrase to de- worked or doing something else. Am ic associations alongside the ENCODE
scribe very interesting phenomena that I right that the ENCODE data imply, data, we saw that although the loci are

82  Scientific American, October 2012


© 2012 Scientific American
Q&A SCIENCE TALK

not close to a protein-coding gene, they


really are close to one of these new ele-
ments that we’re discovering. That’s been
a lovely thing. In fact, when I first saw it,
it was a slightly too-good-to-be-true mo-
ment. And we spent a long time double-
checking everything.

How does that discovery help us


understand disease?
It’s like opening a door. Think about all
the different ways you can study a par-
ENIGMA: Researchers have found greater complexity in
ticular disease, such as Crohn’s: Should
human DNA than this “simple” model would suggest.
we look at immune system cells in the
gut? Or should we look at the neurons
that fire to the gut? Or should we be
looking at the stomach and how it does “We’re going to be studying this for 50 So I’m more like the cat herder, the
something else? years, 100 years. But this is the founda- conductor, necessarily, than someone
All those are options. Now suddenly tion that we start on.” I do get the feel- whose brain can absorb all of this. It
ENCODE is letting you examine those ing that the ENCODE project is the next comes back to that sense that it’s a bit of
options and say, “Well, I really think you layer in that foundational resource for a jungle out there.
should start by looking at this part of the other people to stand on top of and look
immune system—the helper T cells— further. The biggest change here is in Well, you deserve a lot of credit.
first.” And we can do that for a very, very our list of known unknowns. And I It’s more than just cats. They’re
big set of diseases. That’s really exciting. think people should understand that al- pretty opinionated cats.
though finding out how much you don’t Yeah, they are. What scientists are not
Now that we are retiring the phrase know can feel regressive and frustrat- are dogs. Dogs naturally run in packs.
“junk DNA,” is there another, ing, identifying the gaps is really good. Cats? No. And I think that sums up the
better metaphor that might explain Ten years ago we didn’t know what normal scientific phenotype. And so you
the emerging view of the genetic we didn’t know. There is no doubt that have to cajole these people sometimes
landscape? ENCODE poses many, many, many more into sort of taking the same direction.
What it feels like is genuinely a jungle—a questions than it directly answers. At
completely dense jungle of stuff that you the same time, for Crohn’s disease, say, Do you see a point where all this com-
have to work your way through. You’re and lots of other things, there are some plex information will resolve into a
trying to hack your way to a certain posi- effectively quick wins and low-hanging simpler message about human inher­
tion. And you’re really not sure where fruit—at least for researchers—where itance and human disease? Or do we
you are, you know? It’s quite easy to feel you start to say to people, “Oh my gosh, have to accept the fact that complexi-
lost in there. have you looked there?” ty is, as it were, in our DNA?
It’s just one more step. It’s an impor- We are complex creatures. We should ex-
Over the past 20 years the public has tant step, but nowhere near the end, I’m pect that it’s complex out there. But I
been repeatedly told that these big afraid. think we should be happy about that
genomic projects—starting with the and maybe even proud about it. 
Human Genome Project and going You sometimes refer to yourself as
on through various other projects— ENCODE’s “cat herder in chief.” Stephen S. Hall has written about science for the
were going to explain everything we How many people were involved in Atlantic, New York Times Magazine, New York-
needed to know about the “book of the consortium, and what was it like er and many other magazines.
life.” Is ENCODE simply the latest in coordinating such a massive effort?
this sequence? This is very much a different way of do-
LEONARD LESSIN Photo Researchers, Inc.

MORE TO EXPLORE
I think that each time we always said, ing science. I am only one of 400 inves­
T he ENCODE project: Encyclopedia of DNA Elements:
“These are foundations. You build on tigators, and I am the person who is www.genome.gov/10005107
them.” Nobody said, “Look, the human charged to make sure that the analysis
SCIENTIFIC AMERICAN ONLINE
genome bases, that’s it. It’s all done and was delivered and that it all worked out. Discover more about DNA at
dusted—we’ve just got a bit of code But I had to draw on the talents of many, ScientificAmerican.com/oct2012/genes
breaking to do here.” Everybody said, many people.

84  Scientific American, October 2012


© 2012 Scientific American

You might also like