Professional Documents
Culture Documents
This paper discusses the claim that alternative views to evolutionary biology
based on novel advances in understanding the molecular and developmental
bases of variation and inheritance should be captured as a shift from “statis-
tical” to “mechanistic” explanatory schemes (Pigliucci and Müller 2011).
Granted, statistical approaches characterized the Modern Synthesis, but by
examining the epistemic features of postgenomic science I claim that this is
not a proper characterization of the current epistemic shift. I will first char-
acterize the dual nature of the gene in development and inheritance, accounting
for it in terms of difference between two sorts of causal ascriptions. Following
the shift in postgenomic science regarding the concepts of genes, variation and
inheritance, I will first argue that, in contrast to mechanistic explanations,
the shift provides us with novel topological explanatory frameworks to
approach genomic networks of many sorts, and novel and nucleotide-focused
statistical tools unlikely to translate directly into the mechanistic modeling
The author is extremely grateful to Chris Donohue, Antonine Nicoglou, and Arnaud
Pocheville, whose comments and suggestions have been indispensable for the arguments
of this paper. A first version of those arguments has been presented as an invited lecture
at the History and philosophy of biology seminar of the National Human Genome Research
Institute (Bethesda) in December 2016, and then at the Montréal-Duke-Toronto-Paris-
Cambridge Consortium for History and Philosophy of Biology in May 2017 in Montreal.
I thank both audiences; insightful comments have been given by Thomas Heams, Annick
Lesne, and Joseph McInnerney on the first version and they are also warmly acknowledged.
Hugh Desmond and Chris Donahue should be thanked for a thorough language check.
Finally, sharp and detailed comments by two anonymous reviewers have substantially im-
proved the paper. This work is supported by the LIA CNRS Paris-Montréal ‘Epistemic and
Conceptual Issues in Evolutionary Biology. The author thanks the Institut de France and
the Fondation Desmaret for the support to the work that led to this publication.
Perspectives on Science 2019, vol. 27, no. 1
© 2019 by The Massachusetts Institute of Technology doi:10.1162/posc_a_00302
117
118 Epistemological Perspective on Genome Project Legacy
1. Introduction
Evolutionary biology, in the sense of the Modern Synthesis, which in the
1930s and 1940s articulated Darwinian natural selection and Mendelian
genetics around population and quantitative genetics, is currently under-
going a set of theoretical challenges based on various empirical and
conceptual advances (Pigliucci and Müller 2011; Laland et al. 2014;
Huneman and Walsh 2017; Müller 2017). Some authors propose an “Ex-
tended Synthesis,” which should integrate novel processes and modeling
styles within our approach to evolution and adaptation; other authors ar-
gue that the new empirical advances in evolutionary biology, such as the
acknowledgment of the role of phenotypic plasticity in evolution ( West-
Eberhard 2003; Nicoglou 2015), the impact of epigenetic phenomena on
inheritance (Bonduriansky and Day 2009; Danchin et al. 2011), or cases
of directed variation through developmental constraints such as genetic
channeling (Brakefield 2006; Maynard Smith et al. 1985), are likely to
be accounted for within the conceptual framework set forth by the Mod-
ern Synthesis ( Wray et al. 2014; Welch 2017). Granted, the Extended
Synthesis encompasses several claims, both about new problems that evo-
lutionary biologists should address, and new approaches, concepts, or
methods that should be considered. Recently, Müller (2017), one of its
major proponents, summarized the Extended Synthesis idea in these
terms:
Besides the expanded range of selection to multiple levels of
organization, the generative properties of developmental systems are
viewed as responsible for producing phenotypic specificity, whereas
natural selection serves to release that developmental potential.
Particular forms of phenotypic change are taken as the result of
internal generative conditions rather than external pruning. Thus,
a significant amount of explanatory weight is shifted from external
condition to the internal properties of evolving populations.
(…) Instead of chance variation in DNA composition, evolving
developmental interactions account for the specificities of phenotypic
construction.
As indicated here, Müller emphasizes the major role of development as a
“constructive” process within evolution. In this paper, I consider the claim
that development should be reintegrated within evolution and wonder
Perspectives on Science 119
the epistemological changes that are brought about by this new approach to
genes and genomics, and their consequences for evolutionary theorizing, I
will consider a proposition made by Pigliucci and Müller (2011) about
the substitution of a mechanistic explanatory mode for the statistical explan-
atory mode of the Modern Synthesis (§4). To this end, I will evaluate the
extent to which our mechanistic understanding of genes in development
and inheritance is enriched after the postgenomic turn. I will finally assess
Müller and Pigliucci’s claim about an epistemic turn supporting the Ex-
tended Synthesis project. In a word, I will argue that the epistemological
landscape of explanatory types we nowadays witness in this field is much
more complex and richer than what is claimed by some supporters of an
Extended Synthesis, and possibly, in general, by many participants in those
debates when they use philosophical arguments about explanatory types.
2. On the lack of unity in the Modern Synthesis from the viewpoint of a historian of
science, see Cain (2009).
Perspectives on Science 121
4. See Tabery (2014) for an examination of Fisher’s claim that non-additive terms are
less important, and for the consequences of this claim.
5. Lewontin doesn’t say that inferring from heritability measures to causal factors in
individual traits is impossible, but that there is no general and principled way to do it. This
operation presupposes strict conditions on environmental homogeneity, kinds of popula-
tions considered, etc.
Perspectives on Science 123
6. There are other views on this issue of process causation, for instance those of Bailly
and Longo (2011), but this is not the place to consider such controversies.
124 Epistemological Perspective on Genome Project Legacy
9. Think, for example, of Mayr’s well known attack on “bean bag genetics” (Mayr 1959).
Perspectives on Science 127
its relations with other genes modulate its role in the expression of a
target gene, which involves both pleiotropy (one-to-many gene-protein
relations) and epistasis (many-to-one gene-protein relation). Moreover,
each gene is part of many networks, and each network involves many of
the genes in a different way, all of which gives an idea of the complexity
of the gene-protein-trait relations. This shift could be conceived in terms
of a shift from the gene to the genome as the key agent involved in
evolution and development (in contrast for example to the focus on gene
inherent in the “gene’s eye view” (Dawkins 1982) and the discussions this
triggered). Any revision of the epistemological role of genes in develop-
ment and in inheritance should start from this overwhelming fact of the
use of gene networks in our understanding of genes.
b) A novel notion of variation arises from the sequencing of the human
genome (along with the genomes of fruit flies, mice, nematodes, yeast, the
sequencing of which was also part of the HGP), and challenges the idea
that variation is a question of recombination and mutation. Granted, those
may condition variation, but they are not sufficient conditions, as I will
now explain. Our progressive understanding of genomics indeed yielded
two so-called paradoxes regarding genes as DNA sequences: the C-value
paradox, and the G-value paradox (Elliott and Gregory 2015). The C-value
paradox is about the fact that the number of nucleotides in organisms’
DNA does not correlate with the size of the organisms: some plants and
yeast have as many nucleotides in their genomes as humans have; nor does
it match in complexity either (assuming there is some correlation between
complexity and size). As formulated later on, the G-value paradox is about
the fact that the amount of coding in DNA sequences does not seem to
correspond to the complexity of organisms (however complexity is de-
fined): humans as well as some plants or insects have tens of thousands
of genes, while some species have much more.10 Our intuition would be
that whatever the meaning of “complexity” is, more complex organisms
need more instructions to be built and to fulfill their functions, hence more
genes in the sense of coding sequences. But this is not the case at all. Thus,
basically, what explains the G-value and the C-value paradoxes is that few
genes can produce many variant transcripts (e.g., due to alternative splicing
or other processes) and then many proteins, and that they produce more of
them in some genomic and cellular contexts than in other genomic contexts.
Thus, a plausible hypothesis is that the uses of these genes within the genomic
system integrated in the cell machinery accounts for the distinct complexities
of organisms, not the mere amount of (coding) genes. Genome is the primitive
10. Complexity and size may not evolve independently; the relation between their evo-
lutions has been intensively studied; for a recent overview see McCarthy and Enquist (2005).
128 Epistemological Perspective on Genome Project Legacy
system, and genes are now defined in genomics as units within those systems,
though there may be several ways to define them. This leads to the need for
several concepts of genes (e.g., Neumann-Held 2001; Moss 2003; Griffiths and
Stotz 2013), since DNA sequence as a coding sequence is not enough to denote
the content involved in the functioning and development of the cell.
However, inversely many genomic networks are likely to yield the same
gene products or genomic expression profile (see Ciliberti et al. 2007 for an
exploration of the network of GRNs). This overall fact challenges a key
tenet of the Modern Synthesis, namely the fact that, as in the earlier cita-
tion from Huxley, variation is due to mutation and recombination. We
now know that mutations may just not produce variation, and that large
variation can be achieved with almost no mutations, but rather through
the reconfiguration of the network either by a change in the global net-
work output after environmental shift or a small change in a regulatory
gene that triggers major changes in the overall functioning of the network.
David et al. (2013) provided an experimental evolution model based on
the synthetic yeast genome in which adaptations to glucose deprivation
take place in various ways “with or without these significant mutations,
indicating that additional factors participated in this regulation and that
the regulatory network could reorganize in multiple ways to accommodate
different mutations.” In the context of gene networks, variation as a fuel to
adaptive evolution is not in principle to be equated with mutations.
For this reason, the major debate about the relative roles of mutation
and selection in evolution, which pervaded the whole history of evolution-
ary theories, is to be resettled here. At the time of the Modern Synthesis,
Fisher proposed the very strong argument that mutations cannot be the
source of the adaptive evolution of traits, since beneficial mutations should
be necessarily small. In effect, for him, the larger the mutation, the greater
the chance that it would largely impact many traits within an organism,
since organisms are always tightly integrated wholes; however, this would
mean that it would surely harm some of those traits. Because of this argu-
ment, couched in a model now called Fisher’s geometric model (see Martin
and Lenormand 2008) macromutations cannot lead to evolution. Large mu-
tations a priori negatively affect the functional integration of organisms, so
beneficial mutations must be small. However, if mutation can affect not only
coding but also non-coding regulatory genes (something we have known for
several decades), and since these non-coding genes play a role in regulating
the expression of various genes within a gene regulatory network, a mutation
may be large and still poorly affect traits directly. Carroll (2005) claimed in
this sense that most of the mutations involved in the emergence of novelties
in evolution are mutations on regulatory genes rather than coding genes.
David and colleagues go further and indicate that a major adaptive change
Perspectives on Science 129
structure and functioning indeed warrant such changes and therefore jus-
tify the sort of explanatory move pushed by the defenders of an Extended
Synthesis, we may first want to see such a justification formulated. For that
reason I will focus here on a philosophical rationale for the Extended Syn-
thesis formulated by Pigliucci and Müller (2011). In their attempt to for-
mulate a rationale for the move they are calling for in their book, these
authors suggest that whereas the Modern Synthesis biologists used statis-
tical models of gene frequency change to capture evolution because they
did not yet have access to fine-grained mechanisms, the Extended Synthe-
sis can rely on a much deeper knowledge of molecular and developmental
processes, and therefore is in a position to capture the mechanisms under-
lying evolution and adaptation. Thus, they contend that “the shift of em-
phasis from statistical correlation to mechanistic causation arguably
represents the most critical change in evolutionary theory today.”
As we have seen, Modern Synthesis is indeed intrinsically based on sta-
tistical models and probabilistic knowledge. Müller and Pigliucci argue
that the insufficiency of statistical explanations in evolutionary biology
was made manifest in the recent advances of molecular and developmental
biology, including the postgenomic turn that I discussed in section 3. Such
a rationale appeals to a philosophical distinction between statistical and
mechanistic concepts, and therefore engages the current literature about
causal explanations and mechanisms in science (e.g., Craver and Darden
2013; Glennan 2017).16
In this section, I will therefore consider the explanatory regimes in-
volved in the exploration of genomes and their role in inheritance— hence
in evolution— as well as in development, in order to assess the validity of
Müller and Pigliucci’s interpretation. It will appear that while explanatory
innovations do occur in the postgenomic science of heredity and develop-
ment, those changes cannot be captured by the move from statistics to
mechanisms hypothesized by these authors. Therefore, their epistemolog-
ical argument cannot be the proper rationale for the move to an Extended
Synthesis, at least if one considers the legacy of the HGP for evolutionary
biology. However, a more general consequence will be that post-genomic
science, to the extent that it involves our knowledge of inheritance and
development, displays several epistemological regimes, so that an argu-
ment pinpointing a single explanatory change cannot justify the move
towards a new Synthesis in evolution.
16. There is a strong relation between this claim by Müller and Pigliucci and the so-
called “statisticalist interpretation” of natural selection, advanced initially by Walsh, Ariew,
Lewens and Matthen (Walsh et al. (2002); Matthen and Ariew (2002); Walsh (2010)). This
important debate about the nature of selection is wholly left aside here.
132 Epistemological Perspective on Genome Project Legacy
The first part of the section (4.2) addresses the way our postgenomic
science understands the role of genes in the development and functioning
of cells and organisms. I will consider whether the thinking emerging here
is a form of mechanistic thinking that could indeed be likely to supersede
the statistical thinking proper to the Modern Synthesis and mostly used to
handle genes as inheritance units. The second part (4.3) will look at geno-
mics’ attempts to identify markers of heritable traits, and see whether they
could translate into an ascription of causal roles which, in turn, would allow
a general mechanistic framework to address inheritance and evolution
together— in contrast with my discussion in section 2 of the explanatory
heterogeneity of the two roles of the gene.
such redundancy-based robustness from the many cases where the focal gene
does not have a redundant copy somewhere else (Wagner 2005b). In this
case, labeled “distributed robustness,” one should acknowledge that the gene
network keeps doing the same thing in the absence of the gene under focus.
This points to the following fact: genes have a context-sensitive, network-
defined effect rather than a specific causal role.17 Granted, one could always
say that each gene has a repertoire of several possible causal roles, disposi-
tionally triggered each time a gene is altered in the network; but given the
size of the network and the number of possible gene combinations, de-
scribing such a repertoire would be extremely difficult.
Thus, finding out the causal role of an allele in a given network (and not
just its influence on its neighbors) is not straightforward because this role
will depend upon thousands of interactions with other genes and may
change depending on a change in other genes or gene products in the net-
work. What is explanatory therefore is often not the gene qua having a
causal role in the system but the whole network made up by the various
effects of the genes in various contexts defined by various arrangements of
the network. Thus, the explanation of genomic expression (e.g., skeletoge-
netic micromere structure in sea urchins (Oliveri et al. 2008)) may partly
rely on network properties rather than on some allele’s specific causal role.
I am not arguing that this is the only way to infer behavior from our
knowledge of gene networks; in some cases, one can list the frequency at
which a given gene impacts each phenotype and construe an expression profile
for this gene.18 However, this strategy misses the originality of network un-
derstanding as may be allowed by GRN modeling. When, as we have seen,
Huxley in 1936 summarized the ideas of population geneticists that genes
are not simply proxies for traits but that their phenotypes depend upon the
genetic background in which they are embedded, he possibly justified
exactly the same strategy. However, what’s new in the GRN and other
gene networks is that this multiple context-dependent expression of genes
is fine-grained into a hierarchical pattern: genes are downregulated and/or
upregulated. Thus, a mere summary expression profile for genes would
miss such an interesting insight into the reasons why genes have distinct
expression profiles.
Exploring a gene network may however provide the idea of dividing the
genes, as nodes, into several categories defined internally to the network.
For instance, in a paper about the GRN regulating the development of the
heart in zebra fish, the authors, after having reconstructed the network,
17. This holds however they are defined, and whether they are DNA sequences, tran-
scripts, or sets of possible DNA sequences does not make much of a difference here.
18. My thanks to an anonymous reviewer for raising this objection.
134 Epistemological Perspective on Genome Project Legacy
19. Someone could object that specifying the topology is about the organization of the
mechanism, and that therefore the explanation that targets networks topology is a kind of
mechanistic explanation. This is not the place to answer such an objection, which has been
dealt with by Felline (2015) and Huneman (2017, 2018a).
Perspectives on Science 135
cell and the organism will malfunction in their environment. Klemm and
Bornholdt found that reliable attractors cycles are those constituted by per-
vasive triangular elements (minimal subgraphs) that deliver the signal
through the triangle and then to the next node. The fluxes along the edges
as well as their orientation is not so relevant to the reliability of the perfor-
mance. The topology of the network proves therefore to be the main differ-
ence between reliable and unreliable attractors. Hence reliability can be
attested by examining the mere topological properties of the subgraphs.
As they stress: “The likelihood of reliable dynamical attractors strongly
depends on the underlying topology of a network.”
The same kind of reasoning applies to other networks such as the protein-
protein interaction networks (PPI). A recent study by Alvarez-Ponce
et al. (2017) for instance has shown that, even though the different rates of
evolution of different proteins in a lineage may be affected by many factors
(e.g., their level of gene expression, etc.), the most important factor is a topo-
logical one, namely the centrality of the protein in the PPI network, this cen-
trality being understood on the basis of three topological parameters: degree,
betweenness (number of shortest paths between all pairs of other proteins that
pass throughout a certain protein), or closeness (one divided by the average
distance between a protein and all other proteins in the network).
A more general indication of the fruitfulness of topological consider-
ations in explanations about networks, as opposed to the focus on the ac-
tivities of genes as entities within a mechanism to be reconstructed, is the
notion put forth by Uri Alon of network “motives.” In metabolic net-
works, transcription-regulation gene networks or PPI ( Yeger-Lotem
et al. 2004), there are topological patterns (small subnetworks) that recur
much more frequently than randomly expected. Those “motives,” because
of their topology, may do something in the transmission within the overall
network— they are like “logical gates” in a network (Figure 1). For
instance, one of the motives may function as a feed-forward loop because
of its topology (see C in figure 1). As the authors write, “The topology of
this motif enables composite regulation schemes.” The point, here, is that
the shift towards genomic networks in the postgenomic turn does seem to
offer topological explanations rather than mechanistic explanations, at least
in the view of neomechanicists.
To sum up, it is not the case in our explanations of development and
development-related network functioning in the postgenomic context that
a gene as a DNA sequence20 will itself be a unit of expression and develop-
mental action, and a causal-role bearer in an explanatory mechanism. Thus,
even though our understanding of variation involved in developmental
molecular mechanisms exceeds by far that of the MS, the kinds of expla-
nations we gather in this context are not stricto sensu, or not always, mech-
anistic explanations. This is not to deny that mechanistic explanations
occur in this field, nor even that they may be the most frequent case of
explanations— but it surely counters the claim that what happens here
is an intensification of mechanistic regimes of explanation at fine-grain
levels of biological reality. In this perspective, one is not justified to think
of the lessons of the postgenomic turn in terms of the replacement of a
statistical explanatory scheme by a mechanistic understanding.
In addition, one could also consider that our understanding of topolog-
ical constraints allows us to understand something about evolution itself.
That is why, the case I make here for the rise of topological explanations
within postgenomic contexts— even though mechanistic explanations are
still more frequent— is relevant to a discussion of the epistemological
features of any evolutionary paradigm change. Erwin (2017) reminds us
of this in reviewing the various topological and metric spaces considered
by evolutionary biologists to make sense of novelty in evolution. Following
work by Wagner, Fontana, Schuster, Stadler, and others on RNA net-
works, he considers the space of RNA sequences and the way its topology
Perspectives on Science 137
identify genes for traits and possibly change the explanatory landscape of
evolution and development.
4.3.1. ENCODE, Functions and the Genome. We saw that the dream of
mapping particular genes onto phenotypes vanished, as the amount of cod-
ing genes proved to be far smaller than the amount of traits, especially with
the HGP. One gave up the idea that a gene could be mapped onto a trait in
a cell of an organism for which it’s responsible. However, a recent develop-
ment could be seen as the continuation of this idea, namely the ENCODE
project, which aims at listing in a conveniently tractable way the functional
effects of genes within gene networks (rather than within cell or organismal
phenotype) (ENCODE Project 2012). ENCODE has strong ties with the
above-mentioned HapMap program, which was intended to explore varia-
tion in the human genome. Through ENCODE, many genes are now ascribed
some functions internal to their embedding within networks. Annotations are
also intended to capture variation. This led to criticism of the old idea that
most DNA is junk DNA, and of the very notion of junk DNA, since even
though many sequences are not exons, i.e., they are not expressed in traits,
they do something in the networks of gene regulation and gene expression,
and that something has been quantified (ENCODE Project Consortium
2012). The results of such enquiries have been labeled “functions” of the
genes. Yet, as highlighted by Doolittle (2013), the sense of “function” accord-
ing to which all sequences are ascribed functions— contrary to the previous
belief that many of them were junk DNA— is controversial, especially with
regard to the concept of function in evolutionary theory, namely functions as
“selected effects” (Neander 1991).
However, it’s not even clear that those “functions” are causal-role functions—
the main alternative philosophical account of biological functions (e.g.,
Cummins 1975)— because what they do is so context-dependent that one
cannot say that two given genes have two causal roles allowing them to be
part of the same mechanism, since they perform a multiplicity of context-
dependent things. Hence, nothing ensures that the two “functions” will be
referred to the same context, which is a requisite for saying those are two
causal-role functions of a given item (since “being a function” is here as-
cribed in reference to a specific context, i.e., a system under study).21 Using
21. An analogous and sharper critique has been made by Graur and colleagues: “EN-
CODE adopted a strong version of the causal role definition of function, according to which
a functional element is a discrete genome segment that produces a protein or an RNA or
displays a reproducible biochemical signature (e.g., protein binding). Oddly, ENCODE not
only uses the wrong concept of functionality, it uses it wrongly and inconsistently. (…)
According to ENCODE, for a DNA segment to be ascribed functionality it needs to 1)
be transcribed, 2) be associated with a modified histone, 3) be located in an open-chromatin
area, 4) bind a transcription factor, or 5) contain a methylated CpG dinucleotide. (…) This
Perspectives on Science 139
kind of argument is false because a DNA segment may display a property without neces-
sarily manifesting the putative function. For example, a random sequence may bind a tran-
scription factor, but that may not result in transcription” (Graur et al. 2013, p. 580).
22. Being a constraint on the system’s thermodynamic processing maintained by the very
activity of the system itself is the basic conception supporting the “organizational account” of
functions.
140 Epistemological Perspective on Genome Project Legacy
Figure 2. SNPs involved in metabolic human diseases. Excerpt retrieved from the
GWAS Catalog: https://www.ebi.ac.uk/gwas/. Excerpt from the EBI-NHGRI
catalogue of identified SNPs. 2A is a magnified version of 2B, inorder to make
colour representation of SNPs and colour code more visible.
correlated with over 80 diseases and traits have been found. The method-
ology always consists in finding a correlation between variance on the trait
and variation on a given locus, possibly on the whole genome (GWAS). No
knowledge of the causal architecture of gene expression is required here
(even though knowing the dominance, recessivity, and epistasis relation
would be helpful to fine-grain the results), but what matters is the fact
that those variants make a difference in the probability of the trait under
focus for the subpopulations that have those variants. Moreover, as Laber
142 Epistemological Perspective on Genome Project Legacy
and Cox (2017) notice about SNPs involved in diabetes type II and obesity,
“the vast majority of these SNPs are in non-coding regions of the genome
and distal to promoters, suggesting they act through gene regulation which
makes their functional interpretation challenging.”23
These studies rely on the resolution power of our detection tools, which,
here, rests upon the size of the sample. The Major Depressive Disorder GWAS
(Genome-Wide Association Studies) consortium indeed noted in 2013: “Her-
itability alone reveals little about genetic architecture. In the absence of a de-
tailed understanding of genetic architecture, sample size and phenotypic
homogeneity are the critical determinants of discovering robust and replicable
genetic associations” (MDD 2013; my emphasis). This is straightforward:
the larger the size, the greater the chance one will find a variant that is slightly
more represented in the subpopulation with the trait under study than in the
other subpopulation. This explains for example why, by switching to cohorts
that are about 180,000 people (Okbay et al. 2016) instead of 9,000 (MDD
GWAS 2013), researchers found many mutations involved in depression. As
Mullins and Lewis (2017) indicate, “this number [of genetic associations] is
now expected to increase linearly with sample size, as seen in other polygenic
disorders.” Considering the study of genetic loci involved in depression, they
measured the effect of shifting sample size in the following way: “The
CHARGE study of 51,258 participants would have 50% power to detect
a variant accounting for 0.0058% of trait variance. A study of 180,000
participants, similar to SSGAC, could detect a variant accounting for
0.017% of trait variance (with 50% power).”24
Undoubtedly the same trend will affect the research of genetic variants
in schizophrenia, in many diseases, and possibly in behavioral traits.25 As a
study of depression indicates, “it has become clear that the effects of com-
mon genetic variants for most complex human diseases are considerably
smaller than many had anticipated. This implies that sample sizes necessary
for identification of common genetic main effects were far larger than could be
attained by single-research groups or existing consortia” (MDD 2013).
The consequence of this trend towards larger sample sizes, however, is
that all those mutations detected by enlarging the cohort size will con-
tribute only very slightly to explaining phenotypic variance on a given trait.
Moreover, if one assumes that a difference-making effect on heritability can
translate into a causal role in individuals, the mutations will have a vanishingly
23. See also Ziegler and Sun (2012): “Unexpectedly, many of the identified associa-
tions did not map to genes but to gene deserts, and the biology underlying these discoveries
is rarely immediately apparent.”
24. CHARGE and SSACG are two consortium studies on the genetics of depression.
25. See Longino (2013) and Schaffner (2016) on assessing the genetics of human behavior.
Perspectives on Science 143
small causal effect on the trait in a given individual. I call this the “paradox
of resolution.”
The reason for this paradox is the following. Given the random choice of
an individual in the whole population affected by the focal trait, the
chances that she will have this mutation (detected only in a very-large-
sample GWAS study) will in fact be very low. Since one needs a very large
sample to manifest the statistical relevance of this mutation to the vari-
ance, such statistical relevance is not salient in small samples. In other
words, here the chances that an individual has trait T given she has allele
X are extremely weakly higher than for an individual not having X
(everything else being equal); and this vanishingly small increase in prob-
ability is correlated to the resolution power: it is smaller if the resolution
needs to be higher. Therefore, in a randomly chosen individual with allele
X, chances are high that this allele makes no significant difference to her
having trait T. Considering this reasoning from another perspective, the
putative mechanism explaining why she has T will not likely accept the
focal mutation (or SNP allele) X as a causal entity, because the chances
are small that indeed the individual will have allele X. The paradox of
resolution therefore entails that the greater our capacity to find out alleles
involved in heritability of a given focal trait, the less likely will those alleles
have a robust causal role in developing the trait, if we intend to derive
causal roles from our knowledge of alleles involved in heritability. Thus,
what the new methods with their dependence upon resolution power teach
us about genes involved in heritable traits, is precisely something that will
appear quite causally irrelevant if inferences are warranted from detected
trait-related SNPs to the causes of trait development.
GWAS analyses are indeed providing us with much more detailed de-
tection of the genetic bases of heritability of the traits under study than the
classical genetic linkage analysis based on family groupings, since they
address traits at the level of nucleotides, not genes themselves (however
they are defined). However, as Juran and Lazaridis (2011) put it in a paper
about the “genomics in post-GWAS era,” “the main lesson learned from
the early GWAS efforts is that though many disease-associated variants
are often discovered, most have only a minor effect on disease, and in total
explain only a small amount of the apparent heritability,” which directly
stems from the paradox of resolution. What comes next, the so-called
post-GWAS methods, overcomes indeed some of the limitations of the
GWAS methods, especially the focus on common variants (since the target
nucleotides―SNPs―on which GWAS studies focus are common variants
and hence the studies ignore all the rare variants)26 but may not surmount
the paradox of resolution I just sketched here, if they still use a resolution
power based on increased sample size.
Thus, when Lewontin criticized in principle the derivation of causal
weight from analysis in phenotypic variance, he was pointing to a limit
that is not overcome by current genomics, since the paradox of resolution
now prevents such direct inferences, and in turn, prevents elaborating
mechanisms that would show genes robustly involved in the production
of traits.
This paradox does not hamper the GWAS and post-GWAS projects
themselves, since the intent of those studies is not to explain the causes
of the traits under scrutiny, but precisely to identify mutations likely to
increase a risk when it comes to diseases. However, from the viewpoint
of the causal analysis of those traits, it affects the project of drawing major
mechanistic consequences from this extended knowledge of the bases of
the heritability of some focal traits.
GWAS methods and their successors allow genomes to be broadly con-
sidered, without making any assumptions regarding the relation between
genes and traits or a direct correlation between genotype and phenotype
levels— in sum, they allow genes to be considered as elements of genomic
networks rather than as instructions for traits. However, the very idea of a
polygenic score for a trait, namely the addition of the detected SNPs that
make a difference for the trait as derived from GWAS studies, does not fit
into a mechanistic paradigm of an explanation of the trait and its
development since those scores say nothing about the possible activities
of the alleles involved.
The Modern Synthesis relied on sophisticated statistical tools to capture
evolutionary change. However, my cursory investigation of the study of the
genetic bases of traits in the postgenomic framework does not support the
view that this statistical approach has been replaced by a mechanistic un-
derstanding of heritability and then evolution. On the contrary, while the
knowledge here is relying on new methods, in principle it does not allow
us to get a grip on the causal levers involved in the traits.
In fact, current epistemic use of GWAS data is limited by the purely
SNP-oriented character of the method, which tells us precisely nothing
about the varying context-dependent effects of the genes. Thus, polygenic
scores are very poorly informative about the biological reality of the genetic
architecture of traits and its dynamics. Geneticists increasingly acknowl-
edge this fact: “Given the complex genetic architecture and synergistic ef-
fects among these genes, the holistic effect of a gene network or a pathway
is expected to have a larger effect than the sum of individual effects from
each gene. In addition, it is usually challenging to interpret the genetic
associations for their functional connection with the trait only based on
Perspectives on Science 145
the annotation of a single gene” (Sun 2012). The future of GWAS methods
lies precisely in the coupling of their articulation in terms of reconstructed
genomic networks with the topological explanations to which they give
rise, as seen in section 4.2. Sun (2012) summarizes such prospects in these
words: “Biological networks and pathways are built to represent the func-
tional or physical connectivity among genes. Integrated with GWAS data,
the network- and pathway-based methods complement the approach of
single genetic variant analysis, and may improve the power to identify
trait-associated genes.”
Thus, more than the embedding of developmental and evolutionary
knowledge within a single systematic mechanist framework, the new epis-
temological perspectives offered to evolutionary theory by a postgenomic
take on the role of genes in inheritance and development should be under-
stood as a plurality of explanatory modes, including topological explana-
tion and powerful nucleotide-focused statistical tests.
5. Conclusions
To sum up, besides some mechanical fine-grained understanding of the in-
volvement of genes as signals in cascades, we found in postgenomic contexts
several novel epistemic features relevant to evolutionary theory: a set of topo-
logical explanations for network properties proper to gene or protein networks
(4.2); a novel statistical understanding of genomic patterns as responsible
for traits, that goes beyond the one-or-two-loci-genotype/phenotype
statistical correlations on which the Modern Synthesis relied, and that uses
novel detection and statistical tools (4.3). More precisely, pace Pigliucci
and Müller (2011), what we do in the postgenomic era, at least regarding
genomes, does not really foster a mechanistic explanation instead of the
classical use of genotype-phenotype correlation and statistical fitness-based
modeling of evolutionary dynamics. Rather, it involves several explanatory
practices intended to make sense of the high-throughput output of geno-
mic data, while discovering mechanisms appears difficult because of the
problems of applying some functional talk to the whole genome (4.3.1)
and because of the paradox of resolution (4.3.2).
An epistemological consequence of these shifts brought about by the
HGP thereby consists in a diversification of explanatory modes when it
comes to capturing the status of genomes and the production of phenotypic
traits through development as well as through inheritance. Even if, for some
reason not explored in this paper, we have to integrate development within
evolution, and if, by undertaking this, new advances in molecular and
developmental processes, as well as epigenetic inheritance, plasticity, or
cultural inheritance compel us to shift from the statistical kind of expla-
nations proper to the MS to a mechanistic kind of explanation, still the
146 Epistemological Perspective on Genome Project Legacy
References
Alvarez-Ponce, Daniel, F. Feyertag, and S. Chakraborty. 2017. “Position
Matters: Network Centrality Considerably Impacts Rates of Protein
Evolution in the Human Protein–Protein Interaction Network.” Genome
Biology and Evolution 9 (6): 1742–1756.
Amundson, Ron. 2005. The Changing Role of the Embryo in Evolutionary
Thought: Roots of Evo-Devo. Cambridge: Cambridge University Press.
Arnold, S. J., R. Bürger, P. A. Hohenlohe, B. C. Ajie, and A. G. Jones.
2008. “Understanding the Evolution and Stability of the G-Matrix.”
Evolution: International Journal of Organic Evolution 62 (10): 2451–2461.
Bailly, François and Giuseppe Longo. 2011. Mathematics and the Natural
Sciences: The Physical Singularity of Life. New York: World Scientific.
Balaskas, N., A. Ribeiro, J. Panovska, E. Dessaud, N. Sasai, K. Page, J.
Briscoe, J., and V. Ribes. 2012. “Gene Regulatory Logic for Reading
the Sonic Hedgehog Signaling Gradient in the Vertebrate Neural
Tube.” Cell 148 (1): 273–284.
Beatty, John. 1986. “The Synthesis and the Synthetic Theory.” Pp. 125–136
in Integrating Scientific Disciplines. Edited by W. Bechtel. Dordrecht:
Nijhoff.
Beatty, John. 2016. “The Creativity of Natural Selection? Part I: Darwin,
Darwinism, and the Mutationists.” Journal of the History of Biology 49:
659–684.
Bonduriansky, Russell, and Troy Day. 2009. “Nongenetic Inheritance and
Its Evolutionary Implications.” Annual Review of Ecology, Evolution, and
Systematics 40: 103–125.
Brakefield, Paul M. 2006. Evo-Devo and Constraints On Selection. Trends
in Ecology & Evolution 21: 362–368.
Perspectives on Science 147