You are on page 1of 12

Insect Molecular Biology (2006) 15(1), 4556

Evolution of spider silks: conservation and diversification


of the C-terminus
Blackwell Publishing Ltd

R. J. Challis*, S. L. Goodacre and G. M. Hewitt


*IEB, University of Edinburgh, Kings Buildings, West Mains
Road, Edinburgh, UK; Centre for Ecology, Evolution and
Conservation, School of Biological Sciences, University of
East Anglia, Norwich, UK
Abstract
Analysis of DNA sequences coding for the C-terminus
of spider silk proteins from a range of spiders suggests
that many silk C-termini share a common origin, and that
their physical properties have been highly conserved
over several hundred million years. These physical properties are compatible with roles in protein synthesis,
silk function and in recruiting accessory proteins.
Phylogenetic relationships among different silk genes
suggest that any recombination has been insufficient to
homogenize the different types of silk gene, which appear
to have evolved independently of one another. The types
of nucleotide substitutions that have occurred suggest
that selection may have operated differently in the various
silk lineages. Amino acid sequences of flagelliform silk
C-termini differ substantially from the other types of spider silk studied, but they are expected to have very similar
physical properties and may perform a similar function.
Keywords: Araneae; C-terminus; sequence evolution;
silk; spider.
Introduction
Spider silks are multimeric, modular fibres that are of
considerable interest to biotechnologists because of their
unique physical properties (Jin & Kaplan, 2003). Silks are
classified according to the gland in which they are produced
and by the spinning apparatus of the spider. Those that
are woven by spiders with a cribellum (cribellate silks) are

doi: 10.1111/j.1365-2583.2005.00606.x
Received 27 May 2005; accepted following revision 1 August 2005. Correspondence: Dr S. Goodacre, School of Biological Sciences, University of
East Anglia, Norwich, NR4 7TJ. Tel.: +44-1603-593 853; fax: +44-1603-592
250; e-mail: s.goodacre.uea.ac.uk.

2006 The Royal Entomological Society

extremely fine and typically achieve stickiness through van


der Waals interactions and hydrophobic nodes in their protein
sequence (Hawthorn & Opell, 2002). In contrast, the thicker
silks of spiders without a cribellum (ecribellate silks) are
optimized for strength and elasticity and are secreted with
a coating of an aqueous silk protein to achieve stickiness.
Cribellate silks have been isolated from only a few
species, including the primitive Mygalomorph spider Euagrus
chisoseus and the Haplogyne Plectreurys tristis (Fig. 1,
Gatesy et al., 2001). Much more attention has focused on
a range of ecribellate silks found in derived Entelegyne
spiders, particularly the Araneoidea (see Table 1), perhaps
because these possess the most impressive physical
properties. Two classes of ecribellate spidroin silks (MaSp1
and MaSp2) have been isolated from a range of Araneoid
spiders. These are produced by the major ampullate gland
and are used to form a two component fibre for use as a
dragline (Hinman & Lewis, 1992). MaSp1 forms crystalline
regions of -sheet (Hayashi & Lewis, 2000) that have the
tensile strength of Kevlar (Gosline et al., 1999) whilst retaining
35% extensibility (Hayashi & Lewis, 1998); whereas MaSp2
has a high proline content and contains sequence motifs
that confer elasticity (Hayashi & Lewis, 2000).
Further studies of spiders within the family Araneoidea
have identified an additional class of very different silk fibre,
which is produced by flagelliform glands. Flagelliform gland
silks (flags) have slightly lower tensile strength than the
MaSp silks but have many times the extensibility (up to 200%,
Vollrath & Edmonds, 1989). This is central to their mechanism
of prey capture because it enables webs to arrest the motion
of flying organisms without breaking (Hayashi & Lewis,
1998). Extensibility of flagelliform silk is enhanced through
interaction with an aqueous aggregate gland silk that
coalesces to form sticky droplets (reviewed in Vollrath, 1999),
which may serve to hydrate the flagelliform silk (Shao &
Vollrath, 1999). A final class of spidroin silk that has been
identified in the Araneoidea is spun from the minor ampullate
gland. These silks (MiSps) are used in web construction and
have lower extensibility than the flagelliform silks whilst
retaining similar tensile strength (Gosline et al., 1999).
Silk proteins typically consist of a non-repetitive N-terminus,
a highly repetitive (repeat) region and a non-repetitive
45

46

R.J. Challis et al.

Figure 1. Simplified morphological phylogeny of


the Araneae (based on Coddington & Levi, 1991).
Bold type indicates families considered in this
study. Numbers in parentheses denote the number
of species of each family with silk sequence data
available on GENBANK. Also indicated are the nodes
calibrated by fossil evidence: () Rosamygale, 240
Mya (Selden & Gall, 1992) and () Macryphantes,
125 Mya (Selden, 1990).

Table 1. Accession numbers of spider silk sequences in this study. All sequences are from mRNA apart from those indicated by *, which are from genomic DNA
Family

Species

Protein

Accession number

Reference

Dipluridae
Plectreuridae

Euagrus chisoseus
Plectreurys tristis

Fib1
Fib1
Fib2
Fib3
Fib4

AF350271
AF350281
AF350282
AF350283
AF350284

Gatesy et al. (2001)


Gatesy et al. (2001)
Gatesy et al. (2001)
Gatesy et al. (2001)
Gatesy et al. (2001)

Araneidae

Araneus bicentarius
Araneus diadematus

MaSp2*
MaSp1
MaSp2
MiSp
MaSp2*
Flag
MaSp1
MaSp2
MaSp2*

U20328
U47854
U47856
U47853
AF350263
AF350264
AF350266
AF350267
AF350272

Hinman & Lewis (1992)


Guerette et al. (1996)
Guerette et al. (1996)
Guerette et al. (1996)
Gatesy et al. (2001)
Gatesy et al. (2001)
Gatesy et al. (2001)
Gatesy et al. (2001)
Gatesy et al. (2001)

Flag*
MaSp1
MiSp
MaSp1*
MaSp2*
MaSp1*
MaSp2*
MaSp1*
MaSp1*

AF218621
U20329
AF027736
AF350277
AF350278
AF350279
AF350280
AF350285
AF350286

Hayashi & Lewis (2000)


Hinman & Lewis (1992)
Colgin & Lewis (1998)
Gatesy et al. (2001)
Gatesy et al. (2001)
Gatesy et al. (2001)
Gatesy et al. (2001)
Gatesy et al. (2001)
Gatesy et al. (2001)

Argiope aurantia
Argiope trifasciata

Gasteracantha mammosa
Tetragnathidae

Nephila clavipes

Nephila madagascariensis
Nephila senegalensis
Tetragnatha kauaiensis
Tetragnatha versicolor
Theridiidae

Latrodectus geometricus

MaSp1

AF350273

Gatesy et al. (2001)

Pisauridae

Dolomedes tenebrosus

AmSp1
AmSp2

AF350269
AF350270

Gatesy et al. (2001)


Gatesy et al. (2001)

C-terminus. To date, most research has focused on the link


between sequence motifs in the repeat region and the physical properties of silk (see reviews by Hayashi et al., 1999
and Craig & Reikel, 2002). The N-terminus has a role in
transport as it encodes a signal peptide (Hayashi & Lewis,

1998) but the role of the C-terminus is unclear. Kerkham


et al. (1991) proposed that the C-terminus is important in
maintaining the aqueous state of silks prior to extrusion.
Beckwitt & Arcidiacono (1994) found the C-terminal sequence
of spider silk to be highly conserved and proposed a further

2006 The Royal Entomological Society, Insect Molecular Biology, 15, 4556

Evolution of Spider silks

47

Figure 2. ClustalW alignment of C-terminal amino acid sequences, shaded to indicate similarities (grey) and identities (black). The region of highly conserved
sequence about a QALLE motif, which corresponds to a region of predicted -helix in all silk types is also shown.

role in signalling. These functions are not incompatible and


both could require sequence conservation.
Similarities among different spider silk genes suggest that
they share a common ancestor (reviewed in Craig & Reikel,
2002), but the evolutionary relationships among functional
homologues are unclear. It is thought that many of the
genes in this family have evolved through gene duplications
(Beckwitt & Arcidiacono, 1994). Functional relationships are
further complicated by the existence of duplicate silk glands,
spigots and spinnerets (Coddington & Levi, 1991).
In this study, we use sequence data for different silk
types from 16 species distributed across six spider families.
Sequences are from either genomic or cDNA and the species
included in the study come from both basal and terminal
clades within the Araneae (Fig. 1). For clarity, we use the
nomenclature of Gatesy et al. (2001), with the addition of
the abbreviation Fib for the ecribellate fibroins and AmSp
for the ampullate gland spidroin of Dolomedes tenebrosus.
We combine phylogenetic analysis of silk sequences with
prediction of secondary structure and physical properties to
investigate the evolution of the C-terminus of spider silk.

Results
Sequence conservation
There were 26 DNA sequences on GENBANK for which the
C-terminal silk sequence was available: five cribellate fibroins
and 21 ecribellate spidroins/fibroins. The total length of each
sequence varied since all are partial gene sequences with
variable repeat length, repeat number and C-terminal length.
Greater sequence conservation was observed at the Cterminus. There is a particularly conserved region at a QALLE
amino acid sequence motif (Fig. 2), at which the majority of
sequences share greater than 50% identity (Dayhoff similarity
matrix; Dayhoff et al., 1978). When the entire C-terminus
is considered, the similarity is lower, but most silks share at
least 30% identity. The flag C-termini are the most highly
diverged. They do not have a complete QALLE motif and
share as little as 23% sequence identity with other silks.
Phylogenetic analysis
Phylogenetic analyses were performed on the entire silk Cterminal data set (198 bp, 66 amino acids) and repeated with

2006 The Royal Entomological Society, Insect Molecular Biology, 15, 4556

48

R.J. Challis et al.

Figure 3. (a) Unrooted maximum likelihood tree (198 base pairs) of C-termini. Rate matrix, proportion of invariant sites (0.22) and (1.66) estimated from an
initial neighbour-joining tree. Numbers indicate the support for individual branches from 100 bootstrap replicates (values above 70 shown). (b) Unrooted
phylogeny constructed using a Bayesian approach (computed with MRBAYES using 4 chains of 1 000 000 generations after a burn in time of 100 000
generations), estimating the proportion of invariant sites (019) and (1.69). Probabilities for each branch are given. Tests for substitution rate heterogeneity
among branches labelled 14 are described in Table 2.

the highly diverged flagelliform silk sequences removed.


Nucleotide-based trees constructed using both maximum
likelihood and Bayesian methods are shown in Fig. 3 (estimated proportion of invariant sites = 0.22/0.19 and shape
parameter = 1.66/1.69 for Maximum likelihood/Bayesian
trees, respectively). When the highly diverged flag sequences
were removed from the analysis, those relationships among
remaining silks that were well-supported in the previous
analysis were found to remain the same (data not shown).
Trees constructed by the neighbour-joining method had the
same overall topology as those constructed by maximum
likelihood (data not shown). In all trees the ecribellate silks
(AmSp, MaSp and MiSp) of the derived Entelegyne spiders
(Fig. 1) cluster separately from the cribellate silks of the
basal genera Euagrus and Plectreurys. This relationship has
high support (posterior probability = 1.00) in the Bayesian
but not in the maximum likelihood (ML) or neighbour joining
(NJ) trees, although the overall topology is similar in each
case. The highly diverged flagelliform silks cluster most
closely to Plectreurys Fib4 but this relationship is not
strongly supported by any method of tree estimation.

Within the MaSp/MiSp silk group there are few wellsupported nodes in the ML or NJ trees but strong support
in the Bayesian tree for the following: AmSp and MiSp silks
cluster separately from the MaSp silks and MaSp silks
cluster in several, well-supported paraphyletic groups, with
strong support for several terminal groupings consisting
of either MaSp1 or MaSp2 silks but not both. The single
exception is MaSp1 of Araneus diadematus, which falls
within a well-supported group containing MaSp2 silks of
other species.
Maximum likelihood and Bayesian analysis of amino acid
sequences (JTT model of substitution) are shown in Fig. 4.
Well supported terminal groups in these trees were also
well supported by analysis of nucleotide sequences (Fig. 3).
Sequence evolution
Tests for recombination made using Recpars (Hein 93, with
any gaps in the alignment removed) inferred between 1
and 5 recombination events within the phylogeny when the
recombination:substitution cost was set at 1.5 : 1. When all
sequences were included, at least one recombination event

2006 The Royal Entomological Society, Insect Molecular Biology, 15, 4556

Evolution of Spider silks

49

Figure 4. (a) Unrooted maximum likelihood analysis tree of C-termini amino acid sequences (66 amino acids) calculated using MOLPHY (JTT substitution
matrix, majority rule consensus of 50 trees produced is shown, branches supported in more than 50% (= 25) trees, are shown). (b) Unrooted Bayesian analysis
of amino acid sequences (JTT substitution matrix, chain number, burn in time and branch probabilities as for figure 3b. Majority rule consensus shown).

was inferred until the ratio was set at > 6 : 1. Several


transition:transversion costs were used (0.1 : 1, 0.5 : 1 and
1 : 1) and found not to affect the final threshold at which
no recombination events were inferred. The analysis was
repeated with Fib, Flag, MiSp or MaSp sequences removed.
At least one recombination event was inferred in each case,
apart from when the Fib silks were excluded, leaving only
Flag, MaSp, MiSp and AmSp silks. Similar results were
obtained using the DSS approach: two recombination events
were inferred when all sequences were included in the
analysis (F84 model of nucleotide substitution, 1000 bootstrap
replicates, threshold = 0.95), and at least one event when
Flag, MiSp or MaSp sequences were removed from the
alignment, but non-e when the Fib silks were excluded.
Estimates of (the dN/dS ratio) were made based upon
the tree topology estimated by Bayesian analysis (Fig. 3).
The estimated value of was 0.088 when a single value
was assumed across all sites and branches within the tree
(Model = 0, NSites = M0). Parameters estimated assuming
different categories of are given in Table 2. No sites had
an estimated > 1 but the observed data under the model
NSites = M3 (discrete categories of ) was found to be
significantly more likely by LRT (P < 0.0001) than Nsites =
M0 (all sites assumed to have the same ratio).
When a beta distribution with a free ratio of where
can exceed 1 was assumed (Nsites = M8), there was no
significant increase in likelihood when compared with a beta
distribution with all values < 1 (Nsites = M7). An insignificant

likelihood-ratio in this case might simply reflect a sensitivity


to the number of sequences included in the analysis and
the level of sequence divergence (as reviewed by Bielawski
& Yang, 2003) but this explanation cannot be evaluated
without adding new data to the analysis. Estimates of
when the MaSp silks were analysed separately similarly
found no site classes with > 1 although the model assuming several discrete classes of (NSites = M3) was more
likely than the nested model (Nsites = M0), which assumes
no rate heterogeneity (P < 0.0001).
Estimates of allowing for different values for individual
branches within the tree (indicated in Fig. 3b) show that
when models allowing for a difference in substitution ratio
between lineages are assumed, a significantly higher likelihood of the data is observed than when only one set of
ratios is assumed for all branches (Table 2). Estimates
allowing four different classes of with a different set of
ratios in a specified lineage gave estimates of > 1 for one
class in the Flag lineage ( > 13, P = 0.002) and one class
in each of (1) the major ampullate (2) major ampullate/
minor ampullate and (3) Fib lineages, although in these
cases no significant increase in likelihood of the data was
observed over the null model.
The branch-sites analysis was repeated twice with one of
the most highly diverged sequences, Fib and Flag, removed.
With the Fib sequences removed, estimates of for one
class within the major ampullate lineage were again found
to be > 1 (estimated = 6.692) and the data were found to

2006 The Royal Entomological Society, Insect Molecular Biology, 15, 4556

50

R.J. Challis et al.

Table 2. Estimates of under different models of heterogeneity across sites or branches within the tree topology (indicated on Fig. 3) and likelihood ratio tests
(LRTs) of nested models
Model
All sequences included
One across branches and sites
Model = 0 NSites = 0
One across branches, rate variation among sites
Model = 0 NSites = 3 (discrete classes)
Model = 0 NSites = 7 (beta distribution)
Model = 0 NSites = 8 (beta + free ratio of )
MaSp1 & 2 sequences only
One across branches, rate variation among sites
Model = 0, NSites = 0
3 (discrete classes)
7 (beta distribution)
8 (beta + free ratio of )
All sequences included
One across branches, 2 site categories
Model = 0 NSites = 3 (2 categories)
Branch-sites model (variation across sites and branches)
Model = 2 Nsites = 3
1 2 = 3 = 4
2 1 = 3 = 4
3 1 = 2 = 4
4 1 = 2 = 3

Parameter estimates

Likelihood

2, LRT

(per branch) = 0.088


(site classes) = 0.050, 0.124, 0.245

3874.79

3846.70
3850.14

56.18, P < 0.0001

NS

p = 2.75, q = 26.77
p0 = 1, p = 2.75, q = 26.77
(p1 = 0, = 2.60)

2 d.f.
(per branch) = 0.063
(site classes) = 0.029, 0.130, 0.440
p = 0.92, q = 11.68
p0 = 1, p = 0.82, q = 11.69
p1 = 0, = 2.73

1703.95
1679.78
1682.84
1682.84

(site classes) = 0.054, 0.165

3847.84

1 (site classes) = 0.052, 0.163, 13.420


2,3,4 (site classes) = 0.052, 0.163
2 (site classes) = 0.162, 0.524, 96.766
1,3,4 (site classes) = 0.162, 0.524
3 (site classes) = 0.053, 0.0165, 19.247
1,2,4 (site classes) = 0.053, 0.165
4 (site classes) = 0.053, 0.170, 4.019
1,2,3 (site classes) = 0053, 0.170

3841.80

12.08, P = 0.002

3847.26

NS

3847.87

NS

3845.74

NS

be significantly more likely than when assuming one rate


for all branches (LRT 2 = 8.16, P = 0.017, 2 d.f.). When
the Flag sequences were removed, estimates of for one
class within the Fib lineage and MaSp lineages were also
> 1 ( = 7.726 and 5.426, respectively) but the data were
not significantly more likely than under the null model.
The mean dS and dN ratios for all MaSp sequences were
1.53 and 0.23, respectively. Similarly, the mean dS and dN
estimates within all MaSp1 sequences were 1.27 and 0.23
(28 comparisons), within MaSp2 were 1.41 and 0.20 (21
comparisons) and between MaSp 1 and MaSp 2 were 1.42
and 0.26 (56 comparisons). Several intraspecific dS/dN
estimates were possible as follows: 1.49/0.16 (MaSp1/MaSp2
Nephila madagascariensis); 1.41/0.31 (MaSp1/MaSp2
Araneus diadematus); 1.39/0.09 (MaSp1/MaSp2 Argiope
trifasciata); 1.09/0.15 (MaSp1/MaSp2 Nephila senegalensis).
Similarly, the dS and dN estimates between the two
MiSp silks of Araneus diadematus and Nephila clavipes
were 0.99 and 0.52, with intraspecific dS and dN estimates
for MiSp/MaSp1 (or MaSp2) of 1.88 and 0.62 (1.86 and
0.92) for Araneus diadematus and 1.27 and 0.39 for N. clavipes. McDonald-Kreitman (1991) tests implemented in
(Rozas et al., 2003) comparing interspecific substitution
ratios (MaSp1/MaSp1 and MaSp2/MaSp2) with intraspecific ratios (same species MaSp1 vs. MaSp2) found no
significant departures from neutrality (P > 0.05 for each
comparison).

48.34, P < 0.0001

NS

Physical properties and function


A hydrophobicity profile of a representative silk gene Cterminus with the upstream final repeat units (MaSp1 of
N. senegalensis, 250 amino acids), is given in Fig. 5(a).
The gene shows an oscillating pattern of hydrophobic
and hydrophilic regions. All silks, with the exception of the
flagelliform types, show a similar pattern (data not shown)
and all silks (including the flagelliform types) have a peak in
hydrophobicity at the C-terminus. The C-terminus peak in
hydrophobicity is greater than that at any point in the
repetitive region of silk genes and is similar to that of the silk
N-terminus, which is known to act as a signal peptide in
protein transport.
Hydrophobicity profiles for the C-terminal 90 amino acids
of all silk genes (the most hydrophobic region of the entire
silk gene) were also very similar (Fig. 5b). The exception
was the profile of silk from E. chisoseus (shown in bold).
This sequence contained an additional hydrophobic peak
region of greater than two units, 48 residues upstream of
the peak found in all silks (Fig. 5b). The height of the two
peaks is identical and they have a similar hydrophobicity
profile but less than 20% identity.
The region of high hydrophobicity at the 3 end of the Cterminus found in all silks (Fig. 5b) corresponds to the
QALLE region of high sequence conservation (Fig. 2). In
AmSp, MaSp and MiSp silks this region has two hydrophobic

2006 The Royal Entomological Society, Insect Molecular Biology, 15, 4556

Evolution of Spider silks

Figure 5. Kyte and Doolittle mean hydrophobicity profiles


(Scan window = 13 residues) of: (a) Nephila senegalensis
MaSp1 silk final repeat units and C-terminus (250 amino
acids) (b) C-terminus (90 amino acids) of all silk genes
(Euagrus chisoseus shown in bold) (c) Hydrophobicity plot
of conserved QALLE region (21 amino acids) showing the
difference between flagelliform (Nephila clavipes Flag, grey
line) and non-flagelliform (Nephila senegalensis MaSp1,
black line) silks.

2006 The Royal Entomological Society, Insect Molecular Biology, 15, 4556

51

52

R.J. Challis et al.

maxima corresponding to the 8th and 12th residues of


the conserved QALLE motif (usually leucine and serine). In
Flagelliform silks there are also two hydrophobic maxima,
but these are three, as opposed to four, residues apart, in
line with the 9th and 12th residues of the conserved region.
The difference between Flag and non-Flag silks in this
region is illustrated in Fig. 5(c) by comparing the conserved
QALLE region (21 amino acids) of MaSp1 of N. senegalensis
with the Flag from N. clavipes.
Comparison of silk sequences with proteins of known
structure using the 3D-PSSM server suggests that the highly
conserved C-terminal silk QALLE motifs form -helices
(Fig. 2). All sequences have at least one additional region
upstream of the conserved QALLE motif that is likely to
form an -helix (data not shown). In E. chisoseus Fib1 this
region lies within the region of its first hydrophobic peak
upstream of the QALLE motif.
Discussion
Evolution of spider silks: sequence conservation and
diversification
Previous studies have shown that the sequence motifs in the
repeat region of silk genes are highly conserved (Gatesy
et al., 2001). The present study demonstrates that there is
also a high degree of conservation of the non-repetitive
C-terminus of silk proteins in terms of primary amino acid
sequence, predicted secondary structure and physical
properties. This similarity exists between species that are
thought to have diverged up to 240 million years ago (Fig. 1,
Selden & Gall, 1992) and among silk proteins that have a
wide range of physical properties. These traits are thought to
be conferred largely by repeated amino acid motifs encoded
by regions upstream of the C-terminus (Hayashi & Lewis,
2000). The degree of similarity among physical properties
and predicted secondary structure of silk C-termini suggests
that they perform a common function and that their evolution
is likely to be constrained by selection against mutations
that disrupt this function.
Silk genes contain many GC-rich regions that are potential
recombination hotspots (Hayashi & Lewis, 2000) and it is
possible that they have evolved through a complex mode of
evolution involving both gene conversion and recombination,
which would have obscured their true origins. In accordance
with this prediction, the null hypothesis that there has been
no recombination is rejected for the entire dataset. However,
the null hypothesis is not rejected when Fib sequences are
excluded from the analysis. Furthermore, MiSp silks cluster
separately from their MaSp counterparts in phylogenetic
analyses and silks of the same species cluster according to
gene in most cases (e.g. Nephila MaSp1 and MaSp2 genes).
The observation that many silks cluster according to type,
rather than according to species, suggests that their evolution may be explained better by a birth-and-death process

involving gene duplication and loss of function (Nei, 1969),


than by a model of concerted evolution where recombination
homogenizes genes post duplication. Under such a process,
genes are expected to cluster by gene or duplication order
rather than by species, low levels of sequence homogeneity
are expected between different genes (particularly at noncoding sites) and there will be evidence of gene loss or
pseudo-gene formation (as discussed by Nei et al., 1997).
In contrast, if recombination has had the greater influence
then sequence homogeneity between genes within species
is expected to be high and they may cluster together in
phylogenetic analyses.
It is not possible to assess the rate of gene loss within the
spider silk family, since much of the work so far has involved
identification of expressed genes through isolation of their
mRNA. However, the phylogenetic relationships observed
among silks of the same species presented in this analysis
are inconsistent with the sole explanation of recombination
under a model of concerted evolution and comparisons
between intra- and interspecific dS values also support this
view for the following reason: Concerted evolution is expected
to result in homogenization of different silk types and hence
decrease intraspecific values of dS, regardless of any potentially countervailing effects of selection, which is expected
to have a greater effect on dN. However, in spider silks,
intraspecific values of dS between silk types (e.g. 1.49,
1.41, 1.39) are of a similar magnitude or greater than those
calculated between species (1.27 for all MaSp1 sequences;
1.41 for MaSp2 and 1.53 for the combined data set of
MaSp1 and 2). Furthermore, McDonald-Kreitman tests find
no significant departure from neutrality; such a departure is
expected if dS has been reduced to a greater extent than
the associated dN.
Hydrophobicity profiles point to a potentially complex mode
of evolution of spider silks that could also involve duplication of regions within genes. The hydrophobicity profile of
the C-terminus of what is thought to be the most basal
species in this study, E. chisoseus Fib1, has two peaks, 48
residues apart, both of which are predicted to form -helical
regions. The presence of two regions, with the same predicted
secondary structure, raises several important points about
the evolution of spider silks. They may have arisen from a
replication error, such as are thought to drive the evolution
of spider silks (Beckwitt et al., 1998). However, convergence
through selection for similar physical properties is a plausible
alternative explanation, given the low amino acid identity of
the two peaks. It is not possible to establish from this dataset
whether the two peaks arose in the E. chisoseus lineage
or if they represent an ancestral state, in which case they
could be present in other basal lineages. More sequences,
particularly from primitive spider lineages, would help to
answer this intriguing question.
The likelihood of estimates of dN/dS ratios on a site-bysite and branch-by-branch basis using different models

2006 The Royal Entomological Society, Insect Molecular Biology, 15, 4556

Evolution of Spider silks


suggest that there is variable selective pressure on different
amino acids within spider silks, that the ratio is not uniform
throughout the tree and that there may be positive selection
at some sites within particular lineages. Evidence for positive
selection within particular types of silk, such as those of the
ampullate class, which form an intrinsic part of the web
architecture, is interesting because speciation has been
found in at least one case to be more rapid in web-building
than non-web-building spiders (Gillespie, 1999). However,
it is important to emphasize that tests for selection are
based on the assumption of no recombination and this null
hypothesis is rejected when Fib sequences are included in
the analyses (although the number of recombination events
inferred is lower than that predicted to significantly increase
type I errors (Anisimova et al., 2003).
dN/dS ratios were re-estimated with the Fib sequences
removed from the dataset (and similarly with the highly
diverged Flag sequences removed) and the same trend
was observed in the ampullate silk lineage. However we
cannot be certain that there has been no recombination
among the remaining sequences within our data set because
tests for recombination can themselves be confounded by
positive selection at individual amino acid positions, or
by heterogeneity in branch lengths within a phylogeny.
Furthermore, they may not be sufficiently sensitive to detect
rare or ancient recombination events. As a result we cannot
rule out inter or intragenic recombination, although our data
certainly indicate that recombination has not overwhelmed
the effects of gene duplication and independent diversification of different genes, and our analyses point to variable
selection within some regions of the C-terminus (which
could be the result of differences in functional constraint)
and a heterogeneous pattern of branchwise dN/dS substitution ratios.
Evolution of spider silks: origin and function
The origin of the Araneoid ecribellate silks as a whole remains
uncertain. If we accept the use of the silk from the basal
Mygalomorph species, E. chisoseus, as a suitable outgroup
then the results of the amino acid analyses are consistent
with the MaSp, AmSp and MiSp silks being the most derived
(Fig. 4). Bayesian analysis of nucleotide sequences supports
this hypothesis, but there is poor resolution in the maximum
likelihood tree (Fig. 3). It is important to emphasize that,
broadly speaking, the same relationships among silks are
recovered using all methods of analysis, but the inconsistency in confidence estimates deserves some comment.
Discrepancies between the support given by posterior probabilities (Bayesian analyses) and bootstrap values (maximum
likelihood analyses) are not unexpected (Douady et al.,
2003) and are likely to reflect sensitivities of either (or both)
measures to assumptions in the different models applied.
Similar sensitivities could also account for differences in
estimated branch length, which are greater for the Flag

53

silks in the maximum likelihood analysis than when using


either of the other methods, but the analyses are unlikely to
improve until additional silk sequences can be included.
It is interesting to note that sequence divergence within
the four cribellate silks, which are from a single species,
P. tristis, is greater than that within the entire MaSp1 and
2 group in both ML and Bayesian analyses. This diversity
might be peculiar to P. tristis, or it could reflect generally
higher levels of sequence divergence among cribellate silks.
Increased sequence divergence is expected to lower the
rate of recombination through removing recombination
hotspots, or simply by shortening homologous regions
required for recombination to occur. As such it is noteworthy
that it was those tests that included the Fib sequences that
rejected the null hypothesis of no recombination.
The flagelliform silks have the lowest sequence identity
to other sequences and show above average sequence
divergence on the tree of C-terminal regions (Fig. 3). Despite
this divergence, and the different physical properties of
flagelliform silks as a whole, the predicted physical properties of flagelliform C-termini are not dissimilar to the other
types studied. There are two possible hypotheses to explain
similar physical properties despite considerable amino
acid sequence divergence: Either 1) ancestral amino acid
sequence similarity to other silks has been obscured by a
large number of mutations or (2) the C-termini of flagelliform
and non-flagelliform silks do not share a common ancestor
and similarities in terms of predicted physical properties
between flagelliform and other silk C-termini are the result
of selection for similar physical properties.
Flagelliform silks are thought to have originated when the
Araneoid spiders split from their sister group, with MaSp
silks having evolved somewhat earlier at the divergence
of the Araneomorphae. In contrast with this hypothesized
origin, the flagelliform silk sequences appear to cluster more
closely to the primitive cribellate silks than to the MaSp silks
in phylogenetic analyses. This placement may be explained
by flagelliform silks being derived from primitive silks and not
from the MaSp lineage. However, an entirely independent,
non-homologous origin of the flagelliform silk C-terminus
could also explain the apparently elevated substitution ratios
along the Flag lineage and the high degree of sequence
divergence between flagelliform and non-flagelliform silks (with
long-branch attraction accounting for the basal position in
the tree, Philippe & Laurent, 1998). Non-homology of the Flag
silks would violate assumptions upon which the methods
for estimating all substitution ratios are based. Therefore,
although an independent origin seems unlikely, it is important
to highlight the fact that our analyses were repeated with
this lineage completely excluded and that similar trends
were found.
Silk C-termini show the same degree of hydrophobicity as
the silk protein N-terminal signal peptides of Nephila spp.
(data not shown) and it is possible that the C-termini are

2006 The Royal Entomological Society, Insect Molecular Biology, 15, 4556

54

R.J. Challis et al.

similarly involved in signalling. This supports the previous


suggestion made by Beckwitt & Arcidiacono (1994) that
the role of this region might be required for correct protein
transport and synthesis although transport is unlikely to be
the only function since recent work on major ampullate
spidroins confirms that C-termini are indeed present in
the mature silk thread (Sponner et al., 2004). The highly
conserved hydrophobic regions in the C-termini (shown
in Fig. 5b) might, for example, be required for recruiting
accessory proteins such as chaperonins, in order to facilitate
correct protein folding (reviewed in Hartl & Hayer-Hartl,
2002).
Diversification in both the C-terminus and upstream
regions and the consequent changes in physical properties
may be driven by differences in how the silk is used. As an
example, bolas spiders (Mastophora spp.) have evolved an
exceptionally strong MaSp silk from which to hang a viscid
droplet (bolus), which attracts prey, rather than weaving
the orb shaped web that is characteristic of many other
members of the MaSp weaving family (Cartan & Miyashita,
2000). This particular case illustrates the link between the
ability to evolve a new type of silk and the evolution of
differences in behaviour and ecology, which may themselves
be associated with other processes such as speciation
(Gillespie, 1999).
While experimental work is essential to further develop
hypotheses relating to structure and function of spider silks,
our study of the C-terminus has given some insight into
the evolution of this gene family. Analysis of the physical
properties of the C-termini is also informative about silk
fibre formation itself: Flagelliform and major ampullate silks
represent the known extremes of extensibility and tensile
strength, and within the ampullate silks, upstream MaSp1
and MaSp2 regions have very different amino acid compositions, yet all C-termini are predicted to behave in a similar
manner. If this region is present in the mature protein and
is important in protein transport and folding as predicted,
the degree of sequence conservation suggests that a single
process may one day be used for the commercial production
of silks with diverse properties.
Experimental procedures
Silk sequences
Sequences were retrieved from GENBANK (Table 1) and aligned in
BioEdit v. 5.0.9 (Hall, 1999) using ClustalW (Thompson et al., 1994).

Sequence evolution
Phylogenetic analyses of nucleotide sequences (198 unambiguously
aligned base pairs/66 amino acids) were performed using several
methods: maximum likelihood (Felsenstein, 1981), neighbour-joining
and using a probabilitybased, Bayesian approach.
Maximum likelihood trees were constructed in PAUP* v. 4.0b
(Swofford, 1999) using the general time-reversible (GTR) model

(Lanave et al., 1984). The GTR rate matrix, base frequencies, the
proportion of invariant sites and the shape parameter () for the
gamma distribution that describes heterogeneity across sites, were
all estimated by likelihood using an iteration procedure based on
an initial simple neighbour-joining tree: Parameter values were
estimated from this initial neighbour-joining tree using likelihood.
These parameters were then used to make a new neighbourjoining tree, and the parameters re-estimated by likelihood from
this new tree. The process was repeated until no further improvement in likelihood of the neighbour-joining tree was observed. The
final parameter estimates were used to construct a tree by maximum
likelihood. The phylogeny was rooted on E. chisoseus on the basis
of its basal position within the Araneae based upon morphological
data (Fig. 1). Tree searching involved a heuristic procedure with treebisection-reconnection branch swapping. Bootstrap resampling
(100 replicates, Felsenstein, 1985) was used to assign support for
particular branches within the tree. Neighbour-joining trees were
constructed using MEGA 2.1 (Kumar et al., 2001) using the TajimaNei (1984) model of nucleotide substitution.
A probability-based, Bayesian approach to tree construction was
carried out using MRBAYES (Huelsenbeck & Ronquist, 2001).
This package uses a metropolis-coupled Markov chain Monte Carlo
algorithm to allow the running of multiple chains. A run of four chains
for 1 000 000 generations with a burn-in time of 100 000 generations
was carried out to ensure Markov chain convergence. A general
time reversible model of nucleotide substitution was used allowing
for rate heterogeneity across sites, with a proportion of sites
allowed to be invariant.
Maximum likelihood analysis of amino acid sequences using
the Jones, Taylor Thornton (JTT, 1992) substitution matrix was
performed using the program MOLPHY v. 2.3 (Adachi & Hasegawa,
1996.) 50 bootstrap replicates were used to assign support for
individual nodes within the tree. Bayesian analysis of amino acid
sequences was also carried out using the JTT matrix with the
same burn-in and run parameters as before.
Tests for detecting recombination events based upon a phylogenetic
approach were carried out using the program Recpars (Hein, 1993).
Phylogenies with and without recombination events were evaluated
against one another by comparing their total costs using a range of
recombination to substitution costs (the recommended ratio is 1.5 : 1,
Wiuf et al., 2001). A further test for recombination was made using
the DSS (difference in sum of squares) approach (McGuire &
Wright, 2000; F84 distance measure used) as implemented in the
program TOPALI ( Milne, Husmeir, McGuire & Wright, 2003, 04).
Estimates of , the parameter describing non-synonymous/
synonymous (dN/dS) amino acid substitution ratios, were made by
maximum likelihood using the program codeml in the software
package PAML (Yang, 1997). The method allows codon bias and
variable substitution rates to be incorporated into the analysis (Yang
& Bielawski, 2000), which is essential given the AT bias of third codon
positions in spider silk (Xu & Lewis, 1990; Hayashi & Lewis, 1998).
Estimates were made based upon a given tree topology, with the
following sets of criteria: (i) assuming a single value of across all
branches and sites in the tree (Model = 0, Nsites = M0) (ii) allowing
for heterogeneity in among codons within the tree (Model = 0, Nsites
= M3, M7 or M8) (iii) allowing a different value of along a specified
branch in the tree (Yang & Nielsen 2002) whilst at the same time
allowing four different classes of for amino acid positions (using
Model = 2, Nsites = M3).
The tests described can theoretically detect the small number of
sites (or branches) for which > 1 even when < 1 for the majority

2006 The Royal Entomological Society, Insect Molecular Biology, 15, 4556

Evolution of Spider silks


of sites. NSites M3 assumes discrete categories with different
ratios where can exceed 1, M7 assumes a continuous beta
distribution of across sites, where the distribution can take a
variety of shapes and where 1 and M8 assumes the same,
continuous beta distribution of as M7, but with the addition of one
extra site class that has a free ratio estimated from the data. M8
is known to suffer from localized optima and was therefore run
several times using different starting values and the highest likelihood score taken.
Likelihood ratio tests for selection on individual branches were
performed by comparing the likelihoods of Model = 2, NSites = M3
(no. categories = 4) with the nested Model = 0, NSites = M3 (no.
categories = 2). Similarly, tests for selection on individual amino
acids rather than branches were performed by comparing the nested
models NSites = M0 (all sites have the same ) with Nsites = M3
(3 discrete categories of ) according to Nielsen & Yang 1998 and
the model M7 (beta distribution of with values always 1) with
nested model M8 (beta distribution with an additional class of
that can exceed 1). Twice the difference between the likelihoods
of these models was compared with the 2 distribution (1 or 2
degrees of freedom for branch and site models, respectively).

Predicted structure, physical properties and function


Hydrophobicity of amino acid sequences was predicted using Kyte
and Doolittle mean hydrophobicity profiles (Kyte & Doolittle, 1982)
in BioEdit vs. 5.0.9 (Hall, 1999). This technique was consistent over
a range of scan window sizes (520 residues) and gave results
comparable with other scales, such as the Parker HPLC (Parker
et al., 1986) and the Eisenberg scales (Eisenberg et al., 1984).
Secondary structure prediction was performed using the 3DPSSM server (Fischer et al., 1999; Kelley et al., 1999, 2000). This
web-based program compares input query protein sequences with
an extensive database (fold library) of proteins of known structure.
Similarities between known and query sequences are used to
predict secondary structures of the latter. Estimates of confidence
in predicted secondary structure elements, based upon the similarity between query and known sequences, are calculated by the
program (shown as E-values). Only regions with more than 95%
confidence in their predicted structure are shown in this study.

Acknowledgements
The authors are grateful to Dr Brent Emerson, Amy Crowther
and Dr Alison Surridge for critically reading the manuscript.
This work was supported by the University of East Anglia
and by a BBSRC grant to Prof. Hewitt.
References
Adachi, J. and Hasegawa, M. (1996) MOLPHY, Version 2.3:
Programs for Molecular Phylogenetics Based on Maximum
Likelihood. Tokyo: Institute of Statistical Mathematics.
Anisimova, M., Nielsen, R. and Yang, Z. (2003) Effect of recombination on the accuracy of the likelihood method for detecting
positive selection at amino acid sites. Genetics 164: 1229 1236.
Beckwitt, R. and Arcidiacono, S. (1994) Sequence conservation in
the C-terminal region of spider silk proteins (spidroin) from
Nephila clavipes (Tetragnathidae) and Araneus bicentarius
(Araneidae). J Biol Chem 269: 66616663.

55

Beckwitt, R., Arcidiacono, S. and Stote, R. (1998) Evolution of


repetitive proteins from Nephila clavipes (Tetragnathidae) and
Araneus bicentarius (Araneidae). Insect Biochem Mol Biol 28:
121130.
Bielawski, J.P. and Yang, Z. (2003) Maximum likelihood methods
for detecting adaptive evolution after gene dulplication. J
Theoret Func Genomics 3: 201212.
Cartan, K.C. and Miyashita, T. (2000) Extraordinary web and silk
properties of Cyrtarachne (Araneae, Araneidae): a possible link
between orb-webs and bolas. Biol J Linnean Soc 71: 219 235.
Coddington, J.A. and Levi, H.W. (1991) Systematics and evolution
of spiders (Araneae). Annu Rev Ecol Syst 22: 565 592.
Craig, C.L. and Reikel, C. (2002) Comparative architecture of silks,
fibrous proteins and their encoding genes in insects and
spiders. Comparative Biochem Physiol 133: 493 507.
Dayhoff, M.O., Schwartz, R.M. and Orcutt, B.C. (1978) A model of
evolutionary change in proteins. Matrices for detecting distant
relationships, pp. 345 358. In: Dayhoff, M.O., ed. Atlas of protein sequence and structure, Vol. 5. National biomedical
research foundation Washington DC.
Douady C.J., Delsuc, F., Boucher, Y., Doolittle, W.F. and Douzery, E.J.P.
(2003) Comparison of bayesian and maximum likelihood bootstrap measures of phylogenetic reliability Mol. Biol Evol 20:
248254.
Eisenberg, D., Schwarz, E., Komaromy, M. and Wall, R. (1984)
Analysis of membrane and surface protein sequences with the
hydrophobic moment plot. J Mol Biol 179: 125 142.
Felsenstein, J. (1981) Evolutionary trees from DNA sequences: a
maximum likelihood approach. J Mol Evol 17: 368376.
Felsenstein, J. (1985) Confidence limits on phylogenies: an
approach using the bootstrap. Evolution 39: 783 791.
Fischer, D., Barret, C., Bryson, K., Elofsson, A., Godzik, A., Jones, D.,
Karplus, K.J., Kelley, L.A., Maccallum, R.M., Pawowski, K., Rost, B.,
Rychlewski, L. and Sternberg, M.J. (1999) CAFASP-1: Critical
Assessment of Fully Automated Structure Prediction Methods.
Proteins: Structure, Function Genetsupplement 3: 209 217.
Gatesy, J., Hayashi, C., Motriuk, D., Woods, J. and Lewis, R.
(2001) Extreme diversity, conservation, and convergence of
spider silk fibroin sequences. Science 291: 2603 2605.
Gillespie, R.G. (1999) Comparison of rates of speciation in webbuilding and non-web-building groups within a Hawaiian spider
radiation. J Arachnol 27: 79 85.
Gosline, J.M., Guerette, P.A., Ortlepp, C.S. and Savage, K.N.
(1999) The mechanical design of spider silks: from fibroin
sequence to mechanical function. J Exp Biol 202: 32953303.
Guerette, P.A., Ginzinger, D.G., Weber, B.H.F. and Gosline, J.M.
(1996) Silk properties determined by gland-specific expression
of a spider fibroin gene family. Science 272: 112115.
Hall, T.A. (1999) BioEdit: a user-friendly biological sequence alignment, ed. and analysis program for Windows 95/98/NT. Nucleic
Acids Symposium Series, 41, 95 98.
Hartl, F.U. and Hayer-Hartl, M. (2002) Molecular chaperones in the
cytosol: from nascent chain to folded protein. Science 295:
18521858.
Hawthorn, A.C. and Opell, B.D. (2002) Evolution of adhesive
mechanisms in cribellar spider prey capture thread: evidence
for van der Waals and hygroscopic forces. Biol J Linnean Soc
77: 1 8.
Hayashi, C.Y. and Lewis, R.V. (1998) Evidence from flagelliform
silk cDNA for the structural basis of elasticity and modular nature
of spider silks. J Mol Biol 275: 773 784.

2006 The Royal Entomological Society, Insect Molecular Biology, 15, 4556

56

R.J. Challis et al.

Hayashi, C.Y. and Lewis, R.V. (2000) Molecular architecture and


evolution of a modular spider silk protein gene. Science 287:
14771479.
Hayashi, C.Y., Shipley, N.H. and Lewis, R.V. (1999) Hypotheses that
correlate the sequence, structure, and mechanical properties
of spider silk proteins. Int J Biol Macromolecules 24: 271275.
Hein, J.J. (1993) A heuristic method to reconstruct the history of
sequences subject to recombination. J Mol Evol 20: 402 411.
Hinman, M.B. and Lewis, R.V. (1992) Isolation of a clone encoding
a second dragline silk fibroin. J Biol Chem 267: 1932019324.
Huelsenbeck, J.P. and Ronquist, F. (2001) MRBAYES: Bayesian
inference of phylogenetic trees. Bioinformatics 17: 754755.
Jin, H.-J. and Kaplan, D.L. (2003) Mechanism of silk processing in
insects and spiders. Nature 424: 10571061.
Jones, D.T., Taylor, W.R. and Thornton, J.M. (1992) The Rapid
Generation of Mutation Data Matrices from Protein Sequences.
CABIOS, 8, 275282.
Kelley, L.A., MacCallum, R.M. and Sternberg, M.J.E. (1999)
Recognition of remote protein homologies using threedimensional information to generate a position specific scoring
matrix in the program 3D-PSSM, pp. 218225. In: Istrail, S.
Pevzner, P. and Waterman, M., eds. RECOMB 99, Proceedings
of the Third Annual Conference on Computational Molecular
Biology. The Association for Computing Machinery, New York.
Kelley, L.A., MacCallum, R.M. and Sternberg, M.J.E. (2000)
Enhanced Genome Annotation using Structural Profiles in the
Program 3D-PSSM. J Mol Biol 299: 499520.
Kerkham, K., Viney, C., Kaplan, D. and Lombardi, S. (1991) Liquid
crystallinity of natural silk secretions. Nature 349: 596598.
Kumar, S., Tamura, K., Jakobsen, I.B. and Nei, M. (2001) MEGA2:
Molecular Evolutionary Genetics Analysis software, Arizona.
State University, Tempe, Arizona, USA.
Kyte, J. and Doolittle, R.F. (1982) A simple method for displaying
the hydrophobic character of a protein. J Mol Biol 157: 105
142.
Lanave, C.G., Preparata, C., Saccone and Serio, G. (1984) A new
method for calculating evolutionary substitution rates. J Mol
Evoution 20: 86 93.
McDonald, J.H. and Kreitman, M. (1991) Adaptive protein evolution
at the Adh locus in Drosophila. Nature 351: 652654.
McGuire, G. and Wright, F. (2000) TOPAL 2.0: Improved Detection of
Mosaic Sequences within Multiple Alignments. Bioinformatics
16: 130134.
Nei, M. (1969) Gene duplication and nucleotide substitution in
evolution. Nature 221: 40 42.
Nei, M., Gu, X. and Sitinikova, T. (1997) Evolution by the birth-anddeath process in multigene families of the vertebrate immune
system. National Academy of Sciences Colloquium Genetics
and the Origin of. Species: From Darwin to Molecular Biology
60 Years After Dobzhansky.
Nielsen, R. and Yang, Z. (1998) Likelihood models for detecting

positively selected amino acid sites and applications to the HIV-1


envelope gene. Genetics 148: 929936.
Parker, J.M.R., Guo, D. and Hodges, R.S. (1986) New hydrophilicity
scale derived from High-Performance Liquid Chromatography
peptide retention data: correlation of predicted surface residues
with antigenicity and X-ray-derived accessible sites. Biochemistry 25: 54255432.
Philippe, H. and Laurent, J. (1998) How good are deep phylogenetic trees? Curr Opin Genet Dev 8: 616 623.
Rozas, J., Snchez-Delbarrio, J.C., Messeguer, X. and Rozas, R.
(2003) DnaSP, DNA polymorphism analyses by the coalescent
and other methods. Bioinformatics 19: 2496 2497.
Selden, P.A. (1990) Lower Cretaceous spiders from the Sierra-deMontsech, North-east Spain. Palaeontology 33: 257285.
Selden, P.A. and Gall, J.C. (1992) A Triassic Mygalomorph spider
from the Northern Vosges, France. Palaeontology 35: 211 235.
Shao, Z. and Vollrath, F. (1999) The effect of solvents on the
contraction and mechanical properties of silks. Polymer 40:
17991806.
Sponner, A., Unger, E., Grosse, F. and Weisshart, A. (2004)
Conserved C-termini of spidroins are secreted by the major
ampullate glands and retained in the silk thread. Biomacromolecules 5: 840 845.
Swofford, D.L. (1999) PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods), Version 4. Sinauer Associates,
Sunderland, Massachusetts.
Tajima, F. and Nei, M. (1984) Estimation of evolutionary distance
between nucleotide sequences. Mol Biol Evol 1: 269285.
Thompson, J.D., Higgins, D.G. and Gibson, T.J. (1994) CLUSTALW:
improving the sensitivity of progressive multiple sequence
alignment through sequence weighting, position specific gap
penalties and weight matrix choice. Nucl Acids Res 22: 4673
4680.
Vollrath, F. (1999) Biology of spider silk. Int J Biol Macromolecules
24: 8188.
Vollrath, F. and Edmonds, D. (1989) Modulation of the mechanical
properties of spider silk by coating with water. Nature 340:
305 307.
Wiuf, C., Christensen, T. and Hein, J. (2001) A simulation Study of
the reliability of recombination detection methods. Molecular
Biology and Evolution 18: 19291939.
Xu, M. and Lewis, R.V. (1990) Structure of a protein superfiber:
spider dragline silk. Proc Natl Acad Sci United States America
87: 7120 7124.
Yang, Z. (1997) PAML: a Program Package for Phylogenetic
Analysis by Maximum Likelihood. CABIOS, 13, 555 556.
Yang, Z. and Bielawski, J.P. (2000) Statistical methods for detecting
molecular adaptation. Trends Ecol Evol 15: 496 503.
Yang, Z. and Nielsen, R. (2002) Codon-substitution models for
detecting molecular adaptation at individual sites along specific
lineages. Mol Biol Evol 19: 908917.

2006 The Royal Entomological Society, Insect Molecular Biology, 15, 4556

You might also like