Professional Documents
Culture Documents
Our operation aims to breed non-GM MIV-VII High Oleic Soybean with known
disease resistance and high agronomic value (low lodging rate, high yield,
drought tolerant, etc.).
Given the value of maturity groups has been established, how do soybeans “get” their
maturity groups? As with any trait, the answer is “genetics.” A gene is defined as a series of
nucleotides that can be transcribed into an RNA molecule (which often become protein
molecules). The common dominant/recessive model of genes applies here (where given “Aa”,
A would be expressed, and given “aa” “a” would be expressed), but so do some pleiotropic
and epistatic effects (the term for one gene having multiple effects and the term for one trait
being dependent on multiple genes, respectively). Maturity Groups are primarily controlled by
nine “E” genes, E1, E2, E3,…E9. The dominant variety (the uppercase E) is correlated with
later maturation (except in the case of e6 and e9 because Science Is Like That™ sometimes).
So, the latest maturing soybean (MX) would have the following set of alleles in its genome:
E1E2E3E4E5e6E7E8e9, and the fastest maturing soybean (M000) would have the following set
of alleles: e1e2e3e4e5E6e7e8E9. This is…pretty awful news for breeders selecting for maturity
groups, because you could breed a MX with a M000 and get literally anything in between.
Luckily, many cultivars exist that have had their genomes sequenced to confirm what alleles
they have, allowing, say, a E1E2E3E4 to be crossed with an e1e2E3E4 to temper the
fluctuations in maturity group.
These E genes do not have the same magnitude of effect – E1 has more of an influence
on maturity group than E2, and E2 than E3 (research has not been done to confirm if the trend
continues through E9, but it is safe to assume that there is some variance in magnitude of
effect). The E genes, also, code for specific things, such as E1 coding for nuclear localization
tagging of specific proteins, E2 is thought to be involved in circadian rhythm, and E3-E5 are
phytochrome genes. Often, the “recessive” E genes are simply nonfunctional, of which there
are several varieties of nonfunctioning alleles preserved through breeding. In summary:
Maturity groups are genetically controlled.
There are thirteen groups that represent the length of time the plants should be in the ground
before ideal harvest.
Incorrect maturity group for a geographic region will result in a loss of yield one way or
another.
When breeding for maturity groups, it is paramount to do research on the specific alleles
present in a cultivar, not just its maturity group (see the MX/M000 example).
(papers majorly pulled for this section include: Zhai, H., Lü, S., Wang, Y., Chen, X., Ren, H., Yang, J., ... & Wu, H. (2014).
Allelic variations at four major maturity E genes and transcriptional abundance of the E1 gene are associated with
flowering time and maturity of soybean cultivars. PloS one, 9(5), e97636., Langewisch, T., Lenis, J., Jiang, G. L., Wang,
D., Pantalone, V., & Bilyeu, K. (2017). The development and use of a molecular model for soybean maturity groups. BMC
plant biology, 17(1), 91., )
Trans Fats & High Oleic Acid
Soybean oil is clearly a major commodity, and not all soybeans are genetically identical.
There are variations of soybean grown across the world that have more desirable fatty acid
compositions. Since this is arguably the most advanced topic, first I am providing a short
glossary and accompanying visual (on the following page; I recommend zooming out to 70%
to read the bullet points while seeing the graphic).
Saturated and Unsaturated refer to if the fatty acid has any double bonds in it. When there
are ONLY single bonds, the fatty acid is considered Saturated. When there is even ONE
double bond (or triple bond), the fatty acid is considered Unsaturated. Saturation, in effect,
refers to “saturated with Hydrogen atoms,” which is where the process of “Hydrogenation”
gets its name.
Cis/Trans refer to the angles at which those double bonds are formed. Represented at the
bottom of the image, the left molecule has a single bond as the “primary bond” (which occurs
in the sigma, σ, or s domain) and the π or p domain (represented in the lighter gray) are not
aligned or connected in a way to form a double bond. This allows the connected molecules to
have Free Rotation. If a double bond is formed by interactions between p orbitals, the bond
loses free rotation and the periphery molecules are locked into their positions (on the opposite
side, trans, or on the same side, cis). Trans is generally energy favorable within chemistry, as
there is less intramolecular interaction between large subgroups. The default state of most
fatty acids, though, is cis.
Omega (ω) /Delta(Δ) fatty acid are simply terms for denoting where the double bond is. An
“Omega 3” you see in stores would have a double bond three from the methyl end (the CH3
end). A “Delta 3, 7” fatty acid would be a double bond 3 carbons in from the carboxylic acid
end (the side with the O/OH) and another 7 carbons in. (ex. Oleic Acid is a Δ 10, ω 8). These
have implications for how our bodies can digest them, but that is outside the scope of this
paper.
The form “16:0” indicates that there are 16 Carbon atoms and 0 double bonds. 18:2 would be
Linoleic acid and indicate 18 Carbon atoms and 2 double bonds.
Soybeans have five main fatty acids comprising their oil, as listed below:
11% palmitic acid (16:0) 11% palmitic acid 13% palmitic acid
4% stearic acid (18:0) 4% stearic acid 4% stearic acid
25% oleic acid (18:1) 23% oleic acid 20% oleic acid
52% linoleic acid (18:2) 54% linoleic acid 55% linoleic acid
7% linolenic acid (18:3) 8% linolenic 8% linoleic acid
It varies by source, but not by much. Oleic acid is around a quarter, Linoleic is a little more
than half, and Linolenic is 8%. The other two aren’t that important for us. Oleic, Linoleic, and
Linolenic acids are the unsaturated acids that comprise ~85% of the oil, alongside Palmitic
and Stearic acids for the saturated remaining 15%.
Polyunsaturated fatty acids (PUFAs) have undeniable health benefits (the most cited is
reducing LDL cholesterol in the blood, and that the body can’t synthesize PUFA ourselves so
we have to get it from our diets) which is great, but linolenic and linoleic acids prevalent in
commercial soybean oil makes the oil go rancid quickly and have low oxidative stability (~low
melting point). To remedy this, companies hydrogenate (add Hydrogen atoms to break up
double bonds) the oil to reduce the amount of unstable PUFAs. This process introduces
tranS fatS which are linked to obesity and heart problems and stuff. You’re alive and go to
supermarkets. You know.
However, commercial soybean oil being high in polyunsaturated fatty acids that are
hydrogenated and become trans fats is not news. In 2015, there was an FDA ban on partially
hydrogenated oils. The ban gave a 3 year window for companies to update their equipment
and find alternatives, and as of June 18, 2018 the law is in effect (though some merchants got
an extension until 2020). In the search for alternatives and a way to continue using all the
soybean capital, scientists found that Oleic acid eliminates the need for hydrogenation. Oleic
acid is oxidatively stable (1 double bond instead of 2 or 3), and the bond is deep within the
molecule, and not toward a reactive end. And – while this is outside of the scope of this task –
flavor profiles are complex and often defined both by most prominent fatty acids as well as
ratios of those acids to each other, and HO soybean has been shown to maintain a similar
flavor profile. High Oleic acid Soybean can mimic other oils that have had commercial
success, and potentially break into new markets.
While they are ranges (olive oil has a wide diversity of cultivars, regions, harvest timings, and
extraction methods), HO Soybean oil falls into the same ranges that Olive Oil does.
Wow! So, how do we increase Oleic Acid while decreasing Linolenic and Linoleic acid? As
is often the answer in biological research, the answer is “manipulating genes.” The goal of
the project is, straightforwardly, just eliminating the double bonds before they happen.
The way it has been done is to get rid of the enzymes that transform Oleic Acid into the
less desirable Linoleic acid. Fatty Acid Desaturase 2 (FAD2) is an enzyme with two main
genes, FAD2-1A and FAD2-1B that convert Oleic acid (18:1) to Linoleic acid (18:2). This can
be seen in the simple visual above and more complex visual on the next page. FAD3 can
then convert linoleic acid to linolenic acid by adding one more double bond.
The goal of the genetic side of HO breeding is to incorporate nonfunctional alleles of FAD2
and, to a lesser degree, FAD3 in order to reduce the amount of Oleic Acid converted to
linoleic and the amount of linoleic acid converted to linolenic acid. The more mutations the
ensure FAD2 doesn’t function, the better, but one can suffice.
(papers majorly pulled from for this section include: Pham, A. T., Shannon, J. G., & Bilyeu, K. D. (2012). Combinations of
mutant FAD2 and FAD3 genes to produce high oleic acid and low linolenic acid soybean oil. Theoretical and applied
genetics, 125(3), 503-515., Sweeney, D. W., Carrero-Colón, M., & Hudson, K. A. (2017). Characterization of new allelic
combinations for high-oleic soybeans. Crop Science, 57(2), 611-616. )
FAD2-1A/FAD2-1B
(The Genetic Basis of Oleic Acid)
(FAD2-1A and FAD2-1B and FAD3A and FAD3B and FAD3C and FAD6 and FAD7 and )
At this point, the project is essentially laid out, and the specifics of the genetics are not
required for practical work. However, this remaining section aims to introduce the primary
genetic components and their complexity for those who may need it. The paper “The FAD2
Gene in Plants: Occurrence, Regulation, and Role” by Aejez A. Dar, Choudhury, Kancharla, &
Arumugam (which is in the accompanying folder) is the absolute must-read of this topic. This
paper is almost impossible to summarize because it’s so informative and topics range from
actual enzymatic function, temperature and light’s effect on fatty acid desaturation, the FAD2
gene’s evolution, wounding response, and salt response. As such, I will not attempt to do so.
FAD2-1A/1B, though, do function very similarly to the E genes as far as inheritance and
non-function goes. There are an infinite number of ways (point mutations, frameshift
mutations, transcription errors) to have a faulty FAD2 gene, and these alleles ultimately
produce result in varying amounts of oleic acid concentrations in the oil. As such, genetic
research using basic CTAB kits, PCR, and Illumina sequencing has been done in order to
discover which alleles our soybean lines have.
To touch on a smaller component, though, let’s talk about FAD3. Fatty Acid Desaturase 3
(FAD3) complex has three main genes that make it up (FAD3A, FAD3B, and FAD3C), and
the created protein converts linoleic acid to linolenic in the endoplasmic reticulum (see the
image two pages prior). That is to say, FAD3A forms a subunit of the protein, FAD3B forms a
subunit, and FAD3C forms a subunit that all are necessary for regular conversion of linoleic
acid to linolenic. Commercial soybean/wild type typically has 7-10% linolenic acid, while
FAD3A mutated soybeans has 4% linolenic, FAD3A+FAD3B or FAD3A+FAD3C mutated had
3% linolenic, and those mutated in all three had 1% linolenic acid. High linoleic acid, such as
the 4-6% found in “typical” HO soybean, still would benefit from hydrogenation, so this
method for High Oleic Low Linolenic (HOLL) soybean is an important development to further
minimize side effects of PUFAs.
I have attempted to look into how/why FAD6/7 are not as big of a focus as FAD2 and
FAD3, as these proteins also follow the same fatty acid biosynthesis chain, and honestly
haven’t found much. The Dar et al. paper addresses it the most of anything involving
soybean, and the paper isn’t even directly about soybeans. I would bet that it has something
to do with gene expression timing, or perhaps FAD6/7 are just woefully inefficient and leaving
those untouched allows for the plant to maintain that pathway for fatty acid biosynthesis.
To conclude with a more global perspective, I included the Soja-Barometer from 2014
which has some neat facts. (I cannot find any more recent ones, but it’s not like the world has
changed THAT much in four years when it comes to where we get our food). I similarly cannot
stress reading this through enough. It’s important to know where your food comes from. Even
if you can’t immediate change policy in Brazil or the Netherlands, being aware of the world
you live in lets you make more informed choices where you do live.
(papers majorly pulled from for this section include: Dar, A. A., Choudhury, A. R., Kancharla, P. K., & Arumugam, N.
(2017). The FAD2 gene in plants: Occurrence, regulation, and role. Frontiers in plant science, 8, 1789., Shi, Z., Bachleda,
N., Pham, A. T., Bilyeu, K., Shannon, G., Nguyen, H., & Li, Z. (2015). High-throughput and functional SNP detection
assays for oleic and linolenic acids in soybean. Molecular breeding, 35(8), 176.)
This marks the end of the revised edition
The revised edition was produced Spring 2019 after significant training in
the art and science of writing. The following pages were the original
document, written Summer 2018 before my writing ability and
professionalism had been honed. The difference, I believe, speaks for itself.
Maturity Groups
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEeeeeeeeeeeeeeeeeeeeeeeeeeee
(Cheez-It’s slogan is “waiting for their cheese to mature if you didn’t know)
As I’ve understood it all, Soybeans are photoperiodic plants – meaning that their flowering period
depends on day length (long day/short night | long night/short day). Once the proper day length is
reached, the plant activates a series of genes that creates flowers instead of normal leaves (floral
quartet-model stuff, probably). Some sources say once day length is less than 12.8 hours, soybeans
start flowering.
(Hong Zhai, 2014): A simple paper detailing differences in growth stages given different latitudes (i.e.,
day lengths and temperature. All papers directly referenced in this will be in a folder called “HO
Highlights companion folder, and all of the papers that I got the paraphrased information from are
included in another folder nearby.)
𝑼𝑼𝑻
𝑩 𝑼 “proper day length” varies within landraces and varieties of soybean. Soybeans
growers in Northern China (or farther from the Equator anywhere) will want soybeans that mature faster.
The longer one delays soybean flowering, the higher risk an unexpected frost event or blizzard destroys
the entire field. The main goal of talking about maturity groups is being able to not put something
“meant” for short day length, cold, high latitude environments in a tropical climate. It would GROW, but
the yield would be greatly reduced. Similarly, having a high maturity group (late maturing) plant in a
‘Northern’ farm would pose great risk as the plant had yet to flower into September and October when
it becomes too cold to survive and it hasn’t even produced any seed before dying.
Some paper I read likened it to having more and more factories before you attempt to start
making a new product. Leaves (and roots) are the factories that a plant invests in before the “switch”
to the flowering phase. The more roots and more leaves (Sources), the more nutrients that can be sent
to the flowers and subsequent reproductive organs (Sinks).
(((There are Thirteen maturity groups, 000, 00, 0, I, II, III, IV, V, VI, VII, VIII, IX, X. Higher maturity group
means later maturing)))
Maturity groups are not random, and it is possible to breed for certain ones (like most things).
Also like most things, the answer is always “genes” or “proteins.” This time it’s genes. Because
breeding. Maturity groups are determined through a combination of 9* E genes, E1, E2, E3…E9. “In
the E series, the dominant version of the gene confers later flowering and later maturity except
for E6 and E9, where the dominant alleles have an early-flowering phenotype.”
*for now :}
(Langewisch 2017. A paper on everything E gene. It’s a really good read, and where the bolded quote
is from, along with most of the specific information in the above/below paragraph.)
((For the paper, and general information: R1 beginning bloom, R2 full bloom, R3 beginning pod,
R4 full pod, R5 beginning seed, R6 full seed, R7 beginning maturity, R8 full maturity are
“Growth Stages”))
So, the latest maturing soybean variety would be E1E2E3E4E5e6E7E8e9, and the earliest
maturing soybean variety would be e1e2e3e4e5E6e7e8E9. These E genes, naturally, do different
things. E1 is a novel legume-specific transcription factor distantly related to the B3 superfamily (which
apparently is just a DNA binding domain), E2 is thought to be involved in circadian rhythm, E3 is a
phytochrome gene…basically it seems like TO ME that each E gene has a role in allowing the
plant to maintain flowering for longer with greater efficacy. Most of the ‘recessive’ e1, e2, e3,…E6, E9
alleles are simply n o n f u n c t i o n a l. E1, for instance is a functional transcription factor, but there
are multiple common alleles that eliminate or reduce function (e1-as has a missense mutation that
reduces function, and e1-fs and e1-nl have frameshift mutations and complete deletion of the gene that
remove function completely, respectively.)
So to recap: Higher maturity groups take longer to mature (~10 days longer per group)
but can have higher yield/more seeds because they had more time before flowering to make more
leaves and longer roots and MORE OF EVERYTHING. However, this can be a detriment in
environments not suited for soybeans for as long of time. Matching maturity group to environmental
conditions is crucial to effective growing. Additionally, a somewhat complex series of E genes regulate
maturity group. E1 has more influence than E2, and E2 than E3 [I have not found if this trend persists
all the way down to E9], and it’s possible to, hypothetically, cross an e1.e2.E3.E4.E5 (MG I) with
E1.E2.E3.e4.e5 (MG VII) and get either E1.E2.E3.E4.E5 (MG X) or e1.e2.E3.e4.e5 (MG 00) (these MG
#’s are made up for demonstration purposes, but you can absolutely cross two MG IV’s and get an MG VIII if the alleles line
up just right)
Soybean oil is clearly a major commodity. There are variations of soybean that
have more desirable fatty acid compositions (think oil that more mimics olive oil, which
is just what people are used to/has some health benefits.)
(Zi Shi 2015. The paper is genetic-focused, but the introduction has some solid
information on why HO soybeans are important:)
But first, here is some crowded oil basics as a sort of glossary. Saturated and
Unsaturated refer to if the fatty acid has ~any~ double bonds in it. Cis/Trans refer to the
angles at which those double bonds are formed. Omega/Delta fatty acid (like “omega 3”
you see in stores) just means that it’s a fatty acid with a double bond three from the end
(methyl end). A Delta three fatty acid would be a double bond 3 carbons in from the
carboxylic acid end.
It clearly varies, but not by much. Oleic acid is around a quarter, Linoleic is a little
more than half, and Linolenic is 8%. The other two aren’t that important for us.
Polyunsaturated fatty acids (PUFAs) have undeniable health benefits (the most cited is
reducing LDL cholesterol in the blood, and that the body can’t synthesize PUFA
ourselves so we have to get it from our diets) which is great, but linolenic and linoleic
acids prevalent in commercial soybean oil makes the oil go rancid quickly and have low
oxidative stability. To cheaply remedy thi$, companie$ hydrogenate (add Hydrogen
atom$ to break up double bond$) the oil to reduce the amount of un$table PUFA$. This
process introduces tran$ fat$ which are linked to obesity and heart problems and
stuff. You’re alive and go to supermarkets. You know.
However, commercial soybean oil being high in polyunsaturated fatty acids that
are hydrogenated and become trans fats is not news. In 2015, there was an FDA ban
on partially hydrogenated oils. The ban gave a 3 year window for companies to update
their equipment and find alternatives or whatever, and as of June 18, 2018 the law is in
effect. In the search for alternatives and a way to continue using all the soybean capital,
scientists found that…
(Olive oil has a WIDE diversity of cultivars, regions, harvest timings, and
extraction methods. Olive/Palm oil found from general sources online)
(Sweeney 2017 provides HO soybean fatty acid composition with ‘all’
combinations of AABB, AaBb, … aabb)
While they are ranges, HO Soybean oil falls into the same ranges that Olive Oil
does. Wow! So, how do we increase Oleic Acid while decreasing Linolenic and Linoleic
acid? Proteins! (And therefore also genes)
The way it has been done is to get rid of the enzymes that transform Oleic Acid
into the less desirable Linoleic acid. Fatty Acid Desaturase 2 (FAD2) is an enzyme with
two main genes, FAD2-1A and FAD2-1B that convert Oleic acid (18:1) to Linoleic acid
(18:2). As can be seen, FAD3 can then converts linoleic acid to linolenic acid by adding
one more double bond.
(Dar 2017 has the primary information on the FAD2 gene and is as much of a must-read as the paper
on maturity groups is. If only one paper is read, this is the best one to read).
This paper is almost impossible to summarize because it’s so informative and topics range
from actual enzymatic function (The FAD2 in the ER utilizes phospholipids as substrates with NADH, NADH-cytochrome b5
reductase, and cytochrome b5 as electron donors. On the other hand, the plastidial oleate desaturase (FAD6) primarily uses glycolipids as
acyl carriers, and ferredoxin reduced by ferredoxin-NAD(P) reductase as electron donors), temperature and light’s effect on
fatty acid desaturation, and – since the paper is about the enzyme more than any one plant – a
comprehensive appreciation for soybean’s FAD2 can be gained. Just go to a HO Soybean
conference and ask people if they’ve read Aejaz A. Dar’s paper. Of course they have, who hasn’t?
Headers include: Fatty Acid Desaturase Genes, Phylogenetic or Evolutionary Relationship of FAD2
Genes, Regulation of FAD2 Gene (Temperature, Light, Wounding), FAD Gene Isolation and
Characterization, Expression of FAD2 Gene, Significance and Role of FAD2 (Fatty Acid Biosynthesis,
Plant Development, Cold Tolerance, Salt Tolerance).
I have attempted to look into how/why FAD6/7 are not as big of a focus as FAD2 and FAD3, and
honestly haven’t found much. The above paper addresses it the most of anything involving soybean,
and the paper isn’t even directly about soybeans. I would bet that it has something to do with gene
expression timing (FAD2 and FAD3 are expressed in the seed stage…the form that we are using).
Alternatively, FAD6/7 are just woefully inefficient but leaving those untouched allows for the plant to
maintain that pathway for fatty acid biosynthesis.
Soybeans and how you’ve probably consumed
dodecadillions
To get a more global perspective, I included the Soja-Barometer from 2014 with some neat facts.
(I can’t quite find any more recent ones, but it’s not like the world has changed THAT much in four
years, especially when it comes to where we get our food).
I similarly cannot stress reading this through enough. It’s important to know
where your food comes from. Even if you can’t immediate change policy in Brazil
or the Netherlands, being aware of the world you live in lets you make more
informed choices where you do live.
<3 Read this and the FAD2 paper <3
A note about genetic work: We’re looking for specific alleles of specific genes. The functional sequence
of nucleotides for FAD2-1A and FAD2-1B are known, so you are able to simply:
1. Isolate dna
2. PCR (more DNA to work with)
3. Rflp/trflp/restriction enzymes at a point upstream (and downstream) (smaller DNA is easier to
work with)
4. Sequence that DNA
5. Compare to the known sequence and determine if you have a functional, semi-functional, or
nonfunctional allele
Alternatively, HO trait is gaining a decent research base, and the HO lines ‘all’ come from particular
lines with particular mutations.
These are examples, and not necessarily true: If you knew your line came from Williams 82, and
Williams 82 had a G420 deletion causing frameshift mutation, you could just more cost-effectively look
for a marker at like, the 380th nucleotide to scan upwards from there. Similarly, if your line came from
17D, you would be able to map maturity group e1-ns, e2-xx, E3…etc. g e n e t i c s
Side/final note: I feel like PCR would already amplify only the gene, so…I don’t think you’d actually
need any restriction enzymes. But eh.