You are on page 1of 16

A statistical case study, involving several vaccines, about

biological hurdles and applications of computational biology


in Vaccine Development
BY MOHAMMAD SHAHID AKHTAR AND NAVEED ANJUM FAZILI

Abstract:
Vaccine has been paramount to the unparalleled development of humanity in the past century. The usage of vaccine
is what saved perhaps billions of lives from premeditated and premature deaths. It is what has been the saving grace
of humanity; defeating the very natural process of pathogenic deaths.
Our recent times have just shown us how much we yearn a vaccine, due to the Covid-19 onslaught. There are
multiple challenges that medical researches across the world face, which hinder their efforts to develop a viable
vaccine. The world demands for a novel coronavirus SARS-2 vaccine and amid this medical crisis, people with little
knowledge in the field have criticized the hard-working researchers for their supposed incompetency in developing a
vaccine.
Seeing people criticize the researchers, made the co-authors of this paper, Naveed Anjum and Mohammad Shahid,
want to make a understandable yet scientifically accurate paper that could shed insight onto the difficulty of vaccine
development. In addition to this, we decided to do some extensive research into the average time to develop
vaccines for different kinds of pathogens, and to compare the different categories of vaccines using statistical tests .
Finally, the paper discusses the implication of modern and upcoming techniques from biology in the faster
development of vaccines, and the ways AI could help in developing vaccines quickly and safely.

Table of contents:
 SECTION A- The major hurdles towards vaccine development:
1. During vaccine development
2. After vaccine development

 SECTION B- Data for different pathogenic categories:


1. Single-Stranded RNA Viruses
2. Double-Stranded RNA Viruses
3. Double-Stranded DNA Viruses
4. Major bacterium

 SECTION C- Mathematical testing for the presence, if any, of statistical difference between the vaccine
development times for different pathogenic groups.

1. Average developmental time for different pathogenic categories.

2. Standard deviation for different pathogenic categories

3. Normal distribution for different pathogenic categories

4. Statistical comparison of efficiency of viral vaccines and bacterial vaccines

5. T-test comparison of viral vaccines and bacterial vaccines

 SECTION D- The applications of genetic modification, Machine Learning and AI in the development of vaccines:
1. Genetic modification
2.Machine Learning and Artificial Intelligence

 Conclusion.

 References.
SECTION A- The major hurdles towards vaccine development:
Abstract:
Vaccine is a highly important tool in our medical world and has been so for nearly the past two centuries, when
Edward Jenner discovered that injections of Cowpox, a similar but weaker cousin of the much dangerous Smallpox
would give immunity to the recipients against Smallpox.
After years of research, four forms of vaccines were developed [1]:
(a) Inactivated vaccines- Vaccines in which the pathogen has been “killed”.
(b) Attenuated pathogen vaccines- Vaccines with weakened pathogens.
(c) Toxoid vaccines- Vaccines containing inactivated toxins.
(d) Subunit vaccines- Vaccines containing only cleansed antigens.
The inactive pathogens in (a) are still viable because all that a vaccine needs are surface antigens. The antigens are
receptors (i.e. binders) and identity markers of the pathogen, that help to recognize it. The immune system’s
dendritic cells identify the pathogenic antigen as ‘foreign’ and present it to T-Lymph and B-Lymph. The
macrophages and neutrophils are the first one to act and engulf the pathogen via endocytosis, after which they
destroy the pathogen(s) in vacuoles laced with digestive enzymes. The macrophage isolates the antigens are presents
them to developing B-Lymph in bone marrow. [2]
This creates an immune response, which via an immune process generates increasing numbers of specific cells of
two distinct types in bone marrow: B-Lymphocytes and T-Lymphocytes. The undeveloped B-Lymph are involved
with specific antibody production- which is done by using antigenic information from primarily macrophages. The
B-Lymph then develop and differentiate into either Plasma cells- which provide immediate antibody production
against pathogens- and memory cells.
The T-Lymph also differentiate into either Killer T-Lymph or Helper T-Lymph, both of which are equipped with T-
Cell receptors. Each of the two T-Lymph can differentiate into specialised memory cells, much like B-Lymph.
These memory cells give the person long term immunity and the ability to defend against future infections by the
same pathogen. When an attack happens again, these memory cells divide to form new Plasma Cells (A form of B-
Lymph) that can attack the ‘memorised’ pathogen by releasing antibodies again

1. The hurdles during vaccine development:


 Pathogenic identification and isolation:
The first hurdle for the researcher is to identify the specimen pathogen without any foreign contamination. This
isolation is done by either centrifugation of sample, decantation (with liquids) or usage of a selective medium that
supresses contaminative pathogens. The centrifugation can be used here to separate the pathogens as per their mass,
as the heavier contaminative substances settle lower. Since the pathogen has a specific mass, its position can be
predicted in the centrifuged sample, and as such can be decanted.
Once the specific pathogen is isolated, the pathogenic genome can be separated by the lysis of the pathogen, which
releases the organelles including the genome, which can again be separated by centrifugation.
The separated genome can then be amplified by Polymer chain reaction (PCR) along with microarray detection[3],
in which the genome is readily read and simultaneously amplified using polymerase enzyme chain reactions[4]
and/or primers such as Oligonucleotides, can be used to amplify the small DNA sample present in pathogen.[5] If
the pathogen is RNA based, in that case RT-PRC can be used to obtain DNA, which can be read in the same
manner.
Either way, when the genetic material is read, the researchers gain some insight into the pathogen’s antigenic
structure. Furthermore, it helps in understanding how the pathogen works on the victim. This knowledge helps
researchers create temporary treatments or suggest common treatments already available.

Time taken for this step- About 2 days.[6]

 Weakening/inactivating/antigen purification/toxin inactivation of the pathogen and growing it in


a culture:
Following the identification and isolation, the obtained sample is then either weakened or inactivated.
The weakening is primarily done by mutagenesis (i.e. genetic change) using genetic engineering, in which certain
virulence inducing genes are ‘deleted’ via homologous recombination [7], where the target virulent gene is replaced
by nucleotides in a different order. Then the successful weakened pathogens are selected by select Monoclonal
Antibodies (MABs) which are specific to them[8].
The inactivation of the pathogen, contrastingly, is done by the introduction of the pathogens to certain chemical
agents or physical substances like formalin or β-propiolactone [.9] .These chemicals “kill” the pathogen by disabling
its mechanisms to survive and reproduce. This renders the pathogen a simple antigenic host.
The resulting pathogens from either method are then cultivated in a culture, where necessary nutrients and heat is
given so that they divide continuously to form a pathogenic colony.
The third category of vaccines, the subunits, are made by dissolving the pathogen, collecting antigens by
chromatographic identification, and harvesting them via recombinant DNA insertion into other bacteria. [10]
The fourth category of vaccines, the toxoids, could be primarily employed against bacteria like Bordetella pertussis
which causes whooping cough via a toxin. The toxoid vaccine is made by isolating and then purifying the toxin
using either heat or formaldehyde or both. [11]
Time taken for this step - About a week for attenuated, inactivated and subunit vaccines (including cultivation). A
few hours to a day for toxoid vaccines.

 Harvesting the vaccine and approving it for use:

After the cultivation is successful, a batch of the vaccine is firstly verified to be working- primarily under a
microscope, where antigenic presence is confirmed, and virulence determined. After the confirmation, the harvested
vaccine usually undergoes a three-phase confirmatory process.

The first phase involves a small group of people who receive the
trial form of the vaccine. If successful, then the vaccine goes to people with similar features as in phase 1, but in
larger number. Finally, if both phases are successful, the vaccine enters the final phase, where it is give in masses to
thousands.[12] Sometimes, a fourth phase is started in which the effectiveness and possible side-effects of the
vaccine are studied.
A month or so after each phase, the test subjects are subjected to extensive blood tests, by which, the blood anti-
body (glycoproteins responsible for antigen binding) count as well as memory cell count is determined. These tests
give the researchers an idea of vaccine efficiency.
The vaccine is deemed a failure if:
(a) The percentage of test subjects with adverse reaction are above a universal standard of safety.
(b) The antibody count and/or memory cell count are absent or below requirement for active immunity to persist.
(c) The test subjects get infected by the said disease shortly after.
Time taken for this phase- From a few months to possibly years, depending on the complexity of the vaccine. [13]

SUMMARY- From above times, the hurdles of the developmental phase take anywhere from at least months- for
simple vaccines like yearly influenza shots- to possibly years or decades- for complex vaccines like Malarial
vaccine.

The hurdles after the public release of a virus:


 Antigenic drift:
After successful vaccine development, the researchers must monitor possible drifts of the original pathogenic
antigen, against which the vaccine was developed. This is a problem, because if due to a random genetic mutation,
even if one amino acid is substituted, the entire secondary, tertiary, or quaternary structure can possibly change due
to different R-group interactions. This can engender varyingly different changes; from minor structural deviations to
possibly an entirely different shape of the antigen due to piling mutations.
Due to the antigenic drift, various strains of the pathogen are developed, in the process possibly creating a dominant
new strain.
This essentially renders the vaccine incapable of providing a strong activity immunity to the recipient since the
memory cells will provide antibodies that are non-identical to drifted antigens.

 Antigenic concealment:
Sometimes, some pathogens mutate such that they use deception to evade immune effect on them[13].This
deception is done by pathogens when they enter a cell, take over it, and use it as a shield, by hiding inside it. This
prevents detection by dendritic cells or by memory cells. This, as a result, causes a vaccine induced immunity to fail
as the pathogen’s antigens are undetectable to the immune system. A notorious case is the Vibrio Cholera bacterium,
which conceals itself in the walls of the intestine, thereby preventing antibodies from binding. This has driven the
price of a viable vaccine to about 250 USD [14].

 Ethical Issues:
The primary ethical issue after the release of a vaccine is its content and the procedure that makes it up.
Misinformation and rumours often include the narrative that vaccine contains Aluminium, formaldehyde, and even
dead foetuses. These remarks, most notably the latter one causes doubts and prevents the widespread use of
vaccines.

SECTION B- Time to develop vaccines for different categories of pathogens:


 Vaccines against Single-Stranded RNA Viruses:

1. MMR Vaccine:
NMMR stands for measles, mumps, and rubella. The three conditions are caused by measles morbillivirus, mumps
orthorubulavirus and Rubella virus, respectively. Each of the three viruses are single-stranded RNA viruses; of
which the former two are negative sense RNA viruses [15][16] and the latter is positive sense.[17]
Time to develop MMR vaccine- 2 years[18]

Effectiveness of MMR in preventing Measles- 93% and 97% for one dose and two doses, respectively. [19]
Average effectiveness- 95%

2. Polio Vaccine:
Polio virus is one of the subtypes of Enterovirus C, which are single-stranded RNA viruses of positive-sense. The
virus causes disability usually in the legs.
Polio vaccine was developed and first approved on April 12, 1955 after years of research and development by a
team led by Jonas Salk. The polio vaccine utilised inactivated polio viruses to stimulate immune responses.

Time to develop Inactivated Polio vaccine- 7 years (Started in 1948 and ended with approval on 12th April
1955)[20][21]

Effectiveness of Inactivated Polio Virus (IPV) in preventing poliomyelitis- 90%, 99% and 100% for one dose, two
doses and three doses, respectively.
Average effectiveness- 96.3%[22]

3. Influenza A and B:
Discovered in 1933 by Wilson Smith, C.H. Andrewes, and P.P. Laidlaw[23], Influenza A- one of the forms of
Influenza- is a negative-sense single stranded segmented RNA Virus.
A bivalent vaccine (catering to both Influenza A and B) was approved in 1945.[24]
Discovered as a second variant of Influenza vaccine, Influenza B, was discovered by Thomas Francis in 1936.[25]
Influenza B much like Influenza A is a negative-sense single stranded RNA virus, however, it is linear unlike
Influenza A, which is segmented
Time to develop the vaccine- 12 years for Influenza A, and 9 years for Influenza B.

Effectiveness of the vaccine- 37% against Influenza A and 50% against Influenza B (2019 data).[26]

Average Effectiveness against any form of Influenza- 43.5%.

4. Hepatitis A:
Hepatitis A is caused by Hepatovirus A, an unenveloped[27] virus which has a positive-sense single stranded
RNA[28] . The virus causes primarily jaundice, due to increased amounts of bilirubin. The Hepatovirus A was
isolated in 1979[29] and its vaccine was publicly available in 1992[30].

Time to develop the vaccine- 13 years


Effectiveness of vaccine- 95% [31]

 Vaccines against Double-stranded RNA Viruses:


*Note: Due to data constraints, some assumptions were made about Rotarix®’s start date

1. Rotavac®:
Rotavac is a monovalent vaccine against Rotavirus A, one of the only known double-stranded RNA viruses that
infect humans. The clinical trials of Rotavac® began in 2001, in which American Indian alliance, led by Bharat
Biotech had been formed. The vaccine was made available and approved in early 2014.[32]

Time to develop vaccine (approximate)- 13 years


Effectiveness of the vaccine- 56% [33]

2. Rotarix®:
Rotarix, another monovalent vaccine against Rotarix A, was modelled after the strain RIX4414, which was isolated
from a prior 89-12 vaccine. The 89-12 strain was isolated from stool in 1989 and the final vaccine Rotarix®, was
licensed in Mexico in July 2004.[34]
Time to develop vaccine- 15 years
Effectiveness of the vaccine- 90.4%[35]

3. Rotasiil®:
Rotasiil®, the first Rotavirus vaccine capable of heat-resistance was planned starting in 2005[36] in India and was
completed in early 2020[37], when it was approved for use.

Time to develop vaccine- 15 years

Effectiveness of the vaccine- 55%[38]


 Vaccines against major bacterium:

1. Tetanus:
Tetanus is caused by the Gram-positive bacteria Clostridium tetani, which has a cell wall made of thick
peptidoglycan layer which absorbs and keeps the crystal violet colour. The first vaccine- a toxoid based vaccine-
was developed in 1924[39] and was made commercially available in 1938[40].

Time taken for vaccine development- 14 years

Effectiveness of vaccine- 80% to 85% [41]


Taking average as 82.5%

2. Anthrax:
Anthrax is caused by Bacillus anthracis, a Gram-positive bacterium. The AVA (Anthrax Vaccine Absorbed) was
developed in the early 1950s and its clinical trials had started in 1954 under Philip S. Brachman. The vaccine was
permitted for use in the 1970
via USPHS evaluation.[42]

Time taken for vaccine development- 16 years

Effectiveness of vaccine- 92.5 percent[43]

3. Tuberculosis:
Tuberculosis is caused by the Gram-negative Mycobacterium tuberculosis. The bacterium primarily causes
respiratory distress among many things. The BCG vaccine development started in 1908 by Albert Calmette and his
assistant Camille Guérin, and the vaccine was accepted by League of Nations in 1928.[44]
Time taken for vaccine development- 20 years
Effectiveness of the vaccine- 70% to 80% [45]
Average effectiveness is 75%

Time to develop vaccine- 9 years


Vaccine efficiency- 94%[46]

4. Neisseria meningitidis group A:


Neisseria meningitidis group A is a gram-negative bacterium that is one of the several pathogens that cause
Meningitis, a condition where the protective layer of the brain and the spinal cord are inflamed. Amongst several
vaccines against Neisseria meningitidis group A, one of the more important one is MenAfriVac®. Development
started in 2001 under a $10 Million donation by Bill and Malinda Gates Foundation[47]. The vaccine was publicly
available in 2010 in sub-Saharan Africa[48].

 Vaccines against DNA Viruses:

1. Smallpox:
Smallpox is caused by Variola major, a double-stranded DNA virus. The modern vaccine against vaccine,
ACAM2000, was developed in 2001 after 9/11 to counter bioterrorism. The vaccine got its approval in August 2007
by the US FDA.[49]

Time taken to develop vaccine- 6 years

Effectiveness of vaccine- 99% [50]

2. Chickenpox:
Chickenpox is caused by Human alphaherpesvirus 3, better called as Varicella-Zoster virus (VZV). The Varivax
vaccine’s development began is 2003[51] and ended up with FDA approval in 2018.[52]

Time taken to develop vaccine- 15 years

Effectiveness of vaccine- 80% to 85%[53]


Average effectiveness- 82.5%

3. Human Papillomavirus (HPV):


HPV infection is caused by Human papillomavirus, a double stranded DNA virus belonging to the Papillomaviridae
family. The vaccine Gardasil’s development began in 1991 under Jian Zhou and Ian Frazer at The University of
Queensland, Australia.[54] The vaccine was approved for use by the FDA in June 2006.[51]

Time taken to develop vaccine- 15 years

Effectiveness of vaccine- 100%[55]

Section C: Mathematical models to quantitively analyse vaccine development


data:
Although data is limited, a few statistical devices can be employed to find the:

1. Average time to develop a vaccine:


Data for each category:
𝑥̅ =∑t÷n where 𝑥̅ is the mean time to make a vaccine, ∑t is total time and n is number of vaccines.

(a) Single stranded RNA Viruses:


2+7+12+9+13
So 𝑥̅ = = 8.6 𝑦𝑒𝑎𝑟𝑠
5

(b) Double stranded RNA Viruses:


(13+15+15)
So 𝑥̅ = = 14.3 𝑦𝑒𝑎𝑟𝑠
3

(c) Major Bacterium:


14+16+20+9
So 𝑥̅ = = 14.8 𝑦𝑒𝑎𝑟𝑠
4

(d) DNA Viruses:


6+15+15
So 𝑥̅ = = 12.0 𝑦𝑒𝑎𝑟𝑠
3

So, mean of all vaccines = 12.07years

2. The standard deviation for developing a vaccine:


S.D for each:

∑(𝑥−𝑥̅ )2
𝜎=√ where σ is standard deviation, (𝑥 − 𝑥̅ )2 is sum of all (𝑡𝑖𝑚𝑒𝑠 − 𝑚𝑒𝑎𝑛)2 and n is total number of
𝑛
vaccines.

(a) Single stranded RNA Virus:


(2−8.6)2 +(7−8.6)2 +(9−8.6)2 +(12−8.6)2 +(13−8.6)2
𝜎=√ = 3.93
5

(b) Double stranded RNA Virus:


(13−14.3)2 +(15−14.3)2 ×2
𝜎= √ = 0.94
3

(c) Major bacterium:


(14−14.8)2 +(16−14.8)2 +(20−14.8)2 +(9−14.8)2
𝜎=√ = 3.96
4

(d) DNA Viruses


(6−12)2 +(15−12)2 ∗2
𝜎=√ = 4.24
3

Standard deviation for all = 4.48


3. The normal distribution for vaccine development times:
(a) Single stranded RNA Virus:

Figure 1
To find probability for a sRNA Vaccine time to have development time t, equation is as given:
𝑡−8.6
𝑍=
3.93

(b) Double stranded RNA Virus:

Figure 2
To find probability for a dsRNA Vaccine time to have development time t, equation is as given:
𝑡−14.3
𝑍=
0.89

(c) Major Bacterium:

Figure 3
To find probability for a Bacterial Vaccine to have development time t, equation is as given:
𝑡−14.8
𝑍=
3.96
(d) DNA Viruses:

Figure 4
To find probability for a DNA Vaccine to have development time t, equation is as given:
𝑡−12
𝑍=
4.24

Overall normal distribution for any vaccine development:

Figure 5
To find probability for a Vaccine to have development time t, equation is as given:
𝑡−12.4
𝑍=
4.48

4. Calculating probability of a vaccine to be formed in time t:


To calculate the probability that a vaccine is made in time t, the value of t is put into either its respective category
or into the general equation, and then the value of Z is found. The value of Z corresponds to the probability in a
normal distribution table where it is found.
If probability is greater than 0.5, then the vaccine will take more time to make than an average vaccine will.
If probability is smaller than 0.5, then the vaccine will take less time to make than an average vaccine will.
Mathematical testing for the presence, if any, of statistical difference between the vaccine development times for
different pathogenic groups.

5. Testing whether viral vaccines or bacterial vaccines are more effective:


It can be determined which category of vaccine is more effective by finding the mean and the standard deviation of
each category:
95+96.3+43.5+95+56+90.4+55+99+82.5+100
Viral vaccine mean efficiency: = 81.3%
10
202.69
Standard deviation of viral vaccines: = 20.3
10

82.5+92.5+75+94
Bacterial vaccine mean efficiency: = 86%
4
30.95
Standard deviation of bacterial vaccines: = 7.7
4

From above data, it can be deduced that bacterial vaccines are more effective (86% vs 81.3%) and are more
consistent with less varying values (7.7 vs 20.3).
This proves that bacterial vaccines are more stable, because of less mutative ability of bacteria compared to viruses,
which mutate rapidly.
6. Testing for presence, if any, of statistically major difference between the developmental times
of viruses and bacteria:
Using t-tests, the data for viruses can be compared with the data for the bacteria to find whether pathogenic
complexity and mutagenic ability (which is more in viruses) has any effect on vaccine development time or not.

The following equation is used to compare the statistical impact of pathogenic complexity on a vaccine’s
development:
|𝑥̅2 2
1 −𝑥̅2 |
𝑡=
𝑠 2𝑠 2
√ 1+ 2
𝑛1 𝑛2

Here, x1 is the mean of viruses, s1 is standard deviation of viruses and n1 is number of viral vaccines.
Here, x2 is the mean of bacteria, s2 is standard deviation of bacteria and n2 is number of bacterial vaccines.

Using this equation, we can plug in following values from previous section(s):

x1 = 11.09, s1 = 4.46, n1 =11


x2 = 14.75, s2 = 3.96, n2= 4
|11.092 −14.752 |
𝑡= 2 2
= 39.5
√4.46 +3.96
11 4

By using the null hypothesis, “There is no significant statistical difference between the vaccine developmental time
of viruses and bacteria”, and a confidence level of 5%, the value of degree of freedom is 𝑣 = (𝑛1 − 1) + (𝑛2 − 1) .
Therefore, degree of freedom is 𝑣 = (11 − 1) + (4 − 1) = 13
From the table below, for degree of freedom 13, and at a confidence level of 5%, the value given for t is 2.160.

Degrees of freedom Confidence level of 5% (0.05)


13 2.160
Figure 6: Part of T-test value table
As a result, it can be deducted that the difference is statistically significant since the calculated t value is much more
than the critical t value of 2.160. Hence, the null hypothesis is wrong, and pathogenic complexity, as well as
mutagenic ability plays a big role in vaccine development time

SECTION D- The use of genetic technology, machine Learning and AI in the


development of vaccines:
1. Genetic Technology:
 Polymerase Chain Reaction (PCR) is a technique that is traditionally implemented into pathogenic isolation as
mentioned in Section A by ensuring no foreign genetic material exists.

The same technique can also be done in order to identify DNA-based genes/RNA-based genes that code for
pathogenic antigens. For DNA-pathogens this technique involves the heating of the isolated genetic material (post
lysis of pathogen) causing DNA denaturation (separation into two strands). The resulting two strands can then be
combined with DNA Polymerase, which uses free nucleotides to synthesize two new DNA molecules. This process
can be continued so forth until sufficient sample of DNA is made from a very small original sample of DNA.[56]
In the case of RNA-pathogens, RT-PCR is utilized, in which RNA is reverse transcribed into DNA using reverse
transcriptase, an enzyme which reverse transcribes RNA into dsDNA by using fragments of complementary DNA
(cDNA) which functions similarly to mRNA. The resulting DNA is then multiplied just as above mentioned by
PCR.
In either case, the resulting DNA sample can then be read by a DNA sequencing machine, which produces all the
nucleotides in order by using a method known as “shotgun sequencing”, in which the DNA is split randomly into
small fragments, then read base by base. The sequence of the whole pathogen is then obtained by combining each
random fragment’s sequence.
Following this, the antigen is isolated from the pathogen via high-speed centrifugation. The isolated antigen’s
protein subunits are then determined by Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry with
tandem mass spectrometry (MALDI-TOF MS/MS). Using this technique, the isolated antigen in a solution is mixed
with a matrix material on a metal place and irradiated with laser. This causes the ionization of the protein(s), which
are then sped up in a Mass Spectrometry twice. The data obtained is read by a computer which matches it to a
database, and its amino acid sequence is determined. Once the order of amino acids is established, the possible
mRNA codon lists are made (since each amino acid has multiple possible mRNA codons possible). Once the
possible mRNA orders for each protein are determined, corresponding DNA/RNA sequence is established and then
the genome is read to identify where the gene(s) for the antigen is/are located.
Once that is established, the specific section of genome that codes for antigen protein(s) can be reproduced via the
use of plasmid vectors, into which that gene for antigen is added via cutting the genome and using DNA ligase to
create sticky ends.
The altered vectors are then inserted in a bacterium. Once the bacteria start mass producing the pathogenic antigen,
the antigen obtained can be isolated by gel filtration chromatography, in which the larger bacterium are separated
from the smaller antigen proteins using a gel, and used in vaccines to safely trigger the immune system to develop
immunity.

 Another way vaccines can be progressively made more efficient and accurate is by the use of double RNA-guided
DNA endonuclease Cas9 Dalton proteins. The Cas9 protein is commonly used by bacteria against viruses, more
specifically DNA viruses, via their Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) immune
system.
The same Cas9 protein can be used to genetically alter DNA from a pathogen or RNA (reverse transcribed into
DNA) using double guide RNAs (gRNAs) which are contained in CRISPR RNA (crRNA) which is in a hairpin loop
shape.

The Cas9 protein causes the lysis of the DNA at a certain position, which is determined by the specific gRNA (in
crRNA) used. Followed by the target point is a part of the pathogenic DNA called Protospacer Adjacent Motif
(PAM), which is used by the Cas9 protein to select the precise spot on pathogenic DNA by binding to the base pairs
at the point.[57]Once the exact position of DNA is fixed as shown in the diagram below, the DNA at that point is
either split into two strands by a single-stranded or double-stranded break at said point to form a cleavage.[58]

Figure 7: Cas9 Dalton Protein


Source:[59]

Figure 8 Source:[60] Figure 9 Source:[61]


Following this, using Homology Directed Repair (HDR) as in Figure 8, an artificial donor DNA containing genes
that code for repressor protein(s) against pathogenic genes can be added to the double-stranded break in pathogenic
DNA.
After successful DNA-editing, the target genome from Figure 8, is reverse transcribed into crRNA as in Figure 9.
Multiple crRNA’s combine with a trans-activating CRISPR RNA (tracRNA) forming a single-guided RNA
(sgRNA). The sgRNA is then combined with the Cas9 and added to a vector plasmid which can be finally added to a
bacterium. The bacterium can then grow in a cultivar and produce the pathogens with repressed harmful genes. As
such, the pathogens cannot infect human cells and can be used in vaccines safely.

2. Use of Artificial Intelligence (AI) and machine learning:


New pathogens have roused the scientific community with a call to action to combat the growing pandemic. At the
time of this writing, there are several pathogens on the run, yet no novel antiviral agents or approved vaccines
available for deployment as a frontline defense. Traditional approaches have failed to produce stable and protective
vaccines for hypervariable and rapidly-evolving viral pathogens, including influenza viruses [62]. The reasons for
this failure include inherent uncertainty in pathogen evolution [63]. While global surveillance efforts and data
sharing agreements have increased available information, vaccine design often ignores the underlying processes of
the global influenza meta-population which generates diversity that allows the viral populations to escape vaccine-
induced immune responses and anti-viral treatments. Understanding the pathobiology of viruses could aid scientists
in their discovery of potent antivirals by elucidating unexplored viral pathways. One method for accomplishing this
is the leveraging of computational methods to discover new candidate drugs and vaccines in silico. In the last
decade, machine learning-based models, trained on specific biomolecules, have offered inexpensive and rapid
implementation methods for the discovery of effective viral therapies. Given a target biomolecule, these models are
capable of predicting inhibitor candidates in a structural-based manner. If enough data are presented to a model, it
can aid the search for a drug or vaccine candidate by identifying patterns within the data.
Machine learning can be utilized in the form of complicated algorithms and AI to identify the antigens and/or
genetic material of a novel, previously unknown pathogen. This works by the fact that most pathogens belong to a
distinct family of pathogens in which they have similar properties to each other, such as similar proteins
The important ways in which AI and ML could be used in vaccine development are:

 The process of identifying pathogenic genetic material from human DNA can be reduced to mere minutes with the
help of deep learning AI. The AI can be taught to identify pre-read genomes of human DNA by establishing
common repeating patterns of base sequences which are exclusive to humans. Therefore, during the sequencing of,
for example, a mixture of human DNA and a similar but unknown pathogenic DNA/RNA, the AI can identify the
pathogenic genome as being ‘non-human’ genetic material by recognizing the absence of prerequisite human base
patterns.

Figure 10: Common Human Base Sequences (Exclusive to all humans)


So, for example, in Figure 10, AI generates a pattern amongst three human DNA samples, resulting in AI
assumption that all humans have CHBS1 and CHBS2, which are human exclusive.

Figure 11: Differentiation of human DNA from pathogenic genome

Now for example, in Figure 11 above, the AI is provided with data from a mixed random sample of human DNA
and non-human pathogenic DNA. By deep learning from data from earlier figure, the AI associates CHBS1 and
CHBS2 with human DNA. As a result, the AI recognizes DNA SAMPLE B as the non-human pathogenic DNA
since it lacks CHBS1 and CHBS2. Done with many more humans, the AI can accurately recognize base orders that
are exclusive to humans.

Figure 12: Human DNA mixed with unknown genomic sample

In another case, the human DNA is mixed in with an unknown genomic sample. By quickly identifying CHBS1 and
CHBS2, the AI can identify that Sample A is human DNA. Simultaneously, the AI can recognize that the random
sample is non-human, since it is RNA, due to the presence of Uracil and lack of Thymine.

This in itself, can speed up vaccine development since researchers can quickly identify and separate the pathogenic
genome from it’s human host DNA. This means that the genome can be read quicker, so faster vaccine development.

 AI can be used to generate a software modelling of a pathogenic antigen. This can be done by using an AI that has
been trained, by machine learning, to select useful information in a cohesive manner from NMR-Spectroscopy and
electrophoresis. This is done primarily by cancelling ‘background’ unwanted data from the processes. This allows
the computer to highlight and read only the relevant data which corresponds to the primary sequences.
This would significantly lower the time it takes to obtain useful data about the protein(s) in the antigen and their
primary structure(s) as well as their quaternary structure(s). Following this, an AI can be trained and used to quickly
sequence the bases of a DNA fragment by using the aforementioned “shotgun sequencing”, with the added benefit
of using pattern recognition by comparing the sequence with sequences on related pathogens.
This speeds up the automated sequencing process by ‘predicting the order’, and then scans the bases for orders
which code for the antigenic proteins.
As a result, AI simplifies the process by quickly locating the genes responsible for the antigens of the pathogen.
With that, the gene insertion is done into a vector plasmid, and antigens are synthesized for vaccines. This reduces
the whole process into two steps and saves time in the initial developmental phase.

 AI can also be used to quickly identify a novel pathogen’s antigenic structure(s), with minimal NMR-Spectroscopy
and lab work involved. This can be done by allowing the AI to read the structure(s) of already analyzed antigens
belong to pathogens which are closely related to said novel pathogen. Here, using deep learning, the AI can
recognize patterns of recurring and repetitive sets of genomic bases amongst these related pathogens. Therefore, the
AI can to some extent ‘predict’ some, or most of the antigenic structure(s) of the novel antigen. This is done by, for
example, recognizing:

1. Common bonds and their locations amongst the proteins on similar pathogenic antigens.
2. Common patterns of amino acid sequences in similar pathogenic antigens

3. Common arrangements of the proteins in similar pathogenic antigens.

However, the more recent generation of context-based models are transformers that use attention mechanisms and
self-supervision to extract representations from sequences [64]. Transformers have demonstrated the capacity to
predict drug–target interactions [65], model protein sequences [66], and predict retrosynthetic reactions. These
models learn to extract features from sequences on the location, context, and order of the input tokens [67].
Recurrent neural networks (RNNs) and long short-term memory (LSTM) networks have successfully demonstrated
the ability to perform when trained on molecules or protein sequences to predict secondary structure [68],
quantitative structure–activity relationship (QSAR) modeling [69], and function prediction [70].
RNNs and LSTM can hence be used to predict the antigenic protein’s secondary and quaternary structure, therefore
reducing the time it takes to understand the structure(s) of the antigen. This will reduce the development time by big
margins and allow the vaccine to enter clinical phase trials.

 Apart from pre-developmental aid, AI can also help after vaccine development is over in the clinical trials. This can
be done by stimulating how the altered- weakened, attenuated or dead- pathogen interacts with human cells.
By using data that instructs and ‘teaches’ the AI about the normal behaviour of said pathogen, the AI tries to recreate
the same behaviour with altered pathogens with the same human cells, on a computer simulation software such as
the MOLSIM.[71] On software like MOLSIM, the molecular interactions are simulated between the pathogenic
antigen and human binding sites. If the pathogen successfully binds to human cells, the vaccine is a failure, and
vice-versa.
This can be useful, because firstly it can prevent unnecessary endangerment of human lives with first trials of
vaccines and secondly because it is cheap and effective. This has potentially the biggest Impact on vaccine
development, because the trial part alone takes up the most time in the development of a vaccine, due to safety
concerns which are not an issue with AI.
Due to recent development, Graph Convolutional Neural Networks (GCNN) have been the favorite tool for drug
discovery applications [72]. These networks are able to handle graphs and extract features via encoding the
adjacency information within the features. Successful representation learning from molecules using GCNNs has
been demonstrated in drug property prediction [73], protein interface estimation, reactivity prediction [74], and
drug–target interactions [75]. Sequence-based models such as genomics, proteomics, and transcriptomics have also
gained some attention in recent years due to the advancements made in the natural language processing domain.
The drug-target interaction of the GCNN can also be used in this context, because GCNN provide a more advanced
mechanism than MOLSIM. Using GCNN’s drug-targer interaction, it is possible to simulate and predict the
interaction between pathogenic antigen and the human cell antigens. This can be used to verify whether the vaccine
antigens work or not.

Conclusion:
From the hurdles shown, it is clear that vaccine development is no easy feat; from engineering a proper safe vaccine
to years of clinical trial, the whole procedure takes perseverance and patience- from both the researchers and the
global population. In summary vaccine development should not be rushed, for consequences of rushing can be very
harsh; for both the researchers and the people who use them.
The data also showed that, vaccine development, irrespective of category, does take time. The data also
categorically shows that viral vaccines in comparison to bacterial vaccines are less effective but are simultaneously
quicker to develop. This perfectly fits in with biological facts, which dictate the higher mutation rates and the
simplicity of viruses, respectively.
Finally, using extensive research into CRISPR, genetic modification and the ways AI could be used to assist the
former, the future of vaccine development was deciphered and elucidated from a molecular engineering perspective
aided with revolutionary computer programs.

References:
1. https://www.who.int/vaccine_safety/initiative/tech_support/Part-1.pdf?ua=1

2.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3026618/#:~:text=The%20discovery%20that%20macrophages%
20present,usual%20role%20as%20specialised%20phagocytes.

3. https://bmcmicrobiol.biomedcentral.com/articles/10.1186/1471-2180-7-78

4. https://www.biosyn.com/faq/What-is-PCR-
amplification.aspx#:~:text=PCR%20amplification%20is%20the%20selective,primer%20annealing%2C%20and%20pr
imer%20extension.
5. https://www.news-medical.net/life-sciences/What-is-an-
Oligonucleotide.aspx#:~:text=Oligonucleotides%20made%20up%20of%202,a%20small%20amount%20of%20DNA.

6. https://bmcmicrobiol.biomedcentral.com/articles/10.1186/1471-2180-7-
78#:~:text=In%20general%20the%20detection%20and,diagnosis%20%5B12%2C%2013%5D.7

7.
https://www.sciencedaily.com/releases/2015/10/151001151146.htm#:~:text=Scientists%20currently%20delete%2
0genes%20by,the%20effectiveness%20of%20subsequent%20deletions.

8. https://www.sciencedirect.com/topics/neuroscience/attenuated-vaccine

9. https://www.sciencedirect.com/topics/medicine-and-dentistry/inactivated-virus-vaccine

10. https://www.creative-biolabs.com/vaccine/subunit-vaccine-
design.htm#:~:text=Scientists%20cultivate%20microbes%20in%20the,called%20%22recombinant%20subunit%20v
accines%22.

11. https://www.sciencedirect.com/topics/medicine-and-
dentistry/toxoid#:~:text=Toxoid%20vaccines%20(e.g.%20vaccines%20for,maintaining%20immunogenicity)%20to%
20form%20toxoids.

12. https://www.cdc.gov/vaccines/basics/test-approve.html

13.https://www.sciencedirect.com/science/article/pii/016947589390177H/pdf?md5=abd5d84dc8c31c1b21fc40cb
65feb1a0&pid=1-s2.0-016947589390177H-main.pdf

14. http://www.medscape.com/viewarticle/878238

15.https://www.frontiersin.org/articles/10.3389/fmicb.2019.01832/full#:~:text=Measles%20virus%20is%20a%20si
ngle,the%20helical%20nucleocapsid%20(NC).

16. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4268314

17.https://www.ncbi.nlm.nih.gov/books/NBK8200/#:~:text=Structure,surrounded%20by%20a%20lipoprotein%20e
nvelope.

18. https://www.businessinsider.com/how-long-it-took-to-develop-other-vaccines-in-history-2020-
7#:~:text=Hilleman%20was%20credited%20with%20creating,vaccine%20in%20just%20two%20years

19. https://www.cdc.gov/vaccines/vpd/measles/index.html

20. https://archive.org/details/jonassalkconquer00mcph

21. https://www.pbs.org/wgbh/americanexperience/features/transcript/polio-transcript/

22. https://www.cdc.gov/vaccines/vpd/polio/hcp/effectiveness-duration-
protection.html#:~:text=Two%20doses%20of%20inactivated%20polio,of%20IPV%20and%20tOPV%2C%20or

23. https://www.cdc.gov/vaccines/pubs/pinkbook/flu.html

24. https://www.cdc.gov/flu/pandemic-resources/pandemic-timeline-1930-and-beyond.htm

25. https://www.cdc.gov/vaccines/pubs/pinkbook/flu.html

26. https://www.aafp.org/news/health-of-the-public/20200226interimfluve.html#:~:text=Story%20Highlights-
,According%20to%20a%20Feb.,influenza%20A(H1N1)pdm09.

27. https://www.sciencedirect.com/science/article/abs/pii/S0168170207000068?via%3Dihub

28. https://www.sciencedirect.com/topics/veterinary-science-and-veterinary-medicine/hepatitis-a-
virus#:~:text=The%20hepatitis%20A%20virus%20(HAV,only%20one%20known%20human%20genotype.

29. https://pubmed.ncbi.nlm.nih.gov/8182274/

30. https://link.springer.com/chapter/10.1007%2F3-540-36583-4_6

31.https://en.wikipedia.org/wiki/Hepatitis_A_vaccine#:~:text=Hepatitis%20A%20vaccine%20is%20a,after%20the%
20age%20of%20one.

32. https://www.defeatdd.org/blog/celebrating-rotavac-rotavirus-vaccine-success-story-india-and-world

33. https://www.healthissuesindia.com/2016/11/24/rotavac-divides-experts-opinion-effectiveness/

34. https://academic.oup.com/cid/article/48/2/222/305770
35.https://www.who.int/immunization/sage/3_Detailed_Review_Paper_on_Rota_Vaccines_17_3_2009.pdf?ua=1
#:~:text=The%20efficacy%20of%20Rotarix%C2%AE,CI%3D%2083.8%2C%2099.5).

36. https://health.economictimes.indiatimes.com/news/pharma/sii-invents-the-first-ever-heat-stable-rotavirus-
vaccine-in-the-world-rotasiil/70712802

37. https://www.thehindubusinessline.com/news/sii-launches-new-rotavirus-vaccine-rotasiil-liquid-for-
diarrhoea/article30536952.ece

38. https://www.path.org/media-center/serum-institutes-vaccine-demonstrates-significant-efficacy-against-
severe-rotavirus-
gastroenteritis/#:~:text=ROTASIIL%20reduced%20severe%20rotavirus%20diarrhea,dehydration%2C%20hospitaliza
tions%2C%20and%20deaths.

39. https://www.cdc.gov/vaccines/pubs/pinkbook/tetanus.html

40. https://www.nvic.org/vaccines-and-diseases/tetanus/vaccine-history.aspx

41. https://www.immunize.org/catg.d/p4220.pdf

42. https://en.wikipedia.org/wiki/Anthrax_vaccine_adsorbed

43.https://www.ncbi.nlm.nih.gov/books/NBK220536/#:~:text=The%20overall%20effectiveness%20of%20the,low%
2Drisk%20group%20of%20workers.

44. https://en.wikipedia.org/wiki/BCG_vaccine#History

45. https://www.nhs.uk/conditions/vaccinations/bcg-tuberculosis-tb-
vaccine/#:~:text=The%20vaccine%20is%2070%20to,form%20of%20TB%20in%20adults.

46. https://www.path.org/media-center/menafrivac-vaccine-cuts-incidence-of-meningitis-by-94-
percent/#:~:text=MenAfriVac%C2%AE%20vaccine%20cuts%20incidence%20of%20meningitis%20by%2094%20perc
ent,-September%2012%2C%202013

47. http://www.gatesfoundation.org/press-releases/Pages/path-and-who-receive-grant-010530.aspx

48. https://www.bbc.co.uk/news/business-11534311

49. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2880337/

50.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2880337/#:~:text=ACAM2000%E2%84%A2%20vaccination%20
resulted%20in,comparable%20with%20that%20of%20ACAM1000.

51. https://www.medicines.org.uk/emc/medicine/15264

52. https://www.fda.gov/vaccines-blood-biologics/vaccines/varivax

53.https://academic.oup.com/jid/article/197/Supplement_2/S82/849104#:~:text=Extensive%20postlicnsure%20ex
perience%20with%20single,of%20severe%20cases%20of%20varicella.

54. https://en.wikipedia.org/wiki/Gardasil#History

55. http://www.hpvvaccine.org.au/the-hpv-vaccine/how-effective-is-the-vaccine.aspx

56. https://www.genome.gov/about-genomics/fact-sheets/Polymerase-Chain-Reaction-Fact-
Sheet#:~:text=How%20does%20PCR%20work%3F,the%20original%20strands%20as%20templates.

57. https://en.wikipedia.org/wiki/CRISPR_gene_editing#cite_note-32

59. https://torontopubliclibrary.typepad.com/.a/6a00e5509ea6a188340240a4a4e039200d-700wi

60. https://paul-cacoango.blogspot.com/2018/12/el-uso-de-terapia-genetica-pulmonar-en.html

61. https://castormagazine.wordpress.com/2018/04/11/genetic-editing-a-double-edged-sword/

62. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6631137/#B30-vaccines-07-00045

63. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6631137/#B32-vaccines-07-00045

64. https://www.frontiersin.org/articles/10.3389/frai.2020.00065/full#B34

65. https://www.frontiersin.org/articles/10.3389/frai.2020.00065/full#B124

66. https://www.frontiersin.org/articles/10.3389/frai.2020.00065/full#B28

67. https://www.frontiersin.org/articles/10.3389/frai.2020.00065/full#B14

68. https://www.frontiersin.org/articles/10.3389/frai.2020.00065/full#B109
69. https://www.frontiersin.org/articles/10.3389/frai.2020.00065/full#B23

70. https://www.frontiersin.org/articles/10.3389/frai.2020.00065/full#B86

71. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5033024/

72. https://www.frontiersin.org/articles/10.3389/frai.2020.00065/full#B37

73. https://www.frontiersin.org/articles/10.3389/frai.2020.00065/full#B59

74. https://www.frontiersin.org/articles/10.3389/frai.2020.00065/full#B30

75. https://www.frontiersin.org/articles/10.3389/frai.2020.00065/full#B137

You might also like