You are on page 1of 10

Proceedings of 11thNational Science Symposium (February 03, 2019)

Organized by Christ College, Rajkot & Sponsored by Gujarat State Biotechnology Mission (GSBTM), DST,
Govt. of Gujarat. Bioinformatics

VIROINFORMATICS: DATABASES AND TOOLS


Angelin George, John J. Georrge*
Department of Bioinformatics, Christ College, Rajkot-360 005, Gujarat
johnjgeorrge@gmail.com

ABSTRACT
The amalgamation of virology and Bioinformatics have led to the development of a new field known
as viroinformatics. More than 100 web servers and databases are currently available which provides
information regarding different viruses, for example, dengue virus, influenza virus, hepatitis virus,
human immunodeficiency virus [HIV], hemorrhagic fever virus [HFV], human papilloma virus [HPV],
West Nile virus, etc. The databases provide the tools for homology modelling, phylogenetic tree,
multiple sequence alignment, 3D visualization. The need for computer-assisted technologies of genome
structure, function and evolution of viruses is increasing immensely to tackle various challenges in
virology. This review presents the overview of all the viroinformatics databases and tools developed
that can contribute to the development of new potential drug.

1. INTRODUCTION
Viruses are ubiquitous infectious agent known to have infected all types of life forms, from animals
and plants to microorganisms such as bacteria and archaea. Various human diseases are caused by
viruses, this include common cold, influenza, chickenpox, and cold sores as well as serious diseases
such as Ebola virus disease, dengue fever, AIDS, avian influenza, and severe acute respiratory
syndrome (SARS). The possible connection between human herpesvirus 6 (HHV6) and neurological
diseases such as multiple sclerosis and chronic fatigue syndrome is under investigation (Komaroff,
2006).
Approximately 6 million deaths occur every year due to viruses, despite the availability of effective
vaccines and treatments for several diseases. Thus, it is crucial to develop remedies against these viral
invaders. The large amount of genomic and experimental data is generated due to advancement in
molecular biology and Bioinformatics. To store, examine, and disseminate all this information, 144
viroinformatics resources have been developed. The International Committee on Taxonomy of Viruses
(ICTV) that performs the task of naming and classifying virus lists 4,958 species. The genome of 8110
viral strains has been sequenced (NCBI Viral Genome Resource).
Bioinformatics research works, including data analysis, development of tools and databases on the
microorganisms are growing gradually (Abouelwafa et al., 2017; George et al., 2018; George et al.,
2017; Georrge et al., 2011; Georrge et al., 2012; Georrge, 2016; Kotadiya et al., 2015; Lijo et al., 2012;
Nishita et al., 2015; Nishita et al., 2016; Ranipa et al., 2018; Sakina et al., 2016; Sharma et al., 2014;
Trivedi et al., 2016a, 2016b; Ukani et al., 2011; Vaidya et al., 2018). The powerful viroinformatics
resources have been developed that provides unprecedented opportunities to address fundamental
questions in virology. Bioinformatic analysis of viruses includes tasks related to the analysis of any
novel sequences, such as gene identification, gene functional annotation, and analysis of phylogenetic
relationships. As the viral genome is of small size, it possible to sequence large numbers of isolates,
which in turn calls for specific methods of analysis. However, the current sequencing technologies
available for viral genomes pose challenges because most analysis steps are not easily automated
(Tumpey et al., 2005).

2. VIRUS-CENTERED RESOURCES
The biodiversity of viruses and its coverage of multiple scales is challenging challenge for algorithm
and software development (Hufsky et al., 2018). Recently, many new databases and tools are available
to virologists that will be discussed in the following section.
Baculoviruses and Papillomaviruses
Baculoviruses are a family of viruses that infect the insects. They have a large double-stranded DNA
(dsDNA) genome that can accommodate multiple additional foreign genes (Kamita et al., 2010). The
www.virology.ca is a database that provides easy access to the genes, gene families, and genomes of

www.christcollegerajkot.edu.in, © Christ College, Rajkot, India ISBN: 9788192952147, Page No. 117
Proceedings of 11thNational Science Symposium (February 03, 2019)
Organized by Christ College, Rajkot & Sponsored by Gujarat State Biotechnology Mission (GSBTM), DST,
Govt. of Gujarat. Bioinformatics
the different virus families including Baculoviruses. Human Papillomavirus (HPV) is an infection
caused by papillomavirus that can spread through skin-to-skin contact (Table 1). A database called
Papillomavirus Episteme (PaVE) contains curated papillomavirus genomic sequences and provides
several web-based sequence analysis tools (Van Doorslaer et al., 2016).
Table 1: Resources of Baculoviruses and Papillomaviruses
Resources Specific Features URL

Baculoviruses
virology.ca Database that provides access to https://4virology.net/
viral genomic information
Papillomaviruses
PaVE Access to papilloma virus http://pave.niaid.nih.gov
sequences and analysis tools
Dengue Virus and West Nile Virus
Dengue virus is the causative agent of common arthropod-borne viral disease in man with 50–100
million infections per year. Till now vaccines have been developed that can affect all the serotypes of
dengue virus. A set of virus-specific database is available at NCBI which is referred to as Virus
Variation Resources (VVR) (Table 2). It is an integrated resource for dengue virus as well as West Nile
Virus, where the users can build complex queries and then apply various analysis tools to the result
(Resch et al., 2009). The DengueNet is the World Health Organization's central data management
system for the global epidemiological surveillance of dengue fever (DF) and dengue haemorrhagic fever
(DHF) (Lawrence, 2002).
Table 2: Resources of Dengue Virus and West Nile Virus
Resources Specific Features URL
Dengue Virus and West Nile Virus
NCBI-VVR Set of virus-specific database http://www.ncbi.nlm.nih.gov/genomes/Virus
Variation/
Dengue Virus

DengueNet Data management system for


surveillance of dengue fever -

Influenza Virus
Human influenza virus is distributed world-wide. Influenza was brought to the forefront of the
world’s attention due to the recent emergence highly pathogenic avian influenza virus (AIV; H5N1)
that resulted in the death of more than 100 people and the slaughter of millions of poultry in Asia,
Europe and Africa (World Health Organization, http://www.who.int). Till now, 10 web portals and tools
have been developed solely for influenza virus (Table 3).
Influenza Virus Database (IVDB) was the first information resource to be developed that contains
both Beijing Institute of Genomics’(BIG) data and published IV sequences after expert curation to
ensure a high standard of accuracy and completeness. Till now IVDB contains 43,875 influenza virus
nucleotide sequences, 53,983 CDS sequences and 53,983 protein sequences. Two main features of
IVDB are (i) Sequence Distribution Tool: It facilitates IV global transmission and evolution analysis.
(ii) IV Sequence Quality Filter System: The nucleotide sequences are classified and ranked in 7
categories according to sequence content and integrity by the Q-filter system (Chang et al., 2006).

www.christcollegerajkot.edu.in, © Christ College, Rajkot, India ISBN: 9788192952147, Page No. 118
Proceedings of 11thNational Science Symposium (February 03, 2019)
Organized by Christ College, Rajkot & Sponsored by Gujarat State Biotechnology Mission (GSBTM), DST,
Govt. of Gujarat. Bioinformatics
Table 3: Resources of Influenza Virus
Resources Specific features URL
NCBI-IVR Integrated information resource and analysis http://influenza.psych.ac.cn/
platform for genetic, genomic, and phylogenetic
studies of influenza virus.
FluTE Influenza epidemic simulation tool https://www.cs.unm.edu/~dlch
ao/flute/
IRD Provides various visualization and analysis tools https://www.fludb.org/brc/hom
for comparative genomics e.spg?decorator
=influenza
FluGenome Web portal for genotyping influenza A virus https://omictools.com/flugeno
me-tool
IVDB An integrated information resource and analysis http://influenza.big.ac.cn/
platform for influenza virus research
EpiFlu Provides access to influenza virus sequences, http://platform.gisaid.org
related clinical and epidemiological data
associated with human viruses
GiRaF Identifies influenza virus reassortments http://kingsfordlab.cbd.cmu.ed
u/
OpenFluDB Contains genomic and protein sequences as well http://openflu.vital-
as epidemiological data from more than 25'000 it.ch/browse.php
isolates.
ISED Establishes influenza genomic sequences and http://influenza.cdc.go.kr
compares the user’s sequences with those of
vaccine strains.
ATIVS Analysis tools for influenza virus surveillance. http://influenza.nhri.org.tw/AT
IVS/

NCBI-IVR is the most cited resources and provides tools for genome annotation of influenza virus
such as FLAN (FLu ANnotation), for user-provided influenza A virus or influenza B virus sequences
(Bao et al., 2007). IRD is a public-accessible resource that integrates genomic, proteomic, immune
epitope, and surveillance data from a variety of sources, including public databases, computational
algorithms and scientific literature (Squires et al., 2012). Apart from this FluGenome is a tool that is
developed for genotyping influenza A virus and also identifies reassortment events between divergent
lines (Lu et al., 2007).
To analyse drug resistance mutation in user's sequences as well as to gain information about
epitope, ISED (Influenza sequence and epitope database) was established. It also allows users to
visualize epitope-matching structures (Yang et al., 2008). ATIVS is a web- server which carries out
both antigenic and genetic analyses of influenza isolates for influenza surveillance (Liao et al., 2009).
The open FluDB database contains genomic and protein sequences, as well as epidemiological data
from more than 27 000 isolates. It includes information such as virus type, host, geographical location
and experimentally tested antiviral resistance (Liechti et al., 2010). A web interface known as Influenza
Primer Design Resource (IPDR) is established that provides several important tools that aid in the
development of oligonucleotides that may be used to develop better diagnostics (Bose et al., 2008). The
FluTE is a publicly available Influenza Epidemic Simulation Model with more realistic intervention
strategies and can run on a personal computer (Chao et al., 2010).
HIV and Human T-lymphotropic virus
Tremendous efforts are going on to reduce the advent of Human Immunodeficiency virus (HIV). For
the same many resources have been developed (Table 4). The LANL HIV stores information such as
genetic sequences, drug-resistance associated mutations and several other tools (1) Geography Search
Interface: Retrieves HIV sequence based on the Geographical distribution (2) HIValign: Uses the HMM

www.christcollegerajkot.edu.in, © Christ College, Rajkot, India ISBN: 9788192952147, Page No. 119
Proceedings of 11thNational Science Symposium (February 03, 2019)
Organized by Christ College, Rajkot & Sponsored by Gujarat State Biotechnology Mission (GSBTM), DST,
Govt. of Gujarat. Bioinformatics
alignment models already available in the database to align the query sequences. (3) Sequence Locator:
Finds the position of HIV or SIV nucleotide or protein sequence. (4) Find Model: Gives the best which
evolutionary model that fits the query sequences (Kuiken et al., 2003; Shaw et al., 2013).
Some viruses mutate at high rates and rapidly develop resistance to existing antiviral drugs. As a
result, HIV drug resistance is increasing worldwide. An important resource Stanford HIV Drug
Resistance Database (HIVDB) was developed that enables HIV care providers to interpret HIV drug
resistance tests to choose the most appropriate treatment for their patients (Shaw et al., 2013). A similar
interpretation system named as EuResist Network has been developed to explore several machine
learning techniques to develop a treatment response prediction engine (Zazzi et al., 2012).
The PIRSpred is a web server for to predict protein-inhibitor resistance as well as susceptibility for
HIV-1 (Jenwitheesuk et al., 2005). HIV Therapy Simulator (HIVSIM) is a software which contains
computer simulation models useful in exploring the efficacy of HIV therapy regimens (Lim et al.,
2011).
Table 4: Human Immunodeficiency Virus
Resources Specific features URL
LANL Stores HIV sequences, http://hiv.lanl.gov
HIV drug-resistance mutations
database and several other tools
Stanford Predicts drug-resistance http://hivdb.stanford.edu
HIV drug mutations
resistance
DB
EuResist Treatment response https://www.euresist.org/
Network prediction engine
PIRSpred Predicts protein-inhibitor http://protinfo.compbio.washington.edu/pirspred/
resistance for HIV-1
SQUAT Quality assessment tool http://www.stat.brown.edu/CFAR/SQUAT
SCUEAL Phylogenetic method for http://www.datamonkey.org/dataupload_scueal.php
automatic subtyping an
HIV-1 sequence
bNAber Stores detailed information http://bnaber.org
about HIV bNAbs and
provides visualization tools

HIVCD Tool for contamination http://sourceforge.net/projects/hivcd/


detection
vFitness Tool developed to http://bis.urmc.rochester.edu/vFitness/
understand viral fitness

HIV Protein interaction, HIV- http://hivsystemsbiology.org


Systems replication cycle site
Biology
HIVSIM Explores the efficacy of https://sites.google.com/site/hivsimulator/
HIV therapy treatment.
HIV Provides selection pressure http://fold.doe-mbi.ucla.edu/HIV/
positive maps of PR/RT
selection
mutation
DB

www.christcollegerajkot.edu.in, © Christ College, Rajkot, India ISBN: 9788192952147, Page No. 120
Proceedings of 11thNational Science Symposium (February 03, 2019)
Organized by Christ College, Rajkot & Sponsored by Gujarat State Biotechnology Mission (GSBTM), DST,
Govt. of Gujarat. Bioinformatics
Web server SCUEAL is a model-based phylogenetic method for automatically subtyping an HIV-
1 sequence, assigning parental sequences in recombinant strains as well as computing confidence levels
for the inferred quantities (Pond et al., 2009). Sequence Quality Analysis Tool (SQUAT) that runs in
the R statistical environment was created for quality assessment prior to sequence analysis (DeLong et
al., 2012). The bNAber database provides access to detailed data on the rapidly growing list of HIV
bNAbs including sequences and three dimensional structure (Eroshkin et al., 2014).
HIV Contamination Detection (HIVCD) is an open-source tool utilized to make pairwise
comparisons of HIV-1 pol gene sequences from patients across the United States and it
contributes to quality testing (Ebbert et al., 2013). The accurate estimation of viral fitness
depends on complicated statistical methods. This led to the development of vFitness, a web-
based computing tool for improving estimation of in vitro HIV-1 fitness experiments (Ma et
al., 2010). HIV Systems Biology is a website that collects Big Data on HIV and hosts tools
such as Gene Overlapper, HIV Replication Cycle site, GPS-Prot and AIDSVu (Bushman et al.,
2013).
Human T-cell lymphotropic virus type 1 (HTLV-1) are a group of human retroviruses that
causes a type of cancer called adult T-cell leukemia/lymphoma. The HTLV-1 Molecular
Epidemiology Database stores annotated HTLV-1sequences from clinical, epidemiological,
and geographical studies (Araujo et al., 2012).
Hepatitis virus
Hepatitis is mainly caused by the five unrelated hepatotropic viruses: hepatitis A virus
(HAV), hepatitis B virus (HBV), hepatitis C virus (HCV), hepatitis D virus (HDV), and
hepatitis E virus (HEV). HepSEQ is a public repository that contains data related to hepatitis
B virus (HBV) infection collected from international sources (Table 5). There are four major
sections in the web interface of HepSEQ. The first section shows the summary of data in the
repository. The second section allows the user to access and submit the data. The third section
generates pie and bar charts based on the relation between different factors. In the third section
three tools are available for sequence analysis, such as Sequence Matcher, Genotyper and
Mutation Marker (Gnaneshan et al., 2006).
Table 5: Hepatitis B virus (HBV), hepatitisC virus (HCV)
Resource Specific Feature URL
HBV
HepSEQ Data repository of Hepatitis B http://www.hepseq.org/
HBVdb Contains various sequences and http://hbvdb.ibcp.fr
analysis tools
SeqHepB To determine resistance- associated http://www.seqhepb.com
mutations
HBVRegDB Comparison and detection of http://lancelot.otago.ac.nz
regulatory elements in hepatitis B
HCV
euHCVdb Resource of sequence, structures and http://euhcvdb.ibcp.fr
tools

HBVdb facilitates investigation of genetic variability of Hepatitis B Virus (HBV) and


allows the users to annotate their own sequence (Hayer et al., 2012). SeqHepB is both sequence
analysis programs as well as a database that contains data from multiple sources (Yuen et al.,
2007). HBVRegDB is a tool that enables annotation, comparison, detection and visualization

www.christcollegerajkot.edu.in, © Christ College, Rajkot, India ISBN: 9788192952147, Page No. 121
Proceedings of 11thNational Science Symposium (February 03, 2019)
Organized by Christ College, Rajkot & Sponsored by Gujarat State Biotechnology Mission (GSBTM), DST,
Govt. of Gujarat. Bioinformatics

of regulatory elements in hepatitis B virus sequences (Panjaworayan et al., 2007). The


European hepatitis C virus database (euHCVdb) is a library of computer-annotated sequences
based on the reference genome (Combet et al., 2006).
ViPR
Virus Pathogen Database and Analysis Resource (ViPR, www.viprbrc.org) contains various
records, gene and protein annotations, immune epitopes, 3D structures, host factor data. There are
three main functions provided by the ViPR. It stores data from external as well as internal sources
and groups these data into two main categories, virus families containing human priority pathogens
or possible public health threats. Secondly, users can perform analysis with the help of the data
analysis and visualization tools provided by ViPR. The third main feature the ViPR workbench
which allows the storage of the results which retrieved when necessary (Araujo et al., 2012).
Currently, ViPR contains wide range of information regarding several human-pathogen viruses
belonging to Arenaviridae, Bunyaviridae, Caliciviridae, Coronaviridae, Filoviridae, Flaviviridae,
Hepeviridae, Herpesviridae, Paramyxoviridae, Picornaviridae, Poxviridae, Reoviridae,
Rhabdoviridae, and Togaviridae families.
The ViPR stores data from three different types of sources: (i) data from public archives
GenBank, UniProt, Protein Data Bank (PDB, http://www.rcsb.org/pdb), Immune Epitope Database
and PubMed. (ii) ViPR produces novel derived data with the help of various automated
Bioinformatics and comparative genomics algorithms. (iii) direct data submission to ViPR from
experiments and other independent institutions.

3. VIRUS-SPECIFIC TOOLS
Identity distribution and genotype sequencing are crucial for studying viral genome. The
identity distribution is plotted in the form of histogram in which each bar represents the intervals
of identities. The number of tools has been developed for the analysis of viral sequences which are
listed below.

De novo assembly tools for viral genome


Velvet, ABySSor Geneious are the tools available for whole genome assembly. Velvet is a set
of algorithms that can leverage short reads and produce useful assembly based on de Bruijn graphs
(Zerbino et al., 2008). ABySS (Assembly By Short Sequences), a parallelized sequence assembler
enables to increase the amount of memory available to the assembly process (Simpson et al., 2009).
Geneious Basic is a software platform for analysis and visualization of biological data (Kearse et
al., 2012). Due to repetitive UTR region, these tools cannot be used for complete viral genome. As
a result alternative algorithm such as SPAdes (Bankevich et al., 2012) and IDBA-UD (Peng et al.,
2012) were developed for single-cell assemblies. VICUNA is another algorithm designed to
generate assemblies from heterogenous population (Yang et al., 2012). In addition to this
SOAPdenovo-Trans is a non-virus specific tool, but works efficiently for memory-efficient short-
read de novo assembly (Luo et al., 2012).

Secondary structure prediction tools


To understand the regulatory function of virus it is necessary to predict the secondary structure
of viruses. Advances in sequencing technology led to the development of computational programs
and tools: Mfold, RNAfold and LocARNA that predicts the secondary structure of viruses. The M
fold identifies the RNA and DNA folding and predicts hybridization (Zuker, 2003). RNAfold used
to predict Minimum Free Energy secondary structure and also calculates the equilibrium base-
pairing probabilities (Gruber et al., 2008). The LocARNA (Local Alignment of RNA) is a tool that
can deal with pseudoknot-free RNA secondary structures (Will et al., 2007).
Virus genotyping and Annotation
In order to identify the coding and non-coding region, genome annotation is an important step. For
instance, GLUE (Genes Linked by Underlying Evolution) is a software for interpreting sequence data
for different viruses, and can also be used as the storage platform (Singer et al., 2018). It has a simple

www.christcollegerajkot.edu.in, © Christ College, Rajkot, India ISBN: 9788192952147, Page No. 122
Proceedings of 11thNational Science Symposium (February 03, 2019)
Organized by Christ College, Rajkot & Sponsored by Gujarat State Biotechnology Mission (GSBTM), DST,
Govt. of Gujarat. Bioinformatics
interface that is used in Bioinformatics pipelines. Besides this ATHALES is a software that determines
HLA genotypes from Illumina exome sequencing. The PriSM selects and matches primers for viral
genome amplification (Yu et al., 2010).
4. CONCLUSION
This review covers majority of databases and tools that contains significant information of several
virus. This will allow the virologists to select the best tool for specific experiments. Bioinformatics
tools facilitates the comparison of genetic and genomic data which in turn helps to understand the
evolutionary relationship between various species.
5. REFERENCES

Abouelwafa Manal & Georrge John J. (2017). Ebola virus and its potential drug targets. Paper
presented at the Proceedings of International Science Symposium on Recent Trends in Science
and Technology (ISBN: 9788193347553).
Araujo Thessika Hialla Almeida, Souza-Brito Leandro Inacio, Libin Pieter, Deforche Koen, Edwards
Dustin, de Albuquerque-Junior Antonio Eduardo, . . . Alcantara Luiz Carlos Junior. (2012). A
public HTLV-1 molecular epidemiology database for sequence management and data mining.
PloS one, 7(9), e42123.
Bankevich Anton, Nurk Sergey, Antipov Dmitry, Gurevich Alexey A, Dvorkin Mikhail, Kulikov
Alexander S, . . . Prjibelski Andrey D. (2012). SPAdes: a new genome assembly algorithm and
its applications to single-cell sequencing. Journal of computational biology, 19(5), 455-477.
Bose Michael E, Littrell John C, Patzer Andrew D, Kraft Andrea J, Metallo Jacob A, Fan Jiang &
Henrickson Kelly J. (2008). The Influenza Primer Design Resource: a new tool for translating
influenza sequence data into effective diagnostics. Influenza and other respiratory viruses, 2(1),
23-31.
Bushman Frederic D, Barton Spencer, Bailey Aubrey, Greig Caitlin, Malani Nirav, Bandyopadhyay
Sourav, . . . Krogan Nevan. (2013). Bringing it all together: big data and HIV research. AIDS
(London, England), 27(5), 835.
Chang Suhua, Zhang Jiajie, Liao Xiaoyun, Zhu Xinxing, Wang Dahai, Zhu Jiang, . . . Wang Jian.
(2006). Influenza Virus Database (IVDB): an integrated information resource and analysis
platform for influenza virus research. Nucleic acids research, 35(suppl_1), D376-D380.
Chao Dennis L, Halloran M Elizabeth, Obenchain Valerie J & Longini Jr Ira M. (2010). FluTE, a
publicly available stochastic influenza epidemic simulation model. PLoS computational
biology, 6(1), e1000656.
Combet Christophe, Garnier Nicolas, Charavay Celine, Grando Delphine, Crisan Daniel, Lopez Julien,
. . . Hulo Chantal. (2006). euHCVdb: the European hepatitis C virus database. Nucleic acids
research, 35(suppl_1), D363-D366.
DeLong Allison K, Wu Mingham, Bennett Diane, Parkin Neil, Wu Zhijin, Hogan Joseph W & Kantor
Rami. (2012). Sequence quality analysis tool for HIV type 1 protease and reverse transcriptase.
AIDS research and human retroviruses, 28(8), 894-901.
Ebbert Mark TW, Mallory Melanie A, Wilson Andrew R, Dooley Shane K & Hillyard David R. (2013).
Application of a new informatics tool for contamination screening in the HIV sequencing
laboratory. Journal of Clinical Virology, 57(3), 249-253.
Eroshkin Alexey M, LeBlanc Andrew, Weekes Dana, Post Kai, Li Zhanwen, Rajput Akhil, . . . Godzik
Adam. (2014). bNAber: database of broadly neutralizing HIV antibodies. Nucleic acids
research, 42(D1), D1133-D1139.
George Rija & Georrge John J. (2018). Statistical analysis of industrially important thermophilic
organisms producing alpha-amylase, DNA polymerase and protease. Paper presented at the
Proceedings of 10th National Science Symposium on Recent Trends in Science and Technology
(ISBN: 9788192952130).
George Rija, Thomas Sneha, Jacob Sarah & Georrge John J. (2017). Approaches for novel drug target
identification. Paper presented at the Proceedings of International Science Symposium on
Recent Trends in Science and Technology (ISBN: 9788193347553).

www.christcollegerajkot.edu.in, © Christ College, Rajkot, India ISBN: 9788192952147, Page No. 123
Proceedings of 11thNational Science Symposium (February 03, 2019)
Organized by Christ College, Rajkot & Sponsored by Gujarat State Biotechnology Mission (GSBTM), DST,
Govt. of Gujarat. Bioinformatics
Georrge John J & Umrania Valentina. (2011). In silico identification of putative drug targets in
Klebsiella pneumonia MGH78578.
Georrge John J & Umrania VV. (2012). Subtractive genomics approach to identify putative drug targets
and identification of drug-like molecules for beta subunit of DNA polymerase III in
Streptococcus species. Applied Biochemistry and Biotechnology, 167(5), 1377-1395.
Georrge John J. (2016). A Bioinformatics Approach for the Identification of Potential Drug Targets and
Identification of Drug-like Molecules for Ribosomal Protein L6 of Staphylococcus species.
Paper presented at the Proceedings of 9th National Level Science Symposium on Recent Trends
in Science and Technology (ISBN: 9788192952123).
Gnaneshan Saravanamuttu, Ijaz Samreen, Moran Joanne, Ramsay Mary & Green Jonathan. (2006).
HepSEQ: international public health repository for hepatitis B. Nucleic acids research,
35(suppl_1), D367-D370.
Gruber Andreas R, Lorenz Ronny, Bernhart Stephan H, Neuböck Richard & Hofacker Ivo L. (2008).
The vienna RNA websuite. Nucleic acids research, 36(suppl_2), W70-W74.
Hayer Juliette, Jadeau Fanny, Deleage Gilbert, Kay Alan, Zoulim Fabien & Combet Christophe. (2012).
HBVdb: a knowledge database for Hepatitis B Virus. Nucleic acids research, 41(D1), D566-
D570.
Hufsky Franziska, Ibrahim Bashar, Beer Martin, Deng Li, Le Mercier Philippe, McMahon Dino P, . . .
Marz Manja. (2018). Virologists—Heroes need weapons. PLoS pathogens, 14(2), e1006771.
Jenwitheesuk Ekachai, Wang Kai, Mittler John E & Samudrala Ram. (2005). PIRSpred: a web server
for reliable HIV-1 protein-inhibitor resistance/susceptibility prediction. Trends in
microbiology, 13(4), 150-151.
Kamita SG, Kang KD, Hammock BD & Inceoglu AB. (2010). 10 Genetically Modified Baculoviruses
for Pest Insect Control. INSECT CONTROL.
Kearse Matthew, Moir Richard, Wilson Amy, Stones-Havas Steven, Cheung Matthew, Sturrock Shane,
. . . Duran Chris. (2012). Geneious Basic: an integrated and extendable desktop software
platform for the organization and analysis of sequence data. Bioinformatics, 28(12), 1647-1649.
Kotadiya Rohitkumar & Georrge John J. (2015). In silico approach to identify putative drugs from
natural products for Human papillomavirus (HPV) which cause cervical cancer. Life Sciences
Leaflets, 62, 1-13.
Kuiken Carla, Korber Bette & Shafer Robert W. (2003). HIV sequence databases. AIDS reviews, 5(1),
52.
Lawrence J. (2002). DengueNet–WHO’s internet based system for the global surveillance of dengue
fever and dengue haemorrhagic fever. Weekly releases (1997–2007), 6(39), 1883.
Liao Yu-Chieh, Ko Chin-Yu, Tsai Ming-Hsin, Lee Min-Shi & Hsiung Chao A. (2009). ATIVS:
analytical tool for influenza virus surveillance. Nucleic acids research, 37(suppl_2), W643-
W646.
Liechti Robin, Gleizes Anne, Kuznetsov Dmitry, Bougueleret Lydie, Le Mercier Philippe, Bairoch
Amos & Xenarios Ioannis. (2010). OpenFluDB, a database for human and animal influenza
virus. Database, 2010.
Lijo John, Georrge John J. & Trupti Kholia. (2012). A Reverse Vaccinology Approach for the
Identification of Potential Vaccine Candidates from Leishmania spp. Applied Biochemistry and
Biotechnology, 167(5), 1340-1350.
Lim Huat Chye, Curlin Marcel E & Mittler John E. (2011). HIV Therapy Simulator: a graphical user
interface for comparing the effectiveness of novel therapy regimens. Bioinformatics, 27(21),
3065-3066.
Luo Ruibang, Liu Binghang, Xie Yinlong, Li Zhenyu, Huang Weihua, Yuan Jianying, . . . Liu Yunjie.
(2012). SOAPdenovo2: an empirically improved memory-efficient short-read de novo
assembler. Gigascience, 1(1), 18.
Ma Jingming, Dykes Carrie, Wu Tao, Huang Yangxin, Demeter Lisa & Wu Hulin. (2010). vFitness: a
web-based computing tool for improving estimation of in vitro HIV-1 fitness experiments.
BMC bioinformatics, 11(1), 261.
Nishita Vaishnav, Aparna Gupta, Sneha Paul & Georrge John J. (2015). Overview of computational
vaccinology: vaccine development through information technology. Journal of Applied
Genetics, 56(3), 381-391.

www.christcollegerajkot.edu.in, © Christ College, Rajkot, India ISBN: 9788192952147, Page No. 124
Proceedings of 11thNational Science Symposium (February 03, 2019)
Organized by Christ College, Rajkot & Sponsored by Gujarat State Biotechnology Mission (GSBTM), DST,
Govt. of Gujarat. Bioinformatics
Nishita Vaishnav, Suvagiya Pratiksha & Georrge John J. (2016). Modeling mutations, docking, primer
and probe designing of Cytochrome P450 2D6, a drug metabolizing enzyme. Paper presented
at the Proceedings of 9th National Level Science Symposium on Recent Trends in Science and
Technology (ISBN: 9788192952123).
Panjaworayan Nattanan, Roessner Stephan K, Firth Andrew E & Brown Chris M. (2007). HBVRegDB:
annotation, comparison, detection and visualization of regulatory elements in hepatitis B virus
sequences. Virology journal, 4(1), 136.
Peng Yu, Leung Henry CM, Yiu Siu-Ming & Chin Francis YL. (2012). IDBA-UD: a de novo assembler
for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics,
28(11), 1420-1428.
Pond Sergei L Kosakovsky, Posada David, Stawiski Eric, Chappey Colombe, Poon Art FY, Hughes
Gareth, . . . Frost Simon DW. (2009). An evolutionary model-based algorithm for accurate
phylogenetic breakpoint mapping and subtype prediction in HIV-1. PLoS computational
biology, 5(11), e1000581.
Ranipa Avani, Shrilal Anju, Nimavat Akash, Rank Jalpa, Kothari Ramesh & Georrge John J. (2018).
Aspergillus flavus-A menace to farmers. Paper presented at the Proceedings of 10th National
Science Symposium on Recent Trends in Science and Technology (ISBN: 9788192952130).
Resch Wolfgang, Zaslavsky Leonid, Kiryutin Boris, Rozanov Michael, Bao Yiming & Tatusova
Tatiana A. (2009). Virus variation resources at the National Center for Biotechnology
Information: dengue virus. BMC microbiology, 9(1), 65.
Sakina S. Vakhariya & Georrge John J. (2016). Curcumin: A multi-tasking molecule. Paper presented
at the Proceedings of 9th National Level Science Symposium on Recent Trends in Science and
Technology (ISBN: 9788192952123).
Sharma Arun, Dutta Prasun, Sharma Maneesh, Rajput Neeraj Kumar, Dodiya Bhavna, Georrge John J,
. . . Bhardwaj Anshu. (2014). BioPhytMol: a drug discovery community resource on anti-
mycobacterial phytomolecules and plant extracts. Journal of cheminformatics, 6(1), 46.
Shaw Timothy I & Zhang Ming. (2013). HIV N-linked glycosylation site analyzer and its further usage
in anchored alignment. Nucleic acids research, 41(W1), W454-W458.
Simpson Jared T, Wong Kim, Jackman Shaun D, Schein Jacqueline E, Jones Steven JM & Birol Inanç.
(2009). ABySS: a parallel assembler for short read sequence data. Genome research, 19(6),
1117-1123.
Singer Joshua B, Thomson Emma C, McLauchlan John, Hughes Joseph & Gifford Robert J. (2018).
GLUE: A flexible software system for virus sequence data. BMC bioinformatics, 19(1), 532.
Squires R Burke, Noronha Jyothi, Hunt Victoria, García‐Sastre Adolfo, Macken Catherine, Baumgarth
Nicole, . . . Larsen Christopher N. (2012). Influenza research database: an integrated
bioinformatics resource for influenza research and surveillance. Influenza and other respiratory
viruses, 6(6), 404-416.
Trivedi Gauravi & Georrge John J. (2016a). Bacteriocin producing bacteria from gut of Apis mellifera.
Paper presented at the Proceedings of 9th National Level Science Symposium on Recent Trends
in Science and Technology (ISBN: 9788192952123).
Trivedi Gauravi & Georrge John J. (2016b). Identification of novel drug targets and its Inhibitors from
essential genes of human pathogenic Gram positive bacteria. Paper presented at the
Proceedings of 9th National Level Science Symposium on Recent Trends in Science and
Technology (ISBN: 9788192952123).
Ukani Hetal, Purohit Megha K, Georrge John J, Paul Sneha & Singh Satya P. (2011). HaloBase.
Development of database system for halophilic bacteria and archaea with respect to proteomics,
genomics and other molecular traits. J Sci Ind Res, 70, 976-981.
Vaidya Atman, Nair Varun S., Georrge John J. & P Singh S. (2018). Comparative Analysis of
Thermophilic Proteases. Research Journal of Life Sciences, Bioinformatics, Pharmaceutical
and Chemical Sciences, 04(06), 65-91. doi:10.26479/2018.0406.06
Van Doorslaer Koenraad, Li Zhiwen, Xirasagar Sandhya, Maes Piet, Kaminsky David, Liou David, . .
. McBride Alison A. (2016). The Papillomavirus Episteme: a major update to the
papillomavirus sequence database. Nucleic acids research, 45(D1), D499-D506.

www.christcollegerajkot.edu.in, © Christ College, Rajkot, India ISBN: 9788192952147, Page No. 125
Proceedings of 11thNational Science Symposium (February 03, 2019)
Organized by Christ College, Rajkot & Sponsored by Gujarat State Biotechnology Mission (GSBTM), DST,
Govt. of Gujarat. Bioinformatics
Will Sebastian, Reiche Kristin, Hofacker Ivo L, Stadler Peter F & Backofen Rolf. (2007). Inferring
noncoding RNA families and classes by means of genome-scale structure-based clustering.
PLoS computational biology, 3(4), e65.
Yang Seok, Lee Joo-Yeon, Lee Joon Seung, Mitchell Wayne P, Oh Hee-Bok, Kang Chun & Kim Kyung
Hyun. (2008). Influenza sequence and epitope database. Nucleic acids research, 37(suppl_1),
D423-D430.
Yang Xiao, Charlebois Patrick, Gnerre Sante, Coole Matthew G, Lennon Niall J, Levin Joshua Z, . . .
Henn Matthew R. (2012). De novo assembly of highly diverse viral populations. BMC
genomics, 13(1), 475.
Yu Qing, Ryan Elizabeth M, Allen Todd M, Birren Bruce W, Henn Matthew R & Lennon Niall J.
(2010). PriSM: a primer selection and matching tool for amplification and sequencing of viral
genomes. Bioinformatics, 27(2), 266-267.
Yuen Lilly KW, Ayres Anna, Littlejohn Margaret, Colledge Danielle, Edgely Andrew, Maskill William
J, . . . Bartholomeusz Angeline. (2007). SeqHepB: a sequence analysis program and relational
database system for chronic hepatitis B. Antiviral research, 75(1), 64-74.
Zazzi Maurizio, Incardona Francesca, Rosen-Zvi Michal, Prosperi Mattia, Lengauer Thomas, Altmann
Andre, . . . Kaiser Rolf. (2012). Predicting response to antiretroviral treatment by machine
learning: the EuResist project. Intervirology, 55(2), 123-127.
Zerbino Daniel R & Birney Ewan. (2008). Velvet: algorithms for de novo short read assembly using de
Bruijn graphs. Genome research, 18(5), 821-829.
Zuker Michael. (2003). Mfold web server for nucleic acid folding and hybridization prediction. Nucleic
acids research, 31(13), 3406-3415.

How to cite this Book Chapter?

APA Style
Angelin George and John J. Georrge (2019). Viroinformatics: Databases and Tools. Proceedings of 11th
National Science Symposium on Recent Trends in Science and Technology (pp.117-126. ISBN:
9788192952147. Rajkot, Gujarat, India: Christ Publications
MLA Style
Angelin George and John J. Georrge. “Viroinformatics: Databases and Tools”. Proceedings of 11th National
Science Symposium on Recent Trends in Science and Technology (ISBN: 9788192952147). Rajkot, Gujarat,
India: Christ Publications, 2019. pp. 117-126.
Chicago Style
Angelin George and John J. Georrge. “Viroinformatics: Databases and Tools”. In proceedings of 11th National
Science Symposium on Recent Trends in Science and Technology (ISBN: 9788192952147), pp. 117-126.
Rajkot, Gujarat, India: Christ Publications, 2019.

www.christcollegerajkot.edu.in, © Christ College, Rajkot, India ISBN: 9788192952147, Page No. 126

You might also like