You are on page 1of 47

COMSATS Institute of Information Technology,

Sahiwal

Allosteric pocket identification and structure based virtual


screening for the identification of potential inhibitor against zika
protease NS2B-NS3
A research report submitted to Department of Biosciences, in the partial
fulfillment of the requirement for the degree of BS (Bioinformatics)

Submitted to:
Department of Biosciences,
COMSATS, Sahiwal

Submitted by:

Musrrat Fatima
SP14-BBI-006

Supervisor

Dr. Farrukh Jamil

Department of Biosciences

COMSATS Institute of Information Technology, Sahiwal, Pakistan


Fall 2017
COMSATS Institute of Information Technology,

Sahiwal

Date: ________

FINAL APPROVAL
It has certified that we have read the thesis submitted by Musrrat Fatima and it is our judgement
that this project is of sufficient standard to warrant its acceptance by the COMSATS, institute of
information Technology, Sahiwal for the B.S Degree in Bioinformatics.

COMMITTEE

External Examiner
Dr. Bashir Ahmad
Assistant Professor
International Islamic University, Islamabad

Supervisor
Dr. Farrukh Jamil
Department of Biosciences
COMSATS Institute of Information Technology, Sahiwal

Co-Supervisor
Muhammad Saad Khan
Department of Biosciences
COMSATS Institute of Information Technology, Sahiwal

Head of Department
Dr. Awais Ihsan
Department of Biosciences
COMSATS Institute of Information Technology, Sahiwal

ii
A thesis submitted to Department of Biosciences COMSATS Institute of
Information Technology, Sahiwal as a partial fulfillment of requirement for the
award of Bachelors of Sciences in Bioinformatics(BSBI)

iii
Dedication

This thesis is dedicated to my father Mubarak Ali and mother Perveen Akhtar
endless support, love, encouragement and invaluable trust in me.
I learn from them,

Success does not lie in


―Results‖
But in ―Efforts‖
―Being‖ the best is not so important,
―Doing‖ the best is all that matters……

iv
Declaration

I hereby solemnly declare that the work “Allosteric pocket identification and
structure based virtual screening for the identification of potential inhibitor
against zika protease NS2B-NS3” present in the following thesis is my own
effort, except where otherwise acknowledged and that the research report is my
own composition. No part of the research report has been previously presented for
any other degree.

Dated: __________ Signature of the student


__________________
Musrrat Fatima
Reg No CIIT/SP14-BBI-006/SWL

v
Acknowledgements
All gratitude is to the most ―gracious‖, the most ―merciful‖ ALLAH Almighty,
who guided and aided me to bring-forth this report and all respect and reverence
for Holy Prophet
Hazrat Muhammad (S.A.W.W.) whose teaching is complete guidance for
humanity.
I would like to express deep gratitude to my supervisor Dr. Farrukh Jamil,
and co-supervisor Muhammad Saad Khan, COMSATS Institute of
Information Technology, for his guidance, consistent encouragement,
constructive criticism and supervision under which this research has been
conducted. He skillfully designed and implemented the whole research project.
It is my privilege and honor to acknowledge the total support and standards of
excellence provided by him.
I would like to special thanks to my obliging and encouraging parents. They
have actively supported me in my determination and find and realize my
potential.
I am greatly thankful to Dr. Awais Ihsan Head of Department of BioSciences,
COMSATS Institute of Information Technology Sahiwal, for giving me a great
opportunity to get experience and knowledge and providing me research
facilities with his inspiring attitude during the course of study.
I especially acknowledge to my best friends Amina Amin, Saba Kanwal and
Noreen Shahzadi for their love and support.
Musrrat Fatima
CIIT/SP14-BBI-006/SWL

vi
Contents
Summary: ............................................................................................................................................... 1
1. Introduction .................................................................................................................................. 3
1.1. Zika virus ....................................................................................................................................... 3
1.2. Epidemiology................................................................................................................................. 3
1.3. Zika Genome ................................................................................................................................. 4
1.4. Zika protein ................................................................................................................................... 4
1.5. Life cycle ........................................................................................................................................ 5
1.6. Allosteric sites ............................................................................................................................... 5
1.7. Virtual screening ........................................................................................................................... 6
1.8. Statement of Problem ................................................................................................................... 7
1.9. Virus Selection............................................................................................................................... 7
1.10. Target protein ........................................................................................................................... 7
2. Material and method: ................................................................................................................. 10
2.1. Allosteric sites analysis: .............................................................................................................. 10
2.2. Protein dataset and preparation: ............................................................................................... 10
2.3. Virtual Screening: ........................................................................................................................ 11
2.4. Compound Drug-like properties: ................................................................................................ 11
3. Result .......................................................................................................................................... 13
3.1. PDB structures Retrieval ............................................................................................................. 13
3.2. Protein Modifications Chimera ................................................................................................... 13
3.3. Protease Allosteric Sites.............................................................................................................. 14
3.4. Auto dock: ................................................................................................................................... 15
3.5. Virtual Screening result analysis: ................................................................................................ 16
3.6. Docking complex of ligand with protein ..................................................................................... 29
Discussion ............................................................................................................................................. 32
Conclusion ............................................................................................................................................ 33

vii
List of tables:

Table 1: Virtual Screening results: .................................................................................................... 16


Table 2: Compound structure ........................................................................................................... 20
Table 3: Ligand interaction with proteins........................................................................................ 24
Table 4: compound properties .......................................................................................................... 25
Table 5: compound interaction ......................................................................................................... 25

viii
List of Figures:

Figure 1 : 3D structure of Zika protease 5GXJ .................................................................................... 13


Figure 2: 5GXJ protein ......................................................................................................................... 13
Figure 3: Superimpose dengue 2FOM and ZIKA 5GXJ ...................................................................... 14
Figure 4: Allosteric sites of proteins .................................................................................................... 15
Figure 5: Grid set against the Allosteric Sites of 5GXJ. ....................................................................... 16
Figure 6: Docking complex of ligand with protein .............................................................................. 30

ix
Summary:
Zika virus comes to the attention of the general public after its recent outbreak in Brazil,
America 2015. Its association with microcephaly and GBS Syndrome is reported. The current
Insilco study uses pharmacoinformatics technique to identify potential inhibitors against the
virus replication. The inhibitors are designed against the ZIKA protease NS3/NS2B which
has a well-known role in virus replication cycle. The protease is involved in the cleavage of
the viral polyprotein and forms viral structural and nonstructural proteins which are required
for virus replication. The targeted protein structure was available in PDB. Allosteric sites are
determined by structure superimposition with the dengue. The allosteric sites are used for
Structure-Based virtual screening from Mcule a drug discovery platform. After screening one
lac compound 14 compounds are selected that fulfill many drug likeliness filters like the
follow Ro5 parameters and nontoxic. These compounds are forming hydrogen bond with the
allosteric sites residues and are showing minimum binding energies.

1
Chapter 1
Introduction

2
1. Introduction
1.1. Zika virus

ZIKV is a member of the mosquito-borne flavivirus genus which contains important human
pathogens such as dengue virus (DENV), West Nile virus (WNV), Japanese encephalitis virus
(JEV), and yellow fever virus. ZIKA came to the consideration of the general public in 2015
when a large outbreak occurred in Brazil and rapidly spread to other countries in the region
(Ferguson et al., 2016). The local transmission was reported in an additional 52 countries and
territories, mainly in the Americas and the western Pacific, but also in Africa and South East
Asia. Aedes aegypti mosquitoes are the principal vectors, though other mosquito species may
contribute to transmission (Colman et al., 2009). ZIKV was first isolated in 1947 from the serum
of a febrile sentinel monkey in the Zika Forest, hence its name, and 1 year later from Aedes
africanus mosquitoes caught in the same forest (Dick, Kitchen, & Haddow, 1952). As any other
flavivirus, the viral genome is composed of a single-stranded RNA molecule of positive polarity
about 10 kb in length that encodes a single open reading frame (ORF) flanked by two
untranslated regions at both ends. The polyprotein encoded by the single ORF in ZIKV is
cleaved by cellular and viral proteases into 10 mature proteins (three structural and seven non-
structural proteins) (Kuno & Chang, 2007)

1.2. Epidemiology

ZIKV virus is a flavivirus first discovered in 1947 in the Zika forest of Uganda. In April 2007,
ZIKV spread its usual geographic range and was detected outside Africa and Asia for the first
time when an outbreak occurred on Yap Island in the South Western Pacific Ocean, as an
emerging pathogen (Hayes, 2009). Additional evidence was found during a ZIKV outbreak in
French Polynesia (2013): the virus was isolated from the semen of a patient in Tahiti that sought
treatment for hematospermia (Musso et al., 2015). In 2016, ZIKV spread rapidly throughout the
Americas after its initial appearance in Brazil in May 2015. In 2016, 48 countries and territories
in the Americas had reported more than 532,000 suspected cases of Zika, including 175,063
confirmed cases. In addition, 22 countries and territories reported 2,439 cases of a congenital
syndrome associated with Zika. Five countries had reported sexually transmitted Zika cases

3
(Ikejezie et al., 2017). ZIKV had been long been perceived as a mild illness with fever, rash,
arthralgia, and conjunctivitis, and may be misdiagnosed as DENV which causes similar
symptoms (Tan, Sam, Chong, Lee, & Chan, 2017). However, with the increased incidence due to
the current outbreak of ZIKV in Brazil and throughout Latin America, new data suggest a
positive correlation between cases of infection and the rise of microcephaly, characterized by
abnormally small brains(Driggers et al., 2016). Infection of ZIKV can cause neurological
disorders such as Guillain-Barre´ syndrome and acute myelitis (Nowakowski et al., 2016).

1.3. Zika Genome

The viral genome is composed of a single-stranded RNA molecule of positive polarity about 10
kb in length (Kuno & Chang, 2007). In a similar manner to cellular mRNAs, ZIKV genome
includes a cap structure at its 5’ end, but in contrast to cellular mRNAs, ZIKV genome lacks a 3’
poly(A) tract and ends with CUOH.

The genome contains a single ORF flanked by two untranslated regions located at the 5’ and 3’
ends of the genome (Kuno & Chang, 2007).

1.4. Zika protein


ZIKV has a single positive sense RNA genome of approx. 10kb. It is initially translated as a
single polyprotein (Kuno & Chang, 2007) and then post-translationally cleaved into three
structural proteins: capsid (C), pre-membrane/membrane(prM), and envelope (E) as well as
nonstructural (NS) proteins (NS1, NS2A, NS2B, NS3, NS4A, NS4B, and NS5 (Boehm et al.,
2000). NS3 protein of ZIKV possesses putative protease activity at its N-terminus and putative
ATPase/helicase, nucleoside triphosphatase, and 50-triphosphatase activities at its C-terminus
(Zhu et al., 2016).The viral serine protease is embedded in the N-terminal domain of NS3
(NS3pro) (Bollati et al., 2010). Since NS3 is essential to the life cycle of ZIKV, it is an attractive
target for the development of antiviral drugs (Lei et al., 2016).

4
1.5. Life cycle

A distinct feature of genus Flavivirus from other genera of Flaviviridae is that the 5′-end of the
(+)ssRNA genome of genus Flavivirus is decorated with an RNA cap structure
(N7meGpppA2′Ome-RNA).5’end capping of the viral RNA is as important as that for eukaryotic
mRNAs, not only to initiate the process of translation but also to protect the viral RNA from
degradation by endogenous RNA exonucleases. The protein translation happens immediately
after the uncoating of viral particle in the cytoplasm. The (+)ssRNA genome is used as a
template not only for gene expression but also for viral genome replication. Both viral RNA
replication and gene translation occur in the cytoplasm. For RNA replication, viral NS proteins
and cellular proteins interact to form a replication compartment (RC). During the period of viral
RNA replication in the cytoplasm, the RC consists of morphologically distinct, membrane-bound
compartments that also differ with respect to both function and NS proteins composition
(Mackenzie, 2005). The NS3 and NS5 proteins are central to the viral RC, as together, they
harbor most, if not all, of the catalytic activities required to both cap and replicate the viral RNA.
Following replication, the protected genomic RNA is packaged by the C protein to form a capsid
in a host-derived lipid bilayer in which the E protein is embedded and later integrated into viral
envelope. The mature particles subsequently exit from the host cell by exocytosis (Armstrong,
Hou, WanghengArmstrong, N., Hou, W., & Tang, Q. (2017). Biological and historical overview
of Zika virus. World Journal of Virology, 6(1), & Tang, 2017)

1.6. Allosteric sites


Until now, the HCV NS3/4A is the only viral serine protease used as a drug target in clinical
practice. The global threat posed by the recent appearance of the Zika virus (ZIKV) might
accelerate the development of novel antiviral drugs targeting viral serine protease since the virus
encodes a serine protease crucial for its lifecycle (Skoreński, Grzywa, & Sieńczyk, 2016). ZIKV
encodes a serine protease (NS2B–NS3) responsible for viral pre protein processing. As the
crystal structure (Lei et al., 2016) and substrate specificity profile (Gruba et al., 2016) of ZIKV
NS2B–NS3 protease have already been determined, it might become the potential target for
rational drug design. Highly hydrophilic substrate binding sites of flavivirus proteases limit drug
accessibility and the design of compounds displaying high potency, bioavailability and required
pharmacological profile is highly challenging. In contrast to flaviviral serine proteases, the active

5
site of HCV protease is less polar and more hydrophobic, which allows for developing drug
candidates containing no hydrophilic or protonated substituents (Poulsen, Kang, & H. Keller,
n.d.). Among already reported flaviviral protease inhibitors, the most potent are basic and highly
charged molecules and these could hinder membrane permeability and oral absorption. One of
the solutions is the design of allosteric inhibitors that do not target the protease active site
(Shiryaev et al., 2017). Such an approach could facilitate/accelerate the development of NS2B–
NS3 protease inhibitors which could be found effective in ZIKV infections.

1.7. Virtual screening

Virtual screening (VS) is a computational technique used in drug discovery in which large
libraries of chemical structures are searched against a drug target to find the most suitable and
stable compound that binds to an enzyme. In VS, a set of compounds is screened based on their
rank/score using one or more computational procedures.

Depending on availability of data, VS can be performed for different purposes, such as:

1. Ligand-based virtual screening is performed if one active molecule is known and that molecule is
used as a template for searching similar structures.
2. If several active molecules are known then 3D database search is done for construction of 3D
pharmacophore. This pharmacophore is then used for ligand-based virtual screening.
3. If protein 3D structure is known then protein-ligand docking will be performed as a method of
virtual screening, which is a structure-based approach.

The drug-likeness of a molecule is gauged against compound filtering parameters such as


molecular weight, logP, number of rotatable bonds etc. Lipinski’s rule of five is one of the
methods to check the drug-likeness of a compound. It states that poor absorption or permeation is
more likely when the molecular weight of the compound is greater than 500, and value of logP is
more than 5. It also tells that there would be more than 5 H-bond donors and more than 10-H
bond acceptors.

Structure-based virtual screening includes protein-ligand docking which aims to predict ligand’s
appropriate conformation in protein binding site. A score of rank number is assigned to each
conformation. The most stable conformation will be ranked number 1. However, this method

6
requires many degrees of freedom in terms of rotation/conformation and solvent effects.
Furthermore, it is computationally complex and requires protein 3D structure as a start point. The
protein 3D structures obtained through crystallography do not include hydrogen atoms, which
are required by the docking program. It also has the limitation as it cannot predict ADMET
(Absorption, Distribution, Metabolism, Excretion (Elimination), Toxicity) properties. As it is
well known that a drug must bind tightly to the target protein inside the cell, and it must pass
through the cell membrane to effectively deliver its function. Moreover, the drug must stay with
the target protein for a significantly longer period of time to give maximum effect followed by its
metabolism within the cell and its excretion from the body.

A large number of computational approaches have been used at different stages of the drug-
designing in early-stage screening a large number of compounds and select best one and narrow
down the research through lead optimization stages and this way able to reduce the experimental
cost and time (Boehm et al., 2000).

1.8. Statement of Problem

Zika flavivirus infection during pregnancy appears to produce a higher risk of microcephaly, and
also causes multiple neurological problems such as Guillain–Barré syndrome. Zika virus
protease NS2B-NS3 can prove a potential drug target due to its vital role in viral replication.
Computational studies including bioinformatics can be used to identify potential lead compound
against the allosteric sites of ZIKA protease.

1.9. Virus Selection


Zika virus is selected because there is a large research gap since its discovery in 1947 till its
recent outbreak in Brazil(America)2015.This pathogen is neglected a long time because it causes
only mild symptoms like Dengue Virus and no major Disease is reported. But after this outbreak
many cases of microcephaly and Guillain–Barré syndrome reported due to viral infection.

1.10. Target protein

ZIKA Virus protease NS2B\NS3 is selected as a target protein because it is involved in


polyprotein cleavage and form three structural and seven nonstructural proteins. And targeting
this protein will stop the virus replication.

7
Chapter 2
Methodology

8
Protein identification and structure reterival

Prediction of the allosteric site of model


protein

Selection of the ligand databse

Structure based virtual screening

Ranked list of compound on the basis of


binding score

Toxicity analysis of compounds

Ro5 parametrrs of orally administrated drugs

Interactions analysis of comounds

Novel lead compound identification

9
Tools and Technique:
 Auto Dock
 Chimera
 Pymol
 Ligplot
 Osiris
 AdmetSar
 Mcule
 LigandScout
 OpenBable

2. Material and method:


2.1. Allosteric sites analysis:

A high-resolution crystal structure of ZIKA protease 5GXJ was used for the analysis of allosteric
sites. The allosteric sites of Dengue protease (PDB:2FOM), reported in the literature
(Mukhametov, Newhouse, Aziz, Saito, & Alam, 2014) were used to determine the allosteric sites
of the ZIKA protease. Both protein structures were superimposed using PyMol software (Lill &
Danielson, 2011). The allosteric sites were chosen due to required structure and sequence
homology.

2.2. Protein dataset and preparation:


The target protein structure (PDB:5GXJ) with resolution 2.6 Å and R-value free 0.273 was
retrieved from PDB(Protein Data Bank). The chain B and water molecules are deleted using
Chimera software to prepare protein for docking. By using AutoDock Vina (Trott & Olson,
2010), polar hydrogen and Kollman charges are added. Hydrogen and charges are added because
PDB file do not contain hydrogen and charges and most docking program requires them. For the
target based docking, a grid is set against the allosteric pocket of the Zika protease using the Grid
component of the Auto dock. The grid size is set with dimension of 100×100×100 angstrom and
center x= -0.254, y= 11.832 and z= -24.015.

10
2.3. Virtual Screening:

Virtual Screening is a technique used in drug designing in which large libraries are screened
against a target to find the most suitable and stable compound that binds to the target. If protein
3D structure is known, then protein-ligand docking will be performed as a method of virtual
screening, which is a structure-based approach.

In our study, Mcule (Kiss, Sandor, & Szalai, 2012a), a drug discovery platform, was used to
perform the structure-based virtual screening. Mcule Purchasable (In stock & virtual Ro5)
database is selected for screening purpose. Mcule uses a built-in Auto dock Vina tool to perform
docking of small ligand molecule against the target. Each ligand of the selected database is
docked against the selected allosteric pocket of the ZIKV protease NS2B-NS3 (5GXJ) and was
scored according to the maximum binding affinity.

2.4. Compound Drug-like properties:

The best compound after SBVS was checked by OSIRIS property explorer (Actelion
Pharmaceuticals Ltd., Allschwil, Switzerland) to check the compound drug-like properties.
Molecular weight, logP, hydrogen bond donor hydrogen bond acceptor and polar surface area
were checked. Those compounds that fulfill Ro5 criteria were further filtered by checking the
compound toxicity using Mcule lead optimization tool toxicity checker (Kiss et al., 2012a).

11
Chapter 3
Results

12
3. Result
3.1. PDB structures Retrieval

The ZIKA protease structure was retrieved from PDB (PDB id 5GXJ)

Figure 1 : 3D structure of Zika protease 5GXJ

3.2. Protein Modifications Chimera

UCSF Chimera is used for protein modifications. A water molecule is removed which are the
requirement for docking (Madhavi Sastry, Adzhigirey, Day, Annabhimoju, & Sherman,
2013) and chain B is also removed.

Figure 2: 5GXJ protein

13
3.3. Protease Allosteric Sites

As ZIKA belong to the same genus as Dengue the reported allosteric sites of
Dengue(Mukhametov et al., 2014) was used due to sequence and structure homology.The target
site residues include 15 residues ASP71, LYS73, Gln74, TRP83, LEU85, ALA87, ALA88,
TRP89, GLY91, THR118, ASP120, ILE147, LEU149, ASN152, VAL155.

Figure 3: Superimpose dengue 2FOM and ZIKA 5GXJ

(a)

14
(b)
Figure 4: Allosteric sites of proteins
(a)(b) Allosteric pocket of ZIKA 5GXJ

3.4. Auto dock:


Auto dock Vina was utilized with the help of MGL tools to set a grid against the allosteric site
of target protein 5gxj. A water molecule is removed; hydrogen and Koll man charges are added.
The grid is set with dimension of 100*100*100 and center x= -0.254, y= 11.832 and z= -
24.015.

The grid is set against the allosteric pocket so that ligand molecule is docked specifically to the
allosteric pocket.

15
Figure 5: Grid set against the Allosteric Sites of 5GXJ.
3.5. Virtual Screening result analysis:
Mcule was used for the structure-based virtual Screening. Mcule purchasable (In stock & virtual
Ro5) database is selected. The 100000 compounds were docked against target protein and top
100 compounds according to binding affinity were chosen for further analysis. These compound
interactions with protein were checked by using Ligplot and chimera; compound toxicity was
checked by using Mcule toxicity checker.

Table 1: Virtual Screening results:


In the following table ( Table 1) 100 compounds are given. These compounds are the top 100
compounds after structure-based virtual screening of one lac compounds. These compounds are
ranked according to binding energy which varies from -9.3 to -7.8. Compounds Ro5 violation are
given and compounds that fulfill are the parameters are given the value 0. Compounds
interactions are checked by Ligplot and toxicity by Mcule Toxicity checker and given in the
table. The highlighted 14 compounds were chosen (table 1) because these compounds have a
high binding affinity, show interactions with the target protein, nontoxic and fulfill Ro5
parameters.

16
Docking Ro5
Index mcule id interaction Toxicity
score violation
1 MCULE-5858655441-0-1 -9.3 0 No ok
2 MCULE-8576320432-0-1 -9 0 1 fail
3 MCULE-1517911951-0-1 -9 1 3 fail
4 MCULE-4056976190-0-1 -9 1 1 ok
5 MCULE-4790935597-0-2 -9 1 No ok
6 MCULE-6483787945-0-3 -9 0 2 ok
7 MCULE-7018976980-0-1 -8.9 1 No ok
8 MCULE-8688163339-0-2 -8.9 1 1 ok
9 MCULE-8753167026-0-3 -8.9 1 No ok
10 MCULE-9833014684-0-1 -8.8 1 No ok
11 MCULE-2893284679-0-2 -8.7 0 1 fail
12 MCULE-5797379473-0-1 -8.7 1 No ok
13 MCULE-1112921902-0-1 -8.7 0 1 fail
14 MCULE-6943292956-0-2 -8.7 0 No fail
15 MCULE-4835981425-0-5 -8.7 1 No fail
16 MCULE-5502665281-0-1 -8.6 0 No fail
17 MCULE-2598332550-0-1 -8.6 0 1 ok
18 MCULE-6717908673-0-1 -8.5 0 1 ok
19 MCULE-6743039761-0-1 -8.5 1 2 fail
20 MCULE-6476405309-0-1 -8.5 0 1 fail
21 MCULE-6365210919-0-1 -8.5 1 No fail
22 MCULE-8606036216-0-1 -8.5 0 2 fail
23 MCULE-4017669026-0-1 -8.4 0 2 ok
24 MCULE-4177862943-0-1 -8.4 0 1 fail
25 MCULE-3458962287-0-1 -8.4 0 No fail
26 MCULE-9638303510-0-3 -8.4 0 1 fail
27 MCULE-3848302983-0-2 -8.4 1 No fail
28 MCULE-5509043461-0-1 -8.3 0 No fail
29 MCULE-9429907811-0-1 -8.3 1 No ok
30 MCULE-2396793617-0-1 -8.3 1 1 ok
31 MCULE-4345939790-0-2 -8.3 1 1 fail
32 MCULE-3343618873-0-1 -8.3 1 2 ok
33 MCULE-5105020739-0-1 -8.3 0 3 fail
34 MCULE-7874964319-0-1 -8.3 1 2 fail
35 MCULE-6265959891-0-1 -8.3 1 1 fail
36 MCULE-6551643242-0-1 -8.3 0 1 fail
37 MCULE-6512739557-0-1 -8.3 0 2 fail
38 MCULE-4088032230-0-2 -8.2 0 1 ok
39 MCULE-2787382345-0-2 -8.2 0 1 fail
17
40 MCULE-7624899190-0-1 -8.2 0 3 fail
41 MCULE-8187082036-0-1 -8.2 1 2 ok
42 MCULE-4152767969-0-1 -8.2 1 no fail
43 MCULE-3288486745-0-1 -8.2 0 2 fail
44 MCULE-5133591314-0-2 -8.2 1 no ok
45 MCULE-5133327836-0-1 -8.1 1 no fail
46 MCULE-3656105737-0-1 -8.1 1 2 fail
47 MCULE-1811580306-0-1 -8.1 1 no fail
48 MCULE-7517394417-0-2 -8.1 0 no fail
49 MCULE-7975976258-0-1 -8.1 0 no fail
50 MCULE-6554723875-0-1 -8.1 1 1 ok
51 MCULE-1924211718-0-6 -8.1 0 1 fail
52 MCULE-7559439527-0-1 -8.1 0 no fail
53 MCULE-8969381945-0-1 -8.1 0 1 ok
54 MCULE-6884893176-0-1 -8.1 0 1 fail
55 MCULE-8645525757-0-4 -8.1 1 no fail
56 MCULE-5121976021-0-1 -8.1 0 no fail
57 MCULE-8631834817-0-1 -8.1 0 1 ok
58 MCULE-5351948953-0-2 -8.1 1 no fail
59 MCULE-8974284720-0-1 -8.1 1 2 ok
60 MCULE-3248415882-0-2 -8.1 0 1 ok
61 MCULE-4130765792-0-2 -8.1 1 no fail
62 MCULE-9809513480-0-1 -8 0 no fail
63 MCULE-5412986170-0-1 -8 0 no fail
64 MCULE-1655374155-0-2 -8 1 no ok
65 MCULE-2525190839-0-1 -8 0 not plot fail
66 MCULE-2331126102-0-3 -8 1 1 fail
67 MCULE-2225920197-0-1 -8 1 3 fail
68 MCULE-2975261334-0-3 -8 1 no fail
69 MCULE-3498445429-0-1 -8 0 2 ok
70 MCULE-8186755717-0-1 -8 0 2 fail
71 MCULE-4827910256-0-1 -8 0 1 fail
72 MCULE-1182892224-0-1 -8 1 1 fail
73 MCULE-7810469179-0-1 -8 0 no ok
74 MCULE-4965062590-0-1 -8 0 no fail
75 MCULE-4504827771-0-1 -8 1 no fail
76 MCULE-6422812973-0-2 -8 0 1 fail
77 MCULE-5854885725-0-1 -8 0 2 fail
78 MCULE-3788572739-0-1 -8 1 no fail
79 MCULE-4051907463-0-1 -8 0 1 fail

18
80 MCULE-9591052026-0-1 -8 0 1 fail
81 MCULE-7229086120-0-1 -7.9 1 no fail
82 MCULE-3686500192-0-1 -7.9 0 1 ok
83 MCULE-2702614605-0-1 -7.9 1 no ok
84 MCULE-5466230883-0-1 -7.9 0 no ok
85 MCULE-7264291502-0-1 -7.9 1 1 fail
86 MCULE-1165852831-0-1 -7.9 0 2 ok
87 MCULE-7095580128-0-1 -7.9 0 1 ok
88 MCULE-7204888140-0-1 -7.9 0 2 ok
89 MCULE-8259183839-0-1 -7.9 1 1 ok
90 MCULE-2613165958-0-6 -7.9 1 no fail
91 MCULE-6361008803-0-1 -7.9 0 1 fail
92 MCULE-6224217834-0-4 -7.9 1 no fail
93 MCULE-8294041994-0-1 -7.9 0 1 fail
94 MCULE-1612291863-0-1 -7.8 0 no Fail
95 MCULE-3566731577-0-1 -7.8 0 1 Fail
96 MCULE-7581834121-0-1 -7.8 0 no Ok
97 MCULE-1439530303-0-5 -7.8 0 1 Fail
98 MCULE-2760593673-0-1 -7.8 0 1 Ok
99 MCULE-6517969169-0-1 -7.8 0 1 Fail
100 MCULE-2219339680-0-1 -7.8 1 no Fail

19
Table 2: Compound structure
The chemical structures of the 14 compounds chosen after SBVS are given along with their
respective Mule id’s that fulfill druglikeliness features.

Index Molecule id Structure

1 MCULE-6483787945-0-3

2 MCULE-2598332550-0-1

3 MCULE-6717908673-0-1

20
4 MCULE-4017669026-0-1

5 MCULE-4088032230-0-2

6 MCULE-8969381945-0-1

7 MCULE-8631834817-0-1

21
8 MCULE-3248415882-0-2

9 MCULE-3498445429-0-1

10 MCULE-3686500192-0-1

11 MCULE-1165852831-0-1

22
12 MCULE-7095580128-0-1

13 MCULE-7204888140-0-1

14 MCULE-2760593673-0-1

23
Table 3: Ligand interaction with proteins
Ligplot programme was used to generate interactions between protein and ligand. It gives
intermolecular interactions and their strength including hydrogen bonding and hydrophobic
interactions and other atoms accessibilities. The selected 14 compounds interactions are given in
(table3). Different allosteric residues are involved in binding but ASN152 are involved in
hydrogen bonding in the majority of the cases.

ligand ID Binding affinity Interactions Distance

O-ASN152:OD1 2.83
MCULE-6483787945-0-3 -9
O-GLN1074:C 2.97
MCULE-2598332550-0-1 -8.6 O2-ASN152:ND2 2.86

MCULE-6717908673-0-1 -8.5 O2-TRP1083:NE1 3.04


F1-THR118:OG1 3.19
MCULE-4017669026-0-1 -8.4
N2-ASN152:ND2 2.93
MCULE-4088032230-0-2 -8.2 O-TRP1083:NE1 2.9

MCULE-8969381945-0-1 -8.1 O1-ASN152:ND2 3.19

MCULE-8631834817-0-1 -8.1 N2-ASN12:ND2 2.99

MCULE-3248415882-0-2 -8.1 O1-ASN152:ND2 2.97

MCULE-3498445429-0-1 -8 O2-ASN1152:ND2 2.98

MCULE-3686500192-0-1 -7.9 N3-ASN1152:ND2 3.19


O1-ASN1152:ND2 3.06
-7.9
MCULE-1165852831-0-1 O2-ASN1152:ND2 3.00

MCULE-7095580128-0-1 -7.9 N3-GLN1074:OE1 2.89


O1-GLN1074:NE2 3.13
-7.9
MCULE-7204888140-0-1 N2-ASN1152:ND2 3.13
-7.8 O3-ASN1152:ND2 3.05
MCULE-6517969169-0-1

24
Table 4: compound properties
Compound properties checked by Osiris software. Lipinski suggests that poor absorption or
permeation is more likely when following conditions occur:
H-bond donor>5(sum of NH and OH), H-bond acceptors>10(N and O), Molecular weight>500
and Log P>5 (Lipinski, Lombardo, Dominy, & Feeney, 2001). The selected compounds are
drug-like because H-bond donor range from 0-2, H-bond acceptor range from 5-9 and log P from
1-5.
Index Molecule id Molecular clogP H- H- Relative Druglikeness
Weight acceptor Donor PSA
1 MCULE-6483787945-0- 427.55 3 6 1 0.17071 -1.1253
3
2 MCULE-2598332550-0- 404.322 4.3022 6 0 0.24966 -5.0935
1
3 MCULE-6717908673-0- 446.958 2.7846 7 1 0.22571 6.65
1
4 MCULE-4017669026-0- 375.349 2.1211 5 0 0.15715 -3.8807
1
5 MCULE-4088032230-0- 326.423 1.5393 5 0 0.29039 2.1575
2
6 MCULE-8969381945-0- 396.493 1.7641 8 1 0.23992 -0.25452
1
7 MCULE-8631834817-0- 310.4 2.1138 5 2 0.23924 -0.8651
1
8 MCULE-3248415882-0- 441.811 3.8176 5 1 0.17553 -1.5951
2
9 MCULE-3498445429-0- 449.477 3.0763 7 0 0.21256 3.0027
1
10 MCULE-3686500192-0- 333.346 3.0529 6 1 0.27512 3.6216
1
11 MCULE-1165852831-0- 410.476 1.5397 9 0 0.26448 9.0182
1
12 MCULE-7095580128-0- 384.459 3.2613 7 2 0.32795 2.6394
1
13 MCULE-7204888140-0- 390.502 1.5855 6 2 0.25924 -0.50408
1
14 MCULE-2760593673-0- 355.328 1.6942 8 0 0.32856 -0.07324
1

Table 5: compound interaction


Compounds interactions of 14 compounds are shown (table5). Green color residues are protein
residues involved in bonding with ligand residues and distance is labeled.

25
(1) MCULE-6483787945-0-3 (2) MCULE-2598332550-0-1

(3) MCULE-6717908673-0-1 (4) MCULE-4017669026-0-1

26
(5) MCULE-4088032230-0-2 (6) MCULE-8969381945-0-1

(7) MCULE-8631834817-0-1 (8) MCULE-3248415882-0-2

27
(9) MCULE-3498445429-0-1 (10) MCULE-3686500192-0-1

(11) MCULE-1165852831-0-1 (12) MCULE-7095580128-0-1

28
(13) MCULE-7204888140-0-1 (14) MCULE-6517969169-0-1

3.6. Docking complex of ligand with protein

In the docking complex as a ligand is bound to the protein, to show the binding of
compounds to the allosteric residues of the proteins (Figure 6). The ligand is highlighted
green and proteins interacting residues in purple.

(a) MCULE-6483787945-0-3

29
(b) MCULE-2598332550-0-1

(c) MCULE-6717908673-0-1

Figure 6: Docking complex of ligand with protein

30
Chapter 4
Discussion

31
Discussion
By using bioinformatics technique we can screen millions of compound into a manageable
number of compounds that can be tested as a drug in the wet lab too. Structure-based drug design
(SBDD) and ligand-based drug design (LBDD) are two general types of computer-aided drug
design. In case of SBDD 3-dimensional structural information is available to identify its key sites
(active or allosteric) and their interactions which are important for their biological function. Such
information can be used to design a drug that interacts with the target protein and interrupt its
biological function. LBDD is an approach used in the absence of a 3-D receptor and it is based
on the information of ligand that will bind to the receptor molecule. 3D quantitative structure-
activity relationships (3D QSAR) and pharmacophore modeling are the most important and
widely used tools in ligand-based drug design.

ZIKV: Belong to flavivirus genus as the serine protease of ZIKA is essential to viral replication
and HCV protease inhibitor is in clinical practice. There is no specific medicine or vaccine for
Zika virus. Already reported Zika protease inhibitor is the protease active site inhibitor. As the
protease active site are highly charged that’s why it is hydrophilic and limit drug accessibility
and the design of compound that show high potency, bioavailability and required
pharmacological profile is highly challenging. The reported compounds are charged and basic
and could hinder membrane permeability and oral absorption. One of the solutions is to design
allosteric inhibitor against the protease. In the current study, therefore, allosteric inhibitors are
identified through structure-based virtual screening approach. A grid is set against the allosteric
pocket of ZIKA protease with the help of MGL tools of Auto Dock. One lac compounds are
docked to that pocket for the purpose of screening and top 100 compounds scored according to
binding energy were further studied. Fourteen best compounds that are nontoxic by Mcule lead
optimization tool toxicity checker and good interaction with protein and least binding energy are
selected as a lead compound. These findings can be used in future as a potent inhibitor against
the allosteric sites of ZIKA protease.

32
Chapter 5
Conclusion

33
Conclusion
Allosteric pocket identified in the current study is comparatively less polar as compared to
the active sites of flavivirus. In Silico screening against the allosteric pocket of ZIKA
protease NS2B/NS3 protease led us to identify new ligands potentially able to bind it.
Fourteen compounds are chosen after a structure-based virtual screening of one lac
compounds. These compounds bind to the allosteric sites pocket with minimum binding
energy. Secondly, these ligands are involved in hydrogen bond formation with the
surrounding allosteric sites residues. These compounds are non-toxic and fulfill Ro5
parameters of orally administrated drugs.

34
Chapter 6
References

35
Armstrong, N., Hou, WanghengArmstrong, N., Hou, W., & Tang, Q. (2017). Biological and
historical overview of Zika virus. World Journal of Virology, 6(1), 1–8.
https://doi.org/10.5501/wjv.v6.i1.1, & Tang, Q. (2017). Biological and historical overview
of Zika virus. World Journal of Virology, 6(1), 1–8. https://doi.org/10.5501/wjv.v6.i1.1
Boehm, H. J., Boehringer, M., Bur, D., Gmuender, H., Huber, W., Klaus, W., … Mueller, F.
(2000). Novel inhibitors of DNA gyrase: 3D structure based biased needle screening, hit
validation by biophysical methods, and 3D guided optimization. A promising alternative to
random screening. Journal of Medicinal Chemistry, 43(14), 2664–74. Retrieved from
http://www.ncbi.nlm.nih.gov/pubmed/10893304
Bollati, M., Alvarez, K., Assenberg, R., Baronti, C., Canard, B., Cook, S., … Bolognesi, M.
(2010). Structure and functionality in flavivirus NS-proteins: Perspectives for drug design.
Antiviral Research, 87(2), 125–148. https://doi.org/10.1016/j.antiviral.2009.11.009
Cheng, F., Li, W., Zhou, Y., Shen, J., Wu, Z., Liu, G., … Tang, Y. (2012). admetSAR: A
Comprehensive Source and Free Tool for Assessment of Chemical ADMET Properties.
Journal of Chemical Information and Modeling, 52(11), 3099–3105.
https://doi.org/10.1021/ci300367a
Colman, E., Golden, J., Roberts, M., Egan, A., Weaver, J., Pharm, D., & Rosebraugh, C. (2009).
New engla nd journal. New England Journal Medicine, 361(9), 841–3.
https://doi.org/10.1056/NEJMp1415160
Dick, G. W. ., Kitchen, S. ., & Haddow, A. . (1952). Zika Virus (I). Isolations and serological
specificity. Transactions of the Royal Society of Tropical Medicine and Hygiene, 46(5),
509–520. https://doi.org/10.1016/0035-9203(52)90042-4
Driggers, R. W., Ho, C.-Y., Korhonen, E. M., Kuivanen, S., Jääskeläinen, A. J., Smura, T., …
Vapalahti, O. (2016). Zika Virus Infection with Prolonged Maternal Viremia and Fetal
Brain Abnormalities. New England Journal of Medicine, 374(22), 2142–2151.
https://doi.org/10.1056/NEJMoa1601824
Ferguson, N. M., Cucunubá, Z. M., Dorigatti, I., Nedjati-Gilani, G. L., Donnelly, C. A., Basáñez,
M.-G., … Lessler, J. (2016). EPIDEMIOLOGY. Countering the Zika epidemic in Latin
America. Science (New York, N.Y.), 353(6297), 353–4.
https://doi.org/10.1126/science.aag0219
Gerhard Wolber*, † and, & Langer‡, T. (2004). LigandScout:  3-D Pharmacophores Derived
from Protein-Bound Ligands and Their Use as Virtual Screening Filters.
https://doi.org/10.1021/CI049885E
Gruba, N., Rodriguez Martinez, J. I., Grzywa, R., Wysocka, M., Skoreński, M., Burmistrz, M.,
… Pyrć, K. (2016). Substrate profiling of Zika virus NS2B-NS3 protease. FEBS Letters,
590(20), 3459–3468. https://doi.org/10.1002/1873-3468.12443
Hayes, E. B. (2009). Zika virus outside Africa. Emerging Infectious Diseases, 15(9), 1347–50.
https://doi.org/10.3201/eid1509.090442
Ikejezie, J., Shapiro, C. N., Kim, J., Chiu, M., Almiron, M., Ugarte, C., … Aldighieri, S. (2017).
Zika Virus Transmission — Region of the Americas, May 15, 2015–December 15, 2016.
MMWR. Morbidity and Mortality Weekly Report, 66(12), 329–334.

36
https://doi.org/10.15585/mmwr.mm6612a4
Kiss, R., Sandor, M., & Szalai, F. a. (2012a). http://Mcule.com: a public web service for drug
discovery. Journal of Cheminformatics, 4(Suppl 1), P17. https://doi.org/10.1186/1758-
2946-4-S1-P17
Kiss, R., Sandor, M., & Szalai, F. A. (2012b). http://Mcule.com: a public web service for drug
discovery. Journal of Cheminformatics 2012 4:1, 4(1), P17. https://doi.org/10.1186/1758-
2946-4-S1-P17
Kuno, G., & Chang, G.-J. J. (2007). Full-length sequencing and genomic characterization of
Bagaza, Kedougou, and Zika viruses. Archives of Virology, 152(4), 687–696.
https://doi.org/10.1007/s00705-006-0903-z
Lei, J., Hansen, G., Nitsche, C., Klein, C. D., Zhang, L., & Hilgenfeld, R. (2016). Crystal
structure of Zika virus NS2B-NS3 protease in complex with a boronate inhibitor. Science,
353(6298), 503–505. https://doi.org/10.1126/science.aag2419
Lill, M. A., & Danielson, M. L. (2011). Computer-aided drug design platform using PyMOL.
Journal of Computer-Aided Molecular Design, 25(1), 13–19.
https://doi.org/10.1007/s10822-010-9395-8
Lipinski, C. A., Lombardo, F., Dominy, B. W., & Feeney, P. J. (2001). Experimental and
computational approaches to estimate solubility and permeability in drug discovery and
development settings. Advanced Drug Delivery Reviews, 46(1–3), 3–26. Retrieved from
http://www.ncbi.nlm.nih.gov/pubmed/11259830
Mackenzie, J. (2005). Wrapping Things up about Virus RNA Replication. Traffic, 6(11), 967–
977. https://doi.org/10.1111/j.1600-0854.2005.00339.x
Madhavi Sastry, G., Adzhigirey, M., Day, T., Annabhimoju, R., & Sherman, W. (2013). Protein
and ligand preparation: Parameters, protocols, and influence on virtual screening
enrichments. Journal of Computer-Aided Molecular Design, 27(3), 221–234.
https://doi.org/10.1007/s10822-013-9644-8
Mukhametov, A., Newhouse, E. I., Aziz, N. A., Saito, J. A., & Alam, M. (2014). Allosteric
pocket of the dengue virus (serotype 2) NS2B/NS3 protease: In silico ligand screening and
molecular dynamics studies of inhibition. Journal of Molecular Graphics and Modelling,
52, 103–113. https://doi.org/10.1016/j.jmgm.2014.06.008
Musso, D., Roche, C., Robin, E., Nhan, T., Teissier, A., & Cao-Lormeau, V.-M. (2015).
Potential sexual transmission of Zika virus. Emerging Infectious Diseases, 21(2), 359–61.
https://doi.org/10.3201/eid2102.141363
Nowakowski, T. J., Pollen, A. A., Di Lullo, E., Sandoval-Espinosa, C., Bershteyn, M., &
Kriegstein, A. R. (2016). Expression analysis highlights AXL as a candidate zika virus
entry receptor in neural stem cells. Cell Stem Cell, 18(5), 591–596.
https://doi.org/10.1016/j.stem.2016.03.012
Pettersen, E. F., Goddard, T. D., Huang, C. C., Couch, G. S., Greenblatt, D. M., Meng, E. C., &
Ferrin, T. E. (2004). UCSF Chimera - A visualization system for exploratory research and
analysis. Journal of Computational Chemistry, 25(13), 1605–1612.
https://doi.org/10.1002/jcc.20084

37
Poulsen, A., Kang, C., & H. Keller, T. (n.d.). Drug Design For Flavivirus Proteases: What Are
We Missing? Retrieved from
http://www.ingentaconnect.com/content/ben/cpd/2014/00000020/00000021/art00005
Shiryaev, S. A., Farhy, C., Pinto, A., Huang, C. T., Simonetti, N., Ngono, A. E., … Terskikh, A.
V. (2017). Characterization of the Zika virus two-component NS2B-NS3 protease and
structure-assisted identification of allosteric small-molecule antagonists. Antiviral Research,
143, 218–229. https://doi.org/10.1016/j.antiviral.2017.04.015
Skoreński, M., Grzywa, R., & Sieńczyk, M. (2016). Why should we target viral serine proteases
when developing antiviral agents? Future Virology, 11(12), 745–748.
https://doi.org/10.2217/fvl-2016-0106
Tan, C. W., Sam, I. C., Chong, W. L., Lee, V. S., & Chan, Y. F. (2017). Polysulfonate suramin
inhibits Zika virus infection. Antiviral Research, 143, 186–194.
https://doi.org/10.1016/j.antiviral.2017.04.017
Trott, O., & Olson, A. J. (2010). AutoDock Vina: improving the speed and accuracy of docking
with a new scoring function, efficient optimization, and multithreading. Journal of
Computational Chemistry, 31(2), 455–61. https://doi.org/10.1002/jcc.21334
Wallace, A. C., Laskowski, R. A., & Thornton, J. M. (1995). LIGPLOT: a program to generate
schematic diagrams of protein-ligand interactions. “Protein Engineering, Design and
Selection,” 8(2), 127–134. https://doi.org/10.1093/protein/8.2.127
Zhu, Z., Chan, J. F.-W., Tee, K.-M., Choi, G. K.-Y., Lau, S. K.-P., Woo, P. C.-Y., … Yuen, K.-
Y. (2016). Comparative genomic analysis of pre-epidemic and epidemic Zika virus strains
for virological factors potentially associated with the rapidly expanding epidemic. Emerging
Microbes & Infections, 5(3), e22. https://doi.org/10.1038/emi.2016.48

38

You might also like