You are on page 1of 129

GENOME SEQUENCING AND DRUG DESIGNING

APPROACH FOR BRAIN CANCER

Project dissertation report submitted for partial fulfilment of the requirement for
the degree of M.Sc. in Life science

By

KUNAL ROY

(Enrolment No: 2021MLS18)

DEPARTMENT OF ANIMAL SCIENCE

SCHOOL OF LIFE SCIENCES

CENTRAL UNIVERSITY OF KARNATAKA

KADAGANCHI - 585367, GULBARGA(INDIA)

1
CERTIFICATE
This is to certify that Mr. Kunal Roy M.Sc., Department of Life sciences,
Central University of Karnataka, has worked on project dissertation entitled
“Genome Sequencing and Computer-Aided Drug Designing for Brain Cancer(PI3K-
Gamma) ” under my supervision from March 2023 to May 2023. This work was
carried out for the partial fulfilment of M.Sc. in Life science. The work
presented in the project report is original and has not been submitted anywhere
else for the award of any other degree/diploma.

Project Co-Supervisor Dean and Head

Dr. Kavishankar Gawli Prof. N Sathyanarayana


Assistant Professor Dean and Head
Department of Life Science School Department of Life Sciences
Central University of Karnataka Central University of Karnataka

2
DECLARATION

I, Kunal Roy, hereby declare that the dissertation project entitled “Genome
Sequencing and Computer-Aided Drug Designing for Brain Cancer(PI3K-Gamma)”
submitted by me to the Department of Animal Science, School of Life Sciences,
Central University of Karnataka for the partial fulfilment of M.Sc. in Life
science. I have completed this project work under the guidance and supervision
of Dr. Rasmita Samal, Assistant Professor, Department of Animal Science,
Central University of Karnataka, Kadaganchi (Gulbarga).

I also declare that this work has not been submitted to anywhere else for the
award of any other degree/diploma or any other research persons or any
students.

Kunal Roy

(2019MLS012)

ACKNOWLEDGEMENT

3
First and foremost, I sincerely thank my supervisor, Dr. Vinod Gupta (Sr.
Scientist, Rapture Biotech, Noida for their careful and precious guidance which
was extremely valuable for my study both theoretically and practically. I
perceive this opportunity as a big milestone in my career development. I am
using this opportunity to express my deepest gratitude and special thanks to the
Founder and CEO of Rapture Biotech (Mr. Mayank Bhardwaj) who in spite of
being extraordinarily busy with his duties, took time out to hear, guide and keep
me on the correct path and allowing me to carry out my project at their
esteemed organization and extending during the training.

I also choose this moment to acknowledge to thank Dr. Kavishankar Gawli,


Assistant Professor, Department of Animal Science, School of Life Sciences,
Central University of Karnataka for his mentorship. I greatly respect his
scientific thinking and it has been a privilege to work under her guidance. His
enthusiasm and admiration towards life science has been the prime
encouragement for this project.

I would like to express my gratitude towards my friends and my family


members who have always been a source of my strength, achievement, and
inspiration and for providing me love and affection.

Kunal Roy

(2019MLS012)

4
Index

5
Abstract

6
Introduction
The term ‘‘brain tumours’’ refers to a mixed group of neoplasms originating from intracranial
tissues and the meninges with degrees of malignancy ranging from benign to aggressive.
Each type of tumour has its own biology, treatment, and prognosis and each is likely to be
caused by different risk factors. Even ‘‘benign’’ tumours can be lethal due to their site in the
brain, their ability to infiltrate locally, and their propensity to transform to malignancy. This
makes the classification of brain tumours a difficult science and creates problems in
describing the epidemiology of these conditions. Public perception generally fails to
distinguish between different tumour subtypes and although treatments and prognosis may
vary, the functional neurological consequences are frequently similar.
The brain anatomy is enormously complicated, with several sections
responsible for various nervous system activities. Brain tumours can appear anywhere on the
brain or in the skull, including the protective lining, the underside of the brain (skull base),
the brainstem, the sinuses and nasal cavity, and many other places. Depending on which
tissue they emerge from, there are over 120 distinct forms of tumours that can grow in the
brain. Although all brain tumours are tumours, not all malignant brain tumours are cancerous.
Benign brain tumours are noncancerous brain tumours. Benign brain tumours develop slowly,
have well-defined boundaries, and seldom spread. Tumors that are benign can nonetheless be
harmful. They may cause serious malfunction by damaging and compressing areas of the
brain. Benign brain tumours that develop in a crucial part of the brain can be fatal. A benign
tumour can become cancerous very rarely. Meningioma, vestibular schwannoma, and
pituitary adenoma are examples of benign tumours. Brain tumours that are malignant are
cancerous. They usually develop quickly and infiltrate healthy brain areas. Because of the
alterations it brings to the brain’s essential structures, brain cancer can be fatal. Olfactory
neuroblastoma,chondrosarcoma, and medulloblastoma are examples of malignant tumours
that arise in ornear the brain. Brain tumors are generally classified according to how normal
or aberrant the cells seem. This measurement will be used by the doctor to assist plan your
therapy. The grade also indicates how quickly the tumour is likely to develop and spread.

Grade 1: The cells appear to be practically normal and are sluggish to develop. Its
probable that one will live for a long time.
Grade 2: The cells have an odd appearance and develop slowly. The tumour might
spread to neighbouring tissue and return, possibly in a more life-threatening form.
Grade 3: Cells have an irregular appearance and are actively expanding into adjacent
brain tissue. These tumours have a tendency to recur.
Grade 4: Cells have the most unusual appearance and proliferate and spread swiftly.
Some cancers alter over time. Some benign tumours can develop cancerous in rare cases, and
a lower-grade tumour can become malignant.

Glioblastoma, the most lethal kind of brain cancer, kills over two-thirds
of individuals diagnosed within two years. Brain cancer is the most frequent and deadly of all
paediatric solid tumors. Furthermore, children who survive and grow up with these tumours
are frequently harmed by the long-term effects of exposing the developing brain to medical
procedures such as surgery, radiation, and/or chemotherapy. In recent decades, there have
been indications of an increase in the incidence of primary brain tumours, which should be
read with care. Trends may only be regarded meaningful throughout time if they are based on
data obtained using the same definitions and reporting practises. The reported rises, which
have been attributed to numerous reasons, might be explained by inconsistencies and changes
throughout time. Improved diagnostic imaging, which began in the 1970s and 1980s with the

7
introduction of radioisotope imaging, computed tomography, and magnetic resonance
imaging, has resulted in higher detection rates and improved differential diagnosis of brain
tumours that were previously misdiagnosed as strokes or metastatic tumours. Access to
services will have increased, increasing the likelihood of a tumour patient being recorded.
Furthermore, advances in histopathological technology have improved the specificity of
tumour diagnosis, so an apparent rise in particular tumour types—for example, astrocytomas
—could just be the result of fewer non-specific diagnoses being recorded. Specific
investigations conducted in Norway and the United States to see if these factors may account
for temporal patterns imply that the reported increase in brain tumours is most likely an
artefact of changing diagnostic and reporting practices. This statement is supported by the
fact that incidence in the United States levelled down in the 1990s. Brain tumour rates have
been steadily rising in youngsters under the age of 14 and the elderly over the age of 70.
Changes in the environment may explain this in children, but the increase in older patients
over time is more difficult to explain.

8
Literature Review
The phosphatidylinositol 3-kinase (PI3K) signaling pathway plays crucial roles in cell
growth, proliferation and survival.1,2 The PI3K family of enzymes is comprised of 15 lipid
kinases with distinct substrate specificities, expression patterns and modes of actions. 3
Mutations in PIK3CA, the gene encoding the p110a catalytic subunit of PI3Ka, are very
common in human cancers.

mTOR signalling pathway

Rapamycin is a 31-membered macrocyclic lactone that was first developed as


immunosuppressant by Wyeth in 1997 and more recently as an anti-cancer agent in the form
of various analogs, often referred to as rapalogs. In complex with the small protein FKBP12,
rapamycin binds to the FKBP12-rapamycin (FRB) domain of mTOR and inhibits its kinase
activity through an allosteric mechanism that is still under investiga- tion. Because of its
exquisite selectivity, rapamycin has been an indispensable pharmacological probe for
elucidating the biological functions of the mTOR serine/threonine kinase in governing cell
growth and proliferation. However, a series of discoveries over the last decade have revealed
a more com- plicated picture than initially suspected. Partly, this derives from the fact that
mTOR serves as the catalytic subunit in two protein complexes, named mTOR Complex 1
(mTORC1) and mTOR Complex 2 (mTORC2) with complementary but dis- tinct roles [1].
Only mTORC1 is affected by acute rapamycin treatment, presumably because the FKBP12-
rapamycin bind- ing site is occluded in mTORC2. Both complexes are activated by growth
factor signaling through the PI3K pathway, while mTORC1 is additionally regulated by
nutrient and energy signals. Together, these complexes control a diverse array of processes
that are required for basic cell growth, including protein translation, cell division, autophagy
and cell survival.

In addition to mTOR, mTORC1 contains the large scaffold- ing protein Raptor (Regulatory
associated protein of mTOR), the small WD-40 repeat protein mLST8, and two regulatory
proteins named PRAS40 and DEPTOR [1,2]. In cells, this complex is regulated by a diverse
collection of energy and nutrient signals. The majority of these are transmitted through three
primary modes: the GTPase Rag (Ras related GTPase) proteins, which signal the availability
of amino acids; the GTPase rheb (Ras homolog enriched in brain), which transmit signals
from both growth factor and energy sensing pathways; and factors that signal to mTORC1
directly, through phosphorylation or protein–protein inter- actions [1,3,4]. In the absence of
amino acids, mTORC1 is diffusely localized throughout the cytoplasm [3]. In their presence,
the Rag proteins bind to Raptor and sequester mTORC1 on the surface of an endomembrane
compartment. This re-localization is required, though insufficient to acti- vate mTORC1,
which additionally requires rheb. In its GTP- bound state, rheb binds to mTOR near its kinase
domain and directly stimulates its activity [5]. Rheb, itself, is regulated by a large GTPase-
activating TSC1/TSC2 heterodimer, which integrates inputs from many upstream signaling
pathways that reflect PI3K signaling (Akt), MAPK signaling (Erk), ener- getic stress
(AMPK) and hypoxic stress [4]. mTORC1 activity can also be modified by interactions with
PRAS40 and DEP- TOR, which provide additional inputs from PI3K, or by direct
phosphorylation by kinases such as AMPK [1,2].

9
PI3K and mTOR dual inhibitors

On the basis of amino acid sequence similarity, mTOR is classified as Class IV PI3K family
member. This family is a subset of the 6-membered PIKK family of kinases, which is divided
between the phosphatidyl inositol kinases including class 1, class 2 and class 3 and the
protein kinase class IV which includes ATM, ATR, DNA-PK, SMG and mTOR. The first
PI3K kinase inhibitor which was recognized to also inhibit mTOR in an ATP-competitive
fashion was the mor- pholino quinazoline derivative PI-103 which was discovered using
high-throughput screening followed by medicinal chemistry campaign by Astella’s
Pharmaceutical in Japan [26]. It was shown that PI-103 could inhibit biochemical

kinase activity of mTOR (IC50: 20–83 nM), PI3K (IC50: a, 3.6 nM; b, 3.0 nM) and DNA-PK
(IC50: 2.0 nM). PI-103 could exert anti-proliferative activity against a variety of trans-
formed cell lines such as melanoma, cervix, lung and breast. In a combination with the ATP-
competitive EGFR inhibitor erlotinib, PI-103 enhanced efficacy against an EGFR-depen-
dent PTEN-mutant glioma tumor model [28]. Although PI- 103 has not been evaluated in the
clinical trial because of poor ‘drug-like’ properties, it has served as a lead compound for other
PI3K and mTOR selective inhibitors s An understanding of the binding mode of ATP-
competitive inhibitors has to a large extent been elucidated by the pio- neering efforts to
crystallize numerous inhibitors with PI3Kg by Rodger William’s research group. The binding
mode observed crystallography was quite similar to the well estab- lished mode exploited by
ATP-competitive inhibitors that target protein kinases (Fig. 2). The binding pocket can be
roughly divided into three zones: a central part normally occupied by the adenine-ring of
ATP with two potential hydrogen bonds to the kinase ‘hinge’ region (Val882 N–H and
Glu880 C O) [30], a hydrophobic back pocket that is typically exploited to obtain potency
and selectivity, and a forward pocket that exits to solvent. The crystal structure of the
morpholine-chromenone class of structures as exempli- fied by LY294002 demonstrates that
the morpholine oxygen forms a hydrogen bond with the Val882 N–H, whereas the
chromenone scaffold occupies the adenine-binding site and the phenyl ring is directed out
towards solvent. Consistent with the weak potency of this inhibitor (cellular IC50: 1.4 mM)
[31], the hydrophobic back pocket is not occupied. Molecular modeling in conjunction with
the X-ray structure of GDC0941 suggests that PI-103 also exploits its morpholine oxygen as
hinge binder but in contrast to LY294002 its phenol moiety extends to the hydrophobic back
pocket with the phenol forming a hydrogen bond with residue Asp 841 where it greatly
contributes to potency and selectivity [32]. As discussed further below, a variety of selective
mTOR inhibi- tors including WAY-001, WYE-354, WAY-600, WYE-687, Wyeth-BMCL-
200910075-16b, Wyeth-BMCL-200910096- 27, KU0063794 and KU-BMCL-200908069-5
were developed using PI-103 as a lead compound demonstrating that it is possible to achieve
selectivity for mTOR relative to PI3K. Interestingly GDC-0941, an inhibitor that exhibits
selectivity for PI3K over mTOR, was also developed starting with PI-103 as a lead
compound, which demonstrates that the reciprocal selectivity profile can also be achieved.

PHOSPHOINOSITIDE 3-kinases (PI3-kinases) are a ubiquitously expressed enzyme family


that phosphorylate membrane inositol lipids. Activation of PI3-kinases forms one of the
major pathways of intra- cellular signal transduction. A high proportion of cell-surface
receptors, especially those linked to tyrosine kinases activate PI3-kinases, and a be- wildering
variety of cellular functions and events appear to be influ- enced by the lipid products
generated by these enzymes. There is, how- ever, increasing evidence that different isoforms
of the enzyme have specialized functions. This opens up the possibility of developing thera-
peutically useful, isoform-selective inhibitors with limited toxicity. Conditions in which PI3-

10
kinase inhibition could prove valuable include cancer and diseases with an inflammatory or
immune component.

The PI3-kinase family

The primary enzymatic activity of the PI3-kinases is the phosphorylation of inositol lipids at
the 3 position. Different members of the PI3-kinase family generate different lipid products.
To date, four

3􏱉-phosphorylated inositol lipids have been identified in vivo of which phosphatidylinositol


(3,4)-bisphosphate [PtdIns(3,4)P2] and PtdIns(3,4,5)-trisphosphate [PtdIns(3,4,5)P3, or
PIP3] act as second messengers.

PI3-kinase was initially purified and cloned as a heterodimeric com- plex, consisting of a 110
kDa catalytic subunit, now known as p110􏱊, and an 85 kDa regulatory/adapter subunit,
p85􏱊. To date, eight mam- malian PI3-kinases have been identified. These are divided into
three main classes (Fig. 1), on the basis of sequence homology and substrate preference in
vitro. It is likely that all mammalian cells express representatives of each class. A single
member of each of the three classes is found in Drosophila1.

A detailed review of PI3-kinase function and signalling pathways is beyond the scope of this
article and has been discu However, aspects of PI3-kinase function and signalling relevant to
consideration of the potential consequences and therapeutic appli- cations of PI3-kinase
inhibition are described below.

Class I kinases

Four class I enzymes have been identified in humans and other mam- mals. These are divided
into two subclasses (Ia and Ib) on the basis of their mechanism of activation. The principal
lipid generated by the class I kinases in vivo is PtdIns(3,4,5)P3 (see Box 1) although the
enzymes are also able to phosphorylate PtdIns and PtdIns (4)-phosphate [PtdIns(4)P] in vitro.

The Ia subgroup consists of the classical p110 and two additional, closely related enzymes,
p110 and p110. The p110 and p110 iso- forms both have a near-ubiquitous tissue distribution
in adults, whereas p110 is rather more restricted, with the highest levels in leukocytes. All
class Ia enzymes associate with a p85 regulatory/adapter subunit to form a heterodimeric
complex.

The p85 adapter proteins. The p85 adapter proteins contain a series of modular domains all
of which play defined roles in protein–protein

Class II kinases

The three members of the class II enzymes form the least understood group of PI3-kinases.
Enzymes of this class are significantly larger than class I or III kinases, with molecular
weights of the order of 200 kDa, and they possess a second C2 domain at the extreme C-
terminus. The isolated C-terminal C2 domain of PI3-kinase C2􏱋, like other C2 domains, can
bind to anionic phospholipids in vitro. Enzymatically, the class II kinases are distinguished
by a virtual inability to phosphorylate PtdIns(4,5)P2 in vitro. A further, distinguishing feature

11
of these en- zymes is that they are predominantly membrane-associated, in contrast to class I
kinases, which, in the resting state, are cytoplasmic.

There is considerable variability in the N-terminal sequences of this group of enzymes, and
comparison of the various murine and human sequences suggests that there could be splice
variants in this region. No adapters have been identified for the class II kinases. It has
recently become clear that a variety of membrane receptors, including tyrosine kinases and
integrins, can activate class II kinases, although neither the mechanism nor the in vitro
substrate for the enzyme is identified .The cellular consequences of this are unknown.

Class III kinases

The prototype of the class III PI3-kinases is the Saccharomyces cerevisiae enzyme vps34p,
which is believed to be constitutively active. The only lipid substrate identified in vitro is
PtdIns. vps34p associates with vps15p, a Ser/Thr protein kinase that is myristoylated at the
N- terminus and has an essential role in protein trafficking through the vacuole. The pathway
is preserved in mammals that possess homologs of both enzymes. The function of the enzyme
also appears to be con- served in mammalian cells where it is involved in the traffic of
proteins through the lysosome, the mammalian equivalent of the vacuole. Mammalian
VPS34p is probably responsible for generating the ma- jority of cellular PtdIns(3)P.

The enzyme responsible for generating PtdIns(3,5)P2 has been iden- tified recently in S.
cerevisiae as Fab1p (murine ortholog, PIKfyve), which appears to act as a PtdIns(3)P 5-
kinase. Although Fab1p is un- related to any member of the PI3-kinase family, in yeast, like
vps34p, it is required for normal vacuolar function.

PI3-kinase signalling

There has been extensive functional characterisation of the class I ki- nases, especially those
of class Ia, as, in a variety of cell types, these en- zymes are readily activated by ligand and
can be inhibited by pharma- cological and genetic techniques (reviewed in Ref. 15).
Paradoxically, little is known about the role of PI3-kinases in human disease, in part because
of the difficulty in detecting and quantifying the levels of inositol lipids in cells and tissues
and also because of a paucity of good antibodies against the class I kinases.

The physiological consequences of enzyme activation fall into three broad categories, namely
cell growth/survival, intracellular trafficking and cellular motility. In addition, class Ia
kinases are activated by in- sulin and many of the metabolic actions of insulin, including the
translocation of the GLUT4 glucose transporter to the cell membrane, have been linked to
PI3-kinase signalling.

Identification of the cellular effectors of PI3-kinases has been facilitated by the discovery that
pleckstrin homology (PH) domains, which are structurally conserved regions of about 100
amino acids bind to inositol lipids and their head groups and that a subset of these exhibit a
preference for PtdIns(3,4)P2 and PtdIns(3,4,5)P3 over other lipids. In contrast, the more
recently described FYVE domain, which has been identified in a number of proteins,
including several implicated in yeast vacuole and mammalian endosome function (such as
FAB1p/PIKfyve) binds preferentially to the monophosphorylated PtdIns(3)P.

12
Targets of class I PI3-kinases

The majority of known effects of PI3-kinase activation can be explained, atleast in part, on
the basis of the proteins that are known or believed to be PI3-kinase signalling targets.

One pathway that does merit a brief discussion, however, is the now well-characterized
pathway involving protein kinase B (PKB). PKB, also known as AKT, is the mammalian
homologue of the oncogene v-AKT. There is close homology between the kinase domains of
PKB and those of protein kinases A and C (PKA and PKC). These three kinases, together
with PKG and p70S6 kinase and a number of other, related kinases, are sometimes termed the
AGC kinases. PKB/AKT is activated by two phosphorylation events, both of which are
probably induced by the phosphoinositide dependent kinase, PDK1, following PI3-kinase
activation. Downstream of PKB/AKT lie a number of enzymes implicated as effectors of the
actions of insulin on glucose metabolism and protein synthesis, and of PI3-kinase signalling
on cell survival.

PDK1 potentially plays an important role in the activation and/or priming of other AGC
kinases. The activation of most of these enzymes requires multiple phosphorylation events
including PI3-kinase- dependent phosphorylation. Critical phosphorylation sites for the full
but not p110 is recruited to the endosomal compartment where the cellular pool of GLUT4
resides22. This is possibly mediated through a direct association with the transport protein,
RAB5 (which also binds VPS34).

These data suggest that the class Ia kinases are, to some extent, functionally distinct. This
distinction is likely to be mediated through a combination of expression patterns and, per-
haps, a preferential association of certain kinases (or combinations of kinases and adapter
subunits) with particular receptors. An example of the latter is the possible preference of the
insulin receptor for p110. It is almost certain that both the extent and exact nature of
functional specialisation will vary from tissue to tissue.

PI3-kinases in cancer

The evidence that class Ia PI3-kinases act as oncoproteins in some human cancers
(summarised in Box 3) is mostly indirect. Nevertheless, taken in its entirety there is a
convincing case that these kinases do play an important role in some human cancer, although
at present it is impossible to estimate how widespread this is.

Direct evidence that activating mutations of the enzyme exist in human cancer is lacking.
However a fully functional retroviral onco- gene, v-p3k, an ortholog of p110 has been
isolated from the ASV16 avian sarcoma virus that induces haemangiosarcomas in chickens as
well as from ASV8905. The N-terminus of the p110-like sequence is fused to the viral gag
sequence. This results in membrane localisation and is probably the mechanism of its con-
stitutive activation. The activity of v-p3k is independent of either association with a p85
adapter or RAS. A number of membrane-targeted ‘synthetic’ p110 constructs are also
constitutively active and, in appropriate conditions, these transform transfected fibroblasts.

13
A second activated PI3-kinase has been isolated from a radiation-induced murine
lymphoma32. These tumor cells, in contrast to ASV16 induced tumors, contain an apparently
normal p110 associated with p65, a truncated p85 that is missing the C-terminal SH2
domain. The p65/p110 complex is constitutively active in lymphoma cells, although the
mechanism is not clear at present. Transgenic mice that ex- press p65 in T cells have reduced
apoptosis and are prone to develop a lymphoproliferative condition and autoimmune
disease33. T cell lymphoma develops at an early age when the p65-transgene is expressed in
animals with a p53 null background. It seems likely that PKB/Akt activation is essential for
the effects of both v-p3k and p65/p110.

In humans, amplification of PIK3CA, the gene encoding p110 on chromosome 3q26 has been
reported in some ovarian cancer cell lines and primary ovarian cancers and also in cervical
cancers. At present, the functional significance of this is not fully elucidated. Evidence for
protein over-expression and enzymatic activation of class Ia PI3-kinases in a proportion of
colorectal tumors has been presented although at present the mechanism of this is
unknown36.

In addition to a direct oncogenic role as implied above, it is appar- ent that PI3-kinase is part
of a network of oncoproteins. Downstream of PI3-kinase, AKT2 (PKB) is amplified and
overex- pressed in some ovarian cancers. Similarly, AKT3 (PKB) is overproduced in some
steroid-hormone-insensitive breast cancers. On evidence currently available, however,
including an analysis of AKT2 activation in ovarian cancer specimens, activation of
PKB/AKT seems more likely to reflect PI3-kinase activity rather than being the consequence
of primary over-expression.

PI3-kinases are known to be activated by many cell surface receptors that have an established
oncogenic role, such as the PDGFR and EGFR families. PI3-kinase signalling is essential for
the transforming activity of some cytoplasmic oncoproteins, including v-Src, the polyoma
mid- dle T antigen and bcr-abl in Ph chronic myeloid leukaemia. In the case of bcr-abl,
mutants that are unable to activate PI3-kinase lose leukaemogenic potential and dominant-
negative PKB/Akt can reverse bcr-abl induced leukaemia development in SCID mice.

An alternative mechanism for activation of PI3-kinase signalling is a failure to degrade the


phosphoinositide second messengers generated by the enzymes. The gene PTEN encodes a
lipid phosphatase with the ability to remove the 3’ phosphate from PtdInsP2 and PtdInsP3, an
activity which is essential for its role as a tumor suppressor . Germline loss of PTEN is
respon- sible for Cowden’s disease (and related conditions) which is charac- terized by the
development of hamartomas and a susceptibility to breast cancer. Mutation of PTEN is now
considered to be one of the more common somatic mutations in human cancer, occurring
particularly in glioblastoma, prostatic, endometrial and endometroid ovarian cancer. This
might well prove to be the single most significant contributor to PI3-kinase activation in
human cancer.

In drug design, the physicochemical properties play important roles, hence, must be
considered. From a medicinal chemistry perspective, the ability to design a drug capable of
penetrating the blood-brain barrier (BBB) and effecting the desired receptor and biological
response is a formidable challenge. Oral administration is the most complex route in terms of
physicochemical properties, but the most relevant route for patient compliance. An effective
drug should pre- sent an appropriate bioavailability which is related to the pharmacokinetic
14
profile and a certain balance in molecular properties. The absorption consists in the passage
of a molecule through the lipid bilayer of membranes from cells in the gut, which depends of
drug dissolution in gastrointestinal tract. Therefore, lipophilicity and hydrogen bonding
proper- ties of drugs can significantly influence their uptake profiles. Size, ionization, and
molecular flexibility are other factors observed to influence transport of an organic compound
across intestinal membrane and BBB .

In this context, Lipinski and coworkers (1997) developed the “Rule of Five” (Ro5), which is
defined as combined pa- rameters that are capable of identifying a potential drug can- didate
that might present problems with absorption and per- meability. According to the rule, poor
oral bioavailability is more likely when there are more than 5 H-bond donors (HBD), 10 H-
bond acceptors (HBA), the molecular weight (MW) is greater than 500 and the calculated
Log P (CLog P – calculated logarithm of partition coefficient between n- octanol and water)
is greater than 5.

Drugs can be divided into 22 therapeutic class areas, which are mostly comprised by
antidepressant, anticonvulsant, antipsychotic and antiparkinsonian drugs. From all drugs
included in this study, 59 (69%) suited all evaluated physicochemical parameters, while 27
(31%) violated at least one of the properties established by RoCNS. Figure 1 highlights the
percentage of drugs that follow the RoCNS rule both pre- (1985-1999) and post- (2000-2014)
RoCNS publication. The results indicate that was a very slightly de-crease in the percentage
of drugs that meet the rules after it was first published.

The Ro5 methodology appears to be as useful today in defining therapeutically relevant


pharmacokinetic drugability as when it was proposed, but recognizing that the database that
we evaluated includes only drugs that successfully reached the market. As shown earlier, Ro5
fails to discriminate drugs from “non-drugs”. From our perspective, we do not view
additional criteria to be necessary or find significant deficiencies in the four Ro5 criteria
originally proposed by Lipinski and coworkers. BDDCS builds upon the Ro5 criteria and can
quite successfully predict drug disposition characteristics for drugs both meeting and not
meeting Ro5 criteria. More recent expansions of classification systems have been proposed
and do provide useful qualitative and quantitative predictions for clearance relationships.
However, the broad range of applicability of BDDCS beyond just clearance predictions gives
a great deal of further usefulness for this system.

The rule of 5 methodology appears to be as useful today in defining drugability as when it


was proposed, but recognizing that the database that we used includes only drugs that
successfully reached the market. We do not view additional criteria necessary nor did we find
significant deficiencies in the four Rule of 5 criteria originally proposed by Lipinski and
coworkers. BDDCS builds upon the Rule of 5 and can quite successfully predict drug
disposition characteristics for drugs both meeting and not meeting Rule of 5 criteria. More
recent expansions of classification systems have been proposed and do provide useful
qualitative and quantitative predictions for clearance relationships. However, the broad range
of applicability of BDDCS beyond just clearance predictions gives a great deal of further
usefulness for the combined Rule of 5/BDDCS system.

15
OBJECTIVE OF STUDY
1. Comparison and variability of Brain Cancer with reference to genome and analysing the
local and global similarity between sequences by comparing nucleotide or protein
sequence and calculate the statistical significance of the matches.

2. Computation analysis of the sequence alignment of Brain Cancer and data analysis.

3. Clustal omega and MSA (Multiple Sequence Alignment) programs that uses a see guided
tree and HMM profilin technique.

4. BLAST sequence alignment program to find out the best possible alignment and
computational alignment method to calculate all possible parameters using Brain
Cancer produces local alignment score in comparison to the query sequence.

5. Phylogenetic analysis of Brain Cancer reconstructing their evolutionary path and their
ancestral genome in the human host.

6. ORF finder searches for open reading frame in a DNA/RNA sequence using the standard
or alternative genetic code.

7. To analyse structural proteins, domain and function of Brain Cancer and properties of
associated ligand preparation in 3D.

8. Molecular docking shall be performed to study the interaction of Brain Cancer and
associated ligand.

9. In silico analysis and computer-aided drug designing of brain Cancer and prediction of
ligand confirmation as well as the position and orientation within binding affinity.

10. Constraint-based alignment tool for multiple protein sequence to implement sequence
similarity search.

16
Tools and Materials
1. National Center for Biotechnology Information (NCBI)

The National Center for Biotechnology Information (NCBI) is part of the United States
National Library of Medicine (NLM), a branch of the National Institutes of Health (NIH). It
is approved and funded by the government of the United States. The NCBI is located
in Bethesda, Maryland, and was founded in 1988 through legislation sponsored by US
Congressman Claude Pepper.
The NCBI houses a series of databases relevant to biotechnology and biomedicine and is an
important resource for bioinformatics tools and services. Major databases
include GenBank for DNA sequences and PubMed, a bibliographic database for biomedical
literature. Other databases include the NCBI Epigenomics database. All these databases are
available online through the Entrez search engine. NCBI was directed by David Lipman, one
of the original authors of the BLAST sequence alignment program and a widely respected
figure in bioinformatics.

2. ClustalW

17
Clustal W is a general-purpose multiple alignment program for DNA or proteins. The
sensitivity of the commonly used progressive multiple sequence alignment methods has been
greatly improved for the alignment of divergent protein sequences. ClustalW like the other
Clustal tools is used for aligning multiple nucleotide or protein sequences in an efficient
manner. It uses progressive alignment methods, which align the most similar sequences first
and work their way down to the least similar sequences until a global alignment is created.

3. Clustal Omega

Clustal Omega is a multiple sequence alignment program for aligning three or more
sequences together in a computationally efficient and accurate manner. It produces
biologically meaningful multiple sequence alignments of divergent sequences. Evolutionary
relationships can be seen via viewing Cladograms or Phylograms.

4. ProtParam

18
ProtParam (References / Documentation) is a tool which allows the computation of various
physical and chemical parameters for a given protein stored in Swiss-Prot or TrEMBL or for
a user entered protein sequence. The computed parameters include the molecular weight,
theoretical pI, amino acid composition, atomic composition, extinction coefficient, estimated
half-life, instability index, aliphatic index and grand average of hydropathicity (GRAVY).

5. Basic Local Alignment Search Tool (BLAST)

The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between
sequences. The program compares nucleotide or protein sequences to sequence databases and
calculates the statistical significance of matches. BLAST can be used to infer functional and
evolutionary relationships between sequences as well as help identify members of gene
families. BLAST finds regions of similarity between biological sequences. The program
compares nucleotide or protein sequences to sequence databases and calculates the statistical
significance.

6. Protein Data Bank (PDB)

19
The Protein Data Bank (PDB) is a database for the three-dimensional structural data of large
biological molecules, such as proteins and nucleic acids. The data, typically obtained by X-
ray crystallography, NMR spectroscopy, or, increasingly, cryo-electron microscopy, and
submitted by biologists and biochemists from around the world, are freely accessible on the
Internet via the websites of its member organisations (PDBe, PDBj, RCSB, and BMRB). The
PDB is overseen by an organization called the Worldwide Protein Data Bank, wwPDB.
The PDB is a key in areas of structural biology, such as structural genomics. Most major
scientific journals and some funding agencies now require scientists to submit their structure
data to the PDB. Many other databases use protein structures deposited in the PDB. For
example, SCOP and CATH classify protein structures, while PDBsum provides a graphic
overview of PDB entries using information from other sources, such as Gene ontology.

7. PubMed

PubMed is a free search engine accessing primarily the MEDLINE database of references
and abstracts on life sciences and biomedical topics. The United States National Library of
Medicine(NLM) at the National Institutes of Health maintain the database as part of
the Entrez system of information retrieval.
From 1971 to 1997, online access to the MEDLINE database had been primarily through
institutional facilities, such as university libraries. PubMed, first released in January 1996,
ushered in the era of private, free, home- and office-based MEDLINE searching. The
PubMed system was offered free to the public starting in June 1997.

20
8. Drug Bank

DrugBank Online is a comprehensive, free-to-access, online database containing information


on drugs and drug targets. As both a bioinformatics and a cheminformatics resource, we
combine detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with
comprehensive drug target (i.e. sequence, structure, and pathway) information. DrugBank
Online is widely used by the drug industry, medicinal chemists, pharmacists, physicians,
students and the general public. Because of its broad scope, comprehensive referencing, and
detailed data descriptions, DrugBank is enabling major advancements across the data-driven
medicine industry.

9. PubChem

PubChem is a database of chemical molecules and their activities against biological assays.
The system is maintained by the National Center for Biotechnology Information (NCBI), a
component of the National Library of Medicine, which is part of the United States National
Institutes of Health (NIH). PubChem can be accessed for free through a web user interface.
Millions of compound structures and descriptive datasets can be freely downloaded via FTP.
PubChem contains multiple substance descriptions and small molecules with fewer than 100
atoms and 1,000 bonds. More than 80 database vendors contribute to the growing PubChem
database.

21
10. Gene Card

GeneCards is a database provides genomic, proteomic, transcriptomic, genetic and functional


information on all known and predicted human genes. It is being developed and maintained
by the Crown Human Genome Center at the Weizmann Institute of Science.
The database aims at providing a quick overview of the current
available biomedical information about the searched gene, including the human genes, the
encoded proteins, and the relevant diseases. The GeneCards database provides access to
free Web resources about more than 7000 all known human genes that integrated from >90
data resources, such as HGNC, Ensembl, and NCBI. The core gene list is based on approved
gene symbols published by the HUGO Gene Nomenclature Committee (HGNC). The
information is carefully gathered and selected from these databases by its engine. If the
search does not return any results, this database will give several suggestions to help users
accomplish their search depending on the type of query and offer direct links to other
databases’ search engine.
Over time, the GeneCards database has developed a suite of tools (GeneDecks, GeneLoc,
GeneALaCart) that has more specialised capability. Since 1998, the GeneCards database has
been widely used by bioinformatics, genomics and medicalcommunities for more than 15
years.

11. SwissADME

22
SwissADME was made for application in drug discovery and medicinal chemistry contexts,
which stresses for a balance between accuracy and speed in order to deal with a large number
of molecules. Because of the predictive nature of the data returned by SwissADME, values
should be handled with due care. To be effective as a drug, a potent molecule must reach its
target in the body in sufficient concentration, and stay there in a bioactive form long enough
for the expected biologic events to occur. Drug development involves assessment of
absorption, distribution, metabolism and excretion (ADME) increasingly earlier in the
discovery process, at a stage when considered compounds are numerous but access to the
physical samples is limited. In that context, computer models constitute valid alternatives to
experiments. Here, we present the new SwissADME web tool that gives free access to a pool
of fast yet robust predictive models for physicochemical properties, pharmacokinetics, drug-
likeness and medicinal chemistry friendliness, among which in-house proficient methods
such as the BOILED-Egg, iLOGP and Bioavailability Radar.

12. PreADMET

A significant bottleneck remains in the drug discovery procedure, in particular in the later
stages of lead discovery, is analysis of the ADME and overt toxicity properties of drug

23
candidates. Over 50% of the candidates failed due to ADME/Tox deficiencies during
development. To avoid this failure at the development a set of in vitro ADME/Tox screens
has been implemented in most pharmaceutical companies with the aim of discarding
compounds in the discovery phase that are likely to fail further down the line. Even though
the early stage in vitro ADME reduces the probability of the failure at the development stage,
it is still time-consuming and resource-intensive. Therefore, we describe a new web-based
application called PreADMET, which has been developed in response to a need for rapid
prediction of drug-likeness and ADME/Tox data.

13. Molinspiration

Molinspiration offers broad range of cheminformatics software tools supporting molecule


manipulation and processing, including SMILES and SDfile conversion, normalization of
molecules, generation of tautomers, molecule fragmentation, calculation of various molecular
properties needed in QSAR, molecular modelling and drug design, high quality molecule
depiction, molecular database tools supporting substructure and similarity searches. Our
products support also fragment-based virtual screening, bioactivity prediction and data
visualization. Molinspiration tools are written in Java, therefore can be used practically on
any computer platform.

14. ORF Finder

24
ORF finder searches for open reading frames (ORFs) in the DNA sequence you enter. The
program returns the range of each ORF, along with its protein translation. Use ORF finder to
search newly sequenced DNA for potential protein encoding segments, verify predicted
protein using newly developed SMART BLAST or regular BLASTP.

This web version of the ORF finder is limited to the subrange of the query sequence up to 50
kb long. Stand-alone version, which doesn't have query sequence length limitation, is
available for Linux x64.

15. Swiss Target Prediction

SwissTargetPrediction is based on the observation that similar bioactive molecules are more
likely to share similar targets. Therefore, the targets of a molecule can be predicted by
identifying proteins with known ligands that are highly similar to the query molecule.
This website allows you to estimate the most probable macromolecular targets of a small
molecule, assumed as bioactive. The prediction is founded on a combination of 2D and 3D

25
similarity with a library of 370000 known actives on more than 3000 proteins from three
different species.
16. JPred

JPred is a web server that takes a protein sequence or multiple alignment of protein
sequences, and from these predicts the location of secondary structures using a neural
network called Jnet. JPred4 is the latest version of the popular JPred protein secondary
structure prediction server which provides predictions by the JNet algorithm, one of the most
accurate methods for secondary structure prediction.

17. PyRx

PyRx is a Virtual Screening software for Computational Drug Discovery that


can be used to screen libraries of compounds against potential drug targets.
PyRx enables Medicinal Chemists to run Virtual Screening from any platform
and helps users in every step of this process - from data preparation to job
submission and analysis of the results. While it is true that there is no magic
button in the drug discovery process, PyRx includes docking wizard with an

26
easy-to-use user interface which makes it a valuable tool for Computer-Aided
Drug Design. PyRx also includes chemical spreadsheet-like functionality and
powerful visualization engine that are essential for structure-based drug
design.

18. Biovia Discovery Studio

BIOVIA Discovery Studio is a comprehensive suite of validated science applications built on


BIOVIA Pipeline Pilot. It is a suite of software for simulating small
molecule and macromolecule systems.The software delivers a unique blend of open, scalable,
collaborative research tools designed for today’s Life Sciences discovery research needs.

19. REACTOME

REACTOME is an open-source, open access, manually curated and peer-reviewed pathway


database. Our goal is to provide intuitive bioinformatics tools for the visualization,

27
interpretation and analysis of pathway knowledge to support basic and clinical research,
genome analysis, modeling, systems biology and education. Founded in 2003, the Reactome
project is led by Lincoln Stein of OICR, Peter D’Eustachio of NYULMC, Henning
Hermjakob of EMBL-EBI, and Guanming Wu of OHSU.

20. KEGG Database

KEGG (Kyoto Encyclopedia of Genes and Genomes) is a collection of databases dealing


with genomes, biological pathways, diseases, drugs, and chemical substances. KEGG is
utilized for bioinformatics research and education, including data analysis
in genomics, metagenomics, metabolomicsand other omics studies, modeling and simulation
in systems biology, and translational research in drug development.
The KEGG database project was initiated in 1995 by Minoru Kanehisa, professor at the
Institute for Chemical Research, Kyoto University, under the then ongoing Japanese Human
Genome Program.[1][2] Foreseeing the need for a computerized resource that can be used for
biological interpretation of genome sequence data, he started developing the KEGG
PATHWAY database. It is a collection of manually drawn KEGG pathway maps
representing experimental knowledge on metabolism and various other functions of
the cell and the organism. Each pathway map contains a network of molecular interactions
and reactions and is designed to link genes in the genome to gene products (mostly proteins)
in the pathway. This has enabled the analysis called KEGG pathway mapping, whereby the
gene content in the genome is compared with the KEGG PATHWAY database to examine
which pathways and associated functions are likely to be encoded in the genome.

21. STRING Database

28
STRING is a database of known and predicted protein-protein interactions. The interactions
include direct (physical) and indirect (functional) associations; they stem from computational
prediction, from knowledge transfer between organisms, and from interactions aggregated
from other (primary) databases.

29
Methodology
In Silico Studies and Characterization of Chemical drugs
Present studies include a collection and analysis of chemical drugs. Physical properties,
structural and chemical properties will be studied using various computational tools such as
NCBI Pubchem (https://pubchem.ncbi.nlm.nih.gov/) and CHEMBL databases
(https://www.ebi.ac.uk/chembl/). These are manually curated database of bioactive molecules
with drug-like properties. It brings together chemical, bioactivity and genomic data to aid the
translation of biological information into effective new drugs.

Drug Bank
An online curated database will be used to collect most favourable drugs. DrugBank Online
is a comprehensive, free-to-access, online database containing information on drugs and drug
targets. As both a bioinformatics and a cheminformatics resource, we combine detailed drug
(i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target
(i.e. sequence, structure, and pathway) information.

Target Prediction for Active Phytochemicals with Therapeutics Relevance.


Computational tools such as Swiss Target Prediction (http://www.swisstargetprediction.ch/)
and CHEMBL Databases (https://www.ebi.ac.uk/chembl/) will be used to predict the target
that could have role in anti-cancer therapeutics and biologicals process related to cancer
mechanisims will be further explored using tools like KEGG pathway databses, STRING
database and REACTOME (https://www.genome.jp/kegg/pathway, https://string-db.org/ and
https://reactome.org/ respectively.

CADD Approach: Molecular Interaction among Ligands and Target.


Computer-aided drug designing tools and strategy will be used in present study to identify the
targets. Target Identification and Validations: Based on the
computational studies, a protein structure will be selected for target identification and
validation. Three targets for anti-cancer activity evaluation were selected based on the
pathways which involve their active participation. It has been shown in previous studies that
many ligands act as anti-cancer agents are under preclinical and clinical trails. Therefore,
the protein is selected as target for desigining anti-cancer drugs. Three-
dimensional structures of this target protein will be retrieved from protein data bank
(RCSB PDB Databases) or model on SWISS model tools. Ligand structure will be
either derived from databases such as NCBI Pubchem, DrugBank or could be build on
Chimera 1.14 Tools.

Molecular Docking and Virtual Screening


PATCHDOCK (http://www.t3db.ca/), AUTODOCK and PYRx Tools will be used to
perform screening of ligands derived from four DrugBank with anti-
cancer activity. These tools provide a user-friendly docking platform for flexible ligand
docking keeping target protein structure rigid. After importing and preparing protein
structure, docking wizard provides a selection of docking option. Docking tab allows
selection of a specific scoring function like kinetics Score and PLANTS score based on laws
of molecular mechanics.
Computer aided prediction of ADME, activity and Toxicity Studies.
The pharmoco-kinetics properties of given ligands will be studied on SWISS ADME

30
(http://www.swissadme.ch/). This website allows us to compute physicochemical descriptors
as well as to predict ADME parameters, pharmacokinetic properties, drug like nature and
medicinal chemistry friendliness of one or multiple small molecules to support drug
discovery. Toxicity studies and prediction will be performed on T3DB tools
(http://www.t3db.ca/).

Outline of the work


 In Silico Structural studies and characterization of PI3K-Gamma protein.
 2D and 3D structure prediction and analysis of PI3K-Gamma protein.
 Protein-Ligand interaction and Molecular function mediated by
 Ligand identification and optimization.
 Target prediction for active phytochemicals having therapeutics relevance.
 Molecular docking and interaction studies among target and ligands.
 CADD approach to understand the molecular interaction among ligands and
target with reference to brain cancer.
 Computer aided prediction of ADME, activity and toxicity studies.
 Prediction of mode of action and effects on molecular pathways and biological
process for therapeutic significance.

31
Table of Contents

S.no Contents
1. Target Protein Analysis (PI3K-Gamma)
1.1 NCBI and Protein Data Bank Analysis
1.2 Gene Card Analysis
1.3 UniProt Analysis
1.4 Reactome Pathway
1.5 KEGG Pathway
1.6 String Pathway Analysis
1.7 Open Reading Frame Analysis (ORF Finder)
2. NCBI BLAST results of Target Protein(PI3K-Gamma) using nucleotide sequence
3. Multiple Sequence Alignment Target Protein(PI3K-Gamma) in ClustalW
4. NCBI BLAST results of Target Protein(PI3K-Gamma) using pdb sequence
5. Multiple Sequence Alignment Target Protein(PI3K-Gamma) in Clustal Omega
6. Drug Bank analysis to create Drug List
7. Ligand Analysis
7.1 Chemical Properties(PubChem)
7.2 Structural Analysis(PubChem)
7.3 Pharmacokinetic Analysis (SwissADME)
7.4 Bioactivity Prediction(molinspiration)
7.5 Toxicity Prediction (PreADME)
7.6 Target Prediction (Swiss Target Prediction)
8. jPRED Analysis
9. Docking Results
9.1 Mannual Docking in PyRx
9.2 Virtual Docking in Patch Dock

32
RESULTS

33
1.Target Protein Analysis
1.1 PDB and NCBI Analysis
PDB DOI: 10.2210/pdb3ML9/pdb

3ML9 Classification: TRANSFERASE/TRANSFERASE


INHIBITOR

Organism(s): Homo sapiens

Expression System: Spodoptera frugiperda

Mutation(s): No

Membrane Protein: Yes

Method: X-RAY DIFFRACTION

Resolution: 2.55 Å

R-Value Free: 0.298

R-Value Work: 0.242

R-Value Observed: 0.245

Global Symmetry: Asymmetric - C1

Global Stoichiometry: Monomer - A1

Total Structure Weight: 111.18 kDa

Atom Count: 6896

Modelled Residue Count: 847

Deposited Residue Count: 966

Unique protein chains: 1

Protein Sequence
>3ML9_1|Chain A|Phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit gamma isoform|Homo sapiens (9606)
MSEESQAFQRQLTALIGYDVTDVSNVHDDELEFTRRGLVTPRMAEVASRDPKLYAMHPWVTSKPLPEYLWKKIANNCIFIVIHRSTTSQTIKVSPDDT
PGAILQSFFTKMAKKKSLMDIPESQSEQDFVLRVCGRDEYLVGETPIKNFQWVRHCLKNGEEIHVVLDTPPDPALDEVRKEEWPLVDDCTGVTGYHE
QLTIHGKDHESVFTVSLWDCDRKFRVKIRGIDIPVLPRNTDLTVFVEANIQHGQQVLCQRRTSPKPFTEEVLWNVWLEFSIKIKDLPKGALLNLQIYC
GKAPALSSKASAESPSSESKGKVRLLYYVNLLLIDHRFLLRRGEYVLHMWQISGKGEDQGSFNADKLTSATNPDKENSMSISILLDNYCHPIALPKHQP
TPDPEGDRVRAEMPNQLRKQLEAIIATDPLNPLTAEDKELLWHFRYESLKHPKAYPKLFSSVKWGQQEIVAKTYQLLARREVWDQSALDVGLTMQL
LDCNFSDENVRAIAVQKLESLEDDDVLHYLLQLVQAVKFEPYHDSALARFLLKRGLRNKRIGHFLFWFLRSEIAQSRHYQQRFAVILEAYLRGCGTA
MLHDFTQQVQVIEMLQKVTLDIKSLSAEKYDVSSQVISQLKQKLENLQNSQLPESFRVPYDPGLKAGALAIEKCKVMASKKKPLWLEFKCADPTALS
NETIGIIFKHGDDLRQDMLILQILRIMESIWETESLDLCLLPYGCISTGDKIGMIEIVKDATTIAKIQQSTVGNTGAFKDEVLNHWLKEKSPTEEKFQAA
VERFVYSCAGYCVATFVLGIGDRHNDNIMITETGNLFHIDFGHILGNYKSFLGINKERVPFVLTPDFLFVMGTSGKKTSPHFQKFQDICVKAYLALRH
HTNLLIILFSMMLMTGMPQLTSKEDIEYIRDALTVGKNEEDAKKYFLDQIEVCRDKGWTVQFNWFLHLVLGIKQGEKHSAHHHHHH

34
Nucleotide Sequence
H.sapiens mRNA for phosphatidylinositol 3 kinase gamma
GenBank: X83368.1
GenBank Graphics
>X83368.1 H.sapiens mRNA for phosphatidylinositol 3 kinase gamma
GAATTCGGCACGAGCACTTCCTTCTCGGCTAGATTATCTGAAACTGTTGTCGGTTCTTGAGATGATACTA
CCACCGAATGTCTGTGTTTCATTGTCTAGTCCAACCTGTATTGTGGATATCTACAACGTTCCGGCAATAG
TTTTGCAGGTGCATCACATTTTTGTTTTTGTTTTGGGAGGAAAAGGGAGGGCACGGCAGCCAGGCTTCAT
ATTCCTACAAGTGCATGCTTCAAGATTACTGTACTTACAGTGTTTCCAACATCTTCTCATAAAAGGGGAA
AGCTTCATAGCCTCAACCATGAAGGAAACCAGTCGCATAGGGCATGGAGCTGGAGAACTATAAACAGCCC
GTGGTGCTGAGAGAGGACAACTGCCGAAGGCGCCGGAGGATGAAGCCGCGCAGTGCTGCCAGCCTGTCCT
CCATGGAGCTCATCCCCATCGAGTTCGTGCTGCCCACCAGCCAGCGCAAATGCAAGAGCCCCGAAACGGC
GCTGCTGCACGTGGCCGGCCACGGCAACGTGGAGCAGATGAAGGCCCAGGTGTGGCTGCGAGCGCTGGAG
ACCAGCGTGGCGGCGGACTTCTACCACCGGCTGGGACCGCATCACTTCCTCCTGCTCTATCAGAAGAAGG
GGCAGTGGTACGAGATCTACGACAAGTACCAGGTGGTGCAGACTCTGGACTGCCTGCGCTACTGGAAGGC
CACGCACCGGAGCCCGGGCCAGATCCACCTGGTGCAGCGGCACCCGCCCTCCGAGGAGTCCCAAGCCTTC
CAGCGGCAGCTCACGGCGCTGATTGGCTATGACGTCACTGACGTCAGCAACGTGCACGACGATGAGCTGG
AGTTCACGCGCCGTGGCTTGGTGACCCCGCGCATGGCGGAGGTGGCCAGCCGCGACCCCAAGCTCTACGC
CATGCACCCGTGGGTGACGTCCAAGCCCCTCCCGGAGTACCTGTGGAAGAAGATTGCCAACAACTGCATC
TTCATCGTCATTCACCGCAGCACCACCAGCCAGACCATTAAGGTCTCACCCGACGACACCCCCGGCGCCA
TCCTGCAGAGCTTCTTCACCAAGATGGCCAAGAAGAAATCTCTGATGGATATTCCCGAAAGCCAAAGCGA
ACAGGATTTTGTGCTGCGCGTCTGTGGCCGGGATGAGTACCTGGTGGGCGAAACGCCCATCAAAAACTTC
CAGTGGGTGAGGCACTGCCTCAAGAACGGAGAAGAGATTCACGTGGTACTGGACACGCCTCCAGACCCGG
CCCTAGACGAGGTGAGGAAGGAAGAGTGGCCGCTGGTGGACGACTGCACGGGAGTCACCGGCTACCATGA
GCAGCTTACCATCCACGGCAAGGACCACGAGAGTGTGTTCACCGTGTCCCTGTGGGACTGCGACCGCAAG
TTCAGGGTCAAGATCAGAGGCATTGATATCCCCGTCCTGCCTCGGAACACCGACCTCACAGTTTTTGTAG
AGGCAAACATCCAGCATGGGCAACAAGTCCTTTGCCAAAGGAGAACCAGCCCCAAACCCTTCACAGAGGA
GGTGCTGTGGAATGTGTGGCTTGAGTTCAGTATCAAAATCAAAGACTTGCCCAAAGGGGCTCTACTGAAC
CTCCAGATCTACTGCGGTAAAGCTCCAGCACTGTCCAGCAAGGCCTCTGCAGAGTCCCCCAGTTCTGAGT
CCAAGGGCAAAGTTCGGCTTCTCTATTATGTGAACCTGCTGCTGATAGACCACCGTTTCCTCCTGCGCCG
TGGAGAATACGTCCTCCACATGTGGCAGATATCTGGGAAGGGAGAAGACCAAGGAAGCTTCAATGCTGAC
AAACTCACGTCTGCAACTAACCCAGACAAGGAGAACTCAATGTCCATCTCCATTCTTCTGGACAATTACT
GCCACCCGATAGCCCTGCCTAAGCATCAGCCCACCCCTGACCCGGAAGGGGACCGGGTTCGAGCAGAAAT
GCCCAACCAGCTTCGCAAGCAATTGGAGGCGATCATAGCCACTGATCCACTTAACCCTCTCACAGCAGAG
GACAAAGAATTGCTCTGGCATTTTAGATACGAAAGCCTTAAGCACCCAAAAGCATATCCTAAGCTATTTA
GTTCAGTGAAATGGGGACAGCAAGAAATTGTGGCCAAAACATACCAATTGTTGGCCAGAAGGGAAGTCTG
GGATCAAAGTGCTTTGGATGTTGGGTTAACAATGCAGCTCCTGGACTGCAACTTCTCAGATGAAAATGTA
AGAGCCATTGCAGTTCAGAAACTGGAGAGCTTGGAGGACGATGATGTTCTGCATTACCTTCTACAATTGG
TCCAGGCTGTGAAATTTGAACCATACCATGATAGCGCCCTTGCCAGATTTCTGCTGAAGCGTGGTTTAAG
AAACAAAAGAATTGGTCACTTTTTGTTTTGGTTCTTGAGAAGTGAGATAGCCCAGTCCAGACACTATCAG
CAGAGGTTCGCTGTGATTCTGGAAGCCTATCTGAGGGGCTGTGGCACAGCCATGCTGCACGACTTTACCC
AACAAGTCCAAGTAATCGAGATGTTACAAAAAGTCACCCTTGATATTAAATCGCTCTCTGCTGAAAAGTA
TGACGTCAGTTCCCAAGTTATTTCACAACTTAAACAAAAGCTTGAAAACCTGCAGAATTCTCAACTCCCC
GAAAGCTTTAGAGTTCCATATGATCCTGGACTGAAAGCAGGAGCGCTGGCAATTGAAAAATGTAAAGTAA
TGGCCTCCAAGAAAAAACCACTATGGCTTGAGTTTAAATGTGCCGATCCTACAGCCCTATCAAATGAAAC
AATTGGAATTATCTTTAAACATGGTGATGATCTGCGCCAAGACATGCTTATTTTACAGATTCTACGAATC
ATGGAGTCTATTTGGGAGACTGAATCTTTGGATCTATGCCTCCTGCCATATGGTTGCATTTCAACTGGTG
ACAAAATAGGAATGATCGAGATTGTGAAAGACGCCACGACAATTGCCAAAATTCAGCAAAGCACAGTGGG
CAACACGGGAGCATTTAAAGATGAAGTCCTGAATCACTGGCTCAAAGAAAAATCCCCTACTGAAGAAAAG
TTTCAGGCAGCAGTGGAGAGATTTGTTTATTCCTGTGCAGGCTACTGTGTGGCAACCTTTGTTCTTGGAA
TAGGCGACAGACACAATGACAATATTATGATCACCGAGACAGGAAACCTATTTCATATTGACTTCGGGCA
CATTCTTGGGAATTACAAAAGTTTCCTGGGCATTAATAAAGAGAGAGTGCCATTTGTGCTAACCCCTGAC
TTCCTCTTTGTGATGGGAACTTCTGGAAAGAAGACAAGCCCACACTTCCAGAAATTTCAGGACATCTGTG
TTAAGGCTTATCTAGCCCTTCGTCATCACACAAACCTACTGATCATCCTGTTCTCCATGATGCTGATGAC
AGGAATGCCCCAGTTAACAAGCAAAGAAGACATTGAATATATCCGGGATGCCCTCACAGTGGGGAAAAAT
GAGGAGGATGCTAAAAAGTATTTTCTTGATCAGATCGAAGTTTGCAGAGACAAAGGATGGACTGTGCAGT
TTAATTGGTTTCTACATCTTGTTCTTGGCATCAAACAAGGAGAGAAACATTCAGCCTAATACTTTAGGCT
AGAATCAAAAACAAGTTAGTGTTCTATGGTTTAAATTAGCATAGCAATCATCGAACTTGGATTTCAAATG
CAATAGACATTGTGAAAGCTGGCATTTCAGAAGTATAGCTCTTTTCCTACCTGAACTCTTCCCTGGAGAA
AAGATGTTGGCATTGCTGATTGTTTGGTTAAGCAATGTCCAGTGCTAGGATTATTTGCAGGTTTGGTTTT
TTCTCATTTGTCTGTGGCATTGGAGAATATTCTCGGTTTAAACAGACTAATGACTTCCTTATTGTCCCTG
ATATTTTGACTATCTTACTATTGAGTGCTTCTGGAAATTCTTTGGAATAATTGATGACATCTATTTTCAT
CTGGGTTTAGTCTCAATTTTGGTTATCTTTGTGTTCCTCAAGCTCTTTAAAGAAAAAGATGTAATCGTTG
TAACCTTTGTCTCATTCCTTAAATGATGCTTCCAAACATCTCCTTAGTGTCTGCAGGTGTTAGTGGTGTG
CTAAAAGCAAGGAAAGCGAGTTAGTCTTTTCAGTGTCTTTTGCAATTCAATTCTTTTGTCATGTATAACT
GAGACACACAAACACAGCAGGAGAAATCTAAACCGTTGTGCCTTGACCTTCCTCTGCTGGTCTTGTTCCA
GGGTTATGAATATGAAAAAATAGAGATGAGACTTTTTGTGTCAACTCTGTCCACAAGAGTGAGTTATCTA
GTATGATTAGTATAGCTTTCTCCAGCATGGCAGCAGGAAGTAACTACAGGGCCTCTTTTATGCCTGACAT
TTCTTCCCTTCCTTTTTCCCTGCCTCCCTTTTTCATCAATTGCAATGCTCCCACAACTCTTTACAGACTT
GTGAAATCTTCAAGAACACCTTTACTCTATAACTCAAAAATTAGTTGAAAAATAATTACTTCTCAAGGAT
TATTAGAATCTTAGGTACTTATTTGTAAAGATGTTTAGTGACTTTTTTTTCAAGTATCTATAAAGGAGGC
AGATTCTAGAAAATATGAATTAGTTTCCAAATGCCTTAATTTTAAACTTTGGCCTGAACAGTTTTTTCTT
TTTCTTAATGGAAGAAGATATTTAATATCTTAAAAATATTCCAAGTTAGGAAGAACACTACTTGCCTTAT
CCATTTCCCATTTAAAGGACTTTTAAACTTTGACACAGTCCTTCAGATTTCCTGAAAATCCTTGAAATAT
CTTACTTTAAAAATATTTTCATCTCTGAAATATCTCGTTATTTATTGGAGGTATTGTTTAACCTTAGATA
GACCATTAAATTATTTATAAAATATTTTGTAATTACTGTAGCTAATACATTACATAGAAAAAACTATGTT
AACAGTGTCTCTGTTTAAGTATAATCAGATATAAATATATAACTTAATTTTTTAATTTTAAAAAATAGAT
ACCTGTTTGACTTTGAGGTAGTCCAGGCCTTTTTCTTTTTTTTTTTTTTTAATGTGTGCAAAAGCCCAAA
GGTTCCTAAGCCTGGCTGCAAAGAAGAATCAACAGGGACACTTTTTAAAAACACTCTTATCAGCCTGGGG
CAACACAGTGAGACTCCATCTCTTAAAAAAAAAATTAGCTGGGTATAGTGGTATGTGCCTGTAGTCCCAG
GTACTCAGGAGGCTGAGGCAGGAGGATTGCCTGAGCCCAGGAGGTGGAAACTGCAGAGAGTCATGATCAT
GTCCTTACACTCCAGCCTGGATAACAGAGCGAGACCCTGTCTCAAAAAAAAAAAAAAAAAAAAAAAAAAA
ACTCGAG

35
2. NCBI BLAST Results for the nucleotide sequence of the target protein

RID 9K5ENA9Y013 (Expires on 06-04 13:02 pm)


Query ID lcl|Query_34595
Description X83368.1 H.sapiens mRNA for phosphatidylinositol 3 kinase gamma
Molecule type dna
Query Length 5397
Database Name nt
Description Nucleotide collection (nt)
Program BLASTN 2.13.0+
First 10 sequences producing significant alignments:

36
3. Multiple Sequence Alignment in ClustalW

Sequence type explicitly set to Protein


Sequence format is Pearson

Sequence 1: X83368.1 5397 aa


Sequence 2: NM_002649.3 7218 aa
Sequence 3: XM_001162884.6 7226 aa
Sequence 4: XM_003811196.4 7115 aa
Sequence 5: XM_004046020.3 5361 aa
Sequence 6: XM_002818334.3 5365 aa
Sequence 7: XM_032761353.1 7162 aa
Sequence 8: BX648341.1 5214 aa
Sequence 9: XM_012512141.2 5362 aa
Sequence 10: NM_001282427.2 6983 aa

Sequences (1:2) Aligned. Score: 98.7215


(Partial alignment)
Sequences (1:3) Aligned. Score: 83.6946
(Partial alignment)
Sequences (1:4) Aligned. Score: 98.0174
Sequences (1:5) Aligned. Score: 97.9668
(Partial alignment)
Sequences (1:6) Aligned. Score: 78.1361
(Partial alignment)
Sequences (1:7) Aligned. Score: 96.3128
(Partial alignment)
Sequences (1:8) Aligned. Score: 79.8427
Sequences (1:9) Aligned. Score: 96.5871
(Partial alignment)
Sequences (1:10) Aligned. Score: 52.9183
(Partial alignment)
Sequences (2:3) Aligned. Score: 41.826
(Partial alignment)
Sequences (2:4) Aligned. Score: 53.858
(Partial alignment)
Sequences (2:5) Aligned. Score: 55.8664
(Partial alignment)
Sequences (2:6) Aligned. Score: 55.5825
(Partial alignment)
Sequences (2:7) Aligned. Score: 45.3225
(Partial alignment)
Sequences (2:8) Aligned. Score: 76.6015
(Partial alignment)
Sequences (2:9) Aligned. Score: 70.3842
(Partial alignment)
Sequences (2:10) Aligned. Score: 42.8183
(Partial alignment)
Sequences (3:4) Aligned. Score: 59.8313
(Partial alignment)
Sequences (3:5) Aligned. Score: 55.7918
(Partial alignment)
Sequences (3:6) Aligned. Score: 55.5079
(Partial alignment)
Sequences (3:7) Aligned. Score: 46.0765
(Partial alignment)
Sequences (3:8) Aligned. Score: 59.4361
(Partial alignment)
Sequences (3:9) Aligned. Score: 63.9314
(Partial alignment)

37
Sequences (3:10) Aligned. Score: 42.6464
(Partial alignment)
Sequences (4:5) Aligned. Score: 55.8851
(Partial alignment)
Sequences (4:6) Aligned. Score: 55.6384
(Partial alignment)
Sequences (4:7) Aligned. Score: 52.818
(Partial alignment)
Sequences (4:8) Aligned. Score: 56.2908
(Partial alignment)
Sequences (4:9) Aligned. Score: 76.3148
(Partial alignment)
Sequences (4:10) Aligned. Score: 53.9453
(Partial alignment)
Sequences (5:6) Aligned. Score: 55.6799
(Partial alignment)
Sequences (5:7) Aligned. Score: 78.0824
(Partial alignment)
Sequences (5:8) Aligned. Score: 56.1181
(Partial alignment)
Sequences (5:9) Aligned. Score: 75.1352
(Partial alignment)
Sequences (5:10) Aligned. Score: 55.3815
(Partial alignment)
Sequences (6:7) Aligned. Score: 78.1547
(Partial alignment)
Sequences (6:8) Aligned. Score: 75.8343
Sequences (6:9) Aligned. Score: 97.7247
(Partial alignment)
Sequences (6:10) Aligned. Score: 53.0662
(Partial alignment)
Sequences (7:8) Aligned. Score: 75.8343
(Partial alignment)
Sequences (7:9) Aligned. Score: 45.1138
(Partial alignment)
Sequences (7:10) Aligned. Score: 46.4414
(Partial alignment)
Sequences (8:9) Aligned. Score: 78.6728
(Partial alignment)
Sequences (8:10) Aligned. Score: 77.1385
(Partial alignment)
Sequences (9:10) Aligned. Score: 68.743

There are 9 groups


Start of Multiple Alignment

Group 1: Sequences: 2 Score:89854


Group 2: Sequences: 2 Score:84764
Group 3: Sequences: 2 Score:90743
Group 4: Sequences: 2 Score:103223
Group 5: Sequences: 4 Score:119289
Group 6: Sequences: 6 Score:127918
Group 7: Sequences: 8 Score:103758
Group 8: Sequences: 2 Score:100783
Group 9: Sequences: 10 Score:119988
Alignment Score 1748112

38
CLUSTAL 2.1 multiple sequence alignment
XM_002818334.3 -----------------------------------GCACTTCCTTCTCAGCTAGATTATT
XM_012512141.2 -----------------------------------GCACTTCCTTCTCGGCTAGATTATT
XM_001162884.6 TTTTTGCTCTTTGTGGGGTCTGACTCGGAATAGTGGCACTTCCTTCTCGGCTAGATTAT-
XM_003811196.4 -----------------------------------GCACTTCCTTCTCGGCTAGATTAT-
X83368.1 ----------------------GAATTCGGCACGAGCACTTCCTTCTCGGCTAGATTAT-
NM_002649.3 -----------------------------------GCACTTCCTTCTCGGCTAGATTAT-
XM_004046020.3 -----------------------------------GCACTTCCTTCTCGGCTAGATTATT
XM_032761353.1 -----------------------------------GCACTTCCTTCTCTGCTAGATTATT
BX648341.1 ------------------------------------------------------------
NM_001282427.2 ------------------------------------------------------------

XM_002818334.3 TTTATCTGAAACTGTTGTCGGTTCTCGAGATGATACTACCACAGAATGTCTGTGTTTCAT
XM_012512141.2 TTTATCTGAAACTGTTGTCGGTTCTCGAGATGATACTACCACCGAATGTCTGTGTTTCAT
XM_001162884.6 -----CTGAAACTGTTGTCGGTTCTTGAGATGATACTACCACCAAATGTCTGTGTTTCAT
XM_003811196.4 -----CTGAAACTGTTGTCGGTTCTTGAGATGATACTACCACCAAATGTCTGTGTTTCAT
X83368.1 -----CTGAAACTGTTGTCGGTTCTTGAGATGATACTACCACCGAATGTCTGTGTTTCAT
NM_002649.3 -----CTGAAACTGTTGTCGGTTCTTGAGATGATACTACCACCGAATGTCTGTGTTTCAT
XM_004046020.3 TTTATCTGAAACT--TGCCGGTTCTTGAGATGATACTACCACCGAATGTCTGTGTTTCAT
XM_032761353.1 TTTATCTGAAACTGTTGTCGGTTCTCGAGATGATACTACCACCGAATGTCTGTGTTTCAT
BX648341.1 ------------------------------------------------------------
NM_001282427.2 ------------------------------------------------------GCAACA

XM_002818334.3 TGTCTAGTCCAACCTGTATTGTGGATATCCACAAGGTTCCGGCAATAGTTTTGCAGGTGC
XM_012512141.2 TGTCTAGTCCAACCTGTATTGTGGATATCCACAAGGTTCCGGCAATAGTTTTGCAGGTGC
XM_001162884.6 TGTCTAGTCCAACCTGTATTGTGGATATCTACAAGGTTCCGGCAATAGTTTTGCAGGTGC
XM_003811196.4 TGTCTAGTCCAACCTGTATTGTGGATATCTACAAGGTTCCGGCAATAGTTTTGCAGGTGC
X83368.1 TGTCTAGTCCAACCTGTATTGTGGATATCTACAACGTTCCGGCAATAGTTTTGCAGGTGC
NM_002649.3 TGTCTAGTCCAACCTGTATTGTGGATATCTACAACGTTCCGGCAATAGTTTTGCAGGTGC
XM_004046020.3 TGTCTAGTCCAACCTGTATTGTGGATATCCACAAGGTTCCGGCAATAGTTTTGCAGGTGC
XM_032761353.1 TGTCTAGTCCAACCTGTATTGTGGATATCCACAAGGTTCCGGCAATAGTTTTGCAGGTGC
BX648341.1 -------GGCATCTGGATATGAAGGGAGCCCCAGAAAAGCGGAAGAATTTAGACG-----
NM_001282427.2 CTTCCTCTGCATCTGGATATGAAGGGAGCCCCAGAAAAGCGGAAGAATTTAGACG-----
**:* *:::**:.*. * * .**. .:: ***.*.:* **: .*.

XM_002818334.3 ATCACATTTTTGTTTTTGTTTTGGGAGGAAAAGCGAGGGCACGGCAGCCAGGCTTCATAT
XM_012512141.2 ATCAGATTTTTGTTTTCATTTTGGGAGGAAAAGGGAGGGCACGGCAGCCAGGCTTCATAT
XM_001162884.6 ATCACATTTTTGTTTTTGTTTTGGGAGGAAAAGGGAGGGCACGGCAGCCAGGCTTCATAT
XM_003811196.4 ATCACATTTTTGTTTTTGTTTTGGGAGGAAAAGGGAGGGCACGGCAGCCAGGCTTCATAT
X83368.1 ATCACATTTTTGTTTTTGTTTTGGGAGGAAAAGGGAGGGCACGGCAGCCAGGCTTCATAT
NM_002649.3 ATCACATTTTTGTTTTTGTTTTGGGAGGAAAAGGGAGGGCACGGCAGCCAGGCTTCATAT
XM_004046020.3 ATCACATTTTTGTTTTTGTTTTGGGAGGAAAAGGGAAGGCACGGCAGCCAGGCTTCATAT
XM_032761353.1 ATCAGATTTTTGTTTTCATTTTGGGAGGAAAAGGGAGGGCACGGCAGCCAGGCTTCATAT
BX648341.1 ------------------------------------------------------------
NM_001282427.2 ------------------------------------------------------------

XM_002818334.3 TCCTACAAGTGCATGCTTCAAGATTACTGTACTTACAGTGTTTCCAACATCTTCTCATGA
XM_012512141.2 TCCTACAAGTGCATGCTTCAAGATTACTGTACTTACAGTGTTTCCAACATCTTCTCATAA
XM_001162884.6 TCCTACAAGTGCATGCTTCAAGATTACTGTACTTACAGTGTTTCCAACATCTTCTCATAA
XM_003811196.4 TCCTACAAGTGCATGCTTCAAGATTACTGTACTTACAGTGTTTCCAACATCTTCTCATAA
X83368.1 TCCTACAAGTGCATGCTTCAAGATTACTGTACTTACAGTGTTTCCAACATCTTCTCATAA
NM_002649.3 TCCTACAAGTGCATGCTTCAAGATTACTGTACTTACAGTGTTTCCAACATCTTCTCATAA
XM_004046020.3 TCCTACAAGTGCATGCTTCAAGATTACTGTACTTACAGTGTTTCCAACATCTTCTCATAA
XM_032761353.1 TCCTACAAGTGCATGCTTCAAGATTACTGTACTTACAGTGTTTCCAACATCTTCTCATAA
BX648341.1 ---------------CACACCGGGTAGGTTTGAATTTGTTTTGTTTTCAAAAATTAAACA
NM_001282427.2 ---------------CACACTGG-------------------------------------
*: .. *.

XM_002818334.3 AAGGGGAAAGCTTCATAGCCTCAACCATGAAGGAAACCAGTCGCATAGGGCATGGAGCTG
XM_012512141.2 AAGGGAAAAGCTTCATAGCCTCAACCATGAAGGAAACCAGTCGCATAGGGCATGGAGCTG
XM_001162884.6 AAGGGGAAAGCTTCATAGCCTCAACCATGAAGGAAACCAGTCGCATAGGGCATGGAGCTG
XM_003811196.4 AAGGGGAAAGCTTCATAGCCTCAACCATGAAGGAAACCAGTCGCATAGGGCATGGAGCTG
X83368.1 AAGGGGAAAGCTTCATAGCCTCAACCATGAAGGAAACCAGTCGCATAGGGCATGGAGCTG
NM_002649.3 AAGGGGAAAGCTTCATAGCCTCAACCATGAAGGAAACCAGTCGCATAGGGCATGGAGCTG

39
XM_004046020.3 AAGGGGAAAGCTTCATAGCCTCAACCATGAAGGAAACCAGTCGCATAGGGCATGGAGCTG
XM_032761353.1 AAGGGAAAAGCTTCATAGCCTCAACCATGAAGGAAACCAGTCGCATAGGGCATGGAGCTG
BX648341.1 AATGATCCTTCAGCATCATCGCCTCCGCTGCTTTATCAGGTCGCATAGGGCATGGAGCTG
NM_001282427.2 ---------------------------------------GTCGCATAGGGCATGGAGCTG
*********************

XM_002818334.3 GAAAACTATAAACAGCCCGTGGTGCTGAGAGAGGACAACTGCCGCAGGCGCCGGAGGATG
XM_012512141.2 GAGAACTATGAACAGCCCGTGGTGCTGAGAGAGGACAACTGCCGCAGGCGCCGGAGGATG
XM_001162884.6 GAGAACTATAAACAGCCCGTGGTGCTGAGAGAGGACAACTGCCGAAGGCGCCGGAGGATG
XM_003811196.4 GAGAACTATAAACAGCCCGTGGTGCTGAGAGAGGACAACTGCCGAAGGCGCCGGAGGATG
X83368.1 GAGAACTATAAACAGCCCGTGGTGCTGAGAGAGGACAACTGCCGAAGGCGCCGGAGGATG
NM_002649.3 GAGAACTATAAACAGCCCGTGGTGCTGAGAGAGGACAACTGCCGAAGGCGCCGGAGGATG
XM_004046020.3 GAGAACTATAAACAGCCCGTGGTGCTGAGAGAGGACAACTGCCGAAGGCGCCGGAGGATG
XM_032761353.1 GAGAACTATGAACAGCCCGTGGTGCTGAGAGAGGACAACTGCCGCAGGCGCCGGAGGATG
BX648341.1 GAGAACTATAAACAGCCCGTGGTGCTGAGAGAGGACAACTGCCGAGG-CGCCGGAGGATG
NM_001282427.2 GAGAACTATAAACAGCCCGTGGTGCTGAGAGAGGACAACTGCCGAAGGCGCCGGAGGATG
**.******.**********************************..* ************

XM_002818334.3 AAGCCGCGCAGTGCTGCAGCCAGCCTGTCCTCCATGGAGCTCATCCCCATCGAGTTCGTG
XM_012512141.2 AAGCCGCGCAGTGCTGCGGCCAGCCTGTCCTCCATGGAGCTCATCCCCATCGAATTCGTG
XM_001162884.6 AAGCCGCGCAGTGCTGCGGCCAGCCTGTCCTCCATGGAGCTCATCCCCATCGAGTTCGTG
XM_003811196.4 AAGCCGCGCAGTGCTGCGGCCAGCCTGTCCTCCATGGAGCTCATCCCCATCGAGTTCGTG
X83368.1 AAGCCGCGCAGTGCT---GCCAGCCTGTCCTCCATGGAGCTCATCCCCATCGAGTTCGTG
NM_002649.3 AAGCCGCGCAGTGCTGCGGCCAGCCTGTCCTCCATGGAGCTCATCCCCATCGAGTTCGTG
XM_004046020.3 AAGCCGCGCAGTGCTGCGGCCAGCTTGTCCTCCATGGAGCTCATCCCCATCGAGTTCGTA
XM_032761353.1 AAGCCGCGCAGTGCTGCGGCCAGCCTGTCCTCCATGGAGCTCATCCCCATCGAATTCGTG
BX648341.1 AAGCCGCGCAGTGCTGCGGCCAGCCTGTCCTCCATGGAGCTCATCCCCATCGAGTTCGTG
NM_001282427.2 AAGCCGCGCAGTGCTGCGGCCAGCCTGTCCTCCATGGAGCTCATCCCCATCGAGTTCGTG
*************** ****** ****************************.*****.

XM_002818334.3 CTGCCCACCAGCCAGCGCAAATGCAAGAGCCCCGAAACGGCGCTGCTGCACGTGGCGGGC
XM_012512141.2 CTGCCCACCAGCCAGCGCAAATGCAAGAGCCCCGAAACGGCGCTGCTGCACGTGGCCGGC
XM_001162884.6 CTGCCCACCAGCCAGCGCAAATGCAAGAGCCCCGAAACGGCGCTGCTGCACGTGGCCGGC
XM_003811196.4 CTGCCCACCAGCCAGCGCAAATGCAAGAGCCCCGAAACGGCGCTGCTGCACGTGGCCGGC
X83368.1 CTGCCCACCAGCCAGCGCAAATGCAAGAGCCCCGAAACGGCGCTGCTGCACGTGGCCGGC
NM_002649.3 CTGCCCACCAGCCAGCGCAAATGCAAGAGCCCCGAAACGGCGCTGCTGCACGTGGCCGGC
XM_004046020.3 CTGCCCACCAGCCAGCGCAAATGCAAGAGCCCCGAAACGGCGCTGCTACACGTGGCGGGC
XM_032761353.1 CTGCCCACCAGCCAGCGCAAATGCAAGAGCCCCGAAACGGCGCTGCTGCACGTGGCCGGC
BX648341.1 CTGCCCACCAGCCAGCGCAAATGCAAGAGCCCCGAAACGGCGCTGCTGCACGTGGCCGGC
NM_001282427.2 CTGCCCACCAGCCAGCGCAAATGCAAGAGCCCCGAAACGGCGCTGCTGCACGTGGCCGGC
***********************************************.******** ***

XM_002818334.3 CACGGCAACGTGGAGCAGATGAAGGCCCAGGTGTGGCTGCGAGCTCTGGAGACCAGCGTG
XM_012512141.2 CACGGCAACGTGGAGCAGATGAAGGCCCAGGTGTGGCTGCGAGCGCTGGAGACCAGCGTG
XM_001162884.6 CACGGCAACGTGGAGCAGATGAAGGCCCAGGTGTGGCTGCGAGCGCTGGAGACCAGCGTG
XM_003811196.4 CACGGCAACGTGGAGCAGATGAAGGCCCAGGTGTGGCTGCGAGCGCTGGAGACCAGCGTG
X83368.1 CACGGCAACGTGGAGCAGATGAAGGCCCAGGTGTGGCTGCGAGCGCTGGAGACCAGCGTG
NM_002649.3 CACGGCAACGTGGAGCAGATGAAGGCCCAGGTGTGGCTGCGAGCGCTGGAGACCAGCGTG
XM_004046020.3 CACGGCAACGTGGAGCAGATGAAGGCCCAGGTGTGGCTGCGAGCACTGGAGACCAGCGTG
XM_032761353.1 CACGGCAACGTGGAGCAGATGAAGGCCCAGGTGTGGCTGCGAGCGCTGGAGACCAGCGTG
BX648341.1 CACGGCAACGTGGAGCAGATGAAGGCCCAGGTGTGGCTGCGAGCGCTGGAGACCAGCGTG
NM_001282427.2 CACGGCAACGTGGAGCAGATGAAGGCCCAGGTGTGGCTGCGAGCGCTGGAGACCAGCGTG
******************************************** ***************

XM_002818334.3 GCAGCGGACTTCTACCACCGGCTGGGCCCTGACCACTTCCTCCTGCTCTATCAGAAGAAG
XM_012512141.2 GCGGCGGACTTCTACCACCGGCTGGGCCCGGACCACTTCCTCCTGCTCTATCAGAAGAAG
XM_001162884.6 GCGGCGGACTTCTACCACCGGCTGGGGCCGGATCACTTCCTCCTGCTCTATCAGAAGAAG
XM_003811196.4 GCGGCGGACTTCTACCACCGGCTGGGGCCGGATCACTTCCTCCTGCTCTATCAGAAGAAG
X83368.1 GCGGCGGACTTCTACCACCGGCTGGGACCGCATCACTTCCTCCTGCTCTATCAGAAGAAG
NM_002649.3 GCGGCGGACTTCTACCACCGGCTGGGACCGCATCACTTCCTCCTGCTCTATCAGAAGAAG
XM_004046020.3 GCGGCGGACTTCTACCACCGGCTGGGCCCGGACCACTTCCTCCTGCTCTATCAGAAGAAG
XM_032761353.1 GCGGCGGACTTCTACCACCGGCTGGGCCCGGACCACTTCCTCCTGCTGTATCAGAAGAAG
BX648341.1 GCGGCGGACTTCTACCACCGGCTGGGACCGCATCACTTCCTCCTGCTCTATCAGAAGAAG
NM_001282427.2 GCGGCGGACTTCTACCACCGGCTGGGACCGCATCACTTCCTCCTGCTCTATCAGAAGAAG
**.*********************** ** * ************** ************

XM_002818334.3 GGGCAGTGGTACGAGATCTACGACAAGTACCAGGTGGTGCAGACTCTGGACTGCCTGCGC

40
XM_012512141.2 GGGCAGTGGTACGAGATCTACGACAAGTACCAGGTGGTGCAGACTCTGGACTGCCTGCAC
XM_001162884.6 GGGCAGTGGTACGAGATCTACGACAAGTACCAGGTGGTGCAGACTCTGGACTGCCTGCGC
XM_003811196.4 GGGCAGTGGTACGAGATCTACGACAAGTACCAGGTGGTGCAGACTCTGGACTGCCTGCGC
X83368.1 GGGCAGTGGTACGAGATCTACGACAAGTACCAGGTGGTGCAGACTCTGGACTGCCTGCGC
NM_002649.3 GGGCAGTGGTACGAGATCTACGACAAGTACCAGGTGGTGCAGACTCTGGACTGCCTGCGC
XM_004046020.3 GGGCAGTGGTACGAGATCTACGACAAGTACCAGGTGGTGCAGACTCTGGACTGCCTGCGC
XM_032761353.1 GGGCAGTGGTACGAAATCTACGACAAGTACCAGGTGGTGCAGACTCTGGACTGCCTGCGC
BX648341.1 GGGCAGTGGTACGAGATCTACGACAAGTACCAGGTGGTGCAGACTCTGGACTGCCTGCGC
NM_001282427.2 GGGCAGTGGTACGAGATCTACGACAAGTACCAGGTGGTGCAGACTCTGGACTGCCTGCGC
**************.*******************************************.*

XM_002818334.3 TACTGGAAGGCCACACACCGGAGCCCGGGCCAGATCCACCTGGTGCAGCGGCGCCCGCCC
XM_012512141.2 TACTGGAAGGCCACGCACAGGAGCCCGGGCCAGATCCACCTGGTGCAGCGGCGTCCGCCC
XM_001162884.6 TACTGGAAGGCCACGCACCGGAGCCCGGGCCAGATCCACCTGGTGCAGCGGCACCCGCCC
XM_003811196.4 TACTGGAAGGCCACGCACCGGAGCCCGGGCCAGATCCACCTGGTGCAGCGGCACCCGCCC
X83368.1 TACTGGAAGGCCACGCACCGGAGCCCGGGCCAGATCCACCTGGTGCAGCGGCACCCGCCC
NM_002649.3 TACTGGAAGGCCACGCACCGGAGCCCGGGCCAGATCCACCTGGTGCAGCGGCACCCGCCC
XM_004046020.3 TACTGGAAGGCCACGCACCGGAGCCCGGGCCAGATCCACCTGGTGCAGCGGCACCCGCCC
XM_032761353.1 TACTGGAAGGCCACGCACCGGAGCCCGGGTCAGATCCACCTGGTGCAGCGGCGTCCGCCC
BX648341.1 TACTGGAAGGCCACGCACCGGAGCCCGGGCCAGATCCACCTGGTGCAGCGGCACCCGCCC
NM_001282427.2 TACTGGAAGGCCACGCACCGGAGCCCGGGCCAGATCCACCTGGTGCAGCGGCACCCGCCC
**************.***.********** **********************. ******

XM_002818334.3 TCCGAAGAGTCGCAAGCCTTCCAGCGGCAGCTCACCGCGCTGATTGGCTATGACGTCACT
XM_012512141.2 TCTGAGGAGTCGCAAGCGTTCCAGCGGCAGCTCACCGCGCTGATCGGCTATGACGTCACG
XM_001162884.6 TCCGAGGAGTCCCAAGCCTTCCAGCGGCAGCTCACGGCGCTGATTGGCTATGACGTCACT
XM_003811196.4 TCCGAGGAGTCCCAAGCCTTCCAGCGGCAGCTCACGGCGCTGATTGGCTATGACGTCACT
X83368.1 TCCGAGGAGTCCCAAGCCTTCCAGCGGCAGCTCACGGCGCTGATTGGCTATGACGTCACT
NM_002649.3 TCCGAGGAGTCCCAAGCCTTCCAGCGGCAGCTCACGGCGCTGATTGGCTATGACGTCACT
XM_004046020.3 TCCGAGGAGTCCCAAGCCTTCCAGCGACAGCTCACCGGGCTGATTGGCTATGACGTCACT
XM_032761353.1 TCTGAGGAGTCGCAAACGTTCCAGCGGCAGCTCACCGCGCTGATCGGCTATGACGTCACG
BX648341.1 TCCGAGGAGTCCCAAGCCTTCCAGCGGCAGCTCACGGCGCTGATTGGCTATGACGTCACT
NM_001282427.2 TCCGAGGAGTCCCAAGCCTTCCAGCGGCAGCTCACGGCGCTGATTGGCTATGACGTCACT
** **.***** ***.* ********.******** * ****** **************

XM_002818334.3 GACGTCAGCAACGTGCACGACGACGAGCTGGAGTTCACGCGCCGTGGCTTGGTGACCCCG
XM_012512141.2 GACGTCAGCAACGTGCACGACGACGAGCTGGAGTTCACGCGCCGTGGCTTGGTGACCCCG
XM_001162884.6 GACGTCAGCAACGTGCACGACGATGAGCTGGAGTTCACGCGCCGTGGCTTGGTGACCCCG
XM_003811196.4 GACGTCAGCAACGTGCACGACGATGAGCTGGAGTTCACGCGCCGTGGCTTGGTGACCCCG
X83368.1 GACGTCAGCAACGTGCACGACGATGAGCTGGAGTTCACGCGCCGTGGCTTGGTGACCCCG
NM_002649.3 GACGTCAGCAACGTGCACGACGATGAGCTGGAGTTCACGCGCCGTGGCTTGGTGACCCCG
XM_004046020.3 GACGTCAGCAACGTGCACGACGATGAGCTGGAGTTCACGCGCCGTGGCTTGGTGACCCCG
XM_032761353.1 GACGTCAGCAACGTGCACGACGACGAGCTGGAGTTCACGCGCCGTGGCTTGGTGACCCCG
BX648341.1 GACGTCAGCAACGTGCACGACGATGAGCTGGAGTTCACGCGCCGTGGCTTGGTGGCCCCG
NM_001282427.2 GACGTCAGCAACGTGCACGACGATGAGCTGGAGTTCACGCGCCGTGGCTTGGTGACCCCG
*********************** ******************************.*****

XM_002818334.3 CGCATGGCGGAGGTGGCCAGCCGCGACCCCAAGCTCTACGCCATGCACCCGTGGGTGACG
XM_012512141.2 CGCATGGCAGAGGTGGCCAGCCGCGACCCCAAGCTCTACGCCATGCACCCCTGGGTGACG
XM_001162884.6 CGCATGGCGGAGGTGGCCAGCCGCGACCCCAAGCTCTATGCCATGCACCCGTGGGTGACG
XM_003811196.4 CGCATGGCGGAGGTGGCCAGCCGCGACCCCAAGCTCTATGCCATGCACCCGTGGGTGACG
X83368.1 CGCATGGCGGAGGTGGCCAGCCGCGACCCCAAGCTCTACGCCATGCACCCGTGGGTGACG
NM_002649.3 CGCATGGCGGAGGTGGCCAGCCGCGACCCCAAGCTCTACGCCATGCACCCGTGGGTGACG
XM_004046020.3 CGCATGGCGGAGGTGGCCAGCCGAGACCCCAAGCTCTACGCCATGCACCCGTGGGTGACG
XM_032761353.1 CGCATGGCAGAGGTGGCCAGCCGCGACCCCAAGCTCTACGCCATGCACCCCTGGGTGACG
BX648341.1 CGCATGGCGGAGGTGGCCAGCCGCGACCCCAAGCTCTACGCCATGCACCCGTGGGTGACG
NM_001282427.2 CGCATGGCGGAGGTGGCCAGCCGCGACCCCAAGCTCTACGCCATGCACCCGTGGGTGACG
********.**************.************** *********** *********

XM_002818334.3 TCCAAGCCCCTCCCGGAGTACCTGTGGAAGAAGATTGCCAACAACTGCATCTTCATCGTC
XM_012512141.2 TCCAAGCCCCTCCCGGAGTACCTGTGGAAGAAGATTGCCAACAACTGCATCTTCATCATC
XM_001162884.6 TCCAAGCCCCTCCCGGAGTACCTGTGGAAGAAGATTGCCAACAACTGCATCTTCATCGTC
XM_003811196.4 TCCAAGCCCCTCCCGGAGTACCTGTGGAAGAAGATTGCCAACAACTGCATCTTCATCGTC
X83368.1 TCCAAGCCCCTCCCGGAGTACCTGTGGAAGAAGATTGCCAACAACTGCATCTTCATCGTC
NM_002649.3 TCCAAGCCCCTCCCGGAGTACCTGTGGAAGAAGATTGCCAACAACTGCATCTTCATCGTC
XM_004046020.3 TCCAAGCCCCTCCCGGAGTACCTGTGGAAGAAGATTGCCAACAACTGCATCTTCATCGTC
XM_032761353.1 TCCAAGCCCCTCCCGGAGTACCTGTGGAAGAAGATTGCCAACAACTGCATCTTCATCGTC
BX648341.1 TCCAAGCCCCTCCCGGAGTACCTGTGGAAGAAGATTGCCAACAACTGCATCTTCATCGTC

41
NM_001282427.2 TCCAAGCCCCTCCCGGAGTACCTGTGGAAGAAGATTGCCAACAACTGCATCTTCATCGTC
*********************************************************.**

XM_002818334.3 ATTCACCGCAGCACCACCAGCCAGACCATTAAGGTCTCGCCCGACGACACCCCCGGCACC
XM_012512141.2 ATTCACCGCAGCACCACCAGCCAGACCATTAAGGTCTCGCCCGACGACACCCCCGGCGAC
XM_001162884.6 ATTCACCGCAGCACCACCAGCCAGACCATTAAGGTCTCACCCGACGACACCCCCGGCGCC
XM_003811196.4 ATTCACCGCAGCACCACCAGCCAGACCATTAAGGTCTCACCCGACGACACCCCCGGCGCC
X83368.1 ATTCACCGCAGCACCACCAGCCAGACCATTAAGGTCTCACCCGACGACACCCCCGGCGCC
NM_002649.3 ATTCACCGCAGCACCACCAGCCAGACCATTAAGGTCTCACCCGACGACACCCCCGGCGCC
XM_004046020.3 ATTCACCGCAGCACCACCAGCCAGACCATTAAGGTCTCACCCGACGACACCCCCGGCGCC
XM_032761353.1 ATTCACCGCAGCACCACCAGCCAGACCATTAAGGTCTCGCCCGACGACACCCCCGGCGAC
BX648341.1 ATTCACCGCAGCACCACCAGCCAGACCATTAAGGTCTCACCCGACGACACCCCCGGCGCC
NM_001282427.2 ATTCACCGCAGCACCACCAGCCAGACCATTAAGGTCTCACCCGACGACACCCCCGGCGCC
**************************************.******************..*

XM_002818334.3 ATCCTGCAGAGCTTCTTCACCAAGATGGCCAAGAAGAAATCTCTGATGGATATTCCCGAA
XM_012512141.2 ATCCTGCAGAGTTTCTTCACCAAGATGGCCAAGAAGAAATCTCTGATGGATATTCCCGAA
XM_001162884.6 ATCCTGCAGAGCTTCTTCACCAAGATGGCCAAGAAGAAATCTCTGATGGATATTCCCGAA
XM_003811196.4 ATCCTGCAGAGCTTCTTCACCAAGATGGCCAAGAAGAAATCTCTGATGGATATTCCCGAA
X83368.1 ATCCTGCAGAGCTTCTTCACCAAGATGGCCAAGAAGAAATCTCTGATGGATATTCCCGAA
NM_002649.3 ATCCTGCAGAGCTTCTTCACCAAGATGGCCAAGAAGAAATCTCTGATGGATATTCCCGAA
XM_004046020.3 ATCCTGCAAAGCTTCTTCACCAAGATGGCCAAGAAGAAATCTCTGATGGATATTCCCGAA
XM_032761353.1 ATCCTGCAGAGTTTCTTCACCAAGATGGCCAAGAAGAAATCTCTGATGGATATTCCCGAA
BX648341.1 ATCCTGCAGAGCTTCTTCACCAAGATGGCCAAGAAGAAATCTCTGATGGATATTCCCGAA
NM_001282427.2 ATCCTGCAGAGCTTCTTCACCAAGATGGCCAAGAAGAAATCTCTGATGGATATTCCCGAA
********.** ************************************************

XM_002818334.3 AGCCAAAGCGAACAGGATTTTGTGCTGCGCGTCTGTGGCCGGGATGAGTACCTGGTGGGC
XM_012512141.2 AGCCAAAGCGAACAGGATTTTGTGCTGCGCGTCTGTGGCCGGGATGAGTATCTGGTGGGC
XM_001162884.6 AGCCAAAGCGAACAGGATTTTGTGCTGCGCGTCTGTGGCCGGGATGAGTACCTGGTGGGC
XM_003811196.4 AGCCAAAGCGAACAGGATTTTGTGCTGCGCGTCTGTGGCCGGGATGAGTACCTGGTGGGC
X83368.1 AGCCAAAGCGAACAGGATTTTGTGCTGCGCGTCTGTGGCCGGGATGAGTACCTGGTGGGC
NM_002649.3 AGCCAAAGCGAACAGGATTTTGTGCTGCGCGTCTGTGGCCGGGATGAGTACCTGGTGGGC
XM_004046020.3 AGCCAAAGCGAACAGGATTTTGTGCTGCGCGTCTGTGGCCGGGATGAGTACCTGGTGGGC
XM_032761353.1 AGCCAAAGCGAACAGGATTTTGTGCTGCGCGTCTGTGGCCGGGATGAGTATCTGGTGGGC
BX648341.1 AGCCAAAGCGAACAGGATTTTGTGCTGCGCGTCTGTGGCCGGGATGAGTACCTGGTGGGC
NM_001282427.2 AGCCAAAGCGAACAGGATTTTGTGCTGCGCGTCTGTGGCCGGGATGAGTACCTGGTGGGC
************************************************** *********

XM_002818334.3 GAAGCGCCCATCAAAAACTTCCAGTGGGTGAGGCACTGCCTCAAGAACGGAGAAGAGATT
XM_012512141.2 GAAACGCCCATCAAAAACTTCCAGTGGGTGAGGCACTGCCTCAAGAACAGAGAAGAGATT
XM_001162884.6 GAAACGCCCATCAAAAACTTCCAGTGGGTGAGGCACTGCCTCAAGAACGGAGAAGAGATT
XM_003811196.4 GAAACGCCCATCAAAAACTTCCAGTGGGTGAGGCACTGCCTCAAGAACGGAGAAGAGATT
X83368.1 GAAACGCCCATCAAAAACTTCCAGTGGGTGAGGCACTGCCTCAAGAACGGAGAAGAGATT
NM_002649.3 GAAACGCCCATCAAAAACTTCCAGTGGGTGAGGCACTGCCTCAAGAACGGAGAAGAGATT
XM_004046020.3 GAAACGCCCATCAAAAACTTCCAGTGGGTGAGGCACTGCCTCAAGAACGGAGAAGAGATT
XM_032761353.1 GAAACGCCCATCAAAAACTTCCAGTGGGTGAGGCACTGCCTCAAGAACAGAGAAGAGATT
BX648341.1 GAAACGCCCATCAAAAACTTCCAGTGGGTGAGGCACTGCCTCAAGAACGGAGAAGAGATT
NM_001282427.2 GAAACGCCCATCAAAAACTTCCAGTGGGTGAGGCACTGCCTCAAGAACGGAGAAGAGATT
***.********************************************.***********

XM_002818334.3 CACGTGGTACTGGACACGCCTCCAGACCCGGCCCTAGACGAGGTGAGGAAGGAAGAGTGG
XM_012512141.2 CACCTGGTGCTGGACACGCCTCCAGACCCGGCCCTAGACGAGGTGAGGAAGGAAGAGTGG
XM_001162884.6 CACGTGGTACTGGACACGCCTCCAGACCCGGCCCTAGACGAGGTGAGGAAGGAAGAGTGG
XM_003811196.4 CACGTGGTACTGGACACGCCTCCAGACCCGGCCCTAGACGAGGTGAGGAAGGAAGAGTGG
X83368.1 CACGTGGTACTGGACACGCCTCCAGACCCGGCCCTAGACGAGGTGAGGAAGGAAGAGTGG
NM_002649.3 CACGTGGTACTGGACACGCCTCCAGACCCGGCCCTAGACGAGGTGAGGAAGGAAGAGTGG
XM_004046020.3 CACGTGGTACTGGACACGCCTCCAGACCCGGCCCTAGACGAGGTGAGGAAGGAAGAGTGG
XM_032761353.1 CACCTGGTGCTGGACACGCCTCCAGACCCGGCCCTAGACGAGGTGAGGAAGGAAGAGTGG
BX648341.1 CACGTGGTACTGGACACGCCTCCAGACCCGGCCCTAGACGAGGTGAGGAAGGAAGAGTGG
NM_001282427.2 CACGTGGTACTGGACACGCCTCCAGACCCGGCCCTAGACGAGGTGAGGAAGGAAGAGTGG
*** ****.***************************************************

XM_002818334.3 CCGCTGGTGGACGACTGCACGGGAGTCACCGGCTACCATGAGCAGCTTACCATCCACGGC
XM_012512141.2 CCGCTGGTGGACGACTGCACGGGAGTCACCGGCTACCATGAGCAGCTTACCATCCACGGC
XM_001162884.6 CCGCTGGTGGACGACTGCACGGGAGTCACCGGCTACCATGAGCAGCTTACCATCCACGGC
XM_003811196.4 CCGCTGGTGGACGACTGCACGGGAGTCACCGGCTACCATGAGCAGCTTACCATCCACGGC
X83368.1 CCGCTGGTGGACGACTGCACGGGAGTCACCGGCTACCATGAGCAGCTTACCATCCACGGC

42
NM_002649.3 CCACTGGTGGATGACTGCACGGGAGTCACCGGCTACCATGAGCAGCTTACCATCCACGGC
XM_004046020.3 CCGCTGGTGGACGACTGCACGGGAGTCACCGGCTACCATGAGCAGCTTACCATCCACGGC
XM_032761353.1 CCGCTGGTGGACGACTGCACGGGAGTCACCGGCTACCATGAGCAGCTTACCATCCACGGC
BX648341.1 CCGCTGGTGGACGACTGCACGGGAGTCACCGGCTACCATGAGCAGCTTACCATCCACGGC
NM_001282427.2 CCACTGGTGGATGACTGCACGGGAGTCACCGGCTACCATGAGCAGCTTACCATCCACGGC
**.******** ************************************************

XM_002818334.3 AAGGACCACGAGAGTGTGTTCACCGTGTCCCTGTGGGACTGCGACCGCAAGTTCAGAGTC
XM_012512141.2 AAGGACCACGAGAGTGTGTTCACCGTGTCCCTGTGGGACTGCGACCGCAAGTTCAGGGTC
XM_001162884.6 AAGGACCACGAGAGTGTGTTCACCGTGTCCCTGTGGGACTGCGACCGCAAGTTCAGGGTT
XM_003811196.4 AAGGACCACGAGAGTGTGTTCACCGTGTCCCTGTGGGACTGCGACCGCAAGTTCAGGGTT
X83368.1 AAGGACCACGAGAGTGTGTTCACCGTGTCCCTGTGGGACTGCGACCGCAAGTTCAGGGTC
NM_002649.3 AAGGACCACGAGAGTGTGTTCACCGTGTCCCTGTGGGACTGCGACCGCAAGTTCAGGGTC
XM_004046020.3 AAGGACCACGAGAGTGTGTTCACCGTGTCCCTGTGGGACTGCGACCGCAAGTTCAGGGTC
XM_032761353.1 AAGGACCACGAGAGTGTGTTCACCGTGTCCCTGTGGGACTGCGACCGCAAGTTCAGGGTC
BX648341.1 AAGGACCACGAGAGTGTGTTCACCGTGTCCCTGTGGGACTGCGACCGCAAGTTCAGGGTC
NM_001282427.2 AAGGACCACGAGAGTGTGTTCACCGTGTCCCTGTGGGACTGCGACCGCAAGTTCAGGGTC
********************************************************.**

XM_002818334.3 AAGATCAGAGGCATTGATATCCCCGTCCTGCCCCGGAACACCGACCTCACAGTTTTTGTA
XM_012512141.2 AAGATCAGAGGCATTGATATCCCCGTCCTGCCCCGGAACACCGACCTCACAGTTTTTGTA
XM_001162884.6 AAGATCAGAGGCATTGATATCCCCGTCCTGCCTCGGAACACCGACCTCACAGTTTTTGTA
XM_003811196.4 AAGATCAGAGGCATTGATATCCCCGTCCTGCCTCGGAACACCGACCTCACAGTTTTTGTA
X83368.1 AAGATCAGAGGCATTGATATCCCCGTCCTGCCTCGGAACACCGACCTCACAGTTTTTGTA
NM_002649.3 AAGATCAGAGGCATTGATATCCCCGTCCTGCCTCGGAACACCGACCTCACAGTTTTTGTA
XM_004046020.3 AAGATCAGAGGCATTGATATCCCCGTCCTGCCTCGGAACACCGACCTCACAGTTTTTGTA
XM_032761353.1 AAGATCAGAGGCATTGATATCCCCGTCCTGCCCCGGAACACCGACCTCACAGTTTTTGTA
BX648341.1 AAGATCAGAGGCATTGATATCCCCGTCCTGCCTCGGAACACCGACCTCACAGTTTTTGTA
NM_001282427.2 AAGATCAGAGGCATTGATATCCCCGTCCTGCCTCGGAACACCGACCTCACAGTTTTTGTA
******************************** ***************************

XM_002818334.3 GAGGCAAACATCCAGCATGGGCAACAAGTCCTTTGCCAAAGGAGAACCAGCCCCAAACCC
XM_012512141.2 GAGGCAAACATCCAGCATGGGCAACAAGTCCTTTGCCAAAGGAGAACCAGCCCCAAACCC
XM_001162884.6 GAGGCAAACATCCAGCATGGGCAACAAGTCCTTTGCCAAAGGAGAACCAGCCCCAAACCC
XM_003811196.4 GAGGCAAACATCCAGCATGGGCAACAAGTCCTTTGCCAAAGGAGAACCAGCCCCAAACCC
X83368.1 GAGGCAAACATCCAGCATGGGCAACAAGTCCTTTGCCAAAGGAGAACCAGCCCCAAACCC
NM_002649.3 GAGGCAAACATCCAGCATGGGCAACAAGTCCTTTGCCAAAGGAGAACCAGCCCCAAACCC
XM_004046020.3 GAGGCAAACATCCAGCATGGGCAACAAGTCCTTTGCCAAAGGAGAACCAGCCCCAAACCC
XM_032761353.1 GAGGCAAACATCCAGCATGGGCAACAAGTCCTTTGCCAAAGGAGAACCAGCCCCAAACCC
BX648341.1 GAGGCAAACATCCAGCATGGGCAACAAGTCCTTTGCCAAAGGAGAACCAGCCCCAAACCC
NM_001282427.2 GAGGCAAACATCCAGCATGGGCAACAAGTCCTTTGCCAAAGGAGAACCAGCCCCAAACCC
************************************************************

XM_002818334.3 TTCACAGAGGAGGTGCTGTGGAATGTGTGGCTTGAGTTCAGTATCAAAATCAAAGACTTG
XM_012512141.2 TTCACAGAGGAGGTGCTGTGGAATGTGTGGCTTGAGTTCAGTATCAAAATCAAAGACTTG
XM_001162884.6 TTCACAGAGGAGGTGCTGTGGAATGTGTGGCTTGAGTTCAGTATCAAAATCAAAGACTTG
XM_003811196.4 TTCACAGAGGAGGTGCTGTGGAATGTGTGGCTTGAGTTCAGTATCAAAATCAAAGACTTG
X83368.1 TTCACAGAGGAGGTGCTGTGGAATGTGTGGCTTGAGTTCAGTATCAAAATCAAAGACTTG
NM_002649.3 TTCACAGAGGAGGTGCTGTGGAATGTGTGGCTTGAGTTCAGTATCAAAATCAAAGACTTG
XM_004046020.3 TTCACAGAGGAGGTGCTGTGGAATGTGTGGCTTGAGTTCAGTATCAAAATCAAAGACTTG
XM_032761353.1 TTCACAGAGGAGGTGCTGTGGAATGTGTGGCTTGAGTTCAGTATCAAAATCAAAGACTTG
BX648341.1 TTCACAGAGGAGGTGCTGTGGAATGTGTGGCTTGAGTTCAGTATCAAAATCAAAGACTTG
NM_001282427.2 TTCACAGAGGAGGTGCTGTGGAATGTGTGGCTTGAGTTCAGTATCAAAATCAAAGACTTG
************************************************************

XM_002818334.3 CCCAAAGGGGCTCTACTGAACCTCCAGATCTACTGCGGTAAAGCTCCAGCACTGTCCAGC
XM_012512141.2 CCCAAAGGGGCTCTACTGAACCTCCAGATCTACTGCGGTAAAGCTCCAGCACTGTCCAGC
XM_001162884.6 CCCAAAGGGGCTCTACTGAACCTCCAGATCTACTGCGGTAAAGCTCCAGCATTGTCCAGC
XM_003811196.4 CCCAAAGGGGCTCTACTGAACCTCCAGATCTACTGCGGTAAAGCTCCAGCATTGTCCAGC
X83368.1 CCCAAAGGGGCTCTACTGAACCTCCAGATCTACTGCGGTAAAGCTCCAGCACTGTCCAGC
NM_002649.3 CCCAAAGGGGCTCTACTGAACCTCCAGATCTACTGCGGTAAAGCTCCAGCACTGTCCAGC
XM_004046020.3 CCCAAAGGGGCTCTACTGAACCTCCAGATCTACTGCGGTAAAGCTCCAGCACTGTCCAGC
XM_032761353.1 CCCAAAGGGGCTCTACTGAACCTCCAGATCTACTGCGGTAAAGCTCCAGCACTGTCCAGC
BX648341.1 CCCAAAGGGGCTCTACTGAACCTCCAGATCTACTGCGATAAAGCTCCAGCACTGTCCAGC
NM_001282427.2 CCCAAAGGGGCTCTACTGAACCTCCAGATCTACTGCGGTAAAGCTCCAGCACTGTCCAGC
*************************************.************* ********

XM_002818334.3 AAGGCCTCTGCAGAGTCCCCCAGTTCTGAGTCCAAGGGCAAAGTTCAACTTCTCTATTAT

43
XM_012512141.2 AAGGCCTCTGCAGAGTCCCCCAGTTCTGAGTCCAAGGGCAAAGTTCAGCTTCTCTATTAT
XM_001162884.6 AAGGCCTCTGCAGAGTCCCCCAGTTCTGAGTCCAAGGGCAAAGTTCAGCTTCTCTATTAT
XM_003811196.4 AAGGCCTCTGCAGAGTCCCCCAGTTCTGAGTCCAAGGGCAAAGTTCAGCTTCTCTATTAT
X83368.1 AAGGCCTCTGCAGAGTCCCCCAGTTCTGAGTCCAAGGGCAAAGTTCGGCTTCTCTATTAT
NM_002649.3 AAGGCCTCTGCAGAGTCCCCCAGTTCTGAGTCCAAGGGCAAAGTTCAGCTTCTCTATTAT
XM_004046020.3 AAGGCCTCTGCAGAGTCCCCCAGTTCTGAGTCCAAGGGCAAAGTTCAGCTTCTCTATTAC
XM_032761353.1 AAGGCCTCTGCAGAGTCCGCCAGTTCTGAGTCCAAGGGCAAAGTTCAGCTTCTCTATTAT
BX648341.1 AAGGCCTCTGCAGAGTCCCCCAGTTCTGAGTCCAAGGGCAAAGTTCAGCTTCTCTATTAT
NM_001282427.2 AAGGCCTCTGCAGAGTCCCCCAGTTCTGAGTCCAAGGGCAAAGTTCAGCTTCTCTATTAT
****************** ***************************..***********

XM_002818334.3 GTGAACCTGCTGCTGATAGACCACCGTTTCCTCCTGCGCCGTGGAGAATACGTCCTCCAC
XM_012512141.2 GTGAACCTGCTGCTGATAGACCACCGTTTCCTCCTGCGCCGTGGAGAATACGTCCTCCAC
XM_001162884.6 GTGAACCTGCTGCTGATAGACCACCGTTTCCTCCTGCGCCGTGGAGAATACGTCCTCCAC
XM_003811196.4 GTGAACCTGCTGCTGATAGACCACCGTTTCCTCCTGCGCCGTGGAGAATACGTCCTCCAC
X83368.1 GTGAACCTGCTGCTGATAGACCACCGTTTCCTCCTGCGCCGTGGAGAATACGTCCTCCAC
NM_002649.3 GTGAACCTGCTGCTGATAGACCACCGTTTCCTCCTGCGCCGTGGAGAATACGTCCTCCAC
XM_004046020.3 GTGAACCTGCTGCTGATAGACCACCGTTTCCTCCTGCGCCGTGGAGAATACGTCCTCCAC
XM_032761353.1 GTGAACCTGCTGCTGATAGACCACCGTTTCCTCCTGCGCCGTGGAGAATACGTCCTCCAC
BX648341.1 GTGAACCTGCTGCTGATAGACCACCGTTTCCTCCTGCGCCGTGGAGAATACGTCCTCCAC
NM_001282427.2 GTGAACCTGCTGCTGATAGACCACCGTTTCCTCCTGCGCCGTGGAGAATACGTCCTCCAC
************************************************************

XM_002818334.3 ATGTGGCAGATATCCGGGAAGGGAGAAGACCAAGGAAGCTTCAGTGCTGACAAACTCACG
XM_012512141.2 ATGTGGCAGATATCCGGGAAGGGAGAAGACCAAGGAAGCTTCAATGCTGACAAACTCACG
XM_001162884.6 ATGTGGCAGATATCTGGGAAGGGAGAAGACCAAGGAAGCTTCAATGCTGACAAACTCACG
XM_003811196.4 ATGTGGCAGATATCTGGGAAGGGAGAAGACCAAGGAAGCTTCAATGCTGACAAACTCACG
X83368.1 ATGTGGCAGATATCTGGGAAGGGAGAAGACCAAGGAAGCTTCAATGCTGACAAACTCACG
NM_002649.3 ATGTGGCAGATATCTGGGAAGGGAGAAGACCAAGGAAGCTTCAATGCTGACAAACTCACG
XM_004046020.3 ATGTGGCAGATATCCGGGAAGGGAGAAGACCAAGGAAGCTTCAATGCTGACAAACTCACG
XM_032761353.1 ATGTGGCAGATATCCGGGAAGGGAGAAGACCAAGGAAGCTTCAATGCTGACAAACTCACG
BX648341.1 ATGTGGCAGATATCTGGGAAGGGAGAAGACCAAGGAAGCTTCAATGCTGACAAACTCACG
NM_001282427.2 ATGTGGCAGATATCTGGGAAGGGAGAAGACCAAGGAAGCTTCAATGCTGACAAACTCACG
************** ****************************.****************

XM_002818334.3 TCTGCAACTAACCCAGACAAGGAGAACTCAATGTCCATCTCCATTCTTCTGGACAATTAC
XM_012512141.2 TCTGCAACTAACCCAGACAAGGAGAACTCAATGTCCATCTCCATTCTTCTGGACAATTAC
XM_001162884.6 TCTGCCACTAACCCAGACAAGGAGAACTCAATGTCCATCTCCATTCTTCTGGACAATTAC
XM_003811196.4 TCTGCCACTAACCCAGACAAGGAGAACTCAATGTCCATCTCCATTCTTCTGGACAATTAC
X83368.1 TCTGCAACTAACCCAGACAAGGAGAACTCAATGTCCATCTCCATTCTTCTGGACAATTAC
NM_002649.3 TCTGCAACTAACCCAGACAAGGAGAACTCAATGTCCATCTCCATTCTTCTGGACAATTAC
XM_004046020.3 TCTGCAACTAACCCAGACAAGGAGAACTCAATGTCCATCTCCATTCTTCTGGACAATTAC
XM_032761353.1 TCTGCAACTAACCCAGACAAGGAGAACTCAATGTCCATCTCCATTCTTCTGGACAATTAC
BX648341.1 TCTGCAACTAACCCAGACAAGGAGAACTCAATGTCCATCTCCATTCTTCTGGACAATTAC
NM_001282427.2 TCTGCAACTAACCCAGACAAGGAGAACTCAATGTCCATCTCCATTCTTCTGGACAATTAC
*****.******************************************************

XM_002818334.3 TGCCACCCGATAGCCCTGCCTAAGCATCAGCCCACCCCTGACCCGGAAGGGGACCGGGTT
XM_012512141.2 TGCCACCCGATAGCCCTGCCTAAGCATCAGCCCACCCCTGACCCGGAAGGGGACCGGGTT
XM_001162884.6 TGCCACCCGATAGCCCTGCCTAAGCATCAGCCCACCCCTGACCCGGAAGGGGACCGGGTT
XM_003811196.4 TGCCACCCGATAGCCCTGCCTAAGCATCAACCCACCCCTGACCCGGAAGGGGACCGGGTT
X83368.1 TGCCACCCGATAGCCCTGCCTAAGCATCAGCCCACCCCTGACCCGGAAGGGGACCGGGTT
NM_002649.3 TGCCACCCGATAGCCCTGCCTAAGCATCAGCCCACCCCTGACCCGGAAGGGGACCGGGTT
XM_004046020.3 TGCCACCCGATAGCCCTGCCTAAGCATCAGCCCACCCCTGACCCGGAAGGGGACCGGGTT
XM_032761353.1 TGCCACCCGATAGCCCTGCCTAAGCATCAGCCCACCCCTGACCCGGAAGGGGACCGGGTT
BX648341.1 TGCCACCCGATAGCCCTGCCTAAGCATCAGCCCACCCCTGACCCGGAAGGGGACCGGGTT
NM_001282427.2 TGCCACCCGATAGCCCTGCCTAAGCATCAGCCCACCCCTGACCCGGAAGGGGACCGGGTT
*****************************.******************************

XM_002818334.3 CGAGCAGAAATGCCCAACCAGCTTCGCAAGCAATTGGAGGCGATCATAGCCACTGATCCA
XM_012512141.2 CGAGCAGAAATGCCCAACCAGCTTCGCAAGCAATTGGAGGCGATCATAGCCACTGATCCA
XM_001162884.6 CGAGCAGAAATGCCCAACCAGCTTCGCAAGCAATTGGAGGCGATCATAGCCACTGATCCA
XM_003811196.4 CGAGCAGAAATGCCCAACCAGCTTCGCAAGCAATTGGAGGCGATCATAGCCACTGATCCA
X83368.1 CGAGCAGAAATGCCCAACCAGCTTCGCAAGCAATTGGAGGCGATCATAGCCACTGATCCA
NM_002649.3 CGAGCAGAAATGCCCAACCAGCTTCGCAAGCAATTGGAGGCGATCATAGCCACTGATCCA
XM_004046020.3 CGAGCAGAAATGCCCAACCAGCTTCGCAAGCAATTGGAGGCGATCATAGCCACTGATCCA
XM_032761353.1 CGAGCAGAAATGCCCAACCAGCTTCGCAAGCAATTGGAGGCGATCATAGCCACTGATCCA
BX648341.1 CGAGCAGAAATGCCCAACCAGCTTCGCAAGCAATTGGAGGCGATCATAGCCACTGATCCA

44
NM_001282427.2 CGAGCAGAAATGCCCAACCAGCTTCGCAAGCAATTGGAGGCGATCATAGCCACTGATCCA
************************************************************

XM_002818334.3 CTTAACCCTCTCACAGCAGAGGACAAAGAATTGCTCTGGCATTTTAGATATGAAAGCCTT
XM_012512141.2 CTTAACCCTCTCACAGCAGAGGACAAAGAATTGCTCTGGCATTTTAGATATGAAAGCCTT
XM_001162884.6 CTTAACCCTCTCACAGCAGAGGACAAAGAATTGCTCTGGCATTTTAGATACGAAAGCCTT
XM_003811196.4 CTTAACCCTCTCACAGCAGAGGACAAAGAATTGCTCTGGCATTTTAGATACGAAAGCCTT
X83368.1 CTTAACCCTCTCACAGCAGAGGACAAAGAATTGCTCTGGCATTTTAGATACGAAAGCCTT
NM_002649.3 CTTAACCCTCTCACAGCAGAGGACAAAGAATTGCTCTGGCATTTTAGATACGAAAGCCTT
XM_004046020.3 CTTAACCCTCTCACAGCAGAGGACAAAGAATTGCTCTGGCATTTTAGATATGAAAGCCTT
XM_032761353.1 CTTAACCCTCTCACAGCAGAGGACAAAGAATTGCTCTGGCATTTTAGATATGAAAGCCTT
BX648341.1 CTTAACCCTCTCACAGCAGAGGACAAAGAATTGCTCTGGCATTTTAGATACGAAAGCCTT
NM_001282427.2 CTTAACCCTCTCACAGCAGAGGACAAAGAATTGCTCTGGCATTTTAGATACGAAAGCCTT
************************************************** *********

XM_002818334.3 AAGCACCCAAAAGCATATCCTAAGCTATTTAGTTCAGTGAAATGGGGACAGCAAGAAATT
XM_012512141.2 AAGCACCCAAAAGCATATCCTAAGCTATTTAGTTCAGTGAAATGGGGACAGCAAGAAATT
XM_001162884.6 AAGCACCCAAAAGCATATCCTAAGCTATTTAGTTCAGTGAAATGGGGACAGCAAGAAATT
XM_003811196.4 AAGCACCCAAAAGCATATCCTAAGCTATTTAGTTCAGTGAAATGGGGACAGCAAGAAATT
X83368.1 AAGCACCCAAAAGCATATCCTAAGCTATTTAGTTCAGTGAAATGGGGACAGCAAGAAATT
NM_002649.3 AAGCACCCAAAAGCATATCCTAAGCTATTTAGTTCAGTGAAATGGGGACAGCAAGAAATT
XM_004046020.3 AAGCACCCAAAAGCATATCCTAAGCTATTTAGTTCAGTGAAATGGGGACAGCAAGAAATT
XM_032761353.1 AAGCACCCAAAAGCATATCCTAAGCTATTTAGTTCAGTGAAATGGGGACAGCAAGAAATT
BX648341.1 AAGCACCCAAAAGCATATCCTAAGCTATTTAGTTCAGTGAAATGGGGACAGCAAGAAATT
NM_001282427.2 AAGCACCCAAAAGCATATCCTAAGCTATTTAGTTCAGTGAAATGGGGACAGCAAGAAATT
************************************************************

XM_002818334.3 GTGGCCAAAACATACCAATTGTTGGCCAGAAGGGAAGTCTGGGATCAAAGTGCTTTGGAT
XM_012512141.2 GTGGCCAAAACATACCAATTGTTGGCCAGAAGAGAAGTCTGGGATCAAAGTGCTTTGGAT
XM_001162884.6 GTGGCCAAAACATACAAATTGTTGGCCAGAAGGGAAGTCTGGGATCAAAGTGCTTTGGAT
XM_003811196.4 GTGGCCAAAACATACCAATTGTTGGCCAGAAGGGAAGTCTGGGATCAAAGTGCTTTGGAT
X83368.1 GTGGCCAAAACATACCAATTGTTGGCCAGAAGGGAAGTCTGGGATCAAAGTGCTTTGGAT
NM_002649.3 GTGGCCAAAACATACCAATTGTTGGCCAGAAGGGAAGTCTGGGATCAAAGTGCTTTGGAT
XM_004046020.3 GTGGCCAAAACATACCAATTGTTGGCCAGAAGGGAAGTCTGGGATCAAAGTGCTTTGGAT
XM_032761353.1 GTGGCCAAAACATACCAATTGTTGGCCAGAAGAGAAGTCTGGGATCAAAGTGCTTTGGAT
BX648341.1 GTGGCCAAAACATACCAATTGTTGGCCAGAAGGGAAGTCTGGGATCAAAGTGCTTTGGAT
NM_001282427.2 GTGGCCAAAACATACCAATTGTTGGCCAGAAGGGAAGTCTGGGATCAAAGTGCTTTGGAT
***************.****************.***************************

XM_002818334.3 GTTGGGTTAACAATGCAACTCCTGGACTGCAACTTCTCAGATGAAAATGTAAGAGCCATT
XM_012512141.2 GTTGGGTTAACAATGCAACTCCTGGACTGCAACTTCTCAGATGAAAATGTAAGAGCCATT
XM_001162884.6 GTTGGGTTAACAATGCAACTCCTGGACTGCAACTTCTCAGATGAAAATGTAAGAGCCATT
XM_003811196.4 GTTGGGTTAACAATGCAACTCCTGGACTGCAACTTCTCAGATGAAAATGTAAGAGCCATT
X83368.1 GTTGGGTTAACAATGCAGCTCCTGGACTGCAACTTCTCAGATGAAAATGTAAGAGCCATT
NM_002649.3 GTTGGGTTAACAATGCAGCTCCTGGACTGCAACTTCTCAGATGAAAATGTAAGAGCCATT
XM_004046020.3 GTTGGGTTAACAATGCAACTCCTGGACTGCAACTTCTCAGATGAAAATGTAAGAGCCATT
XM_032761353.1 GTTGGGTTAACAATGCAACTCCTGGACTGCAACTTCTCAGATGAAAATGTAAGAGCCATT
BX648341.1 GTTGGGTTAACAATGCAGCTCCTGGACTGCAACTTCTCAGATGAAAATGTAAGAGCCATT
NM_001282427.2 GTTGGGTTAACAATGCAGCTCCTGGACTGCAACTTCTCAGATGAAAATGTAAGAGCCATT
*****************.******************************************

XM_002818334.3 GCAGTTCAGAAACTGGAGAGCTTGGAGGACGATGATGTTCTGCATTACCTTCTACAATTG
XM_012512141.2 GCGGTTCAGAAACTGGAGAGCTTGGAGGACGATGATGTTCTGCATTACCTTCTACAGTTG
XM_001162884.6 GCAGTTCAGAAACTGGAGAGCTTGGAGGACGATGATGTTCTGCATTACCTTCTACAATTG
XM_003811196.4 GCAGTTCAGAAACTGGAGAGCTTGGAGGACGATGATGTTCTGCATTACCTTCTACAATTG
X83368.1 GCAGTTCAGAAACTGGAGAGCTTGGAGGACGATGATGTTCTGCATTACCTTCTACAATTG
NM_002649.3 GCAGTTCAGAAACTGGAGAGCTTGGAGGACGATGATGTTCTGCATTACCTTCTACAATTG
XM_004046020.3 GCAGTTCAGAAACTGGAGAGCTTGGAGGATGATGATGTTCTGCATTACCTTCTACAATTA
XM_032761353.1 GCAGTTCAGAAACTGGAGAGCTTGGAGGACGATGATGTTCTGCATTACCTTCTACAGTTG
BX648341.1 GCAGTTCAGAAACTGGAGAGCTTGGAGGACGATGATGTTCTGCATTACCTTCTACAATTG
NM_001282427.2 GCAGTTCAGAAACTGGAGAGCTTGGAGGACGATGATGTTCTGCATTACCTTCTACAATTG
**.************************** **************************.**.

XM_002818334.3 GTCCAGGCTGTGAAATTTGAACCATACCATGATAGCGCCCTTGCCAGATTTCTGCTGAAG
XM_012512141.2 GTCCAGGCTGTGAAATTTGAACCATACCACGATAGCGCCCTTGCCAGATTTCTGCTGAAG
XM_001162884.6 GTCCAGGCTGTGAAATTTGAACCATACCATGATAGCGCCCTTGCCAGATTTCTGCTGAAG
XM_003811196.4 GTCCAGGCTGTGAAATTTGAACCATACCATGATAGCGCCCTTGCCAGATTTCTGCTGAAG
X83368.1 GTCCAGGCTGTGAAATTTGAACCATACCATGATAGCGCCCTTGCCAGATTTCTGCTGAAG

45
NM_002649.3 GTCCAGGCTGTGAAATTTGAACCATACCATGATAGCGCCCTTGCCAGATTTCTGCTGAAG
XM_004046020.3 GTCCAGGCTGTGAAATTTGAACCATACCATGATAGCGCCCTTGCCAGATTTCTGCTGAAG
XM_032761353.1 GTCCAGGCTGTGAAATTTGAACCATACCATGATAGCGCCCTTGCCAGATTTCTGCTGAAG
BX648341.1 GTCCAGGCTGTGAAATTTGAACCATACCATGATAGCGCCCTTGCCAGATTTCTGCTGAAG
NM_001282427.2 GTCCAGGCTGTGAAATTTGAACCATACCATGATAGCGCCCTTGCCAGATTTCTGCTGAAG
***************************** ******************************

XM_002818334.3 CGTGGTTTAAGAAACAAAAGAATTGGTCACTTTTTGTTTTGGTTCTTGAGAAGTGAGATA
XM_012512141.2 CGTGGTTTGAGAAACAAAAGAATTGGTCACTTTTTGTTTTGGTTCTTGAGAAGTGAGATA
XM_001162884.6 CGTGGTTTAAGAAACAAAAGAATTGGTCACTTTTTGTTTTGGTTCTTGAGAAGTGAGATA
XM_003811196.4 CGTGGTTTAAGAAACAAAAGAATTGGTCACTTTTTGTTTTGGTTCTTGAGAAGTGAGATA
X83368.1 CGTGGTTTAAGAAACAAAAGAATTGGTCACTTTTTGTTTTGGTTCTTGAGAAGTGAGATA
NM_002649.3 CGTGGTTTAAGAAACAAAAGAATTGGTCACTTTTTGTTTTGGTTCTTGAGAAGTGAGATA
XM_004046020.3 CGTGGTTTAAGAAACAAAAGAATTGGTCACTTTTTGTTTTGGTTCTTGAGAAGTGAGATA
XM_032761353.1 CGTGGTTTGAGAAACAAAAGAATTGGTCACTTTTTGTTTTGGTTCTTGAGAAGTGAGATA
BX648341.1 CGTGGTTTAAGAAACAAAAGAATTGGTCACTTTTTGTTTTGGTTCTTGAGAAGTGAGATA
NM_001282427.2 CGTGGTTTAAGAAACAAAAGAATTGGTCACTTTTTGTTTTGGTTCTTGAGAAGTGAGATA
********.***************************************************

XM_002818334.3 GCCCAGTCCAGACACTATCAGCAGAGGTTCGCTGTGATTCTGGAAGCCTATTTGAGGGGC
XM_012512141.2 GCCCAGTCCAGACACTATCAGCAGAGGTTCGCTGTGATTCTGGAAGCCTATTTGAGGGGC
XM_001162884.6 GCCCAGTCCAGACACTATCAGCAGAGGTTCGCTGTGATTCTGGAAGCCTATCTGAGGGGC
XM_003811196.4 GCCCAGTCCAGACACTATCAGCAGAGGTTCGCTGTGATTCTGGAAGCCTATCTGAGGGGC
X83368.1 GCCCAGTCCAGACACTATCAGCAGAGGTTCGCTGTGATTCTGGAAGCCTATCTGAGGGGC
NM_002649.3 GCCCAGTCCAGACACTATCAGCAGAGGTTCGCTGTGATTCTGGAAGCCTATCTGAGGGGC
XM_004046020.3 GCCCAGTCCAGACACTATCAGCAGAGGTTCGCTGTGATTCTGGAAGCCTATCTGAGGGGC
XM_032761353.1 GCCCAGTCCAGACACTATCAGCAGAGGTTCGCTGTGATTCTGGAAGCCTATTTGAGGGGC
BX648341.1 GCCCAGTCCAGACACTATCAGCAGAGGTTCGCTGTGATTCTGGAAGCCTATCTGAGGGGC
NM_001282427.2 GCCCAGTCCAGACACTATCAGCAGAGGTTCGCTGTGATTCTGGAAGCCTATCTGAGGGGC
*************************************************** ********

XM_002818334.3 TGTGGCACAGCCATGCTGCACGACTTTACCCAACAAGTCCAAGTAATCGAGATGTTACAA
XM_012512141.2 TGTGGCACAGCCATGCTGCATGACTTTACCCAACAAGTCCAAGTAATCGAGATGTTACAA
XM_001162884.6 TGTGGCACAGCCATGCTGCACGACTTTACCCAACAAGTCCAAGTAATCGAGATGTTACAA
XM_003811196.4 TGTGGCACAGCCATGCTGCACGACTTTACCCAACAAGTCCAAGTAATCGAGATGTTACAA
X83368.1 TGTGGCACAGCCATGCTGCACGACTTTACCCAACAAGTCCAAGTAATCGAGATGTTACAA
NM_002649.3 TGTGGCACAGCCATGCTGCACGACTTTACCCAACAAGTCCAAGTAATCGAGATGTTACAA
XM_004046020.3 TGTGGCACAGCCATGCTGCACGACTTTACCCAACAAGTCCAAGTAATCGAGATGTTACAA
XM_032761353.1 TGTGGCACAGCCATGCTGCATGACTTTACCCAACAAGTCCAAGTAATCGAGATGTTACAA
BX648341.1 TGTGGCACAGCCATGCTGCACGACTTTACCCAACAAGTCCAAGTAATCGAGATGTTACAA
NM_001282427.2 TGTGGCACAGCCATGCTGCACGACTTTACCCAACAAGTCCAAGTAATCGAGATGTTACAA
******************** ***************************************

XM_002818334.3 AAAGTCACCCTTGATATTAAATCGCTCTCTGCTGAAAAGTATGACGTCAGTTCCCAAGTT
XM_012512141.2 AAAGTCACCCTTGATATTAAATCGCTCTCTGCTGAAAAGTATGACGTCAGTTCCCAAGTT
XM_001162884.6 AAAGTCACCCTTGATATTAAATCGCTCTCTGCTGAAAAGTATGACGTCAGTTCCCAAGTT
XM_003811196.4 AAAGTCACCCTTGATATTAAATCGCTCTCTGCTGAAAAGTATGACGTCAGTTCCCAAGTT
X83368.1 AAAGTCACCCTTGATATTAAATCGCTCTCTGCTGAAAAGTATGACGTCAGTTCCCAAGTT
NM_002649.3 AAAGTCACCCTTGATATTAAATCGCTCTCTGCTGAAAAGTATGACGTCAGTTCCCAAGTT
XM_004046020.3 AAAGTCACCCTTGATATTAAATCGCTCTCTGCTGAAAAGTATGACGTCAGTTCCCAAGTT
XM_032761353.1 AAAGTCACCCTTGATATTAAATCGCTCTCTGCTGAAAAGTATGACGTCAGTTCCCAAGTT
BX648341.1 AAAGTCACCCTTGATATTAAATCGCTCTCTGCTGAAAAGTATGACGTCAGTTCCCAAGTT
NM_001282427.2 AAAGTCACCCTTGATATTAAATCGCTCTCTGCTGAAAAGTATGACGTCAGTTCCCAAGTT
************************************************************

XM_002818334.3 ATTTCACAACTTAAACAAAAGCTTGAAAACCTGCAGAATTCTCAACTCCCCAAAAGCTTT
XM_012512141.2 ATTTCACAACTTAAACAAAAGCTTGAAAACCTGCAGAATTCTCAACTCCCCAAAAGCTTT
XM_001162884.6 ATTTCACAACTTAAACAAAAGCTTGAAAACCTGCAGAATTCTCAACTCCCCGAAAGCTTT
XM_003811196.4 ATTTCACAACTTAAACAAAAGCTTGAAAACCTGCAGAATTCTCAACTCCCCGAAAGCTTT
X83368.1 ATTTCACAACTTAAACAAAAGCTTGAAAACCTGCAGAATTCTCAACTCCCCGAAAGCTTT
NM_002649.3 ATTTCACAACTTAAACAAAAGCTTGAAAACCTGCAGAATTCTCAACTCCCCGAAAGCTTT
XM_004046020.3 ATTTCACAACTTAAACAAAAGCTTGAAAACCTGCAGAATTCTCAACTCCCCGAAAGCTTT
XM_032761353.1 ATTTCACAACTTAAACAAAAGCTTGAAAACCTGCAGAATTCTCAACTCCCCAAAAGCTTT
BX648341.1 ATTTCACAACTTAAACAAAAGCTTGAAAACCTGCAGAATTCTCAACTCCCCGAAAGCTTT
NM_001282427.2 ATTTCACAACTTAAACAAAAGCTTGAAAACCTGCAGAATTCTCAACTCCCCGAAAGCTTT
***************************************************.********

XM_002818334.3 AGAGTTCCATATGATCCTGGACTGAAAGCAGGAGCGCTGGCAATTGAAAAATGTAAAGTA

46
XM_012512141.2 AGAGTTCCATATGATCCTGGACTGAAAGCAGGAGCGCTGGCAATTGAAAAATGTAAAGTA
XM_001162884.6 AGAGTTCCATATGATCCTGGACTGAAAGCAGGAGCGCTGGCAATTGAAAAATGTAAAGTA
XM_003811196.4 AGAGTTCCATATGATCCTGGACTGAAAGCAGGAGCGCTGGCAATTGAAAAATGTAAAGTA
X83368.1 AGAGTTCCATATGATCCTGGACTGAAAGCAGGAGCGCTGGCAATTGAAAAATGTAAAGTA
NM_002649.3 AGAGTTCCATATGATCCTGGACTGAAAGCAGGAGCGCTGGCAATTGAAAAATGTAAAGTA
XM_004046020.3 AGAGTTCCATATGATCCTGGACTGAAAGCAGGAGCGCTGGCAATTGAAAAATGTAAAGTA
XM_032761353.1 AGAGTTCCATATGATCCTGGACTGAAAGCAGGAGCGCTGGCAATTGAAAAATGTAAAGTA
BX648341.1 AGAGTTCCATATGATCCTGGACTGAAAGCAGGAGCGCTGGCAATTGAAAAATGTAAAGTA
NM_001282427.2 AGAGTTCCATATGATCCTGGACTGAAAGCAGGAGCGCTGGCAATTGAAAAATGTAAAGTA
************************************************************

XM_002818334.3 ATGGCCTCCAAGAAAAAACCACTGTGGCTTGAGTTTAAATGTGCTGATCCTACAGCCCTA
XM_012512141.2 ATGGCCTCCAAGAAAAAACCACTGTGGCTTGAGTTTAAATGTGCTGATCCTACAGCCCTA
XM_001162884.6 ATGGCCTCCAAGAAAAAACCACTGTGGCTTGAGTTTAAATGTGCTGATCCTACAGCCCTA
XM_003811196.4 ATGGCCTCCAAGAAAAAACCACTGTGGCTTGAGTTTAAATGTGCTGATCCTACAGCCCTA
X83368.1 ATGGCCTCCAAGAAAAAACCACTATGGCTTGAGTTTAAATGTGCCGATCCTACAGCCCTA
NM_002649.3 ATGGCCTCCAAGAAAAAACCACTATGGCTTGAGTTTAAATGTGCCGATCCTACAGCCCTA
XM_004046020.3 ATGGCCTCCAAGAAAAAACCACTGTGGCTTGAGTTTAAATGTGCTGATCCTACAGCCCTA
XM_032761353.1 ATGGCCTCCAAGAAAAAACCACTGTGGCTTGAGTTTAAATGTGCTGATCCTACAGCCCTA
BX648341.1 ATGGCCTCCAAGAAAAAACCACTATGGCTTGAGTTTAAATGTGCCGATCCTACAGCCCTA
NM_001282427.2 ATGGCCTCCAAGAAAAAACCACTATGGCTTGAGTTTAAATGTGCCGATCCTACAGCCCTA
***********************.******************** ***************

XM_002818334.3 TCAAATGAAACAATTGGAATTATCTTTAAACACGGTGATGATCTGCGCCAAGACATGCTT
XM_012512141.2 TCAAATGAAACAATTGGAATTATCTTTAAACACGGTGATGATCTGCGCCAAGACATGCTT
XM_001162884.6 TCAAATGAAACAATTGGAATTATCTTTAAACATGGTGATGATCTGCGCCAAGACATGCTT
XM_003811196.4 TCAAATGAAACAATTGGAATTATCTTTAAACATGGTGATGATCTGCGCCAAGACATGCTT
X83368.1 TCAAATGAAACAATTGGAATTATCTTTAAACATGGTGATGATCTGCGCCAAGACATGCTT
NM_002649.3 TCAAATGAAACAATTGGAATTATCTTTAAACATGGTGATGATCTGCGCCAAGACATGCTT
XM_004046020.3 TCAAATGAAACAATTGGAATTATCTTTAAACATGGTGATGATCTGCGCCAAGACATGCTT
XM_032761353.1 TCAAATGAAACAATTGGAATTATCTTTAAACACGGTGATGATCTGCGCCAAGACATGCTT
BX648341.1 TCAAATGAAACAATTGGAATTATCTTTAAACATGGTGATGATCTGCGCCAAGACATGCTT
NM_001282427.2 TCAAATGAAACAATTGGAATTATCTTTAAACATGGTGATGATCTGCGCCAAGACATGCTT
******************************** ***************************

XM_002818334.3 ATTTTACAGATTCTACGAATCATGGAGTCTATTTGGGAGACTGAATCTTTGGATCTGTGC
XM_012512141.2 ATTTTACAGATTCTACGAATCATGGAGTCTATTTGGGAGACTGAATCTTTGGATCTGTGC
XM_001162884.6 ATTTTACAGATTCTACGAATCATGGAGTCTATTTGGGAGACTGAATCTTTGGATCTGTGC
XM_003811196.4 ATTTTACAGATTCTACGAATCATGGAGTCTATTTGGGAGACTGAATCTTTGGATCTGTGC
X83368.1 ATTTTACAGATTCTACGAATCATGGAGTCTATTTGGGAGACTGAATCTTTGGATCTATGC
NM_002649.3 ATTTTACAGATTCTACGAATCATGGAGTCTATTTGGGAGACTGAATCTTTGGATCTATGC
XM_004046020.3 ATTTTACAGATTCTACGAATCATGGAGTCTATTTGGGAGACTGAATCTCTGGATCTGTGC
XM_032761353.1 ATTTTACAGATTCTACGAATCATGGAGTCTATTTGGGAGACTGAATCTTTGGATCTGTGC
BX648341.1 ATTTTACAGATTCTACGAATCATGGAGTCTATTTGGGAGACTGAATCTTTGGATCTATGC
NM_001282427.2 ATTTTACAGATTCTACGAATCATGGAGTCTATTTGGGAGACTGAATCTTTGGATCTATGC
************************************************ *******.***

XM_002818334.3 CTCCTGCCATATGGTTGCATTTCAACTGGTGACAAAATAGGAATGATCGAGATTGTGAAA
XM_012512141.2 CTCCTGCCATATGGTTGCATTTCAACTGGTGACAAAATAGGAATGATCGAGATTGTGAAA
XM_001162884.6 CTCCTGCCATATGGCTGCATTTCAACTGGTGACAAAATAGGAATGATCGAGATTGTGAAA
XM_003811196.4 CTCCTGCCATATGGCTGCATTTCAACTGGTGACAAAATAGGAATGATCGAGATTGTGAAA
X83368.1 CTCCTGCCATATGGTTGCATTTCAACTGGTGACAAAATAGGAATGATCGAGATTGTGAAA
NM_002649.3 CTCCTGCCATATGGTTGCATTTCAACTGGTGACAAAATAGGAATGATCGAGATTGTGAAA
XM_004046020.3 CTCCTGCCATATGGTTGCATTTCAACTGGTGACAAAATAGGAATGATCGAGATTGTGAAA
XM_032761353.1 CTCCTGCCATATGGTTGCATTTCAACTGGTGACAAAATAGGAATGATCGAGATTGTGAAA
BX648341.1 CTCCTGCCATATGGTTGCATTTCAACTGGTGACAAAATAGGAATGATCGAGATTGTGAAA
NM_001282427.2 CTCCTGCCATATGGTTGCATTTCAACTGGTGACAAAATAGGAATGATCGAGATTGTGAAA
************** *********************************************

XM_002818334.3 GACGCCACGACAATCGCCAAAATCCAGCAAAGCACAGTGGGCAACACGGGAGCATTTAAA
XM_012512141.2 GATGCCACGACAATTGCCAAAATTCAGCAAAGCACAGTGGGCAACACAGGAGCATTTAAA
XM_001162884.6 GACGCCACGACAATTGCCAAAATTCAGCAAAGCACAGTGGGCAACACGGGAGCATTTAAA
XM_003811196.4 GATGCCACGACAATTGCCAAAATTCAGCAAAGCACAGTGGGCAACACGGGAGCATTTAAA
X83368.1 GACGCCACGACAATTGCCAAAATTCAGCAAAGCACAGTGGGCAACACGGGAGCATTTAAA
NM_002649.3 GACGCCACGACAATTGCCAAAATTCAGCAAAGCACAGTGGGCAACACGGGAGCATTTAAA
XM_004046020.3 GACGCCACGACAATTGCCAAAATTCAGCAAAGCACAGTGGGCAACACGGGAGCATTTAAA
XM_032761353.1 GATGCCACGACAATTGCCAAAATTCAGCAAAGCACAGTGGGCAACACGGGAGCATTTAAA
BX648341.1 GACGCCACGACAATTGCCAAAATTCAGCAAAGCACAGTGGGCAACACGGGAGCATTTAAA

47
NM_001282427.2 GACGCCACGACAATTGCCAAAATTCAGCAAAGCACAGTGGGCAACACGGGAGCATTTAAA
** *********** ******** ***********************.************

XM_002818334.3 GATGAAGTCCTGAATCACTGGCTCAAAGAAAAATCCCCTACTGAAGAAAAGTTTCAGGCA
XM_012512141.2 GATGAAGTCCTGAATCACTGGCTCAAAGAAAAATCCCCTACAGAAGAAAAGTTTCAGGCA
XM_001162884.6 GATGAAGTCCTGAATCACTGGCTCAAAGAAAAATCCCCTACTGAAGAAAAGTTTCAGGCA
XM_003811196.4 GATGAAGTCCTGAATCACTGGCTCAAAGAAAAATCCCCTACTGAAGAAAAGTTTCAGGCA
X83368.1 GATGAAGTCCTGAATCACTGGCTCAAAGAAAAATCCCCTACTGAAGAAAAGTTTCAGGCA
NM_002649.3 GATGAAGTCCTGAATCACTGGCTCAAAGAAAAATCCCCTACTGAAGAAAAGTTTCAGGCA
XM_004046020.3 GATGAAGTCCTGAATCACTGGCTCAAAGAAAAATCCCCTACTGAAGAAAAGTTTCAGGCA
XM_032761353.1 GATGAAGTCCTGAATCACTGGCTCAAAGAAAAATCCCCTACTGAAGAAAAGTTTCAGGCA
BX648341.1 GATGAAGTCCTGAATCACTGGCTCAAAGAAAAATCCCCTACTGAAGAAAAGTTTCAGGCA
NM_001282427.2 GATGAAGTCCTGAATCACTGGCTCAAAGAAAAATCCCCTACTGAAGAAAAGTTTCAGGCA
*****************************************:******************

XM_002818334.3 GCAGTGGAGAGATTTGTTTATTCCTGTGCAGGCTACTGTGTGGCAACCTTTGTTCTTGGA
XM_012512141.2 GCAGTGGAGAGATTTGTTTATTCCTGTGCAGGCTACTGTGTGGCAACCTTTGTTCTTGGA
XM_001162884.6 GCAGTGGAGAGATTTGTTTATTCCTGTGCAGGCTACTGTGTGGCAACCTTTGTTCTTGGA
XM_003811196.4 GCAGTGGAGAGATTTGTTTATTCCTGTGCAGGCTATTGTGTGGCAACCTTTGTTCTTGGA
X83368.1 GCAGTGGAGAGATTTGTTTATTCCTGTGCAGGCTACTGTGTGGCAACCTTTGTTCTTGGA
NM_002649.3 GCAGTGGAGAGATTTGTTTATTCCTGTGCAGGCTACTGTGTGGCAACCTTTGTTCTTGGA
XM_004046020.3 GCAGTGGAGAGATTTGTTTATTCCTGTGCAGGCTACTGTGTGGCAACCTTTGTTCTTGGA
XM_032761353.1 GCAGTGGAGAGATTTGTTTATTCCTGTGCAGGCTACTGTGTGGCAACCTTTGTTCTTGGA
BX648341.1 GCAGTGGAGAGATTTGTTTATTCCTGTGCAGGCTACTGTGTGGCAACCTTTGTTCTTGGA
NM_001282427.2 GCAGTGGAGAGATTTGTTTATTCCTGTGCAGGCTACTGTGTGGCAACCTTTGTTCTTGGA
*********************************** ************************

XM_002818334.3 ATAGGCGACAGACACAATGACAATATTATGATCACCGAGACAGGAAACCTATTTCATATT
XM_012512141.2 ATAGGCGACAGACACAATGACAATATTATGATCACCGAGACAGGAAATCTATTTCATATT
XM_001162884.6 ATAGGCGACAGACACAATGACAATATTATGATCACCGAGACAGGAAACCTATTTCATATT
XM_003811196.4 ATAGGCGACAGACACAATGACAATATTATGATCACCGAGACAGGAAACCTATTTCATATT
X83368.1 ATAGGCGACAGACACAATGACAATATTATGATCACCGAGACAGGAAACCTATTTCATATT
NM_002649.3 ATAGGCGACAGACACAATGACAATATTATGATCACCGAGACAGGAAACCTATTTCATATT
XM_004046020.3 ATAGGCGACAGACACAATGACAATATTATGATCACCGAGACAGGAAACCTATTTCATATT
XM_032761353.1 ATAGGCGACAGACACAATGACAATATTATGATCACCGAGACAGGAAATCTATTTCATATT
BX648341.1 ATAGGCGACAGACACAATGACAATATTATGATCACCGAGACAGGAAACCTATTTCATATT
NM_001282427.2 ATAGGCGACAGACACAATGACAATATTATGATCACCGAGACAGGAAACCTATTTCATATT
*********************************************** ************

XM_002818334.3 GACTTCGGGCACATTCTTGGGAATTACAAAAGTTTCCTGGGCATTAATAAAGAGAGAGTG
XM_012512141.2 GACTTCGGGCACATTCTTGGGAATTACAAAAGTTTCCTGGGCATTAATAAAGAGAGAGTG
XM_001162884.6 GACTTCGGGCACATTCTTGGGAATTACAAAAGTTTCCTGGGCATTAATAAAGAGAGAGTG
XM_003811196.4 GACTTCGGGCACATTCTTGGGAATTACAAAAGTTTCCTGGGCATTAATAAAGAGAGAGTG
X83368.1 GACTTCGGGCACATTCTTGGGAATTACAAAAGTTTCCTGGGCATTAATAAAGAGAGAGTG
NM_002649.3 GACTTCGGGCACATTCTTGGGAATTACAAAAGTTTCCTGGGCATTAATAAAGAGAGAGTG
XM_004046020.3 GACTTCGGGCACATTCTCGGGAATTACAAAAGTTTCCTGGGCATTAATAAAGAGAGAGTG
XM_032761353.1 GACTTCGGGCACATTCTTGGGAATTACAAAAGTTTCCTGGGCATTAATAAAGAGAGAGTG
BX648341.1 GACTTCGGGCACATTCTTGGGAATTACAAAAGTTTCCTGGGCATTAATAAAGAGAGAGTG
NM_001282427.2 GACTTCGGGCACATTCTTGGGAATTACAAAAGTTTCCTGGGCATTAATAAAGAGAGAGTG
***************** ******************************************

XM_002818334.3 CCATTTGTGCTAACCCCTGACTTCCTCTTTGTGATGGGAACTTCTGGAAAGAAGACAAGC
XM_012512141.2 CCATTTGTGCTAACCCCTGACTTCCTCTTTGTGATGGGAACTTCTGGAAAGAAGACAAGC
XM_001162884.6 CCATTTGTGCTAACCCCTGACTTCCTCTTTGTGATGGGAACTTCTGGAAAGAAGACAAGC
XM_003811196.4 CCATTTGTGCTAACCCCTGACTTCCTCTTTGTGATGGGAACTTCTGGAAAGAAGACAAGC
X83368.1 CCATTTGTGCTAACCCCTGACTTCCTCTTTGTGATGGGAACTTCTGGAAAGAAGACAAGC
NM_002649.3 CCATTTGTGCTAACCCCTGACTTCCTCTTTGTGATGGGAACTTCTGGAAAGAAGACAAGC
XM_004046020.3 CCATTTGTGCTAACCCCTGACTTCCTCTTTGTGATGGGAACTTCTGGAAAGAAGACAAGC
XM_032761353.1 CCATTTGTGCTAACCCCTGACTTCCTCTTTGTGATGGGAACTTCTGGAAAGAAGACAAGC
BX648341.1 CCATTTGTGCTAACCCCTGACTTCCTCTTTGTGATGGGAACTTCTGGAAAGAGGACAAGC
NM_001282427.2 CCATTTGTGCTAACCCCTGACTTCCTCTTTGTGATGGGAACTTCTGGAAAGAAGACAAGC
****************************************************.*******

XM_002818334.3 CCACACTTCCAGAAATTTCAGGACATCTGTGTCAAGGCTTATCTAGCCCTTCGTCATCAC
XM_012512141.2 CCACACTTCCAGAAATTTCAGGACATCTGTGTAAAGGCTTATCTAGCCCTTCGTCATCAC
XM_001162884.6 CCACACTTCCAGAAATTTCAGGACATCTGTGTTAAGGCTTATCTAGCCCTTCGTCATCAC
XM_003811196.4 CCACACTTCCAGAAATTTCAGGACATCTGTGTTAAGGCTTATCTAGCCCTTCGTCATCAC
X83368.1 CCACACTTCCAGAAATTTCAGGACATCTGTGTTAAGGCTTATCTAGCCCTTCGTCATCAC

48
NM_002649.3 CCACACTTCCAGAAATTTCAGGACATCTGTGTTAAGGCTTATCTAGCCCTTCGTCATCAC
XM_004046020.3 CCACACTTCCAGAAATTTCAGGACATCTGTGTTAAGGCTTATCTAGCCCTTCGTCATCAC
XM_032761353.1 CCACACTTCCAGAAATTTCAGGACATCTGTGTAAAGGCTTATCTAGCCCTTCGTCATCAC
BX648341.1 CCACACTTCCAGAAATTTCAGGACATCTGTGTTAAGGCTTATCTAGCCCTTCGTCATCAC
NM_001282427.2 CCACACTTCCAGAAATTTCAGGACATCTGTGTTAAGGCTTATCTAGCCCTTCGTCATCAC
******************************** ***************************

XM_002818334.3 ACAAACCTACTGATCATCCTGTTCTCCATGATGCTGATGACGGGAATGCCTCAGTTAACA
XM_012512141.2 ACAAACCTGCTGATCATCCTGTTCTCCATGATGCTGATGACGGGAATGCCCCAGTTAACA
XM_001162884.6 ACAAACCTACTGATCATCCTGTTCTCCATGATGCTGATGACAGGAATGCCCCAGTTAACA
XM_003811196.4 ACAAACCTACTGATCATCCTGTTCTCCATGATGCTGATGACAGGAATGCCCCAGTTAACA
X83368.1 ACAAACCTACTGATCATCCTGTTCTCCATGATGCTGATGACAGGAATGCCCCAGTTAACA
NM_002649.3 ACAAACCTACTGATCATCCTGTTCTCCATGATGCTGATGACAGGAATGCCCCAGTTAACA
XM_004046020.3 ACAAACCTACTGATCATCCTGTTCTCCATGATGCTGATGACGGGAATGCCCCAGTTAACA
XM_032761353.1 ACAAACCTACTGATCATCCTGTTCTCCATGATGCTGATGACGGGAATGCCCCAGTTAACA
BX648341.1 ACAAACCTACTGATCATCCTGTTCTCCATGATGCTGATGACAGGAATGCCCCAGTTAACA
NM_001282427.2 ACAAACCTACTGATCATCCTGTTCTCCATGATGCTGATGACAGGAATGCCCCAGTTAACA
********.********************************.******** *********

XM_002818334.3 AGCAAAGAAGACATTGAATATATCCGGGATGCCCTCACAGTGGGGAAAAATGAGGAGGAT
XM_012512141.2 AGCAAAGAAGACATTGAATATATCCGGGATGCCCTCACAGTGGGGAAAAATGAGGAGGAT
XM_001162884.6 AGCAAAGAAGACATTGAATACATCCGGGATGCCCTCACAGTGGGGAAAAATGAGGAGGAT
XM_003811196.4 AGCAAAGAAGACATTGAATACATCCGGGATGCCCTCACAGTGGGGAAAAATGAGGAGGAT
X83368.1 AGCAAAGAAGACATTGAATATATCCGGGATGCCCTCACAGTGGGGAAAAATGAGGAGGAT
NM_002649.3 AGCAAAGAAGACATTGAATATATCCGGGATGCCCTCACAGTGGGGAAAAATGAGGAGGAT
XM_004046020.3 AGCAAAGAAGACATTGAATACATCCGGGATGCCCTCACAGTGGGGAAAAATGAGGAGGAT
XM_032761353.1 AGCAAAGAAGACATTGAATATATCCGGGATGCCCTCACAGTGGGGAAAAATGAGGAGGAT
BX648341.1 AGCAAAGAAGACATTGAATATATCCGGGATGCCCTCACAGTGGGGAAAAATGAGGAGGAT
NM_001282427.2 AGCAAAGAAGACATTGAATATATCCGGGATGCCCTCACAGTGGGGAAAAATGAGGAGGAT
******************** ***************************************

XM_002818334.3 GCTAAAAAGTATTTTCTTGATCAGATTGAAGTTTGCAGAGACAAAGGATGGACTGTGCAG
XM_012512141.2 GCTAAAAAGTATTTTCTTGATCAGATCGAAGTTTGCAGAGACAAAGGATGGACTGTGCAG
XM_001162884.6 GCTAAAAAGTATTTTCTTGATCAGATCGAAGTTTGCAGAGACAAAGGATGGACTGTGCAG
XM_003811196.4 GCTAAAAAGTATTTTCTTGATCAGATCGAAGTTTGCAGAGACAAAGGATGGACTGTGCAG
X83368.1 GCTAAAAAGTATTTTCTTGATCAGATCGAAGTTTGCAGAGACAAAGGATGGACTGTGCAG
NM_002649.3 GCTAAAAAGTATTTTCTTGATCAGATCGAAGTTTGCAGAGACAAAGGATGGACTGTGCAG
XM_004046020.3 GCTAAAAAGTATTTTCTTGATCAGATCGAAGTTTGCAGAGACAAAGGATGGACTGTGCAG
XM_032761353.1 GCTAAAAAGTATTTTCTTGATCAGATCGAAGTTTGCAGAGACAAAGGATGGACTGTGCAG
BX648341.1 GCTAAAAAGTATTTTCTTGATCAGATCGAAGTTTGCAGAGACAAAGGATGGACTGTGCAG
NM_001282427.2 GCTAAAAAGTATTTTCTTGATCAGATCGAAGTTTGCAGAGACAAAGGATGGACTGTGCAG
************************** *********************************

XM_002818334.3 TTTAATTGGTTTCTACATCTTGTTCTTGGCATCAAACAAGGAGAGAAACATTCAGCCTAA
XM_012512141.2 TTTAATTGGTTTCTACATCTTGTTCTTGGCATCAAACAAGGAGAGAAACATTCAGCCTAA
XM_001162884.6 TTTAATTGGTTTCTACATCTTGTTCTTGGCATCAAACAAGGAGAGAAACATTCAGCCTAA
XM_003811196.4 TTTAATTGGTTTCTACATCTTGTTCTTGGCATCAAACAAGGAGAGAAACATTCAGCCTAA
X83368.1 TTTAATTGGTTTCTACATCTTGTTCTTGGCATCAAACAAGGAGAGAAACATTCAGCCTAA
NM_002649.3 TTTAATTGGTTTCTACATCTTGTTCTTGGCATCAAACAAGGAGAGAAACATTCAGCCTAA
XM_004046020.3 TTTAATTGGTTTCTACATCTTGTTCTTGGCATCAAACAAGGAGAGAAACATTCAGCCTAA
XM_032761353.1 TTTAATTGGTTTCTACATCTTGTTCTTGGCATAAAACAAGGAGAGAAACATTCAGCCTAA
BX648341.1 TTTAATTGGTTTCTACATCTTGTTCTTGGCATCAAACAAGGAGAGAAACATTCAGCCTAA
NM_001282427.2 TTTAATTGGTTTCTACATCTTGTTCTTGGCATCAAACAAGGAGAGAAACATTCAGCCTAA
********************************.***************************

XM_002818334.3 TACTTTAGGCTAGAATCAAAAACAAGTTAGTGTTCTGTGGTTTAAATTAGCATAGCAATC
XM_012512141.2 TACTTTAGGCTAGAATCAAAAACAAGTTAGTGTTCTGTGGTTTAAATTAGCATAGCAATC
XM_001162884.6 TACTTTAGGCTAGAATCAAAAACAAGTTAGTGTTCTATGGTTTAAATTAGCATAGCAATC
XM_003811196.4 TACTTTAGGCTAGAATCAAAAACAAGTTAGTGTTCTATGGTTTAAATTAGCATAGCAATC
X83368.1 TACTTTAGGCTAGAATCAAAAACAAGTTAGTGTTCTATGGTTTAAATTAGCATAGCAATC
NM_002649.3 TACTTTAGGCTAGAATCAAAAACAAGTTAGTGTTCTATGGTTTAAATTAGCATAGCAATC
XM_004046020.3 TACTTTAGGCTAGAATCGAAAACAAGTTAGTGTTCTATGGTTTAAATTAGCATAGCAACC
XM_032761353.1 TACTTTAGGCTAGAATCAATAACAAGTTAGTGTTCTGTGCTTTAAATTAGCATGGCAATC
BX648341.1 TACTTTAGGCTAGAATCAAAAACAAGTTAGTGTTCTATGGTTTAAATTAGCATAGCAATC
NM_001282427.2 TACTTTAGGCTAGAATCAAAAACAAGTTAGTGTTCTATGGTTTAAATTAGCATAGCAATC
*****************.*:****************.** *************.**** *

XM_002818334.3 ATCGAACTTGGATTTCAAATGCAATAGACATTG-TGAAAGCTGGCATTTCAGAAGTATAG

49
XM_012512141.2 ATCGAACTTGGATTTCAAATGCAATAGACCATTGCGAAAGCTGGCATTTCAGAAGTATAG
XM_001162884.6 ATCGAACTTGGATTTCAAATGCAATAGACATTG-TGAAAGCTGGCATTTCAGAAGTATAG
XM_003811196.4 ATCGAACTTGGATTTCAAATGCAATAGACATTG-TGAAAGCTGGCATTTCAGAAGTATAG
X83368.1 ATCGAACTTGGATTTCAAATGCAATAGACATTG-TGAAAGCTGGCATTTCAGAAGTATAG
NM_002649.3 ATCGAACTTGGATTTCAAATGCAATAGACATTG-TGAAAGCTGGCATTTCAGAAGTATAG
XM_004046020.3 ATCAAACTTGGATTTCAAATGCAATAGACATTG-TGAAAGCTGGCATTTCAGAAGTATAG
XM_032761353.1 ATCGAACTTGGATTTCAAATGCAATAGACCATTGCGAAAGCTGGCATTTCAGAAGTATAG
BX648341.1 ATCGAACTTGGATTTCAAATGCAATAGACATTG-TGAAAGCTGGCATTTCAGAAGTATAG
NM_001282427.2 ATCGAACTTGGATTTCAAATGCAATAGACATTG-TGAAAGCTGGCATTTCAGAAGTATAG
***.*************************.:* *************************

XM_002818334.3 CTCTTTTCCTACCTGAACTCTTCCCTGGAGAAAAGATGTTGGCATTGCTGATTGTTTGGT
XM_012512141.2 CTCTTTTCCTACCTGAACTCTTCCCTGGAGAAAAGATGTTGGCATTGCTGATTGTTTGGT
XM_001162884.6 CTCTTTTCCTACCTGAACTCTTCCCTGGAGAAAAGATGTTGGCATTGCTGATTGTTTGGT
XM_003811196.4 CTCTTTTCCTACCTGAACTCTTCCCTGGAGAAAAGATGTTGGCATTGCTGATTGTTTGGT
X83368.1 CTCTTTTCCTACCTGAACTCTTCCCTGGAGAAAAGATGTTGGCATTGCTGATTGTTTGGT
NM_002649.3 CTCTTTTCCTACCTGAACTCTTCCCTGGAGAAAAGATGTTGGCATTGCTGATTGTTTGGT
XM_004046020.3 CTCTTTTCCTACCTGAACTCTTCCCTGGAGAAAAAATGTTGGCATTGCTGATTGTTTGGT
XM_032761353.1 CTCTTTTCCTACCTGAACTCTTCCCTGGAGAAAAGATGTTGGCATTGCTGATTGTTTGGT
BX648341.1 CTCTTTTCCTACCTGAACTCTTCCCTGGAGAAAAGATGTTGGCATTGCTGATTGTTTGGT
NM_001282427.2 CTCTTTTCCTACCTGAACTCTTCCCTGGAGAAAAGATGTTGGCATTGCTGATTGTTTGGT
**********************************.*************************

XM_002818334.3 TAAGCAATGTCCAGTGCTAGGATTATTTGCAGGTTTGGTTTTTTCTCATTTGTCTGTGGC
XM_012512141.2 TAAGCAATGTCCAGTGCTAGGATTATTTGCAGCTTTGGTTTTTTCTCATTTGTCGGTGGC
XM_001162884.6 TAAGCAATGTCCAGTGCTAGGATTATTTGCAGGTTTGGTTTTTTCTCATTTGTCTGTGGC
XM_003811196.4 TAAGCAATGTCCAGTGCTAGGATTATTTGCAGGTTTGGTTTTTTCTCATTTGTCTGTGGC
X83368.1 TAAGCAATGTCCAGTGCTAGGATTATTTGCAGGTTTGGTTTTTTCTCATTTGTCTGTGGC
NM_002649.3 TAAGCAATGTCCAGTGCTAGGATTATTTGCAGGTTTGGTTTTTTCTCATTTGTCTGTGGC
XM_004046020.3 TAAGCAATGTCCAGTGCTAGGATTATTTGCAGGTTTGGTTTTTTCTCATTTGTCTGTGGC
XM_032761353.1 TAAGCAATGTCCAGTGCTAGGATTATTTGCAGCTTTGGTTTTTTCTCATTTGTCAGTGGC
BX648341.1 TAAGCAATGTCCAGTGCTAGGATTATTTGCAGGTTTGGTTTTTTCTCATTTGTCTGTGGC
NM_001282427.2 TAAGCAATGTCCAGTGCTAGGATTATTTGCAGGTTTGGTTTTTTCTCATTTGTCTGTGGC
******************************** ********************* *****

XM_002818334.3 ATTGGAGAATATTCTCGGTTTAAACAGACTAATGACTTCTTTATTGTCCCTGATATTTTG
XM_012512141.2 ATTGGAGAATATTCTCGGTTTAAACAGACTAATGACTTCCTTATTGTCCCTGATATTTTG
XM_001162884.6 ATTGGAGAATATTCTCGGTTTAAACAGACTGATGACTTCCTTATTGTCCCTGATATTTTG
XM_003811196.4 ATTGGAGAATATTCTCGGTTTAAACAGACTGATGACTTCCTTATTGTCCCTGATATTTTG
X83368.1 ATTGGAGAATATTCTCGGTTTAAACAGACTAATGACTTCCTTATTGTCCCTGATATTTTG
NM_002649.3 ATTGGAGAATATTCTCGGTTTAAACAGACTAATGACTTCCTTATTGTCCCTGATATTTTG
XM_004046020.3 ATTGGAGAATATTCTCGGTTTAAACAGACTAATGACTTCCTTATTGTCCCTGATATTTTG
XM_032761353.1 ATTGGAGAATATTCTCGGTTTAAACAGACTAATGACTTCCTTATTGTCCCTGATATTTTG
BX648341.1 ATTGGAGAATATTCTCGGTTTAAACAGACTAATGACTTCCTTATTGTCCCTGATATTTTG
NM_001282427.2 ATTGGAGAATATTCTCGGTTTAAACAGACTAATGACTTCCTTATTGTCCCTGATATTTTG
******************************.******** ********************

XM_002818334.3 ACTATCTTACTATTGAGTGCTTCTGGAAATTCTTTGGAATAATTGATGACATCTATTTTC
XM_012512141.2 ACTGTCTTACTATTGAGTGCTTCTGGAAATTCTTTGGAATAATTGATGACATCTATTTTC
XM_001162884.6 ACTATCTTACTATTGAGTGCTTCTGGAAATTCTTTGGAATAATTGATGACATCTATTTTC
XM_003811196.4 ACTATCTTACTATTGAGTGCTTCTGGAAATTCTTTGGAATAATTGATGACATCTATTTTC
X83368.1 ACTATCTTACTATTGAGTGCTTCTGGAAATTCTTTGGAATAATTGATGACATCTATTTTC
NM_002649.3 ACTATCTTACTATTGAGTGCTTCTGGAAATTCTTTGGAATAATTGATGACATCTATTTTC
XM_004046020.3 ACTATCTTACTATTGAGTGCTTCTGAAAATTCTTTGGAATAATTGATGACATCTATTTTC
XM_032761353.1 ACTGTCTTACTATTGAGTGCTTCTGGAAATTCTTTGGAATAATTGATGACATCTATTTTC
BX648341.1 ACTATCTTACTATTGAGTGCTTCTGGAAATTCTTTGGAATAATTGATGACATCTATTTTC
NM_001282427.2 ACTATCTTACTATTGAGTGCTTCTGGAAATTCTTTGGAATAATTGATGACATCTATTTTC
***.*********************.**********************************

XM_002818334.3 ATCTGGGTTTACTCTCAATTTTGGTTATCTTTGTGTTCCTCAAGCTCTTTAAAGAAAAAG
XM_012512141.2 ATCTGGGTTTAGTCTCAATTTTGGTTATCTTTGTGTTCCTCGAGCTCTTTAAAGAAAAAG
XM_001162884.6 ATCTGGGTTTAGTCTCAATTTTGGTTATCTTTGTGTTCCTCAAGCTCTTTAAAGAAAAAG
XM_003811196.4 ATCTGGGTTTAGTCTCAATTTTGGTTATCTTTGTGTTCCTCAAGCTCTTTAAAGAAAAAG
X83368.1 ATCTGGGTTTAGTCTCAATTTTGGTTATCTTTGTGTTCCTCAAGCTCTTTAAAGAAAAAG
NM_002649.3 ATCTGGGTTTAGTCTCAATTTTGGTTATCTTTGTGTTCCTCAAGCTCTTTAAAGAAAAAG
XM_004046020.3 ATCTGGGTTTAGTCTCAATTTTGGTTATCTTTGTGTTCCTCAAGCTCTTTAAAGAAAAAG
XM_032761353.1 ATCTGGGTTTAGTCTCAATTTTGGTTATCTTTGTGTTCCTCGAGCTCTTTAAAGAAAAAG
BX648341.1 ATCTGGGTTTAGTCTCAATTTTGGTTATCTTTGTGTTCCTCAAGCTCTTTAAAGAAAAAG

50
NM_001282427.2 ATCTGGGTTTAGTCTCAATTTTGGTTATCTTTGTGTTCCTCAAGCTCTTTAAAGAAAAAG
*********** *****************************.******************

XM_002818334.3 ATGTAATCATTGTAACCTTTGTCTCATTCCTTACATGATGCTTCCAAACATCTCCTTAGT
XM_012512141.2 ATGTAATCATTGTAACCTTTGTCTCATTCCTTAAATGATGCTTCCAAACAGCTCCTTAGT
XM_001162884.6 ATGTAATCATTGTAACCTTTGTCTCATTCCTTAAATGATGCTTCCAAACATCTCCTTAGT
XM_003811196.4 ATGTAATCATTGTAACCTTTGTCTCATTCCTTAAATGATGCTTCCAAACATCTCCTTAGT
X83368.1 ATGTAATCGTTGTAACCTTTGTCTCATTCCTTAAATGATGCTTCCAAACATCTCCTTAGT
NM_002649.3 ATGTAATCGTTGTAACCTTTGTCTCATTCCTTAAATGATGCTTCCAAACATCTCCTTAGT
XM_004046020.3 ATGTAATCATTGTAACCTTTGTCTCATTCCTTAAATGATGCTTCCAAACATCTCCTTAGT
XM_032761353.1 ATGTAATCATTGTAACCTTTGTCTCATTCCTTAAATGATGCTTCCAAACAGCTCCTTAGT
BX648341.1 ATGTAATCGTTGTAACCTTTGTCTCATTCCTTAAATGATGCTTCCAAACATCTCCTTAGT
NM_001282427.2 ATGTAATCGTTGTAACCTTTGTCTCATTCCTTAAATGATGCTTCCAAACATCTCCTTAGT
********.************************.**************** *********

XM_002818334.3 GTCTGCAGGTGTTAGTGGTGTGCTAAAAGCAAGGAAAGCGAGTTAGTCTTTTCAGTGTCT
XM_012512141.2 GTCTGCAGGTGTTAGTGGTGTGCTAAAAGCAAGGAAAATGAGTTAGTCTTTTCAGTGTCT
XM_001162884.6 GTCTGCAGGTGTTAGTGGTGTGCTAAAAGCAAGGAAAGCGAGTTAGTCTTTTCAGTGTCT
XM_003811196.4 GTCTGCAGGTGTTAGTGGTGTGCTAAAAGCAAGGAAAGCGAGTTAGTCTTTTCAGTGTCT
X83368.1 GTCTGCAGGTGTTAGTGGTGTGCTAAAAGCAAGGAAAGCGAGTTAGTCTTTTCAGTGTCT
NM_002649.3 GTCTGCAGGTGTTAGTGGTGTGCTAAAAGCAAGGAAAGCGAGTTAGTCTTTTCAGTGTCT
XM_004046020.3 GTCTGCAGGTGTTAGTGGTGTGCTAAAAGCAAGGAAAGCGAGTTAGTCTTTTCAGTGTCT
XM_032761353.1 GTCTGCAGGTGTTAGTGGTGTGCTAAAAGCAAGGAAAGTGAGTTAGTCTTTTCAGTGTCT
BX648341.1 GTCTGCAGGTGTTAGTGGTGTGCTAAAAGCAAGGAAAGCGAGTTAGTCTTTTCAGTGTCT
NM_001282427.2 GTCTGCAGGTGTTAGTGGTGTGCTAAAAGCAAGGAAAGCGAGTTAGTCTTTTCAGTGTCT
*************************************. *********************

XM_002818334.3 TTTGCAATTCAATTCTTTTGTCATGTATAACTGAGACACACAAACACAGCAGGAGAAATC
XM_012512141.2 TTTGCAATTCAATTCTTTTGTCATGTATAACTGAGACACACAAACACAGCAGGAGAAATC
XM_001162884.6 TTTGCAATTCAATTCTTTTGTCATGTATAACTGAGACACACAGACACAGCAGGAGAAATC
XM_003811196.4 TTTGCAATTCAATTCTTTTGTCATGTATAACTGAGACACACAGACACAGCAGGAGAAATC
X83368.1 TTTGCAATTCAATTCTTTTGTCATGTATAACTGAGACACACAAACACAGCAGGAGAAATC
NM_002649.3 TTTGCAATTCAATTCTTTTGTCATGTATAACTGAGACACACAAACACAGCAGGAGAAATC
XM_004046020.3 TTTGCAATTCAATTCTTTTGTCATGTATAACTGAGACACACAAACACAGCAGGAGAAATC
XM_032761353.1 TTTGCAATTCAATTCTTTTGTCATGTATAACTGAGACACACAAACACAGCAGGAGAAATC
BX648341.1 TTTGCAATTCAATTCTTTTGTCATGTATAACTGAGACACACAAACACAGCAGGAGAAATC
NM_001282427.2 TTTGCAATTCAATTCTTTTGTCATGTATAACTGAGACACACAAACACAGCAGGAGAAATC
******************************************.*****************

XM_002818334.3 CAAACCATTGTGCCTTGACCTTCCTCTGCTGGTCTTGTTCCAGGGTTATGAATATGAAAA
XM_012512141.2 CGAACCATTGTGACTTGACCTTCCTCTGCTGGTCTTGTTCCAGGGTTATGAATATGAAAA
XM_001162884.6 TAAACCGTTGTGCTTTGACCTTCCTCTGCTGGTCTTGTTCCAGGGTTATGAATACGAAAA
XM_003811196.4 TAAACCGTTGTGCTTTGACCTTCCTCTGCTGGTCTTGTTCCAGGGTTATGAATACGAAAA
X83368.1 TAAACCGTTGTGCCTTGACCTTCCTCTGCTGGTCTTGTTCCAGGGTTATGAATATGAAAA
NM_002649.3 TAAACCGTTGTGCCTTGACCTTCCTCTGCTGGTCTTGTTCCAGGGTTATGAATATGAAAA
XM_004046020.3 TAAACCGTTGTGCCTTGACCTTCCTCTGCTGGTCTTGTTCCAGGGTTATGAATATGAAAA
XM_032761353.1 CGAACCGTTGTGACTTGACCTTCCTCTGCTGGTCTTGTTCCAGGGTTATGAATATGAAAA
BX648341.1 TAAACCGTTGTGCCTTGACCTTCCTCTGCTGGTCTTGTTCCAGGGTTATGAATATGAAAA
NM_001282427.2 TAAACCGTTGTGCCTTGACCTTCCTCTGCTGGTCTTGTTCCAGGGTTATGAATATGAAAA
.****.*****. **************************************** *****

XM_002818334.3 AATAGAGATGAGACTTTTTGTGTCAACTCTGTCCACAAGAGTGAGTTATCTAGTATGATT
XM_012512141.2 AATAGAGATGAGACTTTTTGTGTCAATTCTGTCCACAAGAGTGAGTTATCTAGTATGATT
XM_001162884.6 AATAGAGATGAGACTTTTTGTGTCAACTCTGTCCACAAGAGTGAGTTATCTAGTATGATT
XM_003811196.4 AATAGAGATGAGACTTTTTGTGTCAACTCTGTCCACAAGAGTGAGTTATCTAGTATGATT
X83368.1 AATAGAGATGAGACTTTTTGTGTCAACTCTGTCCACAAGAGTGAGTTATCTAGTATGATT
NM_002649.3 AATAGAGATGAGACTTTTTGTGTCAACTCTGTCCACAAGAGTGAGTTATCTAGTATGATT
XM_004046020.3 AATAGAGATGAGACTTTTTGTGTCAACTCTGTCCACAAGAGTGAGTTATCTAGTATGATT
XM_032761353.1 AATAGAGATGAGACTTTTTGTGTCAACTCTGTCCACAAGAGTGAGTTATCTAGTATGATT
BX648341.1 AATAGAGATGAGACTTTTTGTGTCAACTCTGTCCACAAGAGTGAGTTATCTAGTATGATT
NM_001282427.2 AATAGAGATGAGACTTTTTGTGTCAACTCTGTCCACAAGAGTGAGTTATCTAGTATGATT
************************** *********************************

XM_002818334.3 AGTATAGCTTTCTCCAGCATGGTAGCAGGAAGTAACTACAGGGCCTCTTTTATGCCTGAC
XM_012512141.2 AGTATAGCTTTCTCCAGCATGGCAGCAGGAAGTAACTACAGGGCCTCTTTTATGCCTGAC
XM_001162884.6 AGTATAGCTTTCTCCAGCATGGCAGCAGGAAGTAACTACAGGGCCTCTTTTATGCCTGAC
XM_003811196.4 AGTATAGCTTTCTCCAGCATGGCAGCAGGAAGTAACTACAGGGCCTCTTTTATGCCTGAC
X83368.1 AGTATAGCTTTCTCCAGCATGGCAGCAGGAAGTAACTACAGGGCCTCTTTTATGCCTGAC

51
NM_002649.3 AGTATAGCTTTCTCCAGCATGGCAGCAGGAAGTAACTACAGGGCCTCTTTTATGCCTGAC
XM_004046020.3 AGTATAGCTTTCTCCAGCATGGCAGCAGGAAGTAACTACAGGGCCTCTTTTATGCCTGAC
XM_032761353.1 AGTATAGCTTTCTCCAGCATGGCAGCAGGAAGTAACTACAGGGCCTCTTTTATGCCTGAC
BX648341.1 AGTATAGCTTTCTCCAGCATGGCAGCAGGAAGTAACTACAGGGCCTCTTTTATGCCTGAC
NM_001282427.2 AGTATAGCTTTCTCCAGCATGGCAGCAGGAAGTAACTACAGGGCCTCTTTTATGCCTGAC
********************** *************************************

XM_002818334.3 ATTTCTTCCCTTCCTTTTTCCCTGCCTCCCTTTTTCATCAGTTGTGATGCTTCCACAACT
XM_012512141.2 ATTTCTTCCCTTCCTATTTCCCTGCCTCCCTTTTTCATCAATTGCGATGCTCCCACAACT
XM_001162884.6 ATTTCTTCCCTTCCTTTTTCCCTGCCTCCCTTTTTCATCAATTGCGATGCTCCCACAACT
XM_003811196.4 ATTTCTTCCCTTCCTTTTTCCCTGCCTCCCTTTTTCATCAATTGCGATGCTCCCACAACT
X83368.1 ATTTCTTCCCTTCCTTTTTCCCTGCCTCCCTTTTTCATCAATTGCAATGCTCCCACAACT
NM_002649.3 ATTTCTTCCCTTCCTTTTTCCCTGCCTCCCTTTTTCATCAATTGCGATGCTCCCACAACT
XM_004046020.3 ATTTCTTCCCTTCCTTTTTCCCTGCCTCCGTTTTTCATCAATTGCGATGCTCCCACAACT
XM_032761353.1 ATTTCTTCCCTTCCTATTTCCCTGCCTCCCTTTTTCATCAATTGCGATGCTCCCACAACT
BX648341.1 ATTTCTTCCCTTCCTTTTTCCCTGCCTCCCTTTTTCATCAATTGCAATGCTCCCACAACT
NM_001282427.2 ATTTCTTCCCTTCCTTTTTCCCTGCCTCCCTTTTTCATCAATTGCGATGCTCCCACAACT
***************:************* **********.*** .***** ********

XM_002818334.3 CTTTACAGACTTGTGAAATCTTCAAGGACACCTTTACTCTAGAACTCAAAAATTAGTTGA
XM_012512141.2 CTTTACAGACTTGTGAAATCTTCAAGAACACCTTTACTCTAGAACTCAAAAATTAGCTGA
XM_001162884.6 CTTTACAGACTTGTGAAATCTTCAAGAACGCCTTTACTCTATAACTCAAAAATTAGTTGA
XM_003811196.4 CTTTACAGACTTGTGAAATCTTCAAGAACACCTTTACTCTATAACTCAAAAATTAGTTGA
X83368.1 CTTTACAGACTTGTGAAATCTTCAAGAACACCTTTACTCTATAACTCAAAAATTAGTTGA
NM_002649.3 CTTTACAGACTTGTGAAATCTTCAAGAACACCTTTACTCTATAACTCAAAAATTAGTTGA
XM_004046020.3 CTTTACAGACTTGTGAAATCTTCAAGAACACCTTTACTCTATAACTCAAAAATTAGTTGA
XM_032761353.1 CTTTACAGACTTGTGAAATCTTCAAGAACACCTTTACTCTAGAACTCAAAAATTAGCTGA
BX648341.1 CTTTACAGACTTGTGAAATCTTCAAGAACACCTTTACTCTATAACTCAAAAATTAGTTGA
NM_001282427.2 CTTTACAGACTTGTGAAATCTTCAAGAACACCTTTACTCTATAACTCAAAAATTAGTTGA
**************************.**.*********** ************** ***

XM_002818334.3 AAAATAATTACTTCTCAAGGATTATTAGAATCTTAGGTACTTATTTGTAAAGATGTTTAG
XM_012512141.2 AAAATAATTACTTCTCAAGGATTATTAGAATCTTAGGTACTTATTTGTAAAGACATTTAG
XM_001162884.6 AAAATAATTACTTCTCAAGGATTATTAGAATCTTAGGTACTTATTTGTAAAGATGTTTAG
XM_003811196.4 AAAATAATTACTTCTCAAGGATTATTAGAATCTTAGGTTCTTATTTGTAAAGATGTTTAG
X83368.1 AAAATAATTACTTCTCAAGGATTATTAGAATCTTAGGTACTTATTTGTAAAGATGTTTAG
NM_002649.3 AAAATAATTACTTCTCAAGGATTATTAGAATCTTAGGTACTTATTTGTAAAGATGTTTAG
XM_004046020.3 AAAATAATTACTTCTCAAGGATTATTAGAATCTTAGGTACTTATTTGTAAAGATGTTTAG
XM_032761353.1 AAAATAATTACTTCTCAAGGATTATTAGAATCTTAGGTACTTATTTGTAAAGACATTTAG
BX648341.1 AAAATAATTACTTCTCAAGGATTATTAGAATCTTAGGTACTTATTTGTAAAGATGTTTAG
NM_001282427.2 AAAATAATTACTTCTCAAGGATTATTAGAATCTTAGGTACTTATTTGTAAAGATGTTTAG
**************************************:************** .*****

XM_002818334.3 TGACTTTTTTTTCAAGTATCTTATTAAAGGAGGCATTCTAGAAAATATGAATTAGTTTCC
XM_012512141.2 TGACTTTTTTTTCAAGTGTCTTATTAAAGGAGGCATTCTAGAAAATATGAATTAGTTTCC
XM_001162884.6 TGACTTTTTTTTCAAGTATCTTATTAAAGGAGGCATTCTAGAAAATATGAATTAGTTTCC
XM_003811196.4 TGACTTTTTTTTCAAGTATCTTATTAAAGGAGGCATTCTAGAAAATATGAATTAGTTTCC
X83368.1 TGACTTTTTTTTCAAGTATCTATAAAGGAGGCAGATTCTAGAAAATATGAATTAGTTTCC
NM_002649.3 TGACTTTTTTTTCAAGTATCTTATTAAAGGAGGCATTCTAGAAAATATGAATTAGTTTCC
XM_004046020.3 TGACTTTTTTTTCAAGTATCTTATTAAAGGAGGCATTCTAGAAAATATGAATTAGTTTCC
XM_032761353.1 TGACTTTTTTTTCAAGTGTCTTATTAAAGGAGGCATTCTAGAAAATATGAATTAGTTTCC
BX648341.1 TGACTTTTTTTTCAAGTATCTTATTAAAGGAGGCATTCTAGAAAATATGAATTAGTTTCC
NM_001282427.2 TGACTTTTTTTTCAAGTATCTTATTAAAGGAGGCATTCTAGAAAATATGAATTAGTTTCC
*****************.***::::*...*. . **************************

XM_002818334.3 AAATGCCTTAATTTTAAACTTTGGCCTGAACAGTGTTTTCTTTTTTTTAATGGAAGAAGA
XM_012512141.2 AAATGCCTTAATTTTAAACTTTGGCCTGAACAGTTTTTTCTTTTTTTAATAG-AAGAAGA
XM_001162884.6 AAATGCCTTAATTTTAAACTTTGGCCTGAACAGTTTTTTCTTTTTCTTAATGGAAGAAGA
XM_003811196.4 AAATGCCTTAATTTTAAACTTTGGCCTGAACAGTTTTTTCTTTTTCTTAATGGAAGAAGA
X83368.1 AAATGCCTTAATTTTAAACTTTGGCCTGAACAGTTTTTTCTTTTTCTTAATGGAAGAAGA
NM_002649.3 AAATGCCTTAATTTTAAACTTTGGCCTGAACAGTTTTTTCTTTTTCTTAATGGAAGAAGA
XM_004046020.3 AAATGCCTTAATTTTAAACTTTGGCCTGAACAGTTTTTTCTTTTTCTTAATGGAAGAAGA
XM_032761353.1 AAATGCCTTAATTTTAAACTTCGGCCTGAACAGTTTTTTCTTTTTTTAATAG-AAGAAGA
BX648341.1 AAATGCCTTAATTTTAAACTTTGGCCTGAACAGTTTTTTCTTTTTCTTAATGGAAGAAGA
NM_001282427.2 AAATGCCTTAATTTTAAACTTTGGCCTGAACAGTTTTTTCTTTTTCTTAATGGAAGAAGA
********************* ************ ********** *:*::* *******

XM_002818334.3 TATTTAATATCTTAAAAATATTCCAAGTTAGGAAGAACACTACTTGCCTTATCCATTTCC

52
XM_012512141.2 TATTTAATATCTTAAAAATATTCCAAGTTAGGAAGAACACTACTTGCCTTATCCATTTCC
XM_001162884.6 TATTTAATATCTTAAAAATATTCCAAGTTAGGAAGAACACTACTTGCCTTATCCATTTCC
XM_003811196.4 TATTTAATATCTTAAAAATATTCCAAGTTAGGAAGAACACTACTTGCCTTATCCATTTCC
X83368.1 TATTTAATATCTTAAAAATATTCCAAGTTAGGAAGAACACTACTTGCCTTATCCATTTCC
NM_002649.3 TATTTAATATCTTAAAAATATTCCAAGTTAGGAAGAACACTACTTGCCTTATCCATTTCC
XM_004046020.3 TATTTAATATCTTAAAAATATTCAAAGTTAGGAAGAACACTACTTGCCTTATCCATTTCC
XM_032761353.1 TATTTAATATCTTAAAAATATTCCAAGTTAGGAAGAACACTACTTGCCTTATCCATTTCC
BX648341.1 TATTTAATATCTTAAAAATATTCCAAGTTAGGAAGAACACTACTTGCCTTATCCATTTCC
NM_001282427.2 TATTTAATATCTTAAAAATATTCCAAGTTAGGAAGAACACTACTTGCCTTATCCATTTCC
***********************.************************************

XM_002818334.3 CATTTAAAGGACTTTTAAACTTTGACACA-TCCTTCAGATTTCCTAAAAATAATTGAAAT
XM_012512141.2 CATTTAAAGGACTTTTAAACTTTGACATA-TCCTTCAGATTTCCTAAAAGTAATTGAAAT
XM_001162884.6 CATTTAAAGGACTTTTAAACTTTGACACA-TCCTTCAGATTTCCTGAAAATAATTGAAAT
XM_003811196.4 CATTTAAAGGACTTTTAAACTTTGACACA-TCCTTCAGATTTCCTGAAAATAATTGAAAT
X83368.1 CATTTAAAGGACTTTTAAACTTTGACACAGTCCTTCAGATTTCCTGAAAATCCTTGAAAT
NM_002649.3 CATTTAAAGGACTTTTAAACTTTGACACA-TCCTTCAGATTTCCTGAAAATAATTGAAAT
XM_004046020.3 CATTTAAAGGACTTTTAAACTTTGACACA-TCCTTCAGATTTCCTGAAAATAATTGAAAT
XM_032761353.1 CATTTAAAGGACTTTTAAACTTTGACATA-TCCTTCAGATTTCCTAAAAGTAACTGAAAT
BX648341.1 CATTTAAAGGACTTTTAAACTTTGACACA-TCCTTCAGATTTCCTGAAAATAATTGAAAT
NM_001282427.2 CATTTAAAGGACTTTTAAACTTTGACACA-TCCTTCAGATTTCCTGAAAATAATTGAAAT
*************************** * ***************.***.*.. ******

XM_002818334.3 ATCTTACTGTAAAAATATTTTCATCTCTTAAATATCTCATTATTTATTGGAGGTATTGTT
XM_012512141.2 ATCTTACTTTAAAAATATTTTCACCTCTTAAATATCTCATTATTTATTGGAGGTATTGTT
XM_001162884.6 ATCTTACTTTAAAAATATTTTCATCTCTTAAATATCTCGTTATTTATTGGAGGTATTGTT
XM_003811196.4 ATCTTACTTTAAAAATATTTTCATCTCTTAAATATCTCGTTATTTATTGGAGGTATTGTT
X83368.1 ATCTTACTTTAAAAATATTTTCATCTCTGAAATATCTCGTTATTTATTGGAGGTATTGTT
NM_002649.3 ATCTTACTTTAAAAATATTTTCATCTCTGAAATATCTCGTTATTTATTGGAGGTATTGTT
XM_004046020.3 ATCTTACTTTAAAAATATTTTCATCTCTTAAATATCTCGTTATTTATTGGAGGTATTGTT
XM_032761353.1 ATCTTACTTTAAAAATATTTTCATCTCTTAAATATCTCATTATTTATTGGAGGTATTGTT
BX648341.1 ATCTTACTTTAAAAATATTTTCATCTCTGAAATATCTCGTTATTTATTGGAGGTATTGTT
NM_001282427.2 ATCTTACTTTAAAAATATTTTCATCTCTGAAATATCTCGTTATTTATTGGAGGTATTGTT
******** ************** **** *********.*********************

XM_002818334.3 TAACCTTAGAGAGACCATTAAATTATTTATAAAATATTTTGTAATTACCTGTAGTTAATA
XM_012512141.2 TAACCTTAGAGAGACCATTAAATTATTTATAAAATATTTTGTAATTACCTGTAGTTAATA
XM_001162884.6 TAACCTTAGAGAGACCATTAAATTATTTATAAAATATTTTGTAATTACCTGTAGTTAATA
XM_003811196.4 TAACCTTAGAGAGACCATTAAATTATTTATAAAATATTTTGTAATTACCTGTAGTTAATA
X83368.1 TAACCTTAGATAGACCATTAAATTATTTATAAAATATTTTGTAATTACTG-TAGCTAATA
NM_002649.3 TAACCTTAGAGAGACCATTAAATTATTTATAAAATATTTTGTAATTACCTGTAGCTAATA
XM_004046020.3 TAACCTTAGAGAGACCATTAAATTATTTATAAAATATTTTGTAATTACCTGTAGTTAATA
XM_032761353.1 TAACCTTAGAGAGACCATTAAATTATTTATAAAATATTTTGTAATTACCTGTAGTTAATA
BX648341.1 TAACCTTAGAGAGACCATTAAATTATTTATAAAATATTTTGTAATTACCTGTAGCTAATA
NM_001282427.2 TAACCTTAGAGAGACCATTAAATTATTTATAAAATATTTTGTAATTACCTGTAGCTAATA
********** ************************************* *** *****

XM_002818334.3 CATTACATAGAAAAAAACTCTATGTTAACAGTGTCTATGTTTAAGTATAATCAGATATAA
XM_012512141.2 CATTACATAGAAAAAAACTCTATGTTAACAGTGTCTATGTTTAAGTATAATCAGATATAA
XM_001162884.6 CATTACATAGAAAAAAACTATGT--TAACAGTGTCTCTGTTTAAGTATAATCAGATATAA
XM_003811196.4 CATTACATAGAAAAAAACTATGT--TAACAGTGTCTCTGTTTAAGTATAATCAGATATAA
X83368.1 CATTACATAG-AAAAAACTATGT--TAACAGTGTCTCTGTTTAAGTATAATCAGATATAA
NM_002649.3 CATTACATAGAAAAAAACTATGT--TAACAGTGTCTCTGTTTAAGTATAATCAGATATAA
XM_004046020.3 CATTACATAGAAAAAAACTATGT--TAACAGTGTGTCTGTTTAAGTATAATCAGATATAA
XM_032761353.1 CATTACATAGAAAAAAACTCTATGTTAACAGTGTCTATGTTTAAGTATAATCAGATATGA
BX648341.1 CATTACATAGAAAAAAACTATGT--TAACAGTGTCTCTGTTTAAGTATAATCAGATATAA
NM_001282427.2 CATTACATAGAAAAAAACTATGT--TAACAGTGTCTCTGTTTAAGTATAATCAGATATAA
********** ********.*.* ********* *.*********************.*

XM_002818334.3 ATATATACTT--AATTTTTTAATTTTAAAAATAGATACCTGTTTGACTTTGAGGTAGTCC
XM_012512141.2 ATATATACTT--AATTTTTTAATTTTAAAAATAGCTACCTGTTTGACTTTGAGGTAGCCC
XM_001162884.6 ATATATACTT--AATTTTTTAATTTTAAAAATAGATACCTGTTTGACTTTGAGGTAGTCC
XM_003811196.4 ATATATACTT--AATTTTTTAATTTTAAAAATAGATACCTGTTTGACTTTGAGGTAGTCC
X83368.1 ATATATAACTTAATTTTTTAATTTTAAAAAATAGATACCTGTTTGACTTTGAGGTAGTCC
NM_002649.3 ATATATACTT--AATTTTTTAATTTTAAAAATAGATACCTGTTTGACTTTGAGGTAGTCC
XM_004046020.3 ATATATACTT--AATTTTTTAATTTAAAAAATAGATACCTGTTTGACTTTGAGGTAGTCC
XM_032761353.1 ATATATACTT--AATTTTTTAATTTTAAAAATAGCTACCTGTTTGACTTTGAGGTAGTCC
BX648341.1 ATATATACTT--AATTTTTTAATTTTAAAAATAGATACCTGTTTGACTTTGAGGTAGTCC

53
NM_001282427.2 ATATATACTT--AATTTTTTAATTTTAAAAATAGATACCTGTTTGACTTTGAGGTAGTCC
*******. * *:*****:*:***:********.********************** **

XM_002818334.3 AGACCTTTTTTTTTTTTTTTTTT----------AAATGTGTGCAAAAGCCCAGTGGTTCC
XM_012512141.2 AGACATTTTTTTTTTTTTTTTTTT-------TTAAATGTGTGCAAAAGCCCAATGGTTCC
XM_001162884.6 AGACCTTTTCTTTTTTTTTTTTTTTTTTTTTTTAAATGTGTGCAAAAGCCCAAAGGTTCC
XM_003811196.4 AGACCTTTTCTTTTTTTTTTTTTTTTTTT----AAATGTGTGCAAAAGCCCAAAGGTTCC
X83368.1 AGGCCTTTTTCTTTTTTTTTTTT-----TTT--AATG-TGTGCAAAAGCCCAAAGGTTCC
NM_002649.3 AGACCTTTTCTTTTTTTTTTTTT-----TTT--TAATGTGTGCAAAAGCCCAAAGGTTCC
XM_004046020.3 AGACCTTTTTTTTTTTTTTTTTT-----TTTTTAATGTG-TGCAAAAGCCCAGTGGTTCC
XM_032761353.1 AGACCTTTTTTTTTTTTTTTTTT-----TTTTAAAATCTGTGCAAAAGCCCAATGGTTCC
BX648341.1 AGACCTTTTCTTTTTTTTTTTTTT-------TTAATG-TGTGCAAAAGCCCAAAGGTTCC
NM_001282427.2 AGACCTTTTCTTTTTTTTTTTTTT-------TTTAATGTGTGCAAAAGCCCAAAGGTTCC
**.*.**** ************ :*: ************.:******

XM_002818334.3 TAAGCCTGGCTGCAAAGAAGAATCAACAGGGACACTTTTTAAAAACACTCTTATCAGCCT
XM_012512141.2 TAAGCCTGGCTGCAAGGAAGAATCAACAGGGACACTTTTTAAAAACACTC--ATCAGCCT
XM_001162884.6 TAAGCCTGGCTGCAAAGAAGAATCAACAGGGACACTTTTTAAAAACACTCTTATCAGCCT
XM_003811196.4 TAAGCCTGGCTGCAAAGAAGAATCAACAGGGACACTTTTTAAAAACACTCTTATCAGCCT
X83368.1 TAAGCCTGGCTGCAAAGAAGAATCAACAGGGACACTTTTTAAAAACACTCTTATCAGCCT
NM_002649.3 TAAGCCTGGCTGCAAAGAAGAATCAACAGGGACACTTTTTAAAAACACTCTTATCAGCCT
XM_004046020.3 TAAGCCTGGCTGCAAAGAAGAATCAACAGGGACACTTTTTAAAAACACTCTTATCAGCCT
XM_032761353.1 TAAGCCTGGCTGCAAGGAAGAATCAACAGGGACACTTTTTAAAAACACTCATCAGCCTGG
BX648341.1 TAAGCCTGGCTGCAAAGAAGAATCAACAGGGACACTTTTTAAAAACACTCTTATCAGCCT
NM_001282427.2 TAAGCCTGGCTGCAAAGAAGAATCAACAGGGACACTTTTTAAAAACACTCTTATCAGCCT
***************.********************************** .: .

XM_002818334.3 GGG-CAACACAGTGAGACCCCATCTCTTAAAAAAAAAAATTAACTGGGTATAGTGGTGTG
XM_012512141.2 GGG-CAACACAGTGAGACCCCATCTCTAAAAAAACTTAG----CTGGGTATAGTGGTGTG
XM_001162884.6 GGG-CAACACAGTGAGACTCCATCTCTTAAAAAAAAAAATTAGCTGGGTATAGTGGTATG
XM_003811196.4 GGG-CAACACAGTGAGACTCCATCTCTTAAAAAAAAAAATTAGCTGGGTATAGTGGTATG
X83368.1 GGGGCAACACAGTGAGACTCCATCTCTTAAAAAAAAAATTAG-CTGGGTATAGTGGTATG
NM_002649.3 GGG-CAACACAGTGAGACTCCATCTCTTAAAAAAAAAATTAG-CTGGGTATAGTGGTATG
XM_004046020.3 GGG---CAACACAGACTCCATCTCTTAAAAAAAAAAATTAG--CTGGGTATAGTGGTGTG
XM_032761353.1 GCA---ACACAGTGAGACCCCATCTCTAAAAAAAAAGTTAG--CTGGGTATAGTGGTGTG
BX648341.1 GGG-CAACACAGTGAGACTCCATCTCTTAAAAAAAAAATTAG-CTGGGTATAGTGGTATG
NM_001282427.2 GGG-CAACACAGTGAGACTCCATCTCTTAAAAAAAAAATTAG-CTGGGTATAGTGGTATG
* . ..*** :** :* . .*** ::******.: : **************.**

XM_002818334.3 TGCCTGTAGTCCCAGGTACTCGGGAGGCTGAGGCAGGAGGATTGCCTGAGCCCAGGAGGT
XM_012512141.2 TGCCTATAGTCCCAGGTACTCGGGAGGCTGAGGCAGGAGGATTTCCTGAGCCCAAGAGGT
XM_001162884.6 TGCCTGTAGTCCCAGGTACTCAGGAGGCTGAGGCAGGAGGATTGCCTGAGCCCAGGAGGT
XM_003811196.4 TGCCTGTAGTCCCAGGTACTCAGGAGGCTGAGGCAGGAGGATTGCCTGAGCCCAGGAGGT
X83368.1 TGCCTGTAGTCCCAGGTACTCAGGAGGCTGAGGCAGGAGGATTGCCTGAGCCCAGGAGGT
NM_002649.3 TGCCTGTAGTCCCAGGTACTCAGGAGGCTGAGGCAGGAGGATTGCCTGAGCCCAGGAGGT
XM_004046020.3 TGCCTGTAGTCCCAGGTACTCAGGAGGCTGAGGCAGGAGGATTGCCTGAGCCCAGGAGGT
XM_032761353.1 TGCCTGTAGTCCCAGGTACTCGGGAGGCTGAGGCAGGAGGATTTCCTGAGCCCAAGAGGT
BX648341.1 TGCCTGTAGTCCCAGGTACTCAGGAGGCTGAGGCAGGAGGATTGCCTGAGCCCAGGAGGT
NM_001282427.2 TGCCTGTAGTCCCAGGTACTCAGGAGGCTGAGGCAGGAGGATTGCCTGAGCCCAGGAGGT
*****.***************.********************* **********.*****

XM_002818334.3 GGAAACTGCAGACAGTGGTGATCATGTCCTTACACTCCAGCCTGGATAACAGAACGAGAC
XM_012512141.2 GGAAACTGCCGAGAGTCATGATCATGTCCTTACAGTCCAGCTTGGATAACAGAGCGAGAA
XM_001162884.6 GGAAACTGCAGAGAGTCATGATCATGTCCTTACACTCCAGCCTGGATAACAGAGCGAGAC
XM_003811196.4 GGAAACTGCAGAGAGTCATGATCATGTCCTTACACTCCAGCCTGGATAACAGAGCGAGAC
X83368.1 GGAAACTGCAGAGAGTCATGATCATGTCCTTACACTCCAGCCTGGATAACAGAGCGAGAC
NM_002649.3 GGAAACTGCAGAGAGTCATGATCATGTCCTTACACTCCAGCCTGGATAACAGAGCGAGAC
XM_004046020.3 GGAAACTGCAGAGAGTCATGATCATGTCCTTACACTCCAGCCTGGATAACAGAGCGAGAC
XM_032761353.1 GGAAACTGCCGAGAGTCATGATCATGTCCTTACAATCCAGCTTGGATAACAGAGCGAGAA
BX648341.1 GGAAACTGCAGAGAGTCATGATCATGTCCTTACACTCCAGCCTGGATAACAGAGCGAGAC
NM_001282427.2 GGAAACTGCAGAGAGTCATGATCATGTCCTTACACTCCAGCCTGGATAACAGAGCGAGAC
*********.** *** .**************** ****** ***********.*****.

XM_002818334.3 CCTGTCTCAAAAAAA---------------------------------------------
XM_012512141.2 CCTGTCTCAAAAAAA---------------------------------------------
XM_001162884.6 CCTGTCTCAAAAAAATAAAATAAAAAATAAAAACACCCTTGCCTGCGCTCCATTCCCAGG
XM_003811196.4 CCTGTCTCAAAAAAATAAAATAAAAAATAAAAACACCCTTGCCTGCGCTCCATTCCCAGG
X83368.1 CCTGTCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAACTCGAG------------------

54
NM_002649.3 CCTGTCTCAAAAAAATAAAATAAAAAATAAAAACACCCTTGCCTGCGCTCCATTCCCAGG
XM_004046020.3 CCTGTCTCAAAAAAA---------------------------------------------
XM_032761353.1 CCTGTCTCAAAAAAATAAAATAAAAAATAAAAACACCCTTGCCTGCGCTCCATTCCCAGG
BX648341.1 CCTGTCTCAAAAAAATAAAAAAAAAAAAAAAAAAAACA----------------------
NM_001282427.2 CCTGTCTCAAAAAAATAAAATAAAAAATAAAAACACCCTTGCCTGCGCTCCATTCCCAGG
***************

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 TTATAATTTAATTACTGTGGGATGAGACCCAGACATCAATATTTTTCTAGATTCTTCAGG
XM_003811196.4 TTATAATTTAATTACTGTGGGATGAGACCCAGACATCAATATTTTTCTAGATTCTTCAGG
X83368.1 ------------------------------------------------------------
NM_002649.3 TTATAATTTAATTACTGTGGGATGAGACCCAGACATCAATATTTTTCTAGATTCTTCAGG
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 TTATAATTTAATTACTGTGGGATGAGGCCCAGACATCAATATTTTTCTAGATTCTTTAGG
BX648341.1 ------------------------------------------------------------
NM_001282427.2 TTATAATTTAATTACTGTGGGATGAGACCCAGACATCAATATTTTTCTAGATTCTTCAGG

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 TGATTCTAATTCACAGCCAGAGTTGGAAGCCCTGCGTGGCCTTTGAAGGTCTAGATGATT
XM_003811196.4 TGATTCTAATTCACAGCCAGAGTTGGAAGCCCTGCGTGGCCTTTGAAGGTCTAGATGATT
X83368.1 ------------------------------------------------------------
NM_002649.3 TGATTCTAATTCACAGCCAGAGTTGGAAGCCCTGCGTGGCCTTTGAAGGTCTAGATGATT
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 TGATTCTAATTCACAGCCAGAGTTGGAAGCCCTGCGTGGCCTTTGAAGGTCTAGATGATT
BX648341.1 ------------------------------------------------------------
NM_001282427.2 TGATTCTAATTCACAGCCAGAGTTGGAAGCCCTGCGTGGCCTTTGAAGGTCTAGATGATT

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 CTTCTTCCTTGCCCTTTGAGCTTTTCCCCATCTCACAGGTATCTAGAAAAAAACTCCTCT
XM_003811196.4 CTTCTTCCTTGCCCTTTGAGCTTTTCCCCATCTCACAGGTATCTAGAAAAAAACTCCTCT
X83368.1 ------------------------------------------------------------
NM_002649.3 CTTCTTCCTTGCCCTTTGAGCTTTTCCCCATCTCACAGGTATCTAGAAAAAAACTCCTCT
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 CTTCTTCCTTGCCCTTTGAGCTTTTCCCCGTCTCACAGGTATCTAGAAAAAAACTCCTCT
BX648341.1 ------------------------------------------------------------
NM_001282427.2 CTTCTTCCTTGCCCTTTGAGCTTTTCCCCATCTCACAGGTATCTAGAAAAAAACTCCTCT

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 TCTTTGGCAACCTGTCCTTTTAAAATCACACTCTACCCACCTGTACAGAGGACCATGCCC
XM_003811196.4 TCTTTGGCAACCTGTCCTTTTAAAATCACACTCTACCCACCTGTACAGAGGACCATGCCC
X83368.1 ------------------------------------------------------------
NM_002649.3 TCTTTGGCAACCTGTCCTTTTAAAATCACACTCTACCCACCTGTACAGAAGACCATGCCC
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 TCTTTGGCAACCTGTCCTTTTAAAATCACACTCTACCTGCCTGTACAGAGGACCATGCCC
BX648341.1 ------------------------------------------------------------
NM_001282427.2 TCTTTGGCAACCTGTCCTTTTAAAATCACACTCTACCCACCTGTACAGAAGACCATGCCC

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 TATAATGAAATGTTTATTCCTATCTATAAATGGAGGATAAACATTTTGTGGCACTTCTGG
XM_003811196.4 TATAATGAAATGTTTATTCCTATCTATAAATGGAGGATAAACATTTTGTGGCACTTCTGG
X83368.1 ------------------------------------------------------------
NM_002649.3 TATAATGAAATGTTTATTCCTATCTATAAATGGAGGATAAACATTTTGTGGCACTTCTGG
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 TATAATGAAATGTTTATTCCTCTCTATAAATGTAGGATAAACATTTTGTGGCACTTCTGG
BX648341.1 ------------------------------------------------------------
NM_001282427.2 TATAATGAAATGTTTATTCCTATCTATAAATGGAGGATAAACATTTTGTGGCACTTCTGG

XM_002818334.3 ------------------------------------------------------------

55
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 ACCAACTATTCCCTACTATTCTTTTGAAGAAAGGCAGGAAGAGTACTTTCTAATTCAGAA
XM_003811196.4 ACCAACTATTCCCTACTATTCTTTTGAAGAAAGGCAGGAAGAGTACTTTCTAATTCAGAA
X83368.1 ------------------------------------------------------------
NM_002649.3 ACCAACTATTCCCTACTATTCTTTTGAAGAAAGGCAGGAAGAGTACTTTCTAATTCAGAA
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 ACCAACTATTCCCTACTATTCTTTTGAAGAAAGGCAGGAAGAGTACTTTCTAATTCAGAA
BX648341.1 ------------------------------------------------------------
NM_001282427.2 ACCAACTATTCCCTACTATTCTTTTGAAGAAAGGCAGGAAGAGTACTTTCTAATTCAGAA

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 GAGGATGTTTTCACTATTCTGATAAACAATAGCCAAGTTCAGACCTTGTACAG---ATTC
XM_003811196.4 GAGGATGTTTTCACTATTCTGATAAACAATAGCCAAGTTCAGACCTTGTACAG---ATTC
X83368.1 ------------------------------------------------------------
NM_002649.3 GAGGATGTTTTCACTATTCTGATAAACAATAGCCAAGTTCAGACCTTGTACAGATTCT--
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 GAGGATGTTTTCACTATTCTGATAAACAATAGCCAAGTTCAGACCTTGTACAGATTCTTC
BX648341.1 ------------------------------------------------------------
NM_001282427.2 GAGGATGTTTTCACTATTCTGATAAACAATAGCCAAGTTCAGACCTTGTACAG---ATTC

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 TTTTTATTTGAATTGCTGAAATAATTTCTTGATGATGAAAAAAAGATTAGAGAGGAATAC
XM_003811196.4 TTTTTATTTGAATTGCTGAAATAATTTCTTGATGATGAAAAAAAGATTAGAGAGGAATAC
X83368.1 ------------------------------------------------------------
NM_002649.3 -TTTTATTTGAATTGCTGAAATAATTTATTGATGATGAAAAAAAGATTAGAGAGGAATAC
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 TTTTTATTTAAATTGCTGAAATAATTTCTTGATGATGAAAAAAAGATTAGAGAGGAATAT
BX648341.1 ------------------------------------------------------------
NM_001282427.2 TTTTTATTTGAATTGCTGAAATAATTTATTGATGATGAAAAAAAGATTAGAGAGGAATAC

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 ATTTATATTTAGCTTATTGGCACATGTGCATACATATTTCCTCTTCAAATGAACCAGTTC
XM_003811196.4 ATTTATATTTAGCTTATTGGCACATGTGCATACATATTTCCTCTTCAAATGAACCAGTTC
X83368.1 ------------------------------------------------------------
NM_002649.3 ATTTATATTTAGCTTATTGGCACATGTGCATACATATTTCCTCTTCAAATGAACCAGTTC
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 ATTTATATTTAGCTTATTGGCACATGTGCATACATATTTCCTCTTCAAATGAACCAGTTC
BX648341.1 ------------------------------------------------------------
NM_001282427.2 ATTTATATTTAGCTTATTGGCACATGTGCATACATATTTCCTCTTCAAATGAACCAGTTC

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 TTTCATTTCATTATGCTAATATATATAT--------------------------------
XM_003811196.4 TTTCATTTCATTATGCTAATATATATAT--------------------------------
X83368.1 ------------------------------------------------------------
NM_002649.3 TTTCATTTCATTATGCTAATATATATATATAACATAT---------------------AT
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 TTTCATTTCATTAGGCTAATATATATATCACATATATGTGCTAATATATATCACATATAT
BX648341.1 ------------------------------------------------------------
NM_001282427.2 TTTCATTTCATTATGCTAATATATATATATAACATAT---------------------AT

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 --------------ATCACATATATATGCTAATATATATATATCACACATATATATCACA
XM_003811196.4 --------------ATAACATATATATGCTAATATATATATATCACACATATATATCACA
X83368.1 ------------------------------------------------------------
NM_002649.3 ATGCTAATATATATATAACATATATATGCTAATATATATATATCACACATATATATCACA
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 ATGCTAATATATATATCACATATATATGCTAATATATATGTGCGATATATATATATCACA
BX648341.1 ------------------------------------------------------------

56
NM_001282427.2 ATGCTAATATATATATAACATATATATGCTAATATATATATATCACACATATATATCACA

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 GTTTTATATATATATATG---TGTGTGTGTGTGTGTATATATATATATATCACAGTTGGG
XM_003811196.4 GTTTTATATATATATATG---TGTGTGTGTGTGTATATATATATATATATCACAGTTGGG
X83368.1 ------------------------------------------------------------
NM_002649.3 GTTTTATATATATATATGTATGT-GTGTGTGTATATATATATATATATATCACAGTTGGG
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 TATATATGCTAATATATATATGTGATATATATATATATATATATATATATCACAGTTGGG
BX648341.1 ------------------------------------------------------------
NM_001282427.2 GTTTTATATATATATATGTATGTGTGTG-TGTATATATATATATATATATCACAGTTGGG

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 CTTGATTCTTCCGTATTCCAAAGAGCATAATTTCAGTTCTATAGACTTATAGATAAATAA
XM_003811196.4 CTTGATTCTTCCGTATTCCAAAGAGCATAATTTCAGTTCTATAGACTTATAGATAAATAA
X83368.1 ------------------------------------------------------------
NM_002649.3 CTTGATTCTTCCGTATTCCAAAGAGCATAATTTCAGTTCTATAGACTTATAGATAAATAA
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 CTTGATTCTTCCGTATTCCAAAGAGCATAATTTCAGTTCTATAGACTTACACATAAATAA
BX648341.1 ------------------------------------------------------------
NM_001282427.2 CTTGATTCTTCCGTATTCCAAAGAGCATAATTTCAGTTCTATAGACTTATAGATAAATAA

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 AAATCATCTTTGTGGGCTTCCTTCCTTTTCTGTTCGCAGTGAATTACATGACCAACAACT
XM_003811196.4 AAATCATCTTTGTGGGCTTCCTTCCTTTACTGTTTGCAGTGAATTACATGACCAACAACT
X83368.1 ------------------------------------------------------------
NM_002649.3 AAATCATCTTTGTGGGCTTCCTTCCTTTACTGTTCGCAGTGAATTACATGACGAACAACT
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 AAATCATCTTTGTGGGCTTCCTTCCTTTACTGTTCCCAGTGAATTACATGACCCACAACT
BX648341.1 ------------------------------------------------------------
NM_001282427.2 AAATCATCTTTGTGGGCTTCCTTCCTTTACTGTTCGCAGTGAATTACATGACGAACAACT

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 TCTATACCTTTGAAAATGTTCTAGAACTAGAATCATCCTGCTACTGGGAACTACCCACAG
XM_003811196.4 TCTATACCTTTGAAAATGTTCTAGAACTAGAATCATCCTGCTACTGGGAACTACCCACAG
X83368.1 ------------------------------------------------------------
NM_002649.3 TCTATACCTTTGAAAATGTTCTAGAACTAGAATCATCCTGCTACTGGGAACTACCCACAG
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 TCTATACCTTTGAAAATGTTCTAGAACTAGAATCATCCTGCTGCTGGGAACTACCCACAG
BX648341.1 ------------------------------------------------------------
NM_001282427.2 TCTATACCTTTGAAAATGTTCTAGAACTAGAATCATCCTGCTACTGGGAACTACCCACAG

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 CTCTATCTTCAATGCCAGGTGAAAACACAGATCACAAGTCAGATGAATCAGGCCAAAGCA
XM_003811196.4 CTCTATCTTCAATGCCAGGTGAAAACACAGATCACAAGTCAGATGAATCAGGCCAAAGCA
X83368.1 ------------------------------------------------------------
NM_002649.3 CTCTATCTTCAATGCCAGGTGAAAACACAGATCACAAGTCAGATGAATCAGGCCAAAGCA
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 CTCTATCTTCAATGCCAGATGAAAACACAGATCACAAGTCAGATGAATCAGGCCAAAGCA
BX648341.1 ------------------------------------------------------------
NM_001282427.2 CTCTATCTTCAATGCCAGGTGAAAACACAGATCACAAGTCAGATGAATCAGGCCAAAGCA

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 ACTTTTATGTATATCTAGGACTGGCGACATAATTTGCACAGCCCAGTGAAAAGTGAAAAT
XM_003811196.4 ACTTTTATGTATATCTAGGACTGGCGACATAATTTGCACAGCCCAGTGAAAAGTGAAAAT
X83368.1 ------------------------------------------------------------

57
NM_002649.3 ACTTTTATGTATATCTAGGACTGGCGACATAATTTGCAGAGCCCAGTGAACAGTGAAAAT
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 ACTTTTATGTATATCTAGGACTGGCGACGTAATTTGCAGAGCCCAGTGAAAAATGAAAAT
BX648341.1 ------------------------------------------------------------
NM_001282427.2 ACTTTTATGTATATCTAGGACTGGCGACATAATTTGCAGAGCCCAGTGAACAGTGAAAAT

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 GCAGTCCCATCATTCAAAAATTATTAAGAAATTCGAGATGGCAACAGCAGTTGTCACACC
XM_003811196.4 GCAGTCCCATCATTCAAAAATTATTAAGAAATTCGAGATGGCAACAGCAGTTGTCACACC
X83368.1 ------------------------------------------------------------
NM_002649.3 GCAGTCCCATCATTCAAAAATTATTAAGAAATTCGAGATGGCAACAGCAGTTGTCACACC
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 GCAGGCCCATCATTCAAAAATTATTAAGAAATTCAAGATGGCAACAGCAGTTGTCACACC
BX648341.1 ------------------------------------------------------------
NM_001282427.2 GCAGTCCCATCATTCAAAAATTATTAAGAAATTCGAGATGGCAACAGCAGTTGTCACACC

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 AAGCACAGTGCCCTTCTGAGCATGAGGCTCTGTGTGACTGCATGGGTCCATGAGACCAGC
XM_003811196.4 AAGCACAGTGCCCTTCTGAGCATGAGGCTCTGTGTGACTGCATGGGTCCATGAGACCAGC
X83368.1 ------------------------------------------------------------
NM_002649.3 AAGCACAGTGCCCTTCTGAGCATGAGGCTCTGTGTGACTGCATGGGTCCATGAGACCAGC
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 AAGCACAGTGCCTTTCTGAGCATGAGGCTCTGTGTGACTGCATGGGTTCATGAGACCAGC
BX648341.1 ------------------------------------------------------------
NM_001282427.2 AAGCACAGTGCCCTTCTGAGCATGAGGCTCTGTGTGACTGCATGGGTCCATGAGACCAGC

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 ACTGTGTGGGTCTGAATGATGGATTTGGTGTTCAGTTTAGCACACACGGTCTACCACGTC
XM_003811196.4 ACTGTGTGGGTCTGAATGATGGATTTGGTGTTCAGTTTAGCACACACGGTCTACCACGTC
X83368.1 ------------------------------------------------------------
NM_002649.3 ACTGTGTGGGTCTGAATGATGGCTTTGGTGTTCAGTTTAGCACACGCGGTCTACCACGTC
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 ACTGTGTGTGTCTGAATGATGGCTTTGGTGTTCAGTTTAGCACACACAGTCTACCACGTC
BX648341.1 ------------------------------------------------------------
NM_001282427.2 ACTGTGTGGGTCTGAATGATGGCTTTGGTGTTCAGTTTAGCACACGCGGTCTACCACGTC

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 TGCATGAGTGGTAAATATAGTGCCTGGGTCAAAATGTTCCTTCTGTTCAAATCGAAATAC
XM_003811196.4 TGCATGAGTGGTAAATATAGTGCCTGGGTCAAAATGTTCCTTCTGTTCAAATCGAAATAC
X83368.1 ------------------------------------------------------------
NM_002649.3 TGCATGAGTGGTAAATGTAGTGCCTGGGTCAAAATGTTCCTTCTGTTCAAATCGAAATAC
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 TGCATGAGTGGTAAATATAGTGCCTGGGTCAAAATGTTGCTTCTGTTCAAATCGAAATAC
BX648341.1 ------------------------------------------------------------
NM_001282427.2 TGCATGAGTGGTAAATGTAGTGCCTGGGTCAAAATGTTCCTTCTGTTCAAATCGAAATAC

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 CTCATCTCTATGGCTCTATGGCTGTACATTAGGACCTAGAACAGTGGCCCATTGCTCTTA
XM_003811196.4 CTCATCTCTATGGCTCTATGGCTGTACATTAGGACCTAGAACAGTGGCCCATTGCTCTTA
X83368.1 ------------------------------------------------------------
NM_002649.3 CTCATCTCTATGGCTCTATGGCTGTACATTAGGACCTAGAACAGTGGCCCATTGCTCTTA
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 CTCATCTCTATG--------GCTGTACATTAGGACCTAGAACAGTGGCCCATTGCTCTTA
BX648341.1 ------------------------------------------------------------
NM_001282427.2 CTCATCTCTATGGCTCTATGGCTGTACATTAGGACCTAGAACAGTGGCCCATTGCTCTTA

XM_002818334.3 ------------------------------------------------------------

58
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 GACTGGAACCATGTCCACTAAAATAAACCTAAGCAGATGTTGTAGACCTAGCCCCACAGG
XM_003811196.4 GACTGGAACCATGTCCACTAAAATAAACCTAAGCAGATGTTGTAGACCTAGCCCCACAGG
X83368.1 ------------------------------------------------------------
NM_002649.3 GACTGGAACCATGTCCACTAAAATAAACCTAAGCAGATGTTGTAGACCTAGCCCCACAGG
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 GACTGGAACCATGTCCACTAAAATAAACCTAAGCAGATGTTGTAGACCTAGCCCCAAAGG
BX648341.1 ------------------------------------------------------------
NM_001282427.2 GACTGGAACCATGTCCACTAAAATAAACCTAAGCAGATGTTGTAGACCTAGCCCCACAGG

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 ACTGCATTTAGCTGCTTCAGTGACACTTTGATGAAAGTATGGAGAAGTGGAGACATTACA
XM_003811196.4 ACTGCATTTAGCTGCTTCAGTGACACTTTGATGAAAGTATGGAGAAGTGGAGACATTATA
X83368.1 ------------------------------------------------------------
NM_002649.3 ACTGCATTTAGCTGCTTCAGTGACACTTTGATGAAAGTATGGAGAAGTGGAGACATTATA
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 ACTGCATTTGACTGCTTCAGTGATACTTTGATGAAAGTATGGAGAAGTGGAGACATTATA
BX648341.1 ------------------------------------------------------------
NM_001282427.2 ACTGCATTTAGCTGCTTCAGTGACACTTTGATGAAAGTATGGAGAAGTGGAGACATTATA

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 GATAAAATATATCAATTCCCAGAGAAAACTCTTGACTTAAAAACTTAACTGTAGTAAATA
XM_003811196.4 GATAAAATATATCAATTCCCAGAGAAAACTGTTGACTTAAAAACTTAACTGTAGTAAATA
X83368.1 ------------------------------------------------------------
NM_002649.3 GATAAAATATATCAATTCCCAGAGAAAACTCTTGACTTAAAAACTTAACTGTAGTAAATA
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 GATAAAATATATCACTTCCCAAAGAAAACTCTTGACTTAAAAACTTAACTATAGTAAATA
BX648341.1 ------------------------------------------------------------
NM_001282427.2 GATAAAATATATCAATTCCCAGAGAAAACTCTTGACTTAAAAACTTAACTGTAGTAAATA

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 TATCTTTTTCAGGTGATGAATTATTTTTTTAAAAAAGGTTACATATAGGAATTCTGCAGT
XM_003811196.4 TATCTTTTTCAGGTGATGAATTATTTTTTTAAAAAAGGTTACATATAGGAATTCTGCAGT
X83368.1 ------------------------------------------------------------
NM_002649.3 TATCTTTTTCAGGTGATGAATTATTTTTTTAAAAAAGGTTACATATAGGAATTCTGCAGT
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 TATCTTTTTCAGGTGATGAATTATTGTTTTAAAAAAGGTTACATATAGGAATTCTGCAGC
BX648341.1 ------------------------------------------------------------
NM_001282427.2 TATCTTTTTCAGGTGATGAATTATTTTTTTAAAAAAGGTTACATATAGGAATTCTGCAGT

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 ACAATTTGGAGGCTATTAGTGCTATATTAATGGAAATTAATTATTTTTTAAGTAAGTCCA
XM_003811196.4 ATAATTTGGAGGCTATTAGTGCTATATTAATGGAAATTAATTATTTTTTAAGTAAGTCCA
X83368.1 ------------------------------------------------------------
NM_002649.3 ATAATTTGGAGGCTATTAGTGCTATATTAATGGAAATTAATTATTTTTTAAGTAAGTCCA
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 ATAATTTGGAGGCTATTAGTGCTATATTAATGGAAATTAATTATTTTTTAATAAATAATA
BX648341.1 ------------------------------------------------------------
NM_001282427.2 ATAATTTGGAGGCTATTAGTGCTATATTAATGGAAATTAATTATTTTTTAAGTAAGTCCA

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 AAAAATAATCTAGAAAGTAAGTTTCCAGAGCAAATCTGACCTAGCATTTGGTATGCTAGG
XM_003811196.4 AAAAATAATCTAGAAAGTAAGTTTCCAGAGCAAATCTGACCTAGCATTTGGTATGCTAGG
X83368.1 ------------------------------------------------------------
NM_002649.3 AAAAATAATCTAGAAAGTAAGTTTCCAGAGCAAATCTGACCTAGCATTTGGTATGCTAGG
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 AAAAATAATCTAGAAAGTAAGTTTCCAGAGCAAATCTGACCTAGCATTTGGTATGCTAGG
BX648341.1 ------------------------------------------------------------

59
NM_001282427.2 AAAAATAATCTAGAAAGTAAGTTTCCAGAGCAAATCTGACCTAGCATTTGGTATGCTAGG

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 CTCTGCTTTTCATGATTTTGAAATAAATCATAATTAGACTTAACAATATGGAGAAAATAA
XM_003811196.4 CTCTGCTTTTCATGATTTTGAAATAAATCATAATTAGACTTAACAATATGGAGAAAATAA
X83368.1 ------------------------------------------------------------
NM_002649.3 CTCTGCTTTTCATGATTTTGAAATAAATCATAATTAGACTTAACAATATGGAGAAAATAA
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 CTCTGCTTTTCATGATTTTGAAATAAATCATAATTAGACTTAACAATATGAAGAAAATAA
BX648341.1 ------------------------------------------------------------
NM_001282427.2 CTCTGCTTTTCATGATTTTGAAATAAATCATAATTAGACTTAACAATATGGAGAAAATAA

XM_002818334.3 ------------------------------------------------------------
XM_012512141.2 ------------------------------------------------------------
XM_001162884.6 ACTTGTATTTTTAAGTGTTCTGTTGGCTGATTTTCTGTTTCATCCAGCTCAATAATTCTG
XM_003811196.4 ACTTGTATTTTTAAGTGTT-----------------------------------------
X83368.1 ------------------------------------------------------------
NM_002649.3 ACTTGTATTTTTAAGTGTTCTGTTGGCTTATTTTCTGTTTCATCCAACTCAATAATTCTG
XM_004046020.3 ------------------------------------------------------------
XM_032761353.1 ACTTGTATTTTTAAGTGTT-----------------------------------------
BX648341.1 ------------------------------------------------------------
NM_001282427.2 ACTTGTATTTTTAAGTGTTCTGTTGGCTTATTTTCTGTTTCATCCAACTCAATAATTCTG

XM_002818334.3 ---------------------------------------
XM_012512141.2 ---------------------------------------
XM_001162884.6 ATAAATAAATTTGGTTCTAGTTTGGTGCTTA--------
XM_003811196.4 ---------------------------------------
X83368.1 ---------------------------------------
NM_002649.3 ATAAATAAATTTGGTTCTAGTTTGGTGCTTGAAAAAAAA
XM_004046020.3 ---------------------------------------
XM_032761353.1 ---------------------------------------
BX648341.1 ---------------------------------------
NM_001282427.2 ATAAATAAATTTGGTTCTAGTTTGGTGCTTGTC------

Phylogram

60
NCBI Blast using PDB Sequence
RID 9NR5PZ3201R
Query ID lcl|Query_827435
Description 3ML9_1|Chain A|Phosphatidylinositol-4,5-bisphosphate 3-kinase
catalytic subunit gamma isoform|Homo sapiens (9606)
Molecule type amino acid
Query Length 966
Database Name UniProtKB/Swiss-Prot
Program BLASTP

61
Clustal Omega
CLUSTAL O(1.2.4) multiple sequence alignment

Q9JHG7.2 MELENYEQPVVLREDNLRRRRRMKPRSA--------AGSLSSMELIPIEFVLPTSQRISK
52
P48736.3 MELENYKQPVVLREDNCRRRRRMKPRSA--------AASLSSMELIPIEFVLPTSQRKCK
52
O02697.2 MELENYEQPVVLREDNRRRRRRMKPRST--------AASLSSMELIPIEFVLPTSQRNTK
52
P32871.1 ----------------------MPPR-PSSGELWGIHL---MPPRILVECLLPNGMIVTL
34
P42336.2 ----------------------MPPR-PSSGELWGIHL---MPPRILVECLLPNGMIVTL
34
A0A0G2K344.1 ----------------------MPPR-PSSGELWGIHL---MPPRILVECLLPNGMIVTL
34
P42337.2 ----------------------MPPR-PSSGELWGIHL---MPPRILVECLLPNGMIVTL
34
P42338.1 ----------------MCFSFIMPPAMADILDIWAVDSQIASDGSIPVDFLLPTGIYIQL
44
Q8BTI9.2 ----------------------MPPAMADNLDIWAVDSQIASDGAISVDFLLPTGIYIQL
38
Q9Z1L0.1 ----------------MCFRSIMPPAMADTLDIWAVDSQIASDGSISVDFLLPTGIYIQL
44
* * * :: :**..

Q9JHG7.2 TPETALLHVAGHGNVEQMKAQVWLRALETSVAAEFYHRLGPDQFLLLYQK-KGQWYEIYD
111
P48736.3 SPETALLHVAGHGNVEQMKAQVWLRALETSVAADFYHRLGPHHFLLLYQK-KGQWYEIYD
111
O02697.2 TPETALLHVAGHGNVEQMKAQVWLRALETSVSADFYHRLGPDHFLLLYQK-KGQWYEIYD
111
P32871.1 -------ECLREATLITIKHELFKEARKYPLHQ---LLQDESSYIFVSVTQEAEREEFFD
84
P42336.2 -------ECLREATLITIKHELFKEARKYPLHQ---LLQDESSYIFVSVTQEAEREEFFD
84
A0A0G2K344.1 -------ECLREATLVTIKHELFKEARKYPLHQ---LLQDESSYIFVSVTQEAEREEFFD
84
P42337.2 -------ECLREATLVTIKHELFREARKYPLHQ---LLQDETSYIFVSVTQEAEREEFFD
84
P42338.1 -------EVPREATISYIKQMLWKQVHNYPMFN---LLMDIDSYMFACVNQTAVYEELED
94
Q8BTI9.2 -------EVPREATISYIKQMLWKQVHNYPMFN---LLMDIDSYMFACVNQTAVYEELED
88
Q9Z1L0.1 -------EVPREATISYIKQMLWKQVHNYPMFN---LLMDIDSYMFACVNQTAVYEELED
94
. ...: :* :: .. : : . ::: . . *: *

Q9JHG7.2 RYQVVQTLD-CLHYWKLMHKSPGQIHVVQRHVPSEETLAFQKQLTSLIGYDVTDISNVHD
170
P48736.3 KYQVVQTLD-CLRYWKATHRSPGQIHLVQRHPPSEESQAFQRQLTALIGYDVTDVSNVHD
170
O02697.2 KYQVVQTLD-CLRYWKVLHRSPGQIHVVQRHAPSEETLAFQRQLNALIGYDVTDVSNVHD
170
P32871.1 ETRRLCDLRLFQPFL----------KVIEPVG-NREEKILNREIGFAIGMPVCEFDMVKD
133
P42336.2 ETRRLCDLRLFQPFL----------KVIEPVG-NREEKILNREIGFAIGMPVCEFDMVKD
133
A0A0G2K344.1 ETRRLCDLRLFQPFL----------KVIEPVG-NREEKILNREIGFVIGMPVCEFDMVKD
133
P42337.2 ETRRLCDLRLFQPFL----------KVIEPVG-NREEKILNREIGFVIGMPVCEFDMVKD
133
P42338.1 ETRRLCDVRPFLPVL----------KLVTRSC-DPGE-KLDSKIGVLIGKGLHEFDSLKD
142

62
Q8BTI9.2 ETRRLCDVRPFLPVL----------KLVTRSC-DPAE-KLDSKIGVLIGKGLHEFDALKD
136
Q9Z1L0.1 ETRRLCDVRPFLPVL----------KLVTRSC-DPAE-KLDSKIGVLIGKGLHEFDALKD
142
. : : : ::: . :: :: ** : :.. ::*

Q9JHG7.2 DELEFTRRRLVTPRMAEVAGRDA------KLYAMHPWV-TSKPLPDYLSKKIANNCIFIV
223
P48736.3 DELEFTRRGLVTPRMAEVASRDP------KLYAMHPWV-TSKPLPEYLWKKIANNCIFIV
223
O02697.2 DELEFTRRRLVTPRMAEVAGRDP------KLYAMHPWV-TSKPLPEYLLKKITNNCVFIV
223
P32871.1 PEVQDFRRNILNVCKEAVDLRDLNSPHSRAMYVYPPNVESSPELPKHIYNKLDKGQIIVV
193
P42336.2 PEVQDFRRNILNVCKEAVDLRDLNSPHSRAMYVYPPNVESSPELPKHIYNKLDKGQIIVV
193
A0A0G2K344.1 PEVQDFRRNILNVCKEAVDLRDLNSPHSRAMYVYPPNVESSPELPKHIYNKLDKGQIIVV
193
P42337.2 PEVQDFRRNILNVCKEAVDLRDLNSPHSRAMYVYPPNVESSPELPKHIYNKLDKGQIIVV
193
P42338.1 PEVNEFRRKMRKFSEEKILSLVGLSWMDWLKQTYPPE--HEPSIPENLEDKLYGGKLIVA
200
Q8BTI9.2 PEVNEFRRKMRKFSEAKIQSLVGLSWIDWLKHTYPPE--HEPSVLENLEDKLYGGKLVVA
194
Q9Z1L0.1 PEVNEFRRKMRKFSEDKIQSLVGLSWIDWLKHTYPPE--HEPSVLENLEDKLYGGKLVVA
200
*:: ** : . : . * . : . : .*: . :.:.

Q9JHG7.2 IHRGT------TSQTIKVSADDTPGTILQSFFTKMA----KKKSLMNISESQSEQDFVLR
273
P48736.3 IHRST------TSQTIKVSPDDTPGAILQSFFTKMA----KKKSLMDIPESQSEQDFVLR
273
O02697.2 IHRST------TSQTIKVSADDTPGTILQSFFTKMA----KKKSLMDIPESQNERDFVLR
273
P32871.1 IWVIVSPNNDKQKYTLKINHDCVPEQVIAEAIRKKTRSMLLSSEQLKLCVLEYQGKYILK
253
P42336.2 IWVIVSPNNDKQKYTLKINHDCVPEQVIAEAIRKKTRSMLLSSEQLKLCVLEYQGKYILK
253
A0A0G2K344.1 IWVIVSPNNDKQKYTLKINHDCVPEQVIAEAIRKKTRSMLLSSEQLKLCVLEYQGKYILK
253
P42337.2 IWVIVSPNNDKQKYTLKINHDCVPEQVIAEAIRKKTRSMLLSSEQLKLCVLEYQGKYILK
253
P42338.1 VHFE----NCQDVFSFQVSPNMNPIKVNELAIQKRLT---IHGKEDE----VSPYDYVLQ
249
Q8BTI9.2 VHFE----NSQDVFSFQVSPNLNPIKINELAIQKRLT---IRGKEDE----ASPCDYVLQ
243
Q9Z1L0.1 VHFE----NSQDVFSFQVSPNLNPIKINELAIQKRLT---IRGKEEE----ASPCDYVLQ
249
: ::::. : * : : * . . .::*:

Q9JHG7.2 VCGRDEYLVGETPLKNFQWVRQCLKNGDEIHLVLDTPPDPALDEVRKEEWPLVDDCTGVT
333
P48736.3 VCGRDEYLVGETPIKNFQWVRHCLKNGEEIHVVLDTPPDPALDEVRKEEWPLVDDCTGVT
333
O02697.2 VCGRDEYLVGETPIKNFQWVRQCLKNGEEIHLVLDTPPDPALDEVRKEEWPLVDDCTGVT
333
P32871.1 VCGCDEYFLEKYPLSQYKYIRSCIMLGRMPNLMLMAKES--------LYSQLPMDCFTMP
305
P42336.2 VCGCDEYFLEKYPLSQYKYIRSCIMLGRMPNLMLMAKES--------LYSQLPMDCFTMP
305
A0A0G2K344.1 VCGCDEYFLEKYPLSQYKYIRSCIMLGRMPNLMLMAKES--------LYSQLPIDSFTMP
305
P42337.2 VCGCDEYFLEKYPLSQYKYIRSCIMLGRMPNLMLMAKES--------LYSQLPIDSFTMP
305
P42338.1 VSGRVEYVFGDHPLIQFQYIRNCVMNRALPHFILVECCK--------IKKMYEQEMIAIE
301
Q8BTI9.2 VSGRVEYVFGDHPLIQFQYIRNCVMNRTLPHFILVECCK--------IKKMYEQEMIAIE
295

63
Q9Z1L0.1 VSGRVEYVFGDHPLIQFQYIRNCVMNRTLPHFILVECCK--------IKKMYEQEMIAIE
301
*.* **.. . *: :::::* *: :.:* . : :

Q9JHG7.2 GYHEQ----LTIHGKDHESVFTVSLWDCDRKFRVKIRGIDIPVLPRNTDLTVFVEANIQH
389
P48736.3 GYHEQ----LTIHGKDHESVFTVSLWDCDRKFRVKIRGIDIPVLPRNTDLTVFVEANIQH
389
O02697.2 GYHEQ----LTIHGKDHESVFTVSLWDCDRKFRVKIRGIDIPVLPRTADLTVFVEANIQY
389
P32871.1 SYSRRISTATPYM---NGETSTKSLWVINSALRIKILCATYVNVNIRDIDKIYVRTGIYH
362
P42336.2 SYSRRISTATPYM---NGETSTKSLWVINSALRIKILCATYVNVNIRDIDKIYVRTGIYH
362
A0A0G2K344.1 SYSRRISTATPYM---NGETATKSLWVINSALRIKILCATYVNVNIRDIDKIYVRTGIYH
362
P42337.2 SYSRRISTATPYM---NGETSTKSLWVINSALRIKILCATYVNVNIRDIDKIYVRTGIYH
362
P42338.1 AAINRNSSNLPLPLPPKKTRIISHVWENNNPFQIVLVKG--NKLNTEETVKVHVRAGLFH
359
Q8BTI9.2 AAINRNSSNLPLPLPPKKTRVISHIWDNNNPFQITLVKG--NKLNTEETVKVHVRAGLFH
353
Q9Z1L0.1 AAINRNSSSLPLPLPPKKTRVISHVWGNNNPFQIVLVKG--NKLNTEETVKVHVRAGLFH
359
. .: : :* : ::: : : .:.*.:.: :

Q9JHG7.2 GQQVLCQRRTSPK-PFAEEVLWNVWLEFGIKIKDLPKGALLNLQIYCCKTPSLSSKASAE
448
P48736.3 GQQVLCQRRTSPK-PFTEEVLWNVWLEFSIKIKDLPKGALLNLQIYCGKAPALSSKASAE
448
O02697.2 GQQVLCQRRTSPK-PFTEEVLWNVWLEFSIKIKDLPKGALLNLQIYCGKAPALSGKTSAE
448
P32871.1 GGEPLCDNVNTQRVPC-SNPRWNEWLNYDIYIPDLPRAARLCLSICSVKGR---------
412
P42336.2 GGEPLCDNVNTQRVPC-SNPRWNEWLNYDIYIPDLPRAARLCLSICSVKGR---------
412
A0A0G2K344.1 GGEPLCDNVNTQRVPC-SNPRWNEWLNYDIYIPDLPRAARLCLSICSVKGR---------
412
P42337.2 GGEPLCDNVNTQRVPC-SNPRWNEWLNYDIYIPDLPRAARLCLSICSVKGR---------
412
P42338.1 GTELLCKTIVSSEVSGKNDHIWNEPLEFDINICDLPRMARLCFAVYAVLDKVKTKKSTKT
419
Q8BTI9.2 GTELLCKTVVSSEISGKNDHIWNEQLEFDINICDLPRMARLCFAVYAVLDKVKTKKSTKT
413
Q9Z1L0.1 GTELLCKTVVSSEISGKNDHIWNEQLEFDINICDLPRMARLCFAVYAVLDKVKTKKSTKT
419
* : **. : . .: ** *::.* * ***: * * : : .

Q9JHG7.2 TPG------SESKGKAQLLYYVNLLLIDHRFLLRHGDYVLHMWQISGKAEEQGSFNADKL
502
P48736.3 SPS------SESKGKVQLLYYVNLLLIDHRFLLRRGEYVLHMWQISGKGEDQGSFNADKL
502
O02697.2 MPS------PESKGKAQLLYYVNLLLIDHRFLLRHGEYVLHMWQLSGKGEDQGSFNADKL
502
P32871.1 ---------KGAKEEHCPLAWGNINLFDYTDTLVSGKMALNLWP-VPHGLED--LLNPIG
460
P42336.2 ---------KGAKEEHCPLAWGNINLFDYTDTLVSGKMALNLWP-VPHGLED--LLNPIG
460
A0A0G2K344.1 ---------KGAKEEHCPLAWGNINLFDYTDTLVSGKMALNLWP-VPHGLED--LLNPIG
460
P42337.2 ---------KGAKEEHCPLAWGNINLFDYTDTLVSGKMALNLWP-VPHGLED--LLNPIG
460
P42338.1 INPSKYQTIRKAGKVHYPVAWVNTMVFDFKGQLRTGDIILHSWSSFPDELEE--MLNPMG
477
Q8BTI9.2 INPSKYQTIRKAGKVHYPVAWVNTMVFDFKGQLRSGDVILHSWSSFPDELEE--MLNPMG
471
Q9Z1L0.1 INPSKYQTIRKAGKVHYPVAWVNTMVFDFKGQLRSGDVILHSWSSFPDELEE--MLNPMG
477

64
: : : * ::*. * *. *: * . :: :

Q9JHG7.2 TSATNPDKENSMSISILLDNYC-HPIALPKHRPT---------PDPEG-------DRV--
543
P48736.3 TSATNPDKENSMSISILLDNYC-HPIALPKHQPT---------PDPEG-------DRV--
543
O02697.2 TSRTNPDKENSMSISILLDNYC-HPIALPKHRPT---------PDPEG-------DRV--
543
P32871.1 VTGSNPNKETPC-LELEFDWFS-SVVKFPDMSV-IEEHANWSVSREAGFSYSHAGLSNRL
517
P42336.2 VTGSNPNKETPC-LELEFDWFS-SVVKFPDMSV-IEEHANWSVSREAGFSYSHAGLSNRL
517
A0A0G2K344.1 VTGSNPNKETPC-LELEFDWFS-SVVKFPDMSV-IEEHANWSVSREAGFSYSHTGLSNRL
517
P42337.2 VTGSNPNKETPC-LELEFDWFS-SVVKFPDMSV-IEEHANWSVSREAGFSYSHTGLSNRL
517
P42338.1 TVQTNPYTENATALHVKFPENKKQPYYYPPFDKIIEKAAEIASSDSAN-------VSS--
528
Q8BTI9.2 TVQTNPYAENATALHITFPENKKQPCYYPPFDKIIEKAAELASGDSAN-------VSS--
522
Q9Z1L0.1 TVQTNPYAENATALHIKFPENKKQPYYYPPFDKIIEKAAEIASGDSAN-------VSS--
528
. :** *. : : : * .

Q9JHG7.2 --RAEMPNQLRKQLEAIIATDPLNPLTAEDKELLWHFRYESLK-HPKAYPKLFSSVKWGQ
600
P48736.3 --RAEMPNQLRKQLEAIIATDPLNPLTAEDKELLWHFRYESLK-HPKAYPKLFSSVKWGQ
600
O02697.2 --RAEMPNQLRKQLEAIIATDPLNPLTAEDKELLWHFRYESLK-DPKAYPKLFSSVKWGQ
600
P32871.1 ARDNELRENDKEQLRAICTRDPLSEITEQEKDFLWSHRHYCVT-IPEILPKLLLSVKWNS
576
P42336.2 ARDNELRENDKEQLKAISTRDPLSEITEQEKDFLWSHRHYCVT-IPEILPKLLLSVKWNS
576
A0A0G2K344.1 ARDNELRENDKEQLRALCTRDPLSEITEQEKDFLWSHRHYCVT-IPEILPKLLLSVKWNS
576
P42337.2 ARDNELRENDKEQLRALCTRDPLSEITEQEKDFLWSHRHYCVT-IPEILPKLLLSVKWNS
576
P42338.1 ----RGGKKFLPVLKEILDRDPLSQLCENEMDLIWTLRQDCREIFPQSLPKLLLSIKWNK
584
Q8BTI9.2 ----RGGKKFLAVLKEILDRDPLSQLCENEMDLIWTLRQDCRENFPQSLPKLLLSIKWNK
578
Q9Z1L0.1 ----RGGKKFLAVLKEILDRDPLSQLCENEMDLIWTLRQDCRENFPQSLPKLLLSIKWNK
584
. :: *. : ***. : :: :::* * . *: ***: *:**..

Q9JHG7.2 QEIVAKTYQLLARREIWDQSALDVGLTMQLLDCNFSDENVRAIAVQKLES-LEDDDVLHY
659
P48736.3 QEIVAKTYQLLARREVWDQSALDVGLTMQLLDCNFSDENVRAIAVQKLES-LEDDDVLHY
659
O02697.2 QEIVAKTYQLLAKREVWDQSALDVGLTMQLLDCNFSDENVRAIAVQKLES-LEDDDVLHY
659
P32871.1 RDEVAQMYCLV---KDWPP--IKPEQAMELLDCNYPDPMVRGFAVRCLEKYLTDDKLSQY
631
P42336.2 RDEVAQMYCLV---KDWPP--IKPEQAMELLDCNYPDPMVRGFAVRCLEKYLTDDKLSQY
631
A0A0G2K344.1 RDEVAQMYCLV---KDWPP--IKPEQAMELLDCNYPDPMVRSFAVRCLEKYLTDDKLSQY
631
P42337.2 RDEVAQMYCLV---KDWPP--IKPEQAMELLDCNYPDPMVRSFAVRCLEKYLTDDKLSQY
631
P42338.1 LEDVAQLQALL---QIWPK--LPPREALELLDFNYPDQYVREYAVGCLRQ-MSDEELSQY
638
Q8BTI9.2 LEDVAQLQALL---QIWPK--LPPREALELLDFNYPDQYVREYAVGCLRQ-MSDEELSQY
632
Q9Z1L0.1 LEDVAQLQALL---QIWPK--LPPREALELLDFNYPDQYVREYAVGCLRQ-MSDEELSQY
638
: **: *: : * : :::*** *: * ** ** *.. : *:.: :*

65
Q9JHG7.2 LLQLVQAVKFEPYHDSALARFLLKRGLRNKRIGHFLFWFLRSEIAQSRHYQQRFAVILEA
719
P48736.3 LLQLVQAVKFEPYHDSALARFLLKRGLRNKRIGHFLFWFLRSEIAQSRHYQQRFAVILEA
719
O02697.2 LLQLVQAVKFEPYHDSALARFLLKRGLRNKRIGHFLFWFLRSEIAQSRHYQQRFAVILEA
719
P32871.1 LIQLVQVLKYEQYLDNLLVRFLLKKALTNQRIGHFFFWHLKSEMHNK-TVSQRFGLLLES
690
P42336.2 LIQLVQVLKYEQYLDNLLVRFLLKKALTNQRIGHFFFWHLKSEMHNK-TVSQRFGLLLES
690
A0A0G2K344.1 LIQLVQVLKYEQYLDNLLVRFLLKKALTNQRIGHFFFWHLKSEMHNK-TVSQRFGLLLES
690
P42337.2 LIQLVQVLKYEQYLDNLLVRFLLKKALTNQRIGHFFFWHLKSEMHNK-TVSQRFGLLLES
690
P42338.1 LLQLVQVLKYEPFLDCALSRFLLERALGNRRIGQFLFWHLRSEVHIP-AVSVQFGVILEA
697
Q8BTI9.2 LLQLVQVLKYEPFLDCALSRFLLERALDNRRIGQFLFWHLRSEVHTP-AVSVQFGVILEA
691
Q9Z1L0.1 LLQLVQVLKYEPFLDCALSRFLLERALDNRRIGQFLFWHLRSEVHTP-AVSIQFGVILEA
697
*:****.:*:* : * * ****::.* *:***:*:**.*:**: . :*.::**:

Q9JHG7.2 YLRGCGTAMLQDFTQQVHVIEMLQKVTIDIKSLSAEKYDVSSQVISQLKQKLESLQNSNL
779
P48736.3 YLRGCGTAMLHDFTQQVQVIEMLQKVTLDIKSLSAEKYDVSSQVISQLKQKLENLQNSQL
779
O02697.2 YLRGCGTAMLHDFTQQVQVIDMLQKVTIDIKSLSAEKYDVSSQVISQLKQKLENLQNLNL
779
P32871.1 YCRACGMYLKH-LNRQVEAMEKLINLTDILKQE-KKDETQKVQ-MKFLVEQMRRPDFMDA
747
P42336.2 YCRACGMYLKH-LNRQVEAMEKLINLTDILKQE-KKDETQKVQ-MKFLVEQMRRPDFMDA
747
A0A0G2K344.1 YCRACGMYLKH-LNRQVEAMEKLINLTDILKQE-KKDETQKVQ-MKFLVEQMRQPDFMDA
747
P42337.2 YCRACGMYLKH-LNRQVEAMEKLINLTDILKQE-KKDETQKVQ-MKFLVEQMRQPDFMDA
747
P42338.1 YCRGSVGHMKV-LSKQVEALNKLKTLNSLIKLN-AVKLNRAKG-KEAMHTCLKQSAYREA
754
Q8BTI9.2 YCRGSVGHMKV-LSKQVEALNKLKTLNSLIKLN-AVKLSRAKG-KEAMHTCLKQSAYREA
748
Q9Z1L0.1 YCRGSVGHMKV-LSKQVEALNKLKTLNSLIKLN-AMKLNRAKG-KEAMHTCLKQSAYREA
754
* *.. : :.:**..:: * .:. :* . . : :. :

Q9JHG7.2 PESFRVPYDPGLKAGTLVIEKCKVMASKKKPLWLEFKCADPT-VLSNETIGIIFKHGDDL
838
P48736.3 PESFRVPYDPGLKAGALAIEKCKVMASKKKPLWLEFKCADPT-ALSNETIGIIFKHGDDL
838
O02697.2 PQSFRVPYDPGLKAGALVIEKCKVMASKKKPLWLEFKCADPT-ALSNETIGIIFKHGDDL
838
P32871.1 LQGFLSPLNPAHQLGNLRLEECRIMSSAKRPLWLNWENPDIMSELLFQNNEIIFKNGDDL
807
P42336.2 LQGFLSPLNPAHQLGNLRLEECRIMSSAKRPLWLNWENPDIMSELLFQNNEIIFKNGDDL
807
A0A0G2K344.1 LQGFLSPLNPAHQLGNLRLEECRIMSSAKRPLWLNWENPDIMSELLFQNNEIIFKNGDDL
807
P42337.2 LQGFLSPLNPAHQLGNLRLEECRIMSSAKRPLWLNWENPDIMSELLFQNNEIIFKNGDDL
807
P42338.1 LSDLQSPLNPCVILSELYVEKCKYMDSKMKPLWLVYNNKVFGE----DSVGVIFKNGDDL
810
Q8BTI9.2 LSDLQSPLNPCVILSELYVEKCKYMDSKMKPLWLVYSSRAFGE----DSVGVIFKNGDDL
804
Q9Z1L0.1 LSDLQSPLNPCVILSELYVEKCRYMDSKMKPLWLVYSNRAFGE----DAVGVIFKNGDDL
810
..: * :* . * :*:*: * * :**** :. : :***:****

Q9JHG7.2 RQDMLILQILRIMESIWETESLDLCLLPYGCISTGDKIGMIEIVKDATTIAQIQQST--V
896

66
P48736.3 RQDMLILQILRIMESIWETESLDLCLLPYGCISTGDKIGMIEIVKDATTIAKIQQST--V
896
O02697.2 RQDMLILQILRIMESIWETESLDLCLLPYGCISTGDKIGMIEIVKDATTIAKIQQST--V
896
P32871.1 RQDMLTLQIIRIMENIWQNQGLDLRMLPYGCLSIGDCVGLIEVVRNSHTIMQIQCKG-GL
866
P42336.2 RQDMLTLQIIRIMENIWQNQGLDLRMLPYGCLSIGDCVGLIEVVRNSHTIMQIQCKG-GL
866
A0A0G2K344.1 RQDMLTLQIIRIMENIWQNQGLDLRMLPYGCLSIGDCVGLIEVVRNSHTIMQIQCKG-GL
866
P42337.2 RQDMLTLQIIRIMENIWQNQGLDLRMLPYGCLSIGDCVGLIEVVRNSHTIMQIQCKG-GL
866
P42338.1 RQDMLTLQMLRLMDLLWKEAGLDLRMLPYGCLATGDRSGLIEVVSTSETIADIQLNSSNV
870
Q8BTI9.2 RQDMLTLQMLRLMDLLWKEAGLDLRMLPYGCLATGDRSGLIEVVSTSETIADIQLNSSNV
864
Q9Z1L0.1 RQDMLTLQMLRLMDLLWKEAGLDLRMLPYGCLATGDRSGLIEVVSTSETIADIQLNSSNV
870
***** **::*:*: :*: .*** :*****:: ** *:**:* : ** .** . :

Q9JHG7.2 GNTGAFKDEVLNHWLKEKCPIEEKFQAAVERFVYSCAGYCVATFVLGIGDRHNDNIMISE
956
P48736.3 GNTGAFKDEVLNHWLKEKSPTEEKFQAAVERFVYSCAGYCVATFVLGIGDRHNDNIMITE
956
O02697.2 GNTGAFKDEVLSHWLKEKCPIEEKFQAAVERFVYSCAGYCVATFVLGIGDRHNDNIMISE
956
P32871.1 KGALQFNSHTLHQWLKDKNK-GEIYDAAIDLFTRSCAGYCVATFILGIGDRHNSNIMVKD
925
P42336.2 KGALQFNSHTLHQWLKDKNK-GEIYDAAIDLFTRSCAGYCVATFILGIGDRHNSNIMVKD
925
A0A0G2K344.1 KGALQFNSHTLHQWLKDKNK-GEIYDAAIDLFTRSCAGYCVATFILGIGDRHNSNIMVKD
925
P42337.2 KGALQFNSHTLHQWLKDKNK-GEIYDAAIDLFTRSCAGYCVATFILGIGDRHNSNIMVKD
925
P42338.1 AAAAAFNKDALLNWLKEYNS-GDDLDRAIEEFTLSCAGYCVASYVLGIGDRHSDNIMVKK
929
Q8BTI9.2 AATAAFNKDALLNWLKEYNS-GDDLDRAIEEFTLSCAGYCVASYVLGIGDRHSDNIMVKK
923
Q9Z1L0.1 AATAAFNKDALLNWLKEYNS-GDDLDRAIEEFTLSCAGYCVASYVLGIGDRHSDNIMVKK
929
: *:...* :***: : : *:: *. ********:::*******..***:..

Q9JHG7.2 TGNLFHIDFGHILGNYKSFLGINKERVPFVLTPDFLFVMGS--SGKKTSPHFQKFQDVCV
1014
P48736.3 TGNLFHIDFGHILGNYKSFLGINKERVPFVLTPDFLFVMGT--SGKKTSPHFQKFQDICV
1014
O02697.2 TGNLFHIDFGHILGNYKSFLGINKERVPFVLTPDFLFVMGT--SGKKTSLHFQKFQDVCV
1014
P32871.1 DGQLFHIDFGHFLDHKKKKFGYKRERVPFVLTQDFLIVISKGAQECTKTREFERFQEMCY
985
P42336.2 DGQLFHIDFGHFLDHKKKKFGYKRERVPFVLTQDFLIVISKGAQECTKTREFERFQEMCY
985
A0A0G2K344.1 DGQLFHIDFGHFLDHKKKKFGYKRERVPFVLTQDFLIVISKGAQEYTKTREFERFQEMCY
985
P42337.2 DGQLFHIDFGHFLDHKKKKFGYKRERVPFVLTQDFLIVISKGAQEYTKTREFERFQEMCY
985
P42338.1 TGQLFHIDFGHILGNFKSKFGIKRERVPFILTYDFIHVIQQGKTG--NTEKFGRFRQCCE
987
Q8BTI9.2 TGQLFHIDFGHILGNFKSKFGIKRERVPFILTYDFIHVIQQGKTG--NTEKFGRFRQCCE
981
Q9Z1L0.1 TGQLFHIDFGHILGNFKSKFGIKRERVPFILTYDFIHVIQQGKTG--NTEKFGRFRQCCE
987
*:********:*.: *. :* ::*****:** **: *: .: .* :*:: *

Q9JHG7.2 RAYLALRHHTNLLIILFSMMLMTGMPQLTSKEDIEYIRDALTVGKSEEDAKKYFLDQIEV
1074
P48736.3 KAYLALRHHTNLLIILFSMMLMTGMPQLTSKEDIEYIRDALTVGKNEEDAKKYFLDQIEV
1074

67
O02697.2 KAYLALRHHTNLLIILFSMMLMTGMPQLTSKEDIEYIRDALTVGKSEEDAKKYFLDQIEV
1074
P32871.1 KAYLAIRQHANLFINLFSMMLGSGMPELQSFDDIAYIRKTLALDKTEQEALEYFMKQMND
1045
P42336.2 KAYLAIRQHANLFINLFSMMLGSGMPELQSFDDIAYIRKTLALDKTEQEALEYFMKQMND
1045
A0A0G2K344.1 KAYLAIRQHANLFINLFSMMLGSGMPELQSFDDIAYIRKTLALDKTEQEALEYFTKQMND
1045
P42337.2 KAYLAIRQHANLFINLFSMMLGSGMPELQSFDDIAYIRKTLALDKTEQEALEYFTKQMND
1045
P42338.1 DAYLILRRHGNLFITLFALMLTAGLPELTSVKDIQYLKDSLALGKSEEEALKQFKQKFDE
1047
Q8BTI9.2 DAYLILRRHGNLFITLFALMLTAGLPELTSVKDIQYLKDSLALGKSEEEALKQFKQKFDE
1041
Q9Z1L0.1 DAYLILRRHGNLFITLFALMLTAGLPELTSVKDIQYLKDSLALGKSEEEALKQFKQKFDE
1047
*** :*:* **:* **::** :*:*:* * .** *::.:*::.*.*::* : * .:::

Q9JHG7.2 CRDKGWTVQFNWFLHLVLGIKQGEKHSA 1102


P48736.3 CRDKGWTVQFNWFLHLVLGIKQGEKHSA 1102
O02697.2 CRDKGWTVQFNWFLHLVLGIKQGEKHSA 1102
P32871.1 AHHGGWTTKMDWIFHTIKQHALN----- 1068
P42336.2 AHHGGWTTKMDWIFHTIKQHALN----- 1068
A0A0G2K344.1 AHHGGWTTKMDWIFHTIKQHALN----- 1068
P42337.2 AHHGGWTTKMDWIFHTIKQHALN----- 1068
P42338.1 ALRESWTTKVNWMAHTVRKDYRS----- 1070
Q8BTI9.2 ALRESWTTKVNWMAHTVRKDYRS----- 1064
Q9Z1L0.1 ALRESWTTKVNWMAHTVRKDYRS----- 1070
. .**.:.:*: * : .

Phylogram

68
Gene Card Analysis
PIK3CG (Phosphatidylinositol-4,5-Bisphosphate 3-Kinase Catalytic Subunit Gamma) is a
Protein Coding gene. Diseases associated with PIK3CG include Immunodeficiency 97 With
Autoinflammation and Pertussis. Among its related pathways are Immune response CCR3
signaling in eosinophils and Development Angiotensin activation of ERK. Gene Ontology
(GO) annotations related to this gene include transferase activity, transferring phosphorus-
containing groups and binding. An important paralog of this gene is PIK3CB.

Genomic Locations for PIK3CG Gene


Latest Assembly chr7: 106,865,278-106,908,980(GRCh38/hg38)
Size: 43,703 bases
Orientation: Plus strand

Genomic View for PIK3CG Gene

Cytogenetic band:7q22.3 by HGNC 7q22.3 by Entrez Gene 7q22.3 by Ensembl


PIK3CG Gene in genomic location: bands according to Ensembl, locations according to GeneLoc (and/or Entrez Gene
and/or Ensembl if different)

Locatlization of PI3K Gene

69
Uniprot Analysis
Protein. : Phosphatidylinositol 4,5-bisphosphate 3-kinase catalytic subunit gamma isoform
Gene : PIK3CG
Organism: Homo sapiens (Human)

Function

Phosphoinositide-3-kinase (PI3K) that phosphorylates PtdIns(4,5)P2 (Phosphatidylinositol


4,5-bisphosphate) to generate phosphatidylinositol 3,4,5-trisphosphate (PIP3). PIP3 plays a
key role by recruiting PH domain-containing proteins to the membrane, including AKT1 and
PDPK1, activating signaling cascades involved in cell growth, survival, proliferation, motility
and morphology. Links G-protein coupled receptor activation to PIP3 production. Involved in
immune, inflammatory and allergic responses. Modulates leukocyte chemotaxis to
inflammatory sites and in response to chemoattractant agents. May control leukocyte
polarization and migration by regulating the spatial accumulation of PIP3 and by regulating
the organization of F-actin formation and integrin-based adhesion at the leading edge.
Controls motility of dendritic cells. Together with PIK3CD is involved in natural killer (NK)
cell development and migration towards the sites of inflammation. Participates in T-
lymphocyte migration. Regulates T-lymphocyte proliferation and cytokine production.
Together with PIK3CD participates in T-lymphocyte development. Required for B-
lymphocyte development and signaling. Together with PIK3CD participates in neutrophil
respiratory burst. Together with PIK3CD is involved in neutrophil chemotaxis and
extravasation. Together with PIK3CB promotes platelet aggregation and thrombosis.
Regulates alpha-IIb/beta-3 integrins (ITGA2B/ ITGB3) adhesive function in platelets
downstream of P2Y12 through a lipid kinase activity-independent mechanism. May have also
a lipid kinase activity-dependent function in platelet aggregation. Involved in endothelial
progenitor cell migration. Negative regulator of cardiac contractility. Modulates cardiac
contractility by anchoring protein kinase A (PKA) and PDE3B activation, reducing cAMP
levels. Regulates cardiac contractility also by promoting beta-adrenergic receptor
internalization by binding to GRK2 and by non-muscle tropomyosin phosphorylation. Also
has serine/threonine protein kinase activity: both lipid and protein kinase activities are
required for beta-adrenergic receptor endocytosis. May also have a scaffolding role in
modulating cardiac contractility. Contributes to cardiac hypertrophy under pathological
stress. Through simultaneous binding of PDE3B to RAPGEF3 and PIK3R6 is assembled in a
signaling complex in which the PI3K gamma complex is activated by RAPGEF3 and which
is involved in angiogenesis.

70
Disease & Variants
Type: Mutagenesis

ID Position Description
833 Loss of kinase activity. Loss of autophosphorylation. Reduced
inflammatory reactions but no alterations in cardiac contractility.

947 Abolishes protein and lipid kinase activity. Does not abolishes interaction with
GRK2.
1011 Loss of autophosphorylation. No effect on phosphatidylinositol-4,5-
bisphosphate 3-kinase activity

Interaction

Subunit
Heterodimer of a catalytic subunit PIK3CG and a PIK3R5 or PIK3R6 regulatory subunit.
Interacts with GRK2 through the PIK helical domain. Interaction with GRK2 is required for
targeting to agonist-occupied receptor. Interacts with PDE3B (By similarity).
Interacts with TPM2. Interacts with EPHA8; regulates integrin-mediated cell adhesion to
substrate. Interacts with HRAS; the interaction is required for membrane recruitment and
beta-gamma G protein dimer-dependent activation of the PI3K gamma complex
PIK3CG:PIK3R6 (By similarity).
Binary interactions
P48736 has binary interactions with 5 proteins

71
Features for beta strand, helix and turn.
O Helix O Beta Strand O Turns

Family & Domains

Domain 217-309 PI3K-RBD1 Automatic Annotation


Domain 357-521 C2 PI3K-type1 Automatic Annotation
Domain 541-723 PIK helical1 Automatic Annotation
Domain 797-1080 PI3K/PI4K catalytic1 Automatic Annotation
Region 803-809 G-loop1 Automatic Annotation
Region 943-951 Catalytic loop1 Automatic Annotation
Region 962-988 Activation loop

72
Reactome Pathway
G-protein beta: gamma signalling

73
74
75
76
KEGG Pathway
PI3K-Akt signaling pathway

The phosphatidylinositol 3' -kinase(PI3K)-Akt signaling pathway is activated by many types


of cellular stimuli or tox ic insults and regulates fundamental cellular functions such as
transcription, translation, proliferation, growth, and survival. The binding of growth factors to
their receptor tyrosine kinase (RTK) or G protein-coupled receptors (GPCR) stimulates class
Ia and Ib PI3K isoforms, respectively. PI3K catalyzes the production of phosphatidylinositol-
3,4,5-triphosphate (PIP3) at the cell membrane. PIP3 in turn serves as a second messenger
that helps to activate Akt. Once active, Akt can control key cellular processes by
phosphorylating substrates involved in apoptosis, protein synthesis, metabolism, and cell
cycle.

77
STRING Pathway

Data Representation

Nodes
Network nodes represent proteins. Splice isoforms or post-translational modifications are
collapsed, i.e. each node represents all the proteins produced by a single, protein-coding gene
locus.

Edges
Edges represent protein-protein associations that are meant to be specific and meaningful, i.e.
proteins jointly contribute to a shared function; this does not necessarily mean they are
physically binding to each other.

78
Predicted Functional Partners:

79
ProtParam Analysis
Number of amino acids: 966
Molecular weight: 110605.24
Theoretical pI: 6.49

Amino acid composition:


6% 6%
3% 5%
2%
3%
5%
6%
6%
2%
4% 5%

4%
7%
2%

7% 5%
4%
11% 6%

Ala (A) 54 Arg (R) 44 Asn (N) 32 Asp (D) 60 Cys (C) 18 Gln (Q) 50
Glu (E) 64 Gly (G) 48 His (H) 36 Ile (I) 60 Leu (L) 108 Lys (K) 71
Met (M) 21 Phe (F) 43 Pro (P) 43 Ser (S) 59 Thr (T) 50 Trp (W) 17
Tyr (Y) 26 Val (V) 62 Pyl (O) 0 Sec (U) 0

Total number of negatively charged residues (Asp + Glu): 124


Total number of positively charged residues (Arg + Lys): 115
(B) 0 0.0%
(Z) 0 0.0%
(X) 0 0.0%
Atomic composition:

80
Chart Title

Carbon C Hydrogen H
Nitrogen N 1340 Oxygen O
Sulfur S
Formula: C4979H7810N1340O1432S39
Total number of atoms: 15600
Extinction coefficients:

Extinction coefficients are in units of M-1 cm-1, at 280 nm measured in water.

Ext. coefficient 133365


Abs 0.1% (=1 g/l) 1.206, assuming all pairs of Cys residues form cystines

Ext. coefficient 132240


Abs 0.1% (=1 g/l) 1.196, assuming all Cys residues are reduced
Estimated half-life:

The N-terminal of the sequence considered is M (Met).

The estimated half-life is: 30 hours (mammalian reticulocytes, in vitro).


>20 hours (yeast, in vivo).
>10 hours (Escherichia coli, in vivo).
Instability index:

The instability index (II) is computed to be 43.09


This classifies the protein as unstable.
Aliphatic index: 92.03
Grand average of hydropathicity (GRAVY): -0.297

81
PubChem Analysis
S.n Drug PubChem Molecula Canonical SMILES Properties
o Name CID r
Formula

1 Icotinib 22024915 C22H21N3O4 C#CC1=CC(=CC=C1)NC2 Molecular Weight=391.4


=NC=NC3=CC4=C(C=C32
)OCCOCCOCCO4 Hydrogen Bond Donor Count=1

Hydrogen Bond Acceptor Count=7

Exact Mass=391.15320616

Formal Charge=0

Topological Polar Surface Area=74.7 Ų

Rotatable bonds=3

2 PF- 25033539 C22H27N5O4


CC1=C2C=C(C(=O) Molecular Weight=425.5
04691502
N(C2=NC(=N1)N)C Hydrogen Bond Donor Count=2
3CCC(CC3)OCCO)C Hydrogen Bond Acceptor Count=8
4=CN=C(C=C4)OC
Exact Mass= 425.20630436

Formal Charge=0

Topological Polar Surface Area= 124 Ų

Rotatable bonds=6

3 Amuvatinib 11282283 C23H21N5O3S C1CN(CCN1C2=NC=NC3 Molecular Weight=391.4


=C2OC4=CC=CC=C43)C(
=S)NCC5=CC6=C(C=C5) Hydrogen Bond Donor Count=1
OCO6
Hydrogen Bond Acceptor Count=7

Exact Mass=391.15320616

Formal Charge=0

Topological Polar Surface Area=74.7 Ų

82
Rotatable bonds=3

4 Wortmannin 312145 C23H24O8 CC(=O)OC1CC2(C(CCC2 Molecular Weight=428.4


=O)C3=C1C4(C(OC(=O)C
5=COC(=C54)C3=O)COC) Hydrogen Bond Donor Count=0
C)C
Hydrogen Bond Acceptor Count=8

Exact Mass= 428.14711772

Formal Charge=0

Topological Polar Surface Area= 109 Ų

Rotatable bonds=4

5 LY294002 3973 C19H17NO3 C1COCCN1C2=CC(=O)C3 Molecular Weight=307.3


=C(O2)C(=CC=C3)C4=CC
=CC=C4 Hydrogen Bond Donor Count=0

Hydrogen Bond Acceptor Count=4

Exact Mass= 307.12084340

Formal Charge=0

Topological Polar Surface Area= 38.8 Ų

Rotatable bonds=2

6 Dactolisib 11977753 C30H23N5O CC(C) Molecular Weight=469.5


(C#N)C1=CC=C(C=C1)N2
C3=C4C=C(C=CC4=NC=C Hydrogen Bond Donor Count=0
3N(C2=O)C)C5=CC6=CC=
CC=C6N=C5 Hydrogen Bond Acceptor Count=4

Exact Mass= 469.19026037

Formal Charge=0

Topological Polar Surface Area= 73.1 Ų

Rotatable bonds=3

7 Temozolom 5394 C6H6N6O2 CN1C(=O)N2C=NC(=C2N Molecular Weight=194.15


ide =N1)C(=O)N
Hydrogen Bond Donor Count=1

Hydrogen Bond Acceptor Count=5

Exact Mass= 194.05522346

Formal Charge=0

Topological Polar Surface Area= 106 Ų

Rotatable bonds=1

8 Lomustine 3950 C9H16ClN3O C1CCC(CC1)NC(=O)N(CC Molecular Weight=233.69


2 Cl)N=O
Hydrogen Bond Donor Count=1

Hydrogen Bond Acceptor Count=3

Exact Mass= 233.0931045

83
Formal Charge=0

Topological Polar Surface Area= 61.8 Ų

Rotatable bonds=3

Structural Analysis
(All pictures are taken from PubChem)

S.n Drug Name 2D Structure 3D Conformer


o

1. Icotinib

2. PF-04691502

84
3. Amuvatinib

4. Wortmannin

5. LY294002

6. Dactolisib

7. Temozolomid
e

8. Lomustine

85
Pharmacokinetic Analysis
(All data has been collected from SwissADME)

Drug Name: Icotinib

86
Drug Name: PF-04691502

87
Drug Name: Amuvatinib

88
89
Drug Name: Wortmannin

Drug Name: LY294002

90
Drug Name: Dactolisib

91
Drug Name: Temozolomide

92
Drug Name: Lomustine

93
Generic Name Follows Lipink’s Rule

94
1 Icotinib Yes

2 PF-04691502 Yes

3 Amuvatinib Yes

4 Wortmannin Yes

5 LY294002 Yes

6 Dactolisib Yes

7 Temozolomide Yes

8 Lomustine Yes

molinspiration

95
1. Icotinib

Molinspiration bioactivity score v2021.03


GPCR ligand 0.16
Ion channel modulator 0.14
Kinase inhibitor 0.71
Nuclear receptor ligand -0.07
Protease inhibitor -0.14
Enzyme inhibitor 0.22

2. Amuvatinib

Molinspiration bioactivity score v2021.03


GPCR ligand -0.05
Ion channel modulator -0.37
Kinase inhibitor -0.11
Nuclear receptor ligand -0.84
Protease inhibitor -0.58
Enzyme inhibitor -0.08

3. Wortmannin

Molinspiration bioactivity score v2021.03


GPCR ligand -0.23
Ion channel modulator -0.32
Kinase inhibitor -0.10
Nuclear receptor ligand 0.22
Protease inhibitor -0.21
Enzyme inhibitor 0.73

4. LY294002

Molinspiration bioactivity score v2021.03


GPCR ligand 0.20
Ion channel modulator -0.42
Kinase inhibitor 0.79
Nuclear receptor ligand -0.21
Protease inhibitor -0.06
Enzyme inhibitor 0.23

5. Dactolisib
Molinspiration bioactivity score v2021.03
GPCR ligand 0.30

96
Ion channel modulator 0.08
Kinase inhibitor 0.48
Nuclear receptor ligand -0.14
Protease inhibitor -0.05
Enzyme inhibitor 0.34

6. Temozolomide

Molinspiration bioactivity score v2021.03


GPCR Ligand -0.49
Ion channel modulator -0.17
Kinase inhibitor -0.14
Nuclear receptor ligand -1.59
Protease inhibitor -0.71
Enzyme inhibitor 0.07

7. Lomustine

Molinspiration bioactivity score v2021.03


GPCR Ligand -0.41
Ion channel modulator -0.23
Kinase inhibitor -0.62
Nuclear receptor ligand -0.93
Protease inhibitor -0.46
Enzyme inhibitor -0.11

Pre ADME Analysis


(For toxicity prediction)

97
1. Icotinib

ID Value
algae_at 0.0163833
Ames_test mutagen
Carcino_Mouse negative
Carcino_Rat positive
daphnia_at 0.0216088
hERG_inhibition medium_risk
medaka_at 0.00103959
minnow_at 0.00221139
TA100_10RLI positive
TA100_NA negative
TA1535_10RLI negative
TA1535_NA negative

2. Amuvatinib

Value
CMC_like_Rule Qualified
CMC_like_Rule_Violation_Fields
CMC_like_Rule_Violations 0
Lead-like_Rule_Violation_Fields Molecular_weight,
AlopP98_value
Lead_like_Rule Violated
Lead_like_Rule_Violations 2
MDDR_like_Rule Mid-structure
MDDR_like_Rule_Violation_Fields No_Rotatable_bonds
MDDR_like_Rule_Violations 1
Rule_of_Five Suitable
Rule_of_Five_Violation_Fields
Rule_of_Five_Violations 0
WDI_like_Rule Out of 90% cutoff
WDI_like_Rule_Violation_Fields AMolRef, 1st_Zagreb
WDI_like_Rule_Violations 2

3. Wortmannin

98
Value
algae_at 0.11397
Ames_test mutagen
Carcino_Mouse negative
Carcino_Rat positive
daphnia_at 0.275294
hERG_inhibition low_risk
medaka_at 0.12815
minnow_at 0.101235
TA100_10RLI negative
TA100_NA negative
TA1535_10RLI negative
TA1535_NA negative

4. LY294002

Value
algae_at 0.0657848
Ames_test mutagen
Carcino_Mouse negative
Carcino_Rat negative
daphnia_at 0.0757216
hERG_inhibition medium_risk
medaka_at 0.00979408
minnow_at 0.0150018
TA100_10RLI positive
TA100_NA positive
TA1535_10RLI negative
TA1535_NA negative

5. Dactolisib

Value
algae_at 0.00368811
Ames_test mutagen
Carcino_Mouse negative
Carcino_Rat negative
daphnia_at 0.00186204
hERG_inhibition medium_risk
medaka_at 1.39354e-005
minnow_at 0.000159849
TA100_10RLI negative
TA100_NA negative
TA1535_10RLI negative
TA1535_NA negative

6. Temozolomide

99
Value
algae_at 0.331441
Ames_test mutagen
Carcino_Mouse negative
Carcino_Rat negative
daphnia_at 2.17405
hERG_inhibition low_risk
medaka_at 5.70405
minnow_at 1.83475
TA100_10RLI positive
TA100_NA positive
TA1535_10RLI positive
TA1535_NA positive

7. Lomustine

Value
algae_at 0.0402313
Ames_test mutagen
Carcino_Mouse negative
Carcino_Rat negative
daphnia_at 0.232962
hERG_inhibition low_risk
medaka_at 0.0701823
minnow_at 0.0546158
TA100_10RLI positive
TA100_NA positive
TA1535_10RLI negative
TA1535_NA negative

Swiss Target Prediction

100
Drug Icotinib

Target Classes

Target Epidermal growth factor receptor erbB1


Probablity

Protein Epidermal growth factor receptor


Gene EGFR
Organism Homo sapiens (Human)
Function Activates at least 4 major downstream signaling cascades including the
RAS-RAF-MEK-ERK, PI3 kinase-AKT, PLCgamma-PKC and STATs
modules
Catalytic activity ATP + L-tyrosyl-[protein] = ADP + H+ + O-phospho-L-tyrosyl-
[protein]

Drug PF-04691502

101
Target Classes

Target Serine/threonine-protein kinase mTOR


Probablity

Protein Serine/threonine-protein kinase mTOR


Gene MTOR
Organism Homo sapiens (Human)
Function MTOR directly or indirectly regulates the phosphorylation of at least
800 proteins. Functions as part of 2 structurally and functionally distinct
signalling complexes mTORC1 and mTORC2
It also includes mTORC1 signalling cascade controlling the MiT/TFE
factors TFEB and TFE3: in the presence of nutrients, mediates
phosphorylation of TFEB and TFE3, promoting their cytosolic retention
and inactivation.
Catalytic activity ATP + L-seryl-[protein] = ADP + H+ + O-phospho-L-seryl-[protein]

102
Drug Amuvatinib

Target Classes

Target Type-1 angiotensin II receptor (by homology)

Probablity

Protein Type-1 angiotensin II receptor


Gene AGTR1
Organism Homo sapiens (Human)
Function Receptor for angiotensin II. Mediates its action by association with G
proteins that activate a phosphatidylinositol-calcium second messenger
system.

Catalytic activity

Drug Wortmannin

103
Target Classes

Target DNA-dependent protein kinase


Probablity

Protein DNA-dependent protein kinase catalytic subunit


Gene PRKDC
Organism Homo sapiens (Human)
Function Serine/threonine-protein kinase that acts as a molecular sensor for DNA
damage
Phosphorylates 'Ser-139' of histone variant H2AX, thereby regulating
DNA damage response mechanism
Involved in DNA non-homologous end joining (NHEJ) required for
double-strand break (DSB) repair and V(D)J recombination

Catalytic activity ATP + L-seryl-[protein] = ADP + H+ + O-phospho-L-seryl-[protein]

Drug LY294002

Target Classes

104
Target Phosphodiesterase 5A
Probablity

Protein cGMP-specific 3',5'-cyclic phosphodiesterase


Gene PDE5A
Organism Homo sapiens (Human)
Function Plays a role in signal transduction by regulating the intracellular
concentration of cyclic nucleotides.

Catalytic activity 3',5'-cyclic GMP + H2O = GMP + H+

Drug Dactolisib

Target Classes

105
Target PI3-kinase p110-alpha/p85-alpha

Probablity

Protein Phosphatidylinositol 3-kinase regulatory subunit alpha


Gene PIK3R1
Organism Homo sapiens (Human)
Function Binds to activated (phosphorylated) protein-Tyr kinases, through its
SH2 domain, and acts as an adapter, mediating the association of the
p110 catalytic unit to the plasma membrane.
Catalytic activity

Drug Temozolomide

106
Target Classes

Target Lysine-specific demethylase 5C


Probablity

Protein Lysine-specific demethylase 5C


Gene KDM5C
Organism Homo sapiens (Human)
Function Participates in transcriptional repression of neuronal genes by recruiting
histone deacetylases and REST at neuron-restrictive silencer elements.

Catalytic activity 3 2-oxoglutarate + N6,N6,N6-trimethyl-L-lysyl4-[histone H3] + 3 O2 =


3 CO2 + 3 formaldehyde + L-lysyl4-[histone H3] + 3 succinate

Drug Lomustine

107
Target Classes

Target Alpha-2a adrenergic receptor


Probablity

Protein Alpha-2A adrenergic receptor

Gene ADRA2A

Organism Homo sapiens (Human)


Function Alpha-2 adrenergic receptors mediate the catecholamine-
induced inhibition of adenylate cyclase through the
action of G proteins.
Catalytic activity

ORF Finder

108
Six Frame Translation
E F G T S T S F S A R L S E T V V G S * D D T T T E C L C F
ORF8 N S A R A L P S R L D Y L K L L S V L E M I L P P N V C V S
I R H E H F L L G * I I * N C C R F L R * Y Y H R M S V F
1
GAATTCGGCACGAGCACTTCCTTCTCGGCTAGATTATCTGAAACTGTTGTCGGTTCTTGAGATGATACTACCACCGAATGTCT
GTGTTTC
F E A R A S G E R S S * R F S N D T R S I I S G G F T Q T E
I R C S C K R R P * I I Q F Q Q R N K L H Y * W R I D T N *
ORF23 N P V L V E K E A L N D S V T T P E Q S S V V V S H R H K

I V * S N L Y C G Y L Q R S G N S F A G A S H F C F C F G R
ORF8 L S S P T C I V D I Y N V P A I V L Q V H H I F V F V L G G
H C L V Q P V L W I S T T F R Q * F C R C I T F L F L F W E
91
ATTGTCTAGTCCAACCTGTATTGTGGATATCTACAACGTTCCGGCAATAGTTTTGCAGGTGCATCACATTTTTGTTTTTGTTTT
GGGAGG
ORF19 N D L G V Q I T S I * L T G A I T K C T C * M K T K T K P P
Q R T W G T N H I D V V N R C Y N Q L H M V N K N K N Q S S
ORF23 M T * D L R Y Q P Y R C R E P L L K A P A D C K Q K Q K P L

K R E G T A A R L H I P T S A C F K I T V L T V F P T S S H
ORF8 K G R A R Q P G F I F L Q V H A S R L L Y L Q C F Q H L L I
E K G G H G S Q A S Y S Y K C M L Q D Y C T Y S V S N I F S
181
AAAAGGGAGGGCACGGCAGCCAGGCTTCATATTCCTACAAGTGCATGCTTCAAGATTACTGTACTTACAGTGTTTCCAACAT
CTTCTCAT
ORF19 F P L A R C G P K M N R C T C A E L N S Y K C H K W C R R M
F P P C P L W A E Y E * L H M S * S * Q V * L T E L M K E Y
F L S P V A A L S * I G V L A H K L I V T S V T N G V D E *

K R G K L H S L N H E G N Q S H R A W S W R T I N S P W C *
ORF8 K G E S F I A S T M K E T S R I G H G A G E L * T A R G A E
ORF13 * K G K A S * P Q P * R K P V A * G M E L E N Y K Q P V V L
271
AAAAGGGGAAAGCTTCATAGCCTCAACCATGAAGGAAACCAGTCGCATAGGGCATGGAGCTGGAGAACTATAAACAGCCCG
TGGTGCTGA
ORF19 F P S L K M A E V M F S V L R M P C P A P S S Y V A R P A S
ORF32 F P F A E Y G * G H L F G T A Y P M S S S F * L C G T T S L
ORF22 L L P F S * L R L W S P F W D C L A H L Q L V I F L G H H Q

E R T T A E G A G G * S R A V L P A C P P W S S S P S S S C
R G Q L P K A P E D E A A Q C C Q P V L H G A H P H R V R A
ORF13 R E D N C R R R R R M K P R S A A S L S S M E L I P I E F V
361
GAGAGGACAACTGCCGAAGGCGCCGGAGGATGAAGCCGCGCAGTGCTGCCAGCCTGTCCTCCATGGAGCTCATCCCCATCG
AGTTCGTGC
L P C S G F A G S S S A A C H Q W G T R W P A * G W R T R A
ORF32 S S L Q R L R R L I F G R L A A L R D E M S S M G M S N T S
ORF22 S L V V A S P A P P H L R A T S G A Q G G H L E D G D L E H

C P P A S A N A R A P K R R C C T W P A T A T W S R * R P R
ORF9 A H Q P A Q M Q E P R N G A A A R G R P R Q R G A D E G P G
ORF13 L P T S Q R K C K S P E T A L L H V A G H G N V E Q M K A Q
451
TGCCCACCAGCCAGCGCAAATGCAAGAGCCCCGAAACGGCGCTGCTGCACGTGGCCGGCCACGGCAACGTGGAGCAGATGA
AGGCCCAGG
A W W G A C I C S G R F P A A A R P R G R C R P A S S P G P
G V L W R L H L L G S V A S S C T A P W P L T S C I F A W T
ORF22 Q G G A L A F A L A G F R R Q Q V H G A V A V H L L H L G L

C G C E R W R P A W R R T S T T G W D R I T S S C S I R R R
ORF9 V A A S A G D Q R G G G L L P P A G T A S L P P A L S E E G

109
ORF13 V W L R A L E T S V A A D F Y H R L G P H H F L L L Y Q K K
541
TGTGGCTGCGAGCGCTGGAGACCAGCGTGGCGGCGGACTTCTACCACCGGCTGGGACCGCATCACTTCCTCCTGCTCTATCA
GAAGAAGG
T A A L A P S W R P P P S R G G A P V A D S G G A R D S S P
H S R A S S V L T A A S K * W R S P G C * K R R S * * F F P
ORF22 H P Q S R Q L G A H R R V E V V P Q S R M V E E Q E I L L L

G S G T R S T T S T R W C R L W T A C A T G R P R T G A R A
ORF9 A V V R D L R Q V P G G A D S G L P A L L E G H A P E P G P
ORF13 G Q W Y E I Y D K Y Q V V Q T L D C L R Y W K A T H R S P G
631
GGCAGTGGTACGAGATCTACGACAAGTACCAGGTGGTGCAGACTCTGGACTGCCTGCGCTACTGGAAGGCCACGCACCGGA
GCCCGGGCC
A T T R S R R C T G P P A S E P S G A S S S P W A G S G P G
C H Y S I * S L Y W T T C V R S Q R R * Q F A V C R L G P W
P L P V L D V V L V L H H L S Q V A Q A V P L G R V P A R A

R S T W C S G T R P P R S P K P S S G S S R R * L A M T S L
ORF9 D P P G A A A P A L R G V P S L P A A A H G A D W L * R H *
ORF13 Q I H L V Q R H P P S E E S Q A F Q R Q L T A L I G Y D V T
721
AGATCCACCTGGTGCAGCGGCACCCGCCCTCCGAGGAGTCCCAAGCCTTCCAGCGGCAGCTCACGGCGCTGATTGGCTATGA
CGTCACTG
S G G P A A A G A R R P T G L R G A A A * P A S Q S H R * Q
ORF31 I W R T C R C G G E S S D W A K W R C S V A S I P * S T V S
L D V Q H L P V R G G L L G L G E L P L E R R Q N A I V D S

T S A T C T T M S W S S R A V A W * P R A W R R W P A A T P
R Q Q R A R R * A G V H A P W L G D P A H G G G G Q P R P Q
ORF13 D V S N V H D D E L E F T R R G L V T P R M A E V A S R D P
811
ACGTCAGCAACGTGCACGACGATGAGCTGGAGTTCACGCGCCGTGGCTTGGTGACCCCGCGCATGGCGGAGGTGGCCAGCC
GCGACCCCA
R * C R A R R H A P T * A G H S P S G A C P P P P W G R G W
ORF31 T L L T C S S S S S N V R R P K T V G R M A S T A L R S G L
V D A V H V V I L Q L E R A T A Q H G R A H R L H G A A V G

S S T P C T R G * R P S P S R S T C G R R L P T T A S S S S
A L R H A P V G D V Q A P P G V P V E E D C Q Q L H L H R H
ORF13 K L Y A M H P W V T S K P L P E Y L W K K I A N N C I F I V
901
AGCTCTACGCCATGCACCCGTGGGTGACGTCCAAGCCCCTCCCGGAGTACCTGTGGAAGAAGATTGCCAACAACTGCATCTT
CATCGTCA
A R R W A G T P S T W A G G P T G T S S S Q W C S C R * R *
ORF30 S * A M C G H T V D L G R G S Y R H F F I A L L Q M K M T M
L E V G H V R P H R G L G E R L V Q P L L N G V V A D E D D

F T A A P P A R P L R S H P T T P P A P S C R A S S P R W P
S P Q H H Q P D H * G L T R R H P R R H P A E L L H Q D G Q
ORF13 I H R S T T S Q T I K V S P D D T P G A I L Q S F F T K M A
991
TTCACCGCAGCACCACCAGCCAGACCATTAAGGTCTCACCCGACGACACCCCCGGCGCCATCCTGCAGAGCTTCTTCACCAA
GATGGCCA
E G C C W W G S W * P R V R R C G R R W G A S S R * W S P W
ORF30 * R L V V L W V M L T E G S S V G P A M R C L K K V L I A L
N V A A G G A L G N L D * G V V G G A G D Q L A E E G L H G

R R N L * W I F P K A K A N R I L C C A S V A G M S T W W A
E E I S D G Y S R K P K R T G F C A A R L W P G * V P G G R
ORF13 K K K S L M D I P E S Q S E Q D F V L R V C G R D E Y L V G
1081
AGAAGAAATCTCTGATGGATATTCCCGAAAGCCAAAGCGAACAGGATTTTGTGCTGCGCGTCTGTGGCCGGGATGAGTACCT
GGTGGGCG
S S I E S P Y E R F G F R V P N Q A A R R H G P H T G P P R
ORF29 F F D R I S I G S L W L S C S K T S R T Q P R S S Y R T P S
L L F R Q H I N G F A L A F L I K H Q A D T A P I L V Q H A

K R P S K T S S G * G T A S R T E K R F T W Y W T R L Q T R
N A H Q K L P V G E A L P Q E R R R D S R G T G H A S R P G
ORF13 E T P I K N F Q W V R H C L K N G E E I H V V L D T P P D P
1171
AAACGCCCATCAAAAACTTCCAGTGGGTGAGGCACTGCCTCAAGAACGGAGAAGAGATTCACGTGGTACTGGACACGCCTC
CAGACCCGG
F A W * F S G T P S A S G * S R L L S E R P V P C A E L G P

110
ORF29 V G M L F K W H T L C Q R L F P S S I * T T S S V G G S G A
F R G D F V E L P H P V A E L V S F L N V H Y Q V R R W V R

ORF1 P * T R * G R K S G R W W T T A R E S P A T M S S L P S T A
P R R G E E G R V A A G G R L H G S H R L P * A A Y H P R Q
ORF13 A L D E V R K E E W P L V D D C T G V T G Y H E Q L T I H G
1261
CCCTAGACGAGGTGAGGAAGGAAGAGTGGCCGCTGGTGGACGACTGCACGGGAGTCACCGGCTACCATGAGCAGCTTACCA
TCCACGGCA
G L R P S S P L T A A P P R S C P L * R S G H A A * W G R C
ORF28 R S S T L F S S H G S T S S Q V P T V P * W S C S V M W P L
ORF21 G * V L H P L F L P R Q H V V A R S D G A V M L L K G D V A

ORF1 R T T R V C S P C P C G T A T A S S G S R S E A L I S P S C
G P R E C V H R V P V G L R P Q V Q G Q D Q R H * Y P R P A
ORF13 K D H E S V F T V S L W D C D R K F R V K I R G I D I P V L
1351
AGGACCACGAGAGTGTGTTCACCGTGTCCCTGTGGGACTGCGACCGCAAGTTCAGGGTCAAGATCAGAGGCATTGATATCCC
CGTCCTGC
P G R S H T * R T G T P S R G C T * P * S * L C Q Y G R G A
ORF28 S W S L T N V T D R H S Q S R L N L T L I L P M S I G T R G
ORF21 L V V L T H E G H G Q P V A V A L E P D L D S A N I D G D Q

ORF1 L G T P T S Q F L * R Q T S S M G N K S F A K G E P A P N P
S E H R P H S F C R G K H P A W A T S P L P K E N Q P Q T L
ORF13 P R N T D L T V F V E A N I Q H G Q Q V L C Q R R T S P K P
1441
CTCGGAACACCGACCTCACAGTTTTTGTAGAGGCAAACATCCAGCATGGGCAACAAGTCCTTTGCCAAAGGAGAACCAGCCC
CAAACCCT
E S C R G * L K Q L P L C G A H A V L G K G F S F W G W V R
ORF28 R F V S R V T K T S A F M W C P C C T R Q W L L V L G L G K
ORF21 R P V G V E C N K Y L C V D L M P L L D K A L P S G A G F G

ORF2 S Q R R C C G M C G L S S V S K S K T C P K G L Y * T S R S
H R G G A V E C V A * V Q Y Q N Q R L A Q R G S T E P P D L
ORF13 F T E E V L W N V W L E F S I K I K D L P K G A L L N L Q I
1531
TCACAGAGGAGGTGCTGTGGAATGTGTGGCTTGAGTTCAGTATCAAAATCAAAGACTTGCCCAAAGGGGCTCTACTGAACCT
CCAGATCT
* L P P A T S H T A Q T * Y * F * L S A W L P E V S G G S R
V S S T S H F T H S S N L I L I L S K G L P A R S F R W I *
E C L L H Q P I H P K L E T D F D F V Q G F P S * Q V E L D

T A V K L Q H C P A R P L Q S P P V L S P R A K F G F S I M
L R * S S S T V Q Q G L C R V P Q F * V Q G Q S S A S L L C
ORF13 Y C G K A P A L S S K A S A E S P S S E S K G K V R L L Y Y
1621
ACTGCGGTAAAGCTCCAGCACTGTCCAGCAAGGCCTCTGCAGAGTCCCCCAGTTCTGAGTCCAAGGGCAAAGTTCGGCTTCT
CTATTATG
S R Y L E L V T W C P R Q L T G W N Q T W P C L E A E R N H
ORF27 Q P L A G A S D L L A E A S D G L E S D L P L T R S R * * T
V A T F S W C Q G A L G R C L G G T R L G L A F N P K E I I

* T C C * * T T V S S C A V E N T S S T C G R Y L G R E K T
E P A A D R P P F P P A P W R I R P P H V A D I W E G R R P
ORF13 V N L L L I D H R F L L R R G E Y V L H M W Q I S G K G E D
1711
TGAACCTGCTGCTGATAGACCACCGTTTCCTCCTGCGCCGTGGAGAATACGTCCTCCACATGTGGCAGATATCTGGGAAGGG
AGAAGACC
S G A A S L G G N G G A G H L I R G G C T A S I Q S P L L G
ORF27 F R S S I S W R K R R R R P S Y T R W M H C I D P F P S S W
H V Q Q Q Y V V T E E Q A T S F V D E V H P L Y R P L S F V

ORF3 K E A S M L T N S R L Q L T Q T R R T Q C P S P F F W T I T
R K L Q C * Q T H V C N * P R Q G E L N V H L H S S G Q L L
ORF13 Q G S F N A D K L T S A T N P D K E N S M S I S I L L D N Y
1801
AAGGAAGCTTCAATGCTGACAAACTCACGTCTGCAACTAACCCAGACAAGGAGAACTCAATGTCCATCTCCATTCTTCTGGA
CAATTACT
L F S * H Q C V * T Q L * G L C P S S L T W R W E E P C N S
ORF27 P L K L A S L S V D A V L G S L S F E I D M E M R R S L * Q
ORF20 L S A E I S V F E R R C S V W V L L V * H G D G N K Q V I V

ORF3 A T R * P C L S I S P P L T R K G T G F E Q K C P T S F A S
P P D S P A * A S A H P * P G R G P G S S R N A Q P A S Q A

111
ORF13 C H P I A L P K H Q P T P D P E G D R V R A E M P N Q L R K
1891
GCCACCCGATAGCCCTGCCTAAGCATCAGCCCACCCCTGACCCGGAAGGGGACCGGGTTCGAGCAGAAATGCCCAACCAGC
TTCGCAAGC
G G S L G A * A D A W G Q G P L P G P E L L F A W G A E C A
ORF26 W G I A R G L C * G V G S G S P S R T R A S I G L W S R L C
ORF20 A V R Y G Q R L M L G G R V R F P V P N S C F H G V L K A L

N W R R S * P L I H L T L S Q Q R T K N C S G I L D T K A L
I G G D H S H * S T * P S H S R G Q R I A L A F * I R K P *
ORF13 Q L E A I I A T D P L N P L T A E D K E L L W H F R Y E S L
1981
AATTGGAGGCGATCATAGCCACTGATCCACTTAACCCTCTCACAGCAGAGGACAAAGAATTGCTCTGGCATTTTAGATACGA
AAGCCTTA
I P P S * L W Q D V * G E * L L P C L I A R A N * I R F G *
ORF26 N S A I M A V S G S L G R V A S S L S N S Q C K L Y S L R L
ORF20 L Q L R D Y G S I W K V R E C C L V F F Q E P M K S V F A K

S T Q K H I L S Y L V Q * N G D S K K L W P K H T N C W P E
ORF10 A P K S I S * A I * F S E M G T A R N C G Q N I P I V G Q K
ORF13 K H P K A Y P K L F S S V K W G Q Q E I V A K T Y Q L L A R
2071
AGCACCCAAAAGCATATCCTAAGCTATTTAGTTCAGTGAAATGGGGACAGCAAGAAATTGTGGCCAAAACATACCAATTGTT
GGCCAGAA
ORF18 A G L L M D * A I * N L S I P V A L F Q P W F M G I T P W F
C G F A Y G L S N L E T F H P C C S I T A L V Y W N N A L L
L V W F C I R L * K T * H F P S L L F N H G F C V L Q Q G S

G K S G I K V L W M L G * Q C S S W T A T S Q M K M * E P L
ORF10 G S L G S K C F G C W V N N A A P G L Q L L R * K C K S H C
ORF13 R E V W D Q S A L D V G L T M Q L L D C N F S D E N V R A I
2161
GGGAAGTCTGGGATCAAAGTGCTTTGGATGTTGGGTTAACAATGCAGCTCCTGGACTGCAACTTCTCAGATGAAAATGTAAG
AGCCATTG
ORF18 P L R P D F H K P H Q T L L A A G P S C S R L H F H L L W Q
S T Q S * L A K S T P N V I C S R S Q L K E S S F T L A M A
P F D P I L T S Q I N P * C H L E Q V A V E * I F I Y S G N

Q F R N W R A W R T M M F C I T F Y N W S R L * N L N H T M
S S E T G E L G G R * C S A L P S T I G P G C E I * T I P *
ORF13 A V Q K L E S L E D D D V L H Y L L Q L V Q A V K F E P Y H
2251
CAGTTCAGAAACTGGAGAGCTTGGAGGACGATGATGTTCTGCATTACCTTCTACAATTGGTCCAGGCTGTGAAATTTGAACC
ATACCATG
ORF18 L E S V P S S P P R H H E A N G E V I P G P Q S I Q V M G H
T * F S S L K S S S S T R C * R R C N T W A T F N S G Y W S
C N L F Q L A Q L V I I N Q M V K * L Q D L S H F K F W V M

I A P L P D F C * S V V * E T K E L V T F C F G S * E V R *
* R P C Q I S A E A W F K K Q K N W S L F V L V L E K * D S
ORF13 D S A L A R F L L K R G L R N K R I G H F L F W F L R S E I
2341
ATAGCGCCCTTGCCAGATTTCTGCTGAAGCGTGGTTTAAGAAACAAAAGAATTGGTCACTTTTTGTTTTGGTTCTTGAGAAGT
GAGATAG
Y R G Q W I E A S A H N L F C F F Q D S K T K T R S F H S L
L A R A L N R S F R P K L F L L I P * K K N Q N K L L S I A
I A G K G S K Q Q L T T * S V F S N T V K Q K P E Q S T L Y

P S P D T I S R G S L * F W K P I * G A V A Q P C C T T L P
P V Q T L S A E V R C D S G S L S E G L W H S H A A R L Y P
ORF13 A Q S R H Y Q Q R F A V I L E A Y L R G C G T A M L H D F T
2431
CCCAGTCCAGACACTATCAGCAGAGGTTCGCTGTGATTCTGGAAGCCTATCTGAGGGGCTGTGGCACAGCCATGCTGCACGA
CTTTACCC
G T W V S D A S T R Q S E P L R D S P S H C L W A A R S * G
W D L C * * C L N A T I R S A * R L P Q P V A M S C S K V W
G L G S V I L L P E S H N Q F G I Q P A T A C G H Q V V K G

ORF4 N K S K * S R C Y K K S P L I L N R S L L K S M T S V P K L
T S P S N R D V T K S H P * Y * I A L C * K V * R Q F P S Y
ORF13 Q Q V Q V I E M L Q K V T L D I K S L S A E K Y D V S S Q V
2521
AACAAGTCCAAGTAATCGAGATGTTACAAAAAGTCACCCTTGATATTAAATCGCTCTCTGCTGAAAAGTATGACGTCAGTTC
CCAAGTTA
V L G L L R S T V F L * G Q Y * I A R Q Q F T H R * N G L *

112
C T W T I S I N C F T V R S I L D S E A S F Y S T L E W T I
L L D L Y D L H * L F D G K I N F R E R S F L I V D T G L N

ORF4 F H N L N K S L K T C R I L N S P K A L E F H M I L D * K Q
F T T * T K A * K P A E F S T P R K L * S S I * S W T E S R
ORF13 I S Q L K Q K L E N L Q N S Q L P E S F R V P Y D P G L K A
2611
TTTCACAACTTAAACAAAAGCTTGAAAACCTGCAGAATTCTCAACTCCCCGAAAGCTTTAGAGTTCCATATGATCCTGGACTG
AAAGCAG
K V V * V F A Q F G A S N E V G R F S * L E M H D Q V S L L
ORF25 E C S L C F S S F R C F E * S G S L K L T G Y S G P S F A P
N * L K F L L K F V Q L I R L E G F A K S N W I I R S Q F C

E R W Q L K N V K * W P P R K N H Y G L S L N V P I L Q P Y
S A G N * K M * S N G L Q E K T T M A * V * M C R S Y S P I
ORF13 G A L A I E K C K V M A S K K K P L W L E F K C A D P T A L
2701
GAGCGCTGGCAATTGAAAAATGTAAAGTAATGGCCTCCAAGAAAAAACCACTATGGCTTGAGTTTAAATGTGCCGATCCTAC
AGCCCTAT
L A P L Q F I Y L L P R W S F V V I A Q T * I H R D * L G I
ORF25 A S A I S F H L T I A E L F F G S H S S N L H A S G V A R D
S R Q C N F F T F Y H G G L F F W * P K L K F T G I R C G *

ORF5 Q M K Q L E L S L N M V M I C A K T C L F Y R F Y E S W S L
K * N N W N Y L * T W * * S A P R H A Y F T D S T N H G V Y
ORF13 S N E T I G I I F K H G D D L R Q D M L I L Q I L R I M E S
2791
CAAATGAAACAATTGGAATTATCTTTAAACATGGTGATGATCTGCGCCAAGACATGCTTATTTTACAGATTCTACGAATCATG
GAGTCTA
ORF17 L H F L Q F * R * V H H H D A G L C A * K V S E V F * P T *
ORF25 F S V I P I I K L C P S S R R W S M S I K C I R R I M S D I
* I F C N S N D K F M T I I Q A L V H K N * L N * S D H L R

ORF5 F G R L N L W I Y A S C H M V A F Q L V T K * E * S R L * K
L G D * I F G S M P P A I W L H F N W * Q N R N D R D C E R
ORF13 I W E T E S L D L C L L P Y G C I S T G D K I G M I E I V K
2881
TTTGGGAGACTGAATCTTTGGATCTATGCCTCCTGCCATATGGTTGCATTTCAACTGGTGACAAAATAGGAATGATCGAGATT
GTGAAAG
ORF17 K P S Q I K P D I G G A M H N C K L Q H C F L F S R S Q S L
ORF25 Q S V S D K S R H R R G Y P Q M E V P S L I P I I S I T F S
N P L S F R Q I * A E Q W I T A N * S T V F Y S H D L N H F

T P R Q L P K F S K A Q W A T R E H L K M K S * I T G S K K
R H D N C Q N S A K H S G Q H G S I * R * S P E S L A Q R K
ORF13 D A T T I A K I Q Q S T V G N T G A F K D E V L N H W L K E
2971
ACGCCACGACAATTGCCAAAATTCAGCAAAGCACAGTGGGCAACACGGGAGCATTTAAAGATGAAGTCCTGAATCACTGGC
TCAAAGAAA
ORF17 R W S L Q W F E A F C L P C C P L M * L H L G S D S A * L F
A V V I A L I * C L V T P L V P A N L S S T R F * Q S L S F
V G R C N G F N L L A C H A V R S C K F I F D Q I V P E F F

N P L L K K S F R Q Q W R D L F I P V Q A T V W Q P L F L E
I P Y * R K V S G S S G E I C L F L C R L L C G N L C S W N
ORF13 K S P T E E K F Q A A V E R F V Y S C A G Y C V A T F V L G
3061
AATCCCCTACTGAAGAAAAGTTTCAGGCAGCAGTGGAGAGATTTGTTTATTCCTGTGCAGGCTACTGTGTGGCAACCTTTGTT
CTTGGAA
I G * Q L F T E P L L P S I Q K N R H L S S H P L R Q E Q F
D G V S S F N * A A T S L N T * E Q A P * Q T A V K T R P I
F G R S F F L K L C C H L S K N I G T C A V T H C G K N K S

* A T D T M T I L * S P R Q E T Y F I L T S G T F L G I T K
R R Q T Q * Q Y Y D H R D R K P I S Y * L R A H S W E L Q K
ORF13 I G D R H N D N I M I T E T G N L F H I D F G H I L G N Y K
3151
TAGGCGACAGACACAATGACAATATTATGATCACCGAGACAGGAAACCTATTTCATATTGACTTCGGGCACATTCTTGGGAA
TTACAAAA
L R C V C H C Y * S * R S L F G I E Y Q S R A C E Q S N C F
P S L C L S L I I I V S V P F R N * I S K P C M R P F * L L
Y A V S V I V I N H D G L C S V * K M N V E P V N K P I V F

V S W A L I K R E C H L C * P L T S S L * W E L L E R R Q A
F P G H * * R E S A I C A N P * L P L C D G N F W K E D K P

113
ORF13 S F L G I N K E R V P F V L T P D F L F V M G T S G K K T S
3241
GTTTCCTGGGCATTAATAAAGAGAGAGTGCCATTTGTGCTAACCCCTGACTTCCTCTTTGTGATGGGAACTTCTGGAAAGAAG
ACAAGCC
N G P C * Y L S L A M Q A L G Q S G R Q S P F K Q F S S L G
K R P M L L S L T G N T S V G S K R K T I P V E P F F V L G
T E Q A N I F L S H W K H * G R V E E K H H S S R S L L C A

H T S R N F R T S V L R L I * P F V I T Q T Y * S S C S P *
T L P E I S G H L C * G L S S P S S S H K P T D H P V L H D
ORF13 P H F Q K F Q D I C V K A Y L A L R H H T N L L I I L F S M
3331
CACACTTCCAGAAATTTCAGGACATCTGTGTTAAGGCTTATCTAGCCCTTCGTCATCACACAAACCTACTGATCATCCTGTTCT
CCATGA
V S G S I E P C R H * P K D L G E D D C L G V S * G T R W S
ORF24 C K W F N * S M Q T L A * R A R R * * V F R S I M R N E M I
W V E L F K L V D T N L S I * G K T M V C V * Q D D Q E G H

ORF6 C * * Q E C P S * Q A K K T L N I S G M P S Q W G K M R R M
A D D R N A P V N K Q R R H * I Y P G C P H S G E K * G G C
ORF13 M L M T G M P Q L T S K E D I E Y I R D A L T V G K N E E D
3421
TGCTGATGACAGGAATGCCCCAGTTAACAAGCAAAGAAGACATTGAATATATCCGGGATGCCCTCACAGTGGGGAAAAATG
AGGAGGATG
A S S L F A G T L L C L L C Q I Y G P H G * L P S F H P P H
ORF24 S I V P I G W N V L L S S M S Y I R S A R V T P F F S S S A
H Q H C S H G L * C A F F V N F I D P I G E C H P F I L L I

ORF6 L K S I F L I R S K F A E T K D G L C S L I G F Y I L F L A
* K V F S * S D R S L Q R Q R M D C A V * L V S T S C S W H
ORF13 A K K Y F L D Q I E V C R D K G W T V Q F N W F L H L V L G
3511
CTAAAAAGTATTTTCTTGATCAGATCGAAGTTTGCAGAGACAAAGGATGGACTGTGCAGTTTAATTGGTTTCTACATCTTGTT
CTTGGCA
* F T N E Q D S R L K C L C L I S Q A T * N T E V D Q E Q C
L F Y K R S * I S T Q L S L P H V T C N L Q N R C R T R P M
S F L I K K I L D F N A S V F S P S H L K I P K * M K N K A

ORF6 S N K E R N I Q P N T L G * N Q K Q V S V L W F K L A * Q S
Q T R R E T F S L I L * A R I K N K L V F Y G L N * H S N H
ORF13 I K Q G E K H S A * Y F R L E S K T S * C S M V * I S I A I
3601
TCAAACAAGGAGAGAAACATTCAGCCTAATACTTTAGGCTAGAATCAAAAACAAGTTAGTGTTCTATGGTTTAAATTAGCAT
AGCAATCA
* V L L S V N L R I S * A L I L F L N T N * P K F * C L L *
L C P S F C E A * Y K L S S D F V L * H E I T * I L M A I M
D F L S L F M * G L V K P * F * F C T L T R H N L N A Y C D

S N L D F K C N R H C E S W H F R S I A L F L P E L F P G E
R T W I S N A I D I V K A G I S E V * L F S Y L N S S L E K
I E L G F Q M Q * T L * K L A F Q K Y S S F P T * T L P W R
3691
TCGAACTTGGATTTCAAATGCAATAGACATTGTGAAAGCTGGCATTTCAGAAGTATAGCTCTTTTCCTACCTGAACTCTTCCC
TGGAGAA
R V Q I E F A I S M T F A P M E S T Y S K E * R F E E R S F
S S P N * I C Y V N H F S A N * F Y L E K G V Q V R G Q L F
D F K S K L H L L C Q S L Q C K L L I A R K R G S S K G P S

K M L A L L I V W L S N V Q C * D Y L Q V W F F L I C L W H
ORF11 R C W H C * L F G * A M S S A R I I C R F G F F S F V C G I
K D V G I A D C L V K Q C P V L G L F A G L V F S H L S V A
3781
AAGATGTTGGCATTGCTGATTGTTTGGTTAAGCAATGTCCAGTGCTAGGATTATTTGCAGGTTTGGTTTTTTCTCATTTGTCTG
TGGCAT
ORF16 L H Q C Q Q N N P * A I D L A L I I Q L N P K K E N T Q P M
S T P M A S Q K T L C H G T S P N N A P K T K E * K D T A N
F I N A N S I T Q N L L T W H * S * K C T Q N K R M Q R H C

W R I F S V * T D * * L P Y C P * Y F D Y L T I E C F W K F
ORF11 G E Y S R F K Q T N D F L I V P D I L T I L L L S A S G N S
L E N I L G L N R L M T S L L S L I F * L S Y Y * V L L E I
3871
TGGAGAATATTCTCGGTTTAAACAGACTAATGACTTCCTTATTGTCCCTGATATTTTGACTATCTTACTATTGAGTGCTTCTGG
AAATTC
ORF16 P S Y E R N L C V L S K R I T G S I K V I K S N L A E P F E

114
S F I R P K F L S I V E K N D R I N Q S D * * Q T S R S I R
Q L I N E T * V S * H S G * Q G Q Y K S * R V I S H K Q F N

F G I I D D I Y F H L G L V S I L V I F V F L K L F K E K D
ORF11 L E * L M T S I F I W V * S Q F W L S L C S S S S L K K K M
L W N N * * H L F S S G F S L N F G Y L C V P Q A L * R K R
3961
TTTGGAATAATTGATGACATCTATTTTCATCTGGGTTTAGTCTCAATTTTGGTTATCTTTGTGTTCCTCAAGCTCTTTAAAGAA
AAAGAT
ORF16 K S Y N I V D I K M Q T * D * N Q N D K H E E L E K F F F I
Q F L Q H C R N E D P N L R L K P * R Q T G * A R * L F L H
K P I I S S M * K * R P K T E I K T I K T N R L S K L S F S

V I V V T F V S F L K * C F Q T S P * C L Q V L V V C * K Q
* S L * P L S H S L N D A S K H L L S V C R C * W C A K S K
C N R C N L C L I P * M M L P N I S L V S A G V S G V L K A
4051
GTAATCGTTGTAACCTTTGTCTCATTCCTTAAATGATGCTTCCAAACATCTCCTTAGTGTCTGCAGGTGTTAGTGGTGTGCTAA
AAGCAA
Y D N Y G K D * E K F S A E L C R R L T Q L H * H H A L L L
L R Q L R Q R M G * I I S G F M E K T D A P T L P T S F A L
T I T T V K T E N R L H H K W V D G * H R C T N T T H * F C

G K R V S L F S V F C N S I L L S C I T E T H K H S R R N L
E S E L V F S V S F A I Q F F C H V * L R H T N T A G E I *
R K A S * S F Q C L L Q F N S F V M Y N * D T Q T Q Q E K S
4141
GGAAAGCGAGTTAGTCTTTTCAGTGTCTTTTGCAATTCAATTCTTTTGTCATGTATAACTGAGACACACAAACACAGCAGGAG
AAATCTA
ORF15 S L S N T K E T D K A I * N K Q * T Y S L C V F V A P S I *
F A L * D K * H R K C N L E K T M Y L Q S V C V C C S F D L
P F R T L R K L T K Q L E I R K D H I V S V C L C L L L F R

N R C A L T F L C W S C S R V M N M K K * R * D F L C Q L C
T V V P * P S S A G L V P G L * I * K N R D E T F C V N S V
K P L C L D L P L L V L F Q G Y E Y E K I E M R L F V S T L
4231
AACCGTTGTGCCTTGACCTTCCTCTGCTGGTCTTGTTCCAGGGTTATGAATATGAAAAAATAGAGATGAGACTTTTTGTGTCA
ACTCTGT
ORF15 V T T G Q G E E A P R T G P N H I H F F L S S V K Q T L E T
G N H R S R G R S T K N W P * S Y S F I S I L S K T D V R D
F R Q A K V K R Q Q D Q E L T I F I F F Y L H S K K H * S Q

P Q E * V I * Y D * Y S F L Q H G S R K * L Q G L F Y A * H
ORF12 H K S E L S S M I S I A F S S M A A G S N Y R A S F M P D I
S T R V S Y L V * L V * L S P A W Q Q E V T T G P L L C L T
4321
CCACAAGAGTGAGTTATCTAGTATGATTAGTATAGCTTTCTCCAGCATGGCAGCAGGAAGTAACTACAGGGCCTCTTTTATGC
CTGACAT
ORF15 W L L S N D L I I L I A K E L M A A P L L * L A E K I G S M
V L T L * R T H N T Y S E G A H C C S T V V P G R K H R V N
G C S H T I * Y S * Y L K R W C P L L F Y S C P R K * A Q C

F F P S F F P A S L F H Q L Q C S H N S L Q T C E I F K N T
ORF12 S S L P F S L P P F F I N C N A P T T L Y R L V K S S R T P
F L P F L F P C L P F S S I A M L P Q L F T D L * N L Q E H
4411
TTCTTCCCTTCCTTTTTCCCTGCCTCCCTTTTTCATCAATTGCAATGCTCCCACAACTCTTTACAGACTTGTGAAATCTTCAAGA
ACACC
E E R G K E R G G K K M L Q L A G V V R * L S T F D E L V G
R G K R K G Q R G K E D I A I S G C S K V S K H F R * S C R
K K G E K K G A E R K * * N C H E W L E K C V Q S I K L F V

F T L * L K N * L K N N Y F S R I I R I L G T Y L * R C L V
ORF12 L L Y N S K I S * K I I T S Q G L L E S * V L I C K D V * *
L Y S I T Q K L V E K * L L L K D Y * N L R Y L F V K M F S
4501
TTTACTCTATAACTCAAAAATTAGTTGAAAAATAATTACTTCTCAAGGATTATTAGAATCTTAGGTACTTATTTGTAAAGATG
TTTAGTG
K S * L E F I L Q F I I V E * P N N S D * T S I Q L S T * H
* E I V * F N T S F Y N S R L S * * F R L Y K N T F I N L S
K V R Y S L F * N F F L * K E L I I L I K P V * K Y L H K T

T F F S S I Y K G G R F * K I * I S F Q M P * F * T L A * T
L F F Q V S I K E A D S R K Y E L V S K C L N F K L W P E Q

115
D F F F K Y L * R R Q I L E N M N * F P N A L I L N F G L N
4591
ACTTTTTTTTCAAGTATCTATAAAGGAGGCAGATTCTAGAAAATATGAATTAGTTTCCAAATGCCTTAATTTTAAACTTTGGCC
TGAACA
S K K * T D I F S A S E L F Y S N T E L H R L K L S Q G S C
K K K L Y R Y L L C I R S F I F * N G F A K I K F K P R F L
V K K E L I * L P P L N * F I H I L K W I G * N * V K A Q V

V F S F S * W K K I F N I L K I F Q V R K N T T C L I H F P
F F L F L N G R R Y L I S * K Y S K L G R T L L A L S I S H
S F F F F L M E E D I * Y L K N I P S * E E H Y L P Y P F P
4681
GTTTTTTCTTTTTCTTAATGGAAGAAGATATTTAATATCTTAAAAATATTCCAAGTTAGGAAGAACACTACTTGCCTTATCCAT
TTCCCA
N K R K R L P L L Y K I D * F Y E L N P L V S S A K D M E W
K K K K K I S S S I * Y R L F I G L * S S C * K G * G N G M
T K E K E * H F F I N L I K F I N W T L F F V V Q R I W K G

F K G L L N F D T V L Q I S * K S L K Y L T L K I F S S L K
L K D F * T L T Q S F R F P E N P * N I L L * K Y F H L * N
I * R T F K L * H S P S D F L K I L E I S Y F K N I F I S E
4771
TTTAAAGGACTTTTAAACTTTGACACAGTCCTTCAGATTTCCTGAAAATCCTTGAAATATCTTACTTTAAAAATATTTTCATCT
CTGAAA
K F S K * V K V C D K L N G S F G Q F I K S * F Y K * R Q F
* L V K L S Q C L G E S K R F I R S I D * K L F I K M E S I
N L P S K F K S V T R * I E Q F D K F Y R V K F I N E D R F

Y L V I Y W R Y C L T L D R P L N Y L * N I L * L L * L I H
I S L F I G G I V * P * I D H * I I Y K I F C N Y C S * Y I
I S R Y L L E V L F N L R * T I K L F I K Y F V I T V A N T
4861
TATCTCGTTATTTATTGGAGGTATTGTTTAACCTTAGATAGACCATTAAATTATTTATAAAATATTTTGTAATTACTGTAGCTA
ATACAT
I E N N I P P I T * G * I S W * I I * L I N Q L * Q L * Y M
D R * K N S T N N L R L Y V M L N N I F Y K T I V T A L V N
Y R T I * Q L Y Q K V K S L G N F * K Y F I K Y N S Y S I C

Y I E K T M L T V S L F K Y N Q I * I Y N L I F * F * K I D
T * K K L C * Q C L C L S I I R Y K Y I T * F F N F K K * I
L H R K N Y V N S V S V * V * S D I N I * L N F L I L K N R
4951
TACATAGAAAAAACTATGTTAACAGTGTCTCTGTTTAAGTATAATCAGATATAAATATATAACTTAATTTTTTAATTTTAAAA
AATAGAT
V Y F F S H * C H R Q K L I I L Y L Y I V * N K L K L F Y I
C L F F * T L L T E T * T Y D S I F I Y S L K K I K F F L Y
* M S F V I N V T D R N L Y L * I Y I Y L K I K * N * F I S

ORF7 T C L T L R * S R P F S F F F F L M C A K A Q R F L S L A A
P V * L * G S P G L F L F F F F * C V Q K P K G S * A W L Q
Y L F D F E V V Q A F F F F F F F N V C K S P K V P K P G C
5041
ACCTGTTTGACTTTGAGGTAGTCCAGGCCTTTTTCTTTTTTTTTTTTTTTAATGTGTGCAAAAGCCCAAAGGTTCCTAAGCCTG
GCTGCA
G T Q S Q P L G P R K R K K K K * H T C F G L P E * A Q S C
R N S K S T T W A K K K K K K K L T H L L G F T G L G P Q L
V Q K V K L Y D L G K E K K K K K I H A F A W L N R L R A A

ORF7 K K N Q Q G H F L K T L L S A W G N T V R L H L L K K K L A
R R I N R D T F * K H S Y Q P G A T Q * D S I S * K K N * L
K E E S T G T L F K N T L I S L G Q H S E T P S L K K K I S
5131
AAGAAGAATCAACAGGGACACTTTTTAAAAACACTCTTATCAGCCTGGGGCAACACAGTGAGACTCCATCTCTTAAAAAAAA
AATTAGCT
L L I L L S V K * F C E * * G P A V C H S E M E * F F F * S
S S D V P V S K L F V R I L R P C C L S V G D R L F F I L Q
F F F * C P C K K F V S K D A Q P L V T L S W R K F F F N A

ORF7 G Y S G M C L * S Q V L R R L R Q E D C L S P G G G N C R E
G I V V C A C S P R Y S G G * G R R I A * A Q E V E T A E S
W V * W Y V P V V P G T Q E A E A G G L P E P R R W K L Q R
5221
GGGTATAGTGGTATGTGCCTGTAGTCCCAGGTACTCAGGAGGCTGAGGCAGGAGGATTGCCTGAGCCCAGGAGGTGGAAAC
TGCAGAGAG
P I T T H A Q L G L Y E P P Q P L L I A Q A W S T S V A S L

116
T Y H Y T G T T G P V * S A S A P P N G S G L L H F S C L T
P Y L P I H R Y D W T S L L S L C S S Q R L G P P P F Q L S

S * S C P Y T P A W I T E R D P V S K K K K K K K K K L E
H D H V L T L Q P G * Q S E T L S Q K K K K K K K K N S
ORF14 V M I M S L H S S L D N R A R P C L K K K K K K K K K T R
5311
TCATGATCATGTCCTTACACTCCAGCCTGGATAACAGAGCGAGACCCTGTCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA
CTCGAG
* S * T R V S W G P Y C L S V R D * F F F F F F F F F E L
M I M D K C E L R S L L A L G Q R L F F F F F F F F V R
D H D H G * V G A Q I V S R S G T E F F F F F F F F F S S

117
DOCKING RESULTS

rmsd/
Ligand Binding Affinity ub rmsd/lb
3ml9_prepared_3950_uff_E=164.64 -5.8 0 0
3ml9_prepared_3950_uff_E=164.64 -5.8 4.883 3.769

118
3ml9_prepared_3950_uff_E=164.64 -5.8 30.804 28.836
3ml9_prepared_3950_uff_E=164.64 -5.6 24.245 22.999
3ml9_prepared_3950_uff_E=164.64 -5.5 31.184 29.176
3ml9_prepared_3950_uff_E=164.64 -5.5 21.277 19.729
3ml9_prepared_3950_uff_E=164.64 -5.4 28.934 27.751
3ml9_prepared_3950_uff_E=164.64 -5.3 22.377 20.903
3ml9_prepared_3950_uff_E=164.64 -5.3 43.24 41.484
3ml9_prepared_3973_uff_E=438.90 -10 0 0
3ml9_prepared_3973_uff_E=438.90 -9.2 29.899 28.642
3ml9_prepared_3973_uff_E=438.90 -8.8 18.012 14.95
3ml9_prepared_3973_uff_E=438.90 -8.3 38.387 36.269
3ml9_prepared_3973_uff_E=438.90 -8.3 17.913 15.301
3ml9_prepared_3973_uff_E=438.90 -8.3 19.114 16.133
3ml9_prepared_3973_uff_E=438.90 -8.2 18.336 14.718
3ml9_prepared_3973_uff_E=438.90 -8.2 5.639 3.604
3ml9_prepared_3973_uff_E=438.90 -8 26.161 24.608
3ml9_prepared_312145_uff_E=785.80 -7.3 0 0
3ml9_prepared_312145_uff_E=785.80 -7.3 42.115 39.444
3ml9_prepared_312145_uff_E=785.80 -7.2 41.892 39.196
3ml9_prepared_312145_uff_E=785.80 -7.2 5.293 2.238
3ml9_prepared_312145_uff_E=785.80 -7.1 5.283 2.335
3ml9_prepared_312145_uff_E=785.80 -7.1 43.293 40.313
3ml9_prepared_312145_uff_E=785.80 -7.1 41.504 39.096
3ml9_prepared_312145_uff_E=785.80 -7.1 47.246 44.004
3ml9_prepared_312145_uff_E=785.80 -7 6.788 2.418
3ml9_prepared_5394_uff_E=324.48 -6.7 0 0
3ml9_prepared_5394_uff_E=324.48 -6.5 37.896 36.469
3ml9_prepared_5394_uff_E=324.48 -6.4 22.225 20.415
3ml9_prepared_5394_uff_E=324.48 -6.4 5.374 1.891
3ml9_prepared_5394_uff_E=324.48 -6.3 40.646 39.6
3ml9_prepared_5394_uff_E=324.48 -6.1 50.451 48.78
3ml9_prepared_5394_uff_E=324.48 -6 40.507 39.271
3ml9_prepared_5394_uff_E=324.48 -6 42.073 40.498
3ml9_prepared_5394_uff_E=324.48 -6 40.379 39.299
3ml9_prepared_11282283_uff_E=799.4
6 -9.7 0 0
3ml9_prepared_11282283_uff_E=799.4
6 -8.9 40.833 36.393
3ml9_prepared_11282283_uff_E=799.4
6 -8.9 27.006 25.922
3ml9_prepared_11282283_uff_E=799.4
6 -8.9 5.68 3.939
3ml9_prepared_11282283_uff_E=799.4
6 -8.7 28.656 26.464

119
3ml9_prepared_11282283_uff_E=799.4
6 -8.4 47.685 43.799
3ml9_prepared_11282283_uff_E=799.4
6 -8.4 40.204 36.013
3ml9_prepared_11282283_uff_E=799.4
6 -8.3 44.875 41.308
3ml9_prepared_11282283_uff_E=799.4
6 -8.2 45.496 43.32
3ml9_prepared_11977753_uff_E=934.5
2 -11 0 0
3ml9_prepared_11977753_uff_E=934.5
2 -10.4 2.395 1.761
3ml9_prepared_11977753_uff_E=934.5
2 -10.2 8.14 2.982
3ml9_prepared_11977753_uff_E=934.5
2 -10 3.669 2.443
3ml9_prepared_11977753_uff_E=934.5
2 -9.5 9.215 4.915
3ml9_prepared_11977753_uff_E=934.5
2 -9.5 7.14 2.5
3ml9_prepared_11977753_uff_E=934.5
2 -9.3 9.157 4.744
3ml9_prepared_11977753_uff_E=934.5
2 -8.9 42.153 39.119
3ml9_prepared_11977753_uff_E=934.5
2 -8.6 28.857 26.973
3ml9_prepared_25033539_uff_E=397.6
4 -8.7 0 0
3ml9_prepared_25033539_uff_E=397.6
4 -8.5 49.058 46.081
3ml9_prepared_25033539_uff_E=397.6
4 -8.4 2.089 1.745
3ml9_prepared_25033539_uff_E=397.6
4 -8.1 6.062 2.669
3ml9_prepared_25033539_uff_E=397.6
4 -7.9 36.938 33.807
3ml9_prepared_25033539_uff_E=397.6
4 -7.9 7.595 3.501
3ml9_prepared_25033539_uff_E=397.6
4 -7.6 47.831 46.419
3ml9_prepared_25033539_uff_E=397.6
4 -7.6 46.016 44.669
3ml9_prepared_25033539_uff_E=397.6
4 -7.6 23.492 20.217
3ml9_prepared_Icotinib_uff_E=513.14 -10.6 0 0
3ml9_prepared_Icotinib_uff_E=513.14 -9.9 2.721 2.523
3ml9_prepared_Icotinib_uff_E=513.14 -9.4 3.757 3.404
3ml9_prepared_Icotinib_uff_E=513.14 -9.3 8.318 3.965

120
3ml9_prepared_Icotinib_uff_E=513.14 -8.9 8.018 5.076
3ml9_prepared_Icotinib_uff_E=513.14 -8.9 7.362 3.64
3ml9_prepared_Icotinib_uff_E=513.14 -8.9 8.316 3.657
3ml9_prepared_Icotinib_uff_E=513.14 -8.8 13.917 11.897
3ml9_prepared_Icotinib_uff_E=513.14 -8.8 29.867 27.412

The ones in orange show best binding affinity.

The ones in yellow show comparatively less binding affinity.

Molecular Docking results with PyRx

Fig:- PyRx Molecular Docking:- Image showing Binding of selected all 8 ligands molecules
with all its different- different conformations with the target protein.

121
Fig:-PyRx Molecular Docking:- Image showing Binding of ligands 11977753 and Icotinib
with all its different conformations.

Fig:-PyRx Molecular Docking:- Image showing Binding of ligands 11977753 and Icotinib
with their best conformations that show the highest binding affinity. Both these two ligands
show shows good results with the target protein.

122
Fig:-PyRx Molecular Docking:- Image showing Binding of ligands 11977753 with its best
conformations that show highest Binding affinity. Ligand 11977753 has the highest binding
affinity as compared with other ligand molecules.

Protein visualization results with ligand Dactolisib:-

Fig:- Dactolisib docked ligand:- This ligand shows best binding affinity with target protein.

Before Docking

After Docking

123
After Docking(2-D Structure)

Fig:- Ligand Structure(Dactolisib)

124
Fig:- 2D diagram with Dactolisib

Protein visualization results with ligand Icotinib:-

Fig:-Icotinib docked ligand:- This ligand shows good binding affinity (less compared to
Dactolisib) with the target protein.

125
Before Docking

After Docking

After Docking(2-D Structure)

Fig:- Ligand Structure(Icotinib)

126
Fig:- 2D diagram with Icotinib

References:

127
1. Discovery of the highly potent PI3K/mTOR dual inhibitor PF-04691502 through structure

based drug design by Hengmiao Cheng, Shubha Bagrodia, Simon Bailey,Martin


Edwards,Jacqui Hoffman, Qiyue Hu,Robert Kania, Daniel R. Knighton, Matthew A. Marx,
Sacha Ninkovic, Shaoxian Su and Eric Zhang

2. mTOR mediated anti-cancer drug discovery by Qingsong Liu, Carson Thoreen, Jinhua
Wang, David Sabatini, Nathanael S. Gray

3. Tp53 gene therapy: a key to modulating resistance to anticancer therapies? By Esther H.


Chang, Kathleen F. Pirollo and Kerrie B. Bouker

4. Wnt/beta-catenin and PI3K/Akt/mTOR Signaling Pathways in Glioblastoma: Two Main


Targets for Drug Design: A Review by Seyed Hossein Shahcheraghi , Venant Tchokonte-
Nan, Marzieh Lotfi, Malihe Lotfi, Ahmad Ghorbani and Hamid Reza Sadeghnia

5. Identification of Potent VEGF Inhibitors for the Clinical Treatment of Glioblastoma, A


Virtual Screening Approach by Mohini Yadav, Ravina Khandelwal, Urvy Mudgal, Sivaraj
Srinitha, Natasha Khandekar, Anuraj Nayarisseri, Sugunakar Vuree, Sanjeev Kumar Singh

6. Molecular targeted therapy of glioblastoma by Emilie Le Rhuna,b, Matthias Preusserc,


Patrick Rotha, David A. Reardond, Martin van den Bente, Patrick Wend, Guido
Reifenbergerf, Michael Weller

7. Structure-Based Drug Design and Synthesis of PI3Kα-Selective Inhibitor (PF-06843195)


by Hengmiao Cheng, Suvi T. M. Orr, Simon Bailey, Alexei Brooun, Ping Chen, Judith G.
Deal,Yali L. Deng and et.al.

8. EXPLOITING THE PI3K/AKT PATHWAY FOR CANCER DRUG DISCOVERY by


Bryan T. Hennessy, Debra L. Smith, Prahlad T. Ram, Yiling Lu and Gordon B. Mills

9. Das, Pratik & Saha, Puja & Abdul, A. (2017). A REVIEW ON COMPUTER AIDED
DRUG DESIGN IN DRUG DISCOVERY. World Journal of Pharmacy and Pharmaceutical
Sciences. 10.20959/wjpps20177-9450.

10. Yu W, MacKerell AD Jr. Computer-Aided Drug Design Methods. Methods Mol Biol.
2017;1520:85-106. doi: 10.1007/978-1-4939-6634-9_5. PMID: 27873247; PMCID:
PMC5248982.

11. Surabhi, Surabhi & Singh, BK. (2018). COMPUTER AIDED DRUG DESIGN: AN
OVERVIEW. Journal of Drug Delivery and Therapeutics. 8. 504-509.
10.22270/jddt.v8i5.1894.

128
THE END

129

You might also like