You are on page 1of 183

Discovery of Mature MicroRNA Sequences within the Protein-

Coding Regions of Global HIV-1 Genomes: Predictions of Novel


Mechanisms for Viral Infection and Pathogenicity

By
Bryan Holland

A Dissertation Presented to the


FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(MOLECULAR BIOLOGY)

May 2017

Copyright 2017 Bryan Holland


Dedication

To my girls, Amber, Layla, and Zooey


Who show their love in ways both big and small

And who, each in their own way, inspire me more than they could ever know.

ii
Acknowledgements

First and foremost, I would like to thank Dr. Suraiya Rasheed. I feel extremely fortunate to have
had Dr. Rasheed as a mentor and adviser. Dr. Rasheed and I have now had a very long and
interesting road together, and I can’t thank her enough for giving me the chance to complete my
Ph.D. at USC.

Dr. Rasheed took many chances with me. She accepted me into her lab from another
department, on another campus. She was extremely patient with me while I tried to get up to
speed on the new material in her field. When my career path took an unexpected turn, she was
accepting and supportive. When I asked to return again to her lab after many years, she again
took a chance by accepting me back into her lab to continue and ultimately finish my graduate
work.
Dr. Rasheed has always been available to me whenever I needed help or had a question to ask.
Her commitment to teaching and to students, not only me, is extraordinary. Her integrity and
ethics are among the highest that I have ever had the pleasure to work with. Finally, for her
willingness to see a long-time graduate student through to his goal, I am extremely and eternally
grateful.
I would also like to extend my gratitude to my liaison adviser Dr. Bob Baker. Dr. Baker has also
been on this extended journey with me. I first met Dr. Baker as my professor in my first
undergraduate molecular biology course, and his teaching and instruction inspired me to
continue on in the field of molecular biology. Dr. Baker served as my first adviser and has
continued to be on my committee for my entire tenure at USC. Dr. Baker has championed my
efforts to others and put in a good word for me when one was needed. His support and
encouragement have helped me immeasurably to remain persistent in my pursuit of my graduate
degree. I feel very thankful and grateful to have had such a supportive adviser, and friend, as Dr.
Baker.
In addition, I would like to thank Dr. Carolyn Phillips for her help in serving on my committee. I
first met Dr. Phillips in 2015 right after she came to USC, and ever since our first meeting she
has always been extremely receptive and always willing to help. When it came time for me to
add another member to my committee, she agreed without hesitation and for that I am extremely
grateful.
I would furthermore like to thank Dr. David Hinton for serving on my committee and spending
his valuable time in my committee process.

iii
Also, my sincerest of thanks goes to collaborator Meng Li, working with Yibu Chen in
Bioinformatics Services in the Norris Medical Library. Meng has been an immense help over
the years. She has provided valuable insight and technical expertise that has helped my project
move forward. She has always had an answer and a smile for me and for those that she has
helped along the way. A huge thanks also goes to former graduate student and friend Michael
Philips who has helped me often along the way with his advice and experience.
I would like to acknowledge the Department of Molecular and Computational Biology for their
enduring support over the years. The professors and faculty have helped me academically
without hesitation time and again. Additionally, I would like to thank all the great people on the
administrative side of the department who have helped me to navigate the ‘ins and outs’ of the
graduate school environment. Thanks go to Doug Burleson and Hayley Peltz. Also, I would like
extend a special thank you to Linda Bazilian, who not only has supported me along the way and
been a great and longtime friend, but who has been instrumental in facilitating my return and
acceptance back into USC.
Additionally, I would like to thank the Department of Pathology for providing me a graduate
student position at the Health Sciences Campus, and also to thank Lisa Doumak in the front
office for always being so helpful with any administrative needs.
Finally, I owe a great debt of gratitude to my family, who has always supported me and never
questioned why I went back to school in the first place, but instead convinced and encouraged
me to complete the work that I started so long ago. My love to you all and I can never thank you
enough.

iv
Table of Contents

Dedication ii

Acknowledgements iii

List of Tables x

List of Figures xi

Abstract xiv

1 Introduction 1
1.1 Overview 1
The Functional Significance of miRNAs in Biology and Virology 1
1.2 History 2
1.3 miRNA Biogenesis 3
1.4 Mechanisms of miRNA Regulation 5
1.5 Principles of miRNA-mRNA Interactions 7
1.6 Viral miRNAs 9
1.7 miRNA vs. miR-like 9

2 Identification of Human MicroRNA-Like Sequences Embedded within the


Protein-Encoding Genes of the Human Immunodeficiency Virus 11

2.1 Abstract 12
2.2 Introduction 13
2.3 Results 15
Identification of Cellular microRNA Sequences Associated with 15
Dysregulated Proteins in HIV-Infected Cells
Discovery of miRNA-195 homologues in the HIV-ENV Gene 15
Identification of HIV-ENV Gene Homology Domains in other Cellular 16
microRNAs
Identification of miRNA-Homology Domains in other Regions of HIV 16
genomes
Clustal Analyses and Mapping of miRNA-like Sequences 17
The miRNA-like sequences are not a classical human cellular miRNA 18
v
2.4 Discussion 18
There are other viral miRNAs and target sites 18
Possible biological functions of miRNAs discovered 19
Our findings are not cellular miRNA sequences but truncated versions 19
These viral miRNA-like sequences may mimic their cellular miRNA 20
counterparts
The viral miRNA-like sequences are from coding regions, and from 20
V1-V5
2.5 Materials and Methods 21

3 High-throughput Analysis of Global HIV-1 Sequences Identifies and Maps


Full-Length Mature MicroRNAs within HIV-1 Genomes 33

3.1 Abstract 33
3.2 Introduction 34
Background 34
Previous Research 36
3.3 Results 37
Identifying MicroRNA Sequences Embedded in the HIV Genome 37
Discovery of 15 Novel microRNA Sequences in Global HIV Genomes 41
Relationship Between Viral miRNA Sequences and Cellular miRNAs 44
Distribution of the 15 miRNAs in Global HIV Isolates by Clade 46
Mapping viral miRNA Sequences to HIV Genomic Regions 48
Graphic Representation of miRNAs present in HIV-1 Genomes 51
Taking a Closer Look at Env 54
3.4 Discussion 56
The viral miRNAs we have identified are distinct from other reported 56
viral miRNAs
These miRNAs may be virally encoded cellular homologues 58
These miRNAs can affect transcription as well as translation 60
Other ‘Mir-like’ Research 60
Exosomes 61
Role of HIV-associated miRNAs as tumor suppressors 62
3.5 Conclusion 63
3.6 Materials and Methods 63
vi
4 Discovery of a Mature MicroRNA Sequence Which Encodes a Highly Conserved
Furin-Binding Site Within Global HIV-1 Envelope gp160 Proteins: Prediction
of Novel MicroRNA-Regulatory Functions 67

4.1 Introduction 67
4.2 Results 68
miR-4644 has Multiple Matches within the Envelope Region 68
Phylogenetic Analysis of miR-4644 matching HIV Isolates 74
miR-4644 – Sequence Alignment and Mapping to HIV gp160 76
Targets of viral miRNAs Identified – Furin as a possible target 81
FURIN: 3’UTR has multiple target sites for 4 of our 81
identified miRNAs
3’UTR has conserved sites for miR-4644 83
Targets of viral miRNAs Identified – PAPPA as a possible target 88
PAPPA: 3’UTR has multiple target sites for 5 of our 88
identified miRNAs
3’UTR has conserved sites for miR-195 and miR-4644 90
4.3 Discussion 97
miR-4644 97
Envelope Expression and Processing 97
Env Gene Expression 97
Furin Processing 101
REKR is a highly conserved motif in HIV-1 106
34 Matching Strains have a synonymous substitution at the 107
nucleotide level
Searching for the ‘REKR’ amino acid motif in other miRNAs – 109
‘translating’ miR-4644
Proposed Regulatory Pathway for miR-4644, Furin, and gp160 112
4.4 Conclusion 120
Our Proposed Generalized Regulatory Pathway 120
4.5 Materials and Methods 123

vii
5 Discovery of Mature Cellular miR-6763 Sequence within the LTR of Several
HIV-1 Isolates Represents a Duplicated Form of the Sp1 Transcription Factor
Binding Site 125

5.1 Introduction 125


5.2 Background 126
miR-6763 126
Sp1 126
Sp1 Sites in the HIV-LTR 128
5.3 Results 129
miR-6763, Identified in HIV Genomic Sequences 129
Alignment of Matching Strains Shows Insertions of Sp1 Binding Site 130
Duplication of the Sp1-I Site is Unique to these three HIV Isolates 133
miR-6763 is Predicted to Target CD4 in silico 134
5.4 Discussion 135
There is a duplication event of the Sp1-I binding site 135
Only three HIV strains have this duplication 135
HIV can therefore duplicate a transcription factor binding site precisely 135
There may be a selective viral advantage to this duplication 135
This duplication has not been previously described 135
The duplication event ‘creates’ a cellular miRNA mimic in the HIV 135
genome
The viral miR-6763 sequence is predicted to downregulate CD4, 136
aiding viral replication

viii
6 Identification of Human MicroRNA Sequences Within HIV-1 Genomes
Offers Novel Mechanism for CD4 T-Cell Receptor Downregulation 137

6.1 Abstract 137


6.2 Introduction 138
CD4 Receptor 138
CD4 Receptor is the primary receptor for HIV during infection of T-cells 139
CD4 requires a secondary receptor 139
Background on CD4 downregulation 140
Nef 141
Vpu 142
Viral Enhancement by CD4 Sequestration 143
6.2 Results 143
Target Sites in CD4 for miR-195, miR-4644, and miR-6763 143
Identification of other miRNA targets in CD4 mRNA 146
6.3 Discussion 149

References 151

ix
List of Tables

Table 2.1: MicroRNA Sequences Used For Analysis 26


Table 2.2: Sequence Similarity Between Human miR-195 and HIV Envelope Genes 27
Table 2.3: Sequence Similarity Between Three Mature Human microRNAs and HIV 28
Envelope Genes

Table 2.4: Sequence Similarity Between Full Length Human microRNAs and HIV 29
Genes
Table 2.5: Sequence Homology Domains in the V5 Regions of HIV Envelope 30
Table 2.6: MicroRNA-like Sequence Alignments with Cellular Genome Sequences 31

Table 3.1: Identification of 15 microRNAs Embedded in HIV Genomes from Various 43


Global Regions
Table 3.2: Specificity of Sequence Matches Between MicroRNA and HIV Encoding 49
Genes
Table 3.3: Software for Filtering miRNA Data by Species 65

Table 4.1: Human miR-4644 Sequences Embedded within the HIV Envelope gp120
Regions of HIV Strains Isolated Globally 70
Table 4.2: miR-4644 Significant Matches and Alignments 72
Table 4.3: Viral Proprotein Furin Cleavage Sites 104

Table 4.4: Potential Translation Products of miRNAs Identified 110

x
List of Figures

Figure 1.1: microRNA Biogenesis 4


Figure 1.2: Mechanisms of miRNA Regulation 6
Figure 1.3: Principles of microRNA-mRNA Interactions 8

Figure 2.1: Phylogenetic Tree of miR-195-like sequences in different HIV clades 23


Figure 2.2: Location of hsa-miR-195-like sequence in HXB2 Env Gene 24
Figure 2.3: Localization of miR-like sequences in HIV envelope variable regions 25

Figure 3.1: Workflow for Identifying microRNA Sequences Embedded within 40


the HIV Genome
Figure 3.2: Graphic Representation of Matching Positions Between Cellular 45
Mature miRNAs and HIV Isolates
Figure 3.3: Graphic Distribution of microRNA Sequences in Different HIV 47
Clades Isolated Globally

Figure 3.4: Localization of miRNA Sequences in HIV Genomes 52


Figure 3.5: Mapping of Mature microRNA Positions and miR-like Sequences 53
in the HIV Genome
Figure 3.6: MicroRNA Sequences Localized to the Envelope Region of the 55
HIV Genome

Figure 4.1: Phylogenetic Tree of miR-4644 Sequences in Different HIV Clades 75


Figure 4.2: miR-4644 Maps to the gp120-gp41 Junction in the HIV Genome of 77
34 Isolates
Figure 4.3: Alignment of HIV Isolate Sequences Matching Human miR-4644 79
Figure 4.4: Target Sites in the 3’UTR of Furin, a gp160 Protease 82

Figure 4.5: miR-4644 has Multiple Conserved Target Sites in the Furin 3’UTR 84
Figure 4.6: Pairing of miR-4644 Seed with Furin 3’UTR 85

xi
Figure 4.7: miR-4644 Pairing Interactions with the Furin mRNA 3’UTR 86
Target Site
Figure 4.8: miR-4644 Targets Furin mRNA by Base Pairing within the 87
3’UTR
Figure 4.9: Target sites in the 3’UTR of PAPPA, an Embryonic, Down- 89
Regulated Protein
Figure 4.10: miR-195 has Multiple Conserved Target Sites in the PAPPA 91
3’UTR

Figure 4.11: miR-195 Base Pairs with PAPPA 3’UTR via Seed and Other 93
Interactions
Figure 4.12: miR-4644 has Multiple Conserved Target Sites in the PAPPA 94
3’UTR
Figure 4.13: Pairing of miR-4644 Seed with PAPPA 3’UTR 95

Figure 4.14: miR-4644 Base Pairs with PAPPA 3’UTR via Seed and Other 96
Interactions
Figure 4.15: HIV-1 Envelope Glycoprotein 98
Figure 4.16: gp160 Expression and Cleavage by Furin 100

Figure 4.17: Furin Cleaves gp160 at Conserved R-E-K-R Amino Acid Motif 102
Figure 4.18: miRNA ‘Y’ Regulates Gene ‘X’, and is Transcribed from a 113
Different Promoter
Figure 4.19: miRNA ‘Y’ Regulates Gene ‘X’, and is Co-Transcribed from the 114
Same Promoter
Figure 4.20: miRNA ‘Y’ Regulates Gene ‘X’, is Transcribed and Spliced from 115
Within Gene ‘X’
Figure 4.21: Expression of miR-4644 may Downregulate Furin in a 117
Self-Regulatory Loop and Contribute to HIV Latency
Figure 4.22: miR-4644 / Furin / gp160 Proposed Regulatory Pathway 118
Figure 4.23: Our Generalized Regulatory Pathway 121

xii
Figure 5.1: Sp1 Transcription Factor 127
Figure 5.2: Three HIV Isolates Contain Cellular miR-6763 Sequence 129
Figure 5.3: Alignment of Target Strains Reveals Sp1-I Duplications 130

Figure 5.4: Sp1 Duplication as a miR-6763 Site 132


Figure 5.5: Target Sites for miR-6763 in the 3’UTR of CD4 134

Figure 6.1: miR-195 Target Sites in the CD4 3’UTR 144


Figure 6.2: miR-4644 Target Sites in the CD4 3’UTR 145
Figure 6.3: miR-6763 Target Sites in the CD4 3’UTR 145
Figure 6.4: HIV-Encoded vmiRNA Target Sites in the CD4 3’UTR 147
Figure 6.5: Multiple Target Sites in the 3’UTR of CD4 148

xiii
Abstract

MicroRNAs (miRNAs) are 17nt-24nt long non-coding RNAs which regulate gene expression by
direct base pairing of the miRNA with target mRNAs in the 3’ UTR. This form of post-
transcriptional regulation between miRNAs and their targets is quickly being recognized as a
central regulatory mechanism necessary for most normal cellular functions. In fact, this
regulation is essential from the development of the embryo to differentiation of numerous cell
types, and from maintenance of health to the development of cancer and other diseases by
dysregulated miRNAs. Understanding the complexities of miRNA-mRNA interactions at
different stages of growth and development is therefore critical for delineating pathogenic
pathways of various human diseases, and for the development of disease-specific therapeutics.
Chapter One of this dissertation presents an introduction to miRNAs and a short synopsis of the
major topics associated with miRNAs such as the history of miRNA discovery, miRNA
biogenesis, mechanisms by which miRNAs regulate their targets, and the basic principles of
miRNA-mRNA interactions. Also presented are short discussions on the concept of viral
miRNAs and an explanation of the terminologies used such as miRNA and miRNA-like.
The next chapter introduces our first report on the discovery of novel miRNA-like sequences in
the genomes of several HIV-1 isolates and is presented exactly as published. The research
presented in this thesis is based on Dr. Rasheed’s Laboratory’s research on the proteomics
profiles of HIV-infection, which indicated that HIV-1 infection alone can dysregulate hundreds
of proteins during chronic replication of HIV-1. Since microRNAs are directly involved in
regulating protein expression, we proposed to search for viral miRNAs using in silico methods.
Using metagenomics tools we were able to identify 8 cellular microRNAs which are predicted to
bind and regulate the mRNAs of multiple proteins that were dysregulated by experimental HIV
infection of CD4+ T-cells in vitro. Subsequent analysis of these miRNAs revealed that one
cellular miRNA, miR-195, showed near perfect homology with five HIV-1 genomes from South
Africa and four additional miRNAs showed significant homology with other African strains. We
propose that these miRNA sequences may have evolved to self-regulate survival of the virus
within its host by evading innate immune responses.
In Chapter Three we take this idea further by examining all known human miRNA sequences, and
comparing them to all known genomic and subgenomic HIV sequences contained in global
databases (approximate n=400,000). By using the latest bioinformatics and high-throughput
analyses, we examined hundreds of thousands of HIV sequences and compared them to all
known human mature miRNA sequences. This examination resulted in the discovery of 15
mature human microRNAs within the protein-coding regions of 20 distinct HIV strains. While
most of these miRNAs were clustered within the env region of HIV-1 genomes, several miRNAs
were present in the gag, pol, nef, and LTR viral regions of multiple HIV-1 strains. Furthermore,
xiv
most of these viral miRNA sequences exhibited 100% complete homology to the cellular
miRNA counterparts and others had minor mutations. Of particular interest of this thesis is a key
finding of the human miR-4644 in 34 distinct HIV isolates from different regions of the globe.

Chapter Four represents an in-depth investigation into the nature of the miR-4644 and its
matching HIV-1 isolates. These miR-4644 sequences were localized to the same exact position
in each of the 34 strains, and that position is the boundary of the gp120-gp41 junction of the env
gene. This viral region, which encodes a particular highly conserved amino acid motif, is also
recognized as the binding site for furin, the cellular protease which catalyzes the eventual
cleavage of gp160 into its active form.
Another interesting twist on this discovery is the fact that miR-4644 is predicted to target the
furin mRNA in silico. This means that the miR-4644 miRNA, embedded within the protein-
coding region of the HIV-1 genome, may regulate the very enzyme that ultimately catalyzes
proteolytic cleavage at precisely the same position, at the amino acid level. The implications are
further discussed, and include the possibility that this viral miRNA may serve to regulate a
critical maturation step in the infectivity of the HIV particle and may therefore utilize a new
pathway in the modulation of HIV pathogenesis.
In Chapter Five we examine another miRNA, miR-6763, more closely. Of the fifteen viral
miRNAs that were discovered in relation to proteins that were dysregulated due to HIV-1
infection of T-cells, miR-6763 is especially interesting for two reasons: 1) the mature miRNA
exhibits 100% complete homology across its entire length with its cellular miRNA counterpart,
and; 2) this miRNA is present in three different and geographically diverse HIV-1 isolates.
Upon further investigation, we have found that this miRNA lies in the 3’ LTR region of the HIV-
1 viral genome and is co-located with a binding site for the transcription factor called specificity
protein 1 (Sp1). Moreover, the Sp1 site is duplicated in all three HIV isolates. The two Sp1 sites
are placed in tandem and together show 100% homology with miR-6763.
The miR-6763 is also predicted to target and downregulate CD4 in silico. The significance of
these findings is discussed in Chapter Six, which examines target sites in the 3’ UTR of the CD4
mRNA. The CD4 molecule is integral to the establishment of infection of HIV by providing a
critical cell surface receptor for virus binding. In our investigation, we found that all three of our
newly identified miRNAs that we have investigated in detail, mir-195, miR-4644, and miR-6763,
are all predicted to downregulate the CD4 mRNA by binding to target sites in the CD4 3’UTR,
presumably limiting expression of this molecule. An additional interesting note is that we have
found that two putative HIV-encoded miRNAs, known as hiv1-miR-TAR-5p and hiv1-miR-H1,
also have a predicted target site in the CD4 mRNA.

Finally, we will discuss all novel findings discovered in this research and their implications on
viral infection, replication, latency, pathogenesis and human disease. The possibility of a new
regulatory pathway will be explored in the context of this new information. Our identification of
new and novel viral miRNA molecules in the HIV-1 genome may serve to increase the
knowledge of viral miRNA regulation and aid in the development of targeted therapies toward
the assuagement of this global pandemic.

xv
Chapter 1

Introduction

1.1  Overview
The Functional Significance of miRNAs in Biology and Virology:
The discovery of non-coding RNAs has opened up a new dimension of genetic regulation which
was unknown a generation ago, and the depth and breadth of this regulation by these molecules
may rival or even exceed the extent of protein regulation currently known. MicroRNAs
(miRNAs) are a class of non-coding RNAs which are 17-24 nucleotides (nt) long and are
primary regulators of gene expression. Their very name evokes the idea that these small
molecules were not easily discovered, and indeed they were not known until 1993; however, the
impact of this class of small RNA molecules on cellular biochemistry and molecular biology is
ever-growing. The ubiquitous nature of miRNAs among species at both the multi- and
unicellular level indicates a very ancient and evolutionarily conserved history for these
molecules, and underscores the importance of their role in cell processes and regulation.
miRNAs appear to be involved in the regulation of almost every cellular pathway and process
which has been examined; in fact, well over half of the human transcriptome is predicted to be
under miRNA regulation. miRNA-directed gene regulation has been shown to be involved in
basic biological and cellular functions such as cell proliferation, apoptosis, and development. As
part of the RNAi pathway, it is involved in cellular defense against foreign pathogens. As a
critical regulator of cell cycle proteins, it is involved in a multitude of cancers.
The entirety of miRNAs that are expressed at any given time comprises a cellular miRNA
expression profile that can indicate the overall health and status of the cell. When this cellular
miRNA profile is dysregulated, cellular trauma can result. Indeed, the pathology of human
disease may be indicated by, as well as caused by, global changes in cellular miRNA expression.
Viruses, having co-evolved with other forms of life, are subject to and manipulators of miRNA
molecules. Some viruses have evolved their own viral miRNAs. Others modify or regulate
cellular miRNAs to aid in their own replication and pathogenicity.
In this thesis, we propose to show that the human immunodeficiency virus (HIV) may mimic or
incorporate human cellular miRNAs as a means of enhancing viral defense, infection, latency,
replication or pathogenicity. The means by which this is accomplished is varied and may include
direct interaction of viral miRNAs on its own replication, or indirectly by influencing and
regulating cellular processes and other cellular miRNAs. We believe that this is a new and novel
discovery in the field of retrovirology and its implications will be thoroughly discussed.

  1  
1.2 History:
The first members of miRNA family were discovered as essential regulators of development in
the nematode Caenorhabditis elegans, and were known as small temporal RNAs (stRNAs)
because they played a role in the developmental transitions of C. elegans. Further study showed
that these stRNAs were really part of a new class of small, regulatory RNAs that came to be
known as microRNAs (miRNAs) [1, 2].
The first miRNA, discovered in 1993, was known as lin-4, and was identified in C. elegans by
screening for phenotypic changes in nematode post-embryonic development [3]. In C. elegans,
there are four different larval stages, known as L1 through L4. When mutated, the lin-4 gene
alters the temporal regulation of larval development by causing L1 expression patterns to repeat
throughout later stages of development. It was found that the lin-4 gene controls the timing of C.
elegans larval development through repression of the lin-14 gene. lin-14 encodes a nuclear
protein which must be downregulated for larval progression from L1 to L2 [4].
However, the lin-4 gene does not produce a protein-encoding mRNA; instead, it makes a small,
non-coding RNA. This 22nt RNA contained a sequence which was partly complementary to 7
regions in the 3’ UTR of the lin-14 mRNA; therefore, it was proposed that this complementarity
was somehow involved in the negative regulation of lin-14 by lin-4, and was supported by the
observation that an intact 3’ UTR in the lin-14 mRNA was necessary for negative regulation to
occur [5].
Another small C. elegans RNA was identified in 2000, called let-7, which was found to repress
another gene known as lin-41 by a similar mechanism. However, the fact that let-7 was found to
be conserved among many species suggested that these small RNAs, and the mechanism by
which they functioned, might be part of a larger class of molecules.
Eventually, both lin-4 and let-7 RNAs were shown to be present in both Drosophila and human
cells, although their expression patterns were different. This differential expression suggested
that this growing group small RNAs, by then referred to as microRNAs, might be regulating
other cellular processes in addition to development [4].
The number of identified miRNAs has grown exponentially in the last decade; the latest version
of miRBase, which is a database repository for all known miRNA sequences, reports a total of
28,645 entries across all species. While initially known to be associated with developmental
events in C. elegans, we now know that they are essential components of nearly every cellular
process and pathway. Their diverse expression patterns demonstrate an ability by these
molecules to react to changing conditions and environments, and by doing so chaperone cellular
processes into adaptation.
 
 

 
 

  2  
1.3 miRNA Biogenesis:
miRNA genes are normally located in intergenic regions but may also be intronic [6]. The first
step in miRNA production is transcription of the gene by RNA Pol II to produce a resulting
hairpin loop RNA structure known as pri-mRNA [7]. These nascent transcripts are first
processed by the nuclear enzyme Drosha, an RNAse III type endonuclease, which is complexed
with a dsRNA binding protein DGCR8, into ~70nt nucleotide precursors called pre-miRNAs [8].
Pre-miRNAs are then transported to the cytoplasm by an enzyme called Exportin 5, where the
imperfect stem-loop RNA structures are further processed by the enzyme Dicer, also an RNAse
III type enzyme [9, 10]. Dicer removes the loop of the hairpin to yield a ~22nt imperfect
miRNA duplex containing 2nt overhangs at each end. The resulting duplex thus contains two
RNA strands, known as the guide strand and the passenger strand. Although either strand may
become the mature miRNA, normally one strand is preferentially selected for incorporation into
the RISC complex [11].
After processing by Dicer, the selected miRNA strand joins a ribonucleoprotein complex called
RISC, or the RNA Induced Silencing Complex. Central in the formation of the RISC complex is
a protein of the Argonaute (AGO) family. In mammals, four different Ago proteins (AGO1-
AGO4) may take part in the formation of the RISC complex [12]. The AGO protein, bound to
the mature miRNA strand along with other associated proteins such as TRBP (the HIV-1
transactivating response RNA (TAR) binding protein), forms a complete RISC unit which targets
mRNAs complementary to the miRNA contained in the complex for regulation [8]. The overall
pathway is depicted on the following page in Figure 1.1.

 
 
 

 
 
 

  3  
Figure 1.1 - microRNA Biogenesis

DNA
Transcription

Pri-miRNA
'Mirtron'
(A)n

DGCR8
Processing Intron
(A)n
Drosha

Splicing
Pre-miRNA

Exportin5

Nucleus

Cytoplasm

Maturation

Dicer TRBP

AGO1-4

Strand selection; miRNP assembly

AGO
AGO
AAAA
AAAAA

Endonucleolytic cleavage Translational repression or deadenylation


  4    
Figure 1.1: microRNA Biogenesis
Expression and maturation of miRNAs in the canonical pathway are depicted. miRNAs are
transcribed as pri-miRNAs in the nucleus and processed by the cellular enzyme Drosha to form a
pre-miRNA. Alternatively, cellular splicing of mirtron sequences may also produce a pre-miRNA
without the need for Drosha processing. Following export into the cytoplasm by Exportin 5,
further processing occurs via the Dicer enzyme and the selected guide strand is bound to the Ago
protein and loaded onto the RISC complex, which subsequently represses its target through
endonucleolytic cleavage (plants) or via translational repression and deadenylation (animals).
(Original figure adapted from Filipowicz et al., 2008 [13]).

1.4 Mechanisms of miRNA Regulation:


When a miRNA incorporated into a RISC complex encounters its target mRNA, gene silencing
occurs. This gene silencing occurs in several ways, the most typical of which is either
translational repression or destabilization of the mRNA.
Repression of translation may occur by preventing initiation of translation, which is achieved by
blocking recognition of the 5’ cap protein, or by preventing the 60S ribosomal subunit from
joining the initiation complex [14-16]. Repression of translation at the elongation step may also
occur by slowing of the ribosomes and promoting ribosomal drop-off from the transcript [17].
Alternatively, the binding of RISC to the 3’ UTR target sequence of the mRNA may result in
destabilization and decay of the message by recruitment of deadenylation factors such as CCR4-
NOT, which remove the poly(A) tail, rendering the mRNA susceptible to exonucleolytic
degradation [18].
An additional method of repression has been proposed in which the nascent polypeptide is
cleaved, although the protease involved in such a mechanism has not been identified [19].
 

 
 
 
 
 
 
 
 
  5  
Figure 1.2 - Mechanisms of miRNA Regulation

 
Figure 1.2: Mechanisms of miRNA Regulation
Four possible mechanisms of mRNA downregulation via miRNA interaction are shown. 1)
Deadenylation may occur via the CCR4-NOT enzyme which is recruited by the miRNA-
ribonucleotprotein complex, which destabilizes the transcript and makes it vulnerable to
degradation; 2) the RISC complex may block translation initiation by repressing 5’ cap recognition
or joining of the 60S ribosomal subunit; 3) proteolytic degradation of the nascent forming peptide
may occur; 4) the miRNP may block elongation of translation by slowing the translational proteins
or promoting what is known as ‘ribosome drop-off’. (Original figure adapted from Filipowicz et
al., 2008 [13]).

 
 
 
 
 
  6  
1.5 Principles of miRNA-mRNA Interactions:
miRNAs interact with their mRNA targets through base pairing. The nature of this base pairing
differs significantly between plants and metazoans. In plants, complementarity between miRNA
and mRNA target tends to be complete or very near complete, and the end result of miRNA
binding to its target is endonucleolytic cleavage of the target mRNA somewhere within the base
pairing region. In animals, base pairing tends to be much more incomplete, and typically results
in translational repression or mRNA degradation, and not target cleavage.
miRNAs are small molecules and therefore have a limited amount of sequence specificity, which
implies that presumably a single miRNA can interact with many mRNA targets. In addition, the
fact that partial pairing between a miRNA and its target may be sufficient for regulation means
that an even larger pool of potential mRNA targets exists. This creates a vast and complex
problem of how to accurately predict mRNA targets for a particular miRNA. Algorithms created
for miRNA target prediction tend to overestimate the number of actual targets and give large
numbers of false positives.
However, through multiple experimental and computational studies, there are several traits
which can be applied to describe the nature of a given miRNA and its mRNA target.
The major determinant of this interaction is full and complete complementarity between
nucleotides 2 through 8 of the miRNA, known as the seed sequence, and its mRNA target.
Additionally, the presence of certain nucleotides at mRNA positions ‘across’ from miRNA #1
and #9 can aid in the strength of association of the miRNA:mRNA duplex. Imperfect pairing in
the seed region can sometimes be compensated for by extensive complementarity in the 3’ region
of the miRNA. Many seed sequences tend to be highly conserved between species.
Base pairing usually occurs in the 3’ UTR region of the mRNA which is being targeted, and the
nature of the 3’ UTR influences miRNA binding as well as regulation of the mRNA transcript.
For example, AU-rich regions in the 3’ UTR can serve as potent regulatory signals, and may act
by creating a stretch of the 3’ UTR which is more accessible to miRNA binding [20]. Genes
with longer 3’ UTRs have more miRNA target sites and a higher density of target sites and tend
to be involved in developmental processes, while those with shorter 3’ UTRs are more likely to
be involved in basic cellular processes [21, 22]. While the primary region of the target for
miRNA binding is the 3’ UTR, it has been demonstrated that some miRNAs can bind to the 5’
UTR [23, 24].
Another feature of the miRNA:mRNA duplex is a tendency for bulges or mismatches to exist,
particularly in the central region. This may serve to preclude an AGO-induced cleavage which is
common in plants that exhibit base pairing in this region [13].
A certain amount of complementarity between the 3’ end of the miRNA and the mRNA also
appears to play a role in stabilizing the interaction between miRNA and its target. While
mismatches and bulges are generally tolerated in this region, complementarity at nucleotide
positions 13-16 seems to be important for miRNA binding.

  7  
The location and number of miRNA binding sites within the 3’ UTR is also a determining factor
in miRNA effectiveness. Locating the miRNA target site fairly close to the end of the ORF of
the gene seems to be favorable. Multiple target sites for the same miRNA are common, and
having multiple sites within close proximity can also induce cooperativity resulting in a higher
degree of repression by the miRNA [13].
miRNAs normally act on their target by repressing its expression, but there are instances where
miRNAs have been shown to upregulate target expression, involving an interaction between Ago
and the protein known as fragile X mental retardation protein (FMR1) [25].

 
Figure 1.3
Principles of miRNA-mRNA Interactions

Bulge
>15 Nucleotides

miRNA

16 13 8 1
Bulge
3' Complementarity 'Seed' region
 

Figure 1.3: Principles of microRNA-mRNA Interactions


A nucleotide-level adaptation of the proposed interactions between miRNA and mRNA is shown.
Key to the interaction is the perfect or near-perfect base pairing between nucleotides 2-8 of the
miRNA and the mRNA sequence, known as the ‘seed’ and shown in red and green. Flanking
nucleotides shown in yellow are thought to strengthen this relationship. A bulge in the central
region of the miRNA:mRNA duplex is common, though further base pairing downstream,
particularly at positions 13-16, shown in orange, is thought to be important. Favorable placement
of the mRNA target sequence is approximately 15 nucleotides downstream of the open reading
frame. (Original figure adapted from Filipowicz et al., 2008 [13]).

  8  
1.6 Viral miRNAs:
While the presence and multitude of cellular miRNAs has been widely studied, the possible
presence of miRNAs in viral genomes is less clear. While miRNAs of viral origin have been
discovered in the genomes of certain larger viruses and DNA viruses such as herpes viruses, the
presence of miRNAs within RNA genomes, and particularly of retroviruses and Human
Immunodeficiency Virus 1 (HIV-1) has remained controversial. Further complicating the issue
is whether miRNAs in viral genomes may have arisen through viral evolution, or rather through
viral acquisition of cellular miRNA genes. The primary goal of this research is to ascertain
whether HIV-1 viral sequences may contain miRNAs which came from, or mimic, cellular
miRNA genes. We will examine this topic in great detail.

1.7 miRNA vs. miRNA-like:


In our previously published research, which is contained in this thesis as Chapter Two, the viral
sequences that we identified were similar, but not exact, copies of cellular miRNAs. Therefore,
we referred to them as miRNA-like.
The distinction between miR and miR-like is a subject not well defined or understood. The
implication of miR-like is that it may retain some of the properties inherent in its ‘real’ miRNA
counterpart. Indeed, studies have shown in vitro that a functional mature miRNA can tolerate
mutations, and that the more mutations a miRNA molecule accumulates, the greater the
reduction in miRNA-mediated repression [26].
In our recent research we have identified additional viral sequences with homology to cellular
miRNAs, as we will show. Some of the additional sequences we identified have greater
homology than previous findings, and other sequences were in fact exact copies of human
cellular mature miRNAs. Therefore, we will refer to these newly identified sequences as viral
miRNAs. The differences between these putative viral miRNAs and their cellular counterparts
(in terms of expression, biogenesis and function) will be discussed as well.

  9  
  10  
Chapter 2

Identification of Human microRNA-like Sequences Embedded Within the


Protein-Encoding Genes of the Human Immunodeficiency Virus

Bryan Holland, Jonathan Wong, Meng Li and Suraiya Rasheed1

Laboratory of Viral Oncology and Proteomics Research, Keck School of Medicine,


University of Southern California, Cancer Research Laboratory Bldg., 1303 N. Mission Rd.
Los Angeles, California 90033

1
Corresponding Author
srasheed@usc.edu

This chapter has been published and is available here:


http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0058586
  11  
2.1 Abstract

Background: MicroRNAs (miRNAs) are highly conserved, short (18-22nts), non-coding


RNA molecules that regulate gene expression by binding to the 3’ untranslated regions
(3’UTRs) of mRNAs. While numerous cellular microRNAs have been associated with
the progression of various diseases including cancer, miRNAs associated with
retroviruses have not been well characterized. Herein we report identification of
microRNA-like sequences in coding regions of several HIV-1 genomes.

Results: Based on our earlier proteomics and bioinformatics studies, we have identified 8
cellular miRNAs that are predicted to bind to the mRNAs of multiple proteins that are
dysregulated during HIV-infection of CD4+ T-cells in vitro. In silico analysis of the full
length and mature sequences of these 8 miRNAs and comparisons with all the genomic
and subgenomic sequences of HIV-1 strains in global databases revealed that the first
18/18 sequences of the mature hsa-miR-195 sequence (including the short seed
sequence), matched perfectly (100%), or with one nucleotide mismatch, within the
envelope (env) genes of five HIV-1 genomes from Africa. In addition, we have identified
4 other miRNA-like sequences (hsa-miR-30d, hsa-miR-30e, hsa-miR-374a and hsa-miR-
424) within the env and the gag-pol encoding regions of several HIV-1 strains, albeit
with reduced homology. Mapping of the miRNA-homologues of env within HIV-1
genomes localized these sequences to the functionally significant variable regions of the
env glycoprotein gp120 designated V1, V2, V4 and V5.

Conclusions: We conclude that microRNA-like sequences are embedded within the


protein-encoding regions of several HIV-1 genomes. Given that the V1 to V5 regions of
HIV-1 envelopes contain specific, well-characterized domains that are critical for
immune responses, virus neutralization and disease progression, we propose that the
newly discovered miRNA-like sequences within the HIV-1 genomes may have evolved
to self-regulate survival of the virus in the host by evading innate immune responses and
therefore influencing persistence, replication and/or pathogenicity.

  12  
2.2 Introduction:

MicroRNAs (miRNAs) are highly conserved, naturally occurring, 18-22 nucleotides long,
noncoding RNA molecules that are processed from long precursor transcripts (pre-miRNAs).
The cellular miRNAs are transcribed by the same transcription factors and RNA polymerase II
that control the transcription of protein-encoding messenger RNAs (mRNAs) [27]. The pre-
miRNA transcripts are processed into mature miRNAs by complex molecular processes
involving cleavage by RNaseIII-like enzyme Drosha in the nucleus, followed by their transport
to the cytoplasm mediated by Exportin-5 and a second cleavage by Dicer, an RNAse III-like
enzyme. One strand of this double-stranded miRNA forms the “mature” 18-22 nucleotide
sequence, which becomes a part of a multi-protein RNA-induced silencing-complex (RISC) that
guides it to the mRNA target site. The second strand is believed to be degraded [27].

The human genome contains >1,500 miRNA genes that have been predicted or experimentally
shown to play critical roles in normal cellular functions, such as maintaining homeostasis, and
regulating or modulating viral and cellular gene expression. Specific alterations in gene or
protein expression profiles change the direction of normal cells toward the development of
cancer and other disorders [28, 29]. An important post-transcriptional regulatory step in gene
expression is that the 5’ ends of miRNAs can base-pair with the complementary sequences in the
3’ untranslated regions (UTRs) of their target mRNAs and suppress translational capacities of
those mRNAs [30]. While the exact mechanism of miRNA-mediated regulation of the mRNA
targets is still not fully understood, two aspects of the gene-expression controls are noteworthy.
First, a short nucleotide sequence (nucleotides 2-7 from the 5’ end of a miRNA), referred to as
the ‘seed sequence’, base-pairs completely and continuously with its target mRNA; next, this
initial binding of the miRNA with its seed sequence allows complete or incomplete binding of
the rest of the mature miRNA sequence with the target mRNA. Thus, gene repression by miRNA
appears to depend primarily on complete base-pairing of their seed sequences with its
complementary mRNA target; however, sometimes an incomplete base-pairing of the rest of the
miRNA sequence also destabilizes the target transcript, represses the translation of the protein or
influences degradation of the target mRNA [31]. Also, one miRNA can interact with multiple
mRNA targets and a single mRNA can be regulated by multiple miRNAs. Upregulation or
downregulation of a given miRNA can dysregulate protein expression profiles and therefore
result in disruption of normal biological processes such as cell proliferation, development,
differentiation, apoptosis, and signal transduction [30, 32, 33].

The bulk of well-characterized miRNAs are either from cellular genomes or from several DNA
viral genomes and both can impact expression of cellular and viral genes respectively in infected
cells [34]. Recently, a retrovirus, the bovine leukemia virus (BLV), has been shown to encode a
conserved cluster of miRNAs that are transcribed by RNA polymerase III which mimics
miRNAs involved in B-cell cancers [35].

In silico studies have predicted that the genomes of the human immunodeficiency virus-1 (HIV-
1) may contain target sites for binding of cellular miRNAs [36]. For instance, hsa-miR-29a is
reported to target the nef region of the HIV-1 genome [37]. Researchers have also shown that
cellular miRNA expression patterns in HIV-1 infected peripheral blood mononuclear cells are
altered post-infection [38, 39]. A recent report proposed that altered cellular miRNA profiles of
  13  
HIV-infected cells could be used as early indicators of host cellular dysfunctions [40]. The
existence of HIV-1 vmiRNAs, that may regulate both viral and host gene expression has also
been suggested [41].

Based on the RNA structure and folding parameters consistent with cellular miRNA molecules,
five HIV-encoded miRNAs have been predicted computationally [42]. However, there has been
disagreement about the existence of vmiRNAs in HIV-1 genomes [43]. Currently, three HIV-
encoded vmiRNAs have been listed in miRBase: miR-H1, miR-N367 and miR-TAR-5p/3p.
These vmiRNAs have been reported to target “viral” transcripts, as in the case of miR-N367
[44], and host cellular factors for miR-H1 and miR-TAR-5p/3p [45, 46]. More recent research
seems to support that miRNAs are in fact present in the viral genomes: Ouellet et al. [47]
showed that functional miRNAs are processed from the HIV-1 TAR element; Yeung et al. [48]
have identified small non-coding RNAs in HIV-1 infected cells and corroborated the existence of
virally encoded miRNAs in nef and TAR; and Schopman et al. [49] have identified HIV-
encoded small RNAs in virus-infected cells. While studies have reported that human
endogenous retroviruses, retroelements and several exogenous retroviral sequences may have
homologies with cellular miRNAs [50], the viral homologues of human miRNAs in HIV-1
genomes have not been identified and the role of miRNAs in HIV-1 infection has not been
elucidated.

Our earlier proteomics and bioinformatics studies had identified a significant number of
functionally relevant proteins that were upregulated, downregulated or produced de novo in a
chronically HIV-infected CD4+ clonal T-cell line (RH9) [51, 52]. Based on these findings we
hypothesized that the differential expression of proteins in HIV-infected versus uninfected
counterpart cells is due to the dysregulation of cellular (or viral) microRNAs. Using multiple
bioinformatics and computational tools we have identified 8 miRNAs that are predicted to bind
to the mRNA targets of several proteins that are differentially expressed in HIV-infected T-cells
(Table 1). To understand the biological significance of these miRNAs we searched for
nucleotide sequence similarities between each of the miRNAs identified, and the whole or partial
HIV-1 genome sequences present in the global databases. Herein, we report the discovery of a
sequence homologue of hsa-miR-195 within the HIV-1 protein-coding regions of several HIV-1
strains from Africa (Table 2). In addition, we have identified sequence similarities between 3
other mature human miRNAs (hsa-miR-30d, hsa-miR-374a and hsa-miR-424) (Table 3) and one
full-length miRNA (hsa-miR-30e) within the HIV-1 envelope regions (Table 4). A detailed
search of the whole HIV-1 genome sequence also displayed similarity between 2 full-length
human miRNAs (hsa-miR-30d and hsa-miR-424) and the gag-pol protein-encoding regions or
pol only of 4 HIV-1 isolates (Table 4). These data have provided new insights into possible roles
of viral env-gene associated miRNA-like sequences in the survival of HIV-1 genomes in the
host, immune responses, virus replication, and pathogenesis.

  14  
2.3 Results:
Identification of Cellular microRNA Sequences Associated with Dysregulated Proteins in
HIV-Infected Cells

We used the program “GeneSet2miRNA” to identify potential microRNAs that could impact the
activities of the proteins whose expression had been shown to be dysregulated in HIV-infected
RH9 T-cells [51, 52]. For example, the microtubule-actin cross-linking factor 1 (MACF1) [53] is
one of the proteins which we found was downregulated due to HIV-infection. Bioinformatics
analyses indicated that the 3’ untranslated region of MACF1 has a sequence (5’
GGACAAUAGCUGCUA 3’) which is complementary to the seed sequence for the mature hsa-
miR-195 (5’ UAGCAGCACAGAAAU 3’), indicating a strong binding affinity to this protein.
MACF1 has an actin-regulated ATPase activity necessary for cross-linking actin to other
cytoskeletal proteins. In addition, MACF isoforms act as positive regulators of Wnt receptor
signaling pathway and are essential for controlling focal adhesions assembly. Actin-related
proteins also play critical roles during HIV-1 infection and latency in T-cells ([54] and our
unpublished data).

Using an adjusted p-value of the enrichment (adjusted for multiple testing by Monte-Carlo
simulations) cutoff of 0.05, we identified 7 microRNAs that could significantly bind to multiple
mRNA targets in these cells. In addition, we selected one more miRNA from our initial search
which fulfilled the criteria of being the best single-model match of the GeneSet2miRNA
program. These 8 miRNA sequences are listed in Table 1. These analyses also revealed that
hsa-miR-195 was one of the most frequently detected cellular miRNAs among the 8 miRNAs
identified to be associated with mRNAs of the proteins modulated in our experimentally HIV-1
infected T-cells. Each of the miRNA and mRNA interactions were highly significant among 7 of
the 8 miRNAs identified (p-value = 0.002 to 0.008).

Discovery of miRNA-195 homologues in the HIV-ENV Gene

Using both the mature (~22 nucleotides) and full length (~70 nucleotides) sequences of each
of the 8 cellular miRNAs listed in Table 1 as query sequences, we have used multiple
bioinformatics tools including BLAST (Basic Local Alignment Search Tool) and Clustal to
align and scrutinize potential matches in HIV-1 genomes. An extensive search of more than
3000 complete HIV-1 genomes and over 400,000 HIV-1 subgenomic sequences present in
both the Los Alamos and NCBI databases resulted in thousands of potential matches. Each
one of these matches was subsequently screened on the basis of the degree of homology and
degree of sequence continuity with the full length and mature sequences of each of the 8
previously identified cellular miRNAs. These analyses resulted in the discovery of a perfect
match (100%) from nucleotide positions 1 through 18 of the mature hsa-miR-195 sequence,
including the seed region at nucleotide positions 2-7, within the envelope (env) protein-coding
sequences of an HIV-1 isolate from South Africa (accession #GU216763) (Table 2). Two
other isolates, #GU216768 and #GU216773, from the same individual also showed homology
with hsa-miRNA-195 in the same region of the Env gene and again matched miR-195 at
positions 1-18, with 17 of 18 (94%) nucleotides matching perfectly (Table 2). This single
  15  
nucleotide mutation (i.e. 17 versus 18 nucleotides) represents the genetic diversity due to
mutations known to be present in all HIV-1 envelope genes of both acute and chronically
HIV-1 infected individuals [55]. In addition to the sequence homology in the South African
HIV-1 isolates (#GU216763), we have identified two other strains from Tanzania,
#HM215313 and #DQ199139, whose Env gene sequences also matched with the same region
of hsa-miR-195 at 17 of 18 nucleotide positions, as in the GU216763 isolate (Table 2).

Identification of HIV-ENV Gene Homology Domains in other Cellular microRNAs

Analyses of the remaining seven cellular miRNAs that were predicted to be associated with
multiple dysregulated proteins expressed in our experimentally HIV-infected cells (Table 1) also
showed sequence similarities between 4 microRNA sequences and the env regions of several
other HIV-1 strains (Tables 3, 4). Each of the seven miRNAs were screened using the BLAST
algorithm to find the best matches within all the complete and partial HIV-1 genome sequences
contained in both NCBI and the Los Alamos HIV-1 databases. Again, several thousand
preliminary matches were screened for the degree of homology, degree of continuity and
presence of the seed sequence. These analyses indicated that in addition to hsa-miR-195, there
are three other cellular mature miRNAs (hsa-miR-424, hsa-miR-374a, hsa-miR-30d) that
displayed 13-18 nucleotide length homologous regions within the Env genes of four different
HIV-1 strains (Table 3). We have also tested full-length (~70 nucleotides) miRNAs and found
that hsa-miR-30e exhibited sequence homology within the env regions of 5 different HIV-1
isolates (Table 4). These sequences were distinct from those aligned with miR-195 and were
located in different regions of the HIV-1 envelope than hsa-miR-195. Thus, the number, length,
and percentage of matches within the seed region of each miRNA and their respective HIV-1
env-gene sequence were variable.

Identification of miRNA-Homology Domains in other Regions of HIV genomes

To determine if any other region of the HIV-1 genome might show homology domains, all the
preliminary alignments were screened again using the query sequences of each of the 8 cellular
miRNAs that we had identified earlier. Our results indicated that 2 out of the 8 predicted
miRNAs (hsa-miR-424 and hsa-miR-30d), showed homologies in different regions within the
HIV-1 genome (Table 4). Both full-length cellular miRNAs (hsa-miR-424 and hsa-miR-30d)
exhibited 76%-100% homology domains within the gag-pol regions of four isolates from three
HIV-1 strains (Table 4). These findings suggest that multiple cellular miRNAs have homology
domains in different regions of various HIV-1 genomes (Tables 2, 3 & 4). Thus, out of the 8
miRNAs that we identified for further analysis, 5 (hsa-miR-195, hsa-miR-424, hsa-miR-30d,
hsa-miR-30e and hsa-miR-374a) showed homology domains in the HIV-1 genomes of several
strains, and 3 (hsa-mir-15b, hsa-miR-16-1 and hsa-miR-16-2) did not.

  16  
Clustal Analyses and Mapping of miRNA-like Sequences

Since the most significant matches in our study occurred between the cellular miRNA hsa-miR-
195 and the env protein encoding regions of HIV-1 isolates from Africa, we further scrutinized
and characterized these sequences by aligning the genomes of different HIV-1 clades from
various regions around the world with the hsa-miR-195-like sequence homology domains.
Using the ClustalW2 algorithm, as well as the Los Alamos compendium of all HIV-1
alignments, we were able to show the exact regions of these sequences in each of the 6 clades
that corresponded to the position of the hsa-miR-195-like sequence we have identified from the
African HIV-1 strains (Table 5).
We chose 17 HIV-1 genomes that had been identified as representative strains from each of the
6 HIV-1 clades (A, B, C, D, E and G) to perform alignments, and used whole genome sequences
to produce the best alignment. These analyses revealed that the hsa-miR-195-like region that we
have identified aligns best to a specific region in the Env gene of African strains (Table 5, shaded
area). However, while miR-195-like sequences were shared by all representative strains among
the 6 clades that we examined, these sequences were highly divergent among the different clades
(Table 5).
In addition to defining the specificity of sequence alignments, we used the TreeDyn software for
the construction of a sequence-based relational tree using the alignment data generated by the
Clustal algorithm. As can be seen in Figure 1, all viral genomes that shared miR-195-like
sequences are clustered together in Clade C.
While detailed maps of the env region exist for several HIV-1 strains, there is no map at present
for the HIV-1 genomes from Africa in which we found the greatest homology. We therefore
cross-referenced our sequence with the HXB2 strain (Clade B) for which a detailed map is
available (GenBank accession number K03455). The HXB2 genome is used as a common
reference strain for many different functional studies, and it has a standardized position
numbering scheme available at the Los Alamos National Laboratory website
(http://www.hiv.lanl.gov/). The alignment given in Table 5 can be cross-referenced with the
nucleotide-specific map of the HXB2 strain and indicates that the ‘hsa-miRNA-195-like’
sequence maps to the segment from position 7611 through 7628 within the env gene of the
HXB2 genome (Figure 2). It is interesting to note that Yeung et al. [48] show a small RNA
species in their supplementary table at position #7601 which corresponds to our 195-like
sequence, albeit with less homology (because they used clade B),(Table 5). This region
corresponds to the V5 region of the envelope glycoprotein. As can be seen from Figure 2, the
miRNA-195-like sequence is embedded entirely within the V5 region of the HIV-1 genome.
Next, we used the Clustal program to similarly align the homologous regions of hsa-miR-424,
miR-374a, and miR-30d (Table 3) to the HXB2 genome. Our results show that the ‘hsa-miR-
424-like’ sequence maps inside the V1 region of the HXB2 env gene at nucleotide positions
6682-6694, ‘hsa-miR-374a-like’ maps inside the V2 region (position 6763-6781), and ‘miR-30d-
like’ lies inside the V4 domain at positions 7386-7398 (Figure 3). There was no homologous
domain present in the V3 region to any of the miRNAs that we examined.

  17  
The miRNA-like sequences are not a classical human cellular miRNA

To determine if the ‘miR-like’ sequences that we identified in the HIV-1 envelope gene are of
viral or cellular origin, we searched the entire NCBI human genome sequence database for
possible homology domains anywhere. Using the BLAST algorithm and the 18-nucleotide miR-
195-like sequence as the query, only one significant match was localized to nucleotides 6524363
through 6524380 (18/18 matches) on the human chromosome 17 (Table 6). While this region
corresponds to the published location of the 21nt hsa-miR-195 sequence [56], our HIV-env-
associated miR-195-like sequence is not that of a typical miRNA because: 1) this sequence
encodes a functional protein; 2) it is not a part of a stem-loop structure necessary for the
processing of miRNAs; 3) there is no evidence that the miR-195-like sequence is processed
from a long precursor transcript (pri-miRNA) by Drosha; and 4) there is no evidence that this
sequence is cleaved into mature miRNA by Dicer enzyme. Further, it appears unlikely that the
newly discovered miR-195-like sequence in the HIV genomes could be incorporated into RISC
complexes that guide the miRNA to the mRNA target site.
Regardless of these controversies, Yeung et al. [48] first suggested that small RNA molecules in
HIV-infected cells may represent products of Dicer cleavage. Subsequently, a high-throughput
deep sequencing study of siRNA from HIV-infected cells suggested that the viral dsRNA
intermediates may be processed by Drosha and Dicer [49]. However, the small RNAs found in
these studies are non-coding and preliminary transfection of these “viral microRNA” clones did
not show significant changes in virus production [49]. In addition, the miRNA-like sequences
we have discovered are not related to those reported in any published studies.
The other miR-like sequences were also similarly localized to their genomic regions using the
BLAST algorithm (Table 6). The results show that the miR-like sequences are most
significantly matched to the position in the human genome that corresponds to the actual cellular
miRNA regions. However, the BLAST searches showed several less significant matches of
miR-like sequences in other parts of the human genome, but these were not 100% matches.

2.4 Discussion:
There are other viral miRNAs and target sites
The potential involvement of miRNAs in HIV-1 proliferation and life cycle is the subject of much
research. While the dysregulation of miRNA expression in HIV-1 infected cells has been known
since 2005 [38], studies have now identified HIV-TAR and Nef-LTR regulating miRNAs [47, 48].
In addition, target sites have been predicted for cellular miRNAs such as nef-LTR regions in HIV-
1 genomes [36], indicating that human cellular miRNAs can modulate HIV-1 expression and
replication [39]. Yeung et al. [48] had first suggested that small RNA molecules in HIV-infected
cells may represent products of Dicer cleavage. Subsequently a high-throughput deep sequencing
study of siRNA from HIV-infected cells suggested that the viral dsRNA intermediates may be
processed by Drosha and Dicer [49]. However, the small RNAs found in these studies are non-
coding and preliminary transfection of these “viral microRNA” clones did not show significant
changes in virus production although some small RNAs showed some inhibition [49]. Thus, the

  18  
miRNAlike sequences we have discovered in HIV genomes are not related to those reported in
any published studies.

Possible biological functions of miRNAs discovered


HIV-1 genomes can interact with miRNA in two ways: direct binding of a cellular miRNA with
a viral transcript or the RNA genome itself, or indirectly, by interacting via a host-cellular factor
that is required for HIV-1 infection and viral life cycle. While most miRNAs regulate gene
expression by suppressing their target mRNAs, a cellular miRNA could target host factors, which
either suppress or enhance HIV-1 infection. Also, a single cellular miRNA can target as many as
100 transcripts. This makes it likely that any given cellular miRNA involved in HIV-1 infection
is probably serving as both an activator and a suppressor of HIV-1 at the same time. While the
multitude of interactions inside a cell are extremely complex and dynamic for determining the
exact role of a given cellular miRNA as purely an up- or down-regulator of HIV-1 infection, there
is evidence for both direct and indirect regulation of cellular and viral gene expression by cellular
miRNAs [57].
We have conducted a thorough literature search for the possible biological functions of cellular
hsa-miR-195, miR-30d, miR-424 and miR-374a as it relates to HIV infection. A total of 17
papers were found to report on a wide range of functionalities of miR-195, 6 citations were
associated with miR-30d function, 8 papers discussed miR-424 gene function and only 1 citation
described miR-374a function. Most papers associate these microRNAs to cancer, apoptosis,
Alzheimer’s disease and signal transduction [58-61] and none was related to any of the miRNAs
expressed during HIV infection.

Our findings are not cellular miRNA sequences but truncated versions
The miRNA-like sequences we have identified in HIV-1 are unique in that they do not seem to
be derived from cellular miRNA, nor do they appear to represent viral miRNAs or their targets.
The hsa-miR-195-like sequence corresponds to the first 18 nucleotides of the mature hsa-miR-
195, which has a length of 21 nucleotides. Any functional similarity of this sequence to the
cellular hsa-miR-195 may be speculative. However, it should be noted that there are miRNAs as
short as 17nt; and that HIV-1 has been reported to encode a viral miRNA, designated TAR-3p,
whose cloned length is 17nt [46]. Further, the potential action of a miRNA is mostly dependent
on base pairing between the miRNA seed sequence and its target; positions 13-16 of the miRNA
may aid in pairing as well [31, 62]. The miR-195-like sequence we have identified in
#GU216763 contains both the seed region and positions 13-16 of hsa-miR-195 and is 100%
conserved in these regions. It has also been predicted computationally that the cellular hsa-miR-
195 may interact with the HIV-1 Nef in the 3’ LTR region based on a perfect complementarity of
a 7 nucleotide seed sequence with its viral target [57].

  19  
These viral miRNA-like sequences may mimic their cellular miRNA counterparts
Whether the microRNA-like sequences are of viral origin or products of provirus integration
millions of years ago remains to be explored. However, the fact that this sequence is part of a
functional coding region of a vital viral gene makes the integration event scenario seem unlikely.
Our data suggest that the miR-like sequences present in the HIV-1 envelope region are viral RNA
sequences, which emulate a cellular miRNA.
The implications of a viral miRNA mimicking a cellular miRNA would be speculative.
However, there is precedence for a viral miRNA to outcompete its cellular competitor. HIV-1
TAR RNA has been reported to act as a sort of miRNA-‘decoy’, decreasing the host cell’s RNAi
activity by binding and sequestering TRBP, a TAR RNA-binding protein and an essential Dicer-
cofactor [63].
We therefore propose that the hsa-miR-like sequences we have identified in the Env genes of
several viruses may similarly be titrating out the related cellular miRNA targets.

The viral miRNA-like sequences are from coding regions, and from V1-V5

A major point of distinction between miRNAs and our newly identified miRNA-like sequences
is that while cellular miRNAs are derived from non-coding regions of the DNA, the miRNA-like
sequences we have identified are located in the coding regions of vital HIV-1 genes. The HIV-1
envelope glycoprotein contains 5 variable regions (V1-V5) interspersed by conserved regions
C1-C5. The miRNA-like sequences we discovered have been mapped to the V1, V2, V4 and V5
regions of the HIV-1 envelope and are integral components of the HIV-1 gene, which codes for a
functional envelope gp120 (Figure 3). The finding of several viral sequences homologous to
cellular miR-30d, miR-30e, miR-374a, miR-424 and miRNA-195 in different regions of the
HIV-1 genome but primarily in the envelope regions of several HIV-1 strains indicate that the
phenomenon of cellular miRNA-like sequences in the HIV-1 genome may be widespread.

Changes in the length of amino acid sequences or glycosylation patterns in the variable V-
regions are critical to HIV-1 infection because they can affect not only the cellular tropism but
can also modulate sensitivity to virus neutralization and disease progression [64]. Some of the
major determinants that contribute to biological activities of HIV-1 strains including replication,
viral tropism (ability to infect T-cells versus macrophages or other cell types), sensitivity to
neutralization, modulation of the CD4 antigen, and cytopathogenicity are localized in the V1 to
V5 regions of the HIV-1 envelope glycoprotein gp120 [65-67]. These sequences are critical for
effective humoral responses and virus neutralization. The V1 toV5 regions of the HIV-1
envelope have been associated with the rate of replication, virus neutralization and pathogenesis
of HIV-1 strains [68, 69]. While the primary virus replicates in the body, it becomes resistant to
neutralization because some of the specific V1 to V5 domains are either lost or have been
modified by increased numbers of N-linked glycosylation sites and therefore cellular immune
responses to HIV-1 infection are compromised [65-67].

  20  
Finally, our findings of four different miRNA-like sequences within the V1, V2, V4 and V5
regions of the HIV-1 env gene may provide new insights that will contribute to a better
understanding of the molecular complexities of HIV-1 infection and pathogenesis. The miRNA-
like sequences may emulate a non-coding cellular miRNA and therefore could represent the first
examples of human cellular homologues of miRNAs in HIV-1 coding regions. These sequences
may therefore play a role in HIV-1 replication, immunity, and virus neutralization, and thus may
influence pathogenesis in HIV-infected individuals. Detailed in vivo studies and construction of
vectors containing the human microRNA-like sequences that we have discovered, would yield
critical results that would allow us to develop a humanized mouse model system to test the
effects of these vectors in vivo. These experiments will demonstrate the influence of microRNA-
like sequences on virus replication, as well as antibody and antigen production in vitro and in
vivo.

2.5 Materials and Methods:


Source of Data: Previous proteomics and bioinformatics research in our laboratory had
identified >200 differentially expressed, functionally relevant proteins in an HIV-1 infected
CD4+ T-cell line (RH9) analyzed sequentially over a period of approximately 2 years [51, 52].
In this study, we used GeneSet2miRNA [70] to identify potential microRNAs that could impact
the activities of our differentially regulated proteins. Using an adjusted p-value of the
enrichment (adjusted for multiple testing by Monte-Carlo simulations) cutoff of 0.05, we
identified 7 miRNAs that may significantly bind to multiple mRNA targets. We also selected
one other miRNA because it was identified as the best single-model match. These 8 miRNAs
are listed in Table 1.

Identification of homologous sequences: To identify sequences that could be homologous to


HIV-1, we downloaded the full length and mature sequences of the 8 human microRNAs from
the miRBase database (http://mirbase.org/) that had been shown to be significantly associated
with the proteins modulated by HIV-infection of CD4+ T-cells. Each of the 8 miRNAs was used
as a separate query, utilizing both the mature and full-length versions of each miRNA. The
BLAST (Basic Local Alignment Search Tool) [71] program was used to search against the entire
HIV-1 databases (http://blast.ncbi.nlm.nih.gov/) (HIV taxid: 11676) at the National Center for
Biotechnology Information (NCBI) and the Los Alamos HIV databases
(http://www.hiv.lanl.gov/content/sequence/BASIC_BLAST/basic_blast.html). All full-length
and partial HIV-1 genome sequences, representative of all HIV-1 clades and strains, were used
for the analyses. These sequences have been identified by the International Committee on
Taxonomy of viruses (ICTV) and are available in the global public databases. The outputs from
both database searches were compared and the best matches from all microRNA query searches
were selected based on the length of the match, percentage of identity of match, lack of gaps or
deletions, and inclusion of the seed sequence.

Clustal Analyses and Mapping of newly identified Sequences: The Clustal algorithm was used
for multiple sequence alignments [72, 73] (http://www.ebi.ac.uk/Tools/msa/clustalw2/).
We used the five most homologous HIV-1 sequences to hsa-miR-195 as identified by our

  21  
BLAST searches and shown in Table 2, as well as the sequences of 17 other representative HIV-
1 strains from 6 clades to generate alignments using the Clustal algorithm. Complete genome
sequences from each of the representative HIV-1 strains were used to perform alignments of the
different clades with our best matches. Results from the Clustal algorithm were then checked
against the Los Alamos HIV Compendium
(http://www.hiv.lanl.gov/content/sequence/HIV/COMPENDIUM/compendium.html) to verify that
the alignments from both sources were in agreement.

In addition to defining the specificity of sequence alignments, we used the TreeDyn software
program for the construction of a sequence-based relational tree using the alignment data
generated by the Clustal algorithm (http://www.treedyn.org/). The target regions of the
alignment were then mapped to the HXB2 strain gene map using the Los Alamos National
Laboratory HIV genome database (http://www.hiv.lanl.gov/) map, because this is one of the
most complete reference sequence data maps available for HIV-1.

Acknowledgement
We thank Zisu Mao and Jane M.C. Chan for technical assistance with our proteomics studies
using two-dimensional gel electrophoresis and mass spectrometry respectively.

  22  
  23  
  24  
Figure Legends

Figure  1.    Phylogenetic  Tree  of  miR-­‐‑195-­‐‑like  sequences  in  different  HIV  clades.  

Phylogenetic  tree  was  constructed  using  the  hsa-­‐‑miR-­‐‑195  sequence  


TAGCAGCACAGAAATATT  for  alignment  similar  to  that  used  in  Table  5.  This  tree  was  
constructed  using  the  treedyn  program.    Each  leaf  represents  one  of  the  5  African  
microRNA-­‐‑like  sequences  or  one  of  the  17  reference  strains  from  the  6  clades  which  were  
used  for  alignment.    The  five  sequences  highlighted  in  red  represent  the  5  African  
microRNA-­‐‑like  sequences  identified  in  Table  2.  
 
Figure  2.    Location  of  hsa-­‐‑miR-­‐‑195-­‐‑like  Sequence  in  HXB2  Env  Gene  

This  figure  shows  the  envelope  region  of  the  HIV  HXB2  genome  to  which  the  hsa-­‐‑miR-­‐‑195-­‐‑
like  seqeunce  maps.    The  area  corresponding  to  the  V5  region  of  HXB2  spans  nucleotide  
positions  7603  through  7632  and  is  shaded.    The  hsa-­‐‑miR-­‐‑195-­‐‑like  region  is  also  shaded.    
The  mismatches  between  the  HXB2  sequence  and  the  miR-­‐‑like  sequence  present  in  the  
African  strain  #GU216763  are  due  to  the  sequence  variations  in  envelope  regions  of  all  HIV  
genomes.  
 
Figure  3.    Localization  of  miR-­‐‑like  sequences  in  HIV  envelope  variable  regions  

This  figure  shows  the  miR–like  domains  listed  in  Tables  2  and  3  and  mapped  to  their  
corresponding  positions  in  the  HXB2  envelope  gene  (at  nucleotide  positions  6682-­‐‑6694,  
6765-­‐‑6782,  7386-­‐‑7398  and  7611-­‐‑7628).    The  miR-­‐‑424-­‐‑like,  miR-­‐‑374a-­‐‑like,  miR-­‐‑30d-­‐‑like,  
and  miR-­‐‑195-­‐‑like  sequences  are  embedded  in  the  hypervariable  regions  corresponding  to  
V1,  V2,  V4  and  V5  respectively  and  are  depicted  in  blue.    No  miRNA-­‐‑like  sequence  was  
detected  in  the  V3  region  of  the  envelope.  
 

 
  25  
 

Table  2.1  
MicroRNA  Sequences  Used  For  Analysis  
microRNA   Length  of   miRBase  Accession  #   Chromo-­ Mature  Cellular                                                
miRNA  :   :  Full  Length  /  Mature   somal   miRNA  Sequence  
Full   Location  
Length  /  
Mature  

hsa-­miR-­195   87  nt   MI0000489   17p13.1   UAGCAGCACAGAAAUAUUGGC  


  21  nt   MIMAT0000461      
hsa-­miR-­15b   98  nt   MI0000438   3q25.33   UAGCAGCACAUCAUGGUUUACA  
  22  nt   MIMAT0000417    
hsa-­miR-­16-­1   89  nt   MI0000070     13q14.2   UAGCAGCACGUAAAUAUUGGCG  
  22  nt   MIMAT0000069      
hsa-­miR-­16-­2   81  nt   MI0000115   3q25.33   UAGCAGCACGUAAAUAUUGGCG  
  22  nt   MIMAT0000069      
hsa-­miR-­30d   70  nt   MI0000255   8q24.22   UGUAAACAUCCCCGACUGGAAG  
  22  nt   MIMAT0000245      
hsa-­miR-­30e   92  nt   MI0000749   1p34.2   UGUAAACAUCCUUGACUGGAAG  
  22  nt   MIMAT0000692      
hsa-­miR-­374a   72  nt   MI0000782   Xq13.2   UUAUAAUACAACCUGAUAAGUG  
  22  nt   MIMAT0000727      
hsa-­miR-­424   98  nt   MI0001446     Xq26.3   CAGCAGCAAUUCAUGUUUUGAA  
  21  nt   MIMAT0001341      
 

 
 
 

 
 
 

  26  
 

                                     Table  2.2  
                   Sequence  Similarity  Between  Human  miR-­‐‑195  and  HIV  Envelope  
Genes  
Gene  Name  /   Accession  #  /   Alignment  -­  Top  line  is  hsa-­miR195   Number  and   Seed  
Description   Subtype  /   (MI0000489)  mature  sequence,  and   Percentage   Matches  
Country  of   bottom  line  is  the  env  region  of  the   of  
Origin   respective  virus.   Nucleotide  
Matches  

HIV-­1  Isolate     GU216763   1 tagcagcacagaaatatt 18 18/18     6/6    


#169b1a12     Subtype  C   ||||||||||||||||||   (100%)   100%  
Envelope  gene   South  Africa   1401 tagcagcacagaaatatt 1418      

HIV-­1  Isolate     GU216768   1 tagcagcacagaaatatt 18 17/18   5/6    


#169b1c4     Subtype  C   |||| |||||||||||||   (94%)   83%  
Envelope  gene   South  Africa   1404 tagcggcacagaaatatt 1421      

HIV-­1  Isolate     GU216773   1 tagcagcacagaaatatt 18 17/18   5/6    


#169b1d7   Subtype  C   |||| |||||||||||||   (94%)   83%  
Envelope  gene   South  Africa   1392 tagcggcacagaaatatt 1409      

HIV-­1  Isolate     HM215313   1 tagcagcacagaaatatt 18 17/18   5/6    


#401-­F1_8_10     Subtype  CD   ||||| ||||||||||||   94%   83%  
Envelope  gene   Tanzania   1362 tagcaccacagaaatatt 137      

HIV-­1  Isolate     DQ199139     1 tagcagcacagaaatatt 18 17/18   6/6    


#TZB0573   Subtype  C   |||||||||||||| |||   94%   100%  
Envelope  gene   Tanzania   585 tagcagcacagaaacatt 602      

   HIV-­‐‑1  isolate  #169b1a12,  #169b1c4,  and  #169b1d7  are  from  the  same  patient.  
 
 

 
 

 
 
  27  
 

Table  2.3  
         Sequence  Similarity  Between  Three  Mature  Human  microRNAs  and            
HIV  Envelope  Genes  
Cellular   Accession  #  /   Gene  Name  /   Alignment  -­  Top  line  is  hsa-­miR   Number  and   Seed  
miRNA   Subtype  /   Description   sequence,  and  bottom  line  is  the  env   Percentage   Matches  
(Mature   Country  of   region  of  the  respective  virus.   of  
Sequence)   Origin   Nucleotide  
Matches  

hsa-­miR-­30d   AY169802   HIV-­1  Strain   3 taaacatccccga 15 13/13     5/6    


  Group  O   #98CMA104   |||||||||||||   (100%)   (83%)  
  Cameroon   Complete   6901 taaacatccccga 6889      
    genome        

hsa-­miR-­374a   AJ429907   HIV-­1  Strain     4 taatacaacctgataag 20 13/17   4/6  


  Group  M   #00NE079   |||||| | |||| ||   (76%)   (67%)  
  Subtype  6cpx   Envelope   175 taataccaattgatcag 191      
  Niger          

  AF391235   HIV-­1  Clone     2 tataatacaacctgataa 19 17/18   6/6  


  Group  M   #TV006c9.1   ||||||||||| ||||||   (94%)   (100%)  
  Subtype  C   Envelope   542 tataatacaacttgataa 559      
  South  Africa          

hsa-­miR-­424   GU080167     HIV-­1  Clone     6 gcaattcatgtttt 19 14/14   2/6  


  Group  M   704MC009F   ||||||||||||||   (100%)   (33%)  
  Subtype  C   Envelope   417 gcaattcatgtttt 404      
  South  Africa          

 
 
 
 

 
 
 
 
 
  28  
Table  2.4  
Sequence  Similarity  Between  Full  Length  Human  microRNAs  and  HIV  
Genes  
Cellular   Accession  #  /   Gene  Name  /   Alignment   Number  and   Seed  
miRNA   Subtype  /   Description   Percentage   Matches  
Country  of   of  
Origin   Nucleotide  
Matches  

hsa-­miR-­30e   FJ147129   HIV-­1  Isolate   1 tgtaaacatccttgactggaag 22 17/22   4/6  


  Subtype  B   #VC2T2C1   | |||| | |||||||||||   77%   67%  
  US   Envelope  gene   279 tttaaattgcgttgactggaag 300      

  FJ147130   HIV-­1  Isolate     1 tgtaaacatccttgactggaag 22 17/22   4/6  


  Subtype  B   #VC2T2C2   | |||| | |||||||||||   77%   67%  
  US   Envelope  gene   273 tttaaattgcgttgactggaag 294      

  U13543   HIV-­1  Isolate     1 tgtaaacatccttgactggaag 22 18/22   5/6  


  Subtype  D   #93UG059   | ||||| | |||||||||||   82%   83%  
  Uganda   Envelope  gene   45 tttaaactgcattgactggaag 66      

  DQ208474   HIV-­1  Isolate     1 tgtaaacatccttgactggaag 22 18/22   5/6  


  Subtype  AD   ML35.W0M.G2   | ||||| | |||||||||||   82%   83%  
  Kenya   Envelope  gene   381 tttaaactgcattgactggaag 402      

  DQ208473   HIV-­1  Isolate     1 tgtaaacatccttgactggaag 22 18/22   5/6  


  Subtype  AD   ML35.W0M.F3   | ||||| | |||||||||||   82%   83%  
  Kenya   Envelope  gene   381 tttaaactgcattgactggaag 402      

hsa-­miR-­424   AM181808     HIV-­1  Isolate   33 gtgttctaaatggttcaaaacgtgaggcgctgctatac 70 29/38   4/6*  


  Subtype  13cpx   #01CMVP/CE   |||||||||||||||| ||| | | ||||||||   76%   67%  
  Cameroon   Gag-­Pol   1049 gtgttctaaatggttctaaaattttcgtcatgctatac 1012      

  GU207082     HIV-­1  isolate   33 gtgttctaaatggttcaaaacgtgaggcgctgctatac 70 29/38   4/6*    


  Subtype  13cpx   #VP_CE_104   |||||||||||||||| ||| | | ||||||||   76%   67%  
  Cameroon   Pol  gene   817 gtgttctaaatggttctaaaattttcgtcatgctatac 780      

  GQ344965     HIV-­1  isolate   33 gtgttctaaatggttcaaaacgtgaggcgctgcta 67 27/35   4/6*    


  Subtype  AG   #06CM06BDH   |||||||||||||||| ||| | || |||||   77%   67%  
  Cameroon   Pol  gene   520 gtgttctaaatggttctaaaattttggtcatgcta 486      

hsa-­miR-­30d   GQ288251   HIV-­1  isolate   23 ggaagctgtaagacacag 40   18/18   0/6  


  Subtype  B   #3077_051503   ||||||||||||||||||   100%   0%  
  US   Pol  gene   120 ggaagctgtaagacacag 103      

 
 *  Represents  seed  matches  which  are  in  the  reverse  complementary  strand  

  29  
Table  2.5:  Sequence  Homology  Domains  in  the  V5  
                                   regions  of  HIV  Envelope  
Description            hsa-­‐‑miR-­‐‑195  Sequence   Matches   Accession  
   Alignment  With  V5  Regions      

hsa-miR-195 TAGCAGCACAGAAATATT  
||||||||||||||||||
M-Clade C TAGCAGCACAGAAATATT 18/18 GU216763
M-Clade C TAGCGGCACAGAAATATT 17/18 GU216773
M-Clade C TAGCGGCACAGAAATATT 17/18 GU216768
M-Clade C TAGCACCACAGAAATATT 17/18 DHM215313
M-Clade C TAGCAGCACAGAAACATT 17/18 DQ199139

M-Clade C TAGCACAAAAGAGATATT 14/18 U46016


M-Clade G TAGCACTAGTGAGATCTT 12/18 AF084936
M-Clade B GAACCAGACCGAGATCTT 11/18 U21135
M-Clade B TAATGACACCGAGGTCTT 10/18 K02007
M-Clade E TGAGACCATCGAAACCTT 10/18 AF197341
M-Clade B TAATGATACCGAGACCTT 9/18 AF004394
M-Clade B --AAGACACTGAGATCTT 9/18 AF042101
M-Clade A TACAAAAAATGAGACCTT 9/18 M62320
M-Clade A CAGTGAACCTGAAACCTT 9/18 AF484509
M-Clade A CAATGTAAATGAAACCTT 8/18 AF004885
M-Clade E TGCGACTAATGAGACCTT 8/18 AF197340
M-Clade D --GTACTAACGAGACCTT 8/18 K03454
M-Clade B --ATGGGTCCGAGATCTT 8/18 K02013
M-Clade B --ATGAGACCGAGACCTT 7/18 D10112
M-Clade B --ATGAGTCCGAGATCTT 7/18 AF033819
M-Clade B # --ATGAGTCCGAGATCTT 7/18 K03455
M-Clade B --ATGAGTCCGAGATCTT 7/18 D86069
** **
 
Selected alignments of V5 regions of 22 different
HIV-1 strains containing the hsa-miR-195-like sequence. The hsa-miR-195 sequence
was first aligned to the 5 African strains (shaded and also shown in Table 2), and
then aligned with 17 representative HIV strains from different
clades as determined by the ClustalW2 algorithm. The mismatches between the clades are due to
the variability of the HIV envelope sequences, including V5 regions.
#  Represents  the  HIV  HXB2  strain  which  was  used  for  gene  mapping  in  Figure  2.  
Asterisks  represent  positions  of  100%  conservation.  
  30  
 

Table  2.6  
MicroRNA-­‐‑like  sequence  alignments  with  cellular  genome  sequences.  
miR-195-like:   1 TAGCAGCACAGAAATATT 18  
  ||||||||||||||||||  
Human chr 17:   6524380 TAGCAGCACAGAAATATT 6524363  
   

miR-30d-like:   1 TAAACATCCCCGA 13  
  |||||||||||||  
Human chr 8:   49090730 TAAACATCCCCGA 49090718  
   

miR-374a-like:   1 TATAATACAACCTGATAA 18  
  ||||||||||||||||||  
Human chr X:   11825168 TATAATACAACCTGATAA 11825151  
   

miR-424-like:   1 GCAATTCATGTTTT 14  
  ||||||||||||||  
Human chr X:   17948436 GCAATTCATGTTTT 17948423  

BLAST search result and alignment using miR-195-like sequence against the human genome
sequence and locations on the chromosomal DNAs.  

 
 

  31  
  32  
Chapter 3

High-throughput Analysis of Global HIV-1 Sequences Identifies and


Maps Full-Length Mature MicroRNAs within HIV-1 Genomes

3.1 Abstract
While the biogenesis and functionalities of small non-coding microRNAs (miRNAs) have been
extensively studied in plants, invertebrates and vertebrates (including humans), the role/s of these
regulatory sequences in the world of retroviruses have not been well-explored. Encouraged by
our recent report of human “microRNA-like” sequences within the protein-coding envelope
sequence of several HIV-1 isolates, we have conducted high-throughput analyses of all 400,000
genomic and sub-genomic HIV-1 sequences from GenBank and systematically compared each
sequence with the entire catalogue of about 2,500 microRNA (miRNA) entries in miRBase. This
exhaustive search resulted in greater than 250,000 matches that contained the microRNA “seed”
of both cellular and viral sequences. A series of multifactorial algorithms and bioinformatics
filters were applied to remove arbitrary sequences that matched only to the microRNA “seed.”
These selective processes resulted in the discovery of 15 mature, full length (~22 nucleotides)
human microRNAs within different protein-coding regions of 20 distinct HIV-1 strains from
different regions of the globe. These miRNAs belong to multiple clades including the B-clade,
which is circulating predominantly in the United States and European countries. Subsequent
mapping of the microRNA sequences discovered revealed that all 15 microRNAs were
embedded not only within the HIV envelope gp120-coding sequences but were also present in
the gag, pol, nef and LTR regions of different HIV-1 isolates.

  33  
3.2 Introduction:
Background
MicroRNAs (miRNAs) are a class of small (17-25 nucleotides in length), noncoding RNA
molecules that are thought to be essential regulators of post-transcriptional eukaryotic gene
expression. By binding to cellular mRNAs through sequence complementarity, they are able to
regulate protein expression via mRNA degradation or translational repression [30, 31].
MicroRNAs are found in plants, fungi, and animals [74, 75], as well as human viruses [76]. The
discovery of miRNAs in plant and animal viruses has highlighted their importance as potential
antiviral targets.
Since their discovery in 1993 in the model system C. elegans [3, 77], advances in technology
have fueled the discovery of thousands of miRNAs. The human genome contains over 2,500
miRNA genes [78] that have been predicted or experimentally shown to play critical roles in
normal cellular functions, such as differentiation and apoptosis, as well as regulating viral gene
expression. Some estimates put the percentage of mRNAs that are regulated by miRNAs as high
as 60% [79]. This shows the crucial role that these noncoding RNAs play in cellular translation.
Recently, there is building evidence that miRNAs may act at the level of transcriptional
regulation as well [80].
A miRNA initiates its interaction with a target mRNA through complementary binding of a short
‘seed sequence’, defined as nucleotides 2-8 of the miRNA. Thus, through this short interaction,
a given miRNA may regulate dozens of mRNAs, and these mRNAs may in turn be regulated by
multiple miRNAs [81, 82]. The complex and interwoven nature of these interactions leads to the
conclusion that changes in the expression of a few miRNAs can bring about rapid changes in the
expression of multiple genes [83]. Upregulation or downregulation of a given miRNA can
dysregulate protein expression profiles and therefore result in disruption of normal biological
processes such as cell proliferation, development, differentiation, apoptosis, and signal
transduction [30, 32, 33]. Therefore, the endogenous miRNA pathway represents a modality by
which the expression of multiple cellular genes can be intricately and precisely regulated. It also
represents an innate immune response by the cell to virus challenge [84-86].
While the majority of miRNAs identified are cellular in origin, DNA viruses such as herpes and
Epstein-Barr viruses are known to encode miRNAs [34]. Whether or not RNA viruses, including
the retrovirus HIV-1, encode their own viral miRNAs is a topic of much debate. While many
researchers have reported the isolation of HIV-1 encoded miRNAs [45, 47, 49], others say that
they do not exist [87].
What is clear is that cellular miRNAs can affect HIV-1 replication and pathogenesis, and that
HIV-1 infection alters microRNA expression profiles. Since Yeung et al. first demonstrated
changes in miRNA expression profiles in HIV-transfected cells [38], many specific miRNAs
have been found that both enhance and inhibit HIV-1 replication. For example, it has been
shown that miR-132 enhances viral replication [88], as does mir-34a [89] miR-146a [90] and
miR-222 [91]. Meanwhile, miR-29a, miR-196b, miR-1290, and miR-155 are a few of the
microRNAs implicated in viral latency [92-94].

  34  
The restriction of HIV-1 in resting CD4+ T cells appears to be miRNA-mediated [95], and HIV-
1 infection has also been shown to suppress circulating viral restriction miRNAs [96].
Additionally, it has been shown that miRNAs can bind to the nucleocapsid domain of the Gag
protein, and the resulting miRNA-Gag complexes interfere with assembly, therefore inhibiting
viral budding and production [97].
Indeed, even the microRNA machinery itself is susceptible to viral interference. The HIV-1
protein Vpr has been shown to target Dicer, the integral endonuclease of the microRNA
biogenesis pathway, for proteasomal degradation in macrophages [98].
The identification of cellular or virally encoded miRNAs could be very useful as an aid in drug
design. Currently, cellular miR-122 has been shown to be essential for hepatitis C replication
[99], and an antisense antagomir of mir-122, marketed commercially as miravirsen, has been
shown to reduce HCV viremia in chimpanzees [100] and is in further trials [101].
Currently, three HIV-encoded vmiRNAs are listed in miRBase: miR-H1, miR-N367 and miR-
TAR-5p/3p. These vmiRNAs have been reported to target “viral” transcripts, as in the case of
miR-N367 [44], or host cellular factors as happens for miR-H1 and miR-TAR-5p/3p [45, 46].
More recent research seems to support that miRNAs are in fact present in the viral genomes:
Ouellet et al. [47] showed that functional miRNAs are processed from the HIV-1 TAR element;
Yeung et al. [48] have identified small non-coding RNAs in HIV-1 infected cells and
corroborated the existence of virally encoded miRNAs in nef and TAR; and Schopman et al.
[49] have identified HIV-encoded small RNAs in virus-infected cells.
Our research has focused on the possibility that retroviruses, and HIV-1 in particular, may harbor
viral homologues of cellular miRNAs. This would comprise a third genre of miRNAs involved
in viral infection – the existence of viral miRNAs that essentially mimic the function of their
natural cellular miRNA counterparts. While retroviruses are undoubtedly targeted by a large
number of human cellular miRNAs [50], and while retroviruses may possibly encode their own
unique vmiRNAs, the possible existence of viral homologues of cellular miRNAs has not been
experimentally validated.

  35  
Previous Research
In previous research [102] we used bioinformatics software (GeneSet2miRNA) to identify 8
human miRNAs that were predicted to be co-involved in the dysregulation of proteins expressed
by chronically infected H9 cells [51, 52].
Using both the mature 22 nt and full length ~70nt sequences of each of the 8 cellular miRNAs
mentioned above as query sequences, we used multiple bioinformatics and computational tools
to align and scrutinize potential matches in HIV-1 genomes.
The investigation of these 8 miRNAs revealed that several HIV-1 isolates contained homologues
of the mature miR-195 sequence. It was also found that three of the other miRNAs, miR-30d,
mir-374a, and mir-424, had homologous sequences in various HIV-1 strains. These miRNAs all
mapped to the envelope (env) region of the HIV-1 genome and specifically to the variable, or V
regions, in this gene.
These promising results based on the examination of only 8 miRNAs encouraged further
research. What would happen if we searched all the known human miRNAs for homology with
all known HIV-1 sequences? This identification of homologous sequences in various HIV-1
genomes based on a search of only a few select miRNAs prompted speculation on how many
more miRNA-like sequences there might be in the HIV-1 genome. A global search was then
performed.
Herein, we report the discovery of 15 novel microRNA sequences, including miR-195, contained
within the protein-coding regions of various global strains of HIV-1 (Table 3.1). This global
search reveals a clustering of these viral miRNA sequences primarily within the HIV-1 envelope
region (Figure 3.6). In addition, one of our findings, miR-4644, is present in 34 distinct HIV-1
isolates which are distributed globally. The miRNAs discovered are further characterized in this
report and could provide new insights into the viral replication and pathogenesis of HIV-1.

  36  
3.3 Results:

Identifying MicroRNA Sequences Embedded in the HIV Genome


Our last paper, “Identification of Human MicroRNA-like Sequences Embedded within the
Protein-Encoding Genes of the Human Immunodeficiency Virus” [102], which is included in this
thesis as Chapter Two, focused on the search for sequences within the HIV-1 genome that were
homologous to cellular miRNAs.
Based on prior proteomics analysis, we focused our efforts on 8 microRNAs that we believed to
be involved in the activity of dysregulated proteins present in chronically infected HIV-1 RH9 T-
cells. As an example, pappalysin-1, or the pregnancy-associated plasma protein A, (PAPPA) is a
protein found to be downregulated due to HIV-1 infection. Our analyses indicated that the 3’
untranslated region (3’UTR) of PAPPA contains a sequence (5’-CUCUCCA-3’) which is
complementary to the seed sequence for the mature hsa-miR-4644 (5’-UGGAGAG-3’) which
presumes an affinity for and interaction for this the mRNA of this protein. PAPPA is a
metalloproteinase which cleaves insulin-like growth factor binding proteins.
Through multiple bioinformatics analyses and computer searches including BLAST (Basic Local
Alignment Search Tool) and Clustal, we were able to show that there were 4 microRNA-like
sequences (miR-195, hsa-miR-30d, hsa-miR-374a and hsa-miR-424) present in the variable
regions of the HIV-1 envelope of several different strains. This promising finding prompted us
to expand our search in an effort to find other homologues and potential relationships between
cellular miRNAs and HIV-1. Our basic workflow proceeded as follows:
Summary:
1)   Assemble all known human miRNAs
2)   Perform global search using all known HIV genomic and subgenomic sequences
3)   Select best matches using adjusted p-value
4)   Remove matches without seed
5)   Group these best matches by miRNA species
6)   Separate viral sequence matches from cellular integration site matches

1) Assemble all known human miRNAs -> 2,588 miRNAs


Our previous paper was based on the selective search for 8 microRNA sequences in the HIV
genome which were chosen based on computational analyses of predicted HIV dysregulated
protein-miRNA interactions. In order to find out what other miRNA sequences might be present
in the HIV genome, we decided to compare all known human miRNA sequences with all known
HIV-1 sequences.

  37  
To this end, we designed and developed software to compile all known mature human miRNA
sequences from the miRBase database (http://mirbase.org/) (Table 3.3). The output file
generated by this software would serve as our master query list. We defined the search space to
be all HIV-1 sequences contained in the NCBI database, which is a global set of databases
containing more than 3,000 complete HIV-1 genomes and over 400,000 subgenomic HIV-1
sequences and represents all known HIV-1 sequence data.
We ran the BLAST program using all of the human microRNAs extracted from the miRBase
database as queries and the NCBI HIV-1 database as the search space. miRBase contains actual,
validated, and proposed human miRNAs. Many miRNAs listed are validated while others are
only predicted, so the sequences that were analyzed are likely not all actual miRNAs. We were
able to obtain 2,588 human miRNA sequences from miRBase, and we opted to use all of these
known and predicted miRNA sequences in our effort to elucidate all possible homologies.
2) Perform global search using all known HIV genomic and subgenomic sequences ->
250,000 Search Results
Using our list of all reported human miRNA sequences from miRBase, we performed thousands
of individual BLAST searches. An exhaustive search of all globally available HIV-1 sequences
yielded thousands of potential matches. By compiling the results of each individual search into a
master list, we put together a preliminary list of candidate matches between all known miRNAs
and all known HIV-1 sequences. This master list contained approximately 250,000 individual
entries generated from search results. Each one of these 250,000 potential matches was
subsequently screened on the basis of the degree of homology and degree of sequence continuity
with the full length sequences of each of the more than 2500 human cellular miRNAs.
3) Select best matches using adjusted p-value -> 322 matches
The results were many but there were also many matches which were statistically not significant.
By sorting according to statistical significance using an adjusted p-value of the enrichment
(adjusted for multiple testing by Monte-Carlo simulations) cutoff of 0.10, we were able to trim
our preliminary list of matches down to 322 candidates.

4) Remove matches without seed -> 219 matches


From these results, we further reduced the candidates by removing those matches which do not
contain a seed sequence. The ‘seed’ sequence is typically defined as nucleotide positions 2-8 of
the mature miRNA, with minimum base pairing between any given miRNA and its target
sequence to occur at positions 2-7. Therefore, any HIV sequence candidate match to a cellular
miRNA sequence would presumably contain at least the seed sequence 2-7. It is known that
there can be a limited amount of mismatch within the seed sequence, typically on the order of 1
nucleotide, so we expanded our criteria to minimally contain nucleotides 3-7 of the seed.
Therefore, any candidate whose match does not start at position 1, 2, or 3 does not contain a
sufficient seed sequence, and would not be reasonably considered and were therefore eliminated.
This step reduced our 322 candidates down to 219.

  38  
5) Group these best matches by miRNA species -> 27 miRNAs
The remaining 219 candidates were then sorted into groups by the particular miRNA species to
which they matched. As stated previously, the candidate matches are viral sequences extracted
from the NCBI database based on homology with a particular cellular miRNA. Although there
were 219 unique viral sequence candidate matches, the viral sequences only matched a total of
27 unique cellular miRNAs. In other words, although there were 219 total matches between
cellular miRNAs and HIV-1 sequences, there were only 27 distinct miRNAs, and these 27
miRNAs had multiple matches.

6) Separate viral sequence matches from cellular integration site matches -> 15 miRNAs
The final step in our analysis of the 27 miRNA-matching groups of sequences was to further
divide them into two groups based on whether the candidate matching sequences were viral or
cellular in origin.
One group, composed of 15 miRNAs, all matched HIV-1 viral sequences in various and different
regions of the HIV-1 genome, and thus these matching sequences were considered viral in
origin.
The second group, composed of 12 miRNAs, were all associated with matching sequences from
HIV-1 integration sites. Upon further inspection it was found that in fact these 12 miRNAs
matched with the cellular portion of the HIV-1 integration site. In other words, the match was
between our query miRNA and cellular DNA sequences which flanked an HIV-1 integration site.
It may make sense to elaborate on how these cellular sequences appeared in our list of matches.
When we defined our search space as all known HIV-1 sequences, this includes full length viral
genomes as well as subgenomic HIV-1 sequence fragments or gene sequences. However, any
sequences which are part of an HIV-1 integration site are also contained in this search space.
Most database entries for viral integration sites also contain as part of its sequence data flanking
regions of some cellular DNA immediately up- or downstream of the actual integration site. In
this way, some cellular DNA sequence data was introduced into our search, although the reason
this happened is because those cellular DNA sequences were adjacent to and thus associated with
HIV-1 viral sequences. While this group of miRNAs associated with integration sites represents
a unique and interesting find, we decided to limit our search to those candidate matches that were
directly associated with HIV-1 genomic and subgenomic sequences.
Therefore, from our initial search involving nearly 2,600 known human miRNA sequences, by
conducting multiple statistical searches and bioinformatics analyses, along with stepwise
filtering of candidate results, we were able to find and narrow matching candidate sequences to a
final number of 15 matching human cellular miRNAs.
Our workflow is summarized in Figure 3.1.

  39  
Figure 3.1
Workflow for Identifying MicroRNA Sequences
Embedded within the HIV Genome

Source and assemble all


≈25,000 known mature miRNAs

Separate by species-sorting
algorithm

≈ 2,500 known mature


human miRNAs

Perform quantitative comparison


search using all known
HIV sequences

250,000 HIV sequence 'hits'


or candidate matches
Filter by statistical
significance

322 matches

Remove candidates
without 'seed'

219 matches

Group into
miRNA families

27 miRNAs

Filter for viral sequences


(non-cellular)

15 miRNAs

miR-586 miR-4644 miR-195 miR-6763 miR-4652

miR-6774-3p miR-6875-3p miR-548ah-3p miR-548am-3p miR-5197

miR-548av-3p miR-6124 miR-6766 miR-7151 miR-7156

  40  
Figure 3.1: Workflow for Identifying microRNA Sequences Embedded within the HIV Genome
Figure shows the steps involved in initial analysis involved in comparing all known human mature
miRNAs to all known HIV genomic and sub-genomic sequences in global databases. Steps are
shown in green, various filtering steps shown in blue and the final result of 15 miRNAs are listed
individually in orange.

Discovery of 15 Novel microRNA Sequences in Global HIV Genomes


Based on these analyses, we have identified 15 human microRNA sequences, including the
previously identified miR-195, which have perfect or nearly perfect homologous sequences in 20
HIV-1 genomes contained in the NCBI database. These 15 miRNA sequences are listed in Table
3.1.

The 15 miRNAs shown in table 3.1, along with their regions of homology, are listed in order
from the longest length of match to the shortest. The longest match that we found was between
human miR-586 and three different HIV-1 isolates from Zambia, Portugal, and Romania. This
match was 22 nucleotides long and represented a 100% perfect match between this entire
miRNA and these HIV-1 genomes. It should be noted that the alignments giving the match
between miR-586 and these isolates contained one gap in each alignment.
The next longest region of homology match was miR-6763, which also matched three different
HIV-1 isolates. These isolates were from Canada, Spain, and Germany. Like mir-586, the
region of homology represented a 100% perfect match between human miR-6763 and the three
HIV-1 isolates over the full miRNA length of 19 nucleotides and no gaps.
Besides these two miRNAs which show 100% perfect homology across their entirety, Table 3.1
then lists 6 more miRNAs which show spans of 100% continuous and uninterrupted homology
18 nt long with 7 different HIV-1 isolates. Although they do not contain the entire miRNA,
these matching sequences do contain the miRNA seed.
The last 7 miRNAs represent sequence matches which are 17nt long. These matching sequences
are also 100% continuous and uninterrupted matches, and like the others, they also contain the
seed region.
It should be noted that all of the matching sequences shown in Table 3.1 (except miR-4652-5p
and miR-6774-3p) contain the entire seed region, deemed to be crucial to miRNA function, in
every miRNA listed.
All of the sequences listed in Table 3.1 show 100% homology across all or the majority of the
length of their corresponding miRNA, including the seed sequence. Two of the miRNAs listed,
miR-586 and miR-6763-5p, have exact 100% complete matching sequences present in 6 HIV
genomes across their entire length.

  41  
While 12 of the miRNAs identified in our search have a single match with a single HIV-1
isolate, there are 4 miRNAs which match to multiple isolates: miR-6875-3p (miRBase accession
#MIMAT0027651) matches two HIV-1 isolates from Australia; miR-586 (#MIMAT0003252)
matches three isolates from Zambia, Portugal, and Romania; miR-6763-5p (#MIMAT0027426)
matches three isolates from Canada, Spain, and Germany; and miR-4644 (#MIMAT0019704)
matches 34 different isolates from various regions around the world. The significance of this
finding of multiple matches for miR-4644 in various isolates will be further discussed in the next
chapter.

  42  
Table 3.1
Identification of 15 microRNAs Embedded in HIV Genomes
From Various Global Regions

Country Length
MicroRNA/ MicroRNA Sequence/ HIV Isolate/
Clade of of Seed
Accession # Location in chromosome Accession #
Origin Match
miR-586 UAUGCAUUGUAUUUUUAGGUCC PP24A-A0564-ESM11B9
C Zambia 22/22* 6/6
MIMAT0003252 chr6: 45197674-45197770 JF294164

miR-586 UAUGCAUUGUAUUUUUAGGUCC envNTM62/54


G Portugal 22/22* 6/6
MIMAT0003252 chr6: 45197674-45197770 EF063092

miR-586 UAUGCAUUGUAUUUUUAGGUCC RO-BCI9


F1 Romania 22/22* 6/6
MIMAT0003252 chr6: 45197674-45197770 Z83303

miR-6763-5p CUGGGGAGUGGCUGGGGAG HDM003V09-5


B Canada 19/19* 6/6
MIMAT0027426 chr12: 132581997-132582061 DQ322230

miR-6763-5p CUGGGGAGUGGCUGGGGAG NP625


B Spain 19/19* 6/6
MIMAT0027426 chr12: 132581997-132582061 AF019395

miR-6763-5p CUGGGGAGUGGCUGGGGAG HAN-2


B Germany 19/19* 6/6
MIMAT0027426 chr12: 132581997-132582061 U43141

miR-4652-5p AGGGGACUGGUUAAUAGAACUA KRB902899


B South Korea 18/22 5/6
MIMAT0019716 chr7: 93716928-93717005 KC690469

miR-6774-3p UCGUGUCCCUCUUGUCCACAG 02-9593OA


AD Uganda 18/21 5/6
MIMAT0027449 chr16: 85918347-85918416 AY803358

miR-6875-3p AUUCUUCCUGCCCUGGCUCCAU MS2004-37_040


B Australia 18/22 6/6
MIMAT0027651 chr7: 100868036-100868107 EF178338

miR-6875-3p AUUCUUCCUGCCCUGGCUCCAU 500437


B Australia 18/22 6/6
MIMAT0027651 chr7: 100868036-100868107 AY856971

miR-195-5p UAGCAGCACAGAAAUAUUGGC 169b1a12


C South Africa 18/21 6/6
MIMAT0000461 chr17: 7017615-7017701 GU216763

miR-548ah-3p CAAAAACUGCAGUUACUUUUGC 93ZR001.3


D Zaire 18/22 6/6
MIMAT0020957 chr4: 76575551-76575626 U27419

miR-548am-3p CAAAAACUGCAGUUACUUUUGU 93ZR001.3


D Zaire 18/22 6/6
MIMAT0019076 chrX: 16627012-16627085 U27419

miR-548av-3p AAAACUGCAGUUACUUUUGC 93ZR001.3


D Zaire 17/20 6/6
MIMAT0022304 chr18: 72853321-72853382 U27419

miR-5197-5p CAAUGGCACAAACUCAUUCUUGA 207_G8


C South Africa 17/23 6/6
MIMAT 0021130 chr5: 143679860-143679971 JQ777084

miR-6124 GGGAAAAGGAAGGGGGAGGA 16538


B Italy 17/20 6/6
MIMAT0024597 chr11: 12163683-12163767 JX666417

miR-6766-5p CGGGUGGGAGCAGAUCUUAUUGAG N-54


B Italy 17/24 6/6
MIMAT0027432 chr15: 89326739-89326810 U89855

miR-7151-5p GAUCCAUCUCUGCCUGUAUUGGC 49_F


B USA 17/23 6/6
MIMAT0028212 chr10: 67403351-67403410 GQ451375

miR-7156-5p UUGUUCUCAAACUGGCUGUCAGA 01AOCSE118


A1 Angola 17/23 6/6
MIMAT0028222 chr1: 77060143-77060202 EU031842

miR-4644 UGGAGAGAGAAAAGAGACAGAAG SO186_H6_5C


C South Africa 17/23 6/6
MIMAT0019704 chr6: 170639849-170639932 JN681247

  43  
Table 3.1: Identification of 15 microRNAs Embedded in HIV Genomes from Various Global
Regions
The microRNAs above were identified through our global screening method as those having
significant homology with various HIV genomic sequences. Listed along with each miRNA is the
length of the matching region of the HIV genome, clade of matching HIV genome, length of seed
match, the HIV isolate’s country of origin, along with name and accession information.
* indicates 100% homology between mature cellular miRNA and HIV isolate.
- Length of Match represents uninterrupted sequence homology.
- MicroRNAs shown in color have matches with multiple HIV isolates.
- Note: miR-4644 has additional matches throughout the Env genes of different HIV genomes.

Relationship Between Viral miRNA Sequences and Cellular miRNAs


Figure 3.2 shows a graphic representation of the 15 miRNAs in our study. Because three of the
miRNAs in our study had multiple matches (miR-586, miR-6763-5p, and miR-6875-3p), there
are 20 matches which represent the most significant sequence homologies to the 15 miRNAs we
used for our study. It can be seen in this figure that these microRNAs show significant overall
sequence identity, as well as near complete seed sequence identity.

When discussing regions of sequence identity, or ‘match’, between miRNA sequences and
portions of HIV-1 sequences, there are several items to consider.
First, the length of the match may not be as long as the length of the miRNA. For example, from
our figure, miR-548av-3p (entry #14) matches HIV-1 Isolate 93ZR001.3 with 100% homology
from position 1 through 17, scoring a match length of 17, though the full length of the mature
miRNA is 20 (Figure 3.2, entry #14). Therefore, on the figure, positions corresponding to
nucleotides 1-17 are shaded in either blue (matching) or orange (matching and seed). Non
matching positions 18-20 are boxed but not shaded.
Second, the match may or may not include nucleotide positions 2-8, which constitute the seed
region of the miRNA. To continue our example, miR-548av-3p also shows 100% homology
with the entire seed region. However, because the overall length of the mature miR-548av-3p is
20nt, and the length of the match is only 17, that means that the last three nucleotide positions of
the miRNA, 18-20, do not match the HIV-1 sequence and this match does not constitute a full
length match.

  44  
Figure 3.2
Graphic Representation of Matching Positions Between Cellular Mature miRNAs and HIV Isolates
Positions of Homology (Shaded)
Matching HIV Length of Length of
MicroRNA: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Isolate Match (nt) miRNA (nt)
1) hsa-miR-586 PP24A-A0564-ESM11B9 22 22
2) hsa-miR-586 envNTM62/54 22 22
3) hsa-miR-586 RO-BCI9 22 22
4) hsa-miR-6763-5p HDM003V09-5 19 19
5) hsa-miR-6763-5p NP625 19 19
6) hsa-miR-6763-5p HAN-2 19 19
7) hsa-miR-4652-5p KRB902899 18 22
8) hsa-miR-6774-3p 02-9593OA 18 21
9) hsa-miR-6875-3p MS2004-37_040 18 22
10) hsa-miR-6875-3p 500437 18 22

45  
11) hsa-miR-195-5p 169b1a12 18 21
12) hsa-miR-548ah-3p 93ZR001.3 18 22
13) hsa-miR-548am-3p 93ZR001.3 18 22
14) hsa-miR-548av-3p 93ZR001.3 17 20
15) hsa-miR-5197-5p 207_G8 17 23
16) hsa-miR-6124 16538 17 20
17) hsa-miR-6766-5p N-54 17 24
18) hsa-miR-7151-5p 49_F 17 23
19) hsa-miR-7156-5p 01AOCSE118 17 23
20) hsa-miR-4644 SO186_H6_5C 17 23

 
Figure 3.2: Graphic Representation of Matching Positions Between Cellular Mature miRNAs
and HIV Isolates
This figure shows all 20 sequence matches in our study listed by miRNA and the HIV isolate
exhibiting the homology. The length of each miRNA is shown graphically by a bar where the
numbers at the top indicate the length in nucleotides of the miRNA (for example, miR-586 has a
length of 22nt). Regions of 100% homology between the miRNA and matching HIV sequence are
colored in blue. The white or unshaded regions of each bar show the portion of each miRNA which
does not exhibit sequence identity with its associated HIV sequence.
Also, the shaded blue regions of miRNA entries #1 through #6 (miR-586 to miR-6763-5p) are
hatched to indicate that there is complete homology across the entire mature length miRNA and its
corresponding HIV isolate.
The seed region is defined as positions 2 through 8 of the miRNA. The region of seed match for
each entry is shown in orange.
As an example: microRNA entry #7, mature miR-4652-5p, has a total length of 22nt and matches
HIV isolate KRB902899 with 100% sequence identity from positions 3-20. They do not match at
positions 1-2 and 21-22. The orange shading from positions 3-7 indicate that the seed region of the
miRNA matches in position 3 through 7 but does not match at position 2.

Distribution of the 15 miRNAs in Global HIV Isolates by Clade


Additionally, given clade information from Table 3.1, we can show here a graphic representation
of the distribution by clade of the miRNA sequences we have identified.
The 20 different sequences that we identified through our investigation came from different
clades. Clade, or subtype, was determined by reference to the GenBank entry for each identified
sequence. Of the twenty HIV-1 sequences given, nine were found to be subtype B, four subtype
C, three subtype D, and one each from clades AD, A1, F1, and G. These results are shown in
Figure 3.3. Of note is the observation that nearly half of the sequences identified were from
clade B, which is typically associated with non-African HIV-1 strains.

  46  
Figure 3.3: Graphic Distribution of microRNA Sequences in Different HIV Clades Isolated
Globally
Clade Distribution of the 20 Homologous HIV Sequences studied is given in this pie chart, as a
percentage of the whole.

  47  
Mapping viral miRNA Sequences to HIV Genomic Regions
In our previous paper, we identified one miRNA (miR-195) which showed perfect or near
perfect homology with several HIV-1 strains and 3 other miRNA-like sequences (miR-30d, miR-
374a and miR-424) within the env and encoding regions, although these had reduced homology
[102].
Since our previous research focused on miRNA-like sequences that were mapped to the env gene
and, specifically, to the hypervariable V-regions within the env gene, we further scrutinized our
results by mapping our most significant matches to all genomic regions of the HIV-1 genome.
We present our results for miRNA mapping to the various HIV-1 genes, organized by genomic
region, in table 3.2.

  48  
Table 3.2
Specificity of Sequence Matches Between
MicroRNA and HIV Encoding Genes

Percent HIV
miRNA HIV Isolate/ Country Alignment Gene
Homology Position

02-9593OA
miR-6774-3p gag 100% 2031-2048
Uganda

16538
miR-6124 pol 100% 2672-2688
Italy

49_F
miR-7151 env 100% 6264-6280
USA
93ZR001.3
miR-548ah-3p env 100% 6734-6752
Zaire
93ZR001.3
miR-548am-3p env 100% 6734-6752
Zaire
93ZR001.3
miR-548av-3p env 100% 6735-6751
Zaire
01AOCSE118
miR-7156 env 100% 7275-7294
Angola
PP24A-A0564
miR-586 env 96% 7324-7347
Zambia
envNTM62/54
miR-586 env 96% 7325-7348
Portugal
RO-BCI9
miR-586 env 96% 7325-7348
Romania
KRB902899
miR-4652 env 100% 7415-7435
South Korea
207_G8
miR-5197 env 100% 7413-7430
South Africa
169b1a12
miR-195 env 100% 7611-7628
South Africa
SO186_H6_5C
miR-4644 env 100% 7741-7757
South Africa

N-54
miR-6766 nef 100% 8882-8899
Italy
MS2004-37_040 nef
miR-6875 100% 9242-9259
Australia
500437
miR-6875 nef 100% 9241-9258
Australia

HDM003V09-5
miR-6763 LTR 100% 9482-9500
Canada
NP625
miR-6763 LTR 100% 9482-9500
Spain
HAN-2
miR-6763 LTR 100% 9482-9500
Germany

  49  
Table 3.2: Specificity of Sequence Matches Between MicroRNA and HIV Encoding Genes
Individual miRNAs are listed here along with the HIV isolates to which they matched and their
alignment. The most significant regions of homology between the HIV genome and human cellular
mature miRNAs are shown. Results are sorted according to HIV genome position. The top row in
each alignment represents the miRNA sequence, and the bottom row represents the HIV sequences.
The numbers on either side of each sequence correspond to respective position numbers.
- Percent homology represents the amount of sequence identity between the two sequences of each
alignment.
- HIV position refers to the corresponding mapped position of each sequence in a standard HXB2
genome.

Table 3.2 shows the 15 miRNAs for which we discovered homologous sequences in various
HIV-1 strains, which are shown in a separate column. Alignments, percent homology and
position in the HIV-1 genome are given as well. All of these sequences show 100% homology
(or in one case, 96% because of one gap in the alignment).
To further characterize these sequences, it was important to map them accurately to a
standardized HIV-1 genome in order to compare them. When searching global databases
containing HIV-1 sequence data, many of the database entries are subgenomic, meaning that
sometimes only one gene is sequenced, or only a few genes, as opposed to an entire full length
genomic sequence. Furthermore, in almost all cases, the diversity of HIV-1 populations means
that a certain position in one genome is not identical to another. In order to compare the
positions of our findings to each other, it is necessary to map each one to a known, or reference,
sequence.
We chose the HXB2 strain as our mapping standard because it has been well studied and
characterized, and its full sequence and sequence features are readily available (GenBank
accession number K03455). The HXB2 genome is used as a common reference strain for many
different functional studies, and it has a standardized position numbering scheme available at the
Los Alamos National Laboratory website (http://www.hiv.lanl.gov/). By using the Clustal
alignment algorithm we were able to definitively map each of our matches to the HXB2 genome
and thus to determine exactly where in the HIV-1 genome it lies. These positions are also shown
in Table 3.2 under the column ‘HIV Position’.
Once we determined the position in a standardized HXB2 genome, it was possible to infer the
exact genomic region or gene in which a given sequence lies. The genomic region or gene
associated with each miRNA match is shown in Table 3.2 under the heading ‘Gene.’ Thus, it
can be seen that in addition to the envelope region, we have identified matching miRNAs in the
gag, pol, nef, and LTR of the HIV-1 genome. The matches mapping to the envelope were
further localized to its precise position of the envelope region, including whether part of a
hypervariable, or V, region. These results will be discussed further and are shown graphically in
Figure 3.4, Figure 3.5 and Figure 3.6.

  50  
For those miRNAs exhibiting multiple matching HIV-1 isolates (miR-586, mir-6875, and
mir-6763, and as will be demonstrated for miR-4644) it is interesting to note that in each case the
multiple matches mapped to the exact same region in the HIV-1 genome, indicating that this is
not an isolated event. It is also interesting to note that as a result of this mapping exercise, we
can see that the miRNAs are not localized only to the envelope region but in fact are distributed
throughout the HIV-1 genome.
Finally, it is worthwhile to note that miR-6763, for which we identified three matching isolates
that all mapped to the LTR, exhibits a perfect match over the entire miRNA, which means that
exact copies of the mature human miR-6763 are contained within the LTR region of these three
isolates.

Graphic Representation of miRNAs present in HIV-1 Genomes


By using the Clustal algorithm and other bioinformatics methods, we were able to map each of
the 15 miRNAs we identified to a standardized HXB2 HIV-1 genome. This allowed us to
display the relative positions of all 15 miRNAs together on a single map.
Figure 3.4 shows visually how the 15 miRNAs are distributed throughout the HIV-1 genome. In
particular, it can be seen that the majority of our significant findings reside in the envelope
region. In fact, all but two sequences reside in either the env, nef, or LTR region of the HIV-1
genome.
In addition to the fact that the majority of results are in the envelope region, they are in fact
further localized to the gp120 subregion of the envelope gene. In fact, we found twelve
significant sequences located in the gp120 subregion, but none of the sequences we identified
were located within the gp41 subregion of the envelope gene.
Figure 3.5 further shows the same 15 miRNAs identified in addition to 4 other miRNA-like
sequences previously identified by our group and focuses in on the downstream region of the
HIV-1 genome where many of the identified sequences are clustered. In previous research we
identified four miR-like sequences that were present in different variable or V regions of the
HIV-1 genome. By adding these four miR-like sequences to our newly identified 15 miRNAs,
we can see a more complete picture of the positions of all miRNAs identified in our study. The
addition of these four new data points demonstrate even more the preponderance of miRNAs
localized to the env region.
While the sequences are seen to be distributed throughout the genome, they are highly clustered
towards the 3’ end of the genome and in particular, in the gp120 region of the envelope gene.
Within gp120, the identified sequences reside both within the hypervariable V regions and also
completely outside of these V regions.

  51  
Figure 3.4
Localization of miRNA Sequences in HIV Genomes
miR- miR- miR- miR- miR- miR-
7156 586 4652 5197 195 4644
env env env env env env
miR-

52  
548 ah
miR- miR-
548 am miR- miR-
miR- miR- 7151 6875
548 av 6766 6763
6774 6124
vpu nef
env env nef LTR LTR
gag pol
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 9719

 
Figure 3.4: Localization of miRNA Sequences in HIV Genomes
Positions of all miRNA sequences identified in this study are indicated on Figure 3.4. 15 miRNA
sequences representing 20 homologous sequences in different HIV isolates are shown. Blue arrows
drawn onto the demarcated line show the positions of the miRNA sequences in a standard HIV
genome. A generic HIV-1 gene map is drawn below to provide a reference for the positions.
Clustering in the envelope region can be readily seen.

Figure 3.5
Mapping of Mature microRNA Positions and miR-like Sequences in the HIV Genome
5041 8379 8797

vif nef

5619 8469 9417


5831 6062 8379 9086

tat vpu rev 3’LTR

6045 6310 8653 9719

5559 5970 6225

vpr gp120 gp41

5850 6045 8795


env
HIV
Genome
Position 6000 7000 8000 9000 9719

miR- miR- miR- miR- miR- miR- miR-


6774 6124 7151 548ah 6766 6875 6763
gag pol env 548am
nef nef LTR
548av
LTR
env

miR- miR- miR- miR- miR- miR-


7156 586 4652 5197 195 4644

env

Figure 3.5: Mapping of Mature microRNA Positions and miR-like Sequences in the HIV Genome
Figure 3.5 shows graphically the locations of miRNA and miR-like sequences that we have
identified in globally isolated HIV genomes. This figure expands the view of Figure 3.4 by
emphasizing location of the downstream regions of the HIV genome. A partial gene map of the
HIV genome is shown along with coordinate numbers on the ruled line. The beginning and ending
nucleotide positions for each HIV gene region are shown attached to it on the figure.
Various HIV gene regions are separated and shown by their reading frame indicated on the left.
Positions of miRNA sequences are indicated by blue arrows. Results are shown for all 15 miRNAs
which are listed in Table 3.1. Positions of miR-like 424, 374a, and 30d respectively, from previous
work, are indicated by asterisks on the coordinate line.

  53  
Taking a Closer Look at Env
A more detailed view of the miRNA sequences present in the HIV-1 envelope region from
various global strains can be seen in Figure 3.6.
The sequences we identified can be cross-referenced with the nucleotide-specific map of the
HXB2 strain to show not only the sequence position and general location at which the sequence
lies but also very specific sub-region information, and sequence features such as binding site
information, which is available. From this resource we were able to localize our identified
sequences to not only the envelope region, but to specific hypervariable regions within the
envelope gene. For example, it can be seen that the miR-195 sequence maps to the segment
from position 7611 through 7628 within the env gene of the HXB2 genome (Figure 3.6). This
corresponds to the V5 region of the envelope glycoprotein, which spans nucleotide positions
7603-7635. Thus, we can say that the miR-195 sequence is contained entirely within the V5
region of the HIV-1 genome.
From our previous research, we found that several sequences were embedded in the
hypervariable regions of the envelope gene [102]. These are miR-424 (V1), miR-374a (V2),
miR-30d (V4) and miR-195 (V5). By mapping these new findings we have now identified
several additional sequences that are contained within the hypervariable regions. These are:
miR-548ah-3p, miR-548am-3p and miR-548av-3p (V2); and miR-4652 and miR-5197 (V4).
Like the sequences previously identified, these sequences are also embedded entirely within their
respective hypervariable V regions in the HIV-1 genome.
We also mapped several miRNA sequences that are contained within the gp120 section of the
envelope gene but are not part of any hypervariable region. These are miR-7151, miR-7156,
miR-586 and miR-4644, which are also shown graphically in Figure 3.6. It is interesting to note
that while several sequences were found in the hypervariable regions V1, V2, V4, and V5, no
sequences were identified in the V3 hypervariable region.
There were no identified sequences which were contained partially within a hypervariable region
and partly outside. All sequences were either entirely contained within a hypervariable ‘V’
region, or entirely outside it.
It can also be seen in the figure that miR-4644, at nucleotide positions 7741-7757, lies within the
gp120 region but ends at exactly the border between gp120 and the beginning of gp41, which
occurs at nucleotide position 7758. This significance of this finding will be discussed further.

  54  
Figure 3.6
MicroRNA Sequences Localized to the
Envelope Region of the HIV Genome

miR- miR- miR- miR- miR- miR- 4644


7151 548ah 7156 586 4652 gp120/gp41
miR- 4644 New Findings
548am 5197 Boundary
CD4 Within Env
Signal 548av C3
Loop V4
Peptide V2

6225 7758 8795


env V1 V2 gp120 V3 V4 V5 gp41

V1 V2 V4 V5
miR-424 miR-374a miR-30d miR-195

Previously Described

Figure 3.6: MicroRNA Sequences Localized to the Envelope Region of the HIV Genome
This figure shows graphically the miRNA domains listed in Table 3.2 and mapped to their
corresponding positions in the HXB2 envelope gene (at positions 6264-6280, 6734-6752, 7275-7294,
7324-7347, 7415-7435, and 7741-7757, respectively). Also shown on the bottom are the positions of
miRNA-like sequences previously identified by our lab for miR-424, miR-374a, miR-30d and miR-
195 (at nucleotide positions 6682-6694, 6765-6782, 7386-7398 and 7611-7628). Hypervariable
regions V1 through V5 are depicted in orange.

  55  
3.4 Discussion:

These viral miRNAs are distinct from other reported viral miRNAs
There has been a great deal of research directed at the involvement of miRNAs in HIV-1
infection and pathogenesis, and this involvement can work in one of two directions: infection by
HIV-1 can change cellular miRNA expression globally, and cellular miRNAs can modulate
HIV-1 infection. Yeung et al. [38] first showed that HIV-1 infection can dysregulate patterns of
cellular miRNA expression; since then, other studies have identified cellular miRNAs which
appear to regulate the HIV-TAR and Nef-LTR regions [37, 47, 48]. Additionally, Hariharan et
al. [36] has predicted target sites for cellular miRNAs such as nef-LTR regions in HIV-1
genomes, also indicating that human cellular miRNAs can modulate HIV-1 expression and
replication [39].
While multiple cellular miRNAs have been shown to modulate HIV-1 infection and
pathogenesis, the existence of putative HIV-encoded miRNAs remains a controversial subject.
HIV-1 encoded miRNAs and their cellular targets were first predicted from in silico studies[42],
and there have been several studies since that have attempted to validate their existence. While
there have been reported identifications of virally encoded miRNAs by various investigators,
other labs have in some cases been unable to confirm the presence of these miRNAs. For
example, two newly purported vmiRNAs, vmiR88 and vmiR99, were identified in the HIV-1
3’LTR [103], and while distinct from the sequences we have identified based on their reported
location, are not listed in miRBase .
Nevertheless, there are three HIV-1 miRNAs listed in the official database of microRNAs,
miRBase [104]: miR-H1-5p, miR-N367-3p, and miR-TAR-5p/3p. There has also been a fourth
report of an HIV-encoded miRNA, hiv1-miR-H3. In an effort to determine whether our findings
are distinct from previous research, we examined each of the four reports and compared to our
results.
Oullet et al. [47] in 2008 isolated miRNA sequences derived from the HIV-1 TAR element
called hiv1-miR-TAR-5p and hiv1-miR-TAR-3p. They have been able through subsequent
research to demonstrate the functional ability of these miRNAs to regulate gene expression in
vitro. Subsequent research by Klase et al. [46] has shown that this miR-TAR protects against
apoptosis by altering cellular gene expression. We have localized this miRNA sequence to
HXB2 position 9543-9566, and this region is not shared by any of our miRNA candidates that
we have found in our investigation.
It is interesting to note that one of the miRNAs we found, miR-6763, is present just upstream of
the TAR element at HXB2 position 9482-9500. This region is also a binding site for the cellular
transcription factor Sp1-I, a zinc finger transcription factor that binds to GC-rich motifs of many
promoters and is involved in many different cellular processes. The potential significance of a
miR-6763 sequence overlying a Sp1-I binding site in the HIV-1 LTR deserves further attention.
Whether mir-6763 could interact with the Sp1 binding site to affect transcription is unknown;
however, it has been shown by Younger, et al. [80] that miRNAs can target gene promoters to
effect transcriptional gene silencing in mammalian cells.
  56  
Also present in the 3’LTR is the virally encoded miRNA hiv1-mir-H1, reported by Kaul et
al[45]. Kaul proposes that hiv1-miR-H1 promotes viral replication by downregulating cellular
miR-149, which targets the viral Vpr gene, and also that miR-H1 specifically targets and
downregulates the cellular apoptosis antagonizing transcription factor, or AATF, gene
expression, causing a molecular cascade which ultimately results in cell death.
We have localized the hiv1-mir-H1 to position 9458-9477 in the HXB2 genome, which is a
distinct location different from any of the miRNAs that we have identified. This location is
however very close to the location of miR-6763, which we mapped to position 9482-9500. hiv1-
mir-H1’s location also completely overlaps the transcription factor binding site Sp1-III. As
previously stated, the mir-6763 sequence we identified overlaps the Sp1-I site in the LTR. This
shows an interesting similarity between miR-H1’s location in the LTR and Sp1 site, and the
location of miR-6763 in the LTR and similarly overlapping another Sp1 site just a few
nucleotides downstream.
Zhang et al. [105] has identified a novel HIV-1 encoded miRNA called hiv1-miR-H3 which is
embedded and encoded by a conserved region within the HIV-1 pol gene. They suggest that
miR-H3 can increase viral production and viral RNA expression through an interaction with the
TATA box in the 5’LTR promoter to activate viral RNA expression.
It is interesting to note that while this report exists in the literature, a database entry for hiv1-
miR-H3 is not present in miRBase. Nevertheless, using this reference we have localized this
unverified hiv1-miR-H3 to HXB2 position 3101-3121. While our lab has also identified a
putative miRNA located within the pol gene, miR-6124, this sequence lies at position 2672-2688
and is therefore distinct from the miR-H3 previously identified.
It is also interesting that according to Zhang, hiv1-miR-H3 seems to work by binding the TATA
region of the LTR promoter to activate transcription, while Younger’s research [80] suggests
transcriptional silencing by miRNAs which can bind to the promoter upstream of the TATA box.
In any case, it serves as another report of the potential for virally encoded miRNAs to interact
with their promoters to regulate gene expression at the transcription level.
Finally, Omoto et al. [106] identified an HIV-1 derived miRNA, called hiv1-miR-N367, which is
embedded and expressed within the nef gene, and was shown to repress Nef expression and is
related to lower viremia in long term non-progressors. Omoto asserts that nef-derived miRNAs
are produced in HIV-1 persistently infected cells, and that hiv1-miR-N367 can block HIV-1 Nef
expression in vitro. They also suggest that there may be other nef/U3 miRNAs produced in
HIV-1 infected cells, and that they may suppress both Nef function and HIV-1 virulence through
the RNAi pathway.
This represents a reported example of a virally encoded miRNA involved in the regulation of the
gene in which it is embedded. An apparent parallel to our findings is that hiv1-miR-N367 is an
embedded sequence within the coding region of nef, and that this miRNA produced from within
nef also serves to down-regulate Nef. As we will see in Chapter Four, this parallels our own
findings with miR-4644 – that miR-4644 represents a miRNA embedded in the coding region of
a gene that it too may downregulate.

  57  
We have localized hiv1-miR-N367 to HXB2 position 9205-9228. This is distinct from the
locations of any of the miRNAs which we have identified. The nearest of our findings to hiv1-
miR-N367 is miR-6875 at position 9242-9259.
Therefore, of the four putative HIV-1 derived miRNAs described here, we have been able to
demonstrate that none of them are the same as the miRNA sequences that we have identified and
that our findings are distinct from prior literature.

These miRNAs may be virally encoded cellular homologues


The sequences that we have identified may represent a unique finding in the HIV-1 genome: a
widespread occurrence of cellular miRNA homologues sequences among different worldwide
strains. These sequences are not classical miRNAs for several reasons: 1) the sequences are
embedded in coding regions of the viral genome; 2) they do not appear to be part of a stem-loop
structure associated with miRNA processing to generate mature miRNAs; 3) there is no
evidence that they are processed from longer transcripts by Drosha, or cleaved into mature
miRNAs by Dicer; 4) their homologous lengths are somewhat shorter than their cellular
counterparts. Nevertheless, they may still function in an miRNA fashion.
It should be noted that there exist other instances of miRNAs that are not processed by a
canonical biogenesis pathway; so-called mirtrons are generated in a Drosha-independent
manner. These are miRNAs which are encoded by the intron of another gene. After one of these
genes goes through splicing, the intron-miRNA sequences resemble the structure of pre-
miRNAs; therefore, they can be exported to the cytoplasm and be processed by normal miRNA
processing machinery without Drosha-mediated cleavage [107, 108]. Dozens of these mirtrons
have been discovered to date [109].
Another instance of noncanonical biogenesis involves the HIV-1 virus itself. It has been stated
that retroviruses need an alternate pathway to generate miRNAs so that the viral genomes
themselves are not cleaved [110]. To prevent cleavage of their RNA genome, a different RNA
substrate must serve as the miRNA precursor. Harwig et al. demonstrated that a miRNA-like
small RNA is produced from the 3’ side of the TAR hairpin, which aids in the production of
short TAR RNAs that serve as a miRNA precursor. These TAR RNAs are cleaved by Dicer, and
this biogenesis pathway differs from the canonical Drosha/Dicer processing and allows HIV-1 to
produce an miRNA without cleavage of its RNA genome.
While several of the viral miRNA sequences we have identified represent 100% full length
cellular homologues, most of the sequences are shorter versions of their cellular counterparts.
All of the sequences, however, contain at least the entire seed sequence of positions 2-7 (with the
exception of miR-4652 and miR-6774), which is a major determinant of miRNA targeting and
function. In fact, the sequences we have identified present 100% homology in most cases from
positions 1-17 or 1-18. Thus, they contain the entire miRNA sequence except for the last few
nucleotides.

  58  
Are the last few nucleotides of a given miRNA critical for biological function, or can a truncated
version still be active? Houzet et al. has examined the effects of degree of complementarity on
function of miRNAs and their targets [26] and found that the extent of complementarity
correlates in a positive way to how effective or potent a miRNA could be in HIV-1 restriction.
They found a cellular miRNA, miR-326, that had a predicted target site in the HIV-1 genome,
was found to decrease HIV-1 replication and was dependent on the degree of complementarity,
meaning that as mutations were introduced into the miR-326 it functioned less effectively, but
still functioned. These results suggest that a shortened version of a miRNA may still retain
targeting function.
miR-195 represents a unique find in our investigation; as we have reported previously, we
identified five African HIV-1 strains with near-perfect homology to this miRNA. miR-195 is a
member of the microRNA-15/16/195/424/497 family, that has been shown to have important
functional roles within cells by regulating the expression of cell cycle proteins such as WEE1,
CDK6, cyclin D1, E2F3 and Bcl-2 [111]. Also, miR-195 is predicted to potentially target the
RISC protein TRBP, as well as the deadenylase protein CNOT6L and the DNA helicase protein
DDX3, as determined by the miRNA target prediction algorithm Targetscan [62, 112]. These
proteins have been shown to be integral for gene silencing by miRNAs. Also, DDX3 has been
shown to have an essential role in HIV-1 Rev-RRE export of viral mRNAs [113]. While we
were the first to report the existence of miR-195-like sequences in the HIV-1 genome,
experimental validation has not been performed, nor has it been determined that these sequences
serve any role related to HIV-1 infection. Nevertheless, speculation on the possible role of an
miR-195 sequence in viral replication holds interesting implications. If this vmiRNA could
function similarly to cellular host miR-195, it might be important in regulating the above
mentioned proteins and modulating the cell cycle of infected cells. In this study, we have
additionally shown that miR-195 is predicted to target PAPPA, a protein known to be
downegulated by HIV-1 infection, and is also predicted to target furin, a critical component of
the viral maturation pathway,
There is also evidence in the literature that the miR-15 family, which includes miR-195, has anti-
HIV-1 properties. It has been found that miR-15a, miR-15b, and miR-16 downregulate the
expression of a cellular protein called Purine-rich Element Binding Protein alpha (Pur-a) in
monocytes. Pur-a restricts HIV-1 infection by blocking monocytic differentiation into dendritic
cells (DCs), which are susceptible to HIV-1 infection [114]. Because miR-195 is a member of
the miR-15/16 family and contains the same seed sequence, we examined the possibility the mir-
195 could likewise target Pur-a. In our own investigation, we have verified using
bioinformatics software that miR-195 is also indeed predicted to target Pur-a.

  59  
These miRNAs could affect transcription as well as translation

In addition to the complex and active role that microRNAs play in modulating gene expression at
the translational level, there is building evidence for the role of miRNAs in the regulation of both
cellular and viral transcription [115]. This transcriptional regulation is accomplished by
interaction with the promoter regions. It appears that viral miRNAs can act as positive or
negative regulators of transcription.

At the cellular level, it has been demonstrated that a mimic to cellular miR-423-5p was able to
silence gene transcription of the progesterone receptor (PR) promoter. This transcriptional
silencing was associated with recruitment of Argonaute 2 (AGO2) to the miRNA transcript that
overlaps the PR gene promoter [80].

Zhang et al. has shown that some cellular miRNAs up-regulate transcription by interaction with
promoter TATA-box motifs in human peripheral blood mononuclear cells (PBMCs) [116].
These miRNAs associate with RNA polymerase II (Pol II) and TATA-box binding protein (TB)
to facilitate the assembly of pre-initiation complexes (PICs). It was demonstrated that miR-138,
miR-92a and miR-181d, in particular, enhance promoter activity of the insulin, calcitonin, and c-
myc genes respectively. Zhang has also demonstrated that the HIV-encoded viral miRNA they
identified, hiv1-miR-H3, upregulates HIV-1 RNA transcription and protein expression [105].
They determined that the miR-H3 targets the TATA box of the HIV 5’LTR and sequence
specifically activates viral transcription.

Other ‘Mir-like’ Research


Since our last publication, there has been an increase in the search for microRNA-like sequences.
As discussed above, Harwig et al. [110] describes a small TAR RNA referred to as a microRNA-
like sequence; another group has identified microRNA-like molecules in the antigenome RNA
of Hepatitis C virus ; and yet another found microRNA-like molecules in the antigenome RNA
of Hepatitis A virus, an RNA virus which replicates in the cytoplasm and would be not
considered likely to encode traditional miRNAs due to its lack of access to miRNA processing
machinery in the nucleus [117].
In other related research, Kato et al. and Zhang et al. both artificially produced microRNA-like
constructs which inhibited HIV-1 replication in HeLa and MT-4 cells [118, 119].

  60  
Exosomes
Increasing attention is being given to the miRNA profiles of exosomes secreted from infected vs.
uninfected cells. Exosomes have the capability to transfer proteins and RNA molelcules from
cell to cell, influencing viral pathogenesis. Exosomes may carry either cellular or viral miRNAs,
and exosomes derived from infected cells have been shown to contain viral miRNAs. It was
recently shown that cell culture supernatants from HIV-1 infected cells contain HIV-1 TAR
miRNA [120]. Several studies have also compared miRNA profiles from the exosomes of
infected and uninfected cells. As an interesting exercise we obtained the results from two of
those studies in an effort to find out if any of the miRNA sequences that we have identified are
present in their data.

Roth et al. compared miRNA exosomal content in HIV-1 infected macrophages to uninfected
macrophages and found 38 miRNA species that were present in the exosomes from infected but
not uninfected cells [121]. None of the sequences we have identified are present in their sample.

We next looked at the study conducted by Aqil et al. in which exosomes were purified from
U937 monocyte cells which stably expressed the viral Nef protein and their miRNA contents
compared to U937 cells which were not infected [122]. Their data shows that two of our miRNA
sequences, miR-195 and miR-586, showed increased levels in infected-cell exosomes. miR-195
showed a 1.61 fold change and miR-586 had a 3.27-fold change in the amount of mRNA present
in the exosomes from these infected cells. Additionally, we searched for the presence of miR-
4644. While the presence of miR-4644 was not tested for, this study did find an increase of 3.25
fold-change in the amount of miR-185 in the infected exosomes. miR-4644 is from the miR-185
family, meaning that they share the same seed sequence and presumably many of the same
targets and targeting pathways.

Based on this evidence, we hypothesize that several of the miRNAs we have identified, mir-195,
miR-586, and miR-4644 may be upregulated and contained in the exosomes of cells infected by
HIV-1 in vivo.

  61  
Role of HIV-associated miRNAs as tumor suppressors

miRNAs affect the expression of a seemingly endless array of genes, including those involved in
cell differentiation and apoptosis. It has also been demonstrated that many miRNAs are involved
in human cancers, including lung, breast, brain, liver, colon, and leukemia [123]. Specifically,
some miRNAs may function as oncogenes or tumor suppressors. More than 50% of miRNA
genes are located in cancer-associated genomic regions [124]. Overexpressed miRNAs in
cancers may function as oncogenes, while underexpressed miRNAs in cancers may be tumor
suppressors and may function as cancer inhibitors by regulating oncogenes and/or genes that
control cell cycle status. It is known, for example, that miR-185 targets Bcl-2, a central gene
involved in cell cycle and apoptosis [125]. In our investigation, we have found that miR-195
acts as a tumor suppressor in cancers such as breast, colon, prostate, liver and cervical, and mir-
4644 may act as a tumor suppressor because it is from the same microRNA family and has the
same seed as miR-185, a known tumor suppressor.
As another example, it has been shown that miR-195 acts as a tumor suppressor in breast cancer
by modulating Insulin receptor substrate 1 (IRS1), inhibiting tumor growth and angiogenesis
[126], and in addition, by targeting FASN, HMGCR, ACACA and CYP27B1, miR-195 inhibited
proliferation, invasion and metastasis in breast cancer cells [127].
It has also been shown that downregulation of miR-195 promotes prostate cancer [128]. In this
study, overexpression of miR-195 inhibited cell cycle progression and tumorigenesis by directly
targeting HMGA1.
Finally, a recent study demonstrated that miR-195 is a key negative regulator of hepatocellular
carcinoma metastasis by targeting FGF2 and VEGFA, and that low expression of miR-195 is
associated with lung metastasis of HCC [129].
There is no documentation in the current literature regarding the role of miR-4644 as a possible
tumor suppressor. However, miR-4644, by virtue of sharing the same seed sequence as
miR-185, and therefore likely many of the same targets, could be involved in the same pathways
as miR-185. miR-185 is implicated as a tumor suppressor in cancers such as colorectal cancer
[130], breast cancer [131], prostate cancer [132], and lung cancer [133].

  62  
3.5 Conclusion:
In this investigation we have discovered 15 miRNA sequences contained within the protein-
coding regions of 20 different HIV-1 isolates. These miRNA sequences are homologous to
cellular miRNAs and may represent cellular homologues which are present in viral genomes and
could have some functional capacity. These purported viral homologues of cellular miRNAs
could play a role in viral infection and pathogenesis, based on the preliminary examination of
miRNA targeting and examples in the literature related to these miRNAs. Also, our
investigation shows that a few of the discovered miRNAs, namely miR-195, miR-4644, and mir-
586, may have connections to tumor suppressor activity, exosome presence and activity, and
possible regulation at the level of transcription. Finally, the presence of miR-4644 in 34 different
HIV-1 isolates also deserves additional attention and will be addressed in the next chapter.
The discovery of so many miRNA sequences contained within the protein-coding regions of the
HIV-1 virus suggests a novel and widespread phenomenon. These sequences come from isolates
which are geographically diverse; the sequences, while mainly in the envelope gene region, are
spread throughout the viral genome; and the sequences are also contained in regions of high
conservation. Further analysis of both the origin and function of these miRNA sequences may
yield many valuable insights into the pathogenesis of the human immunodeficiency virus.

3.6 Materials and Methods:


Source of Data: Previous proteomics and bioinformatics research in our laboratory had
identified >200 differentially expressed, functionally relevant proteins in an HIV-1 infected
CD4+ T-cell line (RH9) analyzed sequentially over a period of approximately 2 years [51, 52].
In our previous study, we used GeneSet2miRNA [70] to identify potential microRNAs that could
impact the activities of our differentially regulated proteins. Using an adjusted p-value of the
enrichment (adjusted for multiple testing by Monte-Carlo simulations) cutoff of 0.05, we
identified 8 miRNAs that may significantly bind to multiple mRNA targets.

Identification of homologous sequences: In this study, in order to identify all miRNA sequences
that could be homologous to HIV-1, we downloaded the entire database of all mature miRNA
sequences from the miRBase database (http://mirbase.org/). We used software designed and
developed internally to sort and subdivide all human mature miRNA sequences from the entire
database and to prepare our master query list (see Table 3.3). Each of the 2,588 mature human
miRNA sequences was used as a separate query, utilizing the mature length version of each
miRNA. The BLAST (Basic Local Alignment Search Tool) [71] program was used to search
against the entire HIV-1 database (http://blast.ncbi.nlm.nih.gov/) (HIV taxid: 11676) at the
National Center for Biotechnology Information (NCBI) and selectively searched in the Los
Alamos HIV databases
(http://www.hiv.lanl.gov/content/sequence/BASIC_BLAST/basic_blast.html). All full-length
and partial HIV-1 genome sequences, representative of all HIV-1 clades and strains, were used
  63  
for the analyses. These sequences have been identified by the International Committee on
Taxonomy of viruses (ICTV) and are available in the global public databases. The outputs from
the database searches were examined and the best matches from all microRNA query searches
were selected based on the length of the match, percentage of identity of match, lack of gaps or
deletions, and inclusion of the seed sequence. Clade or subtype was determined from individual
entries in the NCBI and Los Alamos databases.

Clustal Analyses and Mapping of newly identified Sequences: The Clustal algorithm was used
for multiple sequence alignments [72, 73] (http://www.ebi.ac.uk/Tools/msa/clustalw2/). Clustal
algorithm was used to align and map the 20 matching HIV-1 isolates that we identified to the
HXB2 genome. The target regions of the alignment were then mapped to the HXB2 strain gene
map using the Los Alamos National Laboratory HIV genome database
(http://www.hiv.lanl.gov/) map, because this is one of the most complete reference sequence
data maps available for HIV-1. Results from the Clustal algorithm were then checked against the
Los Alamos HIV Compendium
(http://www.hiv.lanl.gov/content/sequence/HIV/COMPENDIUM/compendium.html) to verify that
the alignments from both sources were in agreement.

  64  
Table 3.3
Software for Filtering miRNA Data by Species
//
// main.c
// C_Programming
//
// Created by Bryan Holland on 12/10/12.
// Copyright (c) 2012 Bryan Holland. All rights reserved.
//

#include <stdio.h>

int main()
{
FILE *infile, *outfile;
char name[70],sequence[25];

infile = fopen("/Users/Desktop/miRNA Full Database


Input.txt","r");
outfile = fopen("/Users/Desktop/filter.txt","w");

if(!infile)
{
puts("Input File Error!");
return(1);
}

while(fgets(name,70,infile) != NULL)
{
fgets(sequence,25,infile);
/* if line starts with '>hsa' then send entry to output file*/
if((name[1] == 'h') && (name[2] == 's') && (name[3] == 'a'))
{
fputs(name,outfile);
fputs(sequence,outfile);
}
}
fclose(infile);

fprintf(outfile,"Results file created successfully!\n");


fclose(outfile);
return(0);
}

  65  
Table 3.3: Software for Filtering miRNA Data by Species
The program above was developed internally by this lab to filter the miRBase database of all
known sequences into a subset of human miRNAs. This code reads in the entire miRBase database
of mature miRNA sequences, and pulls out the miRNAs which are human in origin. These human
miRNA sequences, representing all known human miRNAs, were written to a separate file for
further analysis.  

  66  
Chapter 4

Discovery of a Mature MicroRNA Sequence Which Encodes a


Highly Conserved Furin-Binding Site within Global HIV-1
Envelope gp160 Proteins: Prediction of Novel MicroRNA-
Regulatory Functions

4.1 Introduction:
In Chapter Three we used extensive computational and bioinformatics methods which resulted in
the discovery of 15 mature, full length (~22 nucleotides) human microRNAs within different
protein-coding regions of 20 distinct HIV-1 strains from different regions of the globe.
Subsequent mapping of the microRNA sequences discovered revealed that the 15 microRNAs
were embedded not only within the HIV-envelope gp120-coding sequences, as previously
described, but were also present in the gag, pol, nef and LTR regions of different HIV-1 isolates.
The miRNA sequences which we discovered, while all homologues of cellular miRNAs, are
each unique cellular miRNA sequences. They lie in different regions of the HIV-1 genome.
They likely have different origins, and the consequences of their genomic presence are as yet not
explored. Therefore, the direction we will take in these next few chapters is to investigate
individually several of these newly discovered miRNAs in order to further characterize them.
Of the 15 miRNA sequences discovered, most are present in only single and separate HIV-1
isolates. A few of the miRNA sequences discovered are present in two or three isolates.
However, in one of the sequences discovered, human cellular miRNA miR-4644, we have
discovered 34 HIV-1 isolates containing the miR-4644 sequence. In our effort to further
characterize miR-4644 we have examined the nature of the 34 matching isolates and the location
within these isolates which encodes for another protein. We have also studied the phylogenetic
relationships of the isolates. Finally, we have also investigated possible targets of miR-4644 in
an effort to elucidate a possible connection between HIV-1 and other agents involved in viral
infection.

  67  
This chapter contains the results of our investigation into miR-4644, which includes the novel
functional property that miR-4644 was embedded at exactly the same position in the gp120-
coding sequences of 34 HIV-1 strains isolated from different countries of the globe. A most
unique finding of this miRNA is that within the sequence that constitutes miR-4644, there is a
region that encodes an REKR amino acid motif which is the binding site for Furin, a highly
conserved proprotein convertase protease essential for the cleavage of HIV-gp160 precursor into
gp120 and gp41. Without this cleavage HIV-1 cannot mature into an infectious particle. Based
on the fact that this cleavage is essential for HIV-infectivity and that the virus is rendered
uninfectious, or less infectious if gp160 is not cleaved, we propose that the presence of miRNA-
4644 within the HIV-1 gp160 would be predicted to downregulate or translationally silence the
expression of cellular furin mRNA in a self-regulatory loop by binding to its target site/s upon its
transcription and thus lowering or inhibiting the infectivity or these viruses and making them less
pathogenic. Alternatively, the presence or absence of mature miRNA-4644 embedded within the
HIV-envelope sequences could also be used as a predictive biomarker to distinguish non-
pathogenic or less pathogenic versus pathogenic HIV strains.
Through targeting studies, we also present evidence that both miRNA-4644 and miRNA-195
contain sequences which target binding domains in the 3’UTR of both Furin and PAPPA
mRNAs, indicating that these two microRNAs could possibly downregulate the gene expression
of these important products after HIV-infection of T cells.

4.2 Results:
miR-4644 has Multiple Matches within the Envelope Region of Various HIV strains
To recap our analysis above, we have scanned the over 2,500 known human microRNA
sequences that are listed in miRBase for any homology with all known HIV-1 genomic and
subgenomic sequences that are contained in the NCBI database. Our analysis revealed 15 human
miRNAs showing significant homology in various HIV-1 isolates, with the homologies being
distributed across different genes or regions of the HIV-1 genome. Several of these miRNAs
have multiple matches, including miR-4644.
Human miR-4644 represents a unique finding in our investigation of the most significant
homologous sequences between human microRNAs and HIV-1 sequences. While all of the
other microRNAs we studied match to one, or at most a few HIV-1 isolates, miR-4644 has 34
significant matching HIV-1 strains in the NCBI HIV database. This number of matches
suggests some degree of conservation among HIV species.
These 34 matches are shown in Table 4.1, sorted according to clade. We see that 11 different
clades are represented, as well as many different countries, though mostly African. Seed match
and percent homology are also shown.

  68  
As with our other miRNA sequences, microRNA-4644 exhibits 100% homology, continuous and
uninterrupted, across its matching domain of 17nt. This represents nucleotide positions #1-17 of
the mature 23nt microRNA.
Additionally, alignments were performed between human mature miR-4644 and all matching
sequences, and are shown in Table 4.2.
In order to further characterize the miR-4644 matching HIV-1 sequences, we mapped all 34
matches to the HXB2 genome, using the Clustal alignment method as described above. This
mapping exercise showed that all 34 matching HIV-1 isolate sequences map to the same exact
position in the gp120 region of the envelope gene, at positions #7741-7757. This position lies
precisely at the border of the gp120 and gp41 regions of the envelope gene. The significance of
the position of this region will be discussed in more detail. The fact that all 34 sequences map to
the same position suggests a broad conservation of this miR-4644 sequence across many
different HIV-1 clades and countries of origin.

  69  
Table 4.1
Human miR-4644 Sequences Embedded within the HIV Envelope gp120
Regions of HIV Strains Isolated Globally

miR-4644 Sequence: UGGAGAGAGAAAAGAGACAGAAG


HIV Sequence: UGGAGAGAGAAAAGAGAXXXXXX

Seed Percent Accession # of


Matching HIV Isolate: Country of Origin: Clade
Match Homology HIV Clone

1 T503006_sga01 Thailand 01_AE 6/6 100% JF297228


2 T503006_sga22 Thailand 01_AE " " HQ691027
3 05GX001 China 01_AE " " GU564221
4 2011.ANHUI.WH69 China 01_BC " " KC183781
5 01CM.1402MV Cameroon 01_F2 " " GU201505
6 E7582 Guinea-Bissau 02_AG " " JN863758
7 08.RU.SP-R589.VI.F8 Russia A " " GU481555
8 SAL21 Russia A1 " " JQ180262
9 RU00051 Russia A1 " " EF545108
10 H01_3326 Kenya A1 " " FJ346507
11 537-36 Cameroon A1 " " EU618809
12 01CM.1404MV Cameroon A1/F2 " " AY371164
13 00CMNYU2541 Cameroon A1G " " EF025324
14 SE8603 Uganda A1/CD " " AF075702
15 09YNLC216002sg China BC " " KC898985
16 NARI-IVC2_NEM.J8 India C " " EU908214
17 PC_388_04R Botswana C " " KC628995
18 S121_016_1 Botswana C " " KF374143
19 3041_8 Malawi C " " KC862929
20 6838.v5.c36 Tanzania C " " HM215342
21 Z153MPL13MAR04ENV4.1 Zambia C " " HM068597
22 34M.BMR.661 Zambia C " " HM036948
23 258_I_9 Zimbabwe C " " HQ708062
24 SO186_H6_5C South Africa C " " JN681247
25 707PKE51F3 South Africa C " " HM623586
26 C.x.07.CAP229.4.19H5 South Africa C " " KC863410
27 200b9b4 South Africa C " " GU216797
28 035b9h7 South Africa C " " GU216746
29 99ZALT15 South Africa C " " AY522723
30 02ZAPS001MB1 South Africa C " " DQ275648
31 C.705010185.w08.67dps.3_C12 South Africa C " " JX973127
32 C.CAP239.w49.330dps.3_15_I5 South Africa C " " JX976695
33 BP00031_env South Africa C " " JN687821
34 03ZASK211B1 South Africa C " " DQ093601

  70  
Table 4.1: Human miR-4644 Sequences Embedded within the HIV Envelope gp120 Regions of
HIV Strains Isolated Globally
This table lists the 34 HIV isolates which show significant homology with the mature miR-4644
sequence.

Country of origin, clade, seed match and percent homology for each sequence are shown as well.
Percent homology refers to the percent match of nucleotides in the seed region and also the percent
of overall uninterrupted homology between miR-4644 and each HIV sequence (the first 17
nucleotides of miR-4644 match each isolate shown here with 100% homology).

  71  
Table 4.2
miR-4644 Significant Matches and Alignments

HIV Isolate/ HIV Region / Percent Length of


Alignment
Country of Origin Position Homology Match
SO186_H6_5C gp120
1 100% 17/23
South Africa 7741-7757
707PKE51F3 gp120
2 100% 17/23
South Africa 7741-7757
C.x.07.CAP229.4.19H5 gp120
3 100% 17/23
South Africa 7741-7757
200b9b4 gp120
4 100% 17/23
South Africa 7741-7757
035b9h7 gp120
5 100% 17/23
South Africa 7741-7757
99ZALT15 gp120
6 100% 17/23
South Africa 7741-7757
3041_8 gp120
7 100% 17/23
Malawi 7741-7757
PC_388_04R gp120
8 100% 17/23
Botswana 7741-7757
S121_016_1 gp120
9 100% 17/23
Botswana 7741-7757
258_I_9 gp120
10 100% 17/23
Zimbabwe 7741-7757
6838.v5.c36 gp120
11 100% 17/23
Tanzania 7741-7757
Z153MPL13MAR04ENV4.1 gp120
12 100% 17/23
Zambia 7741-7757
34M.BMR.661 gp120
13 100% 17/23
Zambia 7741-7757
H01_3326 gp120
14 100% 17/23
Kenya 7741-7757
E7582 gp120
15 100% 17/23
Guinea-Bissau 7741-7757
537-36 gp120
16 100% 17/23
Cameroon 7741-7757
SAL21 gp120
17 100% 17/23
Russia 7741-7757

  72  
Table 4.2 (continued)
miR-4644 Significant Matches and Alignments

  73  
Table 4.2: miR-4644 Significant Matches and Alignments
Individual HIV isolate sequences are listed here along with the miR-4644 matching sequence and
their alignment. The HIV gene region is also shown, along with the nucleotide position in a
standardized HIV HXB2 strain to which the match was aligned.
The top row in each alignment represents the miRNA sequence, and the bottom row represents the
HIV sequence. The numbers on either side of each sequence correspond to respective position
numbers in that miRNA or isolate sequence.
Percent homology represents the amount of sequence identity between the two sequences of each
alignment, while length of match refers to the uninterrupted sequence homology between mir-4644
and each isolate listed.

Phylogenetic Analysis of miR-4644 matching HIV Isolates


In an effort to better understand the phylogenetic relationships between the miR-4644 matching
HIV-1 isolates, we used the env region sequences from all 34 HIV-1 isolates that we have
identified to produce an alignment. The clustalw2 program was used to produce this alignment.
From this alignment a treefile in Newick format was produced. This treefile contains data from
the alignment in a tree format which we used to create the figure. The final tree image can be
seen in Figure 4.1.
We used the FigTree (http://tree.bio.ed.ac.uk/software/figtree/) software program to create the
tree image. The 34 sequences depicted are largely from the C clade but there are intersubtype
recombinant sequences and some nonrecombinant sequences as well. Finally, the tree was re-
rooted on a node which we believe is closer to the true center of the HIV-1 M group.
Each leaf represents one of the viral isolates identified in our search as having a sequence
matching miR-4644.
One can see that the tree image shows a strong cluster of the C clade sequences as would be
expected. The figure also highlights the diversity of different clades in which these sequences
were found and suggests a broad conservation among HIV-1 species.

  74  
Figure 4.1
Phylogenetic Tree of miR-4644 Sequences in Different HIV Clades

  75  
Figure 4.1: Phylogenetic Tree of miR-4644 matching HIV Isolates
Phylogenetic tree was constructed using the miR-4644 matching envelope region sequences in
alignment. Each leaf represents one of the 34 isolates identified in our study. Clade clustering as
well as diversity between clades can be seen.

miR-4644 - Sequence Alignment and Mapping to HIV gp160


The identification of 34 matching isolates for miR-4644 within 11 different clades prompted
further investigation into the nature of these particular HIV-1 species. As we have noted, these
strains are mostly from Africa and Asia and all map to the same exact location, positions 7741-
7757 within the envelope gene.
The large number of significant matches and broad display among clades indicates a level of
conservation within the HIV-1 envelope gp120 region and therefore led our lab to explore
whether it could have a possible functional role.
Since we have mapped the miR-4644 sequence to positions 7741-7757 in the HXB2 genome, we
can visualize its position on an HXB2 genomic map diagram. From figure 4.2 below we can see
that in the isolates we have identified, the miRNA-4644 lies within the envelope gene, but not
within a hypervariable or ‘V’ region. Instead it lies at the very end of the gp120 region – in fact,
it is present exactly at the junction of gp120 and gp41, but completely on the gp120 side.

  76  
Figure 4.2
miR-4644 Maps to the gp120-gp41 Junction in the HIV Genome
of 34 Isolates

HXB2
Nucleotide Amino Envelope
Position Acid Region

7735 R arg gp120


7736 - gp120
7737 - gp120 miR8
7738 V val gp120 4644
7739 - gp120 Sequence
7740 - gp120
7741 V val gp120 U
7742 - gp120 G
7743 - gp120 G
7744 Q gln gp120 A
7745 - gp120 G
7746 - gp120 A
7747 R arg gp120 G
7748 - gp120 A
7749 - gp120 G
7750 E glu gp120 A
7751 - gp120 A
7752 - gp120 A
7753 K lys gp120 A
7754 - gp120 G
7755 - gp120 A
7756 R arg gp120 G
7757 - gp120 A
7758 - gp41 <88>gp1208gp41
7759 A ala gp41 >>>>>>>Junction>/>Furin
7760 - gp41 >>>>>>>>Cleavage>Site
7761 - gp41
7762 V val gp41
7763 - gp41
7764 - gp41
7765 G gly gp41
7766 -

  77  
The presence of a miR-4644 sequence precisely at the gp120-gp41 junction appears not
coincidental and warranted further investigation. The next question was, what is different about
these 34 sequences in this region compared to all other HIV sequences? To answer this question
an alignment was performed. The results of this alignment are shown in Figure 4.3.

  78  
Figure 4.3
Alignment of HIV Isolate sequences Matching Human miR-4644

79  
miR-4644:
All Matching Strains:
Non-Matching Strains:

 
Figure 4.3: Alignment of HIV Isolate Sequences Matching Human miR-4644
Alignment of 34 HIV Isolates from 11 Different Clades is given along with miR-4644. The
alignment shows the region of match between miR-4644 and the various HIV isolates in the gp120-
gp41 junction region. Also shown is the sequence for the HIV reference strain HXB2. The bottom
of the figure is a summary which shows the miR-4644 with all of the 34 matching strains shown in
yellow as one sequence (they are all the same in this region) and the sequence which is
representative of most other strains, as exemplified by HXB2. The single nucleotide difference of
an ‘A’ at position 14 is highlighted in red.

In the figure above, each of the matching HIV-1 isolates is listed along with its sequence in the
gp120-gp41 region. The positions corresponding to each individual sequence are given on the
right. The gp120 and gp41 regions are shown graphically on a bar at the top of the figure, and
the demarcation between the gp120 and gp41 regions is also shown.
At the top of this figure we show the miR-4644 sequence aligned to the 34 sequences. Regions
of match are highlighted in yellow. It can be easily seen from this figure that the miR-4644
sequence matches each of the 34 isolate sequences with 100% homology in the first 17
nucleotides of its length, and in the last 6 nucleotides of the miR-4644 there is not a match to the
strains shown. We can also see that the region of match ends exactly at the end of the gp120
region, right at the junction between gp120 and gp41.
As we will discuss in detail later, this gp120-gp41 region of the HIV-1 genome is functionally
critical because it codes for a protease cleavage site. Cleavage of the gp160 nascent polyprotein
into its constituent gp120 and gp41 subunits is necessary to produce infectious virus. Because of
its functional importance, this region is highly conserved, and this conservation can be seen in
the figure.
We would like to know what the differences are between these 34 isolate sequences and all other
HIV-1 sequences. For that reason and as a cursory gesture we first examine this alignment in
context with the HXB2 sequence.
Therefore, at the bottom of the figure the sequence for the HIV-1 strain HXB2 is also given.
HXB2 is not one of our matching strains; rather, it is a reference strain which we will use to
examine for divergences in the alignment. By exploring this alignment and comparing the
HXB2 sequence which is representative of non-matching strains, the difference became evident;
there is a one nucleotide change at position #14 corresponding to the miR-4644 sequence. In
our 34 sequences, plus miR-4644, the nucleotide at this position is a ‘G’, while the
nucleotide present at this position in the HXB2 sequence is an ‘A’.

  80  
An enlargement of the bottom part of Figure 4.3 shows this difference:

miR-4644:
All Matching Strains:
Non-Matching Strains:

As we shall see, by comparing to other HIV-1 sequences, this nucleotide substitution of a ‘G’
rather than an ‘A’ is the distinguishing feature of these 34 isolates in relation to all other HIV-1
sequences.
The reason why only these 34 HIV isolates contain a mature miR-4644 sequences is because
only these 34 isolates contain the ‘G’ nucleotide substitution at this position.
Therefore, a one nucleotide difference at this position is enough to essentially ‘create’ an miR-
4644 sequence. This substitution has ramifications on the amino acid conservation of this
functionally critical site, as well as ramifications on pathogenesis of these isolates and on what
possible functional role the miR-4644 may confer on the virus. These topics will be discussed at
length.

Targets of viral miRNAs Identified – Furin as a possible target


FURIN 3’UTR has multiple target sites for 4 of our identified miRNAs
We will next discuss miRNA targeting in light of these findings. Our discovery of the co-
location of an miR-4644 sequence and a furin substrate cleavage site in HIV-1 genomes
prompted us to investigate furin more closely to see if there was a connection between this
protease and any of the miRNAs we have identified.
Using multiple targeting bioinformatics software packages including TargetScan, we examined
the Furin 3’UTR in detail and found miRNA target sites for several of the miRNAs identified in
our study, including miR-4644.
In total, 4 of the 15 miRNAs (miR-4644, miR-195, miR-4652, and miR-6766) have a total of 8
target sites in this one 3’UTR, including miR-4644, the miRNA which resides at the exact
position of subsequent proteolytic cleavage by Furin. The implications of this will be discussed
further.
These results can be visualized in Figure 4.4. Each of the 8 target sites is shown as a box
indicating which miRNA targets the site, the precise target position in the 3’UTR, and an arrow
indicating position on the map. The figure shows multiple target sites for each miRNA: miR-
4644 has two distinct 3’UTR target sites, miR-6766 has 3 target sites, miR-4652 has 2 targets,
and miR-195 has 1 target site.
  81  
Figure 4.4
Target Sites in the 3'UTR of Furin,
a gp160 Protease
Human FURIN 3'UTR Length:1580
0.1K 0.2K 0.3K 0.4K 0.5K 0.6K 0.7K 0.8K 0.9K 1K 1.1K 1.2K 1.3K 1.4K 1.5K 1.6K
Sites for miRNAs identified in our study :
miR-195
(829)

82  
miR-4644 miR-4644
(431) (1177)
miR-6766 miR-6766 miR-6766 miR-4652
(201) (575) (820) (1319)
miR-4652
(1337)
4 miRNAs (miR-4644, miR-195, miR-4652 and miR-6766) in our study have 8 target sites in the Furin 3'UTR

 
Figure 4.4: Target Sites in the 3’UTR of Furin, a gp160 Protease
Schematic version of the human Furin 3’UTR, showing distances and target sites for miRNAs
identified in this study. miRNA labels are shown graphically at the position of their respective
target sites and positions are also shown numerically. (Original figure adapted from TargetScan,
Nam et al., 2014 [134]).

Multiple target sites in any given 3’UTR for a single miRNA is a common phenomena and
represents a fine tuning mechanism for gene regulation through RNA silencing.
Our identification of target sites for 4 different miRNAs suggests a possible cooperative
regulation by these miRNAs on this particular protein.
In the next figure we will examine one of these target sites in more detail.

FURIN 3’UTR has conserved sites for miR-4644


In order to understand in more detail the nature of the target sites which we are studying, we
chose one of the target sites, the miR-4644 target site at position 431 in the human Furin 3’ UTR,
for closer examination. Results are shown in Figure 4.5.
Using the TargetScan software, we performed an analysis to investigate this miR-4644 target site
in the Furin 3’UTR. This particular software algorithm allows us to visualize not only the entire
Furin 3’UTR and all of the target sites associated with it, but also allows for examination of any
particular individual target site in detail. An important consideration when evaluating possible
miRNA target sites in 3’UTRs is conservation between species. In order to determine the degree
of conservation of this particular target site between species, this site at position 431 is expanded
and shown in the form of an alignment, and we can therefore see in this alignment not just the
human target site sequence, listed first, but also the sequences, aligned, for 13 other species. The
seed match for this target is also highlighted in yellow and indicates a ‘7mer-A1’ type target site.
The sequence detail for this target is an alignment of 14 species. It can be seen from this
alignment that this particular target site is highly conserved among vertebrates. The yellow
shaded area of the alignment shows species in which the seed is 100% conserved. It can be seen
in the alignment that the exact seed sequence is present in 11 of the species, and the 3 other
species have a seed sequence which differs by only one nucleotide. Broad conservation across
species is taken as supporting evidence for the presumption of validity and functionality of
predicted target sites.
The presence of conserved target sites for miR-4644 in the 3’UTR of the furin mRNA provides a
possible mechanism for the regulation of furin expression by this miRNA and suggests a possible
connection between miR-4644 and HIV-1 envelope glycoprotein expression.

  83  
Figure 4.5
miR-4644 has Multiple Conserved Target Sites in the Furin 3' UTR
Furin mRNA
5’ UTR start Coding sequence stop 3’UTR Poly-A tail
5’ 3’
3' UTR length: 1580
0.1K 0.2K 0.3K 0.4K 0.5K 0.6K 0.7K 0.8K 0.9K 1K 1.1K 1.2K 1.3K 1.4K 1.5K
Conserved miR-4644 target sites
miR-4644 miR-4644

84  
miR-4644 target site at position 431:
Human
Chimp
Rhesus
Squirrel
Mouse
Rat
Rabbit
Pig
Cow
Cat
Dog
Brown bat
Elephant
Opossum

 
Figure 4.5: miR-4644 has Multiple Conserved Target Sites in the Furin 3’UT
This figure shows the human Furin mRNA diagrammatically at the top, expands the 3’UTR in the
center, showing length information and miRNA target positions, and revealing two distinct miRNA
target sites for miR-4644. The bottom of the figure shows one of the miR-4644 target sites at
position #431, expanded and in the form of an alignment. Seed sequence is shaded in yellow for
each species in the alignment which contains the miR-4644 target site.

Sequence detail of the interaction between miR-4644 and its target site in the 3’ UTR of furin
mRNA was next examined. Figure 4.6 shows both miR-4644 Furin target sites, which reside at
positions #431-437 and #1177-1183 in the 3’UTR. We can see in this figure the nature of the
interaction between miR-4644 and target site, which consists of a base pairing in the seed region.
This target site type is a ‘7mer-A1’ site, which means that the base pairing exists at seed
positions 2-7, with the addition of an ‘A’ nucleotide in the target mRNA opposite position 1. A
‘7mer-A1’ target site is considered a stronger site than one which matches the seed at positions
2-7 only.

Figure 4.6
Pairing of miR-4644 seed with Furin 3' UTR

Predicted seed pairing of target Site


region (top) and miRNA (bottom) type
Position 431-437 of FURIN 3' UTR 5' 3'
7mer-A1
hsa-miR-4644 3' 5'
Position 1177-1183 of FURIN 3' UTR 5' 3'
7mer-A1
hsa-miR-4644 3' 5'

Figure 4.6: Sequence Detail of the Pairing of miR-4644 Seed with Human Furin 3’ UTR
Target region and miRNA pairing for two miR-4644 target sites in the furin 3’ UTR are given with
base pairing regions shown joined in red. Position information for the UTR and target site type are
also displayed. (Original figure adapted from TargetScan, Nam et al., 2014 [134]).

Additional base pairing between miR-4644 and these target sites beyond the seed region is not
immediately evident from the alignment shown in this figure, so we next used the software
RNAhybrid, which calculates free energies of RNA-RNA interactions, to estimate the binding
strength, and therefore probability, of a miR-4644/Furin mRNA interaction. The RNAhybrid
software analysis computed an alignment and a mfe of -18.2 kcal/mol, and the alignment is
shown graphically in Figure 4.7. Here we see the complete binding of the miR-4644 through the
seed region, defined as positions 2-7, as well as predicted base pairing at position 1. This model
also predicts binding at positions 18-21, which adds stability to the resulting RNA structure.
  85  
Figure 4.7
miR-4644 Pairing Interactions with the
Furin mRNA 3’ UTR Target Site

3’

5’
Furin mRNA
miR-4644
5’

3’ Seed region

mfe: -18.2 kcal/mol

Figure 4.7: Pairing of Human miR-4644 with its Target Site in the Furin 3’ UTR
Binding of miR-4644 with Furin as predicted by RNAhybrid is shown. Predicted mfe = -18.2
kcal/mol. Note pairing throughout seed region at positions 1-7 and further at nucleotide positions
18-21.

  86  
A final image is given in Figure 4.8 of the base pairing interaction between miR-4644 and Furin
mRNA in an illustrative manner using a scale that goes from ‘zoomed out’ to ‘zoomed in’, in
order to provide a more comprehensive overall picture of the miR-4644/Furin interaction.

Figure 4.8
microRNA-4644 Targets Furin mRNA by Base Pairing Within the 3'UTR

Start codon Stop codon


Furin mRNA:
5’ cap 5’UTR Coding region 3’UTR Poly A tail 3’

UTR Length: 1580


5’ 3’
3’ 5’
miRNA
4644

5’ Target mRNA 3’

Furin mRNA

miR-4644

3’ miRNA-4644 5’

Figure 4.8: microRNA-4644 Targets Furin mRNA by Base Pairing within the 3’UTR
miR-4644 base pairing interaction within the Furin mRNA 3’UTR is shown schematically. Top
row represents the complete Furin mRNA; middle section shows the Furin 3’UTR expanded to
reveal the location of the target site within the UTR to which miR-4644 binds; and the bottom
shows the specific base pairing interactions between miR-4644 and its target within the mRNA as
depicted in Figure 8-3.

  87  
Targets of viral miRNAs Identified – PAPPA as a possible target
PAPPA 3’UTR has multiple target sites for 5 of our identified miRNAs
PAPPA is a down-regulated protein identified in previous research as being differentially
expressed in chronically infected H9 cells. From our preliminary research it appeared that
PAPPA may be targeted by many of the 15 miRNAs which we have identified. Therefore, we
examined the PAPPA 3’UTR in detail and found miRNA target sites for several of the miRNAs
in our dataset.
From our examination of the human PAPPA 3’UTR, we identified multiple target sites for
miRNAs from this study. In total, 5 of the 15 miRNAs in this study (miR-195, miR-4644,
mir-586, miR-548ah-3p, and miR-548am-3p) show a total of 15 target sites in this one 3’UTR.
These target sites are shown in diagram form in Figure 4.9. Each of the 15 target sites is shown
as a box indicating which miRNA targets the site, the position in the 3’UTR, and an arrow
indicating the exact position of the target site on the map. The figure shows multiple sites for
each miRNA: miR-195 has 4 target sites, miR-4644 also has 4 target sites, miR-586 has 3
targets, and both miR-548ah-3p and miR-548am-3p have 2 target sites each.
The identification of target sites for 5 different miRNAs in our study suggests, as before with the
analysis of the Furin 3’UTR, a cooperative regulation by these miRNAs on this particular
protein. We chose target sites for two miRNAs, miR-195 and miR-4644, to look at more closely.

  88  
Figure 4.9
Multiple Target Sites in the 3'UTR of PAPPA,
an Embryonic, Down-Regulated Protein
Human PAPPA 3'UTR Length 5705

1K 2K 3K 4K 5K 6K
Conserved sites for miRNAs identified in our study:

miR-195 miR-4644 miR-195 miR-195


(831) (2264) (3996) (5337)

miR-4644 miR-195 miR-4644 miR-4644 miR-548 ah-3p miR-548 ah-3p


(90) (846) (1470) (2290) (3983) (5369)

miR-548 am-3p miR-548 am-3p


(3983) (5369)

miR-586 miR-586 miR-586


(399) (2798) (3576)

Figure 4.9: Target Sites in the 3’UTR of PAPPA, an Embryonic, Down-Regulated Protein
Diagram version of the human PAPPA 3’UTR, showing distances and target sites for miRNAs
identified in this study. miRNA labels are shown graphically at the position of their respective
target sites and positions are also shown numerically. 5 miRNAs (miR-195, miR-4644, miR-586,
miR-548ah-3p, and miR-548am-3p) have a total of 15 target sites in the PAPPA 3’UTR. (Original
figure adapted from TargetScan, Nam et al., 2014 [134]).

  89  
PAPPA 3’UTR has conserved sites for miR-195 and miR-4644
Examination of miR-195 target site in Human PAPPA 3’UTR shows broad conservation
among species
In order to understand in more detail the nature of the target sites shown in the above figure, we
have chosen two target sites in the human PAPPA 3’ UTR for closer examination. The first
selected was the miR-195 target site at position #833 in the 3’ UTR. This site can be seen in
Figure 4.10.
Using the TargetScan software again we are able to visualize the entire PAPPA 3’ UTR. Figure
4.10 shows the target site for miR-195 beginning at position #833. The seed match for this target
is highlighted and in this instance indicates an ‘8mer’ type target site, which is defined as an
exact match of positions 1 through 8 of the mature miRNA, and is considered the strongest type
of target site. The sequence detail for this site is presented as an alignment, and we can therefore
see not just the human target site sequence, listed first, but also the sequences, aligned, for 16
other species.
The sequence alignment given in figure 4.10 shows that this particular target is very highly
conserved among vertebrates. This site conservation extends beyond vertebrates, as can be seen
by the last species listed, X. tropicalis (frog). Broad conservation among species is taken as
supporting evidence for the presumption of functionality of predicted target sites.

  90  
Figure 4.10
miR-195 has Multiple Conserved Target Sites in the PAPPA 3' UTR
PAPPA mRNA
5’ UTR start Coding sequence stop 3’UTR Poly-A tail
5’ 3’
3' UTR length: 5706
1k 2k 3k 4k 5k
Conserved miR-195 target sites
miR-195
miR-195

91  
miR-195
miR-195
miR-195 target site at position 833:
Human
Chimp
Rhesus
Squirrel
Mouse
Rat
Rabbit
Pig
Cow
Cat
Dog
Brown bat
Elephant
Opossum
Macaw
Chicken
Frog

 
Figure 4.10: mir-195 has Multiple Conserved Target Sites in the PAPPA 3’UTR
This figure gives a macro-to-micro view of the PAPPA mRNA and one arbitrarily selected miRNA
target within its 3’UTR. The top of the figure shows PAPPA depicted as a typical mRNA, with a
cap, 5’ UTR, open reading frame, 3’ UTR and poly A tail. The middle section of the figure zooms
in on the 3’ UTR for this mRNA, showing its length and depicting miRNA target sites for miR-195.
The bottom of the figure zooms in again on one target site for miR-195 beginning at position #833
of the 3’ UTR.

Next, in order to visualize the base pairing interactions between miR-195 and this target site, we
used the program RNAhybrid to estimate the binding strength of this particular miR-195/PAPPA
interaction. The RNAhybrid software analysis computed an alignment and a mfe of -17.7
kcal/mol. The alignment is shown graphically in Figure 4.11. Here we see the complete binding
of the miR-195 through the seed region, defined as positions 2-8, as well as at position 1. This
model also predicts binding at positions 13-15, where complementarity is thought to be
important for in vivo miRNA targeting, and further at positions 18-20.

  92  
Figure 4.11
miRNA-195 Base Pairs with PAPPA 3’ UTR
via Seed and Other Interactions

miR-195
5’

3’
Seed region

mfe: -17.7 kcal/mol

3’
5’
PAPPA mRNA

Figure 4.11: mir-195 Base Pairs with PAPPA 3’UTR via Seed and Other Interactions
Binding of miR-195 with PAPPA as predicted by RNAhybrid is shown. Predicted mfe = -17.7
kcal/mol. Note pairing throughout seed region at positions 2-8 as well as at position 1 and further
at nucleotide positions 13-15 and positions 18-20.

  93  
Examination of miR-4644 target site in Human PAPPA 3’UTR also shows broad conservation
among species
The second miRNA that we have chosen to examine more closely is miR-4644. Figure 4.9,
given previously, shows 4 target sites for miR-4644 in the PAPPA 3’UTR. Using the same
methods as above, we examined these target sites. Figure 4.12 shows the target site for
miR-4644 beginning at position #90. This particular miR-4644 target site was chosen at random
to illustrate the detail in the target site interaction and to show conservation among species. The
seed match for this target is also highlighted and in this instance indicates a ‘7mer-A1’ type
target site, which means there is a perfect match at positions 2-7 (the seed) of the miRNA
followed by an ‘A’.
Again, we see a high level of conservation of this target site among vertebrates, with 11 of the 17
species in the alignment exhibiting a perfect match of the seed sequence, and 2 other species
differing by just one nucleotide.
Figure 4.12
miR-4644 has Multiple Conserved Target Sites in the PAPPA 3'UTR
PAPPA mRNA
5’ UTR start Coding sequence stop 3’UTR Poly-A tail
5’ 3’

3' UTR length: 5706

1k 2k 3k 4k 5k
Conserved miR-4644 target sites

miR-4644 miR-4644

miR-4644 miR-4644

miR-4644 target site at position 90:


Human
Chimp
Rhesus
Squirrel
Mouse
Rat
Rabbit
Pig
Cow
Cat
Dog
Brownbat
Elephant
Opossum
Macaw
Chicken
Frog

Figure 4.12: miR-4644 has Multiple Conserved Target Sites in the PAPPA 3’UTR
This figure gives a macro-to-micro view of the PAPPA gene and one arbitrarily selected miR-4644
target within its 3’UTR. The top of the figure shows the PAPPA depicted as a generic mRNA. The
middle section of the figure zooms in on the 3’ UTR for this mRNA, showing its length and
depicting miRNA target sites. Finally, the bottom of the figure zooms in again on one target site for
miR-4644 beginning at position #90 of the 3’ UTR.
  94  
Sequence detail for the interaction between miR-4644 and its target site in the 3’UTR of PAPPA
mRNA is shown in Figure 4.13. Using TargetScan software we are able to display four target
sequences. All four target sites are listed for this miRNA, which reside at positions 91-97, 1473-
1479, 2265-2271, and 2293-2299 of the 3’UTR. We can see in this figure the nature of the
interaction between miR-4644 and target sites. Three of the target sites are of the ‘7mer-A1’
type, meaning that the base pairing exists at seed positions 2-7, and in addition, the sequence in
the mRNA is followed by an ‘A’. The other target site is considered a ‘7mer-m8’ type target
site, which means that there is a seed match at positions 2-7 plus a match at position 8.
Both the ‘7mer-A1’ target site and the ‘7mer-m8’ type are considered stronger sites than a site
which exhibits a seed match (6mer) only.

Figure 4.13
Pairing of miR-4644 seed with PAPPA 3' UTR

Predicted seed pairing of target


Site type
region (top) and miRNA (bottom)
Position 91-97 of PAPPA 3' UTR 5' ... ... 3'
7mer-A1
hsa-miR-4644 3' 5'
Position 1473-1479 of PAPPA 3' UTR 5' ... ... 3'
7mer-m8
hsa-miR-4644 3' 5'
Position 2265-2271 of PAPPA 3' UTR 5' ... ... 3'
7mer-A1
hsa-miR-4644 3' 5'
Position 2293-2299 of PAPPA 3' UTR 5' ... ... 3'
7mer-A1
hsa-miR-4644 3' 5'

Figure 4.13: Sequence Detail of the Pairing of miR-4644 Seed with Human PAPPA 3’ UTR
Target region and miRNA pairing for four miR-4644 target sites in the PAPPA 3’UTR are given
with seed base pairing regions shown joined in red. Position information for the UTR and target
site type are also displayed. (Original figure adapted from TargetScan, Nam et al., 2014 [134]).

  95  
In order to visualize the base pairing interactions between miR-4644 and this target site, we
again used the program RNAhybrid to estimate the binding strength of the miR-4644/PAPPA
interaction. The RNAhybrid software analysis computed an alignment and a mfe of -17.8
kcal/mol. The alignment is shown graphically in Figure 4.14. Here we see the complete binding
of the miR-4644 through the seed region, defined as positions 1-7. This model also predicts
binding at positions 9, 14-17, where complementarity is thought to be important for in vivo
miRNA targeting, and lastly in positions 19-20.

Figure 4.14
miR-4644 Base Pairs with PAPPA 3'UTR via Seed and Other Interactions

miR-4644
5’

3’

Seed region
3’

mfe: -17.8 kcal/mol

5’
PAPPA mRNA

Figure 4.14: miR-4644 Base Pairs with PAPPA 3’UTR via Seed and Other Interactions
Binding of miR-4644 with PAPPA as predicted by RNAhybrid is shown. Predicted mfe = -17.8
kcal/mol. Note pairing throughout seed region at positions 1-7 as well as positions 9, 14-17, and
further at nucleotide positions 19-20.

  96  
4.3 Discussion:

miR-4644
A paper published in 2011 used deep sequencing to look for miRNA sequences in breast tumor
tissue and to compare it to non-tumorigenic tissue and tumor-adjacent tissue. They used 5
different breast cancer patients and three samples from each patient. In total they found 361
miRNA precursor hairpin sequences and 535 new mature miRNAs [135]. In 2011, only 904
mature miRNAs had been identified and listed in miRBase, so this was a significant addition.
From the miRNA sequences they found, two-thirds were present in other human tissue samples,
based on current bioinformatics data available to the researchers. Also about half of the
sequences were found associated with Ago2 in MCF7 cells. Several miRNAs were found in
regions with genomic amplification associated with the cancer pathogenesis, suggesting that they
may be connected.
miR-4644 is one of the sequences identified in this study. The only data given in this paper on
miR-4644 specifically shows that: it is a mature miRNA present on the -3p side of the precursor
miRNA hairpin structure and there is no -5p counterpart; and that it was not found associated
with Ago2 as were many other of the miRNAs found. It was not given whether miR-4644 was
found in any other human tissue sample data. miRBase gives no other references on the initial
identification of human miR-4644.
A current literature search reveals that miR-4644 has been implicated in pancreatic cancer.
Madhavan et al. shows that miR-4644 is significantly upregulated in 83% of PaCa serum
exosomes but not in control groups, indicating that miR-4644 may serve as a functional
biomarker for this disease [136]. Also, it has been shown that miR-4644 appears to be involved
in gemcitabine-resistance developed among pancreatic cancer patients [137].

Envelope Expression and Processing


env Gene Expression
The HIV-1 envelope glycoprotein is a virally encoded molecule present on the surface of the
virion and it is this molecule which mediates membrane fusion and delivery of the viral capsid
into the cell. Transcription of the envelope gene yields a singly spliced mRNA consisting of a 5’
leader joined to the coding region of env, and results in a Vpu/Env bicistronic mRNA of length
4.3-kb [138]. The first translated amino acids comprise a hydrophobic signal peptide which
localizes the nascent protein to the rough endoplasmic reticulum (ER). This leader peptide is
subsequently removed by a signal protease in the ER. The Env precursor, known as gp160, is an
integral membrane protein. Specific hydrophobic peptide motifs in the gp41 transmembrane
(TM) domain fix Env gp160 to the internal cellular membranes during its processing in the ER
and Golgi. A schematic diagram of the HIV-1 Envelope glycoprotein is shown in Figure 4.15.
  97  
Figure 4.15: HIV-1 Envelope Glycoprotein

gp160

gp120 gp41

FP HR1 HR2 TMD

V1 V2 V3 V4 V5
SP

Figure 4.15: HIV-1 Envelope Glycoprotein


Structure and regions of the HIV-1 Envelope glycoprotein are shown. The entire Envelope
precursor has a molecular weight of 160kD and is referred to as gp160. Gp160 is eventually
cleaved into two associated subunits, gp120 and gp41. SP=signal peptide which localizes the
nascent polyprotein to the rough endoplasmic reticulum. C1 through C5 are conserved constant
regions of the gp120 subunit. V1 through V5 are regions of extensive variability. The gp41 domain
lists regions for: FP=fusion protein; HR1 and HR2 are heptad repeats involved in membrane
fusion; TMD=transmembrane domain (Original figure adapted from Fields et al., 2007 [138]).

Gp160 is co-translationally glycosylated with oligosaccharides on asparagine (Asn) residues.


The Env proteins are then folded and oligomerized in the ER, and this oligomerization is
required for stable expression. The oligomers form heterotrimers and this folding is catalyzed by
chaperones present in the ER. Cys-Cys disulfide bonds that form between residues in the gp120
subunit join together the V1-V4 regions.
The gp160 polyprotein is eventually transported to the Golgi network where it is cleaved by
Furin, a cellular protease that will be discussed in further detail [139]. The highly conserved
recognition motif Lys/Arg-X-Lys/Arg-Arg (R/K-X-R/K-R) is recognized by Furin or other
proprotein convertases (PCs), and proteolytic cleavage occurs just to the carboxyl side of this
motif. Cleavage is essential for the normal function of the Env protein and for virus infectivity
[140]. This cleavage of gp160 into gp120 (SU) and gp41 (TM) subunits causes rearrangements
of the polypeptide chain and reveals a hydrophobic N-terminus on gp41, known as the fusion
protein domain, which mediates fusion between viral and host membranes.

  98  
While the cleavage of gp160 into gp120 and gp41 is thought to be important for transport to the
plasma membrane, it has been demonstrated that uncleaved forms of gp160 molecules are highly
represented at the cell surface as well [141, 142].
At the cell surface, gp41 remains attached to the membrane and has an extracellular domain, a
membrane-spanning segment, and cytoplasmic tail, while gp120 lies completely outside the cell
and ultimately the virion through interactions with gp41. The gp120 and gp41 subunits form a
noncovalent association which is a weak interaction and is transported as a heterotrimer from the
Golgi to the cell surface. Gp120 is frequently lost from the cell surface because of this weak
interaction [138].
gp120 is known to be extensively glycosylated, and these modifications may make up nearly half
of weight of the Envelope heterotrimer molecule. This glycosylation is important for virus
infectivity, and also probably serves to aid the virus in evasion of immune surveillance. By
creating a sort of molecular shroud around the gp120 antigen, glycosylation may hide surface
peptides of Env from neutralizing antibodies. Glycosylation is also a method by which the virus
can achieve cell adhesion through lectin binding.
Upon reaching the cell surface, the gp120-gp41 glycoprotein complex is rapidly internalized, and
quickly recycled back into the cell if it is not taken up into a budding virion [143]. This acts to
limit the amount of glycoprotein at the cell surface which could trigger an immune response. It
also means that relatively few glycoproteins get integrated into the virion – HIV-1 only has about
10 glycoprotein spikes, considerably fewer than other viruses.
The C5 region (aa 489-511) of gp120 lies at the C-terminus and associates with gp41 in the
glycoprotein complex at the cell surface. Disruption of this binding of the C5 domain to gp41
inhibits HIV infection. The ‘R-E-K-R’ motif that we have described as associated with sequence
miR-4644 in gp120 represents the last four amino acids, 508-511, of C5. The last 11 amino
acids in this conserved region, aa 501-511, are described as hydrophilic, charged, exposed,
immunologic, and antibody-accessible. The highly conserved nature of the C5 domain in gp120,
(where the miR-4644 sequence lies), and of the gp41 fusion peptide, makes these regions
attractive targets for vaccines and structure-based drugs designed for the prevention and
treatment of HIV infection [144].
A image showing an overview of envelope expression and cleavage by Furin is given in Figure
4.16. The miR-4644 sequence that we have identified is also shown in context for reference.

  99  
  100  
Figure 4.16: env gp160 Expression and Cleavage by Furin
HIV envelope expression, processing and cleavage by Furin are depicted. The location of newly
identified sequence miR-4644 at the gp120-gp41 border is shown for reference. Co-expression of
miR-4644 during env processing may contribute to a regulatory pathway in HIV expression.

Furin Processing
In the late 1960s, it was starting to be proposed that large protein precursors were cleaved to
form active forms. For example, the single chain precursor proinsulin is cleaved into the two
chain mature form of insulin at areas of basic amino acid residues. This phenomena of
endonucleolytic cleavage to form active enzymes was subsequently shown to exist in organisms
from yeast through mammals.
Eventually, Kex2p, a yeast protease, was identified as being related to bacterial serine proteases
belonging to the subtilisin family. Searching for mammalian homologs of the yeast KEX2 gene,
researchers identified a gene called fur which was located near the oncogene c-fes and was thus
named fur standing for c-fes upstream region. The protein product of the fur gene was named
furin. Also identified was another Kex2P homolog called PC2 for protein convertase 2, and it
was shown that all three shared conserved sequences in their catalytic domains. Eventually an
entire proprotein convertase, or PC, family was identified.
Since it was known that cleavage of the HIV viral glycoprotein gp160 into gp120 and gp41 was
necessary for viral infectivity [140, 145], and it had been shown by northern blotting that CD4+
T-lymphocytes express both furin and PC1, it was suspected and subsequently shown that both
of these endolytic nucleases can cleave gp160 [146, 147].
Therefore, furin is a cellular endoprotease and is part of a family called proprotein convertases
(PCs), of which there are seven distinct enzymes, and this enzyme family is homologous to the
previously known yeast enzyme Kex2p and is a calcium-dependent serine endoprotease. Serine
proteases are characterized by a distinctive structure consisting of two beta barrel domains that
converge at the catalytic active site.
Serine proteases cleave peptide bonds and the serine residue of the enzyme serves as the
nucleophilic amino acid at the enzyme’s active site. There are two types of serine proteases:
chymotrypsin-like and subtilisin-like, which define their substrate specificity. Subtilisin is a
serine protease in prokaryotes. Furin is a subtilisin-like enzyme. Furin has a wide pH range,
between 5-8 and therefore can operate in many cellular compartments.
Proprotein convertases typically cleave substrates on the C-terminal side of paired basic amino
acids, such as -Lys-Arg-(cut) or Arg-Arg-(cut). Furin requires an additional ARG at the P4
position (upstream) and therefore the consensus sequence motif for furin cleavage is Arg-Xxx-
(Lys/Arg)-Arg. Other residues may be important for furin function as well. Figure 4.17 below
shows a graphic view of this R-E-K-R cleavage in context of the HIV-1 envelope gene and its
expression:

  101  
  102  
Figure 4.17: Furin Cleaves gp160 at Conserved R-E-K-R Amino Acid Motif
This figure shows an expanded view of the gp120-gp41 junction revealing the nucleic acid and
amino acid detail of the R-E-K-R recognition site for cleavage by Furin, a cellular endoprotease.
Furin cleaves substrates immediately following the R-X-(R/K)-R amino acid motif which is
common among many viral proproteins. Expression of the HIV envelope gene and final cleavage
step in the maturation of the HIV envelope glycoprotein is also shown on the left.

Furin is expressed throughout development in all tissues and multiple protein-processing


compartments such as the trans-golgi-network (TGN), cell surface and endosomes. The golgi
apparatus has a cis-network, which is next to the ER, a medial stack and the final stage, the trans-
golgi network, which packages proteins into secretory vesicles headed to lysosomes or cell
surface. While furin is ubiquitous in all areas of the cell, it is primarily located in the trans-golgi-
network. However, it has been demonstrated that furin is shuttled back and forth between the
TGN and the cell surface via endosomes. It can be tethered at the surface and remain there, or be
internalized by clathrin-dependent endocytosis [143, 148].
Furin substrates include growth factors and hormones, cell surface receptors, plasma proteins,
matrix metalloproteinases, bacterial exotoxins, and viral envelope glycoproteins [149]. A
sampling of viral proteins that are cleaved by Furin, along with their cleavage site amino acid
sequence, is given here in Table 4.3:

  103  
Table 4.3
Viral Proprotein Furin Cleavage Sites

Table 4.3: Viral Proprotein Furin Cleavage Sites


This table shows a variety of viruses, all of which have a receptor protein which is cleaved by the
cellular protease Furin during virus maturation. The cleavage site amino acid sequences are listed
as P6-P5-P4-P3-P2-P1-P1’-P2’, where the recognition site for furin is typically in the range P4-P3-
P2-P1, shown in red and blue, and has the general recognition motif R-X-K/R-R, where X can
represent any amino acid, and the protease cut occurs between P1 and P1’. P6 and P5 may be
important for furin cleavage in some instances as well, and K/R sites at P6 are shown in green.
(Original figure adapted from Molloy et al., 1999 [143]).
  104  
Furin can also operate at the cell surface, via an endocytic pathway where cleavage takes place
within endosomes (as in diphtheria and pseudomonas) and may even be secreted in truncated
form from the cell where it might participate in the proteolytic maturation of extracellular
substrates such as BMP-1.
Furin is the primary endonuclease responsible for cleaving HIV gp160 into its constituent parts,
gp120 and gp41, and cleavage occurs in the TGN of the golgi apparatus [150]. However,
Ohnishi, et al. has demonstrated that a furin-defective cell line, LoVo, is able to correctly process
gp160 [151]. This suggests that other proprotein convertases can contribute to gp160
maturation. Cellular enzymes may help as well. Cleavage of gp160 is vital for viable virus;
uncleaved envelope gp160 is unable to form into a functional envelope heterotrimer and will
yield a non-infectious virus.
There is a second furin cleavage site in the HIV gp160 molecule located 8 amino acids upstream
of the primary cleavage site (Pro-Thr-Lys-Ala-Lys-Arg-*2nd site*-Arg-Val-Val-Gln-Arg-Glu-
Lys-Arg-*primary site) [142]. It is not clear what biological relevance this secondary site may
have. Approximately 15% of gp41 contains fusion peptptide extended to the second cleavage
site. These gp41 molecules are inefficiently packaged into virions, however. [142, 152].
Numerous pathogens require cleavage by furin of the viral envelope glycoproteins or their
bacterial toxins. Both anthrax and diphtheria have Arg-Xxx-(Lys/Arg)-Arg cleavage site
sequences and can be activated by furin to yield the two-chain mature toxin on the cell surface
[149].
Furin cleavage can also determine virulence. For example, the Reston strain of Ebola virus
glycoprotein lacks a furin cleavage site (its amino acid sequence is Lys-Gln-Lys-Arg or
K-Q-K-R) and this virus is not known to be pathogenic. However, the Ebola Zaire and Ebola
Ivory Coast strains contain consensus furin sequences at their respective glycoprotein cleavage
sites and have a high lethality rate [143].
While the furin cleavage site is highly conserved in HIV-1 (R-E-K-R), it is not so in flu viruses.
Since proteolytic activation of envelope glycoproteins is necessary for the entry of viruses into
the host cell, the ability of a glycoprotein to be processed by furin is an important determinant of
viral tropism. HAs of some infuenza viruses lack the first Arg at the cleavage site, and thus are
not susceptible to cleavage by the ubiquitous furin. Thus, these virus infections are limited to the
respiratory and alimentary tracts, where cleavage occurs via an alternate endonuclease, tryptase
Clara [153].
The central role of furin in glycoprotein maturation and therefore virus infectivity has generated
much interest in the physiological function of this enzyme, and remains an active focus for
further research and possible therapeutic intervention.

  105  
REKR is a highly conserved motif in HIV-1
As we have mentioned, part of HIV gp160 polyprotein processing is gp120-gp41 cleavage,
carried out by the host cellular protease furin, which recognizes an Arg-Xxx-Lys-Arg motif and
cleaves immediately after that motif. Thus, the Arg-Xxx-Lys-Arg amino acid codons are highly
conserved not only in the HIV-1 M group, but in all lentiviruses, and indeed in other virus
families as well.
To examine the conservation in this region at a global level among all HIV-1 sequences, we used
the QuickAlign tool, available at the Los Alamos Laboratory website. This software allows us to
examine conservation at the amino acid level. By entering in our region of interest in the gp120-
gp41 junction region, 5’-GTGGTGGAGAGAGAAAAGAGAGCAGTG-3’, where the red
highlighted sequence corresponds to miR-4644, we get the translated sequence of Val-Val-Glu-
Arg-Glu-Lys-Arg-Ala-Val, or VVEREKRAV.
It is thus a fascinating consequence of this sequence that while miR-4644 functions normally as a
non-coding RNA molecule, its sequence, if translated in the right context, can yield the same
R-E-K-R motif that is recognized and cleaved by furin. Thus, a mature miR-4644 sequence
embedded in a coding sequence could therefore have two functions; as a miRNA sequence if
transcribed (though perhaps not functional without proper processing) and if translated, a
functional cleavage site for a cellular protease.
By using the QuickAlign bioinformatics tool, we are able to create a snapshot of any region of
interest showing the most common amino acids at each position. This shows us relative
conservation of amino acid information in any region. This information is displayed as a
pictogram in which the most common amino acid at any given position is the largest, with less
frequent amino acids depicted as smaller and smaller characters. We therefore used QuickAlign
to examine this region in an alignment of all full genome HIV-1 sequences, and produced this
image:

  106  
(QuickAlign output, Los Alamos National Laboratory, reprinted as a screen shot.)
In the above image we have scrutinized the region of the HIV genome encoding the furin
cleavage site, R-E-K-R, by examining an alignment of all known HIV sequences in the Los
Alamos database and tallying the number of occurrences of every amino acid at each position
above. The results are indicated as an image, where the larger the size of the letter, the more
frequent the occurrence of that amino acid at that position. Relative frequency, or probability, is
indicated on the y-axis and the amino acid position relative to the start (‘V’) is on the x-axis.
We can see from this image that the V-V-E-R-E-K-R-A-V sequence is highly conserved, as
indicated by the size of those letters being much larger than the others at those same positions.
In fact, the frequency of each amino acid in the ‘R-E-K-R’ motif above is 98.1%, 92.3%, 98.0%,
and 99.3%, respectively. Total number of sequences tallied in this alignment = 4,632.

34 Matching Strains have a synonymous substitution at the nucleotide level


We have identified a 17nt sequence present in the env gene of 34 HIV-1 isolates which matches
the human cellular miR-4644 sequence with 100% homology from nucleotides 1-17 of the 23nt
mature miR-4644 sequence. This miR-4644 sequence is present in the same location in all of the
34 HIV-1 identified.
As we have stated, in an effort to understand why these 34 isolates are different from the rest of
the HIV-1 genomes in the NCBI database, we examined an alignment of all known sequences,
which we presented in Figure 4.3. We also looked at alignments of miR-4644 and HIV-1 isolate
strains at the nucleotide level to determine the exact difference in those strains which contained
the miR-4644 versus those strains which did not. We present here a diagram, showing how miR-
4644 lines up with one of the 34 ‘matching’ sequences, given here as accession #JN681247 from
South Africa, and a randomly selected ‘non-matching’ sequence, #M62320 from Uganda.
The first 17 nt of the mir-4644 sequence is shown on top in red aligned with the representative
matching and non-matching sequences:
miR-4644: TGGAGAGAGAAAAGAGA
Matching Strain JN681247: …TGGTGGAGAGAGAAAAGAGAGCA…
Non-Matching Strain M62320: …TGGTGGAGAGAGAAAAAAGAGCA…
It can be readily seen that the difference between the matching and non-matching sequences is a
single nucleotide change at position #14 of the miR-4644 sequence. In miR-4644 and in all
matching strains, this position contains a ‘G’, while non-matching strains typically have an ‘A’ at
this position. In fact, the frequency of an ‘A’ at this position is 97.98%, according to Los
Alamos.
Although these 34 matching sequences differ at the nucleotide level, they still encode the
functional R-E-K-R site at the amino acid level. To understand how the 34 sequences that we
identified can diverge at a nucleotide level and maintain conservation (and therefore
functionality) at the amino acid level, we looked more closely at the nucleotide sequence in the
position of divergence, nucleotide 14.
  107  
In a standard consensus HIV-1 alignment, available from the Los Alamos Laboratory, the
nucleotide at the position in question is an ‘A’, which lies in the third position of the codon
‘AAA’, which translates to Lysine (K), which is what we expect – it is the ‘K’ in the ‘R-E-K-R’
motif. However, our matching HIV-1 isolate sequences instead contain the substitution at this
position of a ‘G’ nucleotide, which in the third position changes the codon to ‘AAG’, which also
translates to Lysine ‘K’. So essentially, in these 34 sequences from 34 different HIV-1 isolates,
the ‘AAA’ codon has been changed to ‘AAG’, a synonymous mutation that alters the nucleotide
sequence without changing the amino acid that it encodes for.
Therefore, this one nucleotide difference here is enough to create an miR-4644 sequence, yet
maintains the codon identity and translational fidelity necessary to retain the proper amino acid
motif (REKR) that still allows for the gp120-gp41 polyprotein to be cleaved by furin.
Another interesting aspect about the miR-4644 sequence we have identified is that there is such a
significant number of matches (34) within HIV-1 envelope sequences. We have now been able
to determine that:
1) all of the 34 matching sequences are in the same exact region of the HIV-1 genome,
at the gp120-gp41 junction;
2) all of these matches convert an ‘AAA’ Lys codon to an ‘AAG’ Lys codon; and
3) this substitution changes this region to an miR-4644 sequence without disrupting its
role as a cleavage site for furin and thus the maturation of an infectious particle.

Therefore, the change of the dominant ‘A’ at this position for the relatively rare ‘G’ at this
position in essence ‘creates’ a miR-4644 sequence embedded in the coding region of env,
without disrupting the Env protein itself.
Reverse transcriptase is known to have a strong G->A transition mutation bias. Therefore, while
both ‘AAA’ and ‘AAG’ codons code for Lysine, one would expect most ‘AAG’ codons
ultimately to revert to ‘AAA’. Indeed, almost all Lys codons in HIV-1 species have the ‘AAA’
Lys codon as opposed to the ‘AAG’ Lys codon that we see exhibited here, and yet, the 34
isolates we have identified retain the ‘AAG’ codon. The observation of these ‘AAG’ type
codons in our 34 matches even in the face of this selection pressure may imply a functional role
for these sequences, namely, to maintain a miR-4644 sequence within its coding region.
We speculate that the presence of such a large number of HIV-1 isolates containing the miR-
4644 sequence suggests a functional role for this miRNA. To explore this idea, we would next
have to conduct functional lab experiments to determine what functions are mediated by miR-
4644 in both normal and infected cells, which proteins it can target or regulate, and whether
these proteins may play a role in HIV infection.

  108  
Searching for the ‘REKR’ amino acid motif in other miRNAs – ‘translating’ miR-4644
miRNAs are by definition non-coding RNA molecules. However, this miR-4644 is embedded
within a coding sequence; it co-locates with the envelope coding region of 34 different HIV-1
strains at the gp120-gp41 border. As we have discussed, the cleavage of gp160 into gp120 and
gp41 during viral envelope processing is effected by the cellular protease furin, which recognizes
the amino acid motif R-E-K-R and cleaves directly after that sequence. As we have shown
above, in order for miR-4644 to co-exist in this region, its sequence must not disturb the coding
sequence cleavage motif R-E-K-R. We wanted to investigate whether there are other miRNAs
among those we have identified which could code for the same motif and thus potentially reside
within the same region.
If we are to visualize the miR-4644 nucleotide sequence in red within the envelope gene
sequence of one of those 34 HIV-1 isolates, for example #JN681247 Clade C from South Africa,
it would look something like this:

ß-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑gp120-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑à|ß-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑gp41-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑-­‐‑à  
miR-4644
..AGGCAAAAAGGAGAGTGGTGGAGAGAGAAAAGAGA|GCAGTGGGAATAGGAGC..

Translated, the sequence would become . .


... A K R R V V E R E K R | A V G I G . .

We next examined the possible translated products of all 15 of the miRNAs that we are
investigating. In order to examine all possibilities, we also looked at possible translation
products of the miRNAs in all three reading frames.
miR-4644, of course, contains the amino acid motif REKR in frame position 3, which is where it
lines up with the gp120 coding region. However, we found another REKR motif by
frameshifting two frames back to frame 1. Thus, there are two possible REKR motifs present in
the sequence of miR-4644.

Here is an illustrated example using miR-4644:


miR-4644 Sequence: UGGAGAGAGAAAAGAGACAGAAG

Translating Frame 1 UGG|AGA|GAG|AAA|AGA|GAC|AGA|AG


Gives: W R E K R D R
Translating Frame 2 U|GGA|GAG|AGA|AAA|GAG|ACA|GAA|G
Gives: G E R K E T E

Translating Frame 3 UG|GAG|AGA|GAA|AAG|AGA|CAG|AAG


Gives: E R E K R Q K
  109  
So the three possible amino acid sequences generated by the miR-4644 nucleotide sequence are
WREKRDR, GERKETE, and EREKRQK. These results are listed below.
We examined all theoretical translational products of the other 14 miRNAs and found no other
REKR sequences in any of the three possible reading frames. In addition, we examined all three
reverse frames and also found no REKR sequences.
The possible translation products are listed as follows in Table 4.5:

Table 4.4: Potential Translation Products of miRNAs Identified

miRNA: Frame 1: Frame 2: Frame 3:

1)     miR-­‐‑4644     WREKRDR     GERKETE     EREKRQK  

2) miR-6763 LGSGWG WGVAGE GEWLG

3) miR-4652 RGLVNRT GDWLIEL GTG**N

4) miR-6774-3p SCPSCPQ RVPLVH VSLLST

5) miR-6875-3p ILPALAP FFLPWLH SSCPGS

6) miR-195 *QHRNIG SSTEIL AAQKYW

7) miR-548ah-3p QKLQLLL KNCSYFC KTAVTF

8) miR-548am-3p QKLQLLL KNCSYFC KTAVTF

9) miR-548av-3p KTAVTF KLQLLL NCSYFC

10) miR-5197 QWHKLIL NGTNSFL MAQTHS*

11) miR-6124 GKRKGE GKGRGR EKEGGG

12) miR-6766 RVGADLIE GWEQILL GGSRSY*

13) miR-7151 DPSLPVL IHLCLYW SISACIG

14) miR-7156 LFSNWLS CSQTGCQ VLKLAVR

15) miR-586 YALXFLG MHCXF*V CIVXFRS

  110  
Thus, it would appear that miR-4644 represents a unique finding in our investigation in that it is
the only sequence identified that could possibly encode for the furin cleavage motif R-E-K-R and
thus co-locate in a furin cleavage site region. Therefore miR-4644 is the only sequence of the 15
miRNAs we have identified that could be embedded in a coding sequence requiring this
R-E-K-R motif.

  111  
Proposed Regulatory Pathway for miR-4644, Furin, and gp160
It is widely known that a single miRNA can target and therefore regulate many mRNAs; it is
also known that any given mRNA may be regulated by a multitude of miRNAs. Thus, the
regulatory relationships between miRNAs and the mRNAs they affect are complex.
Any given miRNA and an mRNA which it targets are likely to be separately expressed; that is,
they are transcribed from different loci as different and unique transcription events. While this
scenario is by far the most common, there exists second scenario in which a mRNA and the
miRNA which regulates it can be transcribed sequentially from the same locus, due to
readthrough. In a third scenario which has been described, it is possible for a miRNA and its
mRNA target to be generated from a single RNA following a splicing reaction from a single
transcription event.
We propose that we have discovered a new novel pathway in addition to those already known.
Here, we present a fourth scenario: an embedded microRNA, co-expressed with a coding gene,
may downregulate another protein which regulates the embedded gene. We will now discuss
briefly these four different ways in which a miRNA might be expressed in relation to its target
mRNA.

  112  
The simplest scenario, which we will call canonical pathway #1, calls for a miRNA and the
mRNA it targets to be transcribed from different loci. In this pathway, the miRNA and mRNA
are expressed independently from each other, are not connected at the level of initial expression,
and are not necessarily expressed at the same time. The figure below illustrates this relationship:

Canonical Pathway #1
miRNA ‘Y’ Regulates Gene ‘X’, and is Transcribed from a Different Promoter:

Gene ‘X’ miRNA ‘Y’

DNA TATA + TATA

Transcription

Gene ‘X’ 3’UTR miRNA ‘Y’

RNA +
Targets

Downregulation of Gene 'X'

In the second scenario, canonical pathway #2, a gene and the miRNA which regulates it may be
co-transcribed from the same promoter. Their expression is connected and interdependent. The
amount of miRNA expressed in this scenario may therefore depend on the amount of
readthrough from the promoter through the first gene. This readthrough efficiency creates a
possible mechanism for turning miRNA expression on and off; it also creates a way in which the
expression can be fine-tuned, meaning that the amount of readthrough determines the amount of
miRNA produced.

  113  
Co-expression of a miRNA and its mRNA target therefore creates a sort of feedback mechanism
in which the gene being transcribed is ultimately regulated by the miRNA with which it is co-
transcribed.
This scenario is depicted in the following illustration, in which gene ‘X’ and miRNA ‘Y’ are
co-expressed from the same promoter, with miRNA ‘Y’ targeting and regulating expression of
gene ‘X’:

Canonical Pathway #2
miRNA ‘Y’ Regulates Gene ‘X’, and is Co-Transcribed from the Same Promoter:

Gene ‘X’ miRNA ‘Y’

DNA TATA

Transcription

Gene ‘X’ 3’UTR miRNA ‘Y’

RNA +

Targets

Downregulation of Gene 'X'

There is a third scenario we will call canonical pathway #3 in which the miRNA is embedded
within another gene, either in an intronic (called a mirtron) [107, 108] or intragenic region [154],
and for which the co-expression of the gene and miRNA could result in downregulation of the
gene by the miRNA within the intron.
  114  
This type of regulation highlights a different pathway in which the miRNA and mRNA target
expression are connected and therefore interdependent, however in this case an additional factor
is the regulating miRNA lies within a host gene.
This pathway is illustrated below, where miRNA ‘Y’ is a miRtron which is represented by an
intron within gene ‘X’:

Canonical Pathway #3
miRNA ‘Y’ Regulates Gene ‘X’, is Transcribed and Spliced from Within Gene ‘X’:

Gene ‘X’ miRNA ‘Y’ Gene ‘X’


DNA TATA Exon 1 miRtron Exon 2

Transcription, Splicing

Gene ‘X’ 3’UTR miRNA ‘Y’

RNA Exon 1 Exon 2 +

Targets

Downregulation of Gene 'X'

  115  
As we progress from scenario #1, the simplest case, to the progressively more complex scenarios
#2 and #3, we get closer to our model based on miR-4644: a miRNA located within another
coding gene regulating that gene product. In our case, the end product would be gp160
expression and ultimately cleavage and envelope maturation. However, miR-4644 does not
mediate the cleavage of gp160 directly; rather, it is predicted to target indirectly by
downregulating the protein which mediates the cleavage event – Furin.
Our identification of an miR-4644 sequence embedded within the coding region of the HIV
envelope gene thus represents a new pathway: one in which a miRNA embedded within a gene
regulates that gene not directly by targeting its mRNA, but indirectly by targeting the mRNA of
the protein that processes that gene product. We therefore propose that this is a novel
regulatory pathway which has not yet been described.
This regulatory pathway is illustrated in Figure 4.21 and Figure 4.22.

  116  
Figure 4.21
Expression of miR-4644 may Downregulate Furin in a
Self-Regulatory Loop and Contribute to HIV Latency

HIV DNA gp120 4644 gp41

Transcription Transcription
Translation

HIV gp160
gp120 gp41 4644
polyprotein
Cleaved by Furin

Furin
Open Reading Frame 3’UTR
mRNA
miR-4644 target site

Translation
blocked

Furin
Protein

Cleavage Blocked

  117  
  118  
Figure 4.22: miR-4644 / Furin / gp160 Proposed Regulatory Pathway
This table shows the proposed regulatory miR-4644 / Furin / gp160 pathway in a series of steps
highlighting each molecule involved. HIV DNA with embedded miR-4644 on the left is transcribed
to make both gp160 mRNA as well as miR-4644. The translation and subsequent processing of
gp160, which includes cleavage at the REKR amino acid moiety by Furin, creates a functional viral
glycoprotein. On the right, the furin gene is transcribed to give rise to the furin mRNA containing
a miR-4644 target site in its 3’UTR containing the nucleotide sequence ‘CUCUCCA’, which is
complementary to the miR-4644 seed sequence ’UGGAGAG’. Translation of the furin mRNA
yields the competent protease which is able to cleave gp160 to yield a viable glycoprotein. miR-4644
may target furin, downregulating its expression and thus tampering downstream maturation of the
envelope glycoprotein.

In our proposed pathway above, when the miR-4644 sequence is transcribed, it targets the furin
mRNA, which would downregulate translation of the active furin protein product, which in turn
would not be available to cleave the gp160 polyprotein to create the mature viral glycoprotein.
Thus, the embedded miR-4644, by acting on furin, which acts on gp160, may self-regulate the
gene that it lies within. Presumably, by downregulating furin which ultimately inhibits viral
maturation, a virally encoded and embedded miR-4644 could be acting to promote viral latency.
The difference between the other examples that are currently known and our pathway is that the
miR-4644 is contained within the coding region of the gene where it is co-transcribed.
Furthermore, miR-4644 does not target the gp160 mRNA directly, but rather targets another
mRNA, furin, which in turn regulates gp160 maturation.
There are two important distinctions: one has to do with the placement of the miR-4644
microRNA in relation to the gene that it may regulate (miR-4644 is located within the gene that
it regulates), and the other has to do with the indirect manner in which the gene is ultimately
modulated (miR-4644 -> furin -> gp160). In the other co-transcription scenarios that we have
discussed, the miRNA is embedded in an intragenic region of the gene, but not within the coding
part of the gene itself, as in our discovery. Additionally, this may represent a new discovery of a
miRNA acting not directly on its gene’s target mRNA, but indirectly by targeting the mRNA of a
critical downstream regulating enzyme. These important differences could represent a new and
novel form of miRNA regulation.

  119  
4.4 Conclusion:

Our Proposed Generalized Regulatory Pathway


We propose a generalized regulatory pathway for this ‘indirect regulation’ by a miRNA on its
downstream regulatory target as a new and novel pathway.
Figure 4.23 presents the groundwork for the pathway we present as a series of two stages of
normal and well known regulation within cells. The first stage (1) in figure 4.23 shows a
protease gene ‘Y’, which, when expressed as a functional protein, is able to cleave another
polyprotein product Gene ‘Z’, yielding an active form of protein ‘Z’.
The second stage (2) in Figure 4.23 adds another layer of regulation onto the production of active
protein ‘Z’. In this scenario, there is a microRNA labelled miR-‘X’, which, when expressed,
targets and inhibits the expression of gene ‘Y’. With no protease ‘Y’ being expressed, there is
no cleavage of proprotein ‘Z’, and therefore protein ‘Z’ remains in an inactive state. Therefore,
miR-‘X’ in this case is able to regulate the expression of active protein ‘Z’ by targeting protein
Z’s ‘regulator’ (protease ‘Y’).
Our proposed pathway is presented in the third stage (3) of Figure 4.23, and builds on the second
scenario. In our proposed pathway, miR-‘X’ still regulates ‘Y’, which in turn regulates ‘Z’,
however in this case miR-‘X’ is embedded within gene ‘Z’ – it is embedded within the gene
which it ultimately downregulates.
In this scenario, miR-‘X’ is embedded in the coding region of gene ‘Z’; therefore, expression of
gene ‘Z’ implies that miR-‘X’ and gene ‘Z’ are co-transcribed. Gene ‘Y’ is independently
transcribed and subsequently targeted by the nascent miR-‘X’. This inhibition of protein ‘Y’
results in an uncleaved, and inactive, form of protein ‘Z’ being produced, as in the previous
scenario. The difference is that the miRNA which regulates the gene is part of the gene itself.
We present this novel and previously undescribed pathway in an effort to provide additional
insights into the regulation of viral replication and pathogenesis by new microRNA pathways,
and to inspire further research into the nature of global miRNA regulation.

  120  
Figure 4.23 - Proposed Generalized Regulatory pathway
1 Gene Y protease normally cleaves polyprotein Z
to produce active form of Z
gene ‘Y’ DNA gene ‘Z’ DNA

‘Y’ mRNA ‘Z’ mRNA

‘Y’ Protease ‘Z’ Proprotein

cleaves

active form of Z

2
microRNA X targets Gene Y which disrupts production of active Z
miR-’X’ DNA gene ‘Y’ DNA gene ‘Z’ DNA

miR-’X’ ‘Y’ mRNA ‘Z’ mRNA

targets blocks
translation
no protein ‘Y’ ‘Z’
no cleavage no cleavage

Inactive Protein ‘Z’

  121  
3
MicroRNA X Embedded in Gene Z DNA Can
Still Disrupt Production of Active Z
gene ‘Y’ gene ‘Z’

miR’X’

‘Y’ mRNA miR’X’ ‘Z’ mRNA

Blocks Targets
Translation

‘Y’ Protein ‘Z’ Proprotein

No cleavage

Inactive Protein ‘Z’

Figure 4.23: Our Proposed General Regulatory Pathway


Figure 4.23 shows the proposed generalized regulatory pathway presented as a sequence of steps.
Steps 1 and 2 outline generalized and previously known regulatory mechanisms, while the final
image Step 3 shows our proposed pathway.
#1) In this well-known mechanism, protease ‘Y’ is expressed and cleaves proprotein ‘Z’ to generate
an active form of protein Z.
#2) microRNA ‘X’ adds another layer of regulation to step #1. Here, miR ‘X’ is expressed and
subsequently downregulates its target, protease ‘Y’. With protease Y not available for cleavage of
proprotein ‘Z’, the active form Z is effectively downregulated.
#3) This image represents our proposed pathway which builds on #2. In this mechanism, mir-‘X’ is
embedded within gene ‘Z’, the gene which it ultimately downregulates. When miR-X and Z mRNA
are co-expressed, miR-X acts on its target, protease ‘Y’, which limits or blocks translation into
active protease. Thus, Z is again effectively downregulated through lack of available protease Y.
This represents a self-regulatory, negative feedback loop using a downstream regulatory molecule.

  122  
4.5 Materials and Methods:
Source of Data: In previous research, each of the 2,588 mature human miRNA sequences
contained in miRBase database (http://mirbase.org/) was used as a separate query, utilizing
the mature length version of each miRNA. The BLAST (Basic Local Alignment Search Tool)
[71] program was used to search against the entire HIV-1 database
(http://blast.ncbi.nlm.nih.gov/) (HIV taxid: 11676) at the National Center for Biotechnology
Information (NCBI) and selectively searched in the Los Alamos HIV databases
(http://www.hiv.lanl.gov/content/sequence/BASIC_BLAST/basic_blast.html). All full-length
and partial HIV-1 genome sequences, representative of all HIV-1 clades and strains, were used
for the analyses. These sequences have been identified by the International Committee on
Taxonomy of viruses (ICTV) and are available in the global public databases. The outputs from
the database searches were examined and the best matches from all microRNA query searches
were selected based on the length of the match, percentage of identity of match, lack of gaps or
deletions, and inclusion of the seed sequence. Clade or subtype was determined from individual
entries in the NCBI and Los Alamos databases.

Clustal Analyses and Mapping of newly identified Sequences: The Clustal algorithm was used
for multiple sequence alignments [72, 73] (http://www.ebi.ac.uk/Tools/msa/clustalw2/). We
used the 34 matching HIV isolates to hsa-miR-4644 as identified by our BLAST searches
and shown in Table 4.1, to generate alignments using the Clustal algorithm. The target regions
of the alignment were then mapped to the HXB2 strain gene map using the Los Alamos National
Laboratory HIV genome database (http://www.hiv.lanl.gov/) map, because this is one of the
most complete reference sequence data maps available for HIV-1. Results from the Clustal
algorithm were then checked against the Los Alamos HIV Compendium
(http://www.hiv.lanl.gov/content/sequence/HIV/COMPENDIUM/compendium.html) to verify that
the alignments from both sources were in agreement.

For phylogenetic analysis we used the TreeDyn software program in the construction of a
sequence-based relational tree using the alignment data generated by the Clustal algorithm
(http://www.treedyn.org).

QuickAlign tool was used to align gp160 amino acid and nucleotides regions of interest to the
LANL database web alignment
(http://www.hiv.lanl.gov/content/sequence/QUICK_ALIGNv2/QuickAlign.html) and to
calculate frequency by position, and included surrounding regions and was depicted as a
weblogo. QuickAlign was also used to determine variant count as a percentage by position in
alignments. miRNA targeting sites were determined by using TargetScan
(http://www.targetscan.org/) and PITA
(https://genie.weizmann.ac.il/pubs/mir07/mir07_prediction.html), and miRanda
(http://www.microrna.org/microrna/home.do). Minimum free energy hybridization between
predicted miRNA seed and target pairing were further examined using RNAhybrid
(http://bibiserv.techfak.uni-bielefeld.de/rnahybrid/).
  123  
Acknowledgement
We thank Zisu Mao and Jan M.C. Chan for technical assistance with our proteomics studies
using two-dimensional gel electrophoresis and mass spectrometry respectively. We thank Meng
Li for technical support.

 
 

 
 
 

 
 
 
 

 
 
 

 
 
 
 

 
 
 

 
  124  
 

Chapter 5

Discovery of Mature Cellular miR-6763 Sequence within the LTR of


Several HIV-1 Isolates Represents a Duplicated Form of the Sp1
Transcription Factor Binding Site

5.1 Introduction:
In Chapter Three we have identified and investigated the presence of 15 mature cellular miRNA
sequences that are present in various global HIV-1 strains. These miRNA sequences are not
ubiquitous; rather, they are contained in only one, a few, or up to a few dozen different isolates at
most. Their existence, as yet, is novel and the nature and mechanism of their possible roles in
HIV-1 infection and pathogenesis, unexplained. As part of our investigation, we have attempted
to identify the method by which these miRNAs came to be present in the HIV-1 genome. The
mechanisms of their biogenesis appear to be unique to each miRNA sequence found and
continues to reveal fascinating insights as to their origin.
One of the mature cellular miRNA sequences which we have identified is miR-6763. We
decided to investigate this miRNA in more detail for several reasons. First, we discovered that
the miR-6763 sequence was found in HIV-1 isolates with 100% complete homology across the
entire miRNA domain; the entire miR-6763 sequence from beginning to end was present in the
viral genome. In this respect the miR-6763 is unique in our study in that it represents the
miRNA identified which has the greatest homology to any HIV-1 sequence. Secondly, miR-
6763 matched with perfect homology to not just one but to three different HIV-1 strains.
Furthermore, these strains were from geographically diverse locations; the matches were from
Canada, Spain, and Germany. Finally, the observation that this sequence appears in three
geographically discrete viruses suggests a rare, repeated event.
Our findings show that each of the three matching strains contains a duplicated Sp1 transcription
factor binding site sequence domain in its 3’LTR, and this duplication makes these strains unique
in this respect. Additionally, the arrangement of two Sp1 sequences in tandem creates a unique
nucleotide sequence which contains the exact sequence for a mature cellular miR-6763. Further
investigation reveals that the mature cellular miR-6763 is predicted in silico to target and bind
the CD4 3’UTR, presumably downregulating its expression. CD4 downregulation is known to
facilitate viral replication in vitro and in vivo.
  125  
5.2 Background:

miR-6763
In 2012, Ladewig et al. [109] published a paper in which they described their search for novel
miRNAs using previously isolated human small RNA data sets. Most miRNAs identified up to
that point were so-called ‘canonical’, meaning that they were derived from the well-known
stepwise process of pri-miRNA cleavage by Drosha in the nucleus followed by further Dicer
processing and Ago loading in the cytoplasm. Less known but still documented at that time was
the ‘mirtron’. Mirtrons are introns which, through the splicing process, generate the pre-miRNA
substrates which are subsequently processed by Dicer and therefore represent a sort of pre-
miRNA ‘mimic’. Ladewig’s group was able to show that many previously unidentified
mammalian miRNAs were generated through this pathway. In their study, 240 human splicing-
derived miRNAs were identified, one of which was miR-6763, which is present and annotated in
the miRNA sequence repository miRBase.
mir-6763-5p has a seed sequence of 5’-UGGGGAG-3’. It shares the same seed sequence with
another miRNA known as miR-3150a-3p, and is therefore a member of the same microRNA
family.
miR-6763-5p is listed in the miRBase database, but its functions are largely unknown. By
searching current literature, we have found that miR-6763-5p has been implicated, via the
observation of differential expression, in conditions such as intercranial aneurysm [155] and
lipopolysaccharide (LPS)-induced periodontitis [156]. miR-3150a-3p has also been shown to be
dysregulated in Alzheimer’s disease patients [157].

Sp1
Sp1 is a cellular transcription factor also known as specificity protein 1, or alternatively,
stimulatory protein 1 (http://www.ncbi.nlm.nih.gov/protein/NP_001238754) and is encoded by
the SP1 gene
(http://genome.ucsc.edu/cgibin/hgTracks?db=hg19&lastVirtModeType=default&lastVirtModeE
xtraState=&virtModeType=default&virtMode=0&nonVirtPosition=&position=chr12%3A53773
960%2D53810230&hgsid=474762649_TajkH6E538U5tZYTnVjFE0KcPPqc). It was the first
sequence specific transcription factor identified.
Transcription factors are cellular proteins which have DNA binding activity. This DNA binding
activity is sequence specific and provides transcription factors with the ability to modulate
transcription. Transcription factors can associate with other proteins while bound to the DNA
promoter and these associations may drive or inhibit transcription from the promoter.
The Sp1 transcription factor is a zinc finger DNA binding protein which is 785 amino acids in
length. Sp1 is known to bind to GC-rich motifs of many promoters. These GC-rich motifs may
vary in actual sequence.

  126  
Transcription factors such as Sp1 are able to regulate transcription by either activation or
repression. Whether Sp1 activates or represses transcription depends on many factors including
conditions in the cell or post-translational modifications of the Sp1 molecule such as
phosphorylation, acetylation, or glycosylation.
The term ‘zinc finger’ refers to the shape of the amino acid motif which looks like a protrusion
from the overall protein. The zinc finger is stabilized by a zinc ion which interacts with the
amino acids of the motif to maintain the structure. The zinc finger motif gives the protein in
which it resides DNA binding activity, and is a common DNA binding motif among proteins.
Sp1 has three zinc finger domains of 25 amino acids each, and these three domains bind directly
to DNA to mediate gene transcription. Sp1 is considered to be of zinc finger type Cis2/His2.
The Sp1-I binding site in the HIV LTR is given by Los Alamos as the 11 nucleotide sequence
5’-TGGGGAGTGGC-3’.
 

Sp1 transcription factor

                   
NMR structure of the Sp1 DNA-binding motif.
 
Figure  5.1    Sp1  Transcription  Factor.    By  Emw  -­‐‑  Own  work,  CC  BY-­‐‑SA  3.0,  
https://commons.wikimedia.org/w/index.php?curid=9444646  
 

  127  
Sp1 Sites in the HIV-LTR
The HIV LTR promoter contains three Sp1 binding sites, known as Sp1-I, Sp1-II, and Sp1-III.
The Sp1-I site is closest to the TATA region of the promoter, while the Sp1-II and Sp1-III sites
are successively further upstream. All three sites are considered GC-rich, but the actual
sequences in each are distinct.
Sp1 binding in the HIV LTR is known to enhance transcription from the viral promoter. Sp1
appears to bind indirectly with Tat, and the interaction of Tat with the Sp1 transcription factor
enhances initiation of transcription by RNA Pol II in the viral LTR. Vpr has also been shown to
interact with the Sp1 protein while bound to the Sp1 site in the HIV LTR. This cooperative
binding serves to trans-activate LTR-directed transcription [158].
The canonical Sp1 binding site given in the literature is shown as
5’-(G/T)GGGCGG(G/A)(G/A)(C/T)-3’
In our study, we have used the HIV-1 reference strain HXB2 LTR Sp1 binding site sequences for
examination. All three differ slightly in sequence. While the Sp1-II site conforms to the above
consensus sequence, both Sp1-I and Sp1-III show some variation. The Sp1 binding site which is
most relevant to our investigation and which we examined most closely is the Sp1-I site, which
has a sequence given as
5’-TGGGGAGTGGC-3’
The HXB2 Sp1-I binding site lies 18nt upstream from the promoter start at position #9483, and
all three Sp1 sites are oriented consecutively in tandem.

  128  
5.3 Results:

miR-6763, Identified in HIV Genomic Sequences


Of the 15 miRNAs in our study that we have found to have significant homology with one or
more HIV-1 strains, one of our best ‘matches’ is human cellular miR-6763. HIV-1 sequences
having significant homology to the human cellular miR-6763 are shown in the figure below:

Figure 5.2
Three HIV Isolates from Different Countries Contain miR-6763 Sequence

Figure 5.2: Three HIV Isolates from Different Countries Contain miR-6763 Sequence
Matches for miR-6763 were found among three different HIV strains. Country of origin,
alignment, viral genomic region, degree of homology and position in reference strain HXB2 are
given. All three matches show 100% homology with miR-6763. Position numbers of match for
each strain are also given under alignment.

The mature length miR-6763 is 19nt long and matches perfectly and completely across its entire
19nt length to three different HIV-1 isolates: HDM003V09-5 from Canada, isolate NP625 from
Spain, and isolate HAN-2 from Germany. The fact that this miRNA matched with 100%
homology across its entire domain, plus the fact that it matched not just one but three different
strains from different countries prompted us to investigate the nature of these relationships
further.
We can see from the figure above that all three strains contain an exact miR-6763 sequence, and
each strain contains the sequence in its LTR region. Furthermore, when we mapped the miR-
6763 matching strains using a standardized HXB2 reference strain for alignment, we found that
the miR-6763 mapped to exactly the same HXB2 position, #9482-9500, in each strain.

  129  
Alignment of Matching Strains Shows Insertions of Sp1 Binding Site
Therefore, this data shows that out of the entirety of HIV-1 sequence data present in the global
databases, these three and only these three strains contain this particular miR-6763 sequence.
This prompted the question, what makes these three isolates different from other HIV-1 strains?
In order to answer this question, we performed another sequence alignment. We aligned the
three matching HIV-1 isolates (HDM003V09-5, NP625, and HAN-2) at the region in question,
#9482-9500, along with the mature cellular miR-6763-5p sequence. To this alignment we also
added the HXB2 HIV sequence because HXB2 represents a reference, or ‘typical’, HIV genome.
The alignment is shown here:

Figure 5.3
Alignment of Target Strains Reveals SP1-I Duplications

Figure 5.3: Alignment of Target Strains with HXB2 Reveals Sp1-I Duplications
Alignment is given showing HIV reference strain HXB2 with three strains of interest
(HDM003V09-5, NP625, and HAN-2) and matching microRNA miR-6763. Sp1-I region is shown as
well as flanking sequences and downstream TATA promoter start. The duplicated Sp1-I binding
site can be easily visualized against the HXB2 sequence. Also evident is miR-6763 maintains 100%
homology through the tandem Sp1-I sites of these three HIV isolates.

One of the advantages to using HXB2 as our reference strain is that HXB2’s sequence has been
studied very thoroughly and many sequence features have been identified. We have indicated
several of these features in our alignment: the Sp1-I transcription factor binding site, a TATA
box, as well as the upstream Sp1-II transcription binding site.

  130  
From examining the alignment between HXB2 and our target strains, it can be seen that the
common element among these sequences is an insertion, relative to HXB2, in the three matching
strains (HDM003V09-5, NP625, and HAN-2) of the sequence ‘5-TGGGGAGTGGC-3’. It can
also be readily seen that this sequence is exactly the same sequence as the SP1-I binding site
directly preceding it. By examining all three matching strains, we see the same duplication of
the Sp1-I binding site sequence. Thus, the difference between a standard HIV-1 strain, such as
HXB2, and our ‘matching’ strains is a duplication of the Sp1-I transcription factor binding
site.

When we examine the alignment of miR-6763 with the three matching strains, we can see the
perfect match across miR-6763’s entire length. However, when comparing the HXB2 with the
miR-6763 alignment, we can see that miR-6763 is not a complete match with HXB2 because it
matches HXB2 from positions 1-12 only; from positions 13-19 there is no match and therefore
miR-6763 does not have a significant matching sequence in HXB2. It is the presence of the
insertion of a duplicate SP1-I binding site sequence in the three target strains that in essence
‘creates’ the match for miR-6763.
The fact that these three different strains are separated so far geographically yet contain the same
insertion seems to suggest that this duplication is the result of three independent events.
A schematic showing the Sp1-I insertion in our three matching strains vs. the map of a wild-type
HIV isolate is given in Figure 5.4.

  131  
Figure 5.4
Sp1 Duplication as miR-6763 Site
WT:
HIV-1 TATA
LTR LTR

Sp1

Duplication:
HIV-1 TATA
LTR LTR

Sp1 Sp1
HIV Isolates:
HDM003V09-5
NP625
HAN-2
Sp1-I Sp1-I

miR-6763

Figure 5.4: Sp1 Site Duplication Reveals a Mature miR-6763 Sequence


Schematic diagram of a wild type HIV genome is shown against a diagram of one of our three
matching sequence genomes (HDM003V09-5, NP625, and HAN-2). Sp1-I duplication is shown on
the lower portion of the figure in the 3’LTR. miR-6763 sequence is given and can be seen to run
through both Sp1-I sites. These three HIV strains are the only isolates known to contain a
duplicated Sp1-I site in the LTR.

  132  
Duplication of the Sp1-I site is unique to these three HIV isolates
The 11nt Sp1-I site, 5’-TGGGGAGTGGC-3’, is extremely well conserved in HIV-1 and appears
in virtually all HIV LTRs. We propose here that the presence of a duplicated Sp1-I site is unique
to the three strains we have identified, HDM003V09-5, NP625, and HAN-2.
We have serendipitously identified these three strains containing this tandem, double Sp1-I
binding site by searching all HIV-1 sequences for homology to the mature cellular miR-6763
sequence. If the occurrence of two tandem Sp1-I sites is indeed limited to only the three strains
that we have found – by searching for the miR-6763 sequence – then a search using a double
Sp1-I sequence should yield the same results. These searches were performed.
When using as a query sequence a single HXB2 Sp1-I sequence of 5’-TGGGGAGTGGC-3’ and
searching all HIV-1 sequence databases, we indeed found numerous matches, all of which reside
in the LTR region. This is consistent with what we would expect, that virtually all HIV LTRs
contain an Sp1-I site and most adhere to this exact sequence. When we next performed a search
using a tandem duplicated Sp1-I sequence,
5’-TGGGGAGTGGCTGGGGAGTGGC-3’, we found only the three strains which we have
previously identified, confirming that these are the only three strains carrying this particular
duplication.
Another commonly used reference and experimental HIV-1 strain, pNL-4, was also searched and
was verified to have the Sp1-I binding site. pNL-4 indeed contains the exact same 11nt sequence
with no duplication, as HXB2 does, and both strains show identical sequences immediately
downstream of this Sp1 site. This then implies that the Sp1-I sites present in the three target
isolates we have identified contain exact duplications of the same singular Sp1-I sequence
present in pNL-4. Since in vitro experiments using live virus commonly employ pNL-4, this
finding may be useful in further study of the effects of a tandem duplicated Sp1-I site on viral
transcription.

  133  
miR-6763 is Predicted to Target CD4 in silico
We have performed extensive analysis of the CD4 3’UTR and possible target sites for all of the
miRNA sequences that we have identified in our study. This information as a whole can be
found in detail in chapter 6.
For purposes of this study, we wanted to investigate the potential for interaction between
miR-6763 and CD4. Using the TargetScan software algorithm, we were able test for the
possibility that miR-6763 may target the CD4 mRNA. Indeed, we found three target sites in the
CD4 3’UTR for miR-6763, and these are shown below in Figure 5.5:

Figure 5.5
Target sites for miR-6763 in the 3' UTR of CD4
Human CD4 3'UTR Length 1481

250 500 750 1K 1.25K 1.5K

Figure 5.5: Target Sites for miR-6763 in the 3’UTR of CD4


miR-6763 target sites are shown graphically in figure 5.5. The CD4 mRNA contains three target
sites in its 3’UTR for miR-6763, at positions #32, #269 and #1390. Multiple target sites in the
3’UTR for a single miRNA have been shown to cooperatively enhance binding and subsequent
downregulation of the target mRNA. (Original figure adapted from TargetScan, Nam et al., 2014
[134]).

We hypothesize that a virally encoded miR-6763 could target and possibly downregulate CD4
expression via miRNA-mediated repression. The implications of this finding are several. The
notion that these viruses, via expressing a virally encoded miRNA which is a cellular miRNA
homologue, may potentially downregulate CD4, would add another level of regulation to those
already employed by HIV-1 during viral infection. Such a finding would thus expose another
weapon in the viral arsenal against the cell.

  134  
5.4 Discussion:
In this investigation we have shown that a human cellular miRNA, miR-6763, exhibits 100%
homology with three different HIV-1 strains in the same region of the 3’LTR. We can extract
several insights from this finding:
1)   There is a duplication event of the Sp1-I binding site.
2)   Only three viruses in the database of thousands have this duplication.
3)   HIV can therefore duplicate a transcription factor binding site precisely.
4)   There may be a selective viral advantage to this duplication.
5)   This duplication has not been previously described.
6)   The duplication event ‘creates’ a cellular miRNA mimic in the HIV genome.
7)   The miR-6763 viral sequence could downregulate CD4, aiding viral replication.
1) There is a duplication event of the Sp1-I binding site – we have found three HIV-1 isolates
from different countries which have a tandem, repeated, Sp1-I transcription factor binding site in
their 3’ LTR. Since only the 11nt Sp1-I is repeated without any flanking sequence, only the
Sp1-I site was duplicated.
2) Only three HIV-1 strains have this duplication – in the entire NCBI database of 400,000
genomic and subgenomic sequences, these three strains, and these three strains only, contain this
duplication of the Sp1-I binding site. Subsequently, by virtue of this duplication, these strains
now match the sequence for the mature miR-6763 perfectly; that is, they contain an exact human
cellular mature miR-6763 sequence.
3) HIV can therefore duplicate a transcription factor binding site precisely - the duplication of
the Sp1 site in these three strains is precise in nature; it is exactly the 11-nucleotide length Sp1-I
binding site which is duplicated – no more or less, and it lies directly in tandem with the original
Sp1-I binding site as a repeat. It shows that the virus in nature can duplicate a transcription
factor binding site exactly.
4) There may be a selective viral advantage to this duplication – in normal cells, the Sp1
transcription factor binds to an Sp1 binding site, which enhances transcription. Presumably, an
additional Sp1 site would result in enhanced transcription, which could possibly create a more
active and pathogenic virus. In addition, this ‘extra’ Sp1 binding site is found in the 3’ LTR,
suggesting up-regulated transcription from the 3’ LTR, which could result in enhanced
oncogenic potential of any virus containing the extra Sp1 binding site.
5) This duplication has not been previously described – the discovery of three geographically
distinct HIV-1 strains with this novel feature of an extra binding site for the transcription factor
Sp1 has not been previously reported to our knowledge.
6) The duplication event ‘creates’ a cellular miRNA mimic in the HIV genome – the three
strains with the matching miR-6763 sequence are different from all other HIV-1 isolates in the
NCBI database by virtue of this duplicated Sp1-I sequence. This viral miR-6763 sequence is a
100% homologue to the cellular mature miR-6763 which suggests a possible functional role for
this viral sequence in post-transcriptional microRNA regulation.

  135  
7) The viral miR-6763 sequence is predicted to downregulate CD4, aiding viral replication – in
silico analysis suggests that miR-6763 can bind to three target sites in the CD4 3’UTR,
potentially mediating miRNA-directed downregulation of CD4 expression in the infected cell.
The downregulation of CD4 in cells infected by HIV-1 has been well documented and has been
shown to facilitate viral infection, replication and release of mature virion particles. The
hypothesis that these viral strains could potentially harness and harbor a cellular miRNA which
in turn enhances viral survival holds many intriguing implications.
Thus we have two important results from this study. First, we have identified three HIV-1
strains which contain a cellular miRNA homologue, suggesting a possible functional role in the
virus infectious cycle. Second, by examining the matching strains closely, we have determined
that each of the isolates contains a tandem, duplicated Sp1-I transcription factor binding site,
which suggests the possibility of enhanced transcription and virus activity. We believe that this
finding of not only a new viral miRNA with potential functional roles in HIV-1 pathogenesis, but
also the identification of HIV-1 strains containing a repeated Sp1-I transcription factor binding
site, could have far reaching implications in the understanding and development of treatment for
HIV-1 related disease.

  136  
Chapter 6

Identification of Human MicroRNA Sequences Within HIV-1


Genomes Offers Novel Mechanism for CD4 T-Cell Receptor
Downregulation

6.1 Abstract:
The discovery that the CD4 molecule was a natural receptor for the human immunodeficiency
virus and that CD4+ T-cells were the primary target for HIV-1 was a key early finding in AIDS
research. It followed, naturally, that CD4 would become the subject of intense investigation and
a target for therapeutic intervention.
While much has been learned about the role and function of the CD4 molecule in cellular
homeostasis as well as HIV-1 infection, there remains much to be learned regarding the
mechanisms by which CD4 is regulated and the multiple effects of CD4 on HIV-1 pathogenesis.
Downregulation of CD4 appears to be critical for the propagation of HIV-1 infection, and
therefore therapeutic targeting of the CD4 molecule as a means of limiting viral pathogenesis
remains a promising, if elusive, goal.
In previous research, we have identified 20 HIV-1 strains which contain viral homologues to 15
cellular mature miRNAs. In this investigation we have examined the CD4 3’UTR in detail to
determine possible targeting activity by those miRNAs which we have identified. We will show
that several miRNAs, including miR-195, miR-4644, and miR-6763, are predicted to bind and
target the CD4 3’UTR in silico. We also show that two previously identified and validated HIV-
encoded vmiRNAs, hiv1-miR-TAR-5p and hiv1-miR-H1, are predicted to target the CD4
mRNA. miRNA-directed downregulation of CD4 by virally-encoded miRNAs would usher a
new dynamic and offer important insights to the understanding of viral regulation of infection.

  137  
6.2 Introduction:
CD4 Receptor
CD4+ T-cells represent a major target for HIV infection, and the CD4 molecule is the primary
receptor and mediator of the initial binding between the HIV envelope protein and the
susceptible cell [159, 160].
The CD4 molecule is a glycoprotein of approximately 55-60 kDa molecular weight and is
expressed in many cells, including immune cells such as monocytes, macrophages, dendritic
cells and T-helper cells. CD4 is a cell surface marker on mature T helper lymphocytes and plays
an important role in the process of T-cell activation as well as immune recognition [161]. CD4
was discovered initially in the 1970s as an immunological marker first known as the T4 protein,
and later grouped into a cluster of differentiation, or CD, group 4. As with many other cell
surface receptors, CD4 is a member of the immunoglobulin superfamily, and is a type I integral
membrane glycoprotein found chiefly on the cell surface of major histocompatibility complex
(MHC) class II T-lymphocytes. The molecule is comprised of an N-terminal domain containing
four immunoglobulin domains (D1 to D4) which are present on the extracellular surface of the
cell, as well as a membrane-spanning region, and a short, charged cytoplasmic tail region [162].
The intracellular domain of the CD4 molecule contains two cysteine residues which allow it to
interact with the Src protein tyrosine kinase p56, also known as p56Lck [163].
CD4 acts as a co-receptor, together with the T cell receptor (TCR), in the recognition of a target
antigen on an antigen presenting cell (APC). By facilitating the interaction between T-cells and
antigen-presenting cells [164] CD4 mediates the start of a signal cascade which is critical for the
immunogenic response of CD4+ T-cells [165, 166].
CD4 is a critical part of the immune response due to its role as a ligand for major
histocompatibility complex (MHC) class II molecules [164]. By binding MHC class II bearing
cells, CD4 provides a way of recognizing antigens and activating an immune response.
During this antigenic recognition, the physiological role of CD4 is to stabilize interactions
between T cells and antigen-presenting cells [164]. By displaying a processed peptide fragment
bound to an MHC class II molecule, an antigen-presenting cell is able to bind to the T cell
receptor and CD4 co-receptor of the CD4+ T-cell whereby a signal is intracellularly transduced,
which is critical for antigen responsiveness of the CD4+ T cell [165, 166]. CD4 acts as an
adhesion molecule in this interaction, keeping the cells in close contact.
The T cell bearing the CD4 molecule is known as a T-helper cell and is a critical component of
the immune system. One of its main functions is to activate other immune cells such as B cells
and cytotoxic T cells (CTLs). When CD4+ T-helper cells are depleted, such as happens in HIV
infection, an immunodeficiency results and the patient becomes vulnerable to numerous
opportunistic infections [159, 160].

  138  
CD4 Receptor is the primary receptor for HIV during infection of T-cells
Very early in the history of acquired immunodeficiency syndrome (AIDS) disease, it was
observed that patients exhibited a selective depletion of their CD4+ subpopulation of T
lymphocytes [167]. CD4 was found to be the primary receptor for HIV-1 soon after the virus
had been isolated from AIDS patients. This finding in vivo correlated with the observation that
there was a specific tropism of HIV-1 for the same CD4+ T lymphocytes in vitro [160].
Viral entry of HIV-1 into susceptible cells occurs in multiple steps, and is initiated by the viral
envelope protein gp120 binding to the CD4 molecule. By binding CD4, a conformational
change occurs in the gp120 molecule which ultimately leads to the exposure of the viral gp41
molecule, which is the transmembrane portion of the HIV-1 envelope glycoprotein. This
catalyzes the reaction whereby fusion of the cellular and viral membranes takes place, and this
fusion of membranes is followed by the release of the virion into the cytoplasm of the cell [168].

CD4 requires a secondary receptor


While the primary receptor for the HIV-1 envelope glycoprotein is the CD4 molecule, it was
found that this glycoprotein-CD4 interaction was not sufficient for successful viral infection. It
was shown that the expression of human CD4 on rodent cells allowed those cells to bind virus,
but neither fusion nor infection occurred [169]. It was also found that although HIV-1 can bind
to the D4 subdomain of CD4 which is present on the surface of certain human cells in the brain
and skin, no fusion occurs. Furthermore, some CD4+ cell lines were found to be resistant to
HIV infection. Taken together, this suggested that another key component necessary for viral
fusion and entry was missing [170].
Indeed, in order for the virus to successfully enter the cell, it is also necessary for the viral
envelope to bind to a co-receptor on the cell surface, and this requirement of a co-receptor is
unique to lentiviruses. It is this secondary binding of the cellular co-receptor by the HIV
envelope protein that triggers fusion of the viral and cellular membranes and allows the viral
contents to be released into the cell.
Co-receptors for HIV-1 are chemokine receptors. Chemokines are small molecules that act as
chemoattractant cytokines during the inflammation response. Because chemokine receptors are
present on many different types of cells, almost any cell expressing the CD4 molecule on its
surface can be infected by HIV.

  139  
Most HIV-1 isolates can be classified by their preference, or tropism, for one of two types of
cells. While HIV-1 normally grows well in CD4+ activated PBMC cultures, some can also
multiply in macrophage cultures, and these isolates are known as macrophage, or M-, tropic. M-
tropic viruses are generally not able to replicate in T-cell lines. Another group of viruses grow
well in PBMC cells and in T-cell lines, but do not normally grow in macrophage cultures, and
are considered T-tropic. This tropism for two types of cells is attributed to two distinct co-
receptors on those cells.
The two co-receptors were eventually identified as CXCR4 and CCR5. CXCR4, which was
identified first, is a receptor for class CXC chemokines, and is a transmembrane glycoprotein of
approximately 46 kDa. CXCR4 is a receptor that signals through interactions with G proteins,
and is closely related to another chemokine, interleukin-8 (IL-8).
The CXC4 molecule was shown to be responsible for permissiveness of infection by T-tropic
strains, and this finding suggested that another co-receptor molecule may confer infectability to
cells by M-tropic strains.
This chemokine receptor was indeed found and is known as CCR5. CCR5 gives cells which
express CD4 the ability to be infected by M-tropic strains of HIV. HIV-1 strains that bind the
co-receptor CXCR4 are often called X4 strains, and those which use the CCR5 co-receptor are
known as R5 strains. The cell tropism of any given HIV-1 strain is constrained by the co-
receptor to which it can bind.

Background on CD4 downregulation


Despite the fact that CD4+ T-cells are the primary target and the CD4 molecule itself is the
primary receptor for HIV, we know that expression of CD4 is effectively downregulated post-
HIV infection [160, 171]. While the observation of a virus downregulating its own receptor may
seem ironic at first, extensive investigation has shown this downregulation to enhance viral
replication and viral pathogenesis. The mechanisms by which this downregulation occurs are
several and will be discussed here.
The sequestering of the CD4 molecule is not a phenomena unique to HIV; many enveloped
viruses downregulate their cell surface receptors [159, 160]. The initial reasoning for this is
straightforward; accumulating envelope glycoproteins at the cell surface may interact with the
receptor, which can interfere with virion assembly and may impede incorporation of envelope
into the budding virus particle. Excessive surface concentration of receptor may also induce
formation of syncytia with neighboring cells, limting virus production.
However, other enveloped viruses that downregulate their cell surface receptor do so primarily
via one mechanism, which is the binding up of the receptor in the endoplasmic reticulum (ER)
before it can be transported to the surface. In the case of HIV, there are three separate genes
which can act in different ways to downregulate CD4: env, as well as the lentiviral auxiliary
genes nef and vpu.

  140  
Nef
Nef is a small, 27 kDa protein which is multiply spliced from early transcripts and produced in
abundance early in infection. Lack of normal nef function has been shown to reduce viral
pathogenesis; for example, some HIV-infected patients who do not progress to disease in the
absence of treatment, the so-called long-term nonprogressors (LTNPs), have been shown to have
mutations in the nef region of the viral genome [172]. Specifically, Nef is known to modulate
HIV infection by downregulating the amount of CD4 and the amount of class I major
histocompatibility complex (MHC-I) molecules present at the cell surface.
The Nef protein is post-translationally myristoylated, and this myristoylation serves to localize
the Nef protein to the inner surface of the cell membrane. By attaching to the plasma membrane,
Nef is able to interact with other membrane-bound proteins. Nef serves to secrete the CD4
molecule from the cell surface by inducing its incorporation into early endosomes through
clathrin-dependent endocytosis. It accomplishes this in several steps. First, Nef is able to
dissociate the CD4 molecule from its internal kinase p56-Lck “anchor” by a mechanism which is
poorly understood. It subsequently is able to associate itself directly to the CD4 molecule
through a domain in the CD4 cytoplasmic tail.
After binding to and mobilizing CD4 at the cell surface, Nef then interacts with clathrin adaptor
complexes, such as AP-1, AP-2 and AP-3. This association of Nef/CD4 with AP-2 guides the
CD4 to clathrin coated pits and directs CD4 from the plasma membrane to the early endosome.
The internalization of CD4 eventually leads to its incorporation into lysosomes and degradation
[173].
Nef is also able to direct CD4 directly from the trans-Golgi network (TGN) to endosomes by
again complexing with the CD4 cytoplasmic tail and subsequent association with the AP-1
molecule. It is the AP-1 adaptor complex which guides the CD4 directly to the endosome
without ever reaching the cell surface. In addition, another adaptor complex, AP-3, may serve a
similar function, but in this case guiding the CD4 molecule via Nef directly from the TGN to the
lysosome.
Another related function of Nef in the enhancement of viral infection is the reduction of major
histocompatibility complex I (MHC-I) molecules at the cell surface. In a manner similar to that
by which Nef regulates CD4, Nef appears to interact with the MHC I molecule, also at the level
of the trans-Golgi network (TGN), which leads to its transport to lysosomes for degradation.
Thus, the nascent MHC-I molecules never reach the cell surface and their concentration is
decreased. Since the MHC-I molecule is an antigen-presenting molecule which interacts on the
cell surface with other CD8+ T-cells involved in cellular immunity, the sequestering of the
MHC-I inhibits this process, and the virus is able to evade the immune surveillance of the CD8+
cytotoxic T-cell (CTL). Therefore, the inhibition of MHC-I surface expression by Nef may be a
potent method for HIV-1 to avoid CTL response and continue to replicate in the host [173].

  141  
Vpu
The HIV-1 vpu gene encodes a small protein of length 81 amino acids. The Vpu protein has a
N-terminal hydrophobic region of 27 amino acids which constitutes a membrane-spanning
region, and a C-terminal domain of 54 amino acids which forms a cytoplasmic tail.
Vpu has two distinct biological functions. First, it is required for virus maturation and aids in the
release of budding virions from the host cell. The second function of Vpu is the degradation of
the CD4 molecule. These two functions are distinct and separated into two domains of the Vpu
protein; The N-terminal region of Vpu is involved in the untethering and release of virus
particles from the plasma membrane, and the cytoplasmic region targets the CD4 molecule. Vpu
forms specific interactions with the CD4 in the endoplasmic reticulum (ER) and acts as an
adapter molecule, linking CD4 to a ubiquitin ligase complex known as the E3 complex SCFß-
TrCP
. This ultimately delivers the CD4 receptor to the proteasome degradation pathway and
eliminates the CD4 molecule [174].
The Vpu protein is able to bind the cytoplasmic tail of the CD4 molecule when CD4 is
complexed to the envelope protein gp160 in the ER, and by doing so, causes the dissociation of
CD4 from the gp160 protein, leaving the envelope precursor free to continue its processing and
eventual transport to the cell surface. The Vpu protein complexed to CD4 then binds ß-TrCP,
which is a component of an ubiquitin ligase complex known as E3. This interaction serves to
recruit the remaining components of the ligase complex. The CD4 is subsequently ubiquitinated
while bound to Vpu, and this ubiquitination is what causes CD4 receptor transport from the ER
to the proteasome for subsequent degradation [175].
As alluded to above, the HIV-1 envelope precursor protein also plays a role in limiting CD4
distribution to the cell surface. During HIV infection, both CD4 and the envelope precursor
gp160 are transcribed and therefore co-located in the ER. This inevitably leads to gp160 binding
to the nascent CD4 molecule and trapping the CD4 receptor in the ER. It is in this way that the
Env protein also serves to reduce the surface concentration of CD4; by binding CD4, the Env
precursor prevents its processing and trafficking to the cell membrane [160, 171].
Thus, it appears that Env, Vpu, and Nef all participate in downregulation of CD4. While Nef
reduces CD4 surface concentration in the early stages of infection, Env and Vpu downregulation
of CD4 occurs in the later stages [168].
This redundancy of regulation of CD4 by three distinct HIV-1 genes suggests an underlying
importance of CD4 downregulation as it relates to HIV infection.

  142  
Viral Enhancement by CD4 Sequestration
Reducing the amount of CD4 receptor at the cell surface is proposed to enhance viral replication
and release in several ways. First, limiting CD4 at the surface prevents its binding to nascent
Envelope glycoproteins which are also accumulating at the surface in preparation for viral
assembly. This binding prevents Env incorporation into budding virions. Secondly, excessive
CD4 at the cell surface may cause new progeny HIV-1 virions to bind this receptor and prevent
the release of these viral particles.
A third model suggests that without suppression of CD4 molecules at the surface, CD4 will be
incorporated into the viral envelope along with the Envelope glycoprotein. This will lead to the
inevitable binding up of Envelope to the CD4 embedded in its own viral membrane, preventing it
from interacting with other cell surface CD4 receptors and reducing the infectivity of these
particles [173].
Finally, limiting CD4 concentration at the cell surface prevents or reduces infection by additional
HIV-1 virus particles. Superinfection by HIV, by introducing multiple proviral genomes into the
host DNA, has been shown to be toxic to infected cells [175].

6.2 Results:
Target Sites in CD4 for miR-195, miR-4644, and miR-6763
The studies reviewed here so far have described downregulation of CD4 expression at the protein
level. There are also, however, some reports of reduced CD4 expression through a reduction in
transcription of CD4 mRNA [171, 176].
Studies have also shown a decrease in the translation of CD4 mRNA post-infection [177, 178].
The fact that there is a decreased level of CD4 on the T-cell surface, combined with the
observation that decreased expression of CD4 may occur via reduced CD4 mRNA translation,
prompted the question of whether this reduced translation of CD4 mRNA might be regulated by
a miRNA-mediated mechanism.
Based on this information, we conducted a search for target sites within the CD4 mRNA 3’ UTR
using the TargetScan software. Previously, in chapters 2, 4, and 5, we have investigated miR-
195, miR-4644, and miR-6763 in detail. By comparing the sequences of all 15 miRNAs we have
identified, we found that those same three, miR-195, miR-4644 and miR-6763 indeed had target
sites within the CD4 mRNA.
In addition, we have identified a target site in the CD4 3’UTR for two hiv-encoded miRNAs,
hiv1-miR-TAR-5p and hiv1-miR-H1.

  143  
By utilizing bioinformatics software and other metagenomics tools, we are able to identify
possible target site sequences for most given miRNAs. We can also determine which species,
besides human, may contain the target site sequence in their target mRNA. Conservation of
target sites between species lends support to the idea that the target may be evolutionarily
conserved and therefore more likely to be a legitimate target site in vivo. The following figure
shows our results from the examination of possible target sites for miR-195 in the CD4 mRNA:

Figure 6.1 – miR-195 Target Sites in the CD4 3’UTR

(TargetScan Screenshot Version 7.0, Lewis, 2005 [62]).

This snapshot from the TargetScan program shows the miR-195 target site within the 3’ UTR of
the CD4 mRNA at position #1379. It also shows conservation among the human, chimp, and
rhesus mRNAs, but diminished conservation in the corresponding non-primate mRNAs.

  144  
We were also able to identify a target site for miR-4644. The image is attached here:

Figure 6.2 – miR-4644 Target Sites in the CD4 3’UTR

(TargetScan Screenshot Version 7.0, Lewis, 2005 [62]).

We can see here that mir-4644 has a target site within the human CD4 mRNA 3’ UTR at position
#1422. In the same way as miR-195, this conservation of target extends through human, chimp,
and rhesus but does not extend to other non-primates.
The following now is an image showing the target site for our third miRNA, miR-6763:

Figure 6.3 – miR-6763 Target Sites in the CD4 3’UTR

(TargetScan Screenshot Version 7.0, Lewis, 2005 [62]).


  145  
This screen shot shows a target site in the CD4 3’UTR for miR-6763 at position #32-38. There
are actually three different target sites for miR-6763; these lie at positions #32-38, #269-275,
and positions #1390-1396. In general, multiple target sites in a single mRNA are thought to act
in a cooperative fashion to aid in binding of the miRNA and subsequent regulation. The
observation of three predicted target sites for this miRNA also may lend support that these sites
are indeed legitimate target sites, although final verification would have to come from further
experimental validation. The target sites predicted for miR-6763 only occur in primates and do
not extend to other mammals.
In searching for target sites for the other miRNAs that we have identified in this study, we found
less conclusive evidence for possible target sites for four other miRNAs; miR-7151, miR-6124,
miR-5197, and miR-548av. While the TargetScan software showed possible target sites in the
CD4 mRNA for these miRNAs, they were predicted with a much lower confidence than the three
described above.

Identification of other miRNA targets in CD4 mRNA


After finding target sites for three of the 15 identified miRNAs in the CD4 mRNA, we decided to
investigate whether any other putative HIV-encoded miRNAs might also be capable of targeting
CD4.
While conclusive evidence of HIV-encoded miRNAs remains controversial, there are three HIV
miRNAs listed in the official database of microRNAs, miRBase [104]: miR-H1-5p, miR-N367-
3p, and miR-TAR-5p/3p. There has also been a fourth report of an HIV-encoded miRNA, hiv1-
miR-H3. We decided to investigate these four miRNA sequences to see if there was any
evidence of a possible target site for these purported miRNAs in the CD4 mRNA.
Because the TargetScan software only reveals target sites between known miRNAs and its
database of 3’ UTRs, it was not possible to use the TargetScan algorithm to answer this question.
However, we were able to use the BLAST algorithm to see if any homology exists between these
HIV-encoded miRNAs and the CD4 3’ UTR. Because the BLAST algorithm searches not only
the subject sequence but also identifies homology based on the reverse complement of the
subject sequence, we could search for reverse complementarity on the target strand. Finding
homology with the reverse complement in the target sequence implies that the given miRNA
could hypothetically base pair with its target’s positive (+) sense strand.
We therefore performed reverse compelementarity homology searches using the four putative
HIV-encoded sequences, hiv1-miR-H1, miR-N367-3p, miR-TAR-5p/3p and hiv1-miR-H3 as
query sequences, and defined the CD4 3’UTR as the search space.
By examining the search results, we discovered that there exists a target sequence for the HIV-
encoded sequence hiv1-miR-TAR-5p in the CD4 mRNA. The seed sequence of the HIV miR-
TAR-5p RNA, 5’-CUCUCUG-3’, is perfectly homologous to position #699-705 of the CD4
3’UTR reverse (-) strand, and therefore can potentially bind and downregulate translation of the
CD4 mRNA. This target site is categorized as a 7mer-m8 site, and is present in the human CD4
3’UTR as well as the chimpanzee CD4 3’UTR.
  146  
We also discovered that there is a target sequence in the CD4 mRNA for another HIV-encoded
miRNA, hiv1-miR-H1. In this instance, the seed sequence for hiv1-miR-H1, 5’-CAGGGAG-3’,
matches perfectly to position #930-938 of the CD4 3’UTR reverse (-) strand, and so can also
potentially downregulate expression of CD4. Also of note in this case is the observation that
hiv1-miR-H1 matches not only through its seed sequence, positions #2-8, but actually matches
from positions #2-10, creating a stronger association between this potential miRNA and the CD4
target. This target site is also a 7mer-m8 site, and is conserved among human, chimpanzee and
rhesus monkey. This match is shown here:

Figure 6.4
HIV-Encoded vmiRNA Target sites in the CD4 3'UTR

Figure 6.4: HIV-Encoded vmiRNA Target Sites in the CD4 3’UTR


This figure shows target sites for two HIV-encoded miRNAs which are listed in the miRBase
database, hiv1-miR-TAR-5p and hiv1-miR-H1. vmiRNA name and sequence are given along with
the sequence specificity of the match (‘alignment’) numbers of identities and match type.
A 7mer-m8 target site means that there is a seed match at positions 2-7 of the miRNA plus a match
at position 8, and is considered a stronger target site due to the extra matching at position 8.
Matching sequences corresponding to hiv1-miR-TAR-5p and hiv1-miR-H1 are given in green and
blue, respectively, while CD4 sequences are given in violet. Positions of match between miRNAs
and CD4 target sequence are also shown. Seed match is defined as positions 2-8; hiv1-miR-H1
match extends from 2-10.

No further homology was found between any of the putative HIV-encoded miRNAs that are
listed in public databases and the CD4 mRNA 3’ UTR.
Our results are shown illustratively in Figure 6.5:

  147  
Figure 6.5
Multiple Target Sites in the 3'UTR of CD4
Human CD4 3'UTR Length 1481
250 500 750 1K 1.25K 1.5K
Sites for miRNAs identified in our study :
miR-6763 miR-6763 hiv1-miR-H1 miR-4644

148  
(32) (269) (930) (1422)
hiv1-miR-TAR
(699) miR-6763
(1390)
miR-195
(1379)
5 miRNAs (miR-195, miR-4644, miR-6763, hiv1-miR-TAR and hiv1-miR-H1) have target sites in the CD4 3'UTR

 
Figure 6.5: Multiple Target Sites in the 3’UTR of CD4
This figure shows the ~1500nt long CD4 3’UTR schematically, along with target sites identified in
this study. Three miRNAs (miR-195, miR-4644, and miR-6763) have target sites in the CD4
3’UTR: miR-195 has a target site at position #1379; miR-4644 at #1422; and miR-6763 has three
target sites at positions #32, #269, and #1390.

Additionally, two HIV-encoded vmiRNAs are predicted to have target sites: hiv1-miR-TAR
(position #699) and hiv1-miR-H1 (#930). (Original figure adapted from TargetScan, Nam et al.,
2014 [134]).

6.3 Discussion:
The downregulation of CD4 was observed early on in the history of AIDS, and has been
extensively studied as noted above. The methods employed by HIV-1 to limit the cell surface
expression of CD4, whether mediated by Nef, Env, or Vpu, are those of protein-protein
interaction. Nef acts by internalizing CD4 at the surface, or by directing nascent CD4 to
lysosomes before its transport to the cell surface; Env reduces CD4 expression by binding and
trapping it in the ER; and Vpu is able to interact with CD4 complexed with Env in the ER and
subsequently mediate its ubiquitination and degradation by the proteasome.
Methods possibly employed by HIV-1 to reduce CD4 at the mRNA level are less understood.
Some studies suggest that transcription of the CD4 mRNA is impeded post-infection. Others
have also suggested that translation of CD4 mRNA is affected. Our study investigated the
possibility that CD4 expression could be disrupted at the mRNA level via miRNA-directed
repression of translation.
In our investigation, we have found that three of our predicted HIV-encoded miRNAs can target
the CD4 mRNA and downregulate its expression at the level of translation. This represents a
new modality in the nature of CD4 downregulation, that it can be reduced at the level of
translation via a miRNA mechanism.
This finding that these three miRNAs - mir-195, miR-4644, and miR-6763 - can target CD4
mRNA represents an additional interesting find in an already evolving story. We identified these
three miRNAs as a subset of 15 miRNAs which were identified on the basis of the homology of
the mature cellular miRNA sequence with a particular HIV-1 isolate genome or genomes. This
finding in itself suggests a mimicry of sorts by a virus incorporating or emulating a cellular
homologue, presumably for its own benefit. As we investigated further, we identified possible
functions for these viral miRNA mimics: titration of a natural cellular target was suggested for
miR-195; miR-6763 may arise from the LTR promoter on the basis of an additionally introduced
Sp-I site; and miR-4644 may possibly act by downregulating furin, which ultimately slows the
maturation of the virus by hindering cleavage of gp160.

  149  
However, what we are showing here is another possible role for these miRNAs; that in addition
to the specific role they may be playing in the progression of cellular infection, they may be also
acting in concert to reduce the level of CD4 expression by direct miRNA targeting of its 3’UTR.
The idea of a viral gene product performing multiple roles is not new. Indeed, in our review of
Vpu we mentioned the fact that it serves both to facilitate the release of budding virions from the
cell surface, and also to reduce CD4 expression by targeting it in the ER. In the relatively new
field of miRNA regulation, it is also widely known that a single miRNA can target multiple
mRNA transcripts, thus also acting in a pleiotropic manner. What we are showing in this study
is the observation of multiple effects produced by predicted HIV-encoded miRNA sequences at
the level of mRNA regulation, which to our knowledge has not been described before.
Finally, our finding of two HIV-encoded, previously described miRNA sequences that may
target CD4 may be the most interesting observation of all. While their functions in vivo and
even their existence have been the subject of some controversy, both hiv-miR-TAR-5p and hiv1-
miR-H1 are nevertheless acknowledged in the miRBase database of all known miRNAs. This
means that in addition to the three miRNAs that we have identified, there are two other known
virally encoded miRNAs which can also possibly target CD4. We believe that this further
supports that hypothesis that HIV-1 virally encoded miRNAs exist and function in vivo.
The miRNAs sequences we have identified are homologous to fully-processed, mature miRNAs;
they lack the flanking sequences that are seen in normal cellular pri- or pre-miRNA substrates
and therefore do not appear to be processed in the same way. They may be processed in some
other way to produce mature miRNAs or they may somehow perform their functions embedded
in a longer nucleotide chain. However, hiv-miR-TAR has been shown to exist as a double
stranded RNA hairpin structure which can and does serve as a substrate for a cellular miRNA
processing pathway. Therefore, this miRNA sequence has an additional feature in that it very
well can be processed into a mature miRNA by the existing cellular pathway and loaded onto the
RISC complex. This lends further support to the possible connection between virally encoded
miRNAs and the natural CD4 target of the virus.
While more work needs to be done to experimentally validate these findings, our observations
show a promising and possible new level of regulation in the delicate balance between virus and
host cell during HIV infection. The further elucidation of these regulatory steps may bring
possible new strategies and approaches for therapeutic intervention in an effort to further combat
the AIDS pandemic.
 
 

 
 
 

  150  
References

1. Chalfie M, Sulston J. Developmental genetics of the mechanosensory neurons of


Caenorhabditis elegans. Dev Biol. 1981;82(2):358-70. PubMed PMID: 7227647.

2. Ambros V, Horvitz HR. Heterochronic mutants of the nematode Caenorhabditis elegans.


Science. 1984;226(4673):409-16. PubMed PMID: 6494891.

3. Lee RC, Feinbaum RL, Ambros V. The C. elegans heterochronic gene lin-4 encodes
small RNAs with antisense complementarity to lin-14. Cell. 1993;75(5):843-54. PubMed PMID:
8252621.

4. He L, Hannon GJ. MicroRNAs: small RNAs with a big role in gene regulation. Nat Rev
Genet. 2004;5(7):522-31. doi: 10.1038/nrg1379. PubMed PMID: 15211354.

5. Wightman B, Burglin TR, Gatto J, Arasu P, Ruvkun G. Negative regulatory sequences in


the lin-14 3'-untranslated region are necessary to generate a temporal switch during
Caenorhabditis elegans development. Genes & development. 1991;5(10):1813-24. PubMed
PMID: 1916264.

6. Lagos-Quintana M, Rauhut R, Lendeckel W, Tuschl T. Identification of novel genes


coding for small expressed RNAs. Science. 2001;294(5543):853-8. doi:
10.1126/science.1064921. PubMed PMID: 11679670.

7. Lee Y, Kim M, Han J, Yeom KH, Lee S, Baek SH, et al. MicroRNA genes are
transcribed by RNA polymerase II. EMBO J. 2004;23(20):4051-60. doi:
10.1038/sj.emboj.7600385. PubMed PMID: 15372072; PubMed Central PMCID:
PMCPMC524334.

8. Du T, Zamore PD. microPrimer: the biogenesis and function of microRNA.


Development. 2005;132(21):4645-52. doi: 10.1242/dev.02070. PubMed PMID: 16224044.

9. Murchison EP, Hannon GJ. miRNAs on the move: miRNA biogenesis and the RNAi
machinery. Curr Opin Cell Biol. 2004;16(3):223-9. doi: 10.1016/j.ceb.2004.04.003. PubMed
PMID: 15145345.

10. Lund E, Dahlberg JE. Substrate selectivity of exportin 5 and Dicer in the biogenesis of
microRNAs. Cold Spring Harb Symp Quant Biol. 2006;71:59-66. doi: 10.1101/sqb.2006.71.050.
PubMed PMID: 17381281.

11. Khvorova A, Reynolds A, Jayasena SD. Functional siRNAs and miRNAs exhibit strand
bias. Cell. 2003;115(2):209-16. PubMed PMID: 14567918.

  151  
12. Carmell MA, Xuan Z, Zhang MQ, Hannon GJ. The Argonaute family: tentacles that
reach into RNAi, developmental control, stem cell maintenance, and tumorigenesis. Genes &
development. 2002;16(21):2733-42. doi: 10.1101/gad.1026102. PubMed PMID: 12414724.

13. Filipowicz W, Bhattacharyya SN, Sonenberg N. Mechanisms of post-transcriptional


regulation by microRNAs: are the answers in sight? Nat Rev Genet. 2008;9(2):102-14. doi:
10.1038/nrg2290. PubMed PMID: 18197166.

14. Pillai RS, Bhattacharyya SN, Artus CG, Zoller T, Cougot N, Basyuk E, et al. Inhibition
of translational initiation by Let-7 MicroRNA in human cells. Science. 2005;309(5740):1573-6.
doi: 10.1126/science.1115079. PubMed PMID: 16081698.

15. Humphreys DT, Westman BJ, Martin DI, Preiss T. MicroRNAs control translation
initiation by inhibiting eukaryotic initiation factor 4E/cap and poly(A) tail function. Proceedings
of the National Academy of Sciences of the United States of America. 2005;102(47):16961-6.
doi: 10.1073/pnas.0506482102. PubMed PMID: 16287976; PubMed Central PMCID:
PMCPMC1287990.

16. Chendrimada TP, Finn KJ, Ji X, Baillat D, Gregory RI, Liebhaber SA, et al. MicroRNA
silencing through RISC recruitment of eIF6. Nature. 2007;447(7146):823-8. doi:
10.1038/nature05841. PubMed PMID: 17507929.

17. Petersen CP, Bordeleau ME, Pelletier J, Sharp PA. Short RNAs repress translation after
initiation in mammalian cells. Molecular cell. 2006;21(4):533-42. doi:
10.1016/j.molcel.2006.01.031. PubMed PMID: 16483934.

18. Behm-Ansmant I, Rehwinkel J, Doerks T, Stark A, Bork P, Izaurralde E. mRNA


degradation by miRNAs and GW182 requires both CCR4:NOT deadenylase and DCP1:DCP2
decapping complexes. Genes & development. 2006;20(14):1885-98. doi: 10.1101/gad.1424106.
PubMed PMID: 16815998; PubMed Central PMCID: PMCPMC1522082.

19. Nottrott S, Simard MJ, Richter JD. Human let-7a miRNA blocks protein production on
actively translating polyribosomes. Nat Struct Mol Biol. 2006;13(12):1108-14. doi:
10.1038/nsmb1173. PubMed PMID: 17128272.

20. Vasudevan S, Steitz JA. AU-rich-element-mediated upregulation of translation by FXR1


and Argonaute 2. Cell. 2007;128(6):1105-18. doi: 10.1016/j.cell.2007.01.038. PubMed PMID:
17382880; PubMed Central PMCID: PMCPMC3430382.

21. Sandberg R, Neilson JR, Sarma A, Sharp PA, Burge CB. Proliferating cells express
mRNAs with shortened 3' untranslated regions and fewer microRNA target sites. Science.
2008;320(5883):1643-7. doi: 10.1126/science.1155390. PubMed PMID: 18566288; PubMed
Central PMCID: PMCPMC2587246.

  152  
22. Cheng C, Bhardwaj N, Gerstein M. The relationship between the evolution of microRNA
targets and the length of their UTRs. BMC genomics. 2009;10:431. doi: 10.1186/1471-2164-10-
431. PubMed PMID: 19751524; PubMed Central PMCID:
PMCPMC2758905.

23. Sunkar R, Zhu JK. Novel and stress-regulated microRNAs and other small RNAs from
Arabidopsis. Plant Cell. 2004;16(8):2001-19. doi: 10.1105/tpc.104.022830. PubMed PMID:
15258262; PubMed Central PMCID: PMCPMC519194.

24. Jopling CL, Yi M, Lancaster AM, Lemon SM, Sarnow P. Modulation of hepatitis C virus
RNA abundance by a liver-specific MicroRNA. Science. 2005;309(5740):1577-81. Epub
2005/09/06. doi: 10.1126/science.1113329. PubMed PMID: 16141076.

25. Pasquinelli AE. MicroRNAs and their targets: recognition, regulation and an emerging
reciprocal relationship. Nat Rev Genet. 2012;13(4):271-82. doi: 10.1038/nrg3162. PubMed
PMID: 22411466.

26. Houzet L, Klase Z, Yeung ML, Wu A, Le SY, Quinones M, et al. The extent of sequence
complementarity correlates with the potency of cellular miRNA-mediated restriction of HIV-1.
Nucleic acids research. 2012;40(22):11684-96. doi: 10.1093/nar/gks912. PubMed PMID:
23042677; PubMed Central PMCID: PMCPMC3526334.

27. Siomi H, Siomi MC. Posttranscriptional regulation of microRNA biogenesis in animals.


Molecular cell. 2010;38(3):323-32. Epub 2010/05/18. doi: 10.1016/j.molcel.2010.03.013.
PubMed PMID: 20471939.

28. Calin GA, Croce CM. MicroRNA signatures in human cancers. Nature reviews Cancer.
2006;6(11):857-66. Epub 2006/10/25. doi: 10.1038/nrc1997. PubMed PMID: 17060945.

29. Dai Y, Huang YS, Tang M, Lv TY, Hu CX, Tan YH, et al. Microarray analysis of
microRNA expression in peripheral blood cells of systemic lupus erythematosus patients. Lupus.
2007;16(12):939-46. Epub 2007/11/29. doi: 10.1177/0961203307084158. PubMed PMID:
18042587.

30. Bartel DP. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell.
2004;116(2):281-97. Epub 2004/01/28. PubMed PMID: 14744438.

31. Doench JG, Sharp PA. Specificity of microRNA target selection in translational
repression. Genes & development. 2004;18(5):504-11. Epub 2004/03/12. doi:
10.1101/gad.1184404. PubMed PMID: 15014042; PubMed Central PMCID: PMC374233.

32. Kloosterman WP, Plasterk RH. The diverse functions of microRNAs in animal
development and disease. Developmental cell. 2006;11(4):441-50. Epub 2006/10/03. doi:
10.1016/j.devcel.2006.09.009. PubMed PMID: 17011485.

  153  
33. Wang X, Ye L, Hou W, Zhou Y, Wang YJ, Metzger DS, et al. Cellular microRNA
expression correlates with susceptibility of monocytes/macrophages to HIV-1 infection. Blood.
2009;113(3):671-4. Epub 2008/11/19. doi: 10.1182/blood-2008-09-175000. PubMed PMID:
19015395; PubMed Central PMCID: PMC2628373.

34. Pfeffer S, Sewer A, Lagos-Quintana M, Sheridan R, Sander C, Grasser FA, et al.


Identification of microRNAs of the herpesvirus family. Nature methods. 2005;2(4):269-76. Epub
2005/03/23. doi: 10.1038/nmeth746. PubMed PMID: 15782219.

35. Kincaid RP, Burke JM, Sullivan CS. RNA virus microRNA that mimics a B-cell
oncomiR. Proceedings of the National Academy of Sciences of the United States of America.
2012;109(8):3077-82. Epub 2012/02/07. doi: 10.1073/pnas.1116107109. PubMed PMID:
22308400; PubMed Central PMCID: PMC3286953.

36. Hariharan M, Scaria V, Pillai B, Brahmachari SK. Targets for human encoded
microRNAs in HIV genes. Biochemical and biophysical research communications.
2005;337(4):1214-8. Epub 2005/10/21. doi: 10.1016/j.bbrc.2005.09.183. PubMed PMID:
16236258.

37. Ahluwalia JK, Khan SZ, Soni K, Rawat P, Gupta A, Hariharan M, et al. Human cellular
microRNA hsa-miR-29a interferes with viral nef protein expression and HIV-1 replication.
Retrovirology. 2008;5:117. Epub 2008/12/24. doi: 10.1186/1742-4690-5-117. PubMed PMID:
19102781; PubMed Central PMCID: PMC2635386.

38. Yeung ML, Bennasser Y, Myers TG, Jiang G, Benkirane M, Jeang KT. Changes in
microRNA expression profiles in HIV-1-transfected human cells. Retrovirology. 2005;2:81.
Epub 2005/12/31. doi: 10.1186/1742-4690-2-81. PubMed PMID: 16381609; PubMed Central
PMCID: PMC1352379.

39. Houzet L, Jeang KT. MicroRNAs and human retroviruses. Biochimica et biophysica acta.
2011. Epub 2011/06/07. doi: 10.1016/j.bbagrm.2011.05.009. PubMed PMID: 21640212;
PubMed Central PMCID: PMC3177989.

40. Gupta A, Nagilla P, Le HS, Bunney C, Zych C, Thalamuthu A, et al. Comparative


expression profile of miRNA and mRNA in primary peripheral blood mononuclear cells infected
with human immunodeficiency virus (HIV-1). PloS one. 2011;6(7):e22730. Epub 2011/08/11.
doi: 10.1371/journal.pone.0022730. PubMed PMID: 21829495; PubMed Central PMCID:
PMC3145673.

41. Pilakka-Kanthikeel S, Saiyed ZM, Napuri J, Nair MP. MicroRNA: implications in HIV, a
brief overview. Journal of neurovirology. 2011;17(5):416-23. Epub 2011/07/26. doi:
10.1007/s13365-011-0046-1. PubMed PMID: 21786074.

42. Bennasser Y, Le SY, Yeung ML, Jeang KT. HIV-1 encoded candidate micro-RNAs and
their cellular targets. Retrovirology. 2004;1:43. Epub 2004/12/17. doi: 10.1186/1742-4690-1-43.
PubMed PMID: 15601472; PubMed Central PMCID: PMC544590.
  154  
43. Lin J, Cullen BR. Analysis of the interaction of primate retroviruses with the human
RNA interference machinery. Journal of virology. 2007;81(22):12218-26. Epub 2007/09/15. doi:
10.1128/JVI.01390-07. PubMed PMID: 17855543; PubMed Central PMCID: PMC2169020.

44. Omoto S, Fujii YR. Regulation of human immunodeficiency virus 1 transcription by nef
microRNA. The Journal of general virology. 2005;86(Pt 3):751-5. Epub 2005/02/22. doi:
10.1099/vir.0.80449-0. PubMed PMID: 15722536.

45. Kaul D, Ahlawat A, Gupta SD. HIV-1 genome-encoded hiv1-mir-H1 impairs cellular
responses to infection. Molecular and cellular biochemistry. 2009;323(1-2):143-8. Epub
2008/12/17. doi: 10.1007/s11010-008-9973-4. PubMed PMID: 19082544.

46. Klase Z, Winograd R, Davis J, Carpio L, Hildreth R, Heydarian M, et al. HIV-1 TAR
miRNA protects against apoptosis by altering cellular gene expression. Retrovirology.
2009;6:18. Epub 2009/02/18. doi: 10.1186/1742-4690-6-18. PubMed PMID: 19220914; PubMed
Central PMCID: PMC2654423.

47. Ouellet DL, Plante I, Landry P, Barat C, Janelle ME, Flamand L, et al. Identification of
functional microRNAs released through asymmetrical processing of HIV-1 TAR element.
Nucleic acids research. 2008;36(7):2353-65. Epub 2008/02/27. doi: 10.1093/nar/gkn076.
PubMed PMID: 18299284; PubMed Central PMCID: PMC2367715.

48. Yeung ML, Bennasser Y, Watashi K, Le SY, Houzet L, Jeang KT. Pyrosequencing of
small non-coding RNAs in HIV-1 infected cells: evidence for the processing of a viral-cellular
double-stranded RNA hybrid. Nucleic acids research. 2009;37(19):6575-86. Epub 2009/09/05.
doi: 10.1093/nar/gkp707. PubMed PMID: 19729508; PubMed Central PMCID: PMC2770672.

49. Schopman NC, Willemsen M, Liu YP, Bradley T, van Kampen A, Baas F, et al. Deep
sequencing of virus-infected cells reveals HIV-encoded small RNAs. Nucleic acids research.
2012;40(1):414-27. Epub 2011/09/14. doi: 10.1093/nar/gkr719. PubMed PMID: 21911362;
PubMed Central PMCID: PMC3245934.

50. Hakim ST, Alsayari M, McLean DC, Saleem S, Addanki KC, Aggarwal M, et al. A large
number of the human microRNAs target lentiviruses, retroviruses, and endogenous retroviruses.
Biochemical and biophysical research communications. 2008;369(2):357-62. Epub 2008/02/20.
doi: 10.1016/j.bbrc.2008.02.025. PubMed PMID: 18282469.

51. Rasheed S, Yan JS, Lau A, Chan AS. HIV replication enhances production of free fatty
acids, low density lipoproteins and many key proteins involved in lipid metabolism: a proteomics
study. PloS one. 2008;3(8):e3003. Epub 2008/08/21. doi: 10.1371/journal.pone.0003003.
PubMed PMID: 18714345; PubMed Central PMCID: PMC2500163.

  155  
52. Rasheed S, Yan JS, Hussain A, Lai B. Proteomic characterization of HIV-modulated
membrane receptors, kinases and signaling proteins involved in novel angiogenic pathways.
Journal of translational medicine. 2009;7:75. Epub 2009/08/29. doi: 10.1186/1479-5876-7-75.
PubMed PMID: 19712456; PubMed Central PMCID: PMC2754444.

53. Gong TW, Besirli CG, Lomax MI. MACF1 gene structure: a hybrid of plectin and
dystrophin. Mammalian genome : official journal of the International Mammalian Genome
Society. 2001;12(11):852-61. Epub 2002/02/15. PubMed PMID: 11845288.

54. Wang W, Guo J, Yu D, Vorster PJ, Chen W, Wu Y. A Dichotomy in Cortical Actin and
Chemotactic Actin Activity between Human Memory and Naive T Cells Contributes to Their
Differential Susceptibility to HIV-1 Infection. The Journal of biological chemistry.
2012;287(42):35455-69. Epub 2012/08/11. doi: 10.1074/jbc.M112.362400. PubMed PMID:
22879601; PubMed Central PMCID: PMC3471682.

55. Boeras DI, Hraber PT, Hurlston M, Evans-Strickfaden T, Bhattacharya T, Giorgi EE, et
al. Role of donor genital tract HIV-1 diversity in the transmission bottleneck. Proceedings of the
National Academy of Sciences of the United States of America. 2011;108(46):E1156-63. Epub
2011/11/09. doi: 10.1073/pnas.1103764108. PubMed PMID: 22065783; PubMed Central
PMCID: PMC3219102.

56. Flavin RJ, Smyth PC, Laios A, O'Toole SA, Barrett C, Finn SP, et al. Potentially
important microRNA cluster on chromosome 17p13.1 in primary peritoneal carcinoma. Modern
pathology : an official journal of the United States and Canadian Academy of Pathology, Inc.
2009;22(2):197-205. Epub 2008/08/05. doi: 10.1038/modpathol.2008.135. PubMed PMID:
18677302.

57. Sun G, Rossi JJ. MicroRNAs and their potential involvement in HIV infection. Trends in
pharmacological sciences. 2011;32(11):675-81. Epub 2011/08/25. doi:
10.1016/j.tips.2011.07.003. PubMed PMID: 21862142; PubMed Central PMCID: PMC3200488.

58. Li D, Zhao Y, Liu C, Chen X, Qi Y, Jiang Y, et al. Analysis of MiR-195 and MiR-497
expression, regulation and role in breast cancer. Clinical cancer research : an official journal of
the American Association for Cancer Research. 2011;17(7):1722-30. Epub 2011/02/26. doi:
10.1158/1078-0432.CCR-10-1800. PubMed PMID: 21350001.

59. Zhu H, Yang Y, Wang Y, Li J, Schiller PW, Peng T. MicroRNA-195 promotes palmitate-
induced apoptosis in cardiomyocytes by down-regulating Sirt1. Cardiovascular research.
2011;92(1):75-84. Epub 2011/05/31. doi: 10.1093/cvr/cvr145. PubMed PMID: 21622680.

60. Zhu HC, Wang LM, Wang M, Song B, Tan S, Teng JF, et al. MicroRNA-195
downregulates Alzheimer's disease amyloid-beta production by targeting BACE1. Brain research
bulletin. 2012. Epub 2012/06/23. doi: 10.1016/j.brainresbull.2012.05.018. PubMed PMID:
22721728.

  156  
61. Chen H, Untiveros GM, McKee LA, Perez J, Li J, Antin PB, et al. Micro-RNA-195 and -
451 regulate the LKB1/AMPK signaling axis by targeting MO25. PloS one. 2012;7(7):e41574.
Epub 2012/07/31. doi: 10.1371/journal.pone.0041574. PubMed PMID: 22844503; PubMed
Central PMCID: PMC3402395.

62. Lewis BP, Burge CB, Bartel DP. Conserved seed pairing, often flanked by adenosines,
indicates that thousands of human genes are microRNA targets. Cell. 2005;120(1):15-20. Epub
2005/01/18. doi: 10.1016/j.cell.2004.12.035. PubMed PMID: 15652477.

63. Bennasser Y, Yeung ML, Jeang KT. HIV-1 TAR RNA subverts RNA interference in
transfected cells through sequestration of TAR RNA-binding protein, TRBP. The Journal of
biological chemistry. 2006;281(38):27674-8. Epub 2006/08/05. doi: 10.1074/jbc.C600072200.
PubMed PMID: 16887810.

64. Pantophlet R, Burton DR. GP120: target for neutralizing HIV-1 antibodies. Annual
review of immunology. 2006;24:739-69. Epub 2006/03/23. doi:
10.1146/annurev.immunol.24.021605.090557. PubMed PMID: 16551265.

65. Cheng-Mayer C, Quiroga M, Tung JW, Dina D, Levy JA. Viral determinants of human
immunodeficiency virus type 1 T-cell or macrophage tropism, cytopathogenicity, and CD4
antigen modulation. Journal of virology. 1990;64(9):4390-8. Epub 1990/09/01. PubMed PMID:
2384920; PubMed Central PMCID: PMC247907.

66. Morikita T, Maeda Y, Fujii S, Matsushita S, Obaru K, Takatsuki K. The V1/V2 region of
human immunodeficiency virus type 1 modulates the sensitivity to neutralization by soluble CD4
and cellular tropism. AIDS research and human retroviruses. 1997;13(15):1291-9. Epub
1997/10/27. PubMed PMID: 9339846.

67. van Gils MJ, Bunnik EM, Boeser-Nunnink BD, Burger JA, Terlouw-Klein M, Verwer N,
et al. Longer V1V2 region with increased number of potential N-linked glycosylation sites in the
HIV-1 envelope glycoprotein protects against HIV-specific neutralizing antibodies. Journal of
virology. 2011;85(14):6986-95. Epub 2011/05/20. doi: 10.1128/JVI.00268-11. PubMed PMID:
21593147; PubMed Central PMCID: PMC3126602.

68. Rasheed S, Norman GL, Gill PS, Meyer PR, Cheng L, Levine AM. Virus-neutralizing
activity, serologic heterogeneity, and retrovirus isolation from homosexual men in the Los
Angeles area. Virology. 1986;150(1):1-9. Epub 1986/04/15. PubMed PMID: 3006329.

69. Meissner EG, Coffield VM, Su L. Thymic pathogenicity of an HIV-1 envelope is


associated with increased CXCR4 binding efficiency and V5-gp41-dependent activity, but not
V1/V2-associated CD4 binding efficiency and viral entry. Virology. 2005;336(2):184-97. Epub
2005/05/17. doi: 10.1016/j.virol.2005.03.032. PubMed PMID: 15892960.

  157  
70. Antonov AV, Dietmann S, Wong P, Lutter D, Mewes HW. GeneSet2miRNA: finding the
signature of cooperative miRNA activities in the gene lists. Nucleic acids research. 2009;37(Web
Server issue):W323-8. Epub 2009/05/08. doi: 10.1093/nar/gkp313. PubMed PMID: 19420064;
PubMed Central PMCID: PMC2703952.

71. Karlin S, Altschul SF. Methods for assessing the statistical significance of molecular
sequence features by using general scoring schemes. Proceedings of the National Academy of
Sciences of the United States of America. 1990;87(6):2264-8. Epub 1990/03/01. PubMed PMID:
2315319; PubMed Central PMCID: PMC53667.

72. Higgins DG, Sharp PM. Fast and sensitive multiple sequence alignments on a
microcomputer. Computer applications in the biosciences : CABIOS. 1989;5(2):151-3. Epub
1989/04/01. PubMed PMID: 2720464.

73. Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, et al.
Clustal W and Clustal X version 2.0. Bioinformatics. 2007;23(21):2947-8. Epub 2007/09/12. doi:
10.1093/bioinformatics/btm404. PubMed PMID: 17846036.

74. Kim VN. MicroRNA biogenesis: coordinated cropping and dicing. Nat Rev Mol Cell
Biol. 2005;6(5):376-85. doi: 10.1038/nrm1644. PubMed PMID: 15852042.

75. Zamore PD, Haley B. Ribo-gnome: the big world of small RNAs. Science.
2005;309(5740):1519-24. doi: 10.1126/science.1111444. PubMed PMID: 16141061.

76. Grey F, Meyers H, White EA, Spector DH, Nelson J. A human cytomegalovirus-encoded
microRNA regulates expression of multiple viral genes involved in replication. PLoS Pathog.
2007;3(11):e163. doi: 10.1371/journal.ppat.0030163. PubMed PMID: 17983268; PubMed
Central PMCID: PMCPMC2048532.

77. Knight SW, Bass BL. A role for the RNase III enzyme DCR-1 in RNA interference and
germ line development in Caenorhabditis elegans. Science. 2001;293(5538):2269-71. doi:
10.1126/science.1062039. PubMed PMID: 11486053; PubMed Central PMCID:
PMCPMC1855227.

78. Kozomara A, Griffiths-Jones S. miRBase: integrating microRNA annotation and deep-


sequencing data. Nucleic acids research. 2011;39(Database issue):D152-7. doi:
10.1093/nar/gkq1027. PubMed PMID: 21037258; PubMed Central PMCID: PMCPMC3013655.

79. Friedman RC, Farh KK, Burge CB, Bartel DP. Most mammalian mRNAs are conserved
targets of microRNAs. Genome Res. 2009;19(1):92-105. doi: 10.1101/gr.082701.108. PubMed
PMID: 18955434; PubMed Central PMCID: PMCPMC2612969.

80. Younger ST, Corey DR. Transcriptional gene silencing in mammalian cells by miRNA
mimics that target gene promoters. Nucleic acids research. 2011;39(13):5682-91. doi:
10.1093/nar/gkr155. PubMed PMID: 21427083; PubMed Central PMCID: PMCPMC3141263.

  158  
81. Barbato C, Arisi I, Frizzo ME, Brandi R, Da Sacco L, Masotti A. Computational
challenges in miRNA target predictions: to be or not to be a true target? J Biomed Biotechnol.
2009;2009:803069. doi: 10.1155/2009/803069. PubMed PMID: 19551154; PubMed Central
PMCID: PMCPMC2699446.

82. Sashital DG, Doudna JA. Structural insights into RNA interference. Curr Opin Struct
Biol. 2010;20(1):90-7. doi: 10.1016/j.sbi.2009.12.001. PubMed PMID: 20053548; PubMed
Central PMCID: PMCPMC2855239.

83. Davis BN, Hata A. Regulation of MicroRNA Biogenesis: A miRiad of mechanisms. Cell
Commun Signal. 2009;7:18. doi: 10.1186/1478-811X-7-18. PubMed PMID: 19664273; PubMed
Central PMCID: PMCPMC3224893.

84. Sayed D, Abdellatif M. MicroRNAs in development and disease. Physiol Rev.


2011;91(3):827-87. doi: 10.1152/physrev.00006.2010. PubMed PMID: 21742789.

85. Kumar A. RNA interference: a multifaceted innate antiviral defense. Retrovirology.


2008;5:17. doi: 10.1186/1742-4690-5-17. PubMed PMID: 18241347; PubMed Central PMCID:
PMCPMC2259359.

86. Muller S, Imler JL. Dicing with viruses: microRNAs as antiviral factors. Immunity.
2007;27(1):1-3. doi: 10.1016/j.immuni.2007.07.003. PubMed PMID: 17663977.

87. Whisnant AW, Bogerd HP, Flores O, Ho P, Powers JG, Sharova N, et al. In-depth
analysis of the interaction of HIV-1 with cellular microRNA biogenesis and effector
mechanisms. MBio. 2013;4(2):e000193. doi: 10.1128/mBio.00193-13. PubMed PMID:
23592263; PubMed Central PMCID: PMCPMC3634607.

88. Chiang K, Liu H, Rice AP. miR-132 enhances HIV-1 replication. Virology.
2013;438(1):1-4. doi: 10.1016/j.virol.2012.12.016. PubMed PMID: 23357732; PubMed Central
PMCID: PMCPMC3594373.

89. Kapoor R, Arora S, Ponia SS, Kumar B, Maddika S, Banerjea AC. The miRNA miR-34a
enhances HIV-1 replication by targeting PNUTS/PPP1R10, which negatively regulates HIV-1
transcriptional complex formation. Biochem J. 2015;470(3):293-302. doi: 10.1042/BJ20150700.
PubMed PMID: 26188041.

90. Quaranta MT, Olivetta E, Sanchez M, Spinello I, Paolillo R, Arenaccio C, et al. miR-
146a controls CXCR4 expression in a pathway that involves PLZF and can be used to inhibit
HIV-1 infection of CD4(+) T lymphocytes. Virology. 2015;478:27-38. doi:
10.1016/j.virol.2015.01.016. PubMed PMID: 25705792.

91. Orecchini E, Doria M, Michienzi A, Giuliani E, Vassena L, Ciafre SA, et al. The HIV-1
Tat protein modulates CD4 expression in human T cells through the induction of miR-222. RNA
Biol. 2014;11(4):334-8. doi: 10.4161/rna.28372. PubMed PMID: 24717285; PubMed Central
PMCID: PMCPMC4075518.
  159  
92. Patel P, Ansari MY, Bapat S, Thakar M, Gangakhedkar R, Jameel S. The microRNA
miR-29a is associated with human immunodeficiency virus latency. Retrovirology. 2014;11:108.
doi: 10.1186/s12977-014-0108-6. PubMed PMID: 25486977; PubMed Central PMCID:
PMCPMC4269869.

93. Wang P, Qu X, Zhou X, Shen Y, Ji H, Fu Z, et al. Two cellular microRNAs, miR-196b


and miR-1290, contribute to HIV-1 latency. Virology. 2015;486:228-38. doi:
10.1016/j.virol.2015.09.016. PubMed PMID: 26469550.

94. Ruelas DS, Chan JK, Oh E, Heidersbach AJ, Hebbeler AM, Chavez L, et al. MicroRNA-
155 Reinforces HIV Latency. The Journal of biological chemistry. 2015;290(22):13736-48. doi:
10.1074/jbc.M115.641837. PubMed PMID: 25873391; PubMed Central PMCID:
PMCPMC4447952.

95. Chiang K, Rice AP. MicroRNA-mediated restriction of HIV-1 in resting CD4+ T cells
and monocytes. Viruses. 2012;4(9):1390-409. doi: 10.3390/v4091390. PubMed PMID:
23170164; PubMed Central PMCID: PMCPMC3499811.

96. Zhou Y, Sun L, Wang X, Liang H, Ye L, Zhou L, et al. Short Communication: HIV-1
Infection Suppresses Circulating Viral Restriction microRNAs. AIDS research and human
retroviruses. 2016;32(4):386-9. doi: 10.1089/AID.2015.0253. PubMed PMID: 26607272;
PubMed Central PMCID: PMCPMC4817567.

97. Chen AK, Sengupta P, Waki K, Van Engelenburg SB, Ochiya T, Ablan SD, et al.
MicroRNA binding to the HIV-1 Gag protein inhibits Gag assembly and virus production.
Proceedings of the National Academy of Sciences of the United States of America.
2014;111(26):E2676-83. doi: 10.1073/pnas.1408037111. PubMed PMID: 24938790; PubMed
Central PMCID: PMCPMC4084429.

98. Casey Klockow L, Sharifi HJ, Wen X, Flagg M, Furuya AK, Nekorchuk M, et al. The
HIV-1 protein Vpr targets the endoribonuclease Dicer for proteasomal degradation to boost
macrophage infection. Virology. 2013;444(1-2):191-202. doi: 10.1016/j.virol.2013.06.010.
PubMed PMID: 23849790; PubMed Central PMCID: PMCPMC3755019.

99. Flor TB, Blom B. Pathogens Use and Abuse MicroRNAs to Deceive the Immune System.
Int J Mol Sci. 2016;17(4). doi: 10.3390/ijms17040538. PubMed PMID: 27070595; PubMed
Central PMCID: PMCPMC4848994.

100. Lanford RE, Hildebrandt-Eriksen ES, Petri A, Persson R, Lindow M, Munk ME, et al.
Therapeutic silencing of microRNA-122 in primates with chronic hepatitis C virus infection.
Science. 2010;327(5962):198-201. doi: 10.1126/science.1178178. PubMed PMID: 19965718;
PubMed Central PMCID: PMCPMC3436126.

101. van der Ree MH, van der Meer AJ, de Bruijne J, Maan R, van Vliet A, Welzel TM, et al.
Long-term safety and efficacy of microRNA-targeted therapy in chronic hepatitis C patients.
Antiviral Res. 2014;111:53-9. doi: 10.1016/j.antiviral.2014.08.015. PubMed PMID: 25218783.
  160  
102. Holland B, Wong J, Li M, Rasheed S. Identification of human microRNA-like sequences
embedded within the protein-encoding genes of the human immunodeficiency virus. PloS one.
2013;8(3):e58586. doi: 10.1371/journal.pone.0058586. PubMed PMID: 23520522; PubMed
Central PMCID: PMCPMC3592801.

103. Bernard MA, Zhao H, Yue SC, Anandaiah A, Koziel H, Tachado SD. Novel HIV-1
miRNAs stimulate TNFalpha release in human macrophages via TLR8 signaling pathway. PloS
one. 2014;9(9):e106006. doi: 10.1371/journal.pone.0106006. PubMed PMID: 25191859;
PubMed Central PMCID: PMCPMC4156304.

104. Kozomara A, Griffiths-Jones S. miRBase: annotating high confidence microRNAs using


deep sequencing data. Nucleic acids research. 2014;42(Database issue):D68-73. doi:
10.1093/nar/gkt1181. PubMed PMID: 24275495; PubMed Central PMCID: PMCPMC3965103.

105. Zhang Y, Fan M, Geng G, Liu B, Huang Z, Luo H, et al. A novel HIV-1-encoded
microRNA enhances its viral replication by targeting the TATA box region. Retrovirology.
2014;11:23. doi: 10.1186/1742-4690-11-23. PubMed PMID: 24620741; PubMed Central
PMCID: PMCPMC4007588.

106. Omoto S, Ito M, Tsutsumi Y, Ichikawa Y, Okuyama H, Brisibe EA, et al. HIV-1 nef
suppression by virally encoded microRNA. Retrovirology. 2004;1:44. Epub 2004/12/17. doi:
10.1186/1742-4690-1-44. PubMed PMID: 15601474; PubMed Central PMCID: PMC544868.

107. Ruby JG, Jan CH, Bartel DP. Intronic microRNA precursors that bypass Drosha
processing. Nature. 2007;448(7149):83-6. doi: 10.1038/nature05983. PubMed PMID: 17589500;
PubMed Central PMCID: PMCPMC2475599.

108. Berezikov E, Chung WJ, Willis J, Cuppen E, Lai EC. Mammalian mirtron genes.
Molecular cell. 2007;28(2):328-36. doi: 10.1016/j.molcel.2007.09.028. PubMed PMID:
17964270; PubMed Central PMCID: PMCPMC2763384.

109. Ladewig E, Okamura K, Flynt AS, Westholm JO, Lai EC. Discovery of hundreds of
mirtrons in mouse and human small RNA data. Genome Res. 2012;22(9):1634-45. doi:
10.1101/gr.133553.111. PubMed PMID: 22955976; PubMed Central PMCID:
PMCPMC3431481.

110. Harwig A, Jongejan A, van Kampen AH, Berkhout B, Das AT. Tat-dependent production
of an HIV-1 TAR-encoded miRNA-like small RNA. Nucleic acids research. 2016;44(9):4340-
53. doi: 10.1093/nar/gkw167. PubMed PMID: 26984525; PubMed Central PMCID:
PMCPMC4872094.

111. He JF, Luo YM, Wan XH, Jiang D. Biogenesis of MiRNA-195 and its role in biogenesis,
the cell cycle, and apoptosis. J Biochem Mol Toxicol. 2011;25(6):404-8. doi: 10.1002/jbt.20396.
PubMed PMID: 22190509.

  161  
112. Grimson A, Farh KK, Johnston WK, Garrett-Engele P, Lim LP, Bartel DP. MicroRNA
targeting specificity in mammals: determinants beyond seed pairing. Molecular cell.
2007;27(1):91-105. doi: 10.1016/j.molcel.2007.06.017. PubMed PMID: 17612493; PubMed
Central PMCID: PMCPMC3800283.

113. Yedavalli VS, Neuveut C, Chi YH, Kleiman L, Jeang KT. Requirement of DDX3 DEAD
box RNA helicase for HIV-1 Rev-RRE export function. Cell. 2004;119(3):381-92. doi:
10.1016/j.cell.2004.09.029. PubMed PMID: 15507209.

114. Shen CJ, Jia YH, Tian RR, Ding M, Zhang C, Wang JH. Translation of Pur-alpha is
targeted by cellular miRNAs to modulate the differentiation-dependent susceptibility of
monocytes to HIV-1 infection. FASEB J. 2012;26(11):4755-64. doi: 10.1096/fj.12-209023.
PubMed PMID: 22835829.

115. Suzuki K, Ahlenstiel C, Marks K, Kelleher AD. Promoter Targeting RNAs: Unexpected
Contributors to the Control of HIV-1 Transcription. Mol Ther Nucleic Acids. 2015;4:e222. doi:
10.1038/mtna.2014.67. PubMed PMID: 25625613; PubMed Central PMCID:
PMCPMC4345301.

116. Zhang Y, Fan M, Zhang X, Huang F, Wu K, Zhang J, et al. Cellular microRNAs up-
regulate transcription via interaction with promoter TATA-box motifs. RNA. 2014;20(12):1878-
89. doi: 10.1261/rna.045633.114. PubMed PMID: 25336585; PubMed Central PMCID:
PMCPMC4238354.

117. Shi J, Duan Z, Sun J, Wu M, Wang B, Zhang J, et al. Identification and validation of a
novel microRNA-like molecule derived from a cytoplasmic RNA virus antigenome by
bioinformatics and experimental approaches. Virol J. 2014;11:121. doi: 10.1186/1743-422X-11-
121. PubMed PMID: 24981144; PubMed Central PMCID: PMCPMC4087238.

118. Kato K, Senoki T, Takaku H. Inhibition of HIV-1 replication by RNA with a microRNA-
like function. Int J Mol Med. 2013;31(1):252-8. doi: 10.3892/ijmm.2012.1170. PubMed PMID:
23128354.

119. Zhang T, Cheng T, Wei L, Cai Y, Yeo AE, Han J, et al. Efficient inhibition of HIV-1
replication by an artificial polycistronic miRNA construct. Virol J. 2012;9:118. doi:
10.1186/1743-422X-9-118. PubMed PMID: 22709537; PubMed Central PMCID:
PMCPMC3416660.

120. Narayanan A, Iordanskiy S, Das R, Van Duyne R, Santos S, Jaworski E, et al. Exosomes
derived from HIV-1-infected cells contain trans-activation response element RNA. The Journal
of biological chemistry. 2013;288(27):20014-33. doi: 10.1074/jbc.M112.438895. PubMed
PMID: 23661700; PubMed Central PMCID: PMCPMC3707700.

  162  
121. Roth WW, Huang MB, Addae Konadu K, Powell MD, Bond VC. Micro RNA in
Exosomes from HIV-Infected Macrophages. Int J Environ Res Public Health.
2016;13(1):ijerph13010032. doi: 10.3390/ijerph13010032. PubMed PMID: 26703692; PubMed
Central PMCID: PMCPMC4730423.

122. Aqil M, Naqvi AR, Mallik S, Bandyopadhyay S, Maulik U, Jameel S. The HIV Nef
protein modulates cellular and exosomal miRNA profiles in human monocytic cells. J Extracell
Vesicles. 2014;3. doi: 10.3402/jev.v3.23129. PubMed PMID: 24678387; PubMed Central
PMCID: PMCPMC3967016.

123. Zhang B, Pan X, Cobb GP, Anderson TA. microRNAs as oncogenes and tumor
suppressors. Dev Biol. 2007;302(1):1-12. doi: 10.1016/j.ydbio.2006.08.028. PubMed PMID:
16989803.

124. Calin GA, Sevignani C, Dumitru CD, Hyslop T, Noch E, Yendamuri S, et al. Human
microRNA genes are frequently located at fragile sites and genomic regions involved in cancers.
Proceedings of the National Academy of Sciences of the United States of America.
2004;101(9):2999-3004. doi: 10.1073/pnas.0307323101. PubMed PMID: 14973191; PubMed
Central PMCID: PMCPMC365734.

125. Singh R, Saini N. Downregulation of BCL2 by miRNAs augments drug-induced


apoptosis--a combined computational and experimental approach. J Cell Sci. 2012;125(Pt
6):1568-78. doi: 10.1242/jcs.095976. PubMed PMID: 22328513.

126. Wang Y, Zhang X, Zou C, Kung HF, Lin MC, Dress A, et al. miR-195 inhibits tumor
growth and angiogenesis through modulating IRS1 in breast cancer. Biomed Pharmacother.
2016;80:95-101. doi: 10.1016/j.biopha.2016.03.007. PubMed PMID: 27133044.

127. Singh R, Yadav V, Kumar S, Saini N. MicroRNA-195 inhibits proliferation, invasion and
metastasis in breast cancer cells by targeting FASN, HMGCR, ACACA and CYP27B1. Sci Rep.
2015;5:17454. doi: 10.1038/srep17454. PubMed PMID: 26632252; PubMed Central PMCID:
PMCPMC4668367.

128. Zhang X, Tao T, Liu C, Guan H, Huang Y, Xu B, et al. Downregulation of miR-195


promotes prostate cancer progression by targeting HMGA1. Oncol Rep. 2016. doi:
10.3892/or.2016.4797. PubMed PMID: 27175617.

129. Wang M, Zhang J, Tong L, Ma X, Qiu X. MiR-195 is a key negative regulator of


hepatocellular carcinoma metastasis by targeting FGF2 and VEGFA. Int J Clin Exp Pathol.
2015;8(11):14110-20. PubMed PMID: 26823724; PubMed Central PMCID: PMCPMC4713510.

130. Dong-Xu W, Jia L, Su-Juan Z. MicroRNA-185 is a novel tumor suppressor by negatively


modulating the Wnt/beta-catenin pathway in human colorectal cancer. Indian J Cancer. 2015;52
Suppl 3:E182-5. doi: 10.4103/0019-509X.186576. PubMed PMID: 27453420.

  163  
131. Tang H, Liu P, Yang L, Xie X, Ye F, Wu M, et al. miR-185 suppresses tumor
proliferation by directly targeting E2F6 and DNMT1 and indirectly upregulating BRCA1 in
triple-negative breast cancer. Mol Cancer Ther. 2014;13(12):3185-97. doi: 10.1158/1535-
7163.MCT-14-0243. PubMed PMID: 25319390.

132. Qu F, Cui X, Hong Y, Wang J, Li Y, Chen L, et al. MicroRNA-185 suppresses


proliferation, invasion, migration, and tumorigenicity of human prostate cancer cells through
targeting androgen receptor. Molecular and cellular biochemistry. 2013;377(1-2):121-30. doi:
10.1007/s11010-013-1576-z. PubMed PMID: 23417242.

133. Li S, Ma Y, Hou X, Liu Y, Li K, Xu S, et al. MiR-185 acts as a tumor suppressor by


targeting AKT1 in non-small cell lung cancer cells. Int J Clin Exp Pathol. 2015;8(9):11854-62.
PubMed PMID: 26617940; PubMed Central PMCID: PMCPMC4637756.

134. Nam JW, Rissland OS, Koppstein D, Abreu-Goodger C, Jan CH, Agarwal V, et al.
Global analyses of the effect of different cellular contexts on microRNA targeting. Molecular
cell. 2014;53(6):1031-43. doi: 10.1016/j.molcel.2014.02.013. PubMed PMID: 24631284;
PubMed Central PMCID: PMCPMC4062300.

135. Persson H, Kvist A, Rego N, Staaf J, Vallon-Christersson J, Luts L, et al. Identification of


new microRNAs in paired normal and tumor breast tissue suggests a dual role for the
ERBB2/Her2 gene. Cancer Res. 2011;71(1):78-86. doi: 10.1158/0008-5472.CAN-10-1869.
PubMed PMID: 21199797.

136. Madhavan B, Yue S, Galli U, Rana S, Gross W, Muller M, et al. Combined evaluation of
a panel of protein and miRNA serum-exosome biomarkers for pancreatic cancer diagnosis
increases sensitivity and specificity. Int J Cancer. 2015;136(11):2616-27. doi: 10.1002/ijc.29324.
PubMed PMID: 25388097.

137. Shen Y, Pan Y, Xu L, Chen L, Liu L, Chen H, et al. Identifying microRNA-mRNA


regulatory network in gemcitabine-resistant cells derived from human pancreatic cancer cells.
Tumour Biol. 2015;36(6):4525-34. doi: 10.1007/s13277-015-3097-8. PubMed PMID: 25722110.

138. Fields BN, Knipe DM, Howley PM. Fields virology. 5th ed. Philadelphia: Wolters
Kluwer Health/Lippincott Williams & Wilkins; 2007.

139. Checkley MA, Luttge BG, Freed EO. HIV-1 envelope glycoprotein biosynthesis,
trafficking, and incorporation. Journal of molecular biology. 2011;410(4):582-608. doi:
10.1016/j.jmb.2011.04.042. PubMed PMID: 21762802; PubMed Central PMCID:
PMCPMC3139147.

140. McCune JM, Rabin LB, Feinberg MB, Lieberman M, Kosek JC, Reyes GR, et al.
Endoproteolytic cleavage of gp160 is required for the activation of human immunodeficiency
virus. Cell. 1988;53(1):55-67. PubMed PMID: 2450679.

  164  
141. Moulard M, Hallenberger S, Garten W, Klenk HD. Processing and routage of HIV
glycoproteins by furin to the cell surface. Virus Res. 1999;60(1):55-65. PubMed PMID:
10225274.

142. Moulard M, Decroly E. Maturation of HIV envelope glycoprotein precursors by cellular


endoproteases. Biochimica et biophysica acta. 2000;1469(3):121-32. PubMed PMID: 11063880.

143. Molloy SS, Anderson ED, Jean F, Thomas G. Bi-cycling the furin pathway: from TGN
localization to pathogen activation and embryogenesis. Trends Cell Biol. 1999;9(1):28-35.
PubMed PMID: 10087614.

144. Guilhaudis L, Jacobs A, Caffrey M. Solution structure of the HIV gp120 C5 domain. Eur
J Biochem. 2002;269(19):4860-7. PubMed PMID: 12354117.

145. Leis J, Baltimore D, Bishop JM, Coffin J, Fleissner E, Goff SP, et al. Standardized and
simplified nomenclature for proteins common to all retroviruses. Journal of virology.
1988;62(5):1808-9. PubMed PMID: 3357211; PubMed Central PMCID: PMCPMC253234.

146. Hallenberger S, Bosch V, Angliker H, Shaw E, Klenk HD, Garten W. Inhibition of furin-
mediated cleavage activation of HIV-1 glycoprotein gp160. Nature. 1992;360(6402):358-61. doi:
10.1038/360358a0. PubMed PMID: 1360148.

147. Decroly E, Vandenbranden M, Ruysschaert JM, Cogniaux J, Jacob GS, Howard SC, et al.
The convertases furin and PC1 can both cleave the human immunodeficiency virus (HIV)-1
envelope glycoprotein gp160 into gp120 (HIV-1 SU) and gp41 (HIV-I TM). The Journal of
biological chemistry. 1994;269(16):12240-7. PubMed PMID: 8163529.

148. Molloy SS, Thomas L, VanSlyke JK, Stenberg PE, Thomas G. Intracellular trafficking
and activation of the furin proprotein convertase: localization to the TGN and recycling from the
cell surface. EMBO J. 1994;13(1):18-33. PubMed PMID: 7508380; PubMed Central PMCID:
PMCPMC394775.

149. Nakayama K. Furin: a mammalian subtilisin/Kex2p-like endoprotease involved in


processing of a wide variety of precursor proteins. Biochem J. 1997;327 ( Pt 3):625-35. PubMed
PMID: 9599222; PubMed Central PMCID: PMCPMC1218878.

150. Garten W, Hallenberger S, Ortmann D, Schafer W, Vey M, Angliker H, et al. Processing


of viral glycoproteins by the subtilisin-like endoprotease furin and its inhibition by specific
peptidylchloroalkylketones. Biochimie. 1994;76(3-4):217-25. PubMed PMID: 7819326.

151. Ohnishi Y, Shioda T, Nakayama K, Iwata S, Gotoh B, Hamaguchi M, et al. A furin-


defective cell line is able to process correctly the gp160 of human immunodeficiency virus type
1. Journal of virology. 1994;68(6):4075-9. PubMed PMID: 8189547; PubMed Central PMCID:
PMCPMC236921.

  165  
152. Fenouillet E, Gluckman JC. Immunological analysis of human immunodeficiency virus
type 1 envelope glycoprotein proteolytic cleavage. Virology. 1992;187(2):825-8. PubMed
PMID: 1372142.

153. Klenk HD, Rott R. The molecular biology of influenza virus pathogenicity. Adv Virus
Res. 1988;34:247-81. PubMed PMID: 3046255.

154. Hinske LC, Galante PA, Kuo WP, Ohno-Machado L. A potential role for intragenic
miRNAs on their hosts' interactome. BMC genomics. 2010;11:533. doi: 10.1186/1471-2164-11-
533. PubMed PMID: 20920310; PubMed Central PMCID: PMCPMC3091682.

155. Li H, Wang W, Zhang L, Lan Q, Wang J, Cao Y, et al. Identification of a Long


Noncoding RNA-Associated Competing Endogenous RNA Network in Intracranial Aneurysm.
World Neurosurg. 2016. doi: 10.1016/j.wneu.2016.10.016. PubMed PMID: 27751926.

156. Du A, Zhao S, Wan L, Liu T, Peng Z, Zhou Z, et al. MicroRNA expression profile of
human periodontal ligament cells under the influence of Porphyromonas gingivalis LPS. J Cell
Mol Med. 2016;20(7):1329-38. doi: 10.1111/jcmm.12819. PubMed PMID: 26987780; PubMed
Central PMCID: PMCPMC4929301.

157. Backes C, Haas J, Leidinger P, Frese K, Grossmann T, Ruprecht K, et al. miFRame:


analysis and visualization of miRNA sequencing data in neurological disorders. Journal of
translational medicine. 2015;13:224. doi: 10.1186/s12967-015-0594-x. PubMed PMID:
26169944; PubMed Central PMCID: PMCPMC4501052.

158. Wang L, Mukherjee S, Jia F, Narayan O, Zhao LJ. Interaction of virion protein Vpr of
human immunodeficiency virus type 1 with cellular transcription factor Sp1 and trans-activation
of viral long terminal repeat. The Journal of biological chemistry. 1995;270(43):25564-9.
PubMed PMID: 7592727.

159. Dalgleish AG, Beverley PC, Clapham PR, Crawford DH, Greaves MF, Weiss RA. The
CD4 (T4) antigen is an essential component of the receptor for the AIDS retrovirus. Nature.
1984;312(5996):763-7. PubMed PMID: 6096719.

160. Klatzmann D, Champagne E, Chamaret S, Gruest J, Guetard D, Hercend T, et al. T-


lymphocyte T4 molecule behaves as the receptor for human retrovirus LAV. Nature.
1984;312(5996):767-8. PubMed PMID: 6083454.

161. Sweet RW, Truneh A, Hendrickson WA. CD4: its structure, role in immune function and
AIDS pathogenesis, and potential as a pharmacological target. Curr Opin Biotechnol.
1991;2(4):622-33. PubMed PMID: 1367682.

162. Maddon PJ, Littman DR, Godfrey M, Maddon DE, Chess L, Axel R. The isolation and
nucleotide sequence of a cDNA encoding the T cell surface protein T4: a new member of the
immunoglobulin gene family. Cell. 1985;42(1):93-104. PubMed PMID: 2990730.

  166  
163. Shaw AS, Amrein KE, Hammond C, Stern DF, Sefton BM, Rose JK. The lck tyrosine
protein kinase interacts with the cytoplasmic tail of the CD4 glycoprotein through its unique
amino-terminal domain. Cell. 1989;59(4):627-36. PubMed PMID: 2582490.

164. Doyle C, Strominger JL. Interaction between CD4 and class II MHC molecules mediates
cell adhesion. Nature. 1987;330(6145):256-9. doi: 10.1038/330256a0. PubMed PMID: 2823150.

165. Cammarota G, Scheirle A, Takacs B, Doran DM, Knorr R, Bannwarth W, et al.


Identification of a CD4 binding site on the beta 2 domain of HLA-DR molecules. Nature.
1992;356(6372):799-801. doi: 10.1038/356799a0. PubMed PMID: 1574119.

166. Gay D, Maddon P, Sekaly R, Talle MA, Godfrey M, Long E, et al. Functional interaction
between human T-cell protein CD4 and the major histocompatibility complex HLA-DR antigen.
Nature. 1987;328(6131):626-9. doi: 10.1038/328626a0. PubMed PMID: 3112582.

167. Gottlieb MS, Schroff R, Schanker HM, Weisman JD, Fan PT, Wolf RA, et al.
Pneumocystis carinii pneumonia and mucosal candidiasis in previously healthy homosexual
men: evidence of a new acquired cellular immunodeficiency. N Engl J Med. 1981;305(24):1425-
31. doi: 10.1056/NEJM198112103052401. PubMed PMID: 6272109.

168. Levy JA. HIV and the Pathogenesis of AIDS: ASM press Washington, DC; 2007.

169. Maddon PJ, Dalgleish AG, McDougal JS, Clapham PR, Weiss RA, Axel R. The T4 gene
encodes the AIDS virus receptor and is expressed in the immune system and the brain. Cell.
1986;47(3):333-48. PubMed PMID: 3094962.

170. Chesebro B, Buller R, Portis J, Wehrly K. Failure of human immunodeficiency virus


entry and infection in CD4-positive human brain and skin cells. Journal of virology.
1990;64(1):215-21. PubMed PMID: 2293663; PubMed Central PMCID: PMCPMC249089.

171. Stevenson M, Zhang XH, Volsky DJ. Downregulation of cell surface molecules during
noncytopathic infection of T cells with human immunodeficiency virus. Journal of virology.
1987;61(12):3741-8. PubMed PMID: 3500327; PubMed Central PMCID: PMCPMC255987.

172. Deacon NJ, Tsykin A, Solomon A, Smith K, Ludford-Menting M, Hooker DJ, et al.
Genomic structure of an attenuated quasi species of HIV-1 from a blood transfusion donor and
recipients. Science. 1995;270(5238):988-91. PubMed PMID: 7481804.

173. Levesque K, Finzi A, Binette J, Cohen EA. Role of CD4 receptor down-regulation during
HIV-1 infection. Curr HIV Res. 2004;2(1):51-9. PubMed PMID: 15053340.

174. Flint SJ, Racaniello VR, Enquist LW, Skalka AM. Principles of virology, Volume 2:
pathogenesis and control: ASM press; 2009.

  167  
175. Lindwasser OW, Chaudhuri R, Bonifacino JS. Mechanisms of CD4 downregulation by
the Nef and Vpu proteins of primate immunodeficiency viruses. Curr Mol Med. 2007;7(2):171-
84. PubMed PMID: 17346169.

176. Salmon P, Olivier R, Riviere Y, Brisson E, Gluckman JC, Kieny MP, et al. Loss of CD4
membrane expression and CD4 mRNA during acute human immunodeficiency virus replication.
J Exp Med. 1988;168(6):1953-69. PubMed PMID: 3264318; PubMed Central PMCID:
PMCPMC2189155.

177. Hoxie JA, Alpers JD, Rackowski JL, Huebner K, Haggarty BS, Cedarbaum AJ, et al.
Alterations in T4 (CD4) protein and mRNA synthesis in cells infected with HIV. Science.
1986;234(4780):1123-7. PubMed PMID: 3095925.

178. Yuille MA, Hugunin M, John P, Peer L, Sacks LV, Poiesz BJ, et al. HIV-1 infection
abolishes CD4 biosynthesis but not CD4 mRNA. J Acquir Immune Defic Syndr. 1988;1(2):131-
7. PubMed PMID: 3265152.

  168  

You might also like