You are on page 1of 12



Homology in Proteins of Model Organism Drosophila Melanogaster Help Further Understand Disease Mechanisms of Myeloid Leukemia in Homo Sapiens Elizabeth Kim Lab Partner: Ashley Braxton BIOL 230W, Section #902 TA: Alex Campbell, Ravi Patel 12 October 2015




Humans have evolved from simple eukaryotes into very complex organisms. Despite the evolutionary differences we see in life today, the genetic makeup has not strayed too far. Humans and monkeys are undoubtedly the most similar phenotypically, portraying several homologies in anatomy and behavior (1). Scientists have been studying the similarities and differences among all species to determine the physiology and function to better understand all biological processes. Due to technological advances like Polymerase Chain Reaction (PCR) and manipulation of plasmids, researchers have discovered that many organisms have similar protein domains, sections of protein that serve a specific function and have evolved independently. With this information, scientists have compiled a large database called a genomic cDNA library that hold DNA sequences reverse transcribed from messenger RNA, mRNA. This allows anyone to quickly determine whether protein domains are homologous in other organisms or not (1). For example, researchers have found a homologue in humans and yeast that both require Cdc14, a protein required for the regulation of cyclin-dependent kinase activity (CDK) in order for the cells to leave the mitotic stage (2). Diseases found in humans are a result of misfolding of proteins due to mutation in the DNA. In order to find a cure, researchers must discover the source of the problem, but performing experiments on humans would be unethical. Not only is it difficult to maintain the participant’s health, but also the human life span is far too long for it to be a practical model in lab. The model organism that is able to fulfill this position should be highly homologous, consisting of domains that are of the same protein families. Specifically, the fruit fly Drosophila Melanogaster is an excellent model organism to manipulate because it has been used in the field of genetics for over 100 years. Through these studies, genes have been mapped according to



chromosome location and function. Time is also saved using these organisms because of their very short life cycles, allowing for as many trials as necessary. Most importantly, Drosophila have a great amount of homology with humans, crucial for the study of human diseases. Researchers have discovered that about 75% of disease-causing genes in Drosophila are homologous in humans (3). For example, the functional homologue of Alzheimer’s Disease in fruit flies is the protein domain APP-like or APPL, and deficiencies demonstrated abnormal behavior. The same impairment in humans of the APP transgene is known to be involved in this disease (3). By identifying the similar mechanisms involved in diseases of the model organism will hopefully bring researchers a step closer to creating a treatment option for current terminal diseases such as Parkinson’s disease, epilepsy, and cancer (1). It should be noted that model organisms are not perfect and implications may arise. Although the organisms have homologous proteins as humans, the treatment that may work on the animal may not be processed the same way as in a human body. The obvious differences between species make it difficult to discover a cure for human diseases. Throughout this experiment, the function of proteins sequenced from the cDNA library of the Drosphila Melanogaster were identified, if any, the homology with cDNA of Drosophila and human proteins, the function of the human protein was determined, and the function of these proteins was found in Drosophila can help understand the way human diseases involve them was explained. The overarching goal was to better understand human diseases by comparing the human genome database to the cDNA library sequence of Drosophila Melanogaster (1). With the knowledge that Drosophila contain high level of homology with human diseases, the cDNA sequenced was involved in the function of the mechanism of a homologous



protein responsible for a disease. This allowed scientists to further study the specific proteins to eventually find a cure in humans.

Materials and Methods All procedures were followed according to the lab manual (1). Colony Picking Two colonies were picked from an E. coli culture plate. It was important to practice sterile technique because contamination would result in growth of extraneous bacteria and less of the Drosophila cDNA. E. coli vectors were grown in ampicillin to minimize unnecessary growth. The vectors contained an ampicillin resistance gene, ensuring growth of only the E. coli containing the plasmid. Drosophila cDNA was inserted in plasmid pNB40 which contained the

ampicillin resistance gene. This allowed for optimal growth of cDNA but not E. coli to create a cDNA library.

Plasmid Isolation The QuickLyse Miniprep Plasmid DNA purification system was used to isolate the plasmid DNA from the E. coli vector. This suspended the cDNA in lysis solution temporarily

then removed the plasmid from the culture. The plasmid was collected onto a membrane in the spin column, and washed with isopropanol buffer. Elution then released the plasmid. Isolation was crucial in order to amplify the Drosophila cDNA and ensured that no other bacteria were involved and prepare for polymerase chain reaction.

Polymerase Chain Reaction

Once the cDNA was isolated, PCR was used to amplify the gene by primers and Taq polymerase. Primers found the SP6 and T7 promoters that were located at either end of the cDNA insert. Exponential growth was achieved through a series of denaturation, annealing, and



elongation of the DNA with heating and cooling. The products were copies of the cDNA, which allowed for efficiency in replication and created a library, which was necessary for developing the model for human disease.

Gel Electrophoresis

An agarose gel was made to use as a medium for the cDNA to travel. Each sample of

plasmid, the respective PCR mixes, a negative control, and DNA ladder was pipetted into each well. Gel electrophoresis showed a visual representation of the plasmid DNA and to determine whether the isolation was successful or not.

DNA Sequencing and Identification

The plasmid DNA were taken to the Nucleic Acid Facility on campus. The dideoxy method was used and this process entailed copying DNA. Because they lacked the 3’ OH group, the polymerase failed to add more base pairs, resulting in termination. The termination created fragments and was then separated by size. The migration patterns across the gel were recorded

and are read from 5’ to 3’. MEGA was used to edit the trace file by removing any overlaps. This step prepares the sequence to be observed and compared to homologous human proteins. CDART was used to determine the conserved protein domains the cDNA sequence had. BLAST was then used to identify the edited trace file to determine whether the cDNA sample is of the same protein families as humans.


Colony Growth Observations Plasmid A: small amount of opaque growth at bottom of tube. Plasmid B: small amount of opaque growth at bottom of tube (4). Colonies grew successfully and were able to be used in PCR.



Figure 1. Gel Electrophoresis Photo Under UV Light

1 2 3 4 5 6 7 8 9 10

Figure 1 Legend:

Well 1- DNA Ladder

Well 2- Negative Control (Of

other partners)

Well 3- 4A PCR Well 4- 4A DNA Well 5- 4B PCR Well 6- 4B DNA Well 7- Random PCR Well 8- 3A DNA Well 9- 3B DNA Well 10- Missing PCR

Ethidium bromide added to agarose gel to allow for fluorescence under UV light. Bands

that have traveled the farthest (to the left) have fewer base pairs and fewer base pairs result in

lighter weight. In return, bands that traveled the least have the most base pairs, in other words are

the heaviest DNA fragments. Multiple banding in lanes may be a result of more than one plasmid

DNA was isolated at once or contamination by E. coli or other materials occurred.

Note: PCR master mix tubes went missing for partners 3A and 3B. This may be a result of

evaporation and was discarded. A band was present in negative control.



Figure 2. Measurement of DNA Ladder

Kim Figure 2. Measurement of DNA Ladder This figure shows the average base pairs per banding

This figure shows the average base pairs per banding fragment to use as a template to

compare to the banding of plasmid cDNA.

Table 2. Molecular Size and Weight of cDNA Segments

DNA Segment

Plasmid 3A

Plasmid 3B

Random PCR





Band Size







This table shows the average base pairs of each band observed after running gel


Sequencing Results

Plasmids A and B were sent for sequencing from the group and successfully created trace

files for both samples. Plasmid A was used in this analysis. MEGA was used to trim Plasmid A’s

sequence and then searched for identification using BLAST.



Table 3. Drosophila Melanogaster mRNA Results of Plasmid A Trimmed Sequence (5)

Name of mRNA Sequence

Drosophila melanogaster

Myelodysplasia/myeloid leukemia factor (Mlf), transcript variant H, mRNA

Request ID




Query Length


Protein ID


Table 3 portrays information of the mRNA sequence such as request ID, accession

number, and length of the sequence. This sequence was the first hit to appear on the list of

BLAST results. The protein ID is the protein domain that correlated with the found sequence

after CDART analysis.

Table 5. Protein Identification of Homologues (5)


Drosophila melanogaster

Homo sapien




Name of Protein

Myelodysplasia/myeloid leukemia factor, isoform H

Myeloid leukemia factor 1 isoform 4

Protein Domain(s)

Mlf1Ip Superfamily:

Mlf1Ip Superfamily:

myelodysplasia-myeloid leukemia factor 1-interacting protein

myelodysplasia-myeloid leukemia factor 1-interacting protein

Request ID



Table 5 expresses the protein domain with the highest level of homology in Homo

sapiens observed corresponding to the protein domain of the Drosophila. The Homo sapien

homolog had an identity of 49%, the highest value among the rest of the results after BLAST



According to the data recovered in this experiment, the hypothesis was proven correct.

There was a human protein involved in a disease mechanism homologous to a protein domain in



Drosophila melanogaster. The only protein domain found between both species was responsible

for regulation of transcription and served many functions in the Drosophila, but mutations in the

protein domain of Homo sapiens resulted in myeloid leukemia (5). If further studies were to be

conducted on this specific protein domain, researchers could become more familiar with how the

disease works and eventually create a treatment option.

To come to this conclusion, plasmid DNA of Drosophila was extracted from E. coli

colonies and amplified using PCR technique. Samples of PCR master mixes were lost throughout

the experiment and did not correlate with other samples. Although confirmation of cDNA

amplification could not be confirmed through gel electrophoresis, the original plasmid DNA

samples were sequenced into trace files. This report focuses on plasmid A’s DNA sequence

which coded for Drosophila melanogaster Myelodysplasia/myeloid leukemia factor (Mlf),

transcript variant H, mRNA. The only protein domain involved in this sequence was Mlf1Ip

Superfamily: myelodysplasia-myeloid leukemia factor 1-interacting protein. This protein family

is a conserved central region responsible for transcriptional repressors located in the nucleus and

cytoplasm. Mlf1Ip was also regulated by the DREF transcription factor motif, which was

responsible for the regulation of rapid cell growth in Drosophila (5). Additional research has

suggested that immunohistochemical analysis of rat C6 and F98 glioblastoma tumor models

expressed a large amount of Mlf1Ip, deducing that this protein may play a role in cancers (6).

The experiment was not perfect and had plenty of room for error. The negative control

from gel electrophoresis showed banding, which calls for unreliable samples due to

contamination. Because the gel was shared with another group, the negative was from the sample

of the other group, making it difficult to conclude whether plasmid A focused in this report could

be faulty or not. Sources of this error may have resulted from improper sterile techniques

Kim 10

including failure to change micropipette tips after every use, human waste contact such as saliva

or skin cells, and leaving the sample out in the open air for too long. These mistakes would also

result in poor accession hits when analyzing the cDNA sequence. To improve the experiment it

would be beneficial to repeat steps and carefully follow sterile technique. Although it is not

practical, working in an isolated environment with controlled variables minimizing

contamination would account for these issues.

The human homolog to Mlf1Ip was found through BLAST analysis and corresponded to

Myeloid leukemia factor 1 isoform 4 which contained the same protein domain as the

Drosophila sequence, Mlf1Ip. This domain had 47% homology, the amount of identical amino

acids between human and Drosophila protein (1). The correlation between the two species was

not significant, making this organism less reliable as a model organism. There was homology in

function, but Homo sapien Mlf1Ip specifically regulate the fate of hematopoietic cells that reside

in red bone marrow. This suggests that the human homolog is involved in the disease mechanism

of acute myeloid leukemia (5). Acute myeloid leukemia is a form of cancer that results in

overgrowth of white blood cells in the bone marrow, interfering with growth of normal blood

cells (7).

By the end of this experiment, the Drosophila cDNA sequence extracted and isolated

lead to the discovery of the homologous protein domain of Mlf1Ip Superfamily: myelodysplasia-

myeloid leukemia factor 1-interacting protein found in Homo sapiens. This protein domain is

active in the human disease called myeloid leukemia. Manipulation of the model organism

Drosophila melanogaster protein homolog will allow researchers to better understand the

function of the disease by comparing functions in the protein domain of humans. This revelation

eludes that there are many other proteins in Drosophila that are homologous in humans,

Kim 11

introducing a new realm of possibilities of furthering knowledge in human physiology and other

diseases that need yet to find a cure for. With the knowledge of homologous proteins and

function of both the model organism Drosophila melanogaster and Homo sapiens, conducting

further research could contribute to great advancement in finding a cure for myeloid leukemia.

Further studies could focus on the functionality of Mlf1Ip to determine whether this domain can

be manipulated in order to stop the disease. A similar protocol to this experiment could be

followed by testing on a better model organism with higher homology of protein domains would

be the next step.

Kim 12


(1) Penn State Biology Department. “Biology 230 Laboratory Manual.” Lab handbook. The

Pennsylvania State University, PA. 2014.

(2) zquez-Novelle, M. Dolores, et al. "Functional Homology among Human and Fission Yeast

Cdc14 Phosphatases." Journal of Biological Chemistry 280.32 (2005): 29144-50. Web.

(3) Pandey, UB, and CD Nichols. "Human Disease Models in Drosophila Melanogaster and the

Role of the Fly in Therapeutic Drug Discovery." PHARMACOLOGICAL REVIEWS 63.2

(2011): 411-36. Web.

(4) Ashley Braxton: Penn State Biology Department. “Biology 230 Laboratory Manual.” Lab

handbook. The Pennsylvania State University, PA. 2014

(5) "National Library of Medicine." National Center for Biotechnology Information.

N.p., n.d. Web. 17 Oct. 2015. <>.

(6) Hanissian, S. H., et al. "Regulation of myeloid leukemia factor-1 interacting

protein (MLF1IP) expression in glioblastoma." NCBI (2005)

(7) "Acute myeloid leukemia." Wikipedia. Web. 12 Oct. 2015.