INTRODUCTION Through analysis of evolution of different genes and proteins in different organisms can help with the understanding

of human diseases. In this lab Drosophila, flies, were used as models to help with the understanding of human diseases by sequencing their cDNA. Complementary DNA is synthesized from a messenger RNA template from reverse transcriptase and DNA polymerase enzymes. The rationale behind using Drosophila is that the organism has been studied for hundreds of years. Physiological ramifications of mutations and different genes of phenotypes have been mapped on specific chromosome locations of the Drosophila over the years. Protein homology is when proteins are derived from a common ancestor and humans and Drosophila share a lot of homology in DNA sequences. Drosophila has a short life cycle, is easy to breed, and is easy to culture therefore it is a good organism to study. Furthermore, Drosophila have been very good models for certain diseases such as, type two diabetes, cancers, metabolic disorders, neurological disorders, and heart-related diseases (lab manual). In this lab proteins of specific genes were looked at. Proteins are polymers of amino acids and they are able to fold into specific conformations. The primary structure of proteins is the amino acid sequence of the polypeptide chain held together by covalent and or peptide bonds. The secondary structure refers to the alpha helix and beta strands or sheets that are made up of hydrogen bonds. The tertiary structure refers to the three dimensional structure when alpha helices and beta- sheets are folded and held together by the R groups. Lastly the quaternary structure is the three dimensional structure of a multi- subunit proteins (http://onlinelibrary.wiley.com/doi/10.1002/prot.21473/abstract;jsessionid=6D17421FBD5A310230196 2054E518B09.d01t01). Proteins are in protein families, which are groups of evolutionarily- related proteins and are from a common ancestor. They usually have similar structures and functions. A protein domain is the conserved part of a protein sequence, which can be independent form the rest of the chain. The protein domains can be looked at in humans and Drosophila to see which domains are conserved. Many researchers use Drosophila as models to look at any type of similarities in protein domains and families with humans. For example a study done by Perin, Johnston, Ozcelik, Jahn, Francke and Sudhof looked a synaptic vesicle protein, synaptotagmin, in Drosophila and humans. They studied its structural and functional conservations in evolution and found many similarities between the two (http://www.jbc.org/content/266/1/615.short). The research approach to this lab was to take the sequenced cDNA and use National Center for Biotechnology Information database to blast it and find the conserved protein domains. Then the sequence was blasted against the human genome to see if any protein domains were conserved. There were many goals in this lab such as, too familiarize students with laboratory methods for the study of DNA, databases in bioinformatics, limitations to bioinformatics, understand strategies to identify an unknown protein, the value of model organism, look at protein domains and their functions,

the importance of evolution in protein domains and families, and the relationship between the structure of proteins and their functions. The overall research question in this lab was what potential models for understanding human disease emerge when we search the human genome database with cDNA library sequences from Drosophila. From there, other questions emerged such as what human proteins show homology with cDNA sequenes prepared from a Drosophila cDNA library, what is the function of each protein in humans, what Drosophila protein does this a cDNA sequence relate to and what is the role of the protein in Drosophila, and how might studying the control and function of this protein in Drosophila contribute to our understanding of mechanisms controlling human disease involving the proteins. The hypothesis is that there will be homologous proteins between Drosophila and humans that can lead to the understanding of human diseases. MATERIALS AND METHODS The first step of this lab was to collect 2 colonies from the bacterial plate to grow in a liquid culture. Two glass culture tubes containing 3ml of Luria Broth with ampicillin were obtained. A toothpick was used to collect bacteria from a colony on a petri dish by using forceps that were cleaned by an alcohol swab. It was important just to touch one colony of bacteria so there would not be error in the sequencing of DNA. The toothpick was placed into the tubes containing the Luria Broth and ampicillin. The steps were repeated to have two glass cultures with bacteria. Next, was to isolate the plasmid DNA. This was done by obtaining the cultures and mixing the cultures to resuspend settled cells by flicking the tube. Then, 1.5 ml of the E. coli cell suspension was pipetted into a QuickLyse Lysis tube labeled specifically for each group. These steps were done with both cultures. Both tubes were centrifuged for one minute at 13,000 rpm at room temperature. The supernatant liquid was poured off into a waste beaker and 400 microliters of ice cold Complete Lysis Solution was added to the bacterial cells. The tube was mixed by the vortex for 45 seconds until the pellet was completely resuspended. Then, it was incubated for 5 min at room temperature and the lysate was poured into a QuickLyse spin column to be centrifuged from 1 minute at 13,000rmp. Next, 400 microliters of Buffer QLW was added at the top of the QuickLyse spin column to wash waste and was again centrifuged for one minute and 13,000rmp. The flow throw liquid was discarded and then the tube was centrifuged again for one minute at 13,000 rmp. The spin column after centrifugation was transferred to a new labeled 1.5 ml centrifuge tube and the waste was discarded. 50 microliters of Buffer QLE was added to the center of the QuickLyse spin column and centrifuged again for one minute at 13,000rmp. Finally, the column and cap was discarded and the 50 microliter liquid was kept for PCR and DNA sequencing because it contains the plasmid DNA The next section of this lab was PCR amplification of the plasmid DNA. First, 24 microliters of PCR master mix was added to three 0.2ml PCR tubes and 1 microliter of the plasmid DNA was added to the tubes labeled for the identifiers. One microliter of sterile water was added to the PCR tube labeled negative. Then the samples were added into the PCR machine and the PLASMIDPCR program was selected and when finished, the samples were removed and stored.

Furthermore, the following week was the gel electrophoresis part of the lab. The gel apparatus was properly assembled and 250 mg of agarose was placed in a 125ml Erienmeyer flask. Next, 25ml of 1XTAE buffer was measured, was poured into the flask, and was swirled. The flask was then microwaved for 40 seconds and if no particles of agarose were remaining, the gel was cooled for 2 minutes. Next, one microliter of ethidium bromide dye was added to the gel. On a piece of parafilm, for each of the two plasmid DNA samples, 2 microliters of plasmid DNA, 8 microliters of water, and 2 micorliters of 6X loading dye was mixed. Then 5 microliters of the DNA ladder was added to the first well and 4 micorliters of 6X loading buffer was added into each PCR tube. The wells were loaded with the samples, specifically for our lab group, well 1 DNA ladder, well 2 plasmid DNA A without dye, which was not used, well 3 plasmid DNA A with dye and water, well 4 plasmid DNA B without dye, which was not used, well 5 plasmid DNA B with dye and water, well 6 PCR A with dye, well 7 PCR B with dye, and well 8 PCR negative control. Then with all the samples loaded, the electrodes were inserted to the power supply and to a power of 100 volts. When the bromophenol blue migrated to half of the gel, the gel was removed and photographed on a UV box. Finally, the plasmids were sequenced and the National Center for Biotechnology information and cDART was used to analyze the cDNA. After, the gel was used for DNA sequencing. It was sequenced by the Nuclei Acid Facility by the dideoxy method. The reaction was done by a single strand of DNA was copied by DNA polymerase. The reactions contained a small amount of fluorescently labeled dideoxynucleotides which terminated the reaction. The products of the reaction were fragments of different lengths, which were nucleotides. When the labeled DNA fragments were migrating from the bottom of the gel, a florescence dector records the nucleotide at the 3’ end and the data was collected and read from 5’to 3’. Finally, the sequenced DNA was blasted on the National Center of Biotechnology of Information database. The human genome was searched for homologous proteins that were associated with human disease and compared to Drosophila. This was done by first cutting the cDNA sequence on MEGA. Then the sequence was put in the database and blasted. The gene and protein accession number was recorded and was used in CDART to get the protein domains. Next, the gene was blasted against the human genome and the protein and gene accession number was recorded. That protein accession number was used in CDART to look for the protein domains in humans and be compared to Drosophila.

DISCUSSION Two proteins were looked at in this lab. The first protein found in the Drosophila was cytochrome c oxidase IV. The protein domain in the Drosophila was cytochrome c oxidase subunit IV, COX411 and is conserved in humans. This is an enzyme and is a transmembrane protein found in bacteria and the mitochondria. It receives electrons from the cyctochrome c molecules and is the last enzyme in the electron transport chain. Also, it converts the oxygen into two molecules of water. The structure of this integral protein it has 13 protein subunits and several metal prosthetic sites(http://www.sciencemag.org/content/269/5227/1069). If there is a genetic mutation of cytochrome c oxidase, it can lead to fatal metabolic disorders, which can be seen in early childhood.

Also, it can affect tissues in the brain, heart, and muscle. It is one of the most severe mitochondrial diseases if there is a mutation in cytochrome c oxidase. (http://www.biomed.cas.cz/physiolres/pdf/53%20Suppl%201/53_S213.pdf). The the second DNA sequence of Drosophila, the protein found was cyclin-G1 isoform C with the domain cyclin super family. This domain was conserved between the Drosophila and humans. This is a protein that in humans is encoded by the CCNG1 gene. Cyclin- dependent kinases help regulated the cell cycle in eukaryotic cells when a cyclin binds to a dependent kinase. This protein lacks the protein destabilizing sequence that is in other cyclin- dependent protein kinases. The differences in the types of cyclins are from their primary strucutre. All cyclins have the same 100 amino acids that make up the cyclin box. They all contain similar tertiary structure with 5 alpha helices (http://www.ncbi.nlm.nih.gov/sites/entrez?Db=gene&Cmd=ShowDetailView&TermToSearch=900). If there is a mutation in cyclin, it will affect the cell cycle and also was shown to be involved in Alzheimer’s diseases (http://genesdev.cshlp.org/content/9/10/1149.refs.html).