You are on page 1of 7

Gene Identification and Amino Acid Sequencing of DNA Plasmid #C

Summary

BLAST (Basic Local Sequence Alignment Search Tool) is used to identify the gene of
the DNA Plasmid #C and the amino acid sequence derived from it. As a result, the Kruppel-like
factor 4 (KLF4) gene is identified, and the amino acid sequence of protein KLF4 is sequenced.
The results of the findings are discussed, and the role of KLF4 in producing skin barriers is
summarized. In conclusion, limitations in selecting the correct gene and protein are mentioned.

Introduction

The objective of the experiment is to analyze the sequence of the plasmid to identify its
gene source and translating the sequenced DNA into an amino acid sequence. BLAST (Basic
Local Sequence Alignment Search Tool) is used for both processes, blastn tool for identifying
the gene and blastx tool to translate the gene to an amino acid sequence.

The DNA sequencing approach used in the experiment is the dye-terminator sequencing
approach. This fluorescent dye method uses fluorescent-labeled dideoxy nucleotide chain
terminators in a single reaction for a DNA sample (NC STATE UNIVERSITY). The strand
synthesis is terminated when DNA polymerase encounters a labeled nucleotide, allowing the
generation of millions of strands labeled with 3’flourescent dideoxy terminators (NC STATE
UNIVERSITY). During capillary electrophoresis, laser beams through the detected cells, and a
laser scanner detects the dye fluorescence (NC STATE UNIVERSITY). The signals are then
converted as a trace chromatogram (NC STATE UNIVERSITY). The chromatogram can be
viewed in a tool called 4 Peaks.

DNA is pivotal for vital functions in life: growth, reproduction, and health (Seladi-
Schulman). It contains the information for forming proteins, which are the major workers in an
organism (Seladi-Schulman). The central dogma of molecular biology explains the flow of
genetic information to make a functional product through transcription and translation processes
(“What Is the ‘Central Dogma’?”). In other words, it refers to the flow of information from DNA
to RNA to protein (“What Is the ‘Central Dogma’?”). This experiment investigates the central
dogma of molecular biology as it looks at how a DNA sequence can be translated to amino acid
sequences, thus proteins.
Materials and Methods

Please refer to the lab manual.

Results and discussions


Figure 1: Chromatograph of the sequencing result viewed by 4 peaks (from base 150 to 280)

Figure 1 shows the nucleotide sequence of the plasma DNA given on 4 peaks. The
sequence is derived from the fluorescent dye method. Each peak represents a nitrogenous base of
the nucleotide: Adenine, Thymine, Guanine, and Cytosine, shown by green, red, black, and blue,
respectively.

TAAAAAAGCGAATGACTCTCTGGCTACTAGAGACCACTGCTTACTGGCTTATCGAAATTAATACGA
CTCACTATAGGGAGACCCAAGCTGGCTAGCGTTTAAACTTAAGCTTGGTACCGAGCTCGGATCCACT
AGTCCAGTGTGGTGGAATTCACTAGTGATTATTCATGGCTGTCAGCGACGCTCTGCTCCCGTCCTTCT
CCACGTTCGCGTCCGGCCCGGCGGGAAGGGAGAAGACACTGCGTCCAGCAGGTGCCCCGACTAACC
GTTGGCGTGAGGAACTCTCTCACATGAAGCGACTTCCCCCACTTCCCGGCCGCCCCTACGACCTGGC
GGCGACGGTGGCCACAGACCTGGAGAGTGGCGGAGCTGGTGCAGCTTGCAGCAGTAACAACCCGG
CCCTCCTAGCCCGGAGGGAGACCGAGGAGTTCTTTTATCTCCTGGACCTAGACTTTATCCTTTCCAA
CTCGCTAACCCACCAGGAATCGGTGGCCGCCACCGTGACCACCTCGGCGTCAGCTTCATCCTCGTCT
TCCCCAGCGAGCAGCGGCCCTGCCAGCGCGCCCTCCACCTGCAGCTTCAGCTATCCGATCCGGGCC
GGGGGTGACCCGGGCGTGGCTGCCAGCAACACAGGTGGAGGGCTCCTCTACAGCCGAGAATCTGCG
CCACCTCCCACGGCCCCCTTCAACCTGGCGGACATCAATGACGTGAGCCCCTCGGGCGGCTTCGTGG
CTGAGCTCCTGCGGCCGGAGTTGGACCCAGTATACATTCCGCCACAGCAGCCTCAGCCGCCAGGTG
GCGGGCTGATGGGCAAGTTTGTGCTGAAGGCGTCTCTGACCACCCCTGGCAGCGAGTACAGCAGCC
TTTCGGTCATCAGTGTTAGCAAAGGAAGCCCAGACGGCAGCCCACCCCCCGTGGTTAGTGGCGCCC
TACAGCGGTGGCCCGCCGCGCATGTGCCCCAAGATTACGCAGAGGCGATCCCGGATCCCCTGTGGC
AACAACGAC

Figure 2: The Nucleotide sequence derived (including vector backbone)

The above figure shows the nucleotide sequence obtained from 4 peaks. The number of
bases is 817, excluding the vector backbone. Vector backbone is the sequence before and after
the highlighted sequence, and the Nucleotide sequence of the DNA Plasmid #C is highlighted in
yellow.
Figure 3: Sequences producing significant alignments for Plasmid DNA #C on BLAST (Basic
Local Alignment Search Tool)

The above figure shows the list of possible genes of the Plasmid DNA #C. All the most
probable genes are from Mus musculus, the house mouse. The Mus musculus kruppel-like factor
4 gene, Mus Musculus mammary gland, and Mus musculus gut-enriched Kruppel-like factor all
share the same Query cover of 100% and Percent Identity 98.41%. Since Mus musculus kruppel-
like factor 4 gene was the most frequently appearing gene in the search, it is chosen to be used.
Figure 4: The Most Probable Matching Gene – Mus musculus kruppel-like factor 4 mRNA

Figure 4 shows the comparison between the base sequence of the Plasmid DNA #C to the
base sequence of the Klf4 gene of Mus musculus. It has a Percent Identity of 98%, so the most
likely source of the DNA is the house mouse.

The Klf4 gene of Mus musculus encodes a protein that belongs to the Kruppel family of
transcription factors (Weizmann Institute of Science). It is a gene that regulates diverse cellular
processes, including cell growth, differentiation, and proliferation (Ghaleb and Yang). It has
been found that KLF4 is highly expressed in the skin and plays an essential role in developing
the barrier function of the skin (Ghaleb and Yang). Mice that do not have this gene will die
shortly after birth because of the fluid evaporation resulting from ruined epidermal barrier
function (Weizmann Institute of Science).
Figure 5: Possible Proteins Translated from Plasmid DNA #C

The same DNA sequence of Plasmid #C was put into the blastx tool to translate to an
amino acid sequence. Figure 5 shows the possible proteins derived from Plasmid DNA #C, and
Kruppel-like factor 4 protein of Mus musculus was selected to be the most probable protein.

Figure 6: Amino Acid Sequence of Protein Klf4 of Mus musculus

Figure 6 represents the amino acid sequence of the Kruppel-like factor 4 protein of a
house mouse.
Limitations for both gene and protein identification are that there are several possible
options with the same statistics, including the Percent Identity and Query Cover. Although all of
them are from the same source, Mus musculus, there could be slight differences in its structure
and function that cannot be distinguished simply by the percentage of similarity and accuracy.
Thus, the choice of the gene and protein selected was not backed up by a valid reason, as it was
random. Examples of this are the Mus Musculus mammary gland gene and Mus musculus gut-
enriched Kruppel-like factor gene (Figure 3), and klf4 isoform protein and unnamed protein
product of Mus musculus (Figure 6), since they share the same statistics with the gene and
protein chosen for the report.

Conclusion

The objective of the experiment is to analyze the sequence of the plasmid to identify its
gene source and translating the sequenced DNA into an amino acid sequence. The Plasmid DNA
#C gene is found to be Kruppel-like factor 4 gene in Mus musculus. The amino acid sequence
has also been investigated in the protein Kruppel-like factor 4. The KLF4 gene is essential in
many cellular functions, including developing the barrier function in the skin. It should be noted
that there are limitations since there is a possibility that the gene and protein selected are not the
correct representation of the plasmid.
References

BLAST. “BLAST: Basic Local Alignment Search Tool.” Nih.gov, 2018,

blast.ncbi.nlm.nih.gov/Blast.cgi.

Ghaleb, Amr M., and Vincent W. Yang. “Krüppel-like Factor 4 (KLF4): What We Currently

Know.” Gene, vol. 611, May 2017, pp. 27–37, 10.1016/j.gene.2017.02.025.

NC STATE UNIVERSITY. “Instructional Videos : Genomic Sciences Laboratory.”

Research.ncsu.edu, research.ncsu.edu/gsl/news/instructional-videos/.

Seladi-Schulman, Jill. “DNA Explained and Explored.” Healthline, Healthline Media, 14 Aug.

2019, www.healthline.com/health/what-is-dna.

Weizmann Institute of Science. “KLF4 Gene - GeneCards | KLF4 Protein | KLF4 Antibody.”

Www.genecards.org, www.genecards.org/cgi-bin/carddisp.pl?gene=KLF4.

“What Is the ‘Central Dogma’?” Yourgenome, 25 Jan. 2016, www.yourgenome.org/facts/what-

is-the-central-dogma.

You might also like