You are on page 1of 8

1.9.

DNA NOTATION

Imagine you have to write down a DNA sequence on a piece of paper.


Which of these statements is true?

I write the DNA sequence from 5’->3’ if it is the forward strand


I write the DNA sequence from 3’->5’ if it is the reverse strand
I write both strands of the DNA sequence, the forward as 5’->3’ and the
reverse directly underneath as 3’->5’

Yes, you would write the sequence from 5’->3’ if it is the forward or
the reverse strand.

You may find 1.7 Grammatical rules for DNA sequence


representation useful.

Which is the reverse complement sequence of these 4 letter DNA string?


ACTG

GTCA
TGAC

CAGT

Remember that finding the reverse complement is a two step


process.

Correct! This is the reverse complement. This is a two step process.


Let’s write the question sequence agains:

ACTG

The reverse sequence is GTCA and the complement of this is CAGT.

You may find 1.8 Representing the reverse strand useful.

1.17.
For a given organism:

Each amino acid is encoded by only one codon

Each codon can encode only one amino acid

Each amino acid is encoded by up to two different codons.

This is correct!

You may find 1.10 From DNA to protein useful.

Select all statements that are true.

Select all the answers you think are correct.

the FASTA format is suitable for DNA and protein sequences

the FASTA format is exclusive for DNA sequences


the FASTA format is exclusive for protein sequences

the FASTA format is suitable for RNA sequences

This is correct!

Although not commonly used, RNA sequences can also be stored in


FASTA format.

Select all statements that are true about a Genbank entry.

Select all the answers you think are correct.


A Genbank entry has a section dedicated to sequence data.

A typical Genbank entry has a section dedicated to sequence data in FASTA


format.

A Genbank entry includes the date of submission.

A Genbank could or could not include a reference to published work.

2.7.
Which of these is NOT relevant for the process of homology annotation?
Choose one option

BLAST results
protein 3D structure
sequence accession number
amino acid sequence conservation

An accession number or identifier is not related to protein function

Which of these values given as results of a BLAST search will change if the
size of the database is altered?

percentage identity
E-value
score
the percentage of coverage of a BLAST results depends on the query and the
subject and do not depend on the size or nature of the database

the E-value depends, among other factors, on the size of the


database

You may find 2.4 Use of BLAST (Basic Local Alignment Search
Tool) useful.

In the BLAST submission page, which of the following can be used for
entering the query sequence?

Select all the answers you think are correct.

paste a FASTA sequence


upload a FASTA sequence file
the protein name
the protein accession number
TEST

1. You have downloaded this FASTA file but you are unable to use it
for downstream analysis. By looking at the file, can you indicate
why that is the case?

NP_476772.1 alpha-Tubulin at 84B [Drosophila melanogaster]

MRECISIHVGQAGVQIGNACWELYCLEHGIQPDGQMPSDKTVGGGDDSFNTFFSETGAGKHVPRAVFVDL

EPTVVDEVRTGTYRQLFHPEQLITGKEDAANNYARGHYTIGKEIVDLVLDRIRKLADQCTGLQGFLIFHS

FGGGTGSGFTSLLMERLSVDYGKKSKLEFAIYPAPQVSTAVVEPYNSILTTHTTLEHSDCAFMVDNEAIY

DICRRNLDIERPTYTNLNRLIGQIVSSITASLRFDGALNVDLTEFQTNLVPYPRIHFPLVTYAPVISAEK

AYHEQLSVAEITNACFEPANQMVKCDPRHGKYMACCMLYRGDVVPKDVNAAIATIKTKRTIQFVDWCPTG

FKVGINYQPPTVVPGGDLAKVQRAVCMLSNTTAIAEAWARLDHKFDLMYAKRAFVHWYVGEGMEEGEFSE

AREDLAALEKDYEEVGMDSGDGEGEGAEEY

the sequence is too small


the sequence is a protein and FASTA files are only for nucleotides
the “#” symbol is missing from the first line

the “>” symbol is missing from the first line

CORRECT: The “>” symbol is needed to indicate the start of a


FASTA sequence. It is the first character in the header of the
sequence.

You may find 1.16 EMBL, Genbank and FASTA file comparison
- investigate and discuss useful.
Here is a short nucleotide sequence:

ATCGTGATCG

which option represents the reverse complement sequence of the one


shown above?

GCTAGTGCTA
CGATCACGAT
TAGCACTAGC
AUCGUGAUCG

CORRECT: This is the reverse complement

You may find 1.7 Grammatical rules for DNA sequence


representation useful.

When comparing the GenBank and EMBL sequence data formats, the
“COMMENT” field in the GenBank format corresponds to which field
in the EMBL format?

XX
OC

CC

None of the above

CORRECT: CC corresponds to the Comment section

You may find 1.13 The GenBank file format useful.

One of your colleagues wants to investigate the beta-galactosidase


enzyme in Clostridium difficile. Your colleague needs to retrieve the
DNA sequence. They only have access to the protein sequence of a E.
coli beta-galactosidase. You recommend them to:
Use the E. coli beta-galactosidase protein sequence to search Google.
Type in “beta-galactosidase clostridium” in the search box in Uniprot
Use the E. coli beta-galactosidase protein sequence as a query in BLASTp
against nonredundant protein database
Use the E. coli beta-galactosidase protein sequence as a query in
tBLASTn against a defined database restricted to Clostridium species.

CORRECT: A tBLASTn search uses a protein query (which is the


sequence that we have available) and searches a translated
nucleotide database. Although the subject will be shown as an
amino-acid sequence, the DNA sequence will be readily
accessible from the BLAST results.

You may find 2.3 BLAST, a tool for homology annotationuseful.

When inspecting results from a BLASTp search the best match to a


given query tends to be:

The subject with the highest score


The subject with the highest E-value
The subject with similar length as the query
All of the above

You have used the entry for E. coli BamE protein from Uniprot as a
resource and you need to add a citation. Which one is the most
appropriate:

www.uniprot.org
The UniProt Consortium, UniProt: the universal protein
knowledgebase, Nucleic Acids Res. 45: D158-D169 (2017)
http://www.uniprot.org/uniprot/P0A937
You do not need to add a citation

CORRECT: This is a publication featuring the development and


scope of Uniprot. What is more, this is Uniprot’s recommended
citation.
You may find 2.13 A note on Versions and Citation useful.

You might also like