You are on page 1of 10

9/18/2018

DNA
SEQUENCING

DNA SEQUENCING

REVIEW DNA SEQUENCING

DNA EXTRACTION POLYMERASE CHAIN REACTION

AGAROSE GEL ELECTROPHORESIS

SEQUENCING Frederick Sanger


• The process of determining
the precise order of • (born 13 August 1918) is a British biochemist who was
nucleotides within a DNA twice the recipient of the Nobel Prize for Chemistry
molecule.
• He has lost his religious faith and calls himself an
• DNA Sequencing method was agnostic. In an interview published in the Times
developed by Frederick newspaper in 2000 Sanger is quoted as saying: "My
Sanger in 1975. father was a committed Quaker and I was brought up
as a Quaker, and for them truth is very important. I
• Known as “dideoxy” or chain drifted away from those beliefs - one is obviously
termination method. looking for truth but one needs some evidence for it.
Even if I wanted to believe in God I would find it very
difficult. I would need to see proof."

1
9/18/2018

CHROMATOGRAM

CHROMATOGRAM

CHROMATOGRAM COLORS SEQUENCING


• Where can I do sequencing?
Black G MACROGEN

Green A 1st BASE

Red T
Blue C
Pink N (unknown)
http://www.macrogen.com/eng/

http://www.base-asia.com/

SEQUENCES ARE RETURNED FROM Sample FASTA file


>559036A1_A01_420799_A1_015 TIME: Fri
THE LAB AS FASTA FILES Mar 4 09:42:15 2005
• What is a FASTA file? gggggcgcggtctagcatgcaagtcgagcggcagcggtagcaatat
The FASTA file is a simple text file returned either directly from cgcctagancggcggactggtgagtaacacgtgggrratctaccttt
the sequencing lab or from a program like FinchTV or Phred. A gggatggggatagcctgtggaaacacaggataataccgaatacgtt
FASTA file represents the change from a chromatogram to an gactggattacggtccagccaggaaaggtccgtttggaccgcccga
actual list of bases indicated by the individual peaks in the agatgagcccgcggctgattagctagttxgcggggtaatggcccacc
chromatogram. It consists of a greater-than sign (>) followed by a aagycgatgatcagtaggcgnnntgagagggtgtacgcccacattg
unique ID for each sequence, then followed by the corresponding
ggactgagatacggcccaa
nucleotide bases.
Note: There cannot be a space between the > and the ID. You will
run into problems with your FASTA if there is a space here.

2
9/18/2018

SEQUENCING
• What software you can use to clean the
sequence (Sequence Alignment Editor)?
1. Chromas LITE
2. BioEDIT

REVIEW

MOLECULAR
BIOINFORMATICS

DNA EXTRACTION POLYMERASE CHAIN REACTION

SEQUENCING AGAROSE GEL ELECTROPHORESIS

Bioinformatics in the study of Molecular Biology


BIOINFORMATICS
• molecular data is used for the analysis of
1. Nucleic Acid Databases
different aspects of an organism. - Fully automated gene sequences
• these data can be used for researches - Whole genomes, clones, probes,
ESTs (expressed sequence tags) , STSs, RNAs
concerning drug design and identification of - Genomes by species
individual species. - Human/mouse homology
Store
2. Protein Databases
• data gathered can be used in establishing - Polypeptide chain sequences Manipulate
- Protein patterns and motifs
databases for future references and as - protein 3D structure coordinates Analyze
comparison for newly acquired data. 3. Information Databases
- genetic diseases and known genes
• These databases are available over the - Taxonomy
Internet - Phylogeny

3
9/18/2018

What are the available resources for


Molecular Bioinformatics?
• consolidated data from different molecular
procedures
• online protocols and tools in performing
different molecular procedures
Online and Software based applications • guided and easy to follow instructions for
beginners and advanced learners
Bioinformatics
• updated list of trending and newly developed
tools for molecular analysis

Software applications and databases

• www.ncbi.nlm.nih.gov
• www.geneinfinity.org
• molbiol-tools.ca
• www.protocol-
online.org/prot/Molecular_Biology.index.html Online Database: www.ncbi.nlm.nih.gov

• http://biologylabs.utah.edu/jorgensen/wayne NATIONAL CENTER FOR BIOTECHNOLOGY


d/ape/ INFORMATION

National Center for Biotechnology


Information (NCBI)
• was established as a response to the
developments in the field of science and
technology
• the main aim of the institution is to establish a
database for all the biomedical and research data
that is involved in computational molecular
biology.
• it was established under the National Institutes of
Health (NIH) since they have the largest
biomedical research facility in the world.

4
9/18/2018

NCBI Active Databases NCBI Active Databases


• HTG database - A collection of high-throughput
• GenBank - An annotated collection of all publicly
genome sequences from large-scale genome
available nucleotide and amino acid sequences.
sequencing centers, including unfinished and
• EST database - A collection of expressed sequence finished sequences.
tags, or short, single-pass sequence reads from
• SNPs database - A central repository for both single-
mRNA (cDNA).
base nucleotide substitutions and short deletion and
• GSS database - A database of genome survey insertion polymorphisms.
sequences, or short, single-pass genomic sequences.
• RefSeq - A database of non-redundant reference
• HomoloGene - A gene homology tool that compares sequences standards, including genomic DNA
nucleotide sequences between pairs of organisms in contigs, mRNAs, and proteins for known genes.
order to identify putative orthologs. Multiple collaborations, both within NCBI and with
external groups, support our data-gathering efforts.

NCBI Active Databases


• STS database - A database of sequence tagged sites,
or short sequences that are operationally unique in
the genome.
• UniSTS - A unified, non-redundant view of sequence
tagged sites (STSs).
• UniGene - A collection of ESTs and full-length mRNA
sequences organized into clusters, each representing
Basic Local Alignment
a unique known or putative human gene annotated
with mapping and expression information and cross- Search Tool (BLAST)
references to other sources.

BLAST NCBI
• Finds regions of local similarity between • Systems biology comes under this category
sequences. including reaction fluxes and variable
• The program compares nucleotide or protein concentrations of metabolites.
sequences to sequence databases and calculates • Multi-Agent Based modelling approaches
the statistical significance of matches. capturing cellular events such as signaling,
• BLAST can be used to infer functional and transcription and reaction dynamics.
evolutionary relationships between sequences as
well as help identify members of gene families.
• Utilizes the GenBank database system

http://www.ncbi.nlm.nih.gov/

5
9/18/2018

Basic Local Alignment Search Tool WHAT YOU HAVE

http://blast.ncbi.nlm.nih.gov/

Go to BLAST website

EACH OF YOU HAVE A FILE OF SAMPLE


SEQUENCES FOR THIS ACTIVITY

PLEASE OPEN THE FILE AND SELECT ONE

COPY AND PASTE THE SEQUENCE


FROM YOUR NOTEPAD TO THE ENTRY
QUERY SEQUENCE

6
9/18/2018

IN THE CHOOSE SEARCH SET SELECT


NUCLEOTIDE COLLECTION

CLICK BLAST

BLAST REPORT PAGE BLAST GRAPHICAL OVERVIEW


• The report page consists of three major COLOR KEY
sections ALIGNMENTS
– Graphical overview •Red: most
– Scores and Statistics related (similar
bases)
– The alignment for each database sequence match
•Green, Pink and
Blue: moderately
related
•Black: unrelated

7
9/18/2018

BLAST SCORES AND STATISTICS BLAST SCORES AND STATISTICS


• MAX SCORE OR
SCORE HITS
– Measures
sequence
similarity
– The higher the bit
score, the more
highly significant
it is
– The list is
arranged from
highest to lowest

BLAST SCORES AND STATISTICS BLAST REPORT PAGE


HEADER
• E-VALUE
(EXPECTATION VALUE) STATISTICS
– Represent the amount
of alignments you
would expect to find
by chance that have
the same score as the
alignment you are
looking at.
– The lower the E-value,
or the closer it is to
“0” the more
“significant” the
match is.

WHAT IS GENBANK?
• It is a genetic sequence database, an
annotated collection of all publicly available
DNA sequences.
• It is designed to provide and encourage access
within the scientific community to the most
up to date and comprehensive DNA sequence
information (“LIBRARY”).

8
9/18/2018

GO TO GENBANK WEBPAGE GO TO GENBANK WEBPAGE

TYPE IN THE SEARCH BAR


“Methylobacterium radiotolerans”

9
9/18/2018

Search the following in GENBANK:


1. Methylobacterium radiotolerans
2. mecA gene
3. Staphylococcus aureus
4. AF101418
5. X79286

10

You might also like