Professional Documents
Culture Documents
Review
Review
6
2 . 5
3 8582 328 1 8253
-
- :
.
4 . 1328 -
1 +
1) +
18689 8582+1)
-
(26872 -
26812 +
1) +
(28696 -
28589 1) + +
(30617 30581 + 1)
-
=
802
↓
1
5 .
36166 35306 860
- :
6 .
(31086 30618+ 1) +
-
135306 -
34510 + 1)
=3 for amino
acid
codon
-1 for stop
= 421
7 Look vertically ,
.
should
ideally names
match up .
as choices
cas
long match
sequence
of
up with one
question's seq .
P
purine transitions args CG3
I
Al G
↳
duplication S
speration bonds
a
transversions 4
V
↓ ·
3
hydrogen
c -> T
alogs
pyramidines ↳ duplication AT 2
before
speciation
Purim ->
pyr Transversions
.
8 .
Orthologs
9 ,
inparalogs
Speciation
mi
10
ens
·
M 3) to ortholog
~black widow > co-orthologs (A and B diff
species)
11 .
2
12 .
I
13 . C,D
,
E
14 .
B
71 %
JaveragSee
100
:
:
15 : irens + Jerome
enetwanepe :
54 !
16 .
L
17 .
(
18 .
E
!
21 .
B
22 .
C+ B
Homosapiens +
chordata +
denterstones
23 .
pyr
(7:
AG pur
o 00 . 0 ..
:
or ......
.
18(5) 90
:
M
:
: 2
Transversions 21 D
:
-
-
0 A 92
24 6 .
25 .
Edge I
Go from
longest
:
0 3 + 0 24
.
.
0 2+0
. .
2
=
0 .
45
will end at
up
edge z .
sign te
Template
ddC
BIO 312 - Bioinformatics
Final Exam Study Guide
Casey Ngai
Updated Dec 1, 2021
Given a Genbank formatted gene
annotation of gDNA, calculate from the
provided coordinates:
- The number of introns and exons
- The size of the 5' and 3' UTRs
- The length of the CDS
- The length of the encoded protein
(added by Dr. Rest, 12/9)
Possible Exam Question
-
Explicitly Stated for Exam
? Solving these problems:
Iz
mRNA
co
↑
-'AAGI
-
-
d
A
-
A
G
CTGICL
3'ATGACCG TA 5'mRNA
5'TACT GGLAT 3'temp ·
Explicitly Stated Question for Exam
-
& I 90
Sum :
150150/2 :
75
⑧
ne
*
*
&
75
-
E
q ↑
81
E
-
&
82
S
--
Explicitly Stated
#
? Know how to solve:
? Needleman-Wunsch
? Nussinov Jacobsen
Needleman-Wunsch
⑧ &
0P
⑪
-
&
O ·
00
segl+ 2 -
CTACT-
TTCT
-
&
8
Explicitly Stated for Exam
? How to solve:
O I
O 0 ⑧
- ⑧O
0 O
0- O
V >
C 6
O
8
St
... ⑧ 0⑬
A
“May ask on the exam” continued
“May ask on the exam” continued
“May ask on the exam”
M .
⑦ and U
bind
can
·O ⑦
->
O ⑧
0 h
↳
O U
O y
⑧ P
so O
Ech
Nussinov Jacobsen
-D
-
....
I
-
- i
0
-
7 I
- W ↑
G- Cl
E
CGAACA
12 =
1907 6 -
d
S
-A
· -
#
Explicitly Stated
I
**** a
264/8642
see
seneAYy.AM
134 516
1 2 ㄧ
789
ㄍ
1011
Ns
(odous(
76 G strand
6 coding
mRNA.MG/tAA*Targetci GGtt-3itmplatGn71''tee1l5t
12
Anti-codous TACCC.TN
CDS 1 non
-
1
cuding
※
I
醐 3
-
3
m
AUGTHGAU
I codon ->
3 nucleotides .
↓
pn.in
RNA Primerase
ribosome
CG have 3 H bonds sothey
AT 2
hydrogen bonds 7
:
CG 3
hydrogen
:
wobble :
G to u
in human
· 20 , 000 protein ading genes genome .
·
20 amino
acids
·
3 stop codons
central
Dogma
TAA , TAG , TGA stop codous coding Sequence (CDS)
-
HAA , UAG , UGA RNA
open reading
in
set of
.
= 21 codous
every 16413)
Intron Look +
:
mRNA
mRNA to
5'ntR From CDS :
3'UTR
:
copy
:
cp
WC
:
# of words/# of lines
grep
:
Faste
Bedtools
:
Swiss army
knife for genomic analysis
↑ Similar
Genome tools
: .
EMBOSS :
sequence alignment
copies that share a common ancestor
Homologs gene
:
from duplication
·
paralogs result :
Xenologs
:
e
use BLAST to find
nucleotide database ol
·
nucleotide blast : search
query
discont Megablast
- blastn , megablast ,
.
database wl translated
·
blastX : search protein
nucleotide
query .
· A blast h
:
BLOSUM :
HSPS
high scoring pairs
:
alignment
and
query
.
HSPs
·
Total Score :
sum of .
·
Query coverage :
Maxidentity highest %
:
identity
aligned seq to same subject seq
,
.
* Evalue
:
better the
alignment
residues identical
Identity
:
↳ simplest way of
scoring
an alignment
Rule almost
:
always insect
gaps
in
of 3
protein coding
nucleotide sequences in
groups .
purineDurine E
transitions
purine Transits see
Al G .
transversions 4 I
Y - T
pyramidines
purinesPer E se
transversion
Alignment
alignment
PAM
:
% Accepted Mutations
100 residues
-> PAM-1
:
-> PAM-200 ,
:
Matrix
BLOSUM BLOCKS Substitution
:
related
-> 30 % identity
groups distantly :
related
-> 100 % identity groups closely
:
seq ,
are emphasized
NewickFormat example :
(CLizard Frog) , (Auman Dog)
, ,
ASCII formatted graphical
tree example
withreplacement
--
in an alignment)
can be repeated
- -
but
some columns
domain
Family group
:
of protein's share a
sequence /structure
-
·
Domains are classified by
related
by
common
ancestry Chomology)
.
·
sanger sequencing
:
! 5
!
:visualized
Misrosway
asheatmap S -
Disadvantages
Needleman-Wunsch
-
performs globaltwo
alignment on
sequences .
Alignment
↳irwe
Distance
Hamming
D =
nIN
↓
+
#cleotides
Hof sites differences
t
differences
for distance
RNA Folding ->
GC , AU , GU