Read without ads and support Scribd by becoming a Scribd Premium Reader.
CS262 Lecture 1 Notes
Computational Genomics / Biology for CS262 / Sequence Alignment
Scribed by Hong T. Lam
January 6, 2004
The Goal of Genomics
Study Organisms at the DNA Level
\u2022
Read a complete genome such as the human DNA. (DNA Sequencing & Assembly)
\u2022
Identify parts such as the genes encoded by the DNA sequence. (Gene Finding)
\u2022
Figure out the connections between parts such as how genes interact with each other.
\u2022
Gene Expression: The process by which genetic code is translated into structures present and

functioning in the cell. Expressed genes are transcribed into different types of RNA, of which mRNA is the only type that is translated into proteins. Gene expression provides information about how a gene functions and how it is different from other genes. DNA microarrays can be used to compare gene expression in different populations of cells. Cells have different gene expression patterns and levels. (Microarrays & Regulation)

Study Evolution at the DNA Level
\u2022
Compare whole genomes from multiple organisms. (Large-Scale Comparative Genomics)
\u2022
Quantify the evolution of biological sequence. (Phylogeny & Evolution)
\u2022
Uncover the evolutionary tree.
The Role of CS in Biology

Computer science plays an essential role in biology. With biology becoming an information science, new high-throughput technology is needed. The shift to high throughput technologies in biology has led to an explosion of genomic data.

Basic Computational Methods for Analysis of Biological Sequences
\u2022
Sequence Alignment Algorithms
\u2022
Dynamic Programming
\u2022
Hidden Markov Models
Hong T. Lam
Page 1
1/10/2004
Genomics Applications Using Basic Computational Methods
\u2022
DNA Sequencing: The process of determining the exact order of a long string of bases (A, T, C,
G) that makes up the DNA of an organism. The genomes of several organisms, including human,
have been completely sequenced.
\u2022
Comparison of DNA and proteins across organisms
\u2022
Discovery of genes, promoters, and regulatory sites
Paradigms in Biology
There are two paradigms in biology.
\u2022
Molecular Paradigm (Genetic Dogma)
DNA\ue002 RNA\ue002 polypeptide

DNA is transcribed into RNA (rRNA, rNA, snRNA, mRNA) through a process known as RNA transcription. mRNA is translated into polypeptides which then fold into 3-D protein structures through a mechanism called protein translation. An organism consists of different types of proteins.

\u2022
Evolution Paradigm: All organisms originate from a common ancestor, connected by an
evolutionary tree.
Basic Biology for CS262
Structures of Biomolecules
\u2022
The cell is composed of DNA in the nucleus and proteins in the cytoplasm, all of which is
encapsulated in a lipid membrane.
\u2022
The nucleic acids (DNA and RNA) form the genetic material of all living organisms. They are
found mainly in the nucleus of the cell.
\u2022
A nucleotide has three components.
\ue001Sugar (ribose in RNA, deoxyribose in DNA)
\ue001Phosphoric acid

\ue001Nitrogen base
\ue000Adenine (A)
\ue000Guanine (G)
\ue000Cytosine (C)
\ue000Thymine (T) or Uracil (U)

Two nucleotides are linked together by attaching the phosphate group of one nucleotide to the 5\u2019
carbon atom of the sugar of the other nucleotide.
Hong T. Lam
Page 2
1/10/2004
\u2022

Nucleic acids are linear, unbranched polymers of nucleotides. While RNA is single-stranded, DNA consists of two strands, which run in opposite directions to each other anti-parallel. The strands are joined together by pairing the nitrogenous bases (Watson & Crick base pairs). DNA and RNA are read from the 3\u2019 to the 5\u2019 end. This is related to the numbers on the ribose ring.

T\u2192 U
RNA
CG
ACUG
AG
A=T
G=C
DNA
CG
ACTG
AG
C
GG
T
C
A
C
T
\u2022

Three nucleotides of an mRNA strand form a codon that specifies one amino acid. This makes sense because a codon made from only one or two nucleotides would not produce enough combinations (codons) to code for all 20 of the known amino acids.

\ue0011 nucleotide = 4 possible codons
\ue0012 nucleotides = 4 * 4 possible codons
\ue0013 nucleotides = 4 * 4 * 4 possible codons = 64 possible codons for 20 amino acids

Since a three-nucleotide codon produces 64 possible combinations and there are only 20 known amino acids, this implies redundancy or degeneracy in the genetic code where several different codons specify the same amino acid. The parsimony principle \u2013 that the simplest solution is often right \u2013 rules out a four-nucleotide codon.

Hong T. Lam
Page 3
1/10/2004
\u2022
Two amino acids form a dipeptide.
R O
R
| II
|
H2N--C--C--NH--C--COOH
|
|
H
H
R
R
|
|
H2N--C--COOH H2N--C--COOH
|
|
H
H
Search History:
Searching...
Result 00 of 00
00 results for result for
  • p.
  • Notes
    Load more