You are on page 1of 56

Sequence alteration at the gene level:

mutations and polymorphisms.

A.M.R. Taylor

Institute of Cancer & Genomic Sciences


Eukaryotic gene structure
Eukaryotic gene structure

1. Exons are coding sequences for proteins. Introns


(between the exons) are transcribed into RNA but spliced
out of the mRNA

1. The number and size of exons and introns varies greatly


between genes. Most of the DNA in genes is non-coding.

3. Exons are small ~200 base pairs average. Some exons


correspond to a functional domain of a protein.
E.g. In the globin gene (3 exons) exon 2 includes the
heme binding segment of the polypeptide.

4. Number of introns varies from 0 (histone genes) to 75 in


Duchenne MD
Eukaryotic gene structure

5. Transcription start site is upstream of exon 1 and


goes through to the last exon.

6. RNA processing enzymes trim 3´end of primary


transcript, remove introns and splice together the
exons to give mature mRNA which codes for the
protein.
Eukaryotic gene structure

Post-transcriptional processing

poly-A 'tail' (5'-AAAAAA-etc-AAAAAA-3') added to


3' end
7mG 'cap' (7-methyl guanosine, 7mG) added to 5'
end
splicing of heterogeneous nuclear RNA (hnRNA)
up to 90% of transcript is removed
exons are retained ("expressed")
introns are removed ("intervening")
10 ~ 20 exons / 'gene'
SIZE OF DNA

Number of base pairs in haploid human genome


3 x 10exp9

(Human mitochondrial DNA is 16kb)

Average size of a gene 20kb

Coding DNA <2% of genome

Non coding introns ~26% of genome

Other elements

Total Number of genes ~22,000


Mutation

How do mutations arise?

1. Copying errors during DNA replication

2. Spontaneous depurination

3. Exposure to background ionising radiation


TYPES OF MUTATION

The major consequences of mutations are


restricted to the small fraction of the DNA -
coding DNA.

1. Silent mutations (synonymous)

No change in protein sequence/function.

2. Non-synonymous mutations

Alters the protein product

Natural selection against deleterious mutations in


coding DNA.
Types of Mutation
1. MUTATIONS INVOLVING ONE OR A FEW
NUCLEOTIDES

a. Missense mutation

Single base change results in single amino acid


change.
Usually will affect protein function
May give partial or complete loss of activity.

b. Nonsense mutation

Occurs by single base substitution

Normal codon replaced by termination codon


(TAA,TAG, TGA) to give a truncated protein.
Types of Mutation

c. Frameshift mutation

Frameshifts occur by insertion or deletion of one or


two bases.

Results in a termination codon (TAA, TAG, TGA)


Downstream of the insertion/deletion.
Types of Mutation
ATR exon 18 forward sequence

c.3477G>T base substitution alters amino acid

Missense p.Met1159Ile
Types of Mutation

SETX 66268 Forward

c.713G>C (p.Trp238Ser)
ATM gene mutation

Genomic DNA Genomic DNA cDNA

Exon 10 (reverse) Exon 34 Intron 34 Exon 33 Exon 34

normal

Exon 10 (reverse) Exon 34 Intron 34 Exon 33 Exons 34 + 35

A-T210

c.1290_1291delTG mutation c.5177+5G>A mutation


causing frameshift causing exon 34 skip
Consequences of mutation

Western blot for ATM and aprataxin


Normal control

Normal control
A-T (known)

D
E

F
B

G
C

K
A

J
I
ATM

Aprataxin

Actin

1 2 3 4 5 6 7 8 9 10 11 12 13 14
SICKLE CELL ANAEMIA

Single base change

GAG ->GTG

Glutamine->valine

CYSTIC FIBROSIS

Deletion of a phenylalanine residue (D508)

This accounts for the majority of mutations in


Caucasians (freq. 0.68)
d. Transcriptional mutations

Mutation affects the promoter region of the


gene.

e. Abolition of splice site

Can lead to insertion of intron sequence into


mRNA resulting in a non-functional protein

f. Creation of a new splice site

May lead to insertion of intron sequence


resulting in truncation of the protein
Eucaryotic gene structure
2. LARGER MUTATIONS

a. Loss of complete exons

Leads to loss of protein function

b. Large intron insertions

Leads to loss of protein function

c. Chromosomal mutations

Additional chromosomes T21


Chromosome deletions
137bp
Exon 40 Exon 41

8498 8152
GTAA A G
G

Patient AT149-1 Patient AT149-2 Patient AT40-3

ATM 5762ins137
Nucleotide changes in cancer — e.g. Ras
oncogenes

• Ras oncogenes “activated” in 10 – H-Ras gene mutation in EJ bladder cancer


30% of human cancers 10 11 12 13 14 15
• Always involve point mutations at wt…gly ala gly gly val gly…
specific sites
• e.g. G T transversion — typical of
…GGC GCC GGC GGT GTG GGC…
carcinogen-induced mutations in
smokers or workers with
EJ bladder GTC
occupational carcinogen exposure
cancer val
• Result: deregulated activation of
signalling pathway uncontrolled
proliferation
Ras G12V
Ras Ras Ras
Colon Cancer: Morphological and Genetic changes

Chromosome 5 Ras gene Chromosome 18 Chromosome 17


Gene alteration mutation allele loss allele loss

Adenoma class Adenoma class Adenoma class


Normal Hyperproliferative I II III
CARCINOMA
Epithelium Epithelium Small adenoma Large adenoma Large adenoma +
(<1.0cm) (>1.0cm) foci of carcinoma

Other chromosome
loss
METASTASIS
Other ‘structure’ in DNA
Our DNA carries repeated sequences of different sorts.
 Simple mononucleotide repeats e.g. AAAAAAA

 Dinucleotide repeats e.g. AGAGAGAGAG or CACACACA

 More complex repeat structures

 The number of repeats at any particular locus is inherited

 Strand ‘slippage’ during replication can give a different number


of repeats in the daughter strand.

 May escape proofreading and require repair by mismatch repair


machinery

 Large repeats called satellites – shorter repeat regions called


microsatellites
Microsatellite repeats
Microsatellite repeats- polymorphisms-
in vitro (PCR)

Alleles 1,2 3,4 1,2 2,2 1,1 2,2 1,4 2,2

Slippage in vitro
The polymerase chain reaction

This is a method for synthesising multiple copies of


a unique piece of DNA. i.e. amplifying from a small
to a large amount

Might represent one particular allele of a gene.

Method allows synthesis of enough DNA to run into


a gel and visualise.
POLYMERASE CHAIN REACTION

Enzymatic amplification of a fragment of DNA.

Requirements

1. Two short oligonucleotides (primers) that


hybridise to opposite strands of target sequence.

2. Enzyme that synthesise DNA strand (Taq


polymerase).

3. Repeated cycles of heat denaturation,


hybridisation of primers and synthesis of DNA.
Result

2, 4, 8, 16, 32….2exp30 copies after 30 rounds of


replication

2exp30=10exp9 copies.

DNA can be run out on gel and will form a distinct


crisp band as all DNA fragments are of same
length.

If the length of DNA used has a polymorphism, two


bands of different size will be seen one from each
chromosome.
Slippage in vivo
Microsatellite repeats- polymorphisms-
in vitro (PCR)

Alleles 1,2 3,4 1,2 2,2 1,1 2,2 1,4 2,2

This slippage can occur in vivo

Repaired by ‘mismatch repair’- can insert missing bases


and remove incorrectly added bases.
In vivo a defect in mismatch repair will:

 Result is a different number of repeats being copied-


called microsatellite instability.

 Involve changes in thousands of satellite sequences across


the genome
Microsatellite repeats
Microsatellite instability (MSI)

 Laboratory marker of defect in mismatch repair


 Altered length of microsatellites in different genes in tumour
cells, compared with germline DNA from same individual.
 The detection of MSI in tumour DNA is indicative of genetic
instability resulting from mutation of DNA mismatch repair
gene.
HEREDITARY NON POLYPOSIS COLORECTAL
CANCER
~90% of tumours show microsatellite instability

Indication of a germline mutation in a mismatch


repair gene.

HNPCC is caused by mutations in mismatch repair


genes,
MSH2, MLH1 and to a lesser extent in MSH6, PMS1
and PMS2.
2nd hit - somatic mutation
BAT 25 microsatellite sequence on chromosome 4q12
Because of errors by the polymerase, the products of the
reaction show a normal distribution of lengths grouped around the
correct length
Mutator phenotype hypothesis

Postulates that mismatch repair defects lead to


mutation in other genes (carrying microsatellite
repeats) including those that might contribute to
tumour development - e.g. genes controlling cell
growth etc

The increased mutation rate is then the cause of


accelerated tumourigenesis
DNA SEQUENCE POLYMORPHISMS

• Strictly a sequence polymorphism is the existence of


two or more genetic variants in the population.

• More loosely a polymorphism is a normal non-


pathogenic variant with no consequence for the
individual in terms of human illness.

• These are inherited in a Mendelian fashion and they


can allow us to make distinctions between
individuals
DNA SEQUENCE POLYMORPHISM

Some sequence changes in the DNA have no


deleterious effect. They are neutral.

(Can alter 1 base in a codon but codon change can


result in same amino acid Eg ATM 2685A>G, alters
codon from CTA to CTG and amino acid is leucine in
both cases)

They may arise as inherited or de novo sequence


changes.

Polymorphisms are very common throughout the


genome. Our genomes differ by about 3%
SNPs- single nucleotide polymorphisms

Any fully comprehensive atlas of the human gene


has to identify all the sites that may vary between
individuals.

• Single nucleotide polymorphisms (SNPs) are


common DNA sequence variations among
individuals.

• There are ~10million SNPs, once every 300


bases on average.

Important to map these as they may affect disease


genes or produce variation in drug responses etc.
1. Polymorphisms may influence response to
treatment or therapeutic toxicity.

Deficiency in dihydropyrimidine dehydrogenase


associated with polymorphism DPYD*2A can
result in toxicity with fluorouracil treatment.

Testing for the polymorphism can help with


appropriate dosage adjustment.

Glaire et al, J.Pathol. 2017, 241:226


2. Modifiers of breast cancer risk
Pathogenic mutations of BRCA1 and BRCA1 are
associated with high risk of breast cancer.

Penetrance estimates vary between studies-

• 40-87% for BRCA1 mutation carriers and


• 27-84% for BRCA2 mutation carriers

One explanation is that other genetic factors may


modify breast cancer risks.

A role for some polymorphisms?


BRCA1 mutation carriers have increased and
variable risks of breast and ovarian cancer.

To identify modifiers of breast and ovarian cancer


risk in this population, a multi- stage SNP
association study of 14,351 BRCA1 mutation
carriers was performed.

Loci 1q32 and TCF7L2 at 10q25.3 were


associated with increased breast cancer risk.

PLOS genetics 9; e1003212


RAD51 - important component of double-stranded
DNA–repair mechanisms that interacts with both
BRCA1 and BRCA2.

A single-nucleotide polymorphism (SNP) in the 5


untranslated region (UTR) of RAD51, 135G>C, has
been suggested as a possible modifier of breast cancer
risk in BRCA1 and BRCA2 mutation carriers.

Evidence found of an increased breast cancer risk in


CC homozygotes but not in heterozygotes.

Rad51 135G>C may modify the risk of breast cancer in


BRCA2 mutation carriers by altering the expression of
RAD51.

AJHG 81, 1186 (2007)


The Cancer Genome Project

Project Summary

All cancers occur due to abnormalities in DNA


sequence.

The Cancer Genome Project will use the human


genome sequence and high throughput mutation
detection techniques to identify somatically acquired
sequence variants/mutations and hence identify genes
critical in the development of human cancers.

Establish genetic signatures of different cancers,


infer defective biochemical pathways and then treatment
Using SNPs to establish the association of
particular cancers with particular genetic loci.

Extent of mutation in cancer - information from


genomic screening
Genome-wide association studies
Using 4398 breast cancer cases and 4316 controls and
typed them for 227,000 SNPs.

5 novel independent loci showed strong association with


breast cancer
-4 contain plausible genes - FGFR2, TNRC9, MAP3K1
and - LSAP1

(Easton et al, Nature 447; 1087-1093(2007)


Iñigo Martincorena, and Peter J. Campbell Science
2015;349:1483-1489
Fig. 1 Spectrum of somatic mutations in cancer genomes.

Top – Mutation burden in different tumours


Bottom- signature sequence (from mutational processes across
different genes) associated with tumours
Published by AAAS
Landscape of somatic mutations in 560 breast
cancer whole-genome sequences

Serena Nik-Zainal,et al

Nature 534, 47–54 (02 June 2016)

Bioinformatic analysis to identify different


‘mutational signatures’ that might then say
something about the pathogenesis of the tumour
and possible targets for treatment.
299 driver genes in 33 cancers
(TCGA)

Cell 173, 371-385 (2018)


TRIPLET REPEAT MUTATIONS

Presence of a low number of these triplet repeats is


harmless but
Expansion of triplet repeats can cause a variety of
human disorders

1. Genes which show a modest expansion of (CAG)n


in coding sequence.

Eg. Huntingtons disease

2. Genes which show very large expansions of non-


coding repeat sequence.

Eg Fragile X site A (CGG)n


Friedreich ataxia (GAA)n
Variations in copy number of sequence elements
- copy number variants (CNV)

Variation in number of deleted or duplicated versions of


segments of the genome that result in a range of the
number of copies among individuals.

These include parts of genes and other functional


elements in the genome - large.
CNVs may play a part role of ‘complex’ or
‘polygenic’ diseases.

(variations in copy number of globin genes can


cause different disorders of haemoglobin)
Concept of “genomic individuality” or genetic
variation at the DNA level.

•Mutation,
•SNPs,
•Microsatellites,
•Triplet repeats
•Copy number variants
2008 Nature Genetics 40;316-321

Multiple newly identified loci associated with prostate


cancer susceptibility.

Aim : To identify common alleles associated with


prostate cancer Risk.

DNA from 1854 prostate cancer patients and 1894


controls. Analysed DNAs for 541,129 SNPs

Identified 7 loci associated with prostate cancer

You might also like