You are on page 1of 48

GENESPRING GX

Analysis of SNP Arrays


Experiment Creation and Experiment
Grouping
Paired Analysis and Analysis Against
Reference

Identification of variation requires comparison to either a


reference DNA source, a reference dataset or a reference
genome sequence.

Paired Analysis: Here the control and the test DNA are
from the same individual

Analysis against a reference: The control is generated


from a pool of individuals. All the test samples are then
compared against a common, pooled control, also known
as “reference”.
Reference Creation

HapMap samples are processed and packaged as


Standard Reference.

Custom Reference can be created by going to Tools


Create Custom Reference

Custom Reference creation is usually recommended when


there are 30-40 samples, though it can be created even
with small number of samples.
Paired Analysis

Key words in Paired Analysis :

Parameters: Condition, Group

One parameter value: Normal

Create Interpretation using


Condition and Group for Paired
Analysis to run
Batch Effect

An entity is marked for correction if it obtains a p-value


below the specified threshold.

Batches containing samples below the threshold level are


ignored.
Batch Effect

Batch

Each batch is T-tested against a pool of all remaining batches.


Correction for each flagged entity is performed using a ‘reference batch’.
Batch Effect

In case of only 2 batches, one batch is taken as a reference.


Batch Effect

Reference batch will show


no correction.

With multiple batches, the


batch with the least or no
correction is chosen as the
reference
CN Analysis: Only for Affymetrix Arrays

Log Ratio
Mean Log Ratio (MLR)
CN
Allele Specific Copy Number (ASCN)
Parent Specific Copy Number (PSCN)
LOH
1. Paired Analysis CN computation
“Condition-Type” Interpretation
2. Each tumor is paired against the
Normal of its group
3. All Normals are compared
against the reference

All samples against


reference comparison

Only one set of CN


Analysis results can
be stored.
Log ratio:
Against Reference: Sample/Reference

Paired Analysis:
For disease samples: Disease/Normal
For Normal samples: Sample/Reference

---For SNP probes, the intensity used is the average of the


individual A and B intensities.
---Ratios are transformed by logarithm to base 2.
ASCN

ASCN: Assigning allele calls to the SNP using Fawkes


Algorithm

Total CN Allele A CN Allele B CN

AAB--(3) AA--(2) B--1

Example: SNP with CN=3 and genotype call of AAB will


have an ASCN of 2 for allele A and 1 for allele B
PSCN
Max-Min
A
AB or 1,1 0
B

A A A
AA or 2,0 2
A 

For each CN segment, PSCN identifies the max and min


component and measures allelic imbalance.

Contribution of each parent to an offspring’s CN is PSCN


PSCN: CN=3
Max-Min
Allelic imbalance

A A A
AAA or 3,0 0

A A
AAA or 3,0 0
A

A A
AAB or 2,1 1
B
Max-Min is a measure of allelic imbalance
Common Genomic Variant Region
Common Genomic Variant Region workflow link identifies
regions of the genome that are significantly amplified or deleted
across a set of samples. The method is commonly called GISTIC
and refers to the “Genomic Identification of Significant Targets In
Cancer”.

GISTIC aggregates independently the identified regions of CN


Amplifications and CN deletions, identifying regions of Focal and
Broad aberrations

Runs in 2 modes: Coarse mode and Fine mode


GISTIC

GISTIC uses the biological


genome to perform the “find
overlapping genes” function
GISTIC
How do I visualize the results of my CN analysis?

1.Genome Browser

2.Filtering and Exporting out lists to examine the values

3.Heat Map
Filters-

Filter by Region

Filter by Copy Neutral LOH

Filter by Parent Specific Copy Number

Filter by known CNVs


After CN Analysis, a
set of filters can be
used to narrow down to
the entities of interest.

Any entity list can be


exported out by right
clickentity
listexport list. All
associated values can
be exported out, for
any interpretation.
Genome Browser-A Powerful Visualization Aid
Genome Browser
Multiple viewing options
-Profile plot
-Scatter plot
-Histogram
Multiple types of data tracks
-expression
-CN/Copy Number Confidence
-ASCN/PSCN
-LOH
Facilitates Integrative Analysis
-Importing different data types
-Merging of different tracks
New Genome Builds can be added or the Existing builds
can be edited.
Genome Browser Tracks

1. Experimental data tracks:


Expt. Related data specific to samples, like CN,
expression, allele frequecy etc.

2. Annotation tracks:
Gene track, Transcript track and CpG islands
Updating GB data provides the annotation tracks, which contain
the CpG island tracks, the transcript and the gene level tracks
Genome Browser: what do we need to view in GB?

Chromosome Number

Start position

Stop position

Organism specific Genome Builds have to be downloaded


Genome Browser –Bringing in the data

Drag and Drop:

Experiments: You can choose the samples to view

Entity lists

Annotation tracks
Expt. Samples
Re-ordering tracks

Drop-down for changing chr.


Chromosome Selector

Different GB tabs
Profile plot

Scatter plot

Histogram plot

Profile plot-filled
Data Tracks

Gene Track

(start+stop)/2 Spreadsheet
Genome Browser: Importing and Managing Tracks

Tracks have to be in BED format if using

Using Advanced import or through drag/drop option, tracks in


.txt, .tsv or .csv formats

Additional organisms or new builds for the existing organism


can also be added.
ASCN-homozygous stretch

No PSCN due to
homozygosity
ASCN-heterozygous stretch

PSCN-allelic imabance
Case Study
Introduction
Prostate Cancer is the most common cancer in men.

Primary tumors are thought to be composed of multiple


genetically distinct cancer cell clones.

Both the primary and the metastatic prostate cancers are


heterogenous in nature, posing therapeutic challenges.

The present case study was undertaken with an aim to identify


newer components contributing to prostate cancer.
Datasets Used
Expression:
GSE6919
24 metastatic samples from 4 patients and 18 normal
samples

Genotyping:
GSE14996
58 metastatic locations from 14 patients and 16 subject
paired non-cancerous samples
Validation and Extension of the
Study
Analysis workflow
Expression: Genotyping:

T-test Standard Reference

FC 2.0 Copy Number computation

p-value: 0.05
Genome Browser

Differentially expressed
441 entities
PCA

Normal

Metastatic

QC using PCA shows separation of the Normal and the


Metastatic samples of GSE6919
Validation:

1.ERG-TMPRSS2 fusion

2.Chr.6 aberration pattern

Extension of study:

1.PLAGL1 deletion as possible candidate for prostate cancer.


2.Possible epigenetic silencing of TCF21 in prostate cancer.
ERG-TMPRSS2 fusion, as reported in literature is shown to occur here in at
least 50% of the patients.
Published data

Chr. 6 deletion

Validated in GX11
Deletion of PLAGL1

2.15 Fold Down-


regulation of PLAGL1 in
Metastasis
PLAGL1

Candidate Tumor suppressor gene, with anti-proliferative


activities
Zinc finger protein with transactivation and DNA binding
activity
Presence of splice variants which allow differential
regulation of apoptosis induction and cell cycle arrest
Frequently deleted in many solid tumors-breast, ovarian
and renal cell carcinomas
Also known as LOT or “Lost On Transformation”
PLAG1-network analysis
First order expansion of PLAG1 network
Down-regulation of TCF21
No deletion of TCF21

TCF21

TCF21
CN=2
Conclusions
Using GX11, we could validate the presence of ERG-
TMPRSS2 in several of metastatic prostate cancer samples
Integrative analysis using expression and genotyping data has
identified PLAGL1, a candidate ts gene, to be having a
possible role in prostate cancer.
PLAGL1 deletion, though present in a small percentage of
population, is an early event, occurring at a pre-metastatic
stage
Down regulation of TCF21, another ts gene, is also observed
here. TCF21 is known to be frequently silenced epigenetically
in head and neck cancer. Consistent with this, TCF21 did not
show any deletion in the samples examined.

You might also like