Genespring GX: Analysis of SNP Arrays

GENESPRING GX
Analysis of SNP Arrays

Experiment Creation and Experiment
Grouping
Paired Analysis and Analysis Against
Reference
Identification of variation requires comparison to either a

reference DNA source, a reference dataset or a reference
genome sequence.
Paired Analysis: Here the control and the test DNA are
from the same individual
Analysis against a reference: The control is generated

from a pool of individuals. All the test samples are then
compared against a common, pooled control, also known
as “reference”.
Reference Creation
HapMap samples are processed and packaged as

Standard Reference.
Custom Reference can be created by going to Tools

Create Custom Reference
Custom Reference creation is usually recommended when

there are 30-40 samples, though it can be created even
with small number of samples.
Paired Analysis
Key words in Paired Analysis :
Parameters: Condition, Group
One parameter value: Normal
Create Interpretation using

Condition and Group for Paired
Analysis to run
Batch Effect
An entity is marked for correction if it obtains a p-value

below the specified threshold.
Batches containing samples below the threshold level are

ignored.
Batch Effect
Batch
Each batch is T-tested against a pool of all remaining batches.

Correction for each flagged entity is performed using a ‘reference batch’.
Batch Effect
In case of only 2 batches, one batch is taken as a reference.

Batch Effect
Reference batch will show

no correction.
With multiple batches, the

batch with the least or no
correction is chosen as the
reference
CN Analysis: Only for Affymetrix Arrays
Log Ratio
Mean Log Ratio (MLR)
CN
Allele Specific Copy Number (ASCN)
Parent Specific Copy Number (PSCN)
LOH
1. Paired Analysis CN computation
“Condition-Type” Interpretation
2. Each tumor is paired against the
Normal of its group
3. All Normals are compared
against the reference
All samples against

reference comparison
Only one set of CN

Analysis results can
be stored.
Log ratio:
Against Reference: Sample/Reference
Paired Analysis:
For disease samples: Disease/Normal
For Normal samples: Sample/Reference
---For SNP probes, the intensity used is the average of the

individual A and B intensities.
---Ratios are transformed by logarithm to base 2.
ASCN
ASCN: Assigning allele calls to the SNP using Fawkes

Algorithm
Total CN Allele A CN Allele B CN
AAB--(3) AA--(2) B--1
Example: SNP with CN=3 and genotype call of AAB will

have an ASCN of 2 for allele A and 1 for allele B
PSCN
Max-Min
A
AB or 1,1 0
B
A A A
AA or 2,0 2
A 
For each CN segment, PSCN identifies the max and min

component and measures allelic imbalance.
Contribution of each parent to an offspring’s CN is PSCN

PSCN: CN=3
Max-Min
Allelic imbalance
A A A
AAA or 3,0 0

A A
AAA or 3,0 0
A
A A
AAB or 2,1 1
B
Max-Min is a measure of allelic imbalance
Common Genomic Variant Region
Common Genomic Variant Region workflow link identifies
regions of the genome that are significantly amplified or deleted
across a set of samples. The method is commonly called GISTIC
and refers to the “Genomic Identification of Significant Targets In
Cancer”.
GISTIC aggregates independently the identified regions of CN

Amplifications and CN deletions, identifying regions of Focal and
Broad aberrations
Runs in 2 modes: Coarse mode and Fine mode

GISTIC
GISTIC uses the biological

genome to perform the “find
overlapping genes” function
GISTIC
How do I visualize the results of my CN analysis?
1.Genome Browser
2.Filtering and Exporting out lists to examine the values
3.Heat Map
Filters-
Filter by Region
Filter by Copy Neutral LOH
Filter by Parent Specific Copy Number
Filter by known CNVs

After CN Analysis, a
set of filters can be
used to narrow down to
the entities of interest.
Any entity list can be

exported out by right
clickentity
listexport list. All
associated values can
be exported out, for
any interpretation.
Genome Browser-A Powerful Visualization Aid
Genome Browser
Multiple viewing options
-Profile plot
-Scatter plot
-Histogram
Multiple types of data tracks
-expression
-CN/Copy Number Confidence
-ASCN/PSCN
-LOH
Facilitates Integrative Analysis
-Importing different data types
-Merging of different tracks
New Genome Builds can be added or the Existing builds
can be edited.
Genome Browser Tracks
1. Experimental data tracks:

Expt. Related data specific to samples, like CN,
expression, allele frequecy etc.
2. Annotation tracks:
Gene track, Transcript track and CpG islands
Updating GB data provides the annotation tracks, which contain
the CpG island tracks, the transcript and the gene level tracks
Genome Browser: what do we need to view in GB?
Chromosome Number
Start position
Stop position
Organism specific Genome Builds have to be downloaded

Genome Browser –Bringing in the data
Drag and Drop:
Experiments: You can choose the samples to view
Entity lists
Annotation tracks
Expt. Samples
Re-ordering tracks
Drop-down for changing chr.

Chromosome Selector
Different GB tabs
Profile plot
Scatter plot
Histogram plot
Profile plot-filled
Data Tracks
Gene Track
(start+stop)/2 Spreadsheet
Genome Browser: Importing and Managing Tracks
Tracks have to be in BED format if using
Using Advanced import or through drag/drop option, tracks in

.txt, .tsv or .csv formats
Additional organisms or new builds for the existing organism

can also be added.
ASCN-homozygous stretch
No PSCN due to
homozygosity
ASCN-heterozygous stretch
PSCN-allelic imabance
Case Study
Introduction
Prostate Cancer is the most common cancer in men.
Primary tumors are thought to be composed of multiple

genetically distinct cancer cell clones.
Both the primary and the metastatic prostate cancers are

heterogenous in nature, posing therapeutic challenges.
The present case study was undertaken with an aim to identify

newer components contributing to prostate cancer.
Datasets Used
Expression:
GSE6919
24 metastatic samples from 4 patients and 18 normal
samples
Genotyping:
GSE14996
58 metastatic locations from 14 patients and 16 subject
paired non-cancerous samples
Validation and Extension of the
Study
Analysis workflow
Expression: Genotyping:
T-test Standard Reference
FC 2.0 Copy Number computation
p-value: 0.05
Genome Browser
Differentially expressed
441 entities
PCA
Normal
Metastatic
QC using PCA shows separation of the Normal and the

Metastatic samples of GSE6919
Validation:
1.ERG-TMPRSS2 fusion
2.Chr.6 aberration pattern
Extension of study:
1.PLAGL1 deletion as possible candidate for prostate cancer.

2.Possible epigenetic silencing of TCF21 in prostate cancer.
ERG-TMPRSS2 fusion, as reported in literature is shown to occur here in at
least 50% of the patients.
Published data
Chr. 6 deletion
Validated in GX11
Deletion of PLAGL1
2.15 Fold Down-

regulation of PLAGL1 in
Metastasis
PLAGL1
Candidate Tumor suppressor gene, with anti-proliferative

activities
Zinc finger protein with transactivation and DNA binding
activity
Presence of splice variants which allow differential
regulation of apoptosis induction and cell cycle arrest
Frequently deleted in many solid tumors-breast, ovarian
and renal cell carcinomas
Also known as LOT or “Lost On Transformation”
PLAG1-network analysis
First order expansion of PLAG1 network
Down-regulation of TCF21
No deletion of TCF21
TCF21
TCF21
CN=2
Conclusions
Using GX11, we could validate the presence of ERG-
TMPRSS2 in several of metastatic prostate cancer samples
Integrative analysis using expression and genotyping data has
identified PLAGL1, a candidate ts gene, to be having a
possible role in prostate cancer.
PLAGL1 deletion, though present in a small percentage of
population, is an early event, occurring at a pre-metastatic
stage
Down regulation of TCF21, another ts gene, is also observed
here. TCF21 is known to be frequently silenced epigenetically
in head and neck cancer. Consistent with this, TCF21 did not
show any deletion in the samples examined.

Genespring GX: Analysis of SNP Arrays

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Genespring GX: Analysis of SNP Arrays

Uploaded by

Copyright:

Available Formats

GENESPRING GX

Analysis of SNP Arrays

Identification of variation requires comparison to either a

Analysis against a reference: The control is generated

HapMap samples are processed and packaged as

Custom Reference can be created by going to Tools

Custom Reference creation is usually recommended when

Key words in Paired Analysis :

Parameters: Condition, Group

One parameter value: Normal

Create Interpretation using

An entity is marked for correction if it obtains a p-value

Batches containing samples below the threshold level are

Each batch is T-tested against a pool of all remaining batches.

In case of only 2 batches, one batch is taken as a reference.

Reference batch will show

With multiple batches, the

All samples against

Only one set of CN

---For SNP probes, the intensity used is the average of the

ASCN: Assigning allele calls to the SNP using Fawkes

Total CN Allele A CN Allele B CN

AAB--(3) AA--(2) B--1

Example: SNP with CN=3 and genotype call of AAB will

For each CN segment, PSCN identifies the max and min

Contribution of each parent to an offspring’s CN is PSCN

GISTIC aggregates independently the identified regions of CN

Runs in 2 modes: Coarse mode and Fine mode

GISTIC uses the biological

2.Filtering and Exporting out lists to examine the values

Filter by Copy Neutral LOH

Filter by Parent Specific Copy Number

Filter by known CNVs

Any entity list can be

1. Experimental data tracks:

Organism specific Genome Builds have to be downloaded

Drag and Drop:

Experiments: You can choose the samples to view

Drop-down for changing chr.

Tracks have to be in BED format if using

Using Advanced import or through drag/drop option, tracks in

Additional organisms or new builds for the existing organism

Primary tumors are thought to be composed of multiple

Both the primary and the metastatic prostate cancers are

The present case study was undertaken with an aim to identify

T-test Standard Reference

FC 2.0 Copy Number computation

QC using PCA shows separation of the Normal and the

2.Chr.6 aberration pattern

1.PLAGL1 deletion as possible candidate for prostate cancer.

2.15 Fold Down-

Candidate Tumor suppressor gene, with anti-proliferative

You might also like