You are on page 1of 94

Next Generation Sequencing and BeadArray Technology

-- Illumina Complete Solutions
均泰生物科技有限公司 總經理 彭英哲 2008. 07.10
© 2008 Illumina, Inc. Illumina, Making Sense Out of Life, Sentrix, GoldenGate, DASL, Oligator, Infinium, BeadArray, Array of Arrays, BeadXpress, VeraCode, IntelliHyb, iSelect, CSPro, and Solexa are registered trademarks or trademarks of Illumina Inc.

Illumina Genome Analyzer
Powered by Solexa® Sequencing

© 2008 Illumina, Inc. Illumina, Making Sense Out of Life, Sentrix, GoldenGate, DASL, Oligator, Infinium, BeadArray, Array of Arrays, BeadXpress, VeraCode, IntelliHyb, iSelect, CSPro, and Solexa are registered trademarks or trademarks of Illumina Inc.

Genome Biology is Complex
Methylation Regulation Reference or de novo sequence Variation
A T

Genes Expression

Molecular > cellular > organism > environmental > populations Need sequence level information for DNA, RNA; epigenetic changes and protein interaction
4

tags. interactome Compatible with assembly and annotation tools Unlimited applications EASE OF USE: COMPLETENESS: DATA ACCESS: UTILITY: 5 . regions. RNA populations.Transformational Technology Requirements – Our Guiding Principles COST: THROUGHPUT: ACCURACY: To sequence initially at 1/100th current costs and dramatically reduce labor requirements Initially sequence in excess of 1Gb per experiment. Develop a highly scalable technology To generate high accuracy raw data with reliable and informative quality metrics To simplify and automate sample preparation in addition to sequencing to make technology broadly accessible To sequence whole genomes. genes.

2006: State of the Art Sequencing PRODUCTION Rooms of equipment Subcloning > picking > prepping 35 FTEs 3-4 weeks SEQUENCING 74x Capillary Sequencers 10 FTEs 15-40 runs per day 1-2Mb per instrument per day 120Mb total capacity per day 6 .

2007: Enabling a New Era in Genome Analysis PRODUCTION 1x Cluster Station 1 FTE 1 day SEQUENCING 1x Genome Analyzer Same FTE as above 1 run per 3 days 1Gb per instrument per run >300Mb per day 7 .

1-1.Solexa Sequencing DNA (0.0 ug) A C T C G A T C T G G A 3’ 5’ Sample preparation Single molecule array Cluster growth T G C T A C G A T A C C C G A T C G A T 5’ Sequencing 1 2 3 4 5 6 7 8 9 T G C T A C G A T … Image acquisition 8 Base calling .

THE ILLUMINA GENOME ANALYSIS SYSTEM 9 .

automated workflow (even for paired reads) – Automated parallel processing of multiple samples – Low input material required (1/100th of competing systems) Highest accuracy data among next generation systems – Highest output of perfect reads – Phred-compatible quality scoring validated by major centers Open platform for genetic analysis – Broad range of published applications. protocols.Illumina Genome Analyzer Advantages Single and paired end read support – Simple 1 work-day sample prep process – Not limited to 25 bp reads for paired reads (2x35bp+ reads currently) Simple. and kits Highly scalable platform – Highest current output of any system – Extensible technology base 10 .

Flow cell 8 channels Key to the simplified workflow Surface of flow cell coated with a lawn of oligo pairs Clonal clusters are generated in a contained environment (need no clean rooms) Sequencing also performed in the flow cell on the generated clusters 11 .

Cluster station Aspirates DNA samples into flow cell Automates the formation of amplified clonal clusters from the DNA single molecules Flow cell (clamped into place) DNA libraries 12 .

0 ug) A C T C G A T C T G G A 3’ 5’ Sample preparation Single molecule array Cluster growth T G C T A C G A T A C C C G A T C G A T 5’ Sequencing 1 2 3 4 5 6 7 8 9 T G C T A C G A T … Image acquisition 17 Base calling .Solexa Sequencing Technology DNA (0.1-1.

Current status 3-4 Gb high quality paired read data per flowcell 60 million reads per flowcell Paired 35-50 base reads 20 microns Raw read accuracy: >99. 50% >Q30 Consensus accuracy: >99.5% (35 bases) Phred quality metrics: 90% >Q20.999% (20-30x input depth) 18 .

Re-sequencing Regions Known sequence Sequence reads New sequence >99% 19 .

Genome Analyzer Workflow – Paired End Library Run Harvest 0. even for paired end reads Mon Tues Wed Thurs Fri Sat Sun 20 . fast. automated.1-1 µg DNA Cluster Generation Illumina’s solution is simple. and has few steps Sequencing by Synthesis Paired end Library Prep DONE Go from sample harvest to data acquisition in under a week.

jumping libraries) Large insert separation 23 Illumina-Specific Approach 1 read on EACH strand Unique to clonal clusters Straight forward sample prep 100 to 600 base separation .g.Two Illumina Paired Read (PE) Approaches Read 1 Read 2 “Conventional” Approach Cis-priming 2 reads on the SAME strand Requires a more involved sample prep (e.

and regenerate both strands again FLOWCELL Strand re-synthesis Read 2 2nd cut FLOWCELL FLOWCELL Linearize 24 Sequence 2nd strand . we can resynthesize the DNA in the cluster.Solexa Paired End Sequencing Overview of automated process on the flowcell Cluster amplification 1st cut FLOWCELL Read 1 Normal sequencing process FLOWCELL Linearise DNA Sequence 1st strand Because we amplify on the flowcell surface.

tumor biopsies) Can generate reads of ≥ 35bp – With the conventional method (only method used by competitors) can only generate reads of 25bp Simplest approach for generating majority of paired read data in any project – For most projects.Advantages of Illumina-Specific PE Approach Very simple library preparation method: – Does not require creation of complex constructs (e. short insert reads are 75% of all reads – Even for de novo sequencing projects. 3+ days) Small amounts of starting material required: – Can use as little as 100ng of starting material – Can use method with limited samples (e. Jumping Libraries) – Significantly faster library generation (1 day vs. majority of reads will be from shorter inserts 25 ..g.g..

Re-sequencing with Paired Ends Known sequence Sequence reads New sequence 95 to >99% 26 .

Applications 27 .

Open Platform for Genomic Analysis DNAse1 HS Sites Genomic DNA mRNA Tags Small RNA ChIP-SEQ Full Length cDNA Sequencing Future Applications 28 .

Genome Analyzer applications Applications Whole genome Targeted genome – Long PCR – Pull-down Epigenome – Bisulfite – Restriction – Restriction + bisulfite – Antibody pulldown – Ab. + bisulfite Whole transcriptome – All RNA – RNA minus rRNA. tRNA – PolyA – ncRNA Gene expression (DGE) Small RNA RISC RNA products Protein:DNA Protein:RNA 29 Library Generation DNA Fragments Cluster Generation Sequencing + Adapters .

DNA Sequencing Applications 30 .

even distribution No errors at 60 read depth 3 or greater # ERRORS 50 de novo assembly without paired reads (EBI) Expect 100% coverage with paired reads 30 20 10 0 310 x 11 -1 5x 16 -2 0x 21 -2 5x 26 -3 0x 1x 2x 40 DEPTH 31 . causes meningitis 3.Sequencing Bacterial Genomes Streptococcus suis P1/7 (Sanger) 2Mb genome. aligned to reference 97. 207 contigs.34% coverage (97. 41. 24x depth.3M reads (26bp).3% GC.06% expected).

High Quality Customer Generated Data Broad Data Presented at Cold Spring Harbor M. tuberculosis: Finished strains: H37Rv and F11 1 lane Solexa F11 H37Rv 763 SNPs Found 98% of SNPs Missing 2% of SNPs in regions >80%GC NO false positives Also found 95% of indels 32 Find polymorphisms: high sensitivity and specificity Courtesy of the Broad Institute (David Jaffe) .

74M single reads 100% (251/251) SNP calls agree with known genotypes 5 discrepancies with genotype data but SBS calls corroborated by capillary sequencing of same sample 40% (154) novel SNPs called with high confidence 100% of novel calls examined are confirmed by capillary data Paired reads expected to give 100% coverage and detect other types of genetic variation 33 .8% of region covered (>5x) with 0.Targeted Genomic Re-sequencing 141Kb region 15 pooled Long PCRs 10Kb amplicons 99.

MAPASS) SNP VARIATION 117. 99.X Chromosome Sequencing and SNPs COVERAGES Female sample flow sorted to ~90% purity 155Mb.7% total.605 genotypes (HapMap 550 panel) Concordance 94.97% homozygous. SSAHA.2Gb. 90% sequence-able with 25-mers 3. 85% heterozygous calls Depth is important to sample both heterozygous alleles reliably 34 . 17x in single reads and 10x in paired reads Similar results with multiple alignment methods (ELAND.418 SNPs called Compared to 13.

X Chromosome Sequencing: Structural Variants anomalous gap size inverted compared to reference STRUCTURAL VARIATION 90M paired reads (2 x 25-30 bases. 160 bp apart) aligned uniquely to X 460 pairs show anomalous gap sizes supporting 23 events (>2 pairs / event) 600 pairs show anomalous orientation supporting 83 events (> 2 pairs / event) 35 .

11 to 0.) Mean Gap Spacing: 51.25 (across all 19 breeds). University of Alberta. etc.5% (not including outgroup).3% (with outgroup) Average MAF: 0. 0. University of Missouri. USDA-ARS US Meat Animal Research Center Over 54.22 (within breeds) Goals of Initial Research – Map QTLs and aid in selective breeding of cattle – Gene discovery for better meat and milk production and quality – Discover elements responsible for disease.000 robust SNPs per assay (including 23K+ from the Genome Analyzer. 12K+ from Bovine HapMap. growth. and development – Assist modeling of human disease 36 .5Kb Average Call Rate: 99.Bovine SNP Developed in collaboration with: USDA-ARS Beltsville Agricultural Research Center: Bovine Functional Genomics Laboratory and Animal Improvement Programs Laboratory. 99.

RNA Applications 37 .

477 of 31.496) differentially expressed (>2-fold change at p=0.Digital Gene Expression Tag Profiling Brain to Brain Brain to Brain Brain to UHR Log2 Counts Log2 Brain Counts Log2 Counts Log Counts 2 Log2 Counts UHR to UHR Log2 Counts Log2 Counts Log2 UHR Counts 6M tags 31.95) Log2 UHR Counts Log2 Counts 38 Log2 Counts .5k unique tags 24% (7.

Features and Benefits of Solexa Technology for Gene Expression Completely Open System gives Ability to Study Any Organism – Study any transcript from any gene in any organism – Does not require any a priori knowledge of the organism Digital Gene Expression Data gives Robust Quantitative Data – Data is more robust and persistent than micro array data Unparalleled Depth of Coverage and Tunable Coverage gives Unlimited Dynamic Range with High Confidence of Rare and Null Expression Levels – At least 25X more coverage than SAGE Genome Wide Data gives Orthogonal Genome Wide Validation of Micro Array Data – Truly orthogonal method that interrogates the whole genome – Genome-wide validation vs single gene validation with TaqMan 39 .

CSPro. and Solexa are registered trademarks or trademarks of Illumina Inc. VeraCode. . BeadXpress. Illumina. Oligator. IntelliHyb. BeadArray. Sentrix. GoldenGate. Array of Arrays. DASL. Inc. Making Sense Out of Life. Infinium. iSelect.RNA sequencing © 2008 Illumina.

A typical gene structure 41 .

Full Transcriptome Sample Prep Process Pull-out Poly-A containing mRNA Fragment purified mRNA Random Priming → cDNA Make 2nd Strand cDNA Ligate Sequencing Adapters Size Select on Gel Enrich with PCR Grow Clusters Sequenced by Genome Analyzer 42 .

43 . data based on human brain sample and UHR sample Among the 50 M covered genome bases (uniquely mapped): 83% located in reference seq genes (47% in exons (17x). indicative of novel genes.5x) and 17% located outside of ref seq genes (5x).and 36% in introns (2.Read alignment distribution 20 to 30 M reads per sample.

Human Body Map Project Tissues: • Brain • Heart • Skeletal Muscle • Liver • Lymph Node • Breast • Colon • Testis • Adipose Cell Lines: • UHR • MCF-7 • BT-484 • HME • MB-435 • T47-D 44 .

000 new splicing events in the human transcriptome that have never before been seen in EST data 45 .Spliced variation detection In 15 human tissues and cell lines we can validate more than 2/3 of all previously known splice junctions in the human transcriptome We detect more than 5.

Regulation Applications 46 .

Illumina’s Portfolio for MicroRNA Analysis Digital Gene Expression BeadArray™ Discovery and profiling Any species Rare transcripts Array validation 47 Profiling High throughput. cost effective for human and mouse Feed panel content pipeline with novel miRNAs from DGE studies .

Small RNA Construct Purify small RNA (18-36 nts) Add 5’-RNA Adaptor Add 3’-RNA Adaptor PCR Primer 2 Sequencing Primer miRNA PCR Primer 1 48 .

Rfam:RF00019. corresponding hum -> hsa-miR-598 Rfam:RF00251. Rfam:RF00246.Y.Examples of mouse Small RNA not found in Sanger miRNA db TAG Sequence AAGGTAGATAGAACAGGTCTTG TACGTCATCGTCGTCATCGTTAT AGAATTGTGGCTGGACATCTGT GGCTGGTCCGAAGGTAGTGAGTT TGTAGGGATGGAAGCCATGAAT CCACGAGGACGAGACGTAGCG TGCAGGTCGTCTTGCAGGGCTTCTCG Counts Annotation 1226 700 347 154 48 48 15 Rfam:RF00426.snoACA45.mir-135.U3. Rfam:RF00012.mir-219. maps to Mus musculus retrotransposon-like 1 (Rtl1) 49 . mouse chr14.

Small RNA Discovery Actual Sequence: AAGGTAGATAGAACAGGTCTTG The genomic context of this sequence on chromosome 15 is: CTGGAGACTAAGAAAATAGAGTCCTTGAAATCAAGCTGACTCTGC TTTTAGCCTCCTAAATGAAAAGGTAGATAGAACAGGTCTTGTTTG CAAAATAAATTCAAGACCTACTTATCTACCAACAGCA 50 .

tRNA etc.miRNA Detection 2M miRNA tags: . 51 .7% novel putative miRNAs e.85% match >200 known entries .g.8% match rRNA. . novel sequence TTAATATCGGACAACCATTGT is part of putative miRNA structure and maps to highly conserved element on chr 14 that has no previously known features.

Differentially Expressed miRNA tags 3.9 MILLION FOR ND13/MEIS1 52 BCCA .8 MILLION CALLS FOR ND13 AND 2.

NRSF. Science 1141319. al.Genome Positions Actively Regulated ChIP-Seq: Discovering Transcription Factor Binding Sites Using ChIP Cross link DNA-Protein in cell Purify chromatin with Ab specific to transcription factor & recover trapped DNA Generate 6M tags by Solexa sequencing Identify clustered alignments in human genome (ENCODE) Results in high resolution genome-wide dataset Enriched by ChIP-Seq Array Data Mock ChIP 54 Wold et. 31 May 2007 .

ChIP-SEQ: High Resolution Occupancy Maps Determining binding site location: Place ChIP-SEQ reads on genome Identify tag peak Identify nearest NRSE to the peak At Least 10-fold Better Resolution than ChIP-ChIP 55 .

ChIP-Seq versus ChIP-chip Significantly less starting material (down to 10ng) No cross-hybridization – ChIP-chip shows a significant amount of cross-hybridization – Especially in complex genomes. such as those from mammals Accessibility to regions of the genome that are inaccessible to microarrays because of lack of hybridization Better sensitivity Higher resolution Lower cost 56 .

Me-DIP (Histone Methylation ChIP) 57 .

The ENCODE project Sequencing as a powerful methodology SEQ SEQ SEQ SEQ SEQ ChIP-PET SEQ 58 .

BeadArray Technology 60 .

1/1000 之頻率出現 – 但這些位點的內容卻常常決定著個體差異,影響著身高,膚色等等外在 表現一直到智商,體力,體質,癌症發生,以及對於各種藥物的代謝反 應等等。 – 由於SNP 位點在人類基因體中出現之頻率很高,這些位點也是很好的 genome marker,並可進行genome-wide Linkage Disequilibrium 61 .單核苷酸多型性 (SNP) Single Nucleotide Polymorphism SNP (genetic diversity) – 在族群中,該鹼基之變異(mutation) 發生時間較為久遠,導致有變異之 人數佔整個族群之比例> 1% ,因此以多型性(polymorphism) 稱之。 – 一般變異(mutation) : 差異人數比例<1% – 在人類基因體中,以1/300.

000 markers Currently resource intensive to find/characterize SNPs : Regional and gene specific SNPs will probably come first.SNP Genotyping Genome scans for Linkage Analysis: 1.000 markers per cM. 62 .000 to 5. Genome scans for Association Studies: < 100. High density SNP mapping: require more efficient technologies and computational capability to analyze or recognize patterns from hundreds of thousands of SNPs.000 marker Candidate Gene LD mapping : 100 to 1.

63 .

Disease Study Evolution Disease Type Complex Disease – Complex SNP Many genes Environmental factors Epigenetic factors Very small gene effects Product Criteria Case/Control. Cohort – – – – > 1M Markers CNV/Methylation SNPs in genes MAF <> 0.05 Common Disease – Common SNP – Multiple genes – Smaller gene effects – Environmental factors Case/Control Study – – – – 100.650K SNPs TagSNPs MAF >0.05 CNV Mendelian Disease – Single gene – Large gene effect Family based /Linkage – Small number of SNPs – SNP Information content 64 .

Genome-Wide Association Studies SNP panel of ~100.000 markers Test every SNP for association in study population Patient n=450 25% maf Control n=450 10% maf P-value 0.600.000 .0000005 Genotype patients with SNP panel Association indicates that a nearby candidate gene may be underlying cause Identify causative SNP by fine mapping or sequencing Computational analysis identifies nearby genes and determines structure 65 .

. 2000 samples X 317k loci = 634 million genotypes – Results may be difficult to interpret – Studies are expensive (2000 samples X $660/sample = $1.Whole Genome Association Study Design for Genetic Studies Advantages – – – – Detects smaller gene effects in complex disease traits Association defines relatively small region (1 – few genes) Does not require a priori knowledge for genes involved in disease Can test for interactions of many genes Disadvantages – Requires thousands of samples to find significant association – Extremely large datasets are generated (i.e.3M) 66 .

BeadArray: Microwell Fabrication Optical fiber photoresist silicon wafer acid etch plasma etching 3 µm beads in wells cleaning strand strand core cladding 70 .

000 Beads 900.rray Matrix and BeadChip Formats Sentrix Array Matrix Sentrix BeadChip 15.000/30 = 1666 types ~900.000 Beads 50.4mm ǿ ~50.75 mm 1.000 types (genes) 71 1.8 mm .000/30 = 30.

More Information (x 10 6 ) 5 BeadArray 4 3 2 1 Spotted Ink Jet Photolithographic F eatu res / cm 2 0 100 50 20 11 5 Center-to-center feature distance(µM) 72 .

Bead Design Address 23 b Genotyping_GoldenGate Assay & DASL Assay (GEX for FFPE) 23-base address code Address 23 b Probe 50 b Expression_Direct Hyb & Whole Genome Genotyping_Infinium Assay 50-base gene-specific probe linked to Address 23-base address code 73 .

Array Formats Sentrix Array Matrix BeadChips 74 .

BeadStation

Feature Dual confocal laser scanner Broad dynamic range 2 fluor / μm2 detection BeadScan Settings <1 micron resolution <90 mins / SAM 25-90 mins/ BeadChip Compact bench-top design

Benefit Two-color detection More reliable data Highest sensitivity Flexibility Highest quality data Rapid results

Saves lab space

75

Only Illumina Offers the Complete Whole Genome Association Workflow
WGA Study Case/Control Cohort 100K – 650K SNPs Fine Mapping nd (2 Replication) 60b – 1500K SNPs

APPLICATION

Focused Replication 7K – 60K SNPs

Resequencing Discovery

PRODUCT

1M Duo Human 610 CNV 370 Human 510S

iSelect

GoldenGate VeraCode

Solexa

ASSAY

Infinium

Infinium

GoldenGate

Clonal Single Molecule Array

76

High-Throughput Genotyping and Copy Number Analysis

77

or Primer Extension) – For standard SNP projects – Multiplexed from 6. Fine Mapping 78 .000 to 1.000. Custom Panel Assay – Candidate Gene Mapping.000 – Application : ƒ ƒ ƒ ƒ Disease Association Study SNP-CGH Assay Candidate Gene Mapping Copy Number Variants Assay GoldenGate® Assay (Allele Specific Primer Extension) – For custom and standard SNP projects – Multiplexed from 96 to 1536 (and multiples thereof) – Application ƒ ƒ Standard Panel Assay – Linkage.Two Tools for Genotyping Success Infinium Assay (Single Base Extesion. MHC.

Illumina Infinium Assay SNP Offering iSelect Linkage-12 NS-12 12x 20000-60.800 PRODUCT CNV370 Quad Human610 Quad 510S Exon-Centric Human 1M Duo iSelect 79 .

From 7K to 1M SNPs – One Assay .Infinium GENOMIC DNA 750ng WGA TT TC CC FRAGMENT DNA UNLABELED DNA HYBRIDIZATION ddNTP 80 ALLELE DETECTION THROUGH SINGLE BASE EXTENSION .

SPECIFICITY A C DNA SAMPLE G T A T T 81 . SELECTIVITY 2.Allele Detection Through Single-Base Extension 50mer OLIGO SEQUENCE BEAD Polymerase ~ 18x T 1.

82 .

Effects of Copy Number Variation on gene expression REG GENE REG GENE Additional gene copy Increase of distance from regulatory element REG GENE REG GENE REG GENE REG GENE REG GENE New regulatory element Gene interruption REG GENE REG GENE 83 .

N. G.Copy Number and LOH Analysis Normal diploid genome Illumina Whole-Genome Genotyping Portfolio: Copy and allelic level discrimination High resolution Aneuploidy Reciprocal translocation Unbalanced translocation Amplification (distributed insertions) Heterogeneity LOH (cell-to-cell variability) Both paternal Or maternal TECHNIQUE Banding M-FISH/SKY (Array) CGH SNP-array Digital karatyping Interphase + + + + + + + + – – – + + + + + + + DETECTION – + + + + + + + (+)* (+)* – + – – – + + – Speicher. P.6(10):782-92 Modified from Albertson. D. R. 84 . 34. Chromosome aberratkions in solid tumors. M. Nature Genet. et al. 2005 Oct. 369-376 (2003). Nat Rev Genet. and Carter. The new cytogenetics: blurring the boundaries with molecular biology.

Illumina Offers the Allelic Ratio: Why is this Unique and Significant? ALLELIC RATIO/ALLELIC FRACTION Extremely precise and robust “contamination” measurement High signal to noise ratios (~9.04) Allows estimation of percentage contamination with normal cells 85 .0) Low noise (stdev ~0.

86 .

From 96. Ligate PCR Amplification Cy3. 384-1536 SNPs – GoldenGate Assay GENOMIC DNA 250ng Activate DNA TT TC CC Add Oligos Extend. Cy5 primers Hybridize to Array Matrix or BeadChip 88 .

1536 CpG site. 16-96 samples) Cancer Panel ( 1536 SNP.Applications Linkage Analysis – Family Case Study. 16-96 samples) Methylation (Cancer I panel .008 SNP) MHC Panel ( 2320 SNP. 16-96 samples) Mouse Linkage ( 1500 SNP. 96 sample/assay) MicroRNA Assay Custom SNP Genotyping ( 96-1536 SNP) 89 . (6.

Illumina Custom SNP Genotyping Projects Human Soy Bean Sheep Mouse Corn Cattle Rat Canine Citrus Pine Horse Wheat Chicken Swine Honey Bee Barley And more…! 90 .

0 assembly completed by the Broad Institute Average SNP density: 8 SNPs per Mb Average Call Rate: 99.g.27 (across all 9 breeds). MGI-like) – Affordable & reliable high density SNP typing 91 .18 to 0.25 (within breeds) Goals of Initial Research – To develop a canine panel with SNP coverage uniformly distributed across the dog genome. with an emphasis on selecting markers broadly informative across breed populations – Established cohorts for epidemiology – Centralized informatics (e.81% Average MAF: 0..000 validated SNP probes derived from the CamFam2.Canine SNP Collaboration Developed in collaboration with UC Davis Over 22. 0.

000 SNPs Content is derived from data generated by the Broad Institute’s Equine Genome Sequencing Project.Equine SNP Collaboration Developed in collaboration with: International Equine Genome Mapping Workshop and the Morris Animal Foundation's Equine Genome Consortium Currently in development! Over 50. Goals of Initial Research – Identify genes and mutations that contribute to heritable diseases such as musculoskeletal disease. and bone disease – Offer new diagnostic and therapeutic approaches to reduce animal suffering – Promote equine health and welfare – Manage breeding programs 92 . laminitis. recurrent airway obstruction.

000 Mb 3.Illumina Custom SNP Genotyping Projects Species Human Cat Wheat Corn Pine Bovine Approx.000 Mb 20.000 Mb # of SNPs Various 96 1536 16.000 Mb 2.700 Mb 17. Genome Size 3.704 7.000 Mb 3.600 62K novel Assay Infinium/GG GoldenGate GoldenGate Infinium Infinium Sequencing Platform BeadStation BeadXpress BeadStation BeadStation BeadStation Genome Analyzer 93 .

High-Throughput DNA Methylation Profiling Platform from Illumina 94 .

DNA is packed into Chromatin 95 .

– However. DNA methylation plays a critical role in the regulation of gene expression. DNA methylation is the only known natural modification of DNA. which only occurs in cytosine-guanine (CpG) dinucleotides. such as Alu elements. Most CpG islands (where CpG density is high) are unmethylated. 70-80% percent of all CpG sites in human DNA are methylated.What is DNA Methylation? Methylation is an enzyme-mediated chemical modification that adds methyl (CH3) groups at selected sites on proteins. this methylation occurs primarily in areas where CpG density is low. Aberrant DNA methylation is associated with – – – Cancer (silencing of tumor suppressor genes) Neurological disorders (Rett syndrome) Autoimmune diseases 96 . – Exceptions include imprinted genes and X-chromosome genes in females (X-chromosome silencing). DNA and RNA. or at repeat DNA sites. In humans and most mammals.

the child has Angelman syndrome. – When the deletion involves the chromosome 15 that came from the father. the child has Prader-Willi syndrome. Prader-Willi syndrome and Angelman syndrome: – Due to deletion of the same part of chromosome 15. 97 . The expression of a gene depends upon the parent who passed on the gene. – When the deletion involves the chromosome 15 that came from the mother. Genomic imprinting plays a critical role in fetal growth and development. Imprinting is regulated by DNA methylation and chromatin structure.Genomic imprinting The phenomenon of parent-of-origin gene expression.

Panthera tigris (Tiger) Panthera leo (Lioness) Tigon 98 .

Panthera leo (Lion)

Panthera tigris (Tigress)

99

Liger

Prader-Willi syndrome (PWS)

Angelman syndrome (AS)

http://www-ermm.cbcu.cam.ac.uk 100

Genetics versus Epigenetics

101

DNA Methylation changes in cancer Normal cell: Cancer cell: Cancer-associated CpG Island Hypermethylation Methylated CpG Unmethylated CpG 102 .

illumina. 103 .Illumina’s Methylation Assay Treat genome with bisulfite – Converts unmethylated cytosine to uracil – Methylcytosine remains unchanged “Genotype” the converted DNA at CpG sites using the GoldenGate assay – Multiplexes to 1536 sites analyzed simultaneously – Probe design assumes same methylation status for adjacent CpG sites (Verified by bisulfite sequencing) A specific CpG site is assayed. not a general region defined by restriction sites or general fragmentation Animated ppt slides on detailed assay scheme and workflow are available on our website: www.com/methylation.

Illumina Methylation products Goldengate Methylation Cancer Panel I Custom Design GoldenGate BeadArray Custom or Cancer up to 1.98 104 .81 Yes No BeadStudio Yes | Yes Yes BeadStudio Yes r2 = 0.98 r2 = 0.98 Veracode Methylation Custom Design GoldenGate BeadXpress Custom 48-384 80 / hr (@ 96-plex) r2 > 0.536 96 / SAM r2 > 0.578 12 / BeadChip r2 > 0.97 Yes Yes BeadStudio Yes Infinium Methylation Products Assay Platform Content CpG Sites Sample Throughput Reproducibility Inter-Product Concordance Internal Controls FFPE Compatible Analysis Software Integrate with Expression HumanMethylation27 Infinium BeadArray Genome-wide 27.

1 (February 2007 Release) and recent literature – Novel content discovered using sequencing technology to be included in fixed panels – Regular updates to provide the most current content 105 .MicroRNA Expression Profiling Product Human – 735 human mature miRNA sequences are targeted Mouse – 380 mouse mature miRNA sequences are targeted Content – Sanger miRBase v9.

Mouse. Rat etc) HumanRef-8 MouseRef-8 Human-6 Mouse-6 1 sample (24k transcripts) Total: 8 samples/slide 1 sample (48k transcripts) Total: 6 samples/slide 106 .Whole Genome GEX BeadChips (Human.

000’s CATALOG FOCUSED ARRAYS Human Sampler Cancer Panel GENES 100’s Custom Arrays FFPE EXPRESSION (DASL) Custom DAPs CUSTOM FOCUSED ARRAYS 107 .A Complete Offering in Expression Mouse WG-6 WHOLE GENOME ARRAYS Human WG-6 Mouse Ref-8 Rat Ref-12 Human Ref-8 10.