You are on page 1of 48
2) United States Patent (10) Patent No: US 8,296,076 B2 Fan et al. 4s) Date of Patent: Oct. 23, 2012 (61) NONINVASIVE DIAGNOSIS OF FETAL, sa9028 8 Sava ANEUOPLOIDY BY SEQUENCING 4800.9 ee (73) Inveotors: Hel-Mun Christina Fan, Fremont, CA. a ee (US) Stephen R. Quake, Sanford, CA 3 cht I ta us) 3 Kina eal 3 Hina ta : a ‘iin at (3) Assignee: ‘The Board of Trustes of the Leland Stanford Junior University. Palo Alt, (Conic) CAS) FOREIGN PATENT DOCUMENTS (©) Notice: Subject to any disclaimer, the term of this EP 0637996 BI 7/1997 patel is extended or adjusted under 38, (Continved) USC. 154(b) by 0 days. OTHER PUBLICATIONS (21) Appl.Nos 131482,088 US. App No, 11825298, eI. 5,207, Lape tal (22) ‘ited: Ape. 20,2012 (ontinved) (65) Prior Publication Data Primary Esaminer —Bdward Raymond US 20120208710A1 Aug. 16,2012 Related US. Application Data (0) No, 12/696,508, filed oa Jan. fo. 8.195.415, which is 2 ‘division of application No, 12/560,708, fled on Sep, 16, 2008. (60) Provisional upplication No, 61/098.788, fled on Sep. 20, 2008 () Ince GoIN 3348, (200501) (52). ‘0220 (58) Fleld of Classification Search 70220, TORi182-185 ‘See application ile for complete search history. 36) References Cited USS. PATENT DOCUMENTS 4508625 841985. Gram Aoos236 61987 Calenalt (74) Attornes, Agent, or Firm —David J. Aston: Peters Nery, LLP on ABSTRACT Disclosed is method to achieve digital quantification of DNA (ce, counting differences between identical sequences) using dreet shotgun sequencing followed by mapping t the chromosome of origin and enumeration of fragments per chromosome. The preferred method uses massively parallel sequencing, which ean produce tens of millions of short sequence tags ina single runand enabling a sampling thatean be statistically evaluated. By counting the number of sequence taus mapped toa predefined window in each ehro- rmosome, the over- or under-representation of any ehrome- some in maternal plasma DNA contributed by an aneuploid Tetus can be detected, This method does not roqure the di. {erentiation of fetal versus matemal DNA. The median eunt ‘of autosomal values is used as # normalization constant to fccount far differences in total number of sequence tags is used for comparison between samples and between chromo 117 Drawing Sheets US 8,296,076 B2 Page 2 U.S. PSTENT DOCUMENTS 6.590.904 Dahm $2266 A 611998 Austin ea 3230 BL Wagner ot SiaSa6 A Gi1995 Reka al S954 BL Regnier ta Saszosd A 51905 Saunders tl 6596545 BL Wiener Sarin & 91908 Simon 813325 Br Neton eta Siae5 A 11996 Wing et a Lovin a SQ9R302 A 41996 Wiking al Haron Bs20903 A 6 1996 Kubert al Asin ea Sss6771 491986 Some ie DBiLaT A $1997 Aap ea erage Sins) A 61997 Cole sie o86 Loetal Sates A G07 Bianchi S.664.104 Poured tl Sfastot A 7/1997 Terstappen eta sei stl Kis tal Stas A 101897 Simonet Sensis Bard a Shun799 A "11998 Hananann a 85.841 oper etl Shov983 A 11908 Coleman ea set9.gls Moet Shissie A i008 Recborach 746.05 Benet etal Stae026 A 31998 Witding ctl. 75347 zl S3s0339 A $1008 Shut 7938 hich eal S366.833 A 61998. Agar ea SSIs 8 Fayre a Sina A 1998 Nebometa 83093 Anerson Sooxoa A i908 Chuetal sas849 Xtal SAYTIIS A 11/1998 tinea 8878619 Dlackbn S840502 A 111898 in Vana Sasi Fodnd ta SS0.787 A 121998 Kopel soa. Two SRiKo09 A ‘T1900 Angee oats ake SR66.MS A 21999 Wilding S939 Loperct ah Simas3 A 31999. Benson cl 6927028 Desi a SRoLgst A 4099 Rocheetal 6988.668 Isat al Sones80 A 71999 Wilding S904 Wang ta $952.73 A 911999 Hansmann et a aise Gray etal 85024 A 101999 Gotta 0812 Huang ota Sp2217 A 101999 Troetal Zinn Moon a Som25332 A 101899 Singeret a Fists Hliseta Som721 A 101999 Bruno Kinch ta Sooniss A 11/1900 Teatappen eta dade A 111999 Manet Goeke eta X 11999 seta Weta etal X 11969 Neonat a ose A 121999 Frehaat ta sera 3 To “Tonapres et Tox Foser etal S000 ithe 735028 Reina X 62000. Nokon ta 720% tal X00 Spada Iason, Larmor ta 113000. Brown ea Rew 112000 Bochter e208 R120 aera rasan ower 123000. Buceer Dela Tone-Busno etal S000 Pinel eta ky ea Bi ‘T2001 Seine a Dhallan BI 23001 Fada ta Be Dnalln Bi 32001 BD Unger et BL 3am Faas ste Be Touts! BL gael Ramat 7638.9 > Gao ett 210.891 BI 42001 Nyren a aoorioorr® AL Renters 318556 BI 42001. Shubert 2oav0031341 AT Lowa: S2isa%4 BI $2001 Feimbere 20010053086 AL Ried eta 6288540 BL 72001 Loetal 200200006621 At anc 6365229 BI 72001. Fada eta 2ouzo009736 AL gona 300077 BI 102001 Shubert 30020012930 AL Rothbers ct 6344326 BI "23002 Nehon sta pooz0012931 AL Waa 6361956 BI 3002 Shieh eal Sonaio.ets0 Al Taughaen ea S365362 BI 42002 Tertanpen etal 20020019001 AL ight @RecNer BL 43002 Chacha zonz0czes3t AL sien 376481 B2 42002 Ramvey eta ZO038332 A Quake tal. S'388759 BI $2002 Murphy eta 200200076825 At Geng a) Morty aonatnwss AL Sirota Pac aon2.or1083s At Kuna io Marea amna.o12078 AL Sette a Fite Mees. aon20137088 At Bianchi {£2002 Blankentin aonz0L6ist6 Al Quake sintae Bt $oum) Vout 200200166760 AL 112002 Prema a fadeael BI 92002 Keappet a 20020172987 AL 112002 Terstuppen ta @ie33e 82 9:3002. Moret aonsooi4o2 AL “12003 it 6479.20 BI 112002 Pare tal 20030017514 Al 12003 Pachmann ca SSti9e BE Tam Wesle aa 2on30022207 AL 12003 ‘Balasubramanian 540895 BI 4.3003, jonsoowsss AL 33003 Demiveral ssa BI 42003 SousmD4ises AL $2005 Dennit 7678 BL 62003 aousonraek? Al 42003 Kine US 8,296,0" Page 3 76 B2 20090077299 sox 19077 donso1ig7at 20080129656 aonso1s308s aonrotsoom a00so16sss doosolTossi aonvo17a70t 20030175990 donvoissass son0190602 dmnsolgaess aoso204s3t aonv0200001 aonvoniorso a00v000ss82 aoowoooo4n soogoolsii 20040018500 2oown43s06 aonwonisseo dongoostise aoouowr 276 soowonnss02 amogorias> aoo40137470 doowolsiaes soowotadest aovotgesss doogorTioat aonwolssi9s doovon03037| 20040209209 aoowont4aa0 aongonton aoo4onsiT07| 20080014208 aonsonig7e9 20080037386 donson426os aonsontness aonsons97 30080061962 an0so11ss91 amnso12988t 20080185496 aooso1497? amnsorsn ao0sorss7st donsoleszat anns0l 3006 amsoisiass a00sorsiai0 aonso1siaes amso1967ss 20080207940 dosnt iss aonsoatasss aooso22isat donson21373 amnsox30101 donsonasses ao0so2s01nt aonsons015s anosorso1m a00sons2773 ao0s0nss001 aonsone2s9? aonsor6e33 aons0x72103 200s02¥2196 20080282293 20080287611 20060000772 20080008807 2008 0008804 | 0050024756 anpsn4s2ss al al al A al A al al al 42003 ams 2003 2003 2003 92003 S003 92003 92003, 102003 to003 fo2003 ro200s 112003. Che 1003 ‘00a 1004 1004 00a 2004 3004 3.2004 008 S004 00a T2004 oo 4004 2004 9004 92004 o2ao4 foanoa tooo Huo 1200 ‘00s 0s 05 bons 2005 200s aus ‘sas 00s as 400s 7200s 7s 2005 2005 ‘2005 ‘e205 Sia00s 92005 Sus Sans 102008 Tonos roto 200s 1200s 12005 anos 12005 12005 1300s 122005 12005 200s 1.2005 12005 1/2006 1006 2006 Hanah eta Teostal Tvostal Terstappen ta Leary eal. Oakey eta Schuclr etal ‘Willams ta Presatan tal Pressman etal: Whitney tal ‘fst a. Ships G Desmon et a Bianchi Hanescker eta a al ‘Ouyang ea Chowet a Wang ea Let etal Dhallan Walker ang ea Braff a Lesko al Schuelr ea [octal Pinter Cao Pater tal Giowal Kishan tal Melia Amionacais eal AieRiche ca Albert etal Paetin-Brechoe Moat a Tata ta. Maliade et a Goodie a Kooeta. Chen Palfenberger et. Hahn ta. Chen Raoetal Shafer el Racetal ‘Quake eta Batre a Childers Wagner et Shims Engelberger ta Sulu ta Chen ata Xteet ah esi et al. Anson ea Mebirie Padmanabhan ea Goel etal Kapur tal Chen Coste (Cosmen ta Nogent tal Sanat a O'Mara. Ronaghi eta “ite et a Lapin a 20060051265 AL aonadnsi7%s Al gnvene076? Al 2ova007280s Ai Sonag73i2s AL Snuwbnoston AL dovsor2i4s> Al douwl2t624 AL SovwL2Koos AL 20060134500 AL annwoteoi0s AL aonadted1s0 AL dnusoreo28 Al 2nvaistss AL donwor0s0s? AL aouana27e Al 2006028204 AL dousrs206t AL aonanrs2oct AL 2000028201 AL aonanrs208) AL ao07 0015171 Al 20070017683 Al anon oonesst Al dou7n2643 Al aoinimaeda Al dovnoo2sats Al soon n2eaie AL ainimaeat Al 2ovnop2sats Al anino2eat AL aoininn2ste AL gnvjonyi Al aovn00sti Al doonmoyt2ys AL aninmoy2%s AL 20070082288 AL aov71000339 AL aninioanieo Al 2007000368 AL nnn 00sK7s0 AL aonnnnsi2es Al gnvjonsies? Al 2007003868 AL Joov0ns96 anininns9) 20070039718 AL aounnso7i AL aoinnnso739 AL 2007008974 AL aninnso7st AL aonnnnso78s AL gnv7on6seas Al 200 0068858 A doong7i7e AL 20070972298 Al 20070972900 Al aoungrisrs AL aoinimooaat AL 2007 0092881 AL aoonimp2917 Al aon7iogpo207 AL gnv7ogp9210 Al 20070099289 AL Soo7oiosios AL 3Ou7OIDSI3} Al 2007011079 Al Sou71LTIS8 AL aoinn2asse Al ao070122806 AL an07 0128655 AL 2o070131622 AL 20070134658 AL 2n070134713 AL 20070138621 AL 20070181587 AL 20070181588 AL 20070181717 Al 20070158928 AL Mohamot et. Blanch etal Wang etal Taipourae late Tiaier Dhallan S200 32007 hoot 32007 32001 32007 x00 32001 32007 x20 32007 32007 hoot 32007 32007 2001 42007 42007 ‘007 007 $2007 $2007 Shoot 2007 $2007 Shoot 2007 2007 52007 ‘62007 ‘e200 ‘82007 ‘62007 ‘82007 ‘62007 ‘82007 ‘62007 72007 Waynr et a. Barber etal Linetal Toca Tostal Tang eta Blanch tal “Tonkovigh al ang th Toner eal Fuchs a. Fuchs a: Fuchs Fuchs ea. Fuchs a Fuchs tal Fuchs eta: Chivetal ‘linda Shier eta. Shuler etal Kimet al Toner ea Aires Zehentnes Wilkinson th Pocketa Sutherland etal Bloch Kapret a Barber eal Balen Toneret al Grisham ea kere Grisham ea Kayuret au eta ater ea. aly Twoetal Brauch hia Ponaing ot a. Renoe ea Omnis ea Gayon Fuchs et a Teveroaliy a ime Cilla et Clank eal ‘ssn Coumans a. Goons eal Shuler etal. Oba Onley al. Boker Gao Dorel eal Baker etal Faker a. Caspentr et Macket a, US 8,296,076 B2 Page aolggas A Taig Soul z aa a Miyoiset Mt Jar watts zB taser A orolseut AL 73m at iB Heist §3 Sonora AL Ear Ofbnotta HS WOSUIeS? At Hiroisete At Sar Rett NS MOSSES At Sono At Sat Sitar a HS WOsbaue At Ronoworet At S07 chnsen HS Gross AF meyaumass At ate came WS woman A Sironsles At loa? sates NO WGtbas A routs At tL nor Sera HS WGtbante at Broo A eee tee au Wo ete A rcouae At Litas a WS WoezerMs Miecuors At 3a Seneca WS Weanetes af mesons At dae eum WS doeemmes 4 lowers At 308 nates HO Wess Reco A ae eee WS Wage momma A dame ee WS doeaues 4 Reopens AL dais Ge NO WOneian Reece AL Site Eo wo Wom A mesouga At gates Manet WS does 8 Scones At $8 Sheer a YO Wbaese Mec A Sales Sere WS Wonneils mesa At tas Ot WS toes 8 iopowned At 4a Local YS Wohselar Booty A 4 Ue Wo Woneess 8 Romig At Ta Quiet 4S Won: Mowpiee At laws sewmhcnea HO wos Houta Ale Lai cet ross RO ROMO Snrowmnss Als ‘Yano MSkiggeca ata WO NOSmanirta FOREIGN PATENT DOCUMENTS $8 NShisti 2 72 39 Seat i eine BI S901 4S wOaeitynne 8 B rss 1833 vO RmHEID A i varie ab Lod WO wo2meazet As EP 1388013 BI 2/2004 WO WO 2004/113877 AL a aoe Bt 2a v3 — ee WO WO 2005,028663 A? f Saou SL 004 eo, vases a Teas) At Sd paar a Seem At aasee Wo wo2msodat 42 i Perso at ‘Ras ams0ur iP Hoste BI tr 2008 wo wo2005047852 EP 1272668 Bl 2/2007 WO WO 2005/023091 ir trae RE 07 WO. woameosren = Histo AE 3307 Wo Yosusoswsy = Hera Rh $e, WO Noose 48 US 8,296,076 B2 Page 5 wo 2o0sos4380 2 92005 WO Wo 207044600 A Wo aoosossive Al 912005 WO Woamrosiees AS Wo doososster a2 .am05 Wo Woam70s00 As Wo doosossous A? 10208 WO Wo dmesg AS Wo aoostoteat At 112008 Wo Woamrossais AS Wo oostoneik A? 112005 WO Wodm7eot AS Wo ooso2sess AS 122005 WO W207 089880 AB Wo anosoosose AS 122008 WO Wo2m7Iae38 AD Wo doostieaet a2 (23008 Wo Woamtaes Ab Wo doos tisss? A? 122008 WO Wo 2007132167 AP Wo nose A? 122008 WO Woam70@809 AS Wo aoososstet A} 2006 WO Woanononeaee AS Wo oso1osi0 A? ame WO Wo2m7 e229 As Wo doosozsses A? 2a6 WO Wom? toss AB Wo 2oos ase AS42006 WO Woan70rse6 AS Wo doosouuast Al Ane Wo Wo2meorwt Al Wo toosossist A226 WO Wo 200709911 AS Wo anosiose AS Game WO Woamros4s AS Wo doosorgsio As Gate WO Wodmynene AS Wo toosossisl AS 6246 WO Wo 2007126056 AS Wo 2oosoressr A? 72006 WO Woannowiss AS Wo doosoreiso A? 72006 Wo Woampoee2 AS Wotoosowiat Ai S26 Wo Wo2moeres As 2ooso7e9 AL 32006, WO Wo2mMr09713 AB Wo 2008076867 A} 9.2008 wo? Bopor3402 AL Wo 2oosorsis0 3 ane WO GomoLiaas AL Wo doos 100866 A2 9206 WO wo 2007009229 AS Wo aoosow71s AS 112006 Wo Woamo7soses AS Wo doosonise A316 WO Wodm7;as0 AS Wo dos 20454 AL 112006 WO Wo2mM7D41610 AB Wo 2oosoes380 AS (22006 WO Wo dM DIDS AS Wo aoostieast a3 '2007 Wo oo; om008t Al 32007 OTHER PUBLICATIONS Woaoneset A 32007 USS. Appl. No. 11/825,677, iled Ju. $, 2007, Lope2 etal Wo dooroakiae A? 3.3007 US. App No 1909959, fled Sep. 77,2007, Dat Wo aoorosesss A? 3347 US. App No. 6075420, fe Feb 22008, Quake Wosmponicr 3007 US. App No, 60949.227, ied Jo 1, 2007, Kapur 9 2007 M201 42 ‘doi tal Gen Apia o Det Fetal Nosed Callin Wo 200703544 a2 32007 regan Women. The Lanest Aug $1989.18. eosenscana ait aeapon Adinoli etal. Rapid detection of aneuplidies by microsatellite and Wo aoorosoas Al 43007 4 Ap arene aol aa the quate Muorecent polymerase chain reaction. Prenat Mo mapenas ca. done Diagn 1997, 173) 1299411 Ao monotony aa aay Aalnof, M On a Now-tavasive Approach to Prenat Diagnosis Wo 2ooroaore A> 43007 tase onthe deletion of Fetal Noted Cells in Maternal Blood Wo 2007010949 Ai $2007 pee Prenat Diagnose 191311729. Wo2oorosaat A} $2007 Antal flying icromachine magni price sp Wo doorosoios A2 S307 ‘aor Journal of Micoleioeshanical Systems 1986, 5@) 5 Wo doorossiaa AL $247 is Wo 2007053648 a2 52007 Adio a Fnichment of tl mile x from mite Wo aoorosiees a2 $2007 ee Sige WOseorossao Az ater Sod mode syste ing sro: Preneal Dagon 155 WO 2007062229 A? § 2007, yo soesaeriaie cane Applicant's Amendment an! Response dae Jun. 17,2009 to Now Ba amimeradaatecaiea Final Ofee Acton offi 28, 209 eS, App No. 11701 5865, Wosooneuie 5 Seer Ariza, etal. Kinetice of ft cellar and celle DNA. in the Wo200nossers As 72007 ‘mate circulation during and afer pregnancy implications For Wo2onorse AP 7 2007 ‘noninvasive prenatal diagnosis ransfsion, 2001; 4115241590. WO 2007076989 AL 7 2007 Armoul, otal. Agrevment beeen ehomogenicin sity hybridisation Wo2007 079009 AP 7 2007, (CISH) and FISH inthe dtenmiation of HERZ stats in breast WO 2007079280 827 3007 tance BJ Cancer 2003, 810) 1887-91 (Absit only) Wo 2007 08966) AZ 7 2007 Babochkina, ta. Divs detection of fel calls in maternal Blood: a Wo 2007 ox2ia4 AP — 7 2007 Wo2007 082154 AY 7/2007 Wo20070K8%9 AP 7 2007 WO 2007080495 AY 007 Wo200707s879 At $2007 reappraisal using «combination of wo diferent Y chromosome: Speiic FISH probes anda single X'chromesomespecife probe Aish Gynecol Obstet De 2008.273(3):166 (Asta oa). ‘Babochkina,T.-Ph D. Dissertation Fetal ells internal cc Wo2on7oKei2 AP $007 lation etal ell separation and FISH analysis University of Base, Wo2007 09988 AY $2007 Switrrland Des 8, 2005 (125 pages) Wo20o7099011 A? 82007 Balko, tal. Gene exprstion puters that prsct sensitivity to Wo 20071000670 AL 82007 epider growth fetorreeptr tyrosine kinase ako in hag WO2007092713 A? 2007 ‘ince cell ines and human ling tumors BMC Genomes. Nov. 10, WO2007 098481 AP 2007 200057 29 (14 pes), WO 2006100366 AS 9 2007 Barret, eal. Comparative genomic hybridization using WO 2007 100661 A? 9 2007 ‘lgontstsid nicroarays and tal genomic DN. roe Nat cad WO 2007 101609 AL 9 2007 SGUSA. 2008 10181) 1776570 WO 2007033167 AY 10 2007 ‘Basch, eal Cll seartion using positive immunoselectve teh: WO 2007038261 AY 10-2007 igus. Journal of Immunological Methods, 198,56 260-20, US 8,296,076 B2 Page 6 Bauer, J Advances in cel separation: recent developmen in ‘ounterfow central elation and continous Now cll spar tion, Journal of Chromatography 1999:722'5.09 ‘Becker tal. Fabvcation ot Mirstrstres With High Aspect Ratios ‘and Great Sinatra Heighs by Synchrotron Raton Lithograph. (Galvanforming, and Plastic Moulding (LIGA Process) Microcle- onic Engiooring 186:4.35-86. Becket eta Planar qua chips with submicron chanel or two- mensional capillary electrophoresis applications J. Micromech Microeng.1998°9:24.28 Beebe eal Functional Hysrogel Stucores for Avtonomas Hose ‘Control aside Micofhidic Chanel, Nature, 2000, 40458859, Beane, et al. Toward. the 1,000 dollars human genome Pharmacogenomics. 205; 64)373-82 Berenson etal Cellular Immunosbsogsion Using Monoclonal Anti bodies. Transplantation 1984,38:136-L, Berenson ta. Pose selection of vile cll poputions sing viin-itinimmuanoadsorton. Jounal of Immunological Meth os 18691111. Berg. H.C. Random Walks in Biology, Ch. 4. Princeton Univesity Pres Pinca, NI HY pp. AN [ergo et a Design of a mirnfabricated magnets cel separator Hlecophoress. Ost 2001:22(18)3853.92 Blanchet al. olation offal DNA from nucleated erythcytein ratral blood. Medical Seienes. 1990873279328, Blanch, eta. Demonstration of fetal gene sequences in mcleated ‘enthveytes isolated fom matemal blood. American Journal of Haman Genetics. 1999:45:A252, Bianohi etl. Fatal gender and ancupoidy detection using fetal cells in maternal Blood: analysis of NIFTY I data Prenatal Diagnosis 2002, 22.600.615 Bianchi, et al. Fetal auctetederythroeyes (FNRBC) in maternal ‘blood: eythio-specfe antibodies improve detection. The Ame ‘an Journal of Human Genetics. Ot 192 Supplemental to 99.5 No. 96. Blanch tl, slation of Male Fetal DNA frm Nucla Enh roeyles(NREC) in Mate Blood. The American Pedic Society land Society for Peinc Resareh, Mar 1989; 818-1398, Banch > camp on vihij ‘Sequence Tag Density Relative to the Corresponding Value of gDNA Control HUGH n ity bees, ol by. oad” a o8 07; g Prem 4°13°5°6°3'18 8 2°7'12'21'14°9 11101 15°20 16.17 22 19 Chromosome FIG. 1A U.S. Patent Oct, 23, 2012 Sheet 2 of 17 US 8,296,076 B2 a Chromosome 21 ‘© Trisomy 21 Fetuses ‘ADisony 21 Fetuses ‘Adul Male Plasma DNA 1.15: a 8 Sequence tag Density of Chromosome 21 Relative to the Median Value of Disomy 21 Cases FIG. 1B °” US 8,296,076 B2 Sheet 3 of 17 Oct. 23, 2012 US. Patent 1266°0 = 2Y (s490M) e6y feuone|seg E4 og. xe st oF $ Shu woy payewnsya ‘ELL Adu Woy ParEUNS' “IEW ELL Xays wy pares “OIeWY ELL BLA Woy payewNSS “BLL AX way PareUNS’ “OIE BLL ays Woy PareWNSS “OIE BLL ben woy porewns3 "421 AH WOH PAEUINSD ‘IEW 1ZL. aK Woy PSIEWINSA ‘OREN 121. ‘dup Woy payewnsa ‘Ore/4 eULON, Xap Woy payewNS| ‘OTe W FELON, oF sb oz Sz ov sy Percentage of Maternal Cell-free DNA. that Originates from the Fetus (%) U.S. Patent Oct, 23, 2012 Sheet 4 of 17 US 8,296,076 B2 [iota ona 02 [Joney ona (Fetat) Normalized Frequency B S SBESSRERESESESERERRRRRER EASE Size of Sequenced Fragment (bp) 047 2 8 Fraction of g Chromosome Y Sequences (%) a Cumi R o BESSSERSIIBLSSSLLRSRSLSRSSS Length of Sequenced DNA Fragment (bp) FIG. 3 US 8,296,076 B2 Sheet 5 of 17 Oct. 23, 2012 US. Patent SSL wos dq 000: 008 =~ 009 oraz Ce ek ee eo §S ae ec 38 33 fer 8 a aS —] by, 25 esuesquy —| YNG o1Wous paresys Ajwopuey ot SSL Woy dq oot 08 ~—— 009 Obra 0 002-00 ~— 0 coe 000 §3 § ge 8 ge as Es 52 25 BE DD, ID SID CB esuasquy — AoueuBaig o1eW & WO, YO ewUSe|d [eULO}Ey a ‘Old U.S. Patent Oct, 23, 2012 Sheet 6 of 17 US 8,296,076 B2 712 2 |[s olTn e\[> alle 8 lle alte Bie ‘Tiss ag 2 a 3 7 5 = e 7 Mean Sequence Tag Density FIG. 5A U.S. Patent Oct, 23, 2012 Sheet 7 of 17 US 8,296,076 B2 . . a 43 44 45 48 a7 aS GC Context of Chromosome (%) 4 . 8 ° ts Mean Chromosomal Sequence Tag Density FIG. 5B U.S. Patent Oct, 23, 2012 Sheet 8 of 17 US 8,296,076 B2 . eye ya a4 a a 3 GC Context of Chromosome (%) 40 39 a7 & 2 | 8 3 ° 6 © 3 ot 36 03 ‘Standard Deviation of Chromosomal Sequence Tag Density FIG. 5C U.S. Patent Oct, 23, 2012 Sheet 9 of 17 US 8,296,076 B2 @ momiaad 4 o Chromosome X Adult Male Plasma DNA ‘AMale Fetuses: ‘© Female Fetuses. . 5 & z 8 ‘% Difference of chrx Sequence Tag Density Relative to That of Female Pregnancies FIG.6 © Xsu9 Woy PareWISS % WN IeIe-4 or se oe o% 3 op $ 0 L'Old US 8,296,076 B2 : 0 smia4 OW ELL O smog OW SLL © v ¢ ‘Sni94 O10 IZL @ smo, ofeyy FeUON 7 Sheet 10 of 17 Oct. 23, 2012 Ob Sh 0z Sz of fse Fetal DNA % Estimated from chrY Zi99'°0 = 24 oF sy US. Patent US 8,296,076 B2 Oct. 23, 2012 Sheet 11 of 17 US. Patent (dq) juauiBe14 peouenbas jo 021g Ove OF ODE OR O8z OFZ O7Z 7 EEF OF Oe ee BON 005 2 0001 & c cost & § 3 ‘oo0z 8 é 5 2 o0sz Pte tte tt ta bogoe aides YNG ewIseid 94)-119 & WOH) SJUOWBELS YNG PeouENbag jo UORNGUISIC 9z1g US 8,296,076 B2 Oct. 23, 2012 Sheet 12 of 17 US. Patent @y08 48d speoy # a6 O3€ 008 092 002 OS 00k 05 0 Os Os€ ODE OSz 00z OS! DOL 0S 0 0s y 0S 20d speoy # SL a6 0 ‘Old # 50kb Bins ‘Old # 50kb Bins, owe aeupooon D6 “Ola Sor € 2 1 oF 0 oz Og og o9 6 os QS oo FS ozt s Ort oF 08h 002 zeny x owupmogp = W6 “Ole sz oz sb ot so 0 so . 0 « 3 oor cd mee oz OF 3 z a E ose ry US 8,296,076 B2 Sheet 13 of 17 Oct. 23, 2012 ‘ewosowosyg Gi 22 tb OO St tO 6 me ze we 8s ey OL ‘Old 90 £0 #0 eo peti yeet Feng grees @000 ° zt ° Sequence Tag Dexsity Relative to the Corresponding Value of gONA Control eb ‘04 911 Buveog vowom wy NG eUSEIS sm94 121 Buveog vowon wor YNA eUsEIS © Fey NPY reUON Woy YG ewseIS CL yy sm04 £11 Suue9g voweR) wos YNO ewseIS x sre o1e7 reuLON SuLeeR voWoN WOH YN eUSEIE 7 US. Patent oF U.S. Patent Oct, 23, 2012 Sheet 14 of 17 US 8,296,076 B2 70 40 GC Content % 20 30 10 FIG. 11 °° U.S. Patent Oct. 23, 2012 Sheet 15 of 17 US 8,296,076 B2 creyeenwaPEMeseene2engx# SEESESESSSSESSESESESEEES Fam tedaieonnxaaroeoood! * om « 6S (47XY +13, 17k) ‘ow | P59 4rxy +18, 2104) ole eee o im! P53 (7XX +21, 190k) olmd < P20 TAY +21, 1804) a! PAT (47Kx #21, 160) oie Perec 28 ola ‘ Pr Txy 21008) ele po arent, tomy ola! « pare 2h tok) om! a A7ex #21, 25m) =: P42 (46XY, 11k) im ¢ P40 (48KY, tk) lw ee joe P23 (46XY, 10) ou P19 SKY, 184k) OY ‘ee! racer, 180) Representative T Statistic U.S. Patent Oct, 23, 2012 Sheet 16 of 17 US 8,296,076 B2 8 ; 3 Fi ° 2 Zr on 3 Sleteiens ° 3 é8a88 o8 . 8 3 4.00E+05, Number of Reads 2.00E+05 —3.00E+05 1.00E+05 = 2 & © © = a 3 0.00E+00 ‘Minimum Fetal DNA% of Which Over- or under-representation ‘of Chromosome Could be Detected with 99.9% CI FIG. 13 U.S. Patent Oct. 23, 2012 Sheet 17 of 17 US 8,296,076 B2 -0.48x + 2.90 R2=0.80 y= 46 48 52 84 56 58 1og10 (Number of Reads) a4 02. log10 (Minimum Fetal DNA %) 9. FIG. 14 +2 US 8,296,076 B2 1 NONINVASIVE DIAGNOSIS OF FETAL ANEUOPLOIDY BY SEQUENCING (CROSS-REPERENCE TO RELATED "APPLICATIONS, This application claims priority from U.S. Provisional Pateat Application No. 61/098,758, filed on Sep. 20, 2008, ‘and US. Utility patent application Ser, No, 12/696,509,, which sa divisional OTU:S. application Sex. No. 12/560, 708, filed Sep. 16, 2009, was filed Jan, 29, 2010, and is now US. Pat. No. 8.198.415, both of which are hereby incorporated hy reference in thir entirety STATEMENT OF GOVERNMENTAL SUPPORT ‘Tis invention was made with Government support under ‘contract DP ODO002S1 awarded by the National Institutes ‘of Health, The Goverament has certain rights in this inven- REPERENCE TO SEQUENCE LISTING, COMPUTER PROGRAM, OR COMPACT DISK, Applicants submit herewith a sequence listing in an ASCII text file @815_63_S seq. listtnt), 2s povided in EFS Leval Framework Notice 20 May 2010, part F-1. The file was ‘reated Ape. 19, 2012 and contsins 2,695 bytes. Applicants incomporate the contents of the sequence listing by reference in its entirety. BACKGROUND OF THE INVENTION 1. Field ofthe Invention ‘The present invention relates tothe field of molecular diag nostics, and more particularly to the fel of prenatal genetic ingnoss 2. Related Art Presented below is background information on certain aspectsof the present invention s they may relatetotechnical features refered to inthe detailed description, bu not neces- sarily described in detail. That is, certain components of the present invention may be described in greater detail in the ‘materials discussed below, The discussion below should aot be construed as an admission a othe relevance a the infor ‘mation o the claimed invention othe prior art effet of the material deseribed Fetal aneuploidy and other chromosomal aberrationsaffect, 9.outof 1000 live ints (1)-The gold standard for diagnosing ‘chromosomal abnormalities is kayotyping of feta cells ‘obiained via invasive procedures such as chorionic villus sampling and amniocentesis, These procedures impose small bout potentially significant risks to both the fetus and the ‘mother (2). Non-invasive sereening of fetal aneuploidy’ using ‘matemal serum markers and ultrasound areavilable but have Timited reliability (3-5), There is thereforea desire to develop non-invasive genetic tests for fetal chromosomal abnormali- Since the discovery of intact feta cells in maternal blood, there has been intense interest in uying © use them as 3 diagnostic window ino fetal genetics (6-9). Whilethis has not ‘yet moved into practical application (10), the later discovery that signifieant amounts of cellfve fetal nucleic aids also ‘exist in matemal circulation has Ted to the development of ‘ew non-invasive prenatal genetic tests fora variety of tats (11,12), However, measuring aneuploidy emains challeng- 0 o 2 ing due tothe high background of maternal DNA; fetal DNA often constitutes <10% of total DNA in maternal cellfree plasma (13), Recently developed methods for aneuploidy rely on detec ‘ion focus on allelic variation between the mother and the fetus. Lo etal. demonstmted that allelic ratios of placental specific mRNA in maternal plasma could be used to detect {tisomy 21 in certain populations (14) Similarly, they also showed the use of alll ratios of ‘imprinted genes in maternal plasma DNA to diagnose risomy 18 (15), Dillan et al. used fetal specifi alleles in maternal plasma DNA to detect tssomy 21 (16). However, these meth- ‘ds ate limited to specific populations because they depend ‘on the presence of genetic polymogphisms at specific loci We and others argued that it should he possible in principle to use gital PCR to create a universal, polymorphism independ: ‘est for fetal aneuploidy using maternal plasma DNA (17-19). “An alterative method to achieve digital quantification of DNA js direct shotgun sequencing followed by mapping to the chromosome of origin and enumeration of Iragments per ‘chromosome. Recent advances in DNA sequencing technol- ‘gy allow massively parallel sequencing (20), producing tens ‘of millions of short sequence tags ina singlerun and enabling ‘adeeper sampling than can be achieved by digital PCR. As is known inthe art, theferm “sequence tag” refers a relatively short (eg. 15-100) nucleic acid sequence that can be used 10 identify 2 certain larger sequence, e., be miapped toa chro- :mosome or genomic region or gene. These can be ESTs or expressed sequence tags obtained from mRNA, Specific Patents and Publications ‘Science 309:1476 (2 Sep. 2005) News Focus “An Earlier Look at Baby's Genes” describes attempis to develop tests for Down Syndrome using matemal blood, Early attempts to detect Dossn Syne fetal cells from maternal blood ‘were called “just modestly encouraging” The report also eseribes work by Dennis Lo fo detect the Rh zene ina fetus ‘where itis absent in the mother. Other mutations passed on {rom the father have reportedly been detected as well, sel as cystic fibrosis, bot-thalassemia, a type of dwarfism and Flun- tington’s disease, However, these results have not always been epoducible ‘Venter eta, “The sequence of the human genome” Sc ence, 2001 Feb. 16; 201(5S07)-1304-51 discloses the sequence of the human genome, which information is pub- Jiely wailable from NCBI Another reference genomic sequence isa current NCBI bull as obained from the UCSC genome gateway. ‘Wheeler ta, “The compere genome ofan individ by massively parallel DNA sequencing.” Narure, 2008 Apr. 17 452(7189)872-6 discloses the DNA sequence of a diploid ‘genome ofa single individual, James D. Watson, sequenced {o 7.4-fold redundancy in two months using massively paral- lel sequencing in picolitersize reaction vessels. Comparison ofthe sequence tothe reference genome led tothe identifiea- tion of 3.3 million single nucleotide polymorphisms, of ‘which 10,654 eause amino-acid substitation within the cod- ing sequence. ‘Quake et al., US 200710202525 entitled “Non-invasive ‘etal genetic screening by digital analysis,” published Aug, 30, 2007, dseloses a process in which material blood eon- ‘wining fetal DNA is diluted to @ nominal value of appeoxi- ‘ately 0.5 genome equivalent of DNA per reaction sample Chiu etal, “Noninvasive prenatal digpnosis of fetal chro- ‘mosomal aneuploidy by massively parallel genomic DNA sequencing of DNA in matemal plasma," Proc. Natl Acad, Sei, 105(51):20458-20463 (Dec. 23, 2008) discloses shod for determining fetal aneuploidy using massively US 8,296,076 B2 s made by calculating a “z score” Z scores were ‘compared with reference values, from a population restricted ‘oeuploid male fetuses. The authors noted in passing that GIC ‘content affected the eoeicent of variation Lo et al, “Diagnosing Fetal Chromosomal Aneuploidy Using Massively Parallel Genomic Sequencing.” US 2009) (0029377, published Jan. 29, 2009, discloses 8 method in Which respective amounts of a elinially-relevant chromo- some and of background chromosomes are determined from results of massively parallel sequencing. twas found that the Percentage representation of sequences mapped! to chromo- Some 21 shipher ina pregnant woman eaerying a tisomy 21 {lus when compared with a pregnant womdn carrying & nonmal fetus. For the four pregnant women each eaerying & ‘euploid fetus, a mean of 1.345% of thoir plasma DNA, sequences were aligned 0 ehromosoae 21 To etal, Detemnining # Nucleie Acid Sequence Imbal- ance,” US 20090087847 published Ape. 2, 2009, discloses a mcthod for determining whether « sucleic acid sequence imbalance exists, such san aneuploidy, the method comprise ing deriving first cutoff value Irom an average concentration ‘of a reference nucleic acid sequence in cach of plurality of reactions, wherein the relerence mucleie acid sequence is cither the clinically relevant nucleic acid sequence or the background nocleic acid sequence: comparing the parameter ‘othe fist cut value; and ase onthe comparison, deter mining a clasiicaion of whether a nucleic seid sequence imbalance exists. BRIEP SUMMARY OF THE INVENTION ‘The following brief summary is not intended to inchade all, ‘eaturesand aspects of the present invention, nor doesitimply that the invention must include all features and aspects dise ‘cussed inthis summary. “The present invention comprises a method for analyzing 3 ‘matemal sample, eg. rom peripheral blood tis not invasive into the fetal space, as is amniocentesis or chorionic villi sampling. In the preferred method, fetal DNA. which is present in the matemal plasma is used. The fetal DNA is ia ‘one aspect of the invention enriched due tothe bias in the method towards shorter DNA fragments, which tend to be {etal DNA. The method is independent of any sequence di Terence between the materoal and fell genome. The DNA ‘obtained, preferably from a peripheral blood draw, isa mix ture of fetal and maternal DNA, The DNA obiained is atleast parially sequenced, n-a method which gives large numbee ‘OF short reads, These short reads aet as sequence tags i that a sinificant fraction ofthe reads are suiiently unique to be ‘mapped to specific chromosomes or chromosomal locations known fo exist in the human genome. ‘They are mapped ‘exacly, or may be mapped with one mismatch, asin the ‘examples below. By counting the amber of sequence tags ‘mapped to each chromosome (1-22, X and Y), the over- oF tunder-representation of any chromosome or chromosome portion in the mixed DNA contributed by’ an sneuploi fetus ‘can be detocted. This method does not require the sequence ‘differentiation of fetal versus matemal DNA, because the ‘summed contribution oFboth maternal and fetal sequences in ‘particular chromosome or chromosome portion will bedi erent as between an into, diploid chromosome andl an aber rant chromosome, ie, with an extra copy, missing portion oF the like. In other words, the method does not rely on a priori sequence information that would distinguish fetal DNA from ‘atemal DNA. The abnormal distribution of a fetal chromo- some or portion of a chromosome (i.e, a pross deletion or 0 o 4 osertion) may be determined in the present method by’ ent ‘eration of sequence tags as mapped to differeat chrome- somes. The median count oF autosomal values (.¢.,aumberof Sequence tas per autosome) is used asa normalization con- Sat Io account for diferences in otal number of sequence tags is used for comparison between samples and between chromosomes Theterm “chromosome portion” isused herein to denote either an entire chromosome ora significant frag- ment of a ehromosome, For example, moderate Down syn- drome las been associated with partial wisomy 21g22.2--qler. By analyzing sequence tag density in pre- {efined subsections of chromosomes (eg. 1010 100kb win- dows), a nomnalization constant can be calculated, and ehro- :mosomal subsections quantified (eg, 2192.2). With lange fenough sequence tag counts, the present method can be applied to arbitrarily smal fractions of fetal DNA. Ithas been demonstrated to be accurate down 1 6% fetal DNA concen- ‘ration, Exemplitied below isthe snecessfl use of shorgun sequencing and mapping of DNA to detect fetal trisomy 21 (Down syndrome), trisomy 18 (Edward syndrome), and tr- somy 13 (Patau syndrome), carried out non-invasively using cell-free fetal DNA in materal plasma, This forms the basis of @ universal, polymomphism-independent non-invasive agnostic test for fetal aneuploidy. The sequence data also allowed us t0 characterize plasma DNA in unprecedented atl, suggesting that its enriched for micleosome bound fragments, The method may also be employed so tht the sequence data obtained may be further analyzed to obtain information regarding polymorphisms and mutations, "Ths, the present invention comprises, in certain aspects, 3 method of testing for an abnormal distribution ofa specified chromosome portion in a mixed sample of nomally and Abnormally dstributedchromosomye portions obtained roma Single subject, stich as mixture of fetal and matemal DNA ia ‘maternal plasma sample, One carries out sequence determ- rations on the DNA fragments in the sample, obtaining sequences from multiple chromosome portions of the mixed sample to obtain a number of sequence tags of suficent length of detemnined sequence to be assigned to a chromo- some location within a genome and of sufficient number to reflect abnormal distribution, Using a reference sequence, ‘one assigns the sequence tags o their coespoading chrom ‘somes including ut least the specified elirmosome by com- paring the sequence to reference genomic sequence, Often ‘here will be on the order of millions of short sequence tags ‘hat are assigned to certain chromosomes, and, importantly, certain postions along the chromosomes. One then may setermine a first numberof sequence fags mapped to at least ‘one normally distributed chromosome portion and a second ‘number of sequence tags mapped to the specified chromo- some portion, both chromosomes being in one mixed sample. ‘The present method also involves correcting for nonuniform Aistrbution sequence apsto different chromosomal portions. ‘This isexplained in detail below, where a numberof windows of defined length are created along a chromosome, the win- dws being on the onder of kilobases in length, whereby & uber of sequence tags will all into many ofthe windows and the windows covering each entre chromosome in ques- ‘ion, with exceptions for non-informative regions, eg cen- {romere regions and repetitive regions Varios average num bers, i, median values, are calculated for diferent windows And compared. By counting sequence tags within a series oF predefined windows of equal lengths along different chromo- Somes, more robust andstatsticlly significant results may be ‘oblained, The present method also involves calculating a US 8,296,076 B2 5 “differential benween te first sumber and the second number ‘which is determinative of whether oe not the sbnonmnaldiste- bution exists, In certain aspects, the present invention may comprise 3 ‘computer programmed fo analyze sequence data obtained from a mixture of matemal and fetal chromosomal DNA. ach autosome (chr. 1-22) s computationally segmented into ‘contiguous, non-overlapping windows. (A sliding window ‘ould also be used). Fach window is of sufficient length 10 ‘contain a significant namber of reads (sequence tgs, having about 20-100 bp of sequence) and not sil have a number of ‘windows per chromosome. Typically, a window will be between 10kb and 100 kb, more typically between 40 and 60 kb. There would, thea, for example, accordingly be approxi- mately between 3,000 and 100,000 windows per ehromo- some, Windows may vary widely inthe numberof sequence tags that they contin, based on location (eg. near @ een- tromere or repeating fegion) or GiC content, 38 explained below, The median (ie, middle value ia the st) count pee ‘window for each chromosome is selected then the median of 2 the autosomal values isused to account for differences intotal umber of sequence tas obtained for different chromosomes ‘an distinguish interchromosomal variation from sequencing bias from aneuploidy. This mapping method may also be applied to diseem partial deletions of insertions in ehromo= ome, The present method also provides 1 method for cor- reeting for bias resulting from GIC content. For example, some the Solexa sequencing method was found to produce more sequence tags from fragments with ineeased GC con- ‘eal, By assigning a weight to each sequence tag based on the ‘GiCcontent ofa window in which the read falls. The window {or GC calculation is preferably smaller than the window for sequence to density ealeulation BRIEE DESCRIPTION OF THE DRAWINGS. FIG. 1 is scatter plot graph showing soquence tag densi- ties from eighteen samples, having five diferent gencrypes, 1s indicated inthe figure legend. Fetal ancuploidy is detects able by the overepresentation of the affected chromosome jn matemal blood, [1G. 1A shows sequence tg density rla- tive to the corresponding value of genomic DNA control; ‘chromosomes are ondered by increasing GIC content. The samples shown as indicated, are plasma from a woman bear Jing a T21 fetus: plasma from a woman bearing @ TIS fetus plasma from a normal adult males plasma from @ woman bearing a normal feu; plasma from a woman bearing a T13 ‘etus, Sequence tg densities vary more with inereasing chro- rmosomal GC content, FIG. 1B is a detail from FIG. 14, showing chromosome 21 sequence tag density relative to the median chromosome 21 sequence tag density of the normal ‘cases. Note thatthe values of 3 disomy 21 cases overlap at 10. “The dashed line represents the upper boundary of the 99% ‘confidence interval constrted from all disomy’ 21 samples. The chromosomes are listed ia FIG, 1A in oder of GiC ‘content, fom low thigh. This figure suggests that ne would preferto use asa reference chromosome inthe mixed sample ‘witha mid evel of GIC content, ast canbe soen tat the data there are more tightly grouped, Thats, chromosomes 18, 8,2, 7, 12,21 (except in suspecied Down syndrome), 14,9, nd 11 may be used a the nominal diploid chromosome iflooking fora trisomy. FIG. 1B represents an enlargement ofthe ehro- rmosome 21 data TIG. 2 a seater plot graph showing fetal DNA fraction ‘and gestational age. The faction of fetal DNA in maternal plasma correlates with gestational age. Fetal DNA fraction was estimated by three diferent ways: 1. From the aditional 6 at of chromosomes 13, 18, axl 21 sequences for T13, TIS, and 121 cases respectively. 2, From the depletion ia ‘amount of chromosome X sequences for male cases. 3. From the amount of chromosome Y sequences present for male ceases. The horizontal dashed Tine represents the estimated ‘ini fetal DNA fraction required for the detection of aneuploidy. For each sample, te values of fetal DNA fraction ‘alelted fom the data of dierent chromosomes were ver aged. Thore isa statistically significant conelation between the average fetal DNA faction and gestational age (p-000051), The dashed line represents the simple linear ‘egresson line between the average fetal DNA fretion and stestational age, The R2 valve represents the square of the cormdation coefficient. FIG. 2 suggests that the present ‘method may be employed a avery carly stage of pregnancy. The data were obtsined from the 10-week stage and later becanse that js the earliest stage at which chorionic villi sampling is done. (Amniocentesis is done later). From the level ofthe confidence interval, ane would expect to obtain smeaingfil data as early as 4 wecks gestational age, o pos sibly ealer. FIG.3 isahistogram showing size distribution of matemal and fetal DNA ia maternal plasma, It shows the size distri- tion of total and chromosome Y specifi fragments obtained from 454 sequencing of matemal plasma DNA froma normal male pregnancy. The dstibution is normalized to sum to 1 ‘The numbers of total reads and reads mapped tothe Y-cheo- smosome are 144992 and 178 respectively. Inset: Cumulative fetal DNA fraction asa fuetion of sequenced fragment size ‘The error bars conrespond to the standard eror of the faction cstimated assuming the error of the counts of sequenced ‘ragments follow Poisson statistics TIG. 4 is a pair of line graphs showing distribution of sequence tags around transcription start sites (TSS) of ReSeq genes on all aufosomes and chromosome X from plasma DNA sample of nomal male pregnancy (top, FIG. 48) and randomly sheared genomic DNA contr (bottom, FIG. 4B) ‘The number of tgs within eoeh $ bp window was counted ‘within 21000 bp region around each TSS, taking into account the strand each sequence tag mapped fo. The counts from all transcription start sites for each Shp window were summed and normalized tothe median count among the 400 windows. [A moving average was used to smooth the data. peak inthe sense strand represents the beginning of a nucleosome, while ‘peak inthe anti-sense strand represents the end of anucleo- some: In the plasma DNA sample shown here, five well Positioned nucleosomes are observed downsiea of ta scription start sites and are represented as grey ovals. The ‘number below within ach oval represents he distance in base pairs between adjacent peaks in the sense and antisense strands, corresponding tothe size ofthe inferred nucleosome. ‘No bvious pattem is observe forthe genomic DNA cont. PIG. SA isascatter plot groph shoving the mean sequence tag density for each chromosome of all samples, including cell-free plasma DNA from pregnant women and male dono, fas well as genomie DNA eontro from male donor, is plotted fahove, Exceptions are chromosomes 13, 18 and 21, where cell-free DNA. samples from women carrying, aneuploid etses are exeluded. The error bars represent standard devi ‘ion, The chromosomes are ordered by their GIC content GiC content of each chromosome relative to the genome-wide value (41%) is also plotted. FIG. SB is a seatter plot of mean sequence tag density for each chromosome versus GIC eon- tent of the chromosome. The correlation coefcient is 0927, ‘nd the correlation is statistically significant (p-<10~. FIG. SC is 2 scatter plot of the standard deviation of sequence ta density of each chromosome versus GiC conte US 8,296,076 B2 ‘ofthe chromosome. The correlation coefficient between ‘dard deviation of soquence tg density and the absolute devi tionof chromosomal GiC content from the genome-wide GiC ‘content is 0.963, and the correlation statistically significant (p=10-12), IG. 6 sa setter plot graph showing percent difference of ‘chromosome X sequence fag density of ll samples as come pared to the median chromosome X sequence tag density of ll female pregnancies. All male pregnancies show under- representation of chromosome X. IG. Tisa seater plot graph showing a comparison ofthe ‘estimation of fetal DNA Irscion for cell-tree DNA samples ‘rom 12 male pregnancies using sequencing data from ehro- rmosonies X and Y. The dashed line represents. simple Hinear regression ine, with slope of 0.85, The R2 valve represents thesquareofthe correlation oelicient, Theceisa statistically snificant conelation benween fetal DNA fraction estimated fom chromosomes X and Y (p-0.0015), FIG. 8 is a line graph showing length distribution of sequenced fragments from maternal cell-free plasma DNA. 2 sample of normal male pregnancy at bp resolution. Seatencing was done onthe 454 Roche platform. Reads that haveat least 90% mapping tothe human genome with greater than or equal t 90% aecuraey are retained, totaling 144992 reads. Yaiis represents the number of reads obtained. The > median leth i 177 bp while the mean length is 180 bp. FIG.9 isaschematie illustrating how sequence tag disti= bution is used to detect the over and under-repeesenttion of ny chromosome, i.e. a trisomy (over representation) oF & missing chromosome (typically n X or chromosome, since ‘missing autosomes are genrally lethal). As shown in left panels and C, one first plots the number of reads obtained ‘erm a window that is mapped oa chromosome coordinate that epresensthe position of thereadalong the chromosoa ‘That is, chromosome 1 (panel A) can be seea to have about 2.8108 bp. It would have this number divided by 50 kb ‘windows. These values are reploted (panels Band D) toshow the distribution of the number of sequence togs'50 kb win- ‘dow. The term “bin” is equivalent to a window, Prom this analysis, ne ean determine a median number of reads M for ‘each chromosome, which, forpurposesofillustration, may be ‘observed along the x axis at the approximate center of the slstribution ad may be said to be higher if there are more sequence ts atibutable to that chromosome, For chromo- some I, illustrated in panels A and B, one obtains median MI. By taking the median M of all22autosomes, one obtains ‘8 normalization constant N that can be used to corset for «differences in sequences obtained in different runs, as can be ‘cen in Table I. Thus, the normalized sequence tag density for ‘chromosome I would he MUN; for chromosome 22 it would, be M22IN. Close examination of pane A, for example would show that towards the zero end of the chromosome, this procedure obiained about 175 reads per $0 Kb wind. Ea the middle, nearthe centromere, there wereno reads, bocause this portion ofthe chromosome sill defined inthe human genome Fibray. ‘Thats, in the left panels (A and C), one plots the distribu tion of reads per chromosome coordinate, ie, chromosomal position in terms of number of reads within each 50 kb ‘overlapping sliding window. Then, one determines the distei~ bution ofthe number of sequence tays for each SOkb window, and obtains a median number of sequence tags per chromo- Some forallantosomes and chromosome X (Examples of che 1 top] andhr 22 [bottom] areiustrated here). These results ‘are relered to as M, The mean ofthe 22 values oF M (rom all autosomes, chromosomes 1 through 22) is used as the astantN, The normalized sequence tag den- o 8 sity of each chromosome is MIN (e.., che 1: MUIN; cbr 22: ‘M22/N), Such norasaization is necessary to compare differ cent patient samples since the total umber of sequence tags (ns, the sequence tag density) for exch patient sample is iffeent (the total number of sequence tags Muetuates between ~8 to ~12 million). The analysis thus flows from Trequency of reads per coordinate (A and C) to if reads per ‘window (B and D) to combination of al chromosomes. FIG. 10a seater plot graph shossing data from different, samples, asin FIG. 1, except that bias for G/C sampling has ‘eet eliminated, FIG. 11 isa scatter plot graph showing the weight given to iflrent sequence samples aording to percentage of G/C ‘content, with lower weight given to samples with «higher GIC content. GiC content ranges from about 30% to about 0P% weight can range over a factor of about 3. FIG, 12 isa scatter plot graph which illustrates results of solectod patients as indicated on the x axis, and, for each patient, a distebution of chromosome representation onthe Y fis, as deviating froma representative statistic, indicated as scatter plot graph showing the minimum fetal DNA percentage of which over- or underrepeesentation of a chromosome could be detected witha 99.9% confidence level or chromosomes 21, 18, 13 and Che X, and a value for all other chromosomes, FIG, H4isa scatter plot graph showing a linear relationship between log. 10 of minimum fetal DNA percentage that is needed versus log 10 othe numberof reads required. DEEAILED DESCRIPTION OP THE PREFERRED EMBODIMENT Overview Definitions ‘Unless defined otherwise, al technical and sciemiti terms ‘used herein have the same meaning as commonly understood by those of ordinary skill in the ar to which this invention belongs. Although any methods and materials similar oF ‘equivalent to those described herein can be used in the prac- tice or testing ofthe present invention, the prefered methods ‘and materials age deseribed. Generally, nomencltures ut Jiaed in connection with, and techniques of, cell and molecu- Jar biology and chemistry are those well Knawn and com- monly used inthe art. Cerin experimental techniques, not specifically defined, are generally performed according to ‘conventional methods well Known inthe art and as deseribed in various genera and more specific references that are cited and discussed throughout the present specication. For pur- poses of the clarity, Following terms are defined below. “Sequence tag density" means the normalized value of sequence tas for a defined window of a sequence ona chro- :mosome ina preferred embodiment the window is about 50 ib), where the soquence tag density is used for comparing ifleent samples and for subsequent analysis. A “sequence tag” is @ DNA sequence of sufficient length that it may be assigned spectially to one of chromosomes 1-22, X oF ¥. It {does not necessarily need to be, ut may be non-repetitive ‘ithina single chromosome, A cersain, small degree of tis ‘match (0-1) may be allowed to seeount for minor polymor- phism that may exist between the reference genome and the ‘individual genomes (mistemal and fetal) being mapped. The value of the sequence tap density is nomalized within a sample. This ean he done by cmunting the number of tags Talling within each window on a chromosome: ebtaining @ US 8,296,076 B2 9 median value of the total sequence tag count for each ehro- rmosome; obiaining a median value ofall ofthe autosomal values; and using this value as 2 normalization constant (© socount forthe differences in total number of sequence tags ‘obtained for difforom samples. A sequence tag density as calculated in this way would ideally be about 1 fora disomic ‘chromosome. AS futher deseribod below, sequence tag den- Sites cam vary according to sequencing sntificts, most not bly GIC bias: thisis corrected as described. This method does not require the use of an external standard, but, rather, pro- vides an internal reference, derived from al of the sequence tags (genomic sequences), which may be, for example, a single chromosome ora calculated vale from all attosomes TDI" means tsomy’ 21 18" means trisomy 18. T13" means tdsomy 13, “Aneuploidy” js used ina general sense to mean the press ‘ence or absence of aa entire chromosome, as well the presence of partial chromosomal duplications or deletions oF kilobase or greater size, as opposed to genetic mutations or polymorphisms where sequence differences eis, “Massively: parallel sequencing” means techniques for sequencing millions af fiagments of nocleic acids, eg, using ‘attachment of randomly fragmented genomic DNA to a pla nar, optically transparent surface andl solid phase amplifea- tion 10 create a high density sequencing flow cell with mil- lions of clusters, each containing 1,000 copies of template per sq, em, These templates are sequenced Using four-color DNA. sequencing-by-synthesis technology. See, products ‘offered by Illumina, Ine, San Diego, Calif Inthe present ‘work, sequences were oblained, as deseribed below. with a Illuminw/Solexa 1G Genome Analyzee. The Solexa/ lumina mcthod refered to below relies on the attachment of ran- ‘domly fragmented genomie DNA toa planar, optically tans parent surface. Inthe present case, the plasma DNA does not heed to be sheared, Attached DNA fragments are extended and bridge amplified to ereate an ultra-high density sequene- ing flow cell with 250 million clusters, each containing 1,000 copies of the same template. These templates are sequenced using 2 robust fourcolor DNA sequencing-by- sythesis technology that employs reversible terminators ‘with removable fluorescent dyes. This novel approach ‘ensures high accuraey and true base-by-base sequencing, eliminating sequence-context specie errors and enabling sequencing through homopolymers and repetitive sequences High-sensitivity fluorescence detection is achieved using laser excitation and total internal reflection optics, Short Sequence reads are aligned against a reference genome and peneticdiflerences are called using specially developed data analysis pipeline software, ‘Copies of the protocol for whole genome sequencing using Solexa technology may be found at Bioechniques® Proto- ‘col Guide 2007 Published December 2006: p29, wovw(do) biotechniques.comidefaul.asp? ppage-protocolaesubsection-artcle_digplayeid-1 12978, Solent’ oligonucleotide adapters are ligated onto the frag- ments, yielding a filly.representative genomic library of DNA templates without cloning. Single molecule clonal amplification involves six steps: Template hybridization ‘template amplification, linearization, blocking ¥ ends, den ‘uration and primer hybridization, Solexa’s Sequencing-by- Synthesis utilizes four proprietary nucleotides possessing reversible fhioophore and termination properties, Fach sequencing eyele occurs in the presence of all four micle- tides, “The presently used sequencing is preferably eartied out without a preamplification or cloning step, but may be eo 0 o 10 bined with amplilicaion-based methods in a microfluidic chip having reaction chambers for both PCR and mitroscopic ‘emplate-based sequencing. Only about 30 bp of random sequence information are needed to identify @ sequence as belonging t0 specific human chromosome. Longer sequences can uniquely identify more particular targets. la the present ase, a large number of 25 bp reads were obtained, fad due to the lange number of reads obtained, the 50% specificity enabled sullicient sequence tag representation Punter description of a matsively parallel sequencing ‘method, which employed the below reference 454 method is ound in Rogers and Vente, “Genomes: Massively parallel sequencing” Navure, 437, 326-327 (15 Sep. 2005). As described there, Rothbere and colleagues (Margulies, M. et al, Nawure 437, 376-380 (2005)), have developed a highly parallel system capable of sequencing 25 million bases Tourchour periodabout 100 times faster than the current state-of-the-art Sanger sequencing and capillary-bascd elec- {tophoress platform, The method could potentially allow one individual to prepare and sequence an entre genome ina fess days. The complexity of the system lies primarily in the sample preparation and in the microfabricated, massively parallel platfom, which contains 1. million picolitersized reactors in @ 6.4-em* slide. Sample preparation starts with ‘ragmentation of the genomic DNA, followed by the attach- ‘ment of adaptor sequences to the ends ofthe DNA pieees. The adaptors allow the DNA fragments to bind to tiny beads (round 28) in diameter). This is done under conditions that allow only one piese of DNA to bind o each bead. The beads are encase in droplets of oi that contain al of the reactants ‘ceded to amplify the DNA using a standard tool called the polymerase chain resetion. The ofl droplets form part of an femilson so that each bead is kept apart fom its ncighhor, ensuring the amplification is uncontaminated. Each bead fends up with roughly 10 million copies of its initial DNA fragment. To perform the sequencing reaction, the DNA- template-carying beads are loaded ino the picoliter reactor wells—eaeh well having space for just one bead, The tech- rigue uses a sequencing-by--ymiheis method developed by ‘Uhlenand colleagues, in which DNA complementary to each template stand is synthesize. Tae muelectie bases sed for sequencing release chemical group asthe base forms bond with the growing DNA chain, and this group drives a light- emitting reaction in the presence of specific enzymes and Inciferin, Sequential washes of each of the four possible nucleotides arerun over the plate, anda detector senses whieh of the wells emit light with each wash to determine the sequence of the growing strand, This method has been adopted commercially by 454 Life Sciences. unter examples of massively parallel soquencing are sven in US 20070224613 by Strathmann, published Sep. 27 2007, entitled “Massively Multiplexed Sequencing” Also, Tora further description of massively parallel exjuencing, see US 200310022207 to Balasubramanian, eta, published Jan 30, 2003, entitled “Arrayed polynucleotides and their use in ‘genome analysis” General Description of Method and Materials Overview ‘Non-invasive prenatal diggnosis of aneuploidy has been 2 challenging problem because fetal DNA constitutes a small percentage of total DNA in maternal blood (13) and intact Teta cols are even rarer (6, 7, 9, 31,32). We showed in this study the sucessful development of a truly universal, poly~ ‘momhism-independent non-invasive test for fetal ancup- Toidy. By directly sequencing maternal plasma DNA, we cold detect fetal trisomy 21 as carly as Lath week of gestae ‘ion, Using cel-fee DNA instead of intact eels allowsoneto US 8,296,076 B2 ul void complexities associated with mie ‘ign cells that might have colonized the mother; these cells ‘occurat sch lw numbers that thie conteibution othe ell free DNA is negligible (33, 34). Furthermore, thor is ev dence that call-re fetal DNA clears from the blood to unde- tectable levels within a few hours of delivery and therefore is not carted forward from one pregnaney tothe next (35-37) Rare forms of aneuploidy caused by unbalanced translo- ‘ations and partial duplication ofa chromosome are in prin= ciple detectableby the approach of shotgun sequencing, since the density of sequence taps in the triplicsted region of the ‘chromoxome would be higher than the rest of the ehomo- some, Detecting incomplete aneuploidy caused by mosa- ism is also possible in prineiple but may be more challeng- Ing, since it depends not only on the concentration of fetal DNA in matemal plasma but also the degree of fetal mosa- ‘eis, Further stucies are required to determine the effective- ness of shotgun sequencing in detecting these rare forms of aneuploidy: “The present method is applicable w large chromosomal dletions, such as Sp-Syndrome (five p minus). also knowns CatCry Syndrome or Cri Chat Syndrome. Sp-Syndromeis, ‘characterized at bint by a high-pitched ery, low birth weight, poor muscle tone, mierocephaly, and potential medical com plications. Similarly amenable disorders addressed by the present methods are p-, monosomy 9P, otherwise known as Al’s Syndrome or 9P-, 2241.2 deletion syndrome, Eman tel Syndrome, also known in the medical literature as the Supernumerary Der(22) Syndrome, trisomy’ 22, Unbalanced 11/22 Translocation or partial trisomy 11/22, Mierodeetion and Microduplication at 16p11.2, whichis associated with ‘tis, and other deletions or imbalances including those that are presently unknown ‘An advantage of using direct sequencing to measure anew ploidy non-invasively is that it is abe to make full use ofthe ample, while PCR based methods analyze only few tare _geted sequences, In this study, we obtained on average 5 million eads per sample in a single run, of which ~65.000 ‘mapped to chromosome 21. Since those 5 millon read rep- resent only portion of one han genome, in principle Tess than one genomic equivalent af [INA is sufficient for the ‘detection of aneuploidy using direct sequencing. In practice, a larger amount of DNA was used since there is sample loss ‘during sequencing library preparation, but it may be possible to further reduce the amouat of blood required for analysis, ‘Mapping shotgun sequence information (Le, sequence ‘information from a fragment whose physical zenomie posi- tion is unknown) can be done in a number of ways, whieh involve alignment ofthe obtained sequence with a matching sequence in a reference genome. See, Li etal, "Mapping short DNA sequencing reads and calling variats using map- ping quality score.” Genome Res, 2008 Aug. 19. [Epub ahead of prin. ‘We observed that certain chromosomes have large vara tions in the counts of sequenced fragments (irom sample t0 sample, and that this depends strongly on the GC content (FIG. 1A) 1s unclear at this point whether this stems from PPCRanlfacts during sequencing library preparation or cluster eneration, the sequencing process ise, or whether its a ttue biological effect relating to chromatin structure. We strongly suspect hat itis an arifet since wealso observe GiC bias on genomic DNA eontrol, and such bias on the Solexe sequencing platform has recently been reported (38, 39). It hs a practical consequence since the sensitivity to ancup- Jody detection will vary from chromosome to chromosome: orsinately the most common fsuman ancuplidies (such as 13,18, and 21) have low variation and therefore high detec- 0 o 12 ‘ion sensitivity: Both this problem and the sample volum limitations may possibly be resolved by the use of single molecule sequencing technologies, which do not require the te of PCR or library preparation (40) Plasma DNA samples used in this study were obtained bout 15 to 30 minutes ater amniocentesis or chorionic villas ‘sampling. Since these invasive procedutes disrupt the inter= face between the placenta and maternal ciculation, there Ihave boea discussions whether the amount of fetal DNA in ‘matemal blood might increase following invasive proce- ‘dures. Neither ofthe studies to date have observe a signifi cant effect (41, 42), ‘Oue results support this conclsion, since using the digital PCR assay we estimated that fetal DNA constiuted less than ‘or equa 9 109% of total eel-fee DNA in the majority of our emal plasma samples. This is within the range of previ ‘ously reparted valves in maternal plasma samples obtained prior invasive procedures (13). Itwould be valuable to have 4 direct measurement adressing this point in fare study. “The averige fetal DNA froction estimated from sequencing datas higher than the valuesestimted from digital PCR data by anaverage factor of wo (p 2 sequencing libraries were analyzed with DNA 1000 Kit on the 2100 Bioanalyzer (Agilent) and quantiied with mierof- Inidic digital PCR (Fluidigm). The libraries were then sequenced using the Solexa 1G Genome Analyzer according fo manufacturers instructions Cell-fee plasma DNA from a pregnant woman carrying 3 ‘normal male fetus was also sequenced on the 454/Roche platform. Fragments of DNA extracted from 5.6 ml of cell {ree plasma (equivalent to ~4.9 ng of DNA) were used for sequencing library preparation, The sequencing library was Prepared according tomanufacturer's protocol, except tbat 9 ebulization was performed on the sample and quantification was done with microfluidic digital PCR instead of capillary electrophoresis. The library was then sequenced on the 454 Genome Sequencer FLX System acconting (0 manufactur rs instructions leciropherograns of Solexa sequencing libraries were prepared from cell-free plasma DNA obtained from 18 preg- ‘nant women and 1 male donor. Solexa library prepared from sonicated whole blood genomic DNA from the male donor ‘was also examined. For libraries prepared from cell-free DNA. all had peaks at average 261 bp (eange: 256-264 bp) ‘The actual peak size of DNA fragments in plasma DNA is 168 bp (afer removal of Solexa universal adaptor (92 bp). This corresponds to the size of ehromatosome, Example 4 Data Analysis ‘Shotgun Sequence Analysis Solex sequencing proiced 36 0 S0 bp reads. Te ist 25 bp of each read was mapped to the human genome build 36 (ig18) using BLAND from te Solexa data analysis pipeline. The reads that were uniguely mopped tothe human genome Dhaving at most I mismatch Were retained for analysis. To compare the coverage ofthe different chromosomes, sliding window of 50 kb was applied across each chromosome, ‘except in regions of assembly gaps and microsatellites, and the number of sequence tags falling within each window was ‘counted andthe median value was chosen to be the represen- tative of the chromosome. Because the total number of sequence tags foreach sample was different, for each sample, we normalized the sequence tag density of each chromosome (except chromosome ¥) to the median sequence tg density mong. autosomes. The normalized valves were used for com- parison among samples in subsequent analysis. We estimated fetal DNA fraction fom chromosome 21 for T21 eases, cheo- smosome 18 fom TIS eases, chromosome 13 from TI3 case, And chromosomes X and Y for male pregnancies. For eheo- smosome 21,18, and 13, fetal DNA fraction was estimated as 2%(%-1), where x was the ratio ofthe overrepresented chro- rmosome sequence tag densly of each trisomy case to the ‘median chromosome sequence tag density ofthe all disomy cases. Forchromosome X, fetal DNA was estimated as 2%(1~ ‘9 where x was the ratio of chromosome X sequence tag 21 patients (solid eircles)are P12, 6,7, 14, 17,20, 52and 53. S7 and 59 have trisomy 18 (open diamonkls) and P64 has trisomy 13 (star) This method may be presented by'the fo- lowing thee step process: ‘Step 1: Calenlate at statistic for each chromosome relative toallother chromosome ina sample. Fach t statistic tells the value oF each ehromosone median relative to other ehromo- somes, taking into account the number of reads mapped 10 ‘each chromosome (since the Variation of the median seales With the number of reads). As described above, the present ‘analyses yielded about S millon reads per sample. Although ‘one may obtain 3-10 milion reads per sample, these are short reads, typically only sbout 20-100 bp, so one has atually ‘only sequenced, for example about 300 million ofthe 3 bile Fion by inthe human genome. Thns, statistical methods are used where one sa small sample and the standard devistion ‘ofthe population (3 billion, or47 million forchromosome 21) js unkown and it is desired to estimate it fom the sample ‘umber of reads in order to determine the significance of @ ‘numerical variation. One way t© do this is by caleulating Sndent’s taistdbution, which may be used in place of a ‘normal dstibution expected from a larger sample. The tsta- titi i the value obtained when the tstebution is ealeu- Jated. The formula used for this ealeaton is given below. Using the methods presented here, other t-tests ean be wsed. ‘Step 2: Caleulatethe average statistic matrix by sveraging the values froma samples with disomic chromosomes. Each patient sample data is place in at matrix, where the row is ‘hr to chr22, and the column i also chrl to chr22. Fach eel, represents thet vale when comparing the chromosomes in the corresponding row and column (ie. position (2,1) in the matrix isthe tvalue of when esting chr and che) the diago- nal the matex sO and thematrx is symmetric. The number ‘ofreads mapping o a chromosome is compared individually tw each of ehr1-22, Step 3: Subiact the average ¢ statistic matrix from the 1 statistic matrix ofeach patient sample, Foreach chromosome, the median of the differnce in t statistic is selected as the representative valu "The | staistic For 99% confidence for lage number of samples is 3.09. Any chromosome with a representative t Satstie outside ~3.09 to 3.09 is determined as now-di o 28 Example 10 Calculation of Required Number of Sequence Reads air GiC Bias Corwetion In this example, a method is presented that was used 10 calculate the minimum concentration of fetal DNA in a sample that would be needed to detet an aneuploidy, based ‘ona certain number of reads obiained for that chromosome (except chromosome Y), FIG. 13 and FIG. 14 show results ‘oblained from 19 patient plasma DNA samples, 1 donor plasma DNA sample, and duplicate runs of a donor gDNA ‘Sample. Itisestimated in FG. that the minimam fetal DNA % of which overtepresentation of chr21 ean be detected at the best sampling rate (~70 k reads mapped to chi21) is ~6%. Gndicated by solid lines in FIG. 13). The lines are drawn ‘between about 0.710" reals and 62% fetal DNA concentra. ‘ion, It ean be expected that higher numbers of reads (not ‘exemplified here the needed fetal DNA percentage will drop, probably to about 4%. In FIG. 14, the data from FIG. 13 are presented in a loga sithmic seale. This shows thatthe minimum required fetal DNA concentration scales linearly with the numberof reads ‘na square root relationship (slope of -05). These ealetla- tions were carried out as falls! For large n (1°30), statistic where 5-Jy isthe dfferencein means (or amount of ver- oF ‘under representation of «particular chromosoine) to be mea- sted; ss the standard deviation of thenumber of readsper 50 b in a particular chromosome; nis the number of samples (Ge, the numberof $0 kb windows per chromosome). Since the timmber of $0 kb windows per chromosome is Fixed, ym, If we assume that yal width of the cones ita confess! goer he wale of Thus, Forevery chromosome in every sample, we ean caleulate the value US 8,296,076 B2 29 ‘which corresponds tothe minimum over or under-represen- tation that can be resolved with confidence level yoverned by the value oft. Note that ‘corresponds tothe minimum fetal DNA % oF which any over. ‘or under-representationof chromosomes cam be detected. We ‘expect the number of reads mappa! exch chromosome to play role in determining standard deviation s,, sineeaccord- ing o Poisson distribution, the standard deviation equals to the square root ofthe mean, By plotting number of reads mapped! 1 each chromosome in all the samples, wean evaluate the minimum fetal DNA %of which ‘any over- or under-epresentation of chromosomes ean Be ‘dejected given the current sampling rate ‘Affer correction of GIC bias, the rmber of ead per SOK window forall chromosomes (except chromosome Y) is nor rally distributed. However, we observed outliers in $01 ‘chromosomes (eq, sub-tegion in chromosome 9 has near ero representation, 4 sub-region in chromosome 20 near the ‘centromere has unusually high representation) that affect the ‘calculation of standard deviation and the mean. We therefore chose to calculate eonfidenceinterval athe midi instead the mean to avoid the effect of outliers ia the calculation of ‘confidence interval, We donot expect the confidence interval ‘ofthe median and the mean to be very diffewent if the small, ‘numberof cuties has boen removed. The 9.9% confidence Jnervalofthemedian for eich chromosomeisestimate rom bootstrapping 5000 samples from the 50 kb read distribution data using the percentile method. The half width ofthe con= fidence interval is estimated as 0.5*confidence interval. We plot 2*thalt width of contidence interval of median}ime- ‘dian*100% vs. numberof reads mapped to each chromosome Tor all samples ‘Bootstrap resampling and other computer-immplemented caloulations described ere were carried out in MATLAB, ‘available from The Mathworks. Natick, Mas. CONCLUSION ‘The above specific description is meant to exemplify and ithistate the invention and should not be seen as limiting the scope ofthe invention, which is defined by the literal and ‘equivalent scape of the appended claims. Any patents oF publications mentioned in this specification are intended t0 0 o 30 convey details of methods and materials useful in carrying out certain aspects ofthe invention which may not be explicitly setout but which would be understood by workers in the il Such patents oF publication are hereby incorporated by r= ference to the same extent as if each Wak specially and individually incorporated by reference, as needed for the purpose of describing and enabling the method or material referred to, REFERENCES. 1. Cunningham F, et al. (2002) in Williams Obsteties (MeGrat-Hill Professional, New York), 9. 942 2.2007) ACOG Practice Bulletin No. 88, December 2007. Invasive prenatal testing for aneuploidy. Obster Gynecol, 110. 1459-1467, ‘WapnorR, etal (2003) Fisttrimester seoening for ts0- smies 21 and 18. N Eng! J Med, 349: 1405-1413, 4. Alfrevie Z, Neilson J P (2004) Antenatal sereening for Dowa’s syndrome, Jinj 329: 811-812. ‘5. Malone FD, eal (2008) Frst-rimesteror soeand-rimes- ter screening, of both, for Dovin's syndrome, N Eig J ‘Med, 383: 2001-2011 6. Heezenberg LA, etal. (1979) Fetal cells in the blood of ‘pregnant women: deteetion and enrichment by fluores- cence-aetivated cel sorting. Proc Natl Acad Sci USA, 76: 1453-1485. 7. Bianchi D W, eal. (1990) Isolation of fetal DNA from nucleated erythrocytes in matemal blood. Proc Nat! Acad ‘Soh USA, 87: 3279-3283, 8. Cheung M C, Goldberg J D, Kan ¥ W (1996) Prenatal iagnosisof sickle cell anaeasi and thalassaemia by analy ‘iso fetal cellsin maternal blood. Nat Gene, 14: 264-268 9. Bianchi DW, etal. (1997) PCR quantitation of fetal ells in ‘maternal blood in normal and aneuploid pregnancies. mJ Hum Genet, 61: 822-829. 10. Bianchi D W, etal. 2002) Fetal gender and aneuploidy detection using fetal cells in maternal blood: analysis of NIPTY I data, National Institute of Child Health and evelopment Fetal Cell Isolation Sindy. Premat Diagn, 22: 609-615. 11. Lo M, etal. (1997) Presence of fetal DNA in maternal plasma and serum. Lancet, 350: 485-487, 12, Dennis Lo ¥ M, Chiu RW (2007) Prenatal diagnosis progress through plasma nucleic acids. Nat Rev Gene, 8 na, 13. Lo M, etl. (1998) Quantitative analysis of fetal DNA ‘in materal plasma and serum: implications for noninva- sive prenatal diagnosis. Am J Fu Genet, 62: 768-775, 14, Lo M, etal. (2007) Plasma placental RNA allele ratio ‘Pennits noninvasive prenatal chromosomal. aneuploidy Setection. Nar Med, 13: 218-223, 15. Tong K, etal. (2006) Noninvasive prenatal detection of | fetal trisomy 18 by epigenetic allelic ratio analysis in ‘maternal plasma: Theoretical and empirical consider- ations. Clin Chem, 52: 2194-2202 16. Dhallan R, etal. (2007) A non-invasive test for prenatal iggnosis based on fetal DNA present in maternal blood: 3 inary study. Lancet, 369: 474-481 H.C, Quake S R (2007) Detection of aneuploidy with Gigital polymerase chain reaetion, Anal Chem, 79: 7576 3579, 18, Lo ¥ M, et al, (2007) Digital PCR for the molecular detection of fetal chromosomal aneuploidy. Proc Natl Acad Sei USA, 104 13116-13121 19. Quake $ R, Fan HC. (2006). Non-invasive fetal genetic sereening by digital analysis. USA Provisional Pate US 8,296,076 B2 31 Application No, 60764,420, 20, Mardis ER (2008) Next ‘Generation DNA Sequencing Methods. Amu Rev Genom- ies Hum Genet, 9: 387-402. 20. Lander ES, etal. (2001) Initia sequencing and analysisof the human genome. Nature, 40: 800-921 21.Chan K C, etal. 2004) Size distributions of maternal and fetal DNA in maternal plasma. Clin Chem, $0: 88.92 22. LY, etal. (2004) Size separation of circulatory DNA in ‘maternal plasma permits ready detection of fetal DNA. polymorphisms. Clin Chem, $0: 1002-1011 23. Cooper G, Hausman R (2007) in The ell: a molecular approach (Sinaver Associates, Ine, Sunderland), p. 168. 24. Jar, eal. (2001) DNA gmt in the blood plasma ‘of cancer patients: quantitations and evidence for their ‘origin from apoprotie and necrotic cells, Cancer Res, 61 1659-1665, 25. Giacona MB, etal. (1998) Cell-free DNA in human blood plasma: length measurements in patents with pancreatic ‘cancer and healthy controls. Pancreas, 17: 89.97. 26, Schones DE. etal. (2008) Dynamic regulation of nucleo ‘some positioning in the human genome, Cell, 132: 887~ 898. 27. Ovsolak F, Song JS, Liu X 8, Fisher D F (2007) tigh= throughput mapping ofthe chromatin structure of human sromoters, Nat Biotechnol, 5: 244-248, 28. Yuan G Cet al. 2008) Genome-seale identification of ‘nucleosome positions in S. cerevisiae. Science, 309: 626= 630. 29. Lee W.etal. (2007) high-tesolution atlas of nucleosome ‘ccupancy in yeast, Nat Gener, 39 1238-1244 30, Solda S, eta, (1997) The proportion of fetal ncieated red blood ccs in maternal blood: estiaation by FACS. ‘analysis. Pra Diagn, 17: 743-752. 32 31, Hamada H, eal (1993) Fetal nucleated eells in mater peripheral blood: frequency and relationship o gestational ‘age. Hum Genet, 91: 427-832. 32 Nelson J L 2008) Your cells are my cells. Si Am, 208: 64-71 33. Khosrotchrani K, Bianchi D W (2003) Fetal cell micro- ‘himerism: helpful or harmful the parous woman? Curr Opin Obstet Gynecol, 15: 195-199. 34, LoY M, etal, (1999) Rapid clearance of fetal DNA from paternal plasma, Am J Hum Genet, 64: 218-224, 235, Smid M, tal. (2003) No evidence of fetal DNA pers {enceinmaternal plasma after pregnancy. Hom Genet, 112: 617-618. 36. Riinders R J, Christiaens G C, Soussan AA, van der ‘Schoot C F (2004) Cell-frce fetal DNA is nt present in plasma of nonpregnant mothers. lin Chem, $0: 679-68: futon reply 681 37. Hillier LW, etal. (2008) Whole-genome sequencing and variant discovery in C. elegans. Nat Methods, 5: 183-188, 38, Dohm 1C, Lotta C, Borodin T, Himmelbaver (2008) ‘Substantial biases inwlia-shoet read data sels Grom high- ‘throughput DNA sequencing, Nucleic Acids Res. 39, Harris’, et al. (2008) Single-molceule DNA sequene- ing ofa viral genome, Science, 320: 106-109. 40, Samra O, etal. 2008) Cell-fce fetal DNA ip matemal circulation after amniocentesis. Clin Chem, 49: 1193+ 9s, 41, LoY M, etal. (1999) Ineresed fetal DNA concentrations in the plasma of pregnant women carping fetuses with tvisomy 21. Clin Chem, 48: 1747-1751 42. Segal E, et al- 2006) A genomic code for nucleosome positioning. Nature, 442: 772-778 Tame ISR ‘00> segumnce: 3 getegactee caccagect sm 10 30 2 00> sequmNes: 2 ‘529 10 m3 US 8,296,076 B2 33 -continued 34 sacumiee: 3 589 10 10 ¢ sugumice: 4 580 1 ro 5 > SecOmICE: = agtecogaac terageacct 20 <210> 580 10 no 6 400» sequence: 6 egtegeacte testegeste egusa 2 too» segumice: 7 sequmnce: € <210> $89 10 no 8 US 8,296,076 B2 35 36 -continued - saqumice: 9 What is claimed is 1.A method of testing for an sonal distribution of a ‘hromosome in a sample comprising a mixture of maternal and fetal DNA, comprising the tes of: {@) obtaining maternal an feta DNA from sid sample; (©) sequencing predefined subsequences ofthe mater and ftal DNA to obtain a pili of sequence tags aliging wo the predefined sibsequenoes, wherein sid Sesuence tgs ae of licen! eng abe assigned ow specie predefind subsequence, wherein he predefined Stbsequences ae frm a plurality of different chramo- Somes and wherein sid plurality of different chrome. Somes comprise a east one fist chromosome suspected ofhaving an sbnoemal dseibuton in sa sample and at least one second chromosome presumed 1 be nomlly distibuted ia sid sample; (e) assigning the platy of soquence taps to ther core- ‘sponding predcermined subsequences: (@)deterining a numer of sequence tps aligning tothe Predetermined subscquences of said fist chromosome find a number of sequence tags Yo the predetermined Sulbsequenes ofthe second chromosome and (e) comparing the numbers fom step (@ o determine the ‘ence of absence ofan abnormal dstbition of sid first chromosome 2.The method of lim 1 wherein the samples mate serum or plasna sample, wherein the abnormal distibution ‘of sid fat eromnosome i fetal aneuploidy, snd wherein ‘sid second chromosome is» cuploid chromosome. 3. The method of claim 2 wherein the sequencing com- prises massively parallel saquencing ofthe pradefned subse- PE The method of chim 3 wherein sid massively parcel sequencing compas attaching DNA fragments 10 an optic cally imnaperent surface, conducng sli phase ample tion ofthe attached DNA fragments to create a high density sequencing Now cell with mifions of DNA clusters, and quencing the DNA clstr by four-color DNA sequene- ingsby-synihesis method employing reversible terminators ‘with removable uoresceat dyes ' The method of clam 2 wherein the feta aneuploidy isan anexploidy ofa eromosome select fom the group con: fisting of chromosome 13, chromosome 18 and chromosome a 6. The method of claim 2 wherein the step of assigning Sequence tags o correspon chromosome portions allows one ema "The method of claim 2 wherein the length of the seqhetice aps som about 25 bp to about 100 Bp in length The method of claim 2 wherein the DNA is genie DNA, 0 9. The method of cli 2 whersin sid sequencing com- prises selectively sequencing nucleic acid molecils com Frisng the predefined sequences. 1. The mathod of ela 9 wherein sid sequencing com- prises the use of sequencing aay. 11 The method of lam 10 wherein sid selected defined subsequences of the genomic DNA ate rendered sige Sanded and captured under hybridizing condone by Sinale-sianded pres pc separated of an a 12. The method of elim 2 further comprising detrnina- sion of fal DNA Tration ofthe DNA obtained fom the ‘atonal serum or plasma sample. 3 The method ofelaim 12 wherein the fetal DNA fraction is determined by digital PCR 114. method of esting for an abnormal distribution of clomovome in a sample comprising a mixture of tater dua Fetal DNA, comprising the tes of: {@) obtaining maternal and fetal DNA from sid sample; (6) sequencing predefined subsequences ofthe material and fetal DNA to obtain pirat of sequence tgs Aligning othe predefined stbsequences, wherein sd Sedene tags are ofsuicen engl to be assigned 0 3 specie predefined sobsoquenc, wherein th predfind sUbseaenees ae fom pray of dierent homo somes and wherein sid phat of different chroma somes comprise ses one fist cromosome sspected othaving en abnormal distibution n sai ample and at Teast one second chromosome presumed be normally dist in said sample; (6) assigning the ploy of sequence ag to thoi come- sponing predetermined subeequencer, (determining a relative numberof sequence tags align- into the predetermined subsequences o sid rst chro ‘mosonie and the predetemnind subsequences of sid second ehrtonome (6) detennining a weight for correcting for GIC bias and ‘applying the weight tothe numbers of sequence tgs ‘termined in step (2) obtain a comected number of sequence tags asigned to the predefined subsequences of the fist chromowome and a corected number of sequence tags assigned to the predefined subsequences ofthe second chromosome; and (comparing the corrected numberof sequence as align- ingtothe predetermined sibsoquencesof si fst chro tnovoni to he corrected numberof sequence ag align ing othe predetermined subsequences of sid second hromosome o determine the presen or absence of 28 hoor distribution of ad ist ehromosome. 15: The method of tsi 14 wherein the sample sa mater tal serum or plasma sample, wherein the abnormal distrib tion of said fist chromosome i a fetal aneuploid, and ‘wherein sid second chromosome is euploid chromosome UNITED STATES PATENT AND TRADEMARK OFFICE CERTIFICATE OF CORRECTION PATENT NO. 8,296,076 B2 Page | of 1 APPLICATION NO. : 13/452083 DATED + October 23, 2012 INVENTOR(S) + Hei-Mun Christina Fan and Stephen R, Quake Its certified that error appears in the above-dentfied patent and that said Letters Patent is hereby corrected as shown below: Title Page, Item (54) in the title, line 2, replace “ANEUOPLOIDY” with ~ANEUPLOIDY Signed and Sealed this Twenty-fifth Day of December, 2012 Dud 3: Capps David 1. Kappos Director of the United States Patent amd Trademark Office UNITED STATES PATENT AND TRADEMARK OFFICE CERTIFICATE OF CORRECTION PATENT NO. 8,296,076 B2 Page | of 1 APPLICATION NO. : 13/452083 DATED + October 23, 2012 INVENTOR(S) + Hei-Mun Christina Fan and Stephen R, Quake Its certified that error appears in the above-dentfied patent and that said Letters Patent is hereby corrected as shown below: Title Page, Item (54), line 2, and at Column 1, line 2, in the title, replace *ANEUOPLOIDY” with ~ANEUPLOIDY--. This certificate supersedes the Certificate of Correction issued December 25, 2012 Signed and Sealed this Nineteenth Day of February, 2013 Aaa Sead te Teresa Stanek Rea Acting Director of the United States Patent amd Trademark Office

You might also like