You are on page 1of 197
a: [PACKT] eae ee ee) 6 ees go Vitale) Ayr =a— pS sl NG aoe Ul Cr ae ee AOR UD) ey ae eA Cee) Sy PYTHON DATA SCIENCE ESSENTIALS On fen. ela eR Cele Faivenad SB (CIP) Bae BUMAPEGC: Python IRA SCBL/ (RB) MIRTLE - HSAF (Alberto Boschetti), (3) Pe + We! (Luca Massaron) #; Feet, Sime. Aca: MRT eet, 2016.7 (AGERE SLALBAIAS) #8): Python Data Science Essentials ISBN 978-7-111-54434-0 Le LOB Or OF Oi IIL BaRAM HAR IV. TP274 HPiARAACER StH CLP Mae ( 2016 ) #8 176067 & AALS: MAF: 01-2015-7673 Alberto Boschetti, Luca Massaron: Python Data Science Essentials ( ISBN: 978~1-78528~ 042-9). Copyright © 2015 Packt Publishing. First published in the English language under the title “Pyrhon Data Seience Essentials”, All rights reserved, Chinese simplified language edition published by China Machine Press. Copyright © 2016 by China Machine Press. ALOR tA FEE Packt Publishing BACELOLT UH AA BARI. AG LNRAE TSOUEE | AR AURORE MIE AAAS SGERS SiC: Python ASCH HAAEARE?: HUNCH: Closeriomanncer>sa:Ai 22 © ana, 100087 ) BRE RE Rese: AAT Hl Bi: JesGaRMENRIATRA = WW: 2O1G HE 8 HAE 1 WE 1 REL Fo: 185mmx260mm 16 I aK: 12.25 (RH O.25 BIER) #5: ISBN 978-7-111-54434-0 th: 49.05% SUH, TR fC ROL, rae ART BER ‘sears: (010) 88979426 88361066 soxaree: (010) 88979604 we Hite: (010) 68326294 88379649 68995259 AEA: zit@hzbook.com RIBAS « aeeLUS HAO ASCARI SDT OB / BA ca a i Pee ened seb mR m3 E5Es ae ee xRY“Us age eee Li ek Asus Le) so 4s fi ~35 : Eso - * ° 25 =. 20 SH AKAE (om) 7 eis : i" ois : ; ein ee {° B S g 3 fe 2 3 or yee E20 gs = B iy e EHR Com) TERR IE (om) TESG'REHEE (em) li 8 FICE (cm) APF REIE Com) TERMBICHE (em) 7EMELLIE Cem) m9 09s 0.90 me 08s 0.80 ‘ | r oi } eo LXGHE es ee © 200 400 600 800 1000 1200 1400 1600 1800 PAI #10 1.00 095 % = 0.90 f= = oss: 0.80 10° 10% 10107 to tot! alpha SH Be TE Rh FM aA ECR TR, AA ABE AE Pe Ae, lt: it, SURE ART. TR, SEA SRS RE. eT PGERR. AE. WR TSI, ARR TA Re. BABAR S eR ORIN ANRC, —NDBOR A Seek lle BT AIR. TRUER aaa SAR. all, SORRY A RRO SCRA A, ST ASAI. SUM LOSS, HbA TB AE RUE EB eB FLD. OFFER ANT FA. TA HICH, BARIICAR WBA. fb Re Ys ae a PS tb TL Python EER: AS BUSTIN S RISA —S. Python 2 — Mui A. REPRE ATA RINIBS. FATE ARCR ST AWE RE, Ake ESS aE Ta, AE, RUMOR. EREBH, GE TURIFA, AIREMZESt ie, GSTEIR TB eR, HOU (ape HEE 5 ABI TET REELED TALE RTS RE aa, PE Python HK AT A (S22: AMS EONAR. 2 TCS IRAE Re, ET RPEEPE HER POM SRR BEL PVE. SE, MERE ea, feo AAR IRUENET 2 PALER HE; HET LS OE. TIT 28; RGR RANE] ERA LANES ASAE PAIL AAU BS aC, HATES Bich BE A (E, ZE Python #1X, #L2ciMek bt ARGER, AT SMA AIEEE, WREAK ARWMMIRA. BREE ERA, RAD PRA: 1) ORFRMRRE, FRAG HH, LERU AL, LEME BES. 2) HIG SSRN, JL STA AIC AAT LA AS SCAPRSEEA, Satie AAS TLT A CSBDOTSS iE. 3) PARLE EME. HOE FRIE, PUREE AO LE, RAPALA, GeuES RESTOR eT FR Fie EAE ABER UL SF POA NBA, WEL (E ti FA Python dT EGE PEMAFRARW TA, Maite AOS Ee RES, AAA BEM A UMA BS. ASAT LUE ORB BADE ARS ROE AE I SB, EI MA VIBERSE . (SAWS ES SH ARATE AMES. AR 4 AKERS EE, Stee De FHERAESTEPRT ALOE. HTRAACHAR, IZA, sR EATER, BABTHIPIEIE. APOME (PLANE APES (61300123) MP. ASR TL WSCA A RO a Mo AT HH SEMI. SB, BERRA AUR IL Cynthia RFR TALES | Fite 2016 #3 A lh “FRR, BERT.” —£F (BAT 604—531 #9) SARTRE AIRGUM, CMA ELK, SELB, BTR. it BiRaY. AVES. SURAT, BLE. MARTA RERE EB. Python WEA SEAT EE SEER T AER, COE ERE Aa] RCRD TA, kts — Mae RE RAR TA. Python WHET. BLARE AEIAL URWERET DOR, AD, PPG. RANE. KIC ZAERO LE Python AYR E(T 2, ic He aS iict FeAl Tiel a5 3) 3b ee A DR ER, BRAT] SEPA BD PRET TBC SE ASE fos FL CHEAT 4H) Python T 5. {880 OKA) Python IAAI CREEL, MUR PRG SERRE HN Python HU i, RAVE —#E Python AE), ASEM IMR EAU RUR EY T AIPA. ee, EH 5) PARUEA seRE RAGES AUPE TER. Ped TR AG SETE — ae BSI TRE I Be (ES. RAMU G Oo. Bia, RARER RE, Ebi VEOH. HHS, AROMA UREA. RA. RTS TERE BWOBGR, THEOL EO IE, PLAST ATA FP BS RT IK TESCER ERA Aa Ee, AREA TO ES EAS, DARI ER FLAY a UU) ASE. Ase — Ss, ARB CB) OT Fo MERIT 0S? FSP Lik TR it SHR To AAAS 56 AAA A 2 AR Mh TR TE Ht AN shell fr. AUS), Python BIDLZRIIRRA BARRE Mi 36 2 SEFFNER, RIAA THOU AEE BER EAR. SMTP LE RRA mT FARE Sh LE Se S71 ENS TIN ATL A. AP, AT TMG, (EL STE mB eA ACR AE UL AEH ee PERNT de Me JASE AR A AR A Skee A A , Pa vl FESR T BiB EER EA FS 3 RARE TORR RAR, AS PRR eee RE ER. 2 HERO, BERANE. 4 FEAF PKS 5] Scikit-learn Fh BE 7, aT SIA TR BPRLEY RARER, TL TALL AA A ERISA. 5 PEMISMR TBS ARB A, AT MRR AH AIK Z KR BRE fe ASR, 6 ABH BORA AARBE. LURE SUA aE aS. PLE POA, MSO RAT ORR. Bie AAPHLE| A) Python KIM AAA TL, MA Python $i] Scikit-learn #SAEZEIM | BF R. BAPADT ORNS, TE —GAA Windows, Linux ek Mac OS HF AMET Ble ACTSAS 535 IPB Python WERE AS LR iz FT APA Pa St TR Se, AMR AEF UNE AI —BORERE, FEMA RA BGR AEM. Bik, Bett] ERA ANE Sy ER TRAE ee SEVEN Python if, (MBSR HD A RANATER, fn » RAE, BR TERR. EMA BZA, TL Se BON 1 REI, SEPIA ABEATIR, Pete A ATH ANG SE EATERY F ONES, BAL HERP I aS PE A oF Ds EDK, ATBGU FAR: O ABAD Python EB ABAR OT MIR, (RR ABGER SE REE AAR, BRERA ER OTE OF AERAMGE HR Al Matlab % CAE T HEE. ECS FI/H Python EAT REALE hE HOM BT O AAG PALASS . ABRAM RE A RB PR EAA EEDA hutp:/www.hzbook.com Fat As ail fH. HE 47 Ba BE Se aa ‘+. Brett Lantz ISBN. 978 = 100% #4, Thomas H. Davenport ISBN BS SHE: SRT NSB | Ase | RHF SH #FREA 900% {FB Gareth James % ISBN; 978-7-111-49771-4 Bf. 79.00% ‘2%, Ramesh Shas N 97B-7-111-49439-3 HE 47 Ba iE Pa eur Pa ASHSTML: EARS PRM WMS (e%, BSE PRM ISBN. 978-7-111-62692-6 Fh. 119.007 APARSMALEMARENAAAMUNS, RARIMERAMNRA TRL BRABUS, HAT SARMG TAAL. fPRichord Brot AIDavid JonkenaA BRED A. MMEMORARMRU, GUAR. SR. BH ROARS MMTA AAR RO), SHU MONERKMT RCRA. BTFRESMAMREKR: MAMRALACAL AHS te. WM + MIA S ISBN. 978-7-111-62750-3 Bt. 99.007 APAMRLENERNS, LON YORARARERA MRR ARC WANNA. Ae AMORA. ARAMA STR, A SUERTE ROLL AR, RAMA RES Bane CAUANPHRASFRARRIL. FR. MPM RRMA METRES MRS: BL, ALSRGSLR feat: Fe FRHEAR ISBN: 978-7-111-52926-2 BM: 69.007 ARICA OM RAMOHE AICS MINERS RINSE, SHENTON. moe SAAMI RO, MARTINO (LABS) . ITER TAR RAMAIYLT SOME ARKEARTER. BE ue #1 1.2 Python BY SER em 1 LS NA #25 wees 2: 23 3 PESTA 4 [Python fi ft ca B RPL 1 Sag Python PSP 124 12.2 1.23 1.24 1.2.5 Python Has TB 4 — tf THOR TROAR 131 132 Anaconda Enthought Canopy: Python’ WinPython: 133 134 1.4.1 [Python Notebook ~ 1.4.2 ARERR 18 1 MRA eR ~ 2 {HF pandas HEAT SAE a SB SEB 2.21 Hae 2.2.2 Ase PLR AE 2.2.3 BeBe ASHER 23 24 25 2.6 22.4 THA - 225 SARE 2.2.6 Sib {HAAR BEE LAE {HU NumPy HEFT Ste Ah 2.4.1 NumPy + ffy N 4 8 41 2.4.2 NumPy ndarray #f #4 a 50 )e NumPy 4h ~ 2.5.1 MP ABA 25.2 BHAA 25.3 MAR 254 WARE SRRA 2.5.5 RRA A 2.5.6 FIR NumPy i 4: me tH 56 2.5.7 SRA PARRA 57 2.5.8 J pandas # JK 28 NumPy Hee re mnie SH 2.6.2 NumPy #4141 i 40% G] 61 2.6.3 NumPy # #38 4~ S38 MERSRE- 6 3.1 EDA fig Spe 3.2 Hetie alae ~ 70 vit 33 34 35 3.6 37 38 3.10 HERE FAT 3.3.1 thy ee ERBAH —# AT ARE PCA RB ——Randomized PCA~ BEBRO SALA a ~ EAE XT BLA 33.8 BE Raa 33.9 SRR RBH Seis Re AU 341 PRR EA 3.4.2 EllipticEnvelope ~~ 33.2 333 334 33.5 33.6 337 3.4.3, OneClassSVM Fae 35.1 SHEAR: SHa® 353 Ba aK AIRE ~ 3.5.2 37.2 RRR WEA 3.8.1 stra 3.8.2 BP AERA LE a it HE 392 )TiK8 3.9.3 ARH SRT LI qi 3.9.1 i te 3 Wi “112 B48 BAI SAA PA APR Kes PARE ALIA 41 42 43 44 441 442 443 45 45.1 452 453 454 45.5 45.6 46 46.1 462 463 4.6.4 465 4.6.6 4.7 Jee E A REE ~ 48 Mae S58 teMAon—- Fae tity 5.2 PARSE 5.3) PLANAR. I ARE 5.4 ANB 5. SLE HEM A Ri SET SVM 2H Bik SEF SVM fy B13 IK SVM STMMAA ARE Fe EF RASHR RAS MOLT 3 18 FORA 9 JF FI—AdaBoost 127 BMRA SBE KBR wba a AT RK- Hae HEARS Ab id» AN EE BY AE SABE Om TRH 6.1 matplotlib AB SPH 6.1. tS HB 61.2 AE 62.1 HRMS HF 622 KRB-- 623 PATA 63 BHR A fer ~ 613 ABS 63.1 FTHR 614 HA. 63.2 BikbR- 615 HR 633 REDE. 634 GBT BAKA # B- 64 Si OB ae mF bh FE IG MR SEPA I CH AY A096, USE UE BN SL AY BALE MM PE, PAB BA AB HB F Python 24 ALFA VENI SP Se he MRR SL TI, ERS Python if 7 4075 i809 HSE ALPE. PRA MATLAB 2 R GRATE AT. BA BEAM ARS BK. ACTS ULAELA9E Python BE PE, (HF) Python if 1 Be ICSAC HS ACH 5} BT AVL AE FEAL, IU HE BES Ue ACH BF Te) ER FAR SRE ALE 5 AS BOR Python iri. AGL, FRAP Ms > NTR FR — SARA Python WAS, BSAA (be a5 FAN FIL) MIT SET TEL, NAG TE A A, BAAR AY Python ie TA PLANE, EMEA BZ HSE) — FTE PE, WOK A PE INK Code Academy |: 9 iE hup://www.codecademy.com/en/tracks/python/, 3 # Google |: fi) Python i #2 https:// developers.google.com/edu/python/. iX PSS HRPRABIE fe MY, LRTIL ANT AE I ERR BL BEARS 15 1 Ae AE TT, PE PEAT TRF AB AS BE EAS SE SL Be BAY BER A GA; $4 Python BH AE BL AR ER AERO PISPEPA ME, FQ BE FR TBE 1B) RH, TA PR 9 9 ABE RELL T EIS, TIC AG REPL TF a BY REO AE AK WBA, MERE Mh? TEAR AAME! ARTEL TR LAST, PATE EERE, EAE PAF ER: O Wn fi A128 Python Bee RAE T AA Python #19 /ibi% AS BEES AY RHEE 9 EE 1.1 BBA Python fej7r BALE BT MOU, SRE Bo BE YL ESE TOE 2 + HEE, Python aT » RUBS iT. BLA ES, GS HE. REARS EA Aa BY HAR, SE PENT HAE FP A, AL AS 6. PIR SUES Mee Ra, BARRE 4 FA lh SUR ATE AI AA], PAE eK — SALES HAAS, AS SET) FEROMMAUL TE, HEI BAERP ER, FAA SE RE SD A He LO? FRAT (er Python 22 FE BURE AY TL, AE AEE OA Python (19/954 TE fee Fb, FHWA TAA RAL MATLAB S238 PEF BPR SIE HT SI AIT EP AE LA WCHL. Yi, AAT Python 56 fT RAPE BT i EAR ESE AS. AEN AI FAIR SBS, REIT AEA Sl, ABR AEN EO) Python F 1991 4F@)#, 2 — AGH, RTE AMRIT, CMON TA PE UAC AP AR AE A A PO EE AAR ET FCB UC SASLUET MED NE, IPS AE PE Fn, Python Baty Sit ALS A al mA TL, “EY HAE NF OF Python HI 77M RASTA TL, eA eT (Java, C. Fortran SEAR), te ROE APE) PRE BALE NSE OP fh REET ATE i, EC RATTRAY. SES AMT OF Python 2 SPT AN BL AEE BEET SO. RAIA RE RE RE SF TRAGER, HESS G1 Python Sei FAS. APRA IEA A See PT ALAS TTT) RE OR AL PES AY). ABS PE 2K AH AA Python SF? Q Python 225% Fit). Python ieee WH 5G KALA Windows Linux fil Mac OS HAE ALE, AR FATE E19 FT ASAE O. FAYR Python Lie RHE: SEAT ERE MIE (RIS OS FEB AEG C. Java AUBF ILA Julia HE FANE RA AE TG, EE ER OH Python FATE As ANE Ai ANE FG 0 PPE ET A FE Wh. UEFA INA. FER. I I, REM EFM, “ESA RR TeTSC a A 20 TAT AL TF 1 Q Python si AF fii, Sp 5y FY. AT GaN AT fw TV Re a, RAT BE MAHER TL {8S SE EERE 2} ATI FAN RA MATLAB HILL SEAT SAAT EE AH ME 1.2. Python 23 TAG, FRMARAE SPH Python Pir its BE APE EE DL (Ee EY PT Pe Rit HEEB 0 3 SE, ARATE AR Nr HTS BE 9 as BAS HET Python Jé— API. RO, BR aR, SIEM (teen CH AA Javan) AEGIS), AE CA A A A Td ET RAE OR EAE Sk RE RAAT SLA Pe a ELE THIN? PRA ERI — AIA. HE — RT RANG OR BEDE A PENARTH, HAR TE I 1.2.1. Python 2 284 Python 3 Python 4 4+ 4}: Python 2 Al Python 3., J& 4 Python 3 ikem, HP SE PECE ARIAS LANIEIZ4T, Python 2 1548 LBL: SURE LIS HMAC CAL http://py3readiness. org KF AR AEEMIBLE). Sela, fE Python 3 fi Rae 132 TIRE Python 2 FF AMR AT FHL. BORAT RACE, BORE TNA A AEE. DA GALE Python 3 Al Python 2 Za) AS FAT fa AAR aS WENA BGR AA A, AAS TAT fio PALABRA Python 2 iE aS CLE ATIEG IN, IBRTMACHE 2.7.8). PARAS ZF SRE AE SPRATT EI Python 3 AY) 20H AC aR ALAC, MEI ME A aK aie. 12.2 APR BEAT MEHL Python AY Be Abe BE (ABE MATTE RFT HERE Python), AF HE AEKA Python ER FRE AIF htips://www.python.org/downloads/, 9 ha TEACH AL AEE ER, APLMT Python MPH RRA, KMRACASRHARHO, SHR BADME LER URMA RBA SESE, AERA. Libel, RPERAM FER H AAA, AR, CR-PMAGAP RTM, HMBEEE RO Hide, ie ATAG AMM, LABLRSRM, HMAPECMEF, AFERAMSL-KUE RAEGKAG (RPAMAMAA ME). Di, PRAM —A MPH RRS HPA M, THERA, LRBRAT-F. Python J2— °F FERIA, PREY Windows ASMA UNIX HE RK ALA TEMA RRIE. AGERE RAVE, AME Linux ATM (A Ubunm) CATER PER F Python 2, GRE I BE IES, 1) 47FF Python shell IAS, 7ESENA “python”, BRA TTI citi Python Albi. 2) PR uETT ee AM IK, 7E Python 224 ak shell BR REPL (260 RoE) "Pia tT OAR: >>> import sys >>> print sys.version info 4 HEHE EH, Python a. 3) WDA AR, Rea TE PES FTN AE AE Python 3 [fii Aske Python 2. WLR AT ASI, MATH Python HIN ASIAEE IN “ attribute major=2", AAAS Fe Mia tT T EWA AY Python MAS, BEAE DR MT LTT FAY ET. ENO, eat OTA MSI, ASH RARER ES >”. AM, SEF Python REPL SRE, ArH ABATE A “> >>” 1.2.3 Python b> IAA Python 4 PS 4k SEY HPAE, ASA SHU AR a OE 0, 7 LAB. Ji PE tw PALE Python 4K {£51 PyPl (https://pypi.python.org/pypi) ‘1, PyPI BKB R Python KPA AIE OTE. TF TPE APE BF A RP, ET A, ETT FA PS BE LAC YT. PA. JRE So THOR TIAA. BAPE AUPE SUR Re SER AI MATLAB "fF ASL TELA et Be, FRA aR EDT PRL Hak AY Python ip 2K ADEE, AAS ASR IER, BRAT RETR. Hea SAL Fad. 1. NumPy NumPy Ji Travis Oliphant fi Evi, J& Python if FALE MY Eo Tk. EON A ee TSMR, VBR AC MCALUETT ZABLE PE NY AM RIE. RACAL SLES Ae AE BE 1 HR, ESL BCAA Fe AE, BAM ALDER, ERP Pe Ae (4 Tit 4), SEAR RAR HE BL al AS QO 434 HitE: http://www.numpy.org/ Ol ACERT AREAS: 1.9.1 QO Het Hed: pip install numpy Python #1 RHA) AR OE FA NumPy BRI, BCE AON op: import numpy as np REP BRT EAB 2. SciPy SciPy Jé Travis Oliphant, Pearu Peterson #il Eric Jones % AAI @U Eth i ¢ NumPy DNA, BERR TORRE, WERE URC, “RIIUEIPE fer APU OE He, PEE EL IR O QWiMwAE: http:/www.scipy.org/ OAS MRE TEAR: 0.14.0 RIE HEEB OS 2 HEFL ATS + pip install scipy 3. pandas pandas TH fi fig 4b 3! NumPy il SeiPy Ff 7 fie 2b 58 89 fo]. PORE AT BH 2A DataFrames (C48 #£) Ail Series, pandas PI VA xh 9 fu AS [rl 2 WY BAB A BZ A Pe GE NumPy S28 67 ABI) AUNT TESA. ARIE Wes McKinney (GIF, pandas FTLSE HA MTA SIAR AEB. IE, STOP RUR DET AUR. AERC HSI. Hitt %. RE. MCAT LE RE. Q M@iisHHE: http://pandas.pydata.org! O ATARI HRAS: 0.15.2 O #f 4 Ar: pip install pandas JH, pandas BUR AH HY pd: import pandas as pd 4. Scikitlearn Scikit-learn #29) J SciKits (SciPy T5421) f—Ak4>, “EJ Python Scie LEZ HAY Bob. CRT OT LAE MRE BIA TAL, MUA AE AIA BE UATE EMR. RATE ETT TA. Scikit-learn BAH 2H (Google Summer of Code) /—4H1 A, FH David Cournapeau F 2007 4F 2. A 2013 AEF, BE INRIA (RUSS A SOIT SET) AOESE A Be. QO FA34MiE: http://scikit-learn.org/stable/ Co AFH MART ATTA: 0.15.2 G He ##4 4S: pip install scikit-learn FER: scikit-leam FARA A “sklearn” . 5. IPython BEATE BIA BE OT AU YET PGRBRIE. Fernando Perez #2 [ IPython, iE T Python HX shell He SF 92, ELIE shell, Web 3H 48 Aly ALPE /F IE 0 AY Python MAS, FO RUEICGOM. AE GES. PNT Ridae (ISON Hsk) ADPTTIT SE SRD) AE. [Python $24 Tit HE aA Tk, AS A SATE. MH HwHE. hup:/ipython.org/ OAH TAN NRAS: 2.3 O Hefei + pip install "ipython{notebook]" 6 ¢ RUEHE SE: Python 18 KR, 6. Matplotlib Matplotlib fH John Hunter JR ENTE 2c, 52 —4h a 4 Fee SAE, EAR CH OE RRO, RRA ET. matplotlib #2({ 1 pylab MUR, pylab AHF S4% MATLAB FNMA. Q FasL: http://matplotlib.org/ O AMAL AINEA: 1.4.2 O #4 GS: pip install matplotlib BEATA ais, BT LARS SAT BA Ba a BE AAR import matplotlib.pyplot as plt 7. Statsmodels Statsmodels VA fi HE SciKits ABS}, JE SciPy Sit PAB 4h. Statsmodels HUH a SEHR BOM PCF BUM TIARA, — AAT RSM MES Bee ete. O WYMiswHE: hup://statsmodels.sourceforge.net/ O ASH RRATAUKRAT: 0.6.0 O MRS: pip install statsmodels 8. Beautiful Soup Beautiful Soup ffi Leonard Richardson @)#, J —>4 4) HTML/XML fi #7 2%, FEK SAA EI EAR AY HTML A XML CA, EMA. AAEM RENT, BIHSL “tag soups” (HUH) TLE, CM AOR WHAT. ARM Beautiful Soup, i PRT AR Za (—ARTATL F , Python $MEFE PE AY HTML SP OTe ORES), ERT A Ta EARL, THERA, PRU RIT AB O Yih: htp:/www.crummy.com/software/BeautifulSoup/ ARH RITA NRAT: 4.3.2 O #E92RHS: pip install beautifulsoup4 GER: Beautiful Soup 4) +A SA “bs4” 9. NetworkX NetworkX et 3 FFL ¢ WiPAl di SMT FHI (Los Alamos National Laboratory) FF 4, JE Ae YET USAT GE OE BRE. STAR EL, “ET DT FF 5 A A RE BR eT 9 a TK (2D 3D), EW FE BRE TES bE SET HS, ASE RS. BE OS THES. TEAS PREM ARONA Me. It Htew eo 7 O HE: https:/networkx.github.io/ AH ARMY YMRAS: 1.9.1 O HEAR MS: pip install networkx iH, NetworkX HAM FN “nx” import networkx as nx 10. NLTK ARIE TA CN LTK) fa TLS Ai, BEE A in) Bi iad He RPE. NORE BE AUB fit % TAA EGE A RIG A AMEIF ( Natural Language Processing, NLP) fi) —%#% £ PA. JB, BAPE AE Steven Bird Al Edward Loper WEY WN WAKE Rit SP BLM CIS-530 FF Ae MY. AE — BAP A TL, AT AURA MR ies eS A HH PT 22 1 FAWiSWHE btep:/www.nltk.org/ O AH MITTS: 3.0 Ol HEARS >: pip install nitk 11. Gensim Gensim J& Hi Radim Rehiitek JF AEN FF URAL, AIEEE SEA CHER PETA PP, fB HTKMLARAAM. CHATS BOE, MMT EIA X P47 ( Latent Semantic Analysis, LSA), iifitt LDA (Latent Dirichlet Allocation) (74:G@AUS., Gensim i449) FEB AWI AIK word2vec ik, ERE ICACHE Oy RARE, GR UL HEAP EME AT AT ME A FEE HI LEE 9 O RYMitiHE: http://radimrehurek.com/gensim/ Ol AHIR ARE; 0.10.3 O HEME; pip install gensim 12. PyPy PyPy AJL, ESE Python 2.7.8 HY ERICH, AEA SWC JH AY Python Hie fo GHB MVE, AWA SES LHF NumPy). PyPy W—TERRS RR THERA FPAC MAE. PALE, AR AEP ACRE ERE, EE RE Ay AAS bE EB) — Phat. FOVESBHE: http://pypy.org! Ail ARITA MEAS: 2.4.0 Q Fat; http://pypy.org/download.htm! 1.24 CRANRR Python Act 7s 2288 a BTA Oe SEA A, RE A Pe RAS. Late 8 6 HEHE: Python stem BR HT AL] iS“ pip” a4 “ easy install”. AEM STEAM it > $8 1 7 Sh EAE VET Python TAAL A. FRM BRR. RBA MIT ROLL dete 7 se TH, WbEtT ine s> pip Be. WAT tr nF re: $> easy_install URGE DS i FD BE on ATR, ON AGE — PE TE ee RA IHD “pip”, ERE “easy install” AVBCHENE. AS, aR “pip” EAA SLCLAT VLE BL, WRT RECT, “pip” AER RSET» MEK pip”, AT LENE ANF Ral LL: ASE OAL AE 779K (E hutps://pip.pypa.io/en/latest/ installing html KAR Python MAMMAL RALE “pip”, HA, HM RB MTEL CET “pip”. WFR AT TERE, HET AEM https://bootstrap.pypa.io/get-pip.py t: F & get-pi.py MA, PR ALOE tin S3 $> python get-pip.py VA: BAS th AT-M. https://pypi.python.org/pypi/setuptools/ |: FRM, CMA ceasy_install SREY LAC RATA Hom TG EY TAT, Bee Tt , Rae BG TT MF fir: $> pip install mea, tT List ite: $> easy_install JIG, TAL, SILT AT HCOOH HF BIE TEE. RANI EE EE Ee Be, SUMS RADE — TPH. SE Python REAR HSL “ ImportError” ik, BT LLBE EIA BIE. 4 NumPy OBR RAT, ABLATF: >>> import numpy 4 NumPy EBACE, 2 KHER >>> import numpy Traceback (most recent call last): File "cstdin>", line 1, in ImportError: No module named numpy $i Ht bm eo PAA, FF BGELL “pip” BR “easy install” FeRAM TA. HEB. 1S RRLLG (package) fH (module) FRA. ‘RM pip KH AL RA, # Python PHAM RMA, HHRLLAMMRAH SEMA, CRSKEAF, EMM. AJe, “sklean” HAMOSALH “Scikitteam” HILO? HT, PAG IENIMGES Python AY TELE, HAAEFPII hups://pypi.python.org/ 1.25 TRBHR AURAL, Vy a Be HRCA is BE ASR AA CASO BALI A, AS WAARACH TA. 9, MAT version Rte, PAATEOR AVM AR, FLL numpy fi) >>> import numpy >>> mumpy.__version__ # 2 underscores before and after "1.9.0" FA MUR BE HG Ce FEE AAS, 1.9.1 HE, PTL SATA oe §> pip install -U numpy=-1.9.1 RH, (AS: $> easy install upgrade numpy==1.9.1 Fi, RANGE RATA, FRI AT ANE tire $> pip install -v numpy BO, WAT DLA Air : $> easy install --upgrade numpy 1.3 ASHRAM EMAC ASFA, ONE EBERT BARRA a EY ORE. BE, Vista EAE Python, IRE TIA EAU PE CAITR, SEMEL AER] HEAR 22 AR ARAB, NO ) HOARE 27 TL AD, TAR AT — EEA Python TERE, AA AEE HR. BROS Python PENA ATMA . BRS Python, BEIT RTM LI A HRB TAL, ATO RE FE eae EA TA IDE (Se RIP Be). ThA eT A ta SESAME: ARITA IY), PETE MODY HRCA FR ATT a St Gk BET (HE, BEDUP SERN FR, BR — PAE ATH, MN Anaconda hit (XE TAHA SHR 10 HWAE FH: Python TR, A). HAAR MAD SEIS, ATLL SEAR AAT IR, FREAK Python, FBR BULA HABEOT Aas TT 1.3.1 Anaconda ‘Anaconda ( https://store.continuum.io/eshop/anaconda/) Jt: #fi Continuum Analytics #@ MAZUR ATM, PM 200+ TA, WILMA NumPy, SciPy, pandas, IPython, Matplotlib, Scikit-learn ll NLTK %. Ede —1-Y AURAL, AY D-SLR BLA AY Python fiz FREER. HEME RN WAAC IE Ge BB, Sf Aa i RO AE AS PA AT HH BE Anaconda Fait — ue dil 04 fF AE conda, itt in Tk Me et. TE IEF TN, Anaconda Ef) Fi bm E48 ft di AY Python 2247 AR, BEAT ACSC Ab 8 TALON a} BT ALE at. 1.3.2 Enthought Canopy Enthought Canopy ( https://www.enthought.com/produets/canopy/) 2 Enthought 2° i] if: #4 £4 Python #4 2% it SF A247, LAK 70 & HE Kf. ll NumPy, SciPy, Matplotlib. IPython #l pandas %. AAC AT MAL BL ALAS TEIN SEE aE A ST AP FAP. EA bait MAKE RE (4529 Canopy Express), MR ie MRI, LAT RM. Enthought Canopy Jé—TEEP AMAT, Heer O4T RT. ASE canopy cli. 1.3.3 PythonxY PythonXY ( https://code.google.com/p/pythonxy/) J£% 8A Python B#HA RT, th FEAR EP IR. EARS TA, a NumPy, SciPy, NetworkX, IPython #l Scikit-learn Hib HE [ —“-3E SRIF RIPE Spyder, 3K AF MATLAB IDE. EAL EF AKA Windows AH, OATH THE “pip” 1.3.4. WinPython ‘WinPython ( http://winpython.sourceforge.net/) 4& # Al FF ih AY Python FF hie, lal HRP. Ce AER BIN, CRS TA, tl NumPy, SciPy, Matplotlib A IPython %. Et (RAY Spyder fey IDE BE STE FHF AY A me a ete fe (ny LL de He Pe AE fa F, HERU MH), E ARETE MK Windows AH F LAE, fr 47 1 Ae WinPython (2% #88 (WPPM).. 1.4 IPython faigr [Python $36 1 RAEI Ae NTI, “EAE ts Se OB FAP A TE ZG Sit. Bit HEB OU Bese: D ? fil ?? : fifi] MYiFAHHE (HE “22” RE SI TEAM ADE Dif). GO %; 3X JEM A BANK HAWS. TEAR 5H ALA BR AE i ST. PTFE A “ ipython” dir fej oh 38 CS i 3&7 IPython, MP MTA: $> ipython Python 2.7.6 (default, Sep 9 2014, 15:04:36) Type "copyright", "credits" or "license" for more information. xpython 2.3.1 -- an enhanced Interactive Python. 2 -> Introduction and overview of Ipythen's features. Squickref -> Quick reference. help -> Python's own help system. object? -> Details about ‘object’, use ‘object??' for extra details. In [1]: objl = range(10) WRN. ATARI (Python 4559 “[1]") OTE 10 PAF OA 0 B19) HFN, HEBRRAAA TR obj. In [2]: obji? Type: list String form: [0, 1, 2, 3, 4, 5, 6, 7, 8 9] Length: 10 Docstring: list() -> new empty list list (iterable) -> new list initialized from iterable's items In [3]: Stimeit x=100 10000000 loops, best of 3: 23.4 ns per loop In [4]: Squickret 46448 [2] HA S4F, fH 1Python fr “2” MF obj HH. [Python HPMRAH STRATE AAT Cobj] HEA AE, ALA (0, 1, 2, ++, 9] AMBIT RE), Aer a AS ASSES. ELAR NF 9 Tt » SERA, EA 2?" TAR “2” ar fi LEA fr HH 46465 [3] ftir OFT, MF Python BRIA] (x=100 ) (FARE PRL “ timeit” BEAR PARE UIE TIES, TRDUT IE OTe NIE PEA Te). HT, SANIT Python PRAAP EH Affe) 12 6 RISA HI: Python ERM, CERES [4] HRSA, ATA PARC “quickref”, SbAx [Python FF 5R MMS ¥ , SEATTLE BH Be PALA, HUM Python HAA—TRMA MIC, MU AUREL * stdout” th, BAA — PRK ot, SMAI ES CREAT DAZE IPython SESE AS A EAT SU, ABACUS AN ae ete DE FES A, le, Bea) SAY “In” AL Our:” Ha Re deri A AST HBT, MA eee Nee FAG BES tilt 8) [Python 10 Mt “Ins” Ja, athe refi hoc “Out.” DA, RAR Bia th RE eK OU O In: fre OD Out: Smee AM, MUAH CE Python Hei ARTE, HEAL Fait IBS >>> command HURT, AINA SATA Aa ER PIER $> command Jed, ¢E [Python Hetil i247 bash ae, AER CEA Sw TI — RI” In: Ns Applications Google Drive Public Desktop Develop Pictures env temp In: tpwd /Users/mycomputer 1.4.1 IPython Notebook [Python Notebook ME % ARES F “URAC” DNA He HN LAE SL OO AEA TAHA CMR) O Rieti Urs isa i Beak ioe. O Fei IR, IBM RC ET O RAR HR CLUE AS. ULAR IESE) ARLAT F UPython, EMER PLN MM A ABN AE 1) J14h Python Notebook, Sia FAN FM: “URHCH” SHOR BEN Cie, AW $> dpython notebook 2) SUT SF —7 Web DUNE REBT, 1TH HH [Python I 9 25 44. Web i Wi AEE HE HF LB uF: > Fe rs = IPG: Notebook hestanten serene tay hoon mann. 3) MRI. Ridi “New Notebook” 4777 —4-AHY E11, SF TMK TAR : on - eo meet enn ines ee @ 3 IPO: Notebook — Untitiedd mamems PeCne eet e ORLA ADAG fe ACY Web IFA FEAF. EA PythonIDE AEA HAL, BT CURE ABE WLS SCH HATG (cell) AL. BORA FOCRE RT DU — BOCAS Chica ASN), ATU — Bt. RR — ae BL, PRAT DAAC ATARI, AEARTIRES 2 SE Carita th ) AKG AR MEE a. TF TE — ME Te LAAT In: import random a = random.randint (0, 100) out: 16 In: a*2 out: 32 teAi hd “Ins” AYE SA random HUbe, AE nk —4> 0 3 100 Zi] AY BHAA, IIR BURR Et a, ALUEAT REMMI. AT AC, Bate “Out” Meh — LE 14 © BURA. Python 16 fe THe, flit a HOBIE. EMRE, LMR, REMAP AMS. BE, UFR I FO — TEP RS Ze A A? FP a WE, BNE AMIN ATEMG? SEI, RAS. BOR — 7 AR RT AEA. A, 4 Rai TPAC, AA BRAS: In: import random a = random.randint (0, 100) ER: AMAMPARFLAKET ER, FHF PUHRFAIRAZ, AARAGA Notebook 476% 2 Ki8sTRG, HEESHEAMARAAIH, ALSHRLHF, Rik said ADE AMAT IPython 2-4 fj, RAG AA TL PRAT, TE AH Bb BE, Notebook * S/H BE HE MAT AER, DORAL ee TTA oe, PEA ET BUR ARF IPython Notebook fh}, 7*4Ef4 .ipynb KMFE ISON Hah, EMAMATIN PTC RICA FE, UATE, TRIP TE ASD, A ASP SEI TRAE FT Notebook (32H ANGE EAE Python RILT SLE). HME A PEL ARPT PERS BBR, SLAY AE i Jr fai, [Python Notebook (4) 68 2 LIE ISON 45 #9 AY SCPE SE AEF ATTRA 5 Hit PIR. ARE, SORA eR CTF BEE, URTAIE—T SARA EHR ABT (GEASS EAS ARE) In: ‘matplotlib inline import matplotlib.pyplot as plt from sklearn import datasets from sklearn. feature selection import SelectKBest, f regression from sklearn.linear model import LinearRegression from sklearn.svm import SVR from sklearn.ensemble import RandonForestRegressor fe HE, SEAT —-28 Python BUR: In: boston dataset = datasets.load boston () X full = boston dataset .data HIF HEB O15 ¥ = boston dataset.target print X_full.shape print ¥.shape out: (506, 13) (506,) PUA. FE cell) NBME MS. TF GB ABE AY Ay, ISA EE RE 1 Be ht ROEM. TORK 506 HE TAY BE, EG FOS BA SAE. ABA Ze BAA. BRL IME EETE. BLAR DAP GE EAA, RELIG EY BML. SRA EAT SEAT Pr LD FY ST DAS J — 9 a SEC MO a 9 EA met) AY DLE LL F tr > Pea BE ME 1 SE EHAB: print boston_dataset .DESCR. J WU CHG Be TE A GE Zt, BEAR Python Jef a 2 th JE HL A, BER CHG SR HE fy — He FG ASP BT EB] —- HEE (MM SelectKBest) Mi %& Cl .getsupport () BRA fit ())o AHPC BECEHE AS Ae ee BUM AS STR JHE MIF EMT IES see tT PG: in: selector = SelectKBest(£ regression, k=1) selector. fit (x full, ¥) X =X full[:, selector.get_support ()} print X.shape out: (506, 1) FEHR ASIC “Ins” of, ae PRL EL ISH AE I AY SelectK Best E/E WIFE, RH fit) 272% HET RAO. TEBE TAT 32 Ta A TE ETT RG BE, MIELE get_support() WHF SE HE WR HF ARETE, Ate, FRAT TAA CRRAE) BUTE Oe Bt) Ze HALE A FF PEARED HK RAE EZ Mi) FF TEAR HEE FI EE HEL tA Dd 8) He Td AS EAE, In: plt.scatter(x, ¥, color='black*) plt.show() © HIGHS Ht. Python 16S KM, TTA Eh, BER X AIM, Y dem. SRI, AE REMY, TCR FRR A, EBA Wh 4a, BE RET, RATT BURR ET BL, BOOZ X ALY ZILLI MY = a+bx WERE R. HL — RENE, AILS a A b FES EE, HDRES Fee deo ih A BBG ZT] A In: regressor = LinearRegression (normalize=True) regressor.fit(x, ¥) plt.scatter(X, ¥, color='black') plt.plot (x, regressor.predict (Xx), colors'blue', linewidth=3) plt.show() -10. “5 TERE FRAY H LG, BEAT DM RIE I Se Tf — A FR HE SL), Bea wn SAA sf) ER CSR ELA). GPR, HEA a AHL, BOR AAR RIE. AEA, RET He ER RE, 21% HSER © 17 JER MARRERO. LEFT HLAL (Support Vector Machine, SVM) i J —#t FH 3K fi PAE ARE Fa) RAY BI. Eh, BELA HK (Random Forest) J25}—Ab (1 aft ASA EASE AD, ERRAND BA EWE [Python PAIK A HL In: regressor = SvR() regressor.fit(x, ¥) plt.scatter(x, ¥, color='black') plt.scatter(x, regressor.predict (x), color="blue’, linewidth=3) plt.show() 60 30 40 30 20 10 0 so io is 20 25 30 358 40 In: regressor = RandomForestRegressor () regressor.fit(x, ¥) plt.scatter (Xx, ¥, color='black'); plt.scatter (Xx, regressor.predict (x), color="blue’, linewidth=3) plt.show() 60 50 40 30 20 10 o so Ss Ww Is 2 2 3% 35 40 18 ¢ RIES: Python TE FER PIR ABET T PALO Ae, A TAR PES a SVM AUS F Ba OL BR BK 9 NI. AC 225 4E [Python 3677 FT Ls. TR aR TR ET, A, a BE YE Ae POA CR RICE TE PEE. TELUS LARD A PE SB, RAT MR MMB IT Le TE EAR ARAN IFES ae BAR He ERA CE | RAI mA BTS TC 1.4.2 ABE AMER PATLES LAA AEE, UT Ae PR. SEAT AIZEN BBS AY] Python Sct AE IE AS TE AAT AE. REAR ATL HE PEASE AUIS, BPE AA Se BET Stal. EOC, TEARS PHA EM OE, BPA BURL AS, FRAT ALM PL He OB a Hie EN, OR ORE AEA Python Sie ALE ke, FRIST aL BEBE i HR SAC ED E> (1 fn] a MEF UTIRLAIST A, PEAR DERE NE Sem RTT, ME {CEO T LTE LPython Hil frat Notebook 1-(ti}. AC EE OT A ARS BJ FE Notebook RHE Fiz 77, (05 AT DA AE AE aR C www. hzbook.com) | Fat. 78 TGR. Di THRE BE Seas 1, Scikit-learn Toy adi Scikit-learn Toy Sci: Ai 4E Scikit-learn TALL", SCRE AY SCHR ME FT VA itt Python Afi BRIM AR, ASHE SEE An] Sb BB EH BR AR FE SY PAUP ASHER, HCH MRE AY BGR AE Iris, Boston Ail Digits F, AHA POPES FRIAS 28 RUBE EERE RAT IRATE, BR TNE A be Sh, ET SEANAD LB XM (AlN, BEMAR Iris BARE, HALLE 4: In: from sklearn import datasets In: iris = datasete.load iria() SEC AIR, BRAVO AAG AR A HS, TE A EL A TA ERAS BF i Scikit-learn BAB RE ABHELELL F 71: Q .DESCR: E(t NY EATS O data: (A HPAL HOTF ME. feature names: HABE Fe. C1 target: (0 CEL MC 8 ea 19 bat G1 .target_names: ida A br PHIFER H BR H1% HEEB © 19 QO shape ; [ARTF data Ml target 7H, ERE IE WL A (OR — TAL) AURRTIE CRIMiL, MAT) ABC. SUE, TARE LIE. RA ARS, RIC RAL AOS SAAR, print Ar HER HoH HE In: print iris.DESCR In: print iris.data In: print irie.data. shape In: print iris.feature names In: print irie.target In: print iris.target.shape Im: print iris.target names SUE, USL TREE EST, MUSE le AE it. EAT HAS. TEE, Iris TRU AMA BAA EA TAAL: data Al target, In: print type(iris.data) out: TERE, Iris M9 ME BGR PRA; data Al target. Iris.data #(# 5 fit sepal length sepal width, petal length fil petal width A9He(, iP AERA MIE (150, 4) ARERR, 150 FAERIE. 4 (CARER CRE. AREAL it Je iris.feature_names BFS IGUF Iris.target 32 — 44 Ae ti) He, EP BRE — FAS A HLL target_names BA 8; RUA TRIMER setosa HX, setosa PWAPWECK, ARAMA PH 0). Sy T A RHE SSD BT TE RE Hs HERE, 1936 AE SEARGEEE SET ZC NE + BR 78 (Ronald Fisher) 5 — Vt T SABIE BARS Cris flower dataset), BHAA 150 OBIE L0G FR TERE AS. cB EAC EEL RL TOF BE DS A 4 TB SEAS A —), FA DMR HED HR, i He AiR Ph A HE FP PEI (EFT EGER AUIRAGE, CARR A TUT SARIN, WGA, HEMT AM BAe OD BUPA RY BA. BRE, CECE ATP BL ULE RAR Fh, FB GREER ESA EAR A A, ES RAT RAE, (AS aK PME falda, ALBUS PAE IT. A AE at, ES Tris Bae AE PEASE, UR — FE A Ee TOR PEPE AE PEE UTA, EMT AAR OES Pa Rt SPE AOR AMHR EEL , FOri PETA x A as SH EH Ey A at EAE FE EP HELPP UI RE A CK OT Ar al, tw Ye EH EE Be pandas FERE(E SBR ARY BAC, METRE Mi MICS PAAR, FED TR AEE tS Mi] NY EH 2 + 8 python #82 382 Aarti In: import pandas as pd import numpy as np In: colors = list() In: palette = (0; "red", 1: "green", 2: "blue"} In: for c in np.nditer(iris.target): colors.append (palette [int (c)]) # using the palette dictionary, we convert # each numeric class into a color string In: dataframe = pd.DataFrame(iris.data, columns=iris.feature names) In: scatterplot = pd.scatter_matrix(dataframe, alpha-0.3, figsize=(10, 10), diagonal='hist', color=colors, marker: grid=true) i. rf : Bs a | 2 Ww) =» : 2 re i ' 40 : é ? - 5. é lp ! Ba $ get ogase = Pent $f. 2s : 24 ‘ Ce 5s we i ol ule i eek lia - : & & ewe : “ae ® i 8 WPM (em) RIE (em) HEMI (cm) TERE IENE (cm) Bit Ht bw © 2 FEAL INE 52 He) SPR AZ HY, PRAT S AVM Scikit-learn Toy J UAHA 119 3 MESRGEAT IER, PTR A — Th FAS A LA SE, A ES AER. PRI, JSF Toy RARE? PEN BREATH AT AR, (EE Se PR AIA AS Sh WZ PENE. DR CHE TEL AE, Hy AREAS, TATA BEA SE As AS AR PPE, ADR WP Hes BE 2. midata.org “3: it FRAT HEE ER BE AS HEE HT LR 2) A EE a LIBS VM CHE Fi PUR FRR. AIO ST AY TA APR AST], ATE Ha ET FAG, midata.org JUL AHF J BABE IIA SE VEURIE, APE RL ABO REE, RE ty ic 1 BE HY A) PASCAL ( Pattern Analysis, Statistical Modelling #1 Computational Learning) PASH SHE GY AOL, UFR Ep Sa a HI PGES WT A RA 1972 4 Fh Sw Je ic SO RAT Sa eM, te TF Pe Me Ae EAH BY: ttp://mldata.org/ repository/data/viewslug/global-earthquakes/. RET, fo 7 SHR AMY BAL “global-earthquakes”, ATLA (de FNM F dr PARTE BHR In; from sklearn.datasets import fetch _mldata arthquakes') In: earthquakes = fetch mldata('globa: In: print earthquakes.data In: print earthquakes.data.shape out: (59209b, 4) FF Scikit-learn TA. f.6Y Toy BUH — FE, Ph BAY ata — 4 Ae HE Pe a A, PN EHR earthquakes.data, Hii A #iJE earthquakes.target.. FH Fit (FH 9 EE BC HE, BUG 2 PEA AT FD) 3. LIBSVM Data fa LIBSVM Data(http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/) J —“ Ra GUE 4 BC WE. CEEUET 2AP A, ANIL. 2K SPIE LIBSVM Hie sk HU ea SURE Sc De) RIL ASR ETT IMO. NEAT A WR ALBUS, FORE BUR ET EI. AC, SIEVE http://www. csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/ala/, JFic RITUAL. la, BEAT LAPT FRBGERT : In: import urllib2 In: target_page = ‘http: //www.csie-ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/ala' 22 ¢ RMP HIE: Python EHR. In: a2a = urllib2.urlopen(target_page) In: from sklearn.datasets import load_svmlight_file In: X train, y train = load svmlight_file(a2a) In: print X_train.shape, y train.shape Out: (2265, 119) (22651,) SEAT BUT ER — SERRA IE A SEA — 7 SB, 4, BEM CSV MAI MRE AU, AS BE BEE TARE FRG SS, ACFE Ha TAT LH LER weet fir >. UAE ZR BORE, HSA LTE ae, RA PY Jao 4, NumPy fil pandas /#4)S/HEHET loadtxt Al read_csv PAK fi] a, A BE A Lh PA, FE EE 1 PE AS tp://mldata.org/ repository/data/viewslug/regression-datasets-housing/, Jt Bf F && regression-datasets-housing. csv SCfF FU ARI ALR. HF RR MEP eH (LE (13 ETE, AEE), RR HE BAIL NumPy fi loadext eA 8C, ERAT AT SCH in) — “A LSE ALT "PY BOE ELS He OAT HE AE 9 EB, HT LA AH] pandas.read_table ®& pandas. read_esv RAVE. JHLEL values F7ARGR ; URE EESTI, (HH loadtxt fir ATLA CHEATED loadtxt As iti BE AFF 52 ffl In: housing = np.loadtxt ('regression-datasets- housing. csv’, delimiters", *) In: print type(housing) Out: In: print housing. shape out: (506L, 141) loadtxt PR BCR UAH Ze FE PE CP ELS TA] AE. MRP PE” BRO 5, Cs 2B FH SE EEA >>> import numpy as np >>> type (np. loadtxt) >>> help (np. loadtxt) Joadext pA Sch HF Hy {Fi..4E numpy.lib.npyio BE HT LAR HI. APH ARV SBGE diype, CBWE MY FER: EM EAA loadtxt HA RG) AEM A AH 6 HAE AE MU BEE MAP AST A IY (1 AY int), AH wie HE im o 2B PM, HERE HY, TAIT ACS: In: housing int = np.loadtxt (‘regression-datasets- housing.cev',delimiter=',', dtype-int) 481: housing I housing_int 2H (1 HA PTCA «AEH BM As RAE PA PEC A In: print housing[0,:3], '\n', housing int (0,:3] out: [ 6.32000000e-03 1.80000000e+01 —2.31000000e+001 Lois 2) SS, SUES IEAR BORGIR TOT, RECTAL, HAZ M6. FEAT , SBC" skip” FR MOOR RIL AT FEE. POR OTT FepRATT (Python EM OFF TTB), ME Be skip=1 Hy Pe WATE Ta, a Se HER A FEIN, HU FR ee FA 9 J Iris SCHEME, HO Se ms A AS Td, ris CA 8 19 J ik 2%: http://mldata.org/ repository/data/viewslug/datasets-uci-iris/, $9 b, Iris SUR RARUET — Te PEA A bE EG (class), EASE BFA, Ra ABE Ma A. FUORI. CT AA PBI HE, PAUL, MEAD loadtxt PRK Zei" A AIS, FAY loadixt SER REA ITA 70 mR FLAT Hla] AS R), ARRAS, TSE Rit ee TF a BBA, MALE AAR? pandas FEVE{E T DataFrame BUR 24 AY, AIA dee FARES Sh HEC PFE SAE EH IS CAREER OT A oa AS a] ASA ELSE Tc, AWGRE FR datasets-uci-iris.csy LF, MAF LAP SALE AR. JX BE, {HUH pandas ff) read_csv pRMCWLA AT In: iris filename = ‘da In: iris = pd.read_< % decimal='.', header=None, names= [‘sepal_length', ‘sepal_width', ‘petal length’, ‘petal width’, ‘target']) In: print type(iris) Out: BRT SCHR, read_csv PRBGE A] LAGE GESTHAT (sep), ARUN AYAIAIT (decimal), ft APR (ARPA, header-None ; HWW FE, WATE, header=0) MAEM Pe CALAFIA; FEM pandas HA air % ). HEE: 9, AMRA-AMERALA TSA (RM PURERGP EM, WRB 2H), RH, SRATVARAAAALESAMREARE, Sito, iris.sepal_length + PRG KR AE. 24 ¢ RRA HE, Python i HID. tu 2g BAG pandas ff) DataFrame #22} fd & BE AM A bef AY NumPy B41, AIL Av tit 2 LAE 56 GX AEF o In: iris target, iri: labels = pd. factorize(iris. target) In: print iris data.shape, iris target.shape Out: (150n, 4b) (150L,) 5. Scikit-learn AE ALR fens Ed BER, Scikit-learn HAE pe Gl MR BUE SE, OLA ECE SE A] HF lel VA, Slik. SRS SPS, RSS TAME Te. (ULE HORE SE NY AE EMO RUT HE, Bese aL AE Python #4 GE ATE pO. Ute, PT LAGER RE AS, mA Pa, TRA TK Lf REALS Te (HAL, Ab — AAT — EAT RAS AY Sf In: from sklearn import datasets # We just import the "datasets" module In: X,y = datasets.make cl: n_features=10, In: print X.shape, y.shape Out: (1000000n, 10%) (1000000L,) ification(n_samples=10**6, SPA MU BURG, Xt 100 7TFEA (HH n_samples SRW sz) AM 10 445 FH AY FF HE (n_features) fii}ij make classification #14. 4 random state 1H 101, ix FRE RPM LR TEAS (el OA fi) ALAS Td LB BPE ES fran, APRA AE te $> datasets.make classification(1, n_features=4, random state=101) DOP S18 2 a FT: (array ({[-3.31994186, -2.39469384, -2.35882002, 1.40145585]1]), array({[0])) FICHE ALA FUME EE, random_state SHOBHE RIE BIE AR, SI AE 7638 td (UTE EA REBCE M random state SH (AGA 101, Et AT AE AE RAS A AL FI TEIT RCT), WES CEPOL AGL 5S ADR CHES. TEAST BRE A ANA TA OL REL: random_state AYE M ASCH. BIS, TT ARENT WE? fé CPU 8804 13-2330M @ 2.20GHz AYOLA LE, CAMPEATIT MALAI F : HIF FEB © 25 In: Stimeit X,y = datasets.make_ classification (n_samples=10**5, nm _features=10, random state=101) Out: 1 loops, best of 3: 2.17 s per loop SOUR YREB ME AE TTR AME A ET PK, BA FE a ATT BORE Z TL, 1.5 NG SACRE ARISTA, BRATS TN BB Aa Aa a, BOE TD Hee Be, RMA ATM. FEMS T IPython, kas Tafel Dy al ae Fe BE CER ie, FRAT ABE NP PE TAL AE He HF KAT, LL FN Pd I FA EAB EE oh 2 Bee ey BEATE, PEMA — REBECA, IBZ, Hh RCS? “C5” (munge) JE — MRM ARATE, AEE ME eH ETB MIT AULA RR. SRA RACE, Bil AMIE RZ a BAEACS R767 IE A, BURSA, Ee WB, Fi ILE eA: “BGHREEE” (data wrangling) Al “SCmME” (data preparation ) FEF RH, AREA PE O BAR eae OX PERE RUE RHE BE my sea O SCH nasi HERG HB PRE ACR NRE HAShm HLA RU BR CH RET RGR OAL ALAR, DIBA RAY. ATELY SEE, RAGA CRE A ooooao 2.1 AGRA ERE BGA BARS], Oy PATS, RAT AT DE A — FIA HSA Bir FREE EEG, BGR ZT TAIT, LR Pe LAL BE EH RDBMS 2c dF BE NoSQL 5/441 42 BUH. BCE AY Ar. Web API BG HTML Jit ri Hi 58 3 BS. FRR AI BAR PE RT EN — PSEA, HEE AVERT, RATA SGT BE PEMA AY TFL FRY BE Sp I Ty FE ICC, a $2e MHAH © 27 A SEAM BAR HE LAR BE (RDBMS) 1142, AE TELOMERE (RIA AC) RATT BL ATE PRI RIES TBE PEAT AUS EL PEP AS aE Ke Ze BANE AH AUN — RIN AACES Python Beat Alt ab 3 TAT A Td CH, BB EARS TRL BA BE AE, SETTER ASF AUB) SL fi A ADE FESS TPL BEAK, SSCA iM i ah Ha 4k — 4 Sek iB A A SD eT EBT EAB SAB BPE RZ, BT aT, XE OLR, BS FELLA LTE ANY PR TEE. DM, PAPA A Dy Rp a Bk AR Be BE HOTRBN TE, HARA SCBR, Pisce ORE TENE Ht: MENTE A ek AEM CERES EL), RFE PRIA RE MU BEE AG SC, EPR LRRD AT wit. BY LAAN HH A PL kB — Be AAS Ah ASE ER, RAY SERIES LEU ACU Beas CA BEE) ATM ABER OT fF — 4 PRR AG AAS, LIAS FA REDE Ai A YY Ae AR Fei] ARC Re, ABE AG FP, PBA, MPR HEAR ATE Bi AG — ARI, 2.2 (8A pandas #77 MMR MM ENTIRE, BATE T OAT DAR fy SB. Hot T Python fA AEA ATS CEA, TEE LEME ET FAME ERRATA 28 PEI pandas Al NumPy APERTURE 2.2.1 Sate one ALAR FEA CSV SCA pandas Ff tf. pandas FEHEHE CRIT. THES RK, ATLL ASCE (aR URL) IMR BAR. AU HTOL F, pandas 23 FFE Ae TT A PMU MOREE OLE TTA. MAL CAE LAS RAAEE L . HEIN AE 9 TE ARSENY FEMORAL SEAN), BART ELD). A (EA In: import pandas as pd iris_filename = ‘da’ ilename, sep=',' header=None, jepal_width', ‘petal length’, ‘petal width', iris = pd.read cav(iri names= ['sepal_length', ‘target ']) JUL aT Air, ATLA ECE. SPB (sep), tk (2 #F (decimal), pre (header) VA ABNEY Hs (4 AREA 22), SBE sep='' Al decimal="" HAP ERM, BaP YX BEE EE GDA Ae, MATE, PH RAY CSV SC PF ae EA ATE TH Ik BR, RRA 28% SABHA EGE: Python 1H EH Pai Be WIM BH AS ES RMA EB: tRAURAILAARM, TEM PHRMA ERM EER: import urllib2 url = "http://aima.cs.berkeley.edu/data/iris.csv" setl = urllib2.Request (url) iris_p = urllib2.urlopen(set1) iris_other = pd.read_csv(iris_p, sep=',', decimal='.', header-None, names- ['sepal_length', ‘sepal_width', ‘petal_length', ‘petal_wideh', ‘target']) iris_other.head() HEL 7 AE BO Xt J — 4% OW iris OY pandas 84% HE ( DataFrame). “(RE —7 A LAY Python WAR FH, FEAR AN RET PAR ASS TARE BRE. OT RPL Ae ARE, HORM E Ars aT EABLET CaF In: iris-head() out: sepal_length sepal width petal length petal width target ° 5.1 3.5 14 0.2 1 4.9 3.0 14 0.2 2 47 a2 1.3 0.2 3 46 Sel 1s 0.2 4 5.0 3.6 1.4 0.2 Irie-setosa Im: iris.tail() fered DOL RC, RAAT Me, RTT. MR AR EN ASTRO F ER eR BN AMERRMZ TRASK, HU: In: iris-head(2) bidtrS Fh T RGRATPT. BLLE, NT EET, AT RAF A: tn: iris.colums Out: Index({u'sepal_length', u'sepal_width', u'petal_ilength', u'petal width', u'target'], dtype='object') BAU SEA, RE RR — THR, (SEI EAE — ‘4 pandas #41 TURNS a, CRA ETI A os. BM, FR “targer” Fl, HAE BABA EE In: ¥ = iris{*target'] H2% RHA 6 29 Y out: 0 Tris-setosa 1 etosa 2 etosa 3 Tris-setosa 149s Iria-virginica Name: target, Length: 150, dtype: object Xt Y AYAEAUE pandas Series, ATLA Fr aA Whe) HER, PATER AT et FTTRAMIFE. SUE, RAV AA RT He, pandas #5] (Index) AHR AA FMRI] o MERA, ATLA TNA, ME aS In: X = irie[['sepal_length', 'sepal_width']] x out: sepal length sepal width ° Bet 3.5 1 49 3.0 2 47 a2 aa7 6.5 3.0 148 6.2 3.4 149 5.9 3.0 [150 rows x 2 columns] TERR TAF, 1 PUNE RIE pandas BRE. ky Ht 2 IA fy oD ME Ae WO2EE? TAD, FEN, RTA BER — Fl, We, Si — AE Te A (Hl pandas: Series). ZEA AFP, BEUTLER AH, FLARE) TAS A ER (HRT aR PABLET IY pandas AYSGR AE) . iF ia: AT B16) SF et Ae He 8 RA eR eB TT A Se; MIRON be, WIE PERLE pandas SHE. AN), RAY flit, 6 AiX42 pandas Series. ite, RATE SST ie TB AE iL PE OAR a LE. PR EB ES ee, RL SEA RE. ARE SAE, da SE SRT, ABE NG EY AR as BE AE TG IRE BE ch Ca], BH I As. 7. WAT MMT TF, RE Ja TAG ROE EAE, ARTE pandas fy 2UHHE A Series |: MEAURYE shape, far F mind PLP AAS: 30 © SiG: Python iss Sem In: out: (150, 5) In: ¥.shape out: (150,) EFM BE BE KETC (tuple), ib 9 VE REA pandas Series tH EMMIS (HEM, AAA 70H ICH). 2.2.2 Shs (a) ae SLE, PASIAN Me Hi 1) AAS REG eo. EE ME SR A T, w. ih 8 7 PB 8 AL A TL PP LR CSV CPE AT. RA WARM AAT ABE NT PLL, AR TR PRT ‘ike, ee DAR GS A HA 9 Date, Temperature city 1,Temperature city 2,Temperature city 3, Which destination 20140910, 80,32,40,1 20140911, 100,50,36,2 20140912,102,55,46,1 20140912, 60,20,35,3 20140914, 60, ,32,3 20140914, ,57,42,2 tt Int FTE, TAT CAB ROR, IP ALICE La ae. IRE BO + RMB aire: import pandas as pd In: fake dataset = pd.read csv('a loading example 1.csv', sep=',') fake_dataset out: Date Tomperature city 1 Temperature city 2 Temperature city 3 Which destination zo1a0s10 803240 20140911 100 50 36 20140912 102 55 46 20140913 60 20 35 20140914 60 NaN 32 20140915 NaN 57 42 IIE PAARL pandas ARAL E MSR Rae , Be BO AF ASF. FRITTERS AH RAE EI, ET a BF RRA EN ASAE LAY ES, ATPASE OA SPE. Rae Ay F2% KBAR © 31 BN. CRPPH BS ORCR Be aE: tol) Date Temperature city 1 Tomperature_city_2 Temperature_city 3 Which destination 0 2014-09-10 80 32 40 1 2014-09-11 100 50 36 2 2014-09-12 102 55 46 1 2014-09-13 60 20 35 3 2014-09-14 60 NaN 32 3 oe one 2014-09-15 NaN 57 42 2 BLE, WT ABELL “NaN” Zea iit Be, APE A LATE Cio, $0 ABER ME) . BOAT HT PAGE LL FF RSE In: fake data: t.£411na (50) out: Date Temperature city 1 Temperature city 2 Temperature city 3 Which destination 0 2014-09-10 80 32 40 1 2014-09-11 100 50 36 2 2014-09-12 102 55 46 1 1 2 3 2014-09-13 60 20 35 3 4 2014-09-14 60 30 32 3 5 2014-09-15 30 57 42 2 BORE, GOR LZR ARETE T Uk eB AT WR LEA I PE LR in Be (CATH WRORABLAT DAF 8 BORE, Da ETS HAL AS Td PR AD Soar TD): In: fake dataset. filina(-1) PER: LATE RR AMAT AARR TBR RAL (ted EER AE A He AB) ATHEABA EI, TAM XH inplace=True Nan (FEU Af 21 at ASSOC AER PB, ACRE ML ATP In: fake dataset. fillna(fake dataset .mean(axis=0)) [ME mean Jy) Ut SUA SEAS re EFL BARE RE, axis=0 fe SAH LEFT HEAT, PIES ATI. UZ, axis=1 POR IPALIIN, BIE, AREAS To ASTER. EHO IE pandas IK JE NumPy, AIT 32 © MOEA BI, Python iF IIR. XS TA HEE axis SRY ABI . smedian A729 mean FWA, SUR EUPIA. SSAC Aa BSR BY SSH a, SAAR HE TR ATE ORR aent, PALL EA AbD: Bs EAN TY AEB BAB — NE, SER RAT RT. CEE EF, load_csv SH RRIA BUTT A AEP ESHA FEA ASAT AB AYRE A Ts ET, {ELI FARE TAT AY. CEVE AEA OR TE, BORER RAI ME— 2 SAAS la BEE A SUL A BE AE BL, BEES. HOTT. AA — CR EE, PUR RTE TT, JE. MPS ST OH SHE SEL ATE RRA IB, BT ARAN 27 KAD error_bad_lines i262Ht: val, val2,val3 0,0,0 cles aratava 13,9 = pd.read_csv(‘a loading example 2.cav', ) Skipping line 4: expected 3 fields, saw 4 out: Vall val2 val3 ° ° ° ° 2 By a ss 2 3 3 3 2.2.3 SBA RRS ‘AM BE HAR A IH SEL A TSR AN EAS IM, AERA TH BEF a eo WTAE A REAR DF Rc, BE A RD CL ae I. BAR Python, (FA Python A LL Ange Sct 8) XK Se (Chunk). BRE RABE I, PBC DBE ENTE St BEA AER FBO RARE YH. TAR ZEAL AOA, AR SAG BE Dh 7 AE AW. {HUH pandas, 44 PAT SCTE SCPE BR FAI ARS — RG SAH He AT) TDAP, BEE ABS, EL EAT, BBE SE Bm BCH FP tt FH chunksize SMUT. WRT KL, TAAL F read_csv pA (Hii Hi AE pandas HSB, LPI C ARIAS BES TT ABSA I AG TR In: import pandas as pd In: irie_chunks names=['C1", ' d.read_cev(iris filename, header=None, , chunksize=10) $2 KeAH © 33 In: for chunk in iris chunks: print chunk.shape print chunk out: (0, 5) cl c2 c3 ce os © 5.1 3.5 1.4 0.2 tris-setosa 14.9 3.0 1.4 0.2 2 4.7 3.2 1.5 0.3 34.6 3.1 1.5 0.2 45.0 3.6 1.4 0.2 5 5.4 3.9 1.7 0.4 rIrie-setona 6 4.6 3.4 1.4 0.3 Irie-setosa 7 5.0 3.4 1.5 0.2 Iris-setosa 8 4.4 2.9 1.4 0.2 Tris-setosa 9 4.9 3.1 1.5 0.1 rTris-setosa UAT 14 RARER, EEE AEE PER (10, 5). FAM RABE HET AED ESTAR AE. PEROT FAT a ASH 94 — 7 pandas SARHERT ICE (Ie >47) In: iris iterator = pd.read_csv(iris filename, header=None, names=['C1', 'C2', 'C3', ‘C4', 'C5*], iterator=True) In: print iris iterator.get_chunk(10) shape (ao, 5) In: print iris iterator.get_chunk(20).shape (20, 5) In: piece = iris iterator.get_chunk(2) piece out: cl cz cs ce cs 0 4.8 3.1 1.6 0.2 Iris-setosa 1 5.4 3.4 1.5 0.4 Iris-setosa FEAT, PEPER THERE, FR — +P 10 TTB, 2 BERR, HIT DEE 2 TT RARE HH BBE AE BRT pandas, 38 ATLEH sy PL, CREE T PT PROP HRS, 4 PH BUE reader Mil DictReader. (E/N M Fare $A esv Melt: 7, PRR 20 TAC In:import csv PRR reader Mii Ae ERA, CB) Python HAH, PAR DictReader ALK REF 34 © BLIGH iE: Python eA SRF ML RPT A EB GE TEER. reader ¢ 238 Fle DEN AEH, ARIEL AE PEE PEGE RS BFE EIT AS RTA BY SY At BB F)o DictReader #551 2 i BAR We Si] Me ERY RR HH BE — AF RH FL) RESUFEA RAPA TA) HEX FURAN BMRA LR BIFAE— AUR. LL, GEL Python ity the SLAC PyPy , ERA FUT ITI. Hoh, TAT tL a E248 HM NumPy ft ndarrays (Fe{) bt BITRE — ABATED). AP BARLEA ISON JAY 7 se, I) FRE Pie IE, HOT csv See eLiy Mesh fe: {BLA midata.org M434 F #2 datasets-uci-iris.csy (PIE M A, WF Ree SIM SAFE. (SBR, ARCA LAAT TEE, EAI 150 TIAL, HL esy UF BE Aiba.) Pale, ME — APR LN HAR ep EMER T Pe In: with open(iris filename, 'zb') as data stream: for n, row in enumerate (csv.DictReader (data_stream, fieldnames = [‘sepal length‘, ‘sepal width’, ‘petal_length', ‘petal_width', 'target'], dialect="excel")): if n= 0: print n,row else: break out: 0 {*petal length "3.5', ‘sepal length’ 11.4", ‘petal width': 10.2", ‘sepal_width': 1', ‘target': ‘Tris-setosa'} LR CHS ENE WT AWE? IE, ATIF read-binary 28 ACE, BEES data_stream, (54) with er JE WT GR RSH SE BDUT IOs. SCPP RENE IEA Wis, EO REE iS Cfor..in..) i TF ALP AY 7c, enumerate Ph BM HH csv.DictReader ABO": MIN RSE, “Ete TAA data_stream AVEC . AL SCHR APT, fieldnames Hf TF EH MOA KATE. dialect HURT T RAE NY LG YAP EH) cov SCE CRA. TRALEE Hh — ae eee FY PE ANT TT DD TEAR, WURDE TT, PRT EDK, AFI, OER break free Abs SBE AT SOA, BH, ARF Hak 1 ET 1 HATA 8 — FR FAPEAY, PT EGU AUIAL{CPSIISH csv.reader ir, FLUE In: with open(iris filename, ‘rb') as data stream: for n, row in enumerate (csv.reader (data_stream, #2 KAR © 35 dialect='excel')): if n==0: print row else: break Out: ['5.1', 13.5%, "1.4", 10.2, 'Tris-setosa'] SCHL ASS SELAH AA APL I HR SHA, AAR ER A BE nT OE Ra, RET A for PAH. SE AE ADU SRA IE Se CIC, CA] rh SETS Lab J SS SEE In: def batch read(filename, batch=5): # open the data stream with open(filename, ‘rb') as data stream: # reset the batch batch output = list () # iterate over the file for n, row in enumerate (csv.reader(data_stream, ialect='excel')): # if the batch is of the right size if n > 0 and n % batch == 0: # yield back the batch as an ndarray yield (np. array (batch output) ) # reset the batch and restart batch output = list() # otherwise add the row to the batch batch_output. append (row) # when the loop is over, yield what's left yield (np.array (batch output)) (ew i SB — RE, SAR Re AI HO. APH #e enumerate PARC? fH) csv.reader PARLE SSSA Fe 09 DMS, AIT AE Se IES KO FEM). ARABS LES. Ub FB ee ASI ESE S22 TT, AWA yield RRR ERY. ATMA. TBE BEA CES E AHIEB IL, In: import numpy as np for batch input in batch read(iris filename, batch=3): print batch input break 36 ¢ RUBE: Python BEE 2* 11,3" 10.2) 'Iris-setosa']] FEC PRE BET BDU EE BEA TE HUSA NTI, GTO 4 RE, Bh ESE LR BUTS, He Spe — Be GR A EAT DR 2.2.4 iA tae ASt BIA WAL, FATA MAVABAL CSV RE. pandas EHEHE TAIL AD) HE (PRC) Ke IMR MS Excel, HDFS, SQL, JSON, HTML fil Stata 280 (4) 8G. [Ny Re Me HCH UA BS OT, SUN a na VE A REA A CORE, ea Se A ERI. ART WR BAT — PIM AR SQL FAT AEAA Bl Fei, pandas SHE M] VIAL AIF Series MIWA ARON. TER, palate Mens RAM F In: import pandas as pd In: my_own dataset = pd.DataPrame({'Coll': range(5), ‘Col2': ‘Hello Worla!*}) Coll col2 co13 cola o o 1 1 Hello World! 1 2 1 1 Hello World! 2 2 1 1 Hello World! 3 3 1 1 Hello World! 4 4 5 1 Hello World! MRL. UTM ENA, BRCM AP RTFM) A BAA PEMA). IA EDT AFE A, Col2 #1 Col3 PHFULAAR aL AY Hy sk Ala, AEE SAP EAE TDRE ASAE. LEGS, FRA A PL Bm MC AT LE — 4 oe BGHASEIIN pandas Meee ERTL, HRT LEAE AAT UTA, AMET RRA. OE BEAR Im: my_wrong own dataset = pd.DataFrame({‘Coll': range(5), 'Col2": ‘string', 'Col3': range(2)}) ValueError: arrays must all be same length RPE PRAM, ABA A dtypes ITE: Int my_own_dataset.dtypes coli intes $2% KKH 6 397 col2 floate4 col3 floaté4 cola object, dtype: object (RADA De we, Ae RE A HE SY RCP BH EE FiA SEAR A, AR LA aL LAE Pe SH 9 bE OPE AB SBCA HAE A FE A ek PO RE PE AC A AC — 8 HO 8 HG BE DA A A 0 ee FT A ae A ARE. In: my_own dataset ['Coll'] = my_own dataset ['Coll'] .astype (float) my_own dataset.dtypes Out: Coll —floates col2 — float64 cols floatea cols object dtype: object 2.2.5 SORA PUTT SRAE AT SAS, REFRESH SR, TASS ET ABP-ENS FEAR, PATRAS EE AH DEG, IRs BEA a RF A 4 AT A RY LE — EE (mask). HEB AFA (Boolean value, USCA), BV BGRAT EA bee. JEBEL, (BEES Iris BURSE PEAT AT ETRE ( sepal length) AF 6 HY SdETT BY LA Ra PAT EF {C0 False FELL LAO Peas Bale, A A Be HO yy PL. tA eH ULE. UL BRAT EAE — PG FO fa fe PE. BNR ATE “ New label” 4m 38 ¢ RHE: Python iss eM OREO (target) Hf) “Iris-virginica” pr, (A FTF TARTS REE IM; In: mask_target = iris[‘target'] == 'Iris-virginica’ In: iris.loc{mask_target, ‘target'] = 'New label’ FRAT EBL“ Iris-virginica” WARE, SLE MBA AWLT “ New label”. ik {BERRY loc () De RH AAA EAT BEB, FE EE RR | A EY RIAD, BAB (target) WE RAIA A. WTVH unique() is. AEH) MRE iT EE WY, MR AEA EY: In: iris{'target'] unique() tosat, 'Iris-versicolor', ‘New label'], out: array({'tris. atype-object) AULA T fF 3 EEE SET BE OT A LUE T ARE CHR, AT AA EBD): = iris.groupby(['target']) .mean() sepal_length sepal width petal length petal width target Iris-setos! 5.006 3.418 1.464 0.244 Iris-veraicolor 5.936 2.770 4.260 1.326 New label 6.588 2.974 5.552 2.026 In: grouped _targets_var = iris.groupby(['target']) .var() grouped_targets_var out: sepal_length sepal_width petal length petal width 0.124249 0.145180 0.030106 0.011494 0.266433 0.098469 0.220816 0.039106 New label 0.404343 0.104004 0.304588 = 0.075433 ei, ARG BEF PORT HM LUE THEAE, PTLMEOH sort) Jv, ELAR F In: iris. sort_index(by='sepai_length') -head() out: sepal length sepal width petal length petal width target a3 43 3.0 aa 0.1 Iris-setosa 42 44 3.2 1.3 0.2. tri 38 44 3.0 es 0.2 tri,

You might also like