You are on page 1of 53
Rata Minna ond Dela Werehousng Bota mmmea is a protey thet user various eon Lt+0__ Aiscover pattern or Knowle tron lector using LMsucisetion and madwne algae eo other = tr_veters sto exhacks or mining iL Kno wslestan rom | Lavoe mows hy _ ote eg Frond detection , st orks Son _ seme predickon ete. Q) What is not DM sede » Searcting bow ke oreo on Cog 2) Senrdied tor prone no: oe _ 2) =a SQL gquey te a i al ass wns, > 5 Q) What _ is acta ining : = YD ands peope ult similar hobwicr teeny Sete Getler Gowmern yehred by sea une accord to context. 2), Crores _ of _geth cancer are highes fa _powes _ Xwes near 2 _powertne, 4) Dita ming as q step im she proten Knovstesd gy cAiscevery. from doleror — (kop) __Ddata cleanig V/ =e —_Oorm om ETE HK _ 0. pata Cleaning * \b is done to remove noise and Ances\ stent dota - a 2). Data _Imegiation : Here _rmvltiple data _souses Uke Flat bles , databases, detacuhes etc ore combed were cota releyart to we omalyns rom ane deta: 2) Deta_ Selechion toak are relieved 4) Deka trowormetien: Herve _ - a . we corcl dete “Knee fon aperepicte For mining _ wy _pestoreni 4, summery or agguageton ops §)_ Dette Wiring + \t is dhe enerial proven wher Jat diserk wetreds are apcbed in orda to one dete gotten. - oo 6) Patter _evalotion To \dexkly “wey interatoy 7 a . 0 patiows represenbyg _lrowledae band on Aor \nkeeotvgnen measeb tae visvolisahor bk bnevledy. ere wick fo present tt F) kenowledoe presevtokon (Aepresertohon deck! mined prorles ge be dhe _usew Ne classmate Date __ Traci bon sl feechwiGreer wee umpwtable det to at- Prormity of dade thigh himererton thy 4d data = T 1 Herero gene) a Mstnbchve retro of ade Forms A Pre processing cleans one Sp EE] 4 7x TB Date then fomatien » 2,22, tee, 51,18 — 0 62 03 2100 , O59 ON using dither) date droclivig Aechniqnts T] Peta Clearing '- — > Real world data tend to be incomplete , nosy o : \nconsistent _ _ - Data cleaning athempts to fll in misn- vale sec ont neise while idewRhing _ovtties ond coved eons | . a) WW cthe 1 dota ~ Basic mete for dobn cleatig ore Cotogoitad eng on A) Noisy ctoke —_ 8) Misying voles _ 7 QD _ Weowystant clem ee Y A) Noisy Dake: a: + Neise §3 on random error or variance in a meamel j Variable. i . : : Quen a _aumeric attiAbute suck _o) price Smooth ovf dots tp remove how cbs we _ Moke i> explamed tnelow a) _ Bloning - _ = binning metros gmoott~ ase Sorted dota wale by : Com dt. Ws nex eet: se the valver ono _ = The sortes’ valor “ore Dstt bres ah ia nnrlion of buckeb or Bid whee eth tin pefos loo! : I Soo toning ou shows 1A the exanple below. E oi Pextornm smooteniieg or eing on sorted dots _ yen below. nis) Dy, 22, 2, 24 25, 1 28,84 classmate, Step |: Partikon into eqyidepth bis . Bin yi we 1S Bu 2 ot Bin 3 3 9s 28 34 ___| Step 2* Smeetniva ey Bi meme __ Bin i cna) A — Bin 2. 22 a2 2u _ | Be 3) gq 2% 29 - 4 te Step 3: Smootning toy Bir pounstary _ Bn 1s 4 45 _ _ Bia 2° ot 2th - a _ Bin 32 25 25 34 _ Steph? Smoetting by Bin Media _ _ Bin Bia 2 gr sy at — Gin Bi 28 oe — — ee a _ | 2 —— Dorferm clekr smoothing wry bainniy ee fellevun seed dots ; 4 ——__ Prce(e colts) | 4,815, 29,21 /24 25, 26,29, 2434 width tee tain of of erect fH dmcoth by } bugger, bin bonmde dein retin so Pariten inte —equiclepth biws oe bars 4 8 TIS — = bin 2S gt Bt 2H 25 Gin 3! 2h 8 pq 34 Steph oft rs aa - bal > 49 444 Gor * 23-¥523 23 23 ein 8° 94 29 24 29 be8 bin Bourclor bine! 2) 21 25 2S bin 3' 26 26 RE gy - | og ; ~ —— hea ban meen a bol; 4 a he ed pir Si 24 24 Type 2! problem 3 Bedform equiv tte seag * Sor tue iets sorted doe —— bo S - E Vv, 1B, 15, 35, 50,55, #2,12 204 245 Price Gn vs) | 5. 10,%% wide 3 bie a) equat wiping — The bw hee equst Lolth | wt — ye fos Behan OS ee “Range eb Bin Leinew) <= [eins asl = tt Rompe ; gn 2: } 2 | Re a 2S [ein a ses} - _ Bong, Ee ane < Uni ewe] —— _ “where we (max- ecm) [Cond © bio) no. eb bis gn tt SLmatw)] = S440 = ¥5 (5/75) - bin 2 [evn doer] = S + 140 = INS (76,145 Bin 3 2 S [mas se] S54 210 = 215 (146,20) _ $4eg 2! ; = | - Cin: Ee 1, UW AT, 85,60, $5, 72 tine a2] | Bind! Poon, 215] y classmate, ——* Smosth by Bin Mean Bin 12 BO. SOT 0 ITE TOSS 30 er 2 AW —— Be ts LO we a - Yr j — — —_____ a) Missing Nodues * _ To fu \n misieg valves ail ferent metheds < Some ef them are om follovr DV Agnore he ple, : 2D Fly ta the mining vale Laney, _ 3) Use athbwe mean. _ i 4) Use global ee ko 4 ro nmin S) Use atiibte mean for samples beleeg me _ . __clovy _ _C) \oconsistaxt peta —— _— There moy bbe \Acowastaney ln whe doter vecoded fo _ Some dtravnacdtiows whch can be corrected srereelly Ueiny exten! Actorentos 7 _ Wis mey be coupled Wide vontines ehergnnedd ro ble) - cowect te Wcomi tar ver of codes . = Krowledg. eqgpaceey skools meg be end bo lett i Volo an Ae known Hoke consineitty 7 — Mere wey aye be incomstany, gee bo debe Integrin J where oO var atmoute can have aifferet rere 1 Atkresd dolatoars, SE] Date tegration | — \t is om 2 = Ab com enes” multiple chataleons data cubes Nes de Wr o A ees - an ie Nemes dury Dokot beaters ation z RecunAcwers * An OtttNote nay be pedudant can be ervect Wem cretma hatte _ ee - Pemvil “reeme con be cterved Gon Creentey, snort such tery are oll led by cadelabon ene) ad €] Gten_Schema Inlegreror can be micky | — Wow con “equivalend real wate exrbhe Form pedhph anh Bowe be mother we oe hew emis Ws relened fo on piderthicahor prebler whew con date on or Computes Con or a Mat castowe WOW one databer A Univre: Ne a dotaloax ade~ fe a ane _enbh, euch meta deter Con be wet to help avoid emo \n_ schema \tegers a pehection ond pesdlubon dota *) ohne con fuck. - j= i For ine sSovie real word eth abmlte vats Bi fered somrcor Moy pute 7s tg eee di Fherentes in represestakon , scaly Or enced ee -_ Pud line oll the above iwmes d clade Wego and Ca \npreve dhe _exComnny oud ped 4 Deb My greece. *] Protlems en Correlakon Cocffces Ceenc) dD. For tha given velos ottiloden % ay Colenbas, comeloWon cpeliicient ( Peavsse'S product! Momed - idevt) and avo Kind th Ae boo variable. pa avely ov -vely corvelobed. 4 e| 3 consider sample data sel) 8: feat Create toile fink meen of - penduck XY Me ey @ [we Tae 1 (2 [3 } —— 2 5 ° 1oj2 A Sample Nel | Mx= 2x| MysZy peiten = c [ye 4 Mea) ye | a = Step2! Cololate st Stewdod deiahon (i= 2) ata ota ote tot Siee V wae) 1 = see = o- (+5) = | ote camer cs 305 Nel “ : _ Remen classmate. ——— cE a oy) ee ce — __ orn Cx, ¥) (Cma1) * Com * oa) _ - — - | T ey a - (4-1) & (0.816 ® 225) = = as : _ =) 0.895 _ |, Since __corr (x,v) 20°84 which is +ve , variables X &Y ove porilivey correlated 2) For the followi = correlaker or arab Correloked data_ps cuts 4, atiiuke P&B. en EERE diet od che fad A rey ous Sot A 6 a= Ma _B-Ms (a- ma) * (8- Ns) + zt 4 =< 3 -4.393 yess _-\3 = 108 : |_ 2 eee Si ‘}s 4 o:2 - 6-23 — i 9:2 Do — _—__—_|3 3. B52 | Fas ot Bn 2.2 - pel = F2 le t+ “23 4 to -o 8 a — _ e438 B=S-4 - CeR= 2.55 oAB= Q.so Cow ( m8) id TD Data transformations _ - — Hee aka ave trenlorred tn +p for Appre priats _ for mas = BT involve the Poltowhe A) Sreoothing ts hee nose is removed en ote Whe hechniques Uke binning , regrene- ek. ; = oo opens | a Pi as hee Summary Gh oorash Ore _oppled a Lage sak, hole eel bt oygriegetel b Conpute monly | — Aolts Age deter This _stepo cH nd bh erahachy cate pe - i oly Of ole a 2 pee peste da + ho te eve oe ptoiie (arly a | dade are pep loeet level Gove fe. an Tere nd highs _ et ats ae es! Sdreet atinbse oon he erobiec! ty Wi eee ee tee like ah or a : ap gaan | a oO ee BD) Pt Rowte constwcken or femne wohwro | — Yae new athibes. ore wnmrited ond adder bo Ger ted oh Rake to bp Asta Ay. - £] Normalisation: Here data otmbte ou scled 50 tro! hey fell usrtin a small speahed rangi. tues are 4 Baye [) Mean-Mox Normatisoven | 2) Z-Seeve Normalisakon. 3 Nemetisaton Iby clecimal sealing — _* Problems on Min Max Normotisahen Ne inh ig ( mew emany - nes mine) + neenng Max, ~ mn p | Suppose thar minimus aud mein value for oe otimbbute income Ore 12000 & A8000 acd we wots like Fo map income to the ra ao [ o-0, bo] by min-max nermalisaiion then qvalue of R #8608 income js trautomet be uhot? v Max, Nes her Kg neo nA vi dw vz F3c0" ‘ Vena Pay ‘a vic emt OC een = Aero neny) b neem = - Yer - Mina iE ™ — - v' = #3600-120e0 =y¥ (1-0) +0. Gg000 - (2000 - ___ Fo He - Rs 43600 iy frontomet Jo O-FI6 Y Use mm max Normalisation me eae ony Tok Sn ae 200 | 200 Hoo | feo 1000 - ° a q ° 4 = fy , + - Sy (t-e) +o — u la hs " ol2s —— (\-o) +0 Joe 7 - S c= (ae? oo - ae ; = OF - “A js | ¥ ete |G | ee | Goo tore| : \ velo J ows bowl os) 3 - ~ Problema on J] Z- Score nowalisation. _ _ 7 . / mean = Re fe Bom [ Stondard = Oo = = (mi- %)™ =a . deviakan —_ a - L Variance eve _) Use following meted! be nomex pllewing grore 4, pa Gora: 200, 3800,40%, 660, loot tt a Us 2 Score normaliah DUK Min Men pomedieh - Comtcleod _funplt |, Step » t" Standard oleviahier | ce | Ela | n na TTT (poe rt) Boot | =f oo-s0u)"4 (soergos) “4 (aertoy *Gr mies \ 5 = 282-34 © iS n I< y D> - eet -0- FOF : — “2400 _—_ — = 4eo-soe et WL XY 7 = —_——— ty ve G6eo-Soo = - Br Vu — _— : = Ores - a Viel (acon naer 282K = vo 200 300 hoo Goe ies =o He Vi =1:06 -0F0 ore Fa x) Normaisalon boy decimal scoding' _ Hae nomeisoke cone bry moly ee Aon een eo _ ab values 4 ates AR The no of, Hecima pot __ moved depends or maxijnuwe absole valet 4 Pp the value ob $Y oh Bie nanebous BT loph iL Ye ov whe TJ is tm soot integer T - (Came drat Max ( fv'/) <2 vie y ‘ _ 7 a ge — a ve 35 le, 2 ai 2 ee (e vic 3s = 0-35 2 Las. 10 A Aange =936 wb AF all wpe fon | aowwaln by i. Suppor te preoded yall of fie Me abuts le 8 je duined sealing 9 [-ase] [-aesl, Caml... Larsf pay, pal ee So j= gs - aes ve SE vic 4e6. = eane - = —— SV Suppox Sk data Sor Snshyy ince oftibure __ — Aresperg 4 tee stopwords in Acumet . ty vole 04 given in Inne or order 13, 18, 16, [6, 19,90 20), 2A, 22 2t ay 2) 3° 33, 83,87, ne 30 Se | the ay. bien oye _ VR nen Mae pote olorel. fo _trom bors vole 4O je | [on J RS OS th 2-S0n pormmobiol t pom vou FO _ 7 ; — - a ; Bint \3 is 16 - _ _ ce Bin > U6 14-20 ee ee | Bin 3s 20 Ee a Bin S22 er eS 1 Bin f* go 8B SS 7 Gn. 6° 3s 3r 35 i a - 2.7: © we oF _ Bin ft 4f 52 FO a - _ Bint U46 146 1G ‘ _ 27 (832 1S 18d i 3s oo a 2) __ i 4s 24 om 24 I ened gu au 32 zh UL 6: 38) 3 SMe fi ts 4d Go-% 40-3 a a: 56 sé cs z Binriy yy Median ae eee & I 7 Ss WT oOAS _ — = ata i 4 2 20 2 — I 4 wm 1 eat - | r 38 st a - . | 6 sy Far St | 7 fo ae . us Seo se ye i y bearsey — ee eS — i 3e 8e ye st oss ff ge yS - srg OST — Soe Qt a _ ke ke 8 _— a a y's v- Mia x (Meemsexa = fewnte A) i enony Mack- Mit = v= ur viz o-3 (1-0) to Fo-13 _ Ve 2.ay OC Y= DAT ae _ ~~ Data Reduction :- ss - Thay are tre Aecrniqnes +eot can be applied fe obtala a vreduted reprosentokon ef, oda set. 1:e pmalles in volume yet , the wkeg nly of onginct Aske is clowy | maintained ond thw, meskes the protew more offset {— [ ly predudy the sane oni Aerts, | Sdrategiey ther dota reduolioe ores - ) para Whe ogpe gehen Bet ‘ an 2) Drmennen red chen Be 3) Dota .comprenen me) Nemevosihy reduclron a aane | S)__Dicretizater ard toncept Weravony gore _ a _i] Dota Cube oFpege ion! - bis a poten hh ution ‘aformotten ip ed and | Pyare eee fon Slashes analy ____ Oar C Onlise ancy prsceniy) a a —0 oLrr an a aden cape ze 1 @ tn his rrokot vw we ts ine Md abe ani: © kiero precen inte ane “ele dle 4 ol deen aabs f per — quake LP go" once bA we ar srtentet _ tr an Le rath fron per guava daly fem - dete const _egppelat ga dhs. cardia dete — | dumwor'n du fotl pale per + tWhend dpe yer thir rat In dats BAT bin 1s smelter A olin _'pthed uy of Jatornokon. -| Data Cubes provides fast occen to Pre Ca ten Summansed dota. ” GJ Dimenvien Reduokon — MW. veducen Aobkea set dire by removing OttAlote, Ameen ~ Commarea tom dota Methods ah ottich | Lean. Wt meson Ore oppied fo fed Oni eneen Ae 4, ottdloutes suth Mok the ent Pooahith dis MoKor 4 dab, dane 3 ae eAginst distiewe. obtdred oll athibuter, = Wee we 3 Stardet oprsrewts of of (lhe 2) Ermedled fe, fate et, tim stay yal igorithn. 7 Spee dun te pegs opurohion hot puri, went gine Usd] chesidln re att oe aA hth fo Ionou 5) Filter approach here Feakwes ave selected ete dole onini allgorid~! iS Aue wning dome approath fe Indyperder of dol _ | mining task | G Wrapper approacs : =| Ths methods yy target dota ering lyon’ a 4 | Place Box fo find de beet tort go ofrbsh 4 | Wy dielor fy Hh ides! bt pital wot | nwo oll ‘et i re? qhe best and worse atnbte are h coy, aetener ef heanece Uke aber Lo enna deat of, gtokshet ag rech whi mw € We otebh te are indepetat rE a one omethen - i. Outncr Owe hishie metnodls ok othiowe subst pbeche ie are aa ee aa - oe Min bore a) Stepwise forward pelectow : — Were procedine starts oie an empty set af _athtostes B_-|| the_lovat af, tre oviginel _oxhitote is clefewined at z added fo: tre Step pee ek coon ake beak A tee remcinty abtiode if ove! 3 bb we sete _ _ _b)_ Stepusise backward geleckor' = The precede stent tite fu at of eth ot cate he dep it removes dre worst attiode remoinig in he a Data_Comprdiion’ ~ ~~ res dota encoding or dren torraion ove Spex _ fe obtain a redured O_ Comprened repress, _ ol ongines doko a _ } — Ab onginal als con bt ve covsbustes trom oy open date pithewt cay lath of isformaton tre chal, - __tornprerion bedeniqn & calle o lop len AF we re- conabuck only on apr sicnetice ce dote ther ne dots a Scobed ‘Ci A) eg: PrindgAl comporet acelyis ( 4) Numerosty ved etter, *. ore ——. Such techniques edt applied fo rede cate volun by Cheesing altemste smsile fame dota Aapeeer aah fechot ques Moy ramebe ox ro~pehouwt —_For_paremeterc metros a model is used be eatnah ote sy not only dota_parencte 064 bo ke _ Stored \oeteod 4, acto! dade. No pesweles methods ore wid fon aboiy redmed _ —Mptmetebe of choles inched 9) _hishe a a - — a cn _ 9. ply ~~) Disaetizaton , Contest hierarety guntsotor oud . aster bineri zation n >, Disnck Caron _ Festi ger can te vat bo reduce no-y ie yaa for = gives comiianes aletnde. by die'ding e due + of [tre odtilentes nko Intew ale. be und to replace _achuel dose Te \wterwet Teves ban g valued oy reducing dee na 4 voles tov on athlete | - tvs metres Lave recmyive his lage amows 4 Het ____1s_ sped _or _sorhrg tre “dobe_ os e090, Step pw alo provide a heretics or muthreselekon D partheneiy of. ortibutes. Prtpitgube sapadines | inane j O _Coreept het ron, is Abou below = Comey hainaroly “fp a puorerse abrilodie Lifinn Aiswitis Ge albahe _ - — Ty cmH be wrt be pede dala ty whlichy ard ng how bend p geth ios pracmos yt , Abang ____fe_alhitoutu , Ge by hi lend _covyt uel +4 | a Or Ann a =) te S dfn? mobiads lea yma, (ores be hh iraihe Gating (typ dos hijo aol Lg 9 hag IE IEE - - T 4) Qualitative attiete; > They descroe a Teobne ef an jerk Scho gemg en aches! gne_ov_quonbly bypicolly wed __Yepres 4 oy ; a pt ~ — —— Eg one ey ceproet compete, costs i cote@ies a) opposed bh meawmesK quanbiin 40 tor see Arnie ik on A Sor medion 7 ~ : 4 br ow . : ia g _ Norms >) Rebel _ ee teres eath pal Agee ors Kad Each vel pepreert oor EA of oly sh’ & stogde : wouicd al! = ogfe, Wx ahe velewes 00 cotezoricr alt _ 9 ir color acoupele | rmartht Stetie aa - 9 Ovdinal) oO - DNase 2 meal pris (Ronkiy) ee an loebveen rceeme vale 5 wot Meo" a yode She, 3) Binary = Nommeal attnbutes wrth only too cotegoves H stote, a) Spamehic | eee 9 Both otcomes are aap impovtest a) Peigenne ke! (Boolean ottiote) 2 Hae Ovic@ne oe vt ovtat 2s mech tert oa Et yo _ —2t uaaetae ho DF Trem othabson w\p Compare _ ges vokwes bo Ae ve. this und fe compare a’ wer de fred vols agent (On _uppar bit art a laser lt = _) Discrete: Wee Ft has. only finite _or_covntatty “finite oe set of | odes. Mt Ls somehmes _ representecd on is vokwes. —. Binary otiontes re a special Con of, _disrate othhy a — 2 Numesic} a 7 Mepomable_ quently rpocal cia, wpe es voles _ nl : a) Ide pyk _ Inherent O poe. We cw qpe-k 4 obit os Beey anode — | meget up te wl gest 4° leo , 5) Conhnucet! > ey Keight voy" _ Data Quality: - “B mMeswemet & code collections problem YD Ques a QD Mig vawes 3) Incongstent dols/vowme? 4) Duplicate date x Problens involved ‘In Meosumet ener 1)__Noise - Random compone _ 4, mone amon i i ‘\nvolves distilbion 4, a valu) shows below PA AA Time sain dork __ Dota Grror a be uth —____ y Rete 2). predror His a A - __ yet cach other, It iro pect ene Tim sews with nik neg dd fe Maye Me deteuwinih | plerwe— _f dosenen f 1 mest’ — retewed 0 repecta ili at — ~ pie civien a atta Comistest reste 4, repecsed wegen _§] Accuracy’ ts a 4 4 Close He menue ee ce vale 9) Beat Wis messed boy chek de differen pobre + - mee + set ef volver ke Fre brews vol oH te quesk beng meanred a oe the list _d common, dale bie by madine len one: Selection foie S over (tty [ula ep ote, trea bie reo, biet , Oot abi bin esos 7 — ft ROK, ea _: Recara von - f remap - - aan H > — ——_.> lh pene Value K Toone date qualiy on eceatny, Corplcbens , meee | timely 1, beliwabrlih pinte pret abrlrs, /) | SS Unit 2: ~~ = J Summa Stahsh Summary Stash ore quankhes such o meas ond standard deviahor Avot Cap variow Choraderishe Xo}, sek ae remven. a - eg Averge hersheott ‘\acone Fre Aen 4 college “Studed whe complete rrder Gprodwske hegre Na 4 yu eke ——— Ta Meoaure el Ceol tevolamy ty a value Hot repre La_fpicsl er _Cenbeat eae ol Ae dots ok: _ - The most Commer wmearme 4 conbt tendo on A) meesn ( bvercse ) » = kis swe doll dota eubier aided by ner of entres Population mean (M) p= Sax 2 - Sample __mean Cw = 2K 7 , j A 2). Median = 9. The volue shot Wes in the midelie aq data whe, the dota set is ordered UAE tee hole set Wed odd no enter ten a t Voters doe nd hor eur no erkes feo ke | MN ebteined 6) adstiy to er tn middle An ne by _ a 3) Mode! > Dota ate _evvt a dole set mes howe 4 pers made pr _no mede If no _onbry is repeskedt tren oon _ rcatest and lesst_yolve bud They ove not just valies ctwok are ver di idillee+ trom she patten 1 _astauinned) toy due vedi of tue _cheie Joh opti ie he meor _ oo Wher ownier are pres vr bert bo UK chic, an we vrese Et centre temclave a go TL] Measwre Qh Variaken’ > Measure ca — — _ Varah ev} (feiss [exsnse\ Foe rey _) c= IN fee — ) Range : o> The Atlee bebocen Maxime 4 Mimine cote enby Le sed — 2d) Ska ncad devialre, a ; 2 WU measmes _ ari otitity, and Consatlowny of te Sangh Or poprele "ln most_reat wort Speticares Geant __torristensy is © Gest adua ln stewWskes ask anukyis — : a 3) Vathanee : — > is pretowed 5, aw meoune of apresd. lt menu Nsperrien of a set of clata peints orourd thor meen. lt is_dven as Go F 1-2 And- aye - janet N . E o = Ss (a-x)* “i 7 _ - io Net Reblem 1 mode and ether be End mean, medion. AN Jetlesing sample ctetn ' a 4u,s, 60 a Mean = wer arsedtstee = Ay — = Cutie: = 60 - _ Provien 2: ; — aw a nBesclvte beauers; Relovwe Kreqwes an! ES - Freawen, Moa he A doe 4 ne beck Fest by Skeet 2 UL 2S, 4, 63,91, 8,67 3,4,7,1 —————— 4 (epee) | Prd | Pl tay — Rewlte Relovive Conpulahee \ 3 ns = Oo 3 = ss senenee OE ' Ys = o Cet] an ou 2 | 2 Nat aemeit es AUS, we = 6 it 2 hae = 013 62 = 8 —_|| 5 =! Ms = 0 28% |eyu = - eee ee As = 918 fa4e tow # = YT = 0: 0¢ Vi nS {L a) Percenble _ eee . > The pabak percentile a data set is te do oan od which PP perces 4 vowwe he doh at a lem man or egnrol bo +s vat A Hoe bb colatote percent. | .Stept: Sort the dota \ a Baeaiy pin (> Step?! Calusote tohee P_is He partutor percentile yor is bo Cotwtot bk nis & aryl W2e. Step? BUN amoroe hepa p percenkle U1 @ mean d dota v aliees n spoviken i k ita Eis net an iategph ph pervert b ebfalas oy rOwndiny up ake next integer, wy) & vale ot hed poston. —_Il - ty __ Proden — Use are follousing seh of, stock ences in rpess Find 16 pevatt k Sd" paweht Vhs Del tof uf rd | “i 1 3, 4,8,9, 10, 12,12, 14, 1, 20, { — ———— 4] Measwe oh Similacty ond 2 sidney | Seritorily and _disinilarty are _ oes beoawn Has one “wd by number ot, dole _ _ ening alggrttiers | Auth Oo clusters , — dambcaker , anon dakechon _ mang a2. yaihel dota ac is oa wae once the __ dasinilanies hoa oun opted ea _provimity 5 \S_yped_to “pehex atier a. a 7 | disienlonty | i to mnenyme prorieth, bebresn z objets hows oy on Ande attlose and then _consder proxiowh meormes Por [oktecs oie qrulliple thieves we ux meosner like - ce-rclation ond evelictian cistawee Sith ane ofl to der&C dats swh a Kme genes or 20 pomts _ zl ijeonoe mec | fone Pre a - \)_ Geele Meteling _Coekys Codgact Bees a SMC Fe feof nag attbute vals = fu + foo nod athrbules _ Epes ae ea where. too = = noilel athnbat es | cues 420 kyro —- a fore [Amy 70 bye! _ 4 poe eee " EI hy FO ff fue 4 mos Jacund Coaficient —_ | / TO = Nos of IH matches ; : . a ——Crtoef etvbsla) Cn aan Nall = Nevgte dy Het Kg | Wa = Loc For Wwe Klong veces & &Y compe SMe yg fe = — yD Cosine ! a 7 — +4) Co- Verience * =eees _Protlemt . a 7. — | For fowwea _ tatsle which dea : me ‘ben eS > —— _€Lononic grote X) end ack 4h Aetum Cy) 2 || covartranee Inula fo Aetermne shetre, ee ee ASL at Achim Wor tve ov — a Y | 3. (2 wy a SP | Step: Create tote _ x (si-%) % Cvi- ¥) 2-35 SS Toss i... 2.38 : — -0-.S5 en | FAA Standod deicha 4 oy 7 et | gue 0-3462_ ‘ tee = Ci-ge = 2° 5814 _ as [aE | _ a oe _Contedtion ce roent (709) = Covonronse Cr) | - ae Fy | Seer Ee Se - Osa x 2-58 I : = = 0-66 Inthe Correlation gant ond poh 4 Pow 6 pottery _ Correlated codfitient is poshve value, thre ewonon | I 4 | Problem 2: {for are following dala pes of atvibde Aas [ fs Coveviante a-d ¢orelotor cexfticens Wsehsees AAG Fl 1 3 tHE g ae 18 @/ 6 a, 4] 3} 3 \o | Flt! I TT. = Toe - saps A i Ai-A | B-B | (Bi-B)* C Aie-B) IL! B_|-3683 | 2S} ~ 4-35 [3 6) W133 OSC) ee -los 2 4 f-2-38 | 356) -to- 25 5 y ond] —t44 = OIF 3 | 3 | siz] -2-44 — Fe 6 - | # a 22 -2 44 7s UF [oo | (2 2 | #0 | -344 = 24-449 2 z |-a-s8 Use = 444 fo 4 F | -088}] bse 1.34% ia agg | s-44 = - 6404S Cheetecee = 84-57 1 3 fae ose y ox = foz@icny = 2S ye aa N at a ——— — 42 J Gig" = _2-s0 _— i ha - __ Corebohon Lefpiiet (eW—) = Cover (meg) Seg a -gZ: 0T6 23S, '*! 250 —- = = 0: TOF Ze : | . — Data _Cuboid : poe i so dota warehowny , dain mbes ore on Svertens! - The lattice oh one a ole tet - the lax Cuboid comtaw oad _ he AKAmenow ~ tan rete in foter vale for ony conkirolons n_ dimevsion_ > The bon Curosi _ en ah ty 2 _ — The opex eebeist or 0- Dimeno- Colored veley by Can where +the gop by is empty [to P the mest - = peein th Coboidr ard tr often dovretet os PLL = stort at apex utbotd ot explore doaweat i) dre lotice 4D iy equuciex lo obriling Oloon wt Aalto. __ corer = A. we stort wt de bow Cabell cand enor epoerd te apie fo Aull op 7. & yp i yd Ma What ave 2 climenion! 4 aul he gia dilme fon over O71 Ayior~ pi fpau a owe direchons lu olla Lok it , fia rweainenet a esl 4 fa ty 4 ayy sO dimensional CI . 2 7B pout is a OD objet it har no_ length no Widin ard ho height it os No size jit tells | about the location ony oe — ft — draw tg ngs ORS ne setts 2) Dimensional > Aline segues on_a_S olject Th bos ore height @] 2 Bimensiow™ (oJ _ > The 20 shaper oF objects are A Flat _ plane Bago chat hoo 2 chmemior length ond wich, Square , secacg\e _awde » tas avgle One exo, le AD 4] 3 Bebo | [#) —_______ > 3b s 1 are soil tL 3 dimenrtea (€ lengtn have Wickne epi, Cube be Colott one ened 3D _ | PBrotlen t anne a_ 3D data wre cortaining - i. rae ea item | oo dimera, ue el : eee bo te above cule a te ete + Chott fa te Qare€. SF Since we ire 38 dimenion - the frtal ne of tubotels f gre F O_O Object dnat Oe Heat ant wide 35 »_sbepey a letice of cubsid for Fo a a of mu _— — Wy) Iatep Sb yeoeh a 7] — es yeary _______ _ Shy ito ~ ; es Liter yen} | fatyter gery 3 i Prollem 2! “Conrad 3D cbe Bere dinason | Jane, lecorer | spp) Hen¥ + Final wa one A CboutlA ae | Ene . a ¥ de letver 44, Wem, (ecok en, ppt Unis 3: | | i | Classik cakien Classin cane I = Claseikeakon is a tom of dala analgrs Hast exprem modes ( Clasabew) desorvirn lmportent cheba omer, -| la cloumbes dole iain 30 more _ cobegories bamd on I dow labels _ _ Data clanli cation [ndwder 2 Steps a 1) Learning phon 2) Cexkcote prox ; | [raising | =] learning —=p Unkner se Plant Valuer Lasnbcoken Precew *» MopéL Constuchen TRATNBIH _ — poesiscee - | Tomy | algorttn - iy ¥] The Euctecion Aistante meoawe quer in aie Inclows __gewsrctined boy Minkowski distete mau, the flows are she 3 tommer ex ples ASsteuce 1) Manhattan ot Clyblock distance (11 - Ceskew A =1) — | Alay) = of = [oe= 4, |! Ke } oO , Ale y) = = | ~& — ye] ke! , 8) Eucltides. distawe / te nom (whe Ac 2) . - 7 : _ Alay) = af = [xy = yp | —_ - 2) Supremus [Lmax | Leo norm [ cheby Shey __ h J = = whey Asc | — dln = zl lina a ADC Problem! _ _ ad gj 2 eo eptesertce tay atte values a (1,6, 2,5,3) & gC 35,2, 6,0) U Compute _manhottan cdstance bebicer 2 ouch, i), Compe eculdian distance pebvee SS Oh UD) Const jnprenen MI WO) Compas Min kkowele 7 (a=4) Y alouy) = = [%e - Se | 7 t eS =— \r-3| +] 6-5) + le) & |s-el + [3-6] 424 \ 40 4 Gt) + 3) ( a 2 Mauhetton aston Alor) = Wales) = Ff ti-sy* + (ete (a-2d* 4 (s-6) + (3-6) qaaieer te | ae eeiyaliss ; _ : 2 Eudedian dt dau) > SSF _ FU Svferer — [esi = 2, le-sl=t ivi=o G6) =t (oe) 3 | Sapeegae aint CC me ; > ap taste COT) =e Tea — _OLAP. _operahor ot (op) Ghicage 7 o/s 7 he Tran Porn (Rell doo ar A og] ou +e Leellug | a = \ | zg] pice 3 Lh st an \ w) suc — “8 v s)__ Pet a o4 = Moble trotom PC Laphe a — Hoy bype> 5

You might also like