You are on page 1of 28
New Horizon Institute of Technology & Management Page No. Date TS Addi gomenks Nad stil ia QI what as Yelabionship beboeen Dubs Warehovsirg und Data _veplicabion? Which form of replication is __ better [| suited fae dah Were housing. ? Odety Wevehovies ave carefully —dlesigned — dala bases con't fy Secondary Sates Ub 5 _it is suggested be hake ce copy jond widk of clutu is very large and they @ie a yeplicutect _clata stove _onless Roe —_compleaibies. ous data _ warehouse would be over kill. | tout hold intesruked —clala Cram —sae—tythers—bet | @it you caly) medonced lata Bam ace system, bok || can't impact the performance of that system, then vouys he _wewotre _ccccess fe if ere very Mig, cq A replicated clatn’ shore is cr clulahase That held __}l Fhe sdleakea oo allio? 01a aaa © this means ik vis ically sin a formek similar te _ what the source system hae Othe valucmeynsance replicated cluku shore i¢ that it | provicles ct single source for resources bo b& in ordey be ac ulu_from any system _wsithoot _ it -|@ Batu veplicaton is simply a __methed for crea negatively impacking the performance of taut system. i data in a diskribvted ORepiication technology cun be bo sovrce _clatu. 4 vironment. used be _cuplure chenges * Replication synchronous. ov easy achronovs) Schemas from _olher systems , bot doen't tyoly integrate New Horizon Institute of Technology & Management P20°No Date : nous Repl = Synchronous _ replicakion is iedauree creating _replicos __in__yeal_Hime. ate || = In Synchronous yeplicabeo data is written te - —|| __primary _skorage and the replica _shovld always ae pee yemaine tee i - - = Asyndkvenous Replicabion Ae iE -Tt_is wed fox cxaahing Helen. _deley ed _ replicas: | =In asynchronous veplicubion labs ae _ovitten_ | 4 || the primary shovage first and then copy _| _cluta tothe _vepleca- + ee + Symchvonous ae is best svited . Sov | | warehouse os it cyeates sea el ier a8 Explain Dax) Warehouse Lin elas od ‘ Ee Othe data jo cluba Warehouse operational ad of the: nizalion as. from other external Sources : refered to as source syskems . eas well These are comes from collectivel @ The cluta omen date enctructed area called skuging —crrec fel eds, ans Rammed! corm Etta andl doplicuted” From dala _syskems Source ie shoved. a ev eh. the fo_prepare the daha Ih ins. e data Rouselntee. @ the dete steslag oven dy. generally a collection of rauchines where imple. achvities Glee Reeve t beat + lth ate Processing tukes place . Skaging aveq —cloes nok _ provide any qyvery ov presto ev vices. oon a provide query ov | presentubion | _ Sevvices, ik is cokeala 3 __a presentation server. © New Horizon Institute of Technology & Management 29° me Date ee . © he tive _different__inds of systems th required eit Hee for rs : oa _dala warehouse ave + G) Seorce System 3 Duta Ste jing Area. ene Gi) Presentahen "Servers . pee): | Ae i || The dal travels fram Source Systems fy —_ presentubion Servers via the lata _skuging aveo . __ T = 7 comm f s i Fight bi if ; Bera Rene ari sea eS | [ +t te L f > su pf Lists lois HE LS [ea < Sommeted Oy t [nal fT 4 Zz oh Se wae a ‘ u Access ie = 2 ; wa] Debwiled! Ook: = [Tealy mb SEEN nol: Sela house: ponayer| S27 SS Gras ‘ | FE 3 Arkive /éuckyp > ARCHETECTURE DATA WAREHOUSE || @ Operatio Nal __sovrce es of dala foy the dala warehoue ave feos bradifiooul nelwor and hievarchical format Supplied is s the data from He mainframe systems in the New Horizon Institute of Technology & Management Page No. Ose Se £5 | + Datu can alse come fram the . relational DGS 28 Like Oracle _, Infovmise. ea * Tn addition to _ these in. fevaale late clata__also includes external _dutu abo! 1 |___fvem __ commercial _clatubases and datubeses. eee associated with supplier ancl customers z he + __}@_Loao ers dened | + the — eae ey performs all He operations a t Z associated with “extrachen and loading — dota into the _ chak wervehouse + these operations include simple Web femubons Le of the daba to __prepave the chatu for _enby 22 I inkon HBR eve boogie) hs aeee | © Warehouse Munoger ) Ei 3 i + he warehouse _ manus ev performs. all the ‘operabons associated a munagenrent of datu in warehouse FS « the operations performed by warehouse _manayer | | _ include. co : * Analysis oh aa to _ensuve consistency. | vi onthe base | Te Ree eee ess | || + Oenormolizabion Banmcn AIS eWe u Goose of aggregation =) i cine _—_evwchiving AE ee __1@@ Manuy aH Woe pai: ‘hoe + the Goery | ee x_perborms all operations Be associated with mung gerrcat Of oser queries. | + This component is ally Constrocted — vilag + Vendo eb ser acces looks , dlatu ousin monitoring fools , dafubase focilibies and cestem bs pa ne __Wavehsme shoves all the keke lec’ oka inthe _clububese Schema p_We _majorily of casec _clefuilled duh is apet___stmed online bot coggregated fo the next Revel_ of __cleferls . >Re deleiled dela is added vlevly fo the B_ warehouse b Sepplimeat the aggre gated data sgh } Highly Summarized Oat: * Ks" shore “alll the _predetined lightly and highly Semmayized — data generated by the Sas 2y she oa geod of the summarized informahion is __to__ speed op He query performance: &) Ax sod Gock op date’ + The _deloiled and Summarized! taka ave stove _By the _puxpese of ovchivir and _bock Op. * Ke dala is henfered fo Slerege _avchives such 2s __mocnetic nlapes ov ophical disks. : Dela ad The lala Wovehouse also shoves alll Pre Mele doha Cis choot dleta) defi tions used by all processes in__Hhe _ worehouse. in used Soy vaviely of poy, incloding * se ecb aod alg eee = As _povt of Gvery munaerent process. stryckere of mele dake will differ in each preces__becove the purpose ic different New Horizon Institute of Technology & Management PaveNo fee a ry Access Taals 2 pi aul _muin purpore of «duly warehouse is fo _provide _inforrautien fo the business munugers fer Stvatevqte clecrsions - making » s these _usews interact with the _sarehouse __using wer _ciccess tools s DATA MART DADE rns a Ldependent of specific Oss ne Gpplicahion. ce = is__ceatvalized | and |@ ze ih lecenha lized by | erprise wide. _usey_aveu.— well pleoned. @se jg possibly : ‘ | afu_is histevical @O7he dale consist, of came and“ sommarized. histow ys debuiled end sommar'zel feos of multiple (Ou consis of singli e Subject oF concern fo | a bichve. - | Implementahon takes months eae a em z Y. posell ia a | Generally size is fyam tea ee | eo-Ga fe ITM. than loo Ga. ll i ee oe © New Horizon Institute of Technology & Management 9° No Date Datu Wavehoore design stratergy _Mayt — Structuve The Tap Down Approuch = the Dependent Dale || + The daha flow in the _fep down OLA environment ee ee with duba exbrachan from data __ sources Approach This ete eer ——L validated and consolidated for_ens ~ cua, ord then hunstevred te wth eam aa hs : as) ee ror " AS ries staging sing He “operahional pate the operahonal aor and ~a_level of Dale © New Horizon Institute of Technology & Management 2 +The 0s stuge _is__somelimes skipped. TG ais! yeplication of the opevahional _clatubases * Dali is also leaded into the Dah warehouse in parallel process fo avoid extracking it from fhe _ODs (> The Bottom = vp Approach : The Dalv warehouse __ 6x3 __shruchore e Zi urchitectore pokes the dalu warehouse move _of a yivbol_veality than a physical veoliby , Alt debs mark Could be Cn H, one is ox > covld be lecatecl on diftevent sevvers. accxoss the _enbe» prise while the dubt warehouse would be 4 virlval _eaki being noking move thun «a sem total of oll’ the ~ data “mark context even the cubes constructed by © fools could be consideved os clatu may ts both causes the shoved climensions can be used fer the con formed dimensions __ as | ce i eet | [other > operoran Bolo UP Approach i... i igh sat Dane No New Horizon Institute of Technology & Management H |G) Hybrid Approcch tthe Hybrid approach aims te harness the speed aod sey ovienluhon of Bottom op approach fo the integration of the lop down _approcch + the Hybrid opprouch begins with an Enthy Reluboaship diagram of the elaba mark und a gradval exfension of the clula marh to extewed the Enkerprise motel in a consitteat Uneor fashion s these daks mark ave developer! eslog the shaw Schema or dimensional models. |(0) Federated Approach sthis is a hob -and - spoke architechre of ken || described as the" eewchitechres” of avchitecloves” £F xecoramends an intearahon of heterogenous cab warehouses , daha mers and packaged” applicubions } that alveady exists in He enterprise _||_* the geal iste integrate exishng. cnalyhc shuchores wherever possible and fy define the “highest valve.” metvies jdimensions and meusyrec ancl shave and _yewe fem wikh'n exishing analy strucluves. ) A prachecul Appyooch. The steps ia the prochical Approach ave 1s the fivsk step is by do planning and defining 2. An _archikectoye is created Re a Complete tyuvehouse i TheWdokam Sater Wstiticon tena? each slanduvdized &: Consider the gevies of Supexmarl one at a lime and implement the cluta warehouse S-In this prachcat approcch _, Sivek }he organizakions needs ave determined . The key fy. this cpprocch isl tak planning ts_done fist a} the enterprise eleva lel The Yeqvivements ave gathered at overall (evel ee r Date sees New Horizon Institute of Technology & Management 29°Ne el _(Y) Star Schema + — a Sault | (2)snow flake schema. eng (1) Stay Schemes fae +s cdl schema ie. then most. pepolar. schenot A lesign favs o ‘apo toevehouse jaa Dimension — table ikown unique identifier . fuble | is velatedaite © one! owl moves and every enky has + Every Dimension fack liable dhs aA! slenhifiews Primary _ key) Svom the tH ot AL the unique dimension —_fubles. make. vp foro inthe Srck _Fuble = see Pinte ton Fa) « The fuct \ifable “ also conluios fucls . for example binalron of fove-id., date - Rey and prods composite key a com giving He emounk of « certain _prodeck. sold “pn a given clay iba This segment commonly employs shake meusures that cooperate with the lula _mining modules lo focus the _seavch foweres fas cinakin patterns . Th might vhlize a stake threshold “bo Silter out discovered | patterns, = On We olhey hand, the pullern evaleahin’ module misht be co-ovdinuted with Me mining yradvlel, depending on the implementuhion of tha cluks an tee vsed» For efficient data mining , it is _abrormally suggested ko posh the evalualion ot _poHern stake _«» _moch as __possille _into_the mining procedure _te cottine [he search bs _ only fascinaking patterns, New Horizon Institute of Technology & Management ase No. a Graphical Osevine Inlyylate | c the graphical usey intevfute GUT) module commen- |_ icakes _behween the data mining system and and effreiently use _ the system wither! knowing | the complexity of the process. Bios eee eee ce Thee Modvle helps the usey to _cusily | | = this module co-operates with the Pu enalg ystern when re use pecafes | __a tusk and displays = Ae SREB E EINER YD 7 STN es | © Knowled ge Base + = the Rnowledge base is helpfol in the” enbve bo Guide Me _ seaychi oy evaluate the stake process of — dake mining . Tk might he helpfol | of e result _patteyns. - the knowledge base may even Mat ian anges zi views and" chute from! ! user —€rcpeviences Hat _ might be helpfoliin ste \oluba mining Process. | >The data mining engine = receive iagslal i. a Now| resolt _ i #accovate and yelrable fei |= The pullerm — ussecs ment aha negulenly ites mo the _ knowledge base to oo wre cand also Update ike New Horizon Institute of Technology & Management Pave No.: — Date InN KIDD.) process, 10 Knowledge _cliycovery in the clutubase CkOO) i, the H process of searching fav _hielden Raowledae in the maxive _amovol of data thut we ewe techically capable of generating and shoving « | HO hah hadtaioe cl af kOD is to extuck knowledge | Hi. Covsvinfoymahion)) Somhia loier wlilevelinclale Cdatabases) OTs the o0-brivieil Gignificunt) process of ao | itertifyingvalidj novel, potentially — useful and pulkimakely understandable patterns in data. I@ the goat cist dishngvish between un precessech deka, some thine Hut muy Ack be obvious but valeable ov enlightening ain ib discovery © the. oveyall process cf Fading and interpye Hing the repeated application LA] DATA CLEANING « p | = Removal of Noise. inconsistent daha and ra —— outlier saat. fields —sourees such as _dlulubuses , warehouse, and —tvansachional duli ave _ may be tembined TESS Sinko ese single data format CJ DATA SELECTION: me Cae Oka —rebevant | by He aunalisis bach lia ved. from Ihe databuse . ‘ | © New Horizon Institute of Technology & Management °° N = Co\\ecking only necessary information “fo the I model Finthing “useful feaboves to. ~wepvesent data depeading on the geal ofthe bush | OJ DATA TRANSFORMATION : and consolidated — inte = Data are transformed i forms appropriate for _mining by pevforming summary lorsvagqvegahea: operations methods Invariant , = By sia trunsformakion is found vepresentubien for. the clube | eX DATA MEANING ¢ - An _essenhal process are applied bo extrack = Deciding which model and be appropriate i i, where inte igen’ methods clake patterns parareler may New Horizon Institute of Technology & Management Pace No: Date jem lS egIrern EvawwaTzon: To identify tre _troly _inkeveshing.__palteras representing knowledge based __on inkeveskiag, _meusores “KNOWLEDGE PRES ENTATION: _ave used fo present __mined _roowledge bo Users. can abe yn form of gvaphs_. cher ls oa! Pe A iHevent Sslees Winn. 7||O Data preprocessing ee ke mining lech nic,ve dable format, Rea rtd data is often incomplete inconsistent lacking in_certuin hehavioys oy tren cls Land is _(rkely “ fe contain _muny ervors . @Deola pre procesing is used #8 oy proven method! of resolving such issues. Daln preprocessing prepares lL bu for rthev processing «_ 1 @® Datu preprocessing is_vsed in _clulubase -cviven applications | “such ets customer __jLand _wle-bused applic —|@Jn Machine Learning rocesses, cluta _prepvocessing. |b _cvihcol be _ente. dalasel ia a form that [could be interpreted and parsed by the cilyonithm. © daka gee Hyvesgh e series of : + PLE processing : ‘ au \uktonship management neoral nehwowks). Kens i [AS ota CLEANING : SE Sate Dolan is cleaned cleansed throogh _ eae a te ‘ |] I and knowledge —representution _bechaives luton inks an uadershu- 4 \ 1 | rage No. Dae New Horizon Institute of Technology & Management * filing in missing values ov dlelebing rons wilh missing clara) smoothing the _Aeisy _duta ow resolving the “inconsistencies in the -claba . Smoothing || data is _particulavly _impovtant for Mt clahasels, | since machines cannot make vse _of _duta they connot interpret. Dala can be cleaned by dividing I ik into ecqyal size segments that ave thus smoothed || Coinning) , by fitting ik bo @ lineay ov _mulhple meqvession funchion Cregvessian), or by grouping it | inky clusters of simiday lake Celostering) Oulu “in consis |. ag, | fences can occ” doe te human errors (the informahibn |] wos stored in a wrong field). Puplicated yerlves || should be vemoved. Payough deduplicahan be avoid ving the dala object an advantage Chorus) pos . GIOATA INTEGRATION: Datu with different vepresentubion ave pot __ together and conics within the dal ave resolved noisy CS MATA TRANS FORMATION: | Daks is normalized and generalized Movmalization isa process that _ensoves that no dulu is yvedundunt ik is all shoved jin a single phace , and all the dependencies ave logical. DATA REDUCTION: When the) volume of dala is huge. clulu bases ean Kecamelir slousere . ) coslyusto accel iandlis he vehallene ging, fe properly stove, Pala yeduchton step aims | to present a yedveed -vepresentiion of the _ lake ina _dlula warehouse Theve ave metheds fe yecluce claka Fox various example, once aq a, © New Horizon Institute of Technology & Management 3° No: — Date : gebset of _velevant attributes is chosen for ibs Significan- Ce_, anything below e given level is discarded. Encoding mechanitms can be Used to veduce the size of clatu Jas well. Tf all oviginul luke can be after compression , the operahion is labelled Loss lei TF come eluba is sloskuithen. ii culled a alse be used , for yecovered as loss. neduction . Agave gabion can | example , to condense countless _transuchons into a | siagle weekly ov monthly wilve , significantly vedveing Whe nember of cluka —obyecks. tt Z 1. Mery ES DATA DIS CRE TIAZTION: Data could also be cliscretized to replace yaw valves with interval levels. This step involves the reduchion number of “ values of a _continuovs,_ allyibute of « the vange of altibute _iatervals by divid FJ DATA SAMPLING : J Sometimes j lve fe “hme, sluvageisiov memory. cunshainb, | & clutuset ,—praxtebedt is foo big ov too complex to be wovked swith Sampling. techniques can be used be select — stork with jos a subset lof the cluliset_, provided Haat “it hes approsci mately | the sume _propeybies of Ike oviginul _ one. © New Horizon Institute of Technology & Management 9°" All Explain duke trans formation with ibs steps of normalization wilh __ suitable _ example | SIO in datw tvanformation , the dala ave transformed in _ways that ave ideal ov mining. the daly The data tvansformehion involves steps _thak ave + |_ AS Smothing : | Zt is a process that ‘i used lo _vemove noise Syom the __datasek using some algorithms . Tk allows | > Sov high igh bing. impevtant __ fealeres present in the abaset. Dt helps in _pvecliching the _ patterns When collecting clula_, it. cun be _mantpoluted to eliminate ov) veelvee any vaviance oy any other Noise form. ae 3 Aggregation | Data — collection or aggregubion isthe method — of shoving and presenting dubs iA _a_sommury format. The data may be “oblafned from molhiple duke sovrces bo jateqvate thee data ‘sovrees into || cute analysis dlescriphion » Ds Discrebzalion: Te__iy a process of _transfoyeriag _coakinaous lakes into sek of small __ intervals. OF Atribute Constyuchioo + Where new altlvilootes ave created and applied be assist the mining process from le given seb of alwibvstes . this “simplifies the original data and males the mining move efficient. New Horizon Institute of Technology & Management 20°" Date : IB Genevalizalion || technig Foc levellll lukas alsibuteh to bigs lel — hievarchy. converts concept Qala. normileztation involves converting PE Uavola beh variable inte a given range. »sed_ fow _ niques that G) Min - Mox Hevmal'satton oa = This _bransloreas He owg inal clube — Suppose that : min-A iy He minima max-A js the maxima __ cle cn RD. Pl “fowl ea = Vemin A Caew-mia-A— new-min-Al+nee.nin-A mmooA- min-P a + where pi Wiis valve of data, 4 min-A iy the minima, a mux-A is the. marnima, Ai I Newemin- A is the neo Minima, ae l {| New - mda. A isthe New Maxime , a | Gi) 2 score Normalizabon Czero ey a | = In _ this technig ve, values cave normale zed E bused on meun cel shandavd __cleviation =] of the dala A. i | ar ‘ where, is _meen ,_ a Vis the new and old entry in duba __ pil xespectively , pa ois the shundoyd dleviakion « —____ Gi) Decimal Sealing Method for pects -Tt normalizes by moving re of valees of the daha yi da Ob 4 = Te _novmulizestkes data by pis bechniyves_ we divideAeewthr valodirst thei - clube! by r [Tes mescaline bsblobes valve! eB data. 4 = Facet ou > _- Vise Vie AZ | oe o> : a3 where vi is data valve a . is their stmtlesh! 00. “off integers in _ mem vale. ‘xamples + ID Examp valve by Wiles ‘bples oso E DAYyiooke age aye bi ia increasing, ovdey . 13, 15.16, 16 , 14s 205 as 2 35 41,44, 53 62, 64, 72 ee oe oot gs () Use min -mox ovmUlicaloniaile ltvansfowa (Meet valve 45 fy age on fe the yan geo > ive 4S aed New = mann | few - min-h= O =13 =7k min- A al. max-A © New Horizon Institute of Technology & Management °°’

You might also like