You are on page 1of 19
ve CART algoritum. we can PY The deg, Took like this Cholesterol <= 200 | | Disease: Yes | Disease: No Disease: Yes Fig 42: Predicting Disease based on Age and Cholesterol Levels he top, which evaluates the begins at The decision tree in this illustration ceed to the left br " [fa patient is under the age of tion, The 200. The forecast is "Yes Disease” if the et statement "Age Che festerol fevel is less than or equal t is "No Disease” if the patien 200" cond examine the level is more than 200. Howev switch to the right branch, where "No Disease predict 43 CHAID 4.3.1 Chi-Square Automatic Interaction Detection ood used t0 useful. when CHAID (Chi-Square Autoniatic Interaction Detection) is a statistical met between diferent categories of variables, It is particu! the interaction se ce & Continuing Education, Campus of Open Learning pen Learning, University of Delhi © De = introduction to Business Analytics pst ‘ es cate vaniab "= 90 z Liddle. 9 c Purchase Frequency a Femate = 3 Se ee 5 Low Medium igh ee Not s - “Satisfied ‘Satisfied Not Satisfied Fa satisfied Determining Satisfaction Levels of cus nto SUD Neorg y hierarchical structure. I enables ae “This flowchart shows bow eat ‘most important predictor factors, td ondety visualise Hoe inks between (BE (Costomer Satzsfaction). sulting im ‘variables and their effects on the target vet variable on the flowchart, and it has ¢Wo Branches: "Yougr aoe the Gender varible within the "Young" branch, aie «the Purchase Frequency variable is next exitained fo "Low." "Medium," and "High" We ave y faction outcome and are either "Satisfiegr Age Group is the fi "Middlc-aged.” We further in branches for "Male" and "Female." tach gender subgroup, yielding three branches the leaf nodes, which represent the customer Salis ‘or "Not Satisfied." 4.3.2 Bonferroni Correction ietical method used to adjust the significance levels (p- pothesis tosis at the same time, 1t helps contol the ificant result by making the criteria for significance The Bonferroni correction is a st values) when conducting multiple overall chance of falsely claiming 4 sig hy more strict. ction, we divide the desired significance Level (usually denoted med (denoted as m). This adjusted significance level, termining statistical significance. To apply the Bonferroni corre a) by the mamber of tests being perform new threshold for det denoted as a! or a_B. becomes 1 Mathematically, the Bonferroni comection can be represented as wee ot aa ng 10 hypothesis tests, and we want a sigoifieance For example, suppose we are conductin Bonferroni correction, we divide a by LO, resulting level of 0.05 («> 0.05). By applying the in.an adjusted sig a’ = 0.05/10 = 0.005 03) inst the es obtained from each test, we compare them Now, when we assess the p-value inal a. Ifa p-value is less than oF equal tod. adjusted significance level (0') instead of the ori ave consider the result to be statistically significant se have conducted 10 independent hypothesis tests, an example. Suppose Let's conside we obtain p-values of 0.02, Bonferroni correetion with a of 0.05 0.05 / 10 = 0.005. Al Department of stance & Continuing Education. Campus of Open Learmin Schoul of Open Learning, University of Dethi and 0.07, 0.01, 0.03, 0.04, 0.09, 0.06, 0.08, 0.05, and 0.02, Using the and m™~ 10, the adjusted signifieance level becomes d= eee esmiroduction 2 Business Analytics 0.05, and we ide a by 0.02) “uortoanp’g Summuneo:y Y ato PO mae ra Punis1q Jo Womnaag J porseds- [9% a8 Si9ySN}9 397 yo envelas 1OMUSIP How PIOUS sss TULUM Aayenb ioanniag np . yp WIDMNIEG, aout ryposput 8 a ua famqeindas samiqerndsg asp dup 0} S459 sg sons? 5 i Zunsausny> por mod cep 2avu, pINOHs 1105 son pussys INTE ay are siuiod 1 rdw ssamoedae) « sai uae sop ay) O80] MOY] OF SIOGOI_ SOU - Aten yed ue aumumnts pov] esaaag ‘rep aut Ur Su 9 ss0rsey (PIDAag ‘EEP 3 nyo Jo wuatUssasse MPO ssnyo go> Atienb iy + MOH OF SUF ui yy soumade> WE! suauayy sSupiaysnyy so sypengy (vy) pea auoydye 5927 SI p pu Aten) 3 yo anqumtt yeurndo amp Susu: uy suonsiopisuos ywerodiu ssaoysnrzy Jo snquiay, pend pu tend: 97 val as) pad ed) = g stored (ub-= «zb 1b) = pun (ud i é Os ae a ee os meen soungp SOA, LL “seseD [woods Se ssaMMEID TANT Ca fq wont a weap Sopnjaur yeyy oansraue 2ouwysip paziei7 (eT) bid IZ ) cep qb) 2d pone - 28 ua9s8 Aires aaysoa ay (ub 2D“) = OE 2414) = a -sroyo9n ong Jo “eyep jouorsuonup-ysry qin BuslEP ee aumeoiptt surest 208° For aes Asoeswe9 sy suonsanp sayy wy Kavegtes ot 3 seq aifun ayy jo aunson aya samsvaut Aaciey(oHts 20D ost yrpodsey sontqnay ssomsngy 04 watt" att Stability measures the consistency of chaste - , such as different initalizations or subsets of he get My Jess prone to variation4s and demonstrates robugue MA ahi ly specific Measures: Depending on the appl Plication do, ain, easier sepention, Mts he homepage al + Compa alate the effect VENESS Of ly Seng ig measures specific to the problem can be example, silhouette coefficient can be used to ev Capturing meaningful customer groups. (B) Determining the O} nal Number of Clusters in K-means clustering Determining the optimal number of clusters, K, in K-means elusterin, 18 iS 2 rail iy ig ott fo okey al methots are commonly ied clustering analysis. Selecting th appropriate number of elusters i imp and extracting meaningful information from the data. Seve determine the optimal number of clusters: + Elbow Method: “The elbow method involves plowing the withinelser sun squares (WCSS) as a function of the number of clusters”, The plot resembles on am n ide and the optimal number of clusters is ified at the "elbow pint, whee ig, below. in is cleat rate of decrease in WSS slows down significantly. In th ke=3 is the optimal number of clusters Fig 4.10 Etbow Method

You might also like