You are on page 1of 8
‘Assignment-4-Bayes Theorem submitted by Vivekanand Aelgani E21SOEPO04i Q1.Given the following facts find a probability of dangerous fires when smoke outside + The probability of dangerous fires is rae (1%) + ‘but smoke is fairly common (10%) due to barbecues, * 30% of dangerous fires make smoke Solution Given: * The Proboof dangerous fies i are= P(F)= 0.01 * The Prob. of Smoke= P(S) = 0, + Prob.of Smoke given dangerous Fire =P(SI =09 Reqd to fin Pr Deg Fe shen Soke ¥ P(S|F) + P(F) nea = ASO rs) oo» P(F'S) =9% 22. Consider the data set shown in Table 1 Record 1 2 3 4 5 6 7 8 9 10 aa 20000 ofp o-0o0 0038-0 of aaa s444+ fo 2). Calculate the conditional probabilities for P(A=1[-), P(=1[+), P (C=1]+), P(A=1|+),P (B=1[+), and P(C=1l-). Solution pp= ij) = ROBE De P= _ PO PC PUC=1)+*P(C=1) PH PUHA=) +P POD pare Pe Pr) Te P(-\C=1)+P(C=1) _ 0.5555.. #0. PIC CO | a 10 Use the estimate of conalitional proba 3 = 1) using the naive Bayes approach, Solution 04402405 “D5e0s oar -)¥ PO) FAROE P(-|A=0,B= _ P= 0) B= 1) PO) ~ =0*FB=1 Since P(-|A=0,8=1) > P(+|A=0,B=1) the class labels - (Negative) ‘Q3.Load breast cancer dataset from sklearn datasets and perform the necessary preprocessing (scaling, label encoding, etc). Train the model for Naive Bayes classification and evaluate the performance (accuracy, AUC, precision, recall, ) in 70%-30% and 10-Folé setup. from skiearn. datasets nport 2oad_breast_cancer fata = load. breast cancer() print (aata.keys()) ict_keys({"data’, “target, ‘frane', “target_nanes', “DESCR’, ‘feature nanes', “filenane’, ‘data nodule"; port pondas Seport pandas a= pd Reed the Datarrane, first using the fe af = pé.dataFrane(data.data, columrssdata.feature_nanes) 4 Add 0 target column, ond Fit ar[ target") = data-tanget # Show the first five rows arsneaet) mean meat miear_-mean mean ranma tah mean {MM wort wot wort ‘ads terture peimter "aed smootness compactness concavity “OIE symmetry gfe texture peieter ed st 0 1786 103k zac woone —arnBlo —azr760 9300" 47ID. Oasis OTT = 1735 ‘teas 20180 1 20S) 1771 1329¢ TREC nears grass DUREE 307017 081205667 . 234i 1580 19560 radius texture perimeter "area smoothness compactness 2 196 2125 1a00e r203¢ 109605990 3142 203 T7SE 3861 OSD a.290 4 202 434 sasae nase ono030——a.a200 S rows x 31 columns concavity ons74 oz ox9a0 “point 2790 a.n0s20 aoes0 symmetry 2069 ser oxeos ee As there are no nominal colums we don't need to apply label encoding (871 store the feature data Xs datandas F Store the target dato ¥ = data.target f spltt the dato using SctMtt-Learn’s train test_eplit from sklearn,nodel_selection inport train_test_split facta fimension 05999 oeres 0588: Xitrain, Xtest, yotrain, y.test = train fest 3plit(X, y,test_size = @.3,random_statess3) sreature seating ‘from sieara.preprecessing Saport Standardsealer se Standardsealer() Atrain = se-fit_transforw (train) Altes = se.cransfonn(x test) (9°) from skiearn.naive bayes import Gaussians from sklearnmodel selection inport ress_val_score classifier = GoussianN() classifier.fiz(x train, y_zrain) 8 t0-fota cross vattaatton Ne_training_score = cross_val_score(classifier, x train, y_train, ev=10) worst extute = 3882 = 2650 - 1667 worst wort petimeter “area sw 125017090 saa? 5677 1522015750 yipred = classifier.predict(x test) from sklearn.netrics inport confusion patie cn = confusion natrix(y test, y_red) array((l 55, 2) [ 9, 205]], dtypesintes) from sklearnnetrics inport accuracy_score accuracy = accuracy. score(y_test,y_ pred) accuracy 0.935672514619882 Accurae % from sklearn.netrice import classification report,rac_curve, roc_aue_scare, auc print( classification report(y_test, y_ored)) precision recall fl-score support (TT. 995 TTG. 9G TE 37 1 698892095 accuracy e040 an racro ave ess 093 weighted avg ess a8 7 ‘Alternatively Accuracy ‘from sklearn iaport netrics Accuracyenatrics.accuracy_score(y. test. y_pred) pecuracy 0.935672514619882 Precision Precision trses.precision_scare(y_test,y_pred) 0. 9813000112109533, Sensitvity-Recall sensitivity Recallenetrics.recall_score(y_test,y_pred) sensivivity_Recall 0.9210526315789473, Specificity specificity 0.9649122807017504 F1-Score 11, scoreenetrics.1_score(y_test.y_pred) flscore 0. 9s02262442438914 AUC value false positive rate, true positive rate, thresholds = roc_curvely_test, y_ored) Ni_roz.auc = auc( aise positive rate, true positive rate) print (round(N6_ree_aue,2)) 0.94

You might also like