786 views

Uploaded by Pedro Gutierrez Leido

- nnml
- n22.pdf
- Decision Trees for Uncertain Data
- ejerccio 3.1..pdf
- Homework2 Sol
- Decision Tree Classifier
- Teoria de Las Deciciones Collaborative Work
- fuzzy
- A Survey of Recent Trends in One Class Classification
- ID3 Algorithm
- shalabi_shaaban_kasasbeh_05
- Weka tutorials Spanish.pdf
- Introduction to data mining - Steinbach.pdf
- Assignment1 (2)
- v231 Tutorial5 Answers
- CausesofPoorPerformanceinMathematicsfromTeachersParentsandStudentsPerspective
- What Are Statistics and Scope of Web Statistics Tools
- The Null Hypothesis Testing Controversy in Psychology
- unit plan - demers
- math1_syllabus

You are on page 1of 5

a) What is the entropy of this collection of training examples with respect to the positive class? Answer: There are four positive examples and five negative examples. Thus, P (+) = 4/9 and P (−) = 5/9. The entropy of the training examples is −4/9 log2(4/9) − 5/9 log2(5/9) = 0.9911. b) What are the information gains of a1 and a2 relative to these training examples? Answer: For attribute a1, the corresponding counts and probabilities are:

a1 T F

+ 3 1

1 4

The entropy for a1 is given by: 4/9 [ -(3/4)log(3/4) –(1/4)log(1/4)] + 5/9[ -(1/5)log(1/5) – (4/5)log(4/5)] = 0.7616. Therefore, the information gain for a1 is 0.9911 − 0.7616 = 0.2294. For attribute a2, the corresponding counts and probabilities are:

A2 T F

+ 2 2

3 2

The entropy for a2 is given by: 5/9[ -(2/5)log(2/5) – (3/5)log(3/5)] + 4/9 [-(2/4)log(2/4) – (2/4)log(2/4)] = 0.9839. Therefore, the information gain for a2 is 0.9911 − 0.9839 = 0.0072. c) For a3, which is a continuous attribute, compute the information gain for every possible split. Answer: a3 1.0 3.0 4.0 5.0 5.0 6.0 7.0 7.0 Class label Split point + 2.0 3.5 + 4.5 5.5 + 6.5 + 7.5 Entropy Info Gain 0.8484 0.1427 0.9885 0.0026 0.9183 0.0728 0.9839 0.9728 0.8889 0.0072 0.0183 0.1022

The best split for a3 occurs at split point equals to 2. d) What is the best split (among a1, a2, and a3) according to the infor-mation gain? Answer: According to information gain, a1 produces the best split. e) What is the best split (between a1 and a2) according to the classification error rate? Answer: For attribute a1, the error rate = 2/9. For attribute a2, the error rate = 4/9. Therefore, according to error rate, a1 produces the best split. f) What is the best split (between a1 and a2) according to the Gini index?

**Answer: For attribute a1 the Gini index is given by: 4/9 [ 1-(3/4) -(1/4) ] + 5/9 [1-(1/5) -(4/5) ] = 0.3444.
**

2 2 2 2

Similarly. Which attribute would the decision tree induction algorithm choose? Answer: The overall Gini index before splitting is: Gorig = 1 − 0.2813.1633.6 = 0.4898 2 2 GA=F = 1 – (3/3) – (0/3) = 0 Hence the corresponding gain is equal to Gorig – (7/10)GA=T –(3/10)GA=F = 0. b) Calculate the gain in the Gini index when splitting on A and B.6 = 0. EA=F = -(3/3)log (3/3) – (0/3) log (0/3) =0 ∆ = Eorig-(7/10) EA=T –(3/10) EA=F = 0.4 − 0.48 The gain in the Gini index after splitting on A is: GA=T=1 – (4/7) – (3/7) = 0. Therefore a1 produces a better split. attribute B will be chosen to split the node. Similarly. which will be given by: Gorig –(4/10)GB=T -(6/10) GB=F = 0. 5. the information gain after splitting on B is given by: Eorig − 4/10EB=T − 6/10EB=F = 0.4 log 0.4 − 0.For attribute a2. Which attribute would the decision tree induction algorithm choose? Answer: The contingency tables after splitting on attributes A and B are: A=T 4 3 A=F 0 3 B =T 3 1 B =F 1 5 + − + − The overall entropy before splitting is: Eorig = −0.9852. A T T T T T F F F T T B F T T F T F F F T F Class Label + + + − + − − − − − 2 2 2 2 a) Calculate the information gain when splitting on A and B. Therefore. 2 2 2 2 . attribute A will be chosen to split the node.6 log 0. Consider the following data set for a binary class problem.9710 The information gain after splitting on A is: EA=T = -(4/7)log (4/7) –(3/7)log (3/7) = 0. the Gini index is given by: 5/9[1-(2/5) -(3/5) ] + 4/9[1-(2/4) -(2/4) ] = 0.1371.2565 ∆ = Therefore.4889. we can compute the gain after splitting on B.

the error rate using attribute Y is (40 + 40)/200 = 0.5. Consider the following set of training X 0 0 0 0 1 1 1 1 Y 0 0 1 1 0 0 1 1 Z 0 1 0 1 0 1 0 1 No. For attribute X. the error rate using attribute Y is (30 + 30)/200 = 0. Y . X 0 1 C1 15 15 C2 45 25 Y 0 1 C1 15 15 C2 45 25 . For attribute Y.5] and they are both monotonously decreasing on the range [0. the corresponding counts are: X 0 1 C1 60 40 C2 60 40 Therefore. of Class C1 Examples No. which are scaled differences of the measures. What is the overall error rate of the induced tree? Answer: Splitting Attribute at Level 1. do not necessarily behave in the same way. Answer: Yes. the subsequent test condition may in-volve either attribute X or Y. and Z. This depends on the training examples distributed to the Z = 0 and Z = 1 child nodes. the corresponding counts are: Z 0 1 C1 30 70 C2 70 30 Therefore. even though these measures have similar range and monotonous behavior. as illustrated by the results in parts (a) and (b). After splitting on attribute Z. Use the classification error rate as the criterion for splitting. For attribute Z.c) Figure 4. as shown in the table below. their respective gains. 6.4. the error rate using attribute X is (60 + 40)/200 = 0. To determine the test condition at the root node. 0. the corresponding counts for attributes X and Y are the same.13 shows that entropy and the Gini index are both monotonously increasing on the range [0. the corresponding counts are: Y 0 1 C1 40 60 C2 60 40 Therefore. Since Z gives the lowest error rate. it is chosen as the splitting attribute at level 1.5. Is it possible that information gain and the gain in the Gini index favor different attributes? Explain. Splitting Attribute at Level 2. we need to com-pute the error rates for attributes X. of Class C2 Examples 5 40 0 15 10 5 45 0 10 5 25 0 5 20 0 15 a) Compute a two-level decision tree using the greedy approach described in this chapter. For Z = 0. 1]. ∆.3.

For Z = 1.3. it provides a better split. respectively. the corresponding counts for attributes Y and Z are shown in the tables below. the corresponding counts for attributes X and Y are shown in the tables below. Y 0 1 C1 5 55 C2 55 5 Z 0 1 C1 15 45 C2 45 15 The error rate using attributes Y and Z are 10/120 and 30/120. respectively. Y 0 1 C1 35 5 C2 5 35 Z 0 1 C1 15 25 C2 25 15 The error rate using attributes Y and Z are 10/80 and 30/80. X 0 1 C1 45 25 C2 15 15 Y 0 1 C1 25 45 C2 15 15 Although the counts are somewhat diﬀ erent. (15 + 15)/100 = 0. The corresponding two-level decision tree is shown below. What is the error rate of the induced tree? Answer: After choosing attribute X to be the first splitting attribute. X 0 Y 0 C2 1 0 C1 C1 1 C2 1 Y . For X = 0. the corresponding counts for attributes Y and Z are shown in the table below. For X = 1. Since attribute Y leads to a smaller error rate. b) Repeat part (a) using X as the first splitting attribute and then choose the best remaining attribute for splitting at each of the two successor nodes. the sub-sequent test condition may involve either attribute Y or attribute Z. their error rates remain the same. it provides a better split.The error rate in both cases (X and Y) are (15 + 15)/100 = 0. Z 0 X or Y 0 1 0 C2 C2 C1 1 C1 1 X or Y The overall error rate of the induced tree is (15+15+15+15)/200 = 0.3. The corresponding two-level decision tree is shown below. Since attribute Y leads to a smaller error rate.3.

The overall error rate of the induced tree is (10 + 10)/200 = 0. This example shows that a greedy heuristic does not always produce an optimal solution. the error rate for part (a) is significantly larger than that for part (b). Answer: From the preceding results. c) Compare the results of parts (a) and (b). .1. Comment on the suitability of the greedy heuristic used for splitting attribute selection.

- nnmlUploaded bysk3342001
- n22.pdfUploaded byChristine Straub
- Decision Trees for Uncertain DataUploaded byMohit Sngg
- ejerccio 3.1..pdfUploaded bymarie vz
- Homework2 SolUploaded byYavuz Sayan
- Decision Tree ClassifierUploaded bypi194043
- Teoria de Las Deciciones Collaborative WorkUploaded byMonica Yiseth Tigreros Briceño
- fuzzyUploaded byNick
- A Survey of Recent Trends in One Class ClassificationUploaded byAnonymous NJ3j2S4o59
- ID3 AlgorithmUploaded bysharad verma
- shalabi_shaaban_kasasbeh_05Uploaded byManon573
- Weka tutorials Spanish.pdfUploaded byelprofesor96
- Introduction to data mining - Steinbach.pdfUploaded bysi12nisha
- Assignment1 (2)Uploaded bySuriti Chawla
- v231 Tutorial5 AnswersUploaded byneha4project
- CausesofPoorPerformanceinMathematicsfromTeachersParentsandStudentsPerspectiveUploaded byDaniel Dube
- What Are Statistics and Scope of Web Statistics ToolsUploaded byPitter Allen
- The Null Hypothesis Testing Controversy in PsychologyUploaded bywillian3219
- unit plan - demersUploaded byapi-296936553
- math1_syllabusUploaded byKryzl Rey del Fierro
- Engstangina (Edited)Uploaded byJohn Paul Cabral
- Singly Linked ListUploaded bysidd1811
- Where Do We Go From HereUploaded byMark Jones Jr.
- Multi Dimensional ArraysUploaded byPaul Gonzales
- An Intrusion Detection Model Dorothy IEEEUploaded bykrisna
- RMDA-Week1Uploaded byKrishan Tyagi
- Be Chapter IIIUploaded byNhaz Pasandalan
- Flyer Spss Cpd1Uploaded byjparkteach
- at wooordUploaded byNurul A T Fadhly
- RUploaded bysdpk_ind

- Decision Tree MainUploaded byxandercage
- Researchpaper Analysis of a Population of Diabetic Patients Databases in Weka ToolUploaded bySellam Veera
- Online Trajectory Generation Basic Concepts for Instantaneous Reactions to Unforeseen Events-5C8Uploaded byrameshramyb
- Classification of Human Skin Diseases Using Data MiningUploaded byIJAERS JOURNAL
- IJAIEM-2013-05-26-066Uploaded byAnonymous vQrJlEN
- An Effective Prediction Factors for Coronary Heart Disease using Data Mining based Classification TechniqueUploaded byEditor IJRITCC
- Factor Selection for Delay Analysis Using Knowledge Discovery in DatabasesUploaded byIngrid Johanna
- remotesensing-08-00666Uploaded bymk@metack.com
- HybridUploaded bySholeh Rachmatullah
- Early Identification of Diseases Based on Responsible Attribute using Data MiningUploaded byAnonymous kw8Yrp0R5r
- 06674155Uploaded byfranzkila
- A Hand Gesture Recognition System Based on 24GHzUploaded bynazareth621
- Multimodal Analysis of Expressive Gesture in Music and Dance PerformancesUploaded byBruno Madeira
- CS105 - Final ProjectUploaded byrockstarlive
- arUploaded byBarun Kumar
- Machine Learning using for Classification of Heart FailureUploaded byIRJET Journal
- Dmdw NotesUploaded byJohn William
- Data Mining Cup 2010 ReportUploaded byPutu Yunia Saputra
- romi-dm-04-algoritma-nov2014.pptxUploaded byRizqy Fahmi
- The Best of the Machine Learning Algorithms Used in Artificial IntelligenceUploaded byshiva_dilse123
- SRS ProjectUploaded byNayani Gadekar
- Decision Tree AnalysisUploaded bydzuleffendi
- Chapter 5 Concept Description Characterization and Comparison 395Uploaded bySailaja Naidu
- Predictive AlgorithmsUploaded byRazvan Agapie
- p226-wangUploaded byElham Manouchehri
- Performance Evaluation of Machine Learning Algorithms in Post-operative Life Expectancy in the Lung Cancer PatientsUploaded byDanjuma Kwetishe Joro
- EEG Eye State ReportUploaded byPhillip Zaphiropoulos
- AI in Network Intrusion DetectionUploaded byAnonymous Th1S33
- Applying the CHAID AlgorithmUploaded byWong Yet Xiang
- LeDucThinh-BABAIU14255-Decision Tree.docxUploaded byVô Thường