This document discusses building classifiers using the J48 algorithm in WEKA. It explains that classification recognizes patterns to determine what group an item belongs to. It then discusses running the J48 classifier on the glass dataset, analyzing the results including accuracy, confusion matrix, and effects of pruning. It also includes an activity asking the reader to analyze results from running J48 on other datasets.
This document discusses building classifiers using the J48 algorithm in WEKA. It explains that classification recognizes patterns to determine what group an item belongs to. It then discusses running the J48 classifier on the glass dataset, analyzing the results including accuracy, confusion matrix, and effects of pruning. It also includes an activity asking the reader to analyze results from running J48 on other datasets.
This document discusses building classifiers using the J48 algorithm in WEKA. It explains that classification recognizes patterns to determine what group an item belongs to. It then discusses running the J48 classifier on the glass dataset, analyzing the results including accuracy, confusion matrix, and effects of pruning. It also includes an activity asking the reader to analyze results from running J48 on other datasets.
• Classification recognizes patterns that describe
the group to which an item belongs by examining existing items.
For example, • Businesses such as credit card or telephone companies worry about the loss of steady customers.
• Classification helps discover the characteristics of
customers who are likely to leave Problem • Classify the attribute ‘Type’ of glass.arff dataset Different types of classifier in WEKA Click on start to run that classifier No. of instances and Attributes No of leaves and trees Overall Accuracy= 66.8% Confusion matrix Confusion matrix • Showing seven different class • Diagonal elements showing correctly classified class • Sum of diagonal elements= 143 (equal to accuracy percentage shown above) • Every non diagonal element shows missed classification (equal to non-accuracy percentage shown above) Open configuration panel of J48 • Click in front of Choose option Click on unpruned parameter and select true, to make it pruned tree • Many algorithms attempt to "prune", or simplify, their results.
• Pruning produces fewer, more easily interpreted
results.
• More importantly, pruning can be used as a tool
to correct for potential overfitting After ok click on start to run • Now accuracy = 67.2% Better result as compare to unpruned tree(66.8) Select another parameter Change minNumObj=15 Avoid Small leaves 5.0 shows correctly classified leaves 1.0 shows un-correctly classified leaf Correctly Classified Instances decreased Number of leaves and size of tree decreased Only 8 leaves Right click • Select option visualize tree Decision tree Same Decision tree Right click Select More option More information about classifier J48 • C4.8 (Latest version of classifier made in Java language) • So called J48 Activity • Open the glass dataset, go to the Classify panel, choose the J48 tree classifier, and run it (with default parameters) • 1. Use the confusion matrix to determine how many headlamps instances were misclassified as build wind float? • 3 • 2. Open the labor dataset, go to the Classify panel, and run the J48 classifier (with default parameters). What is the percentage of correctly classified instances? • 73.6842 • 3. Now turn pruning off in the J48 configuration panel by setting unpruned to True and run it again. What is the percentage of correctly classified instances now? • 78.9474