Itb 10BM60092 Term Paper

Itb 10BM60092 Term Paper

Published by Swarnabha Ray
Data Mining with WEKA
Data Mining with WEKA

Published by: Swarnabha Ray on Apr 19, 2012
ITB term paper
ITB termPaper
Swarnabha ShankarRay[10BM60092]
ITB term paper
Data mining
is indispensible in today’s world where information plays an important role in shaping
business success. But data mining isn't solely the domain of big companies and expensive software.A free software
WEKA is capable of performing most of the data mining activities.
(Waikato Environment for Knowledge Analysis)
is a product of the University of Waikato(New Zealand) and was first implemented in its modern form in 1997. It uses the GNU General
Public License (GPL). The software is written in the Java™ language and contains a GUI for
interacting with data files and producing visual results (think tables and curves).WEKA's preferred method for loading data is in the Attribute-Relation File Format (ARFF),where we can define the type of data being loaded, then supply the data itself.When we start WEKA, the GUI chooser pops up as shown in figureIt lets us choose four ways to work with WEKA and our data. The four ways are
Knowledge Flow
Simple CLI
We click on the explorer tab.The sample data used in this case is car data available from 
in .arff format.Attribute-Relation File Format (ARFF) file has two distinct sections. The first section is the Headerinformation, which is followed the Data information.The Header of the ARFF file contains the name of the relation, a list of the attributes (the columns in thedata), and their types
Data File
ITB term paper
: Car Evaluation Database%
: Creator: Marko Bohanec%
Relevant Information Paragraph:
 % Car Evaluation Database was derived from a simple hierarchical% decision model originally .The model evaluates cars according to the following concept structure:%%CARcar acceptability%PRICEoverall price%buyingbuying price%paintprice of the maintenance%TECHtechnical characteristics%COMFORTcomfort%doorsnumber of doors%personscapacity in terms of persons to carry% luggage bootthe size of luggage boot%safetyestimated safety of the car%
Number of Instances: 1728
Number of Attributes: 6
Attribute Values:
% buying v-high, high, med, low% maint v-high, high, med, low% doors 2, 3, 4, 5-more% persons 2, 4, more% lug_boot small, med, big% safety low, med, high @
Relation Car
  @attribute buying {vhigh,high,med,low} @attribute maint {vhigh,high,med,low} @attribute doors {2,3,4,5more} @attribute persons {2,4,more} @attribute lug_boot {small,med,big} @attribute safety {low,med,high} @attribute class {unacc,acc,good,vgood} @
sample data

