Professional Documents
Culture Documents
net/publication/280612446
CITATIONS READS
0 7,758
1 author:
Oussama Ahmia
Université Bretagne Sud
10 PUBLICATIONS 11 CITATIONS
SEE PROFILE
All content following this page was uploaded by Oussama Ahmia on 03 August 2015.
BUILDING A CLUSTERER
BATCH:
A clusterer is built in much the same way as a classifier, but the
method buildClusterer(Instances) is replaced by buildClassifier(Instances).
The following code snippet shows how to build a SimpleKmean clusterer (I Will explain how to generate
option using GUI at the end of this document):
import weka.clusterers.SimpleKMeans;
import weka.core.Instances;
import weka.core.Instances;
import weka.core.converters.ConverterUtils.DataSource;
//The DataSource class is not limited to ARFF files. (3.5.5 and newer)
//It can also read CSV files and other formats (basically all file formats
that Weka can import via its converters using the gui.
INCREMENTAL:
Clusterers implementing the weka.clusterers.UpdateableClusterer interface can be trained
incrementally (available since version 3.5.4). This conserves memory, since the data doesn't have to be
loaded into memory all at once. (Used if the data are too big to be loaded in the memory)
1. Call buildClusterer(Instances) with the structure of the dataset (may not contain any
actual data rows, only the structure is important).
2. Subsequently call the updateClusterer(Instance) method to feed the clusterer
new weka.core.Instance objects, one by one.
3. Call updateFinished() after all Instance objects have been processed, for the clusterer to
perform additional computations.
import java.io.File;
import weka.clusterers.Cobweb;
import weka.core.Instance;
import weka.core.Instances;
import weka.core.converters.ArffLoader;
// load data we use arff file to get the structure of our training set
ArffLoader loader = new ArffLoader();
loader.setFile(new File("/some/where/data.arff"));
Instances structure = loader.getStructure();
String Options =" -A 1.0 -C 0.0028 -S 42";
// create Cobweb clster
Cobweb cw = new Cobweb();
cw.setOptions(weka.core.Utils.splitOptions(Options));
cw.buildClusterer(structure);
Instance current;
//we suppose that there is data in our arff file
while ((current = loader.getNextInstance(structure)) != null)
cw.updateClusterer(current);
cw.updateFinished();
CLUSTERER OPTION :
The easiest method (my personal opinion) to set Clusterer option (the method works also for classifier) is by
following this steps:
3. Delete the class name of the beginning of the option string that we get from step “2” example:
“weka.clusterers.SimpleKMeans” and pay attention to put “\” after ( " ) if the option String contains some.