Professional Documents
Culture Documents
HAI C-09 Martes 27-10-2020
HAI C-09 Martes 27-10-2020
Data Mining
Practical Machine Learning Tools and Techniques
I. H. Witten, E. Frank,
M. A. Hall, and C. J. Pal
ISBN: 978-0-12-804291-5
Creación de archivos ARFF
Creación de archivos ARFF
More Data Mining with Weka
Class 5 – Lesson 1
Simple neural networks
Ian H. Witten
Department of Computer
Science University of Waikato
New Zealand
weka.waikato.ac.nz
Lesson 5.1: Simple neural
networks
Class 1 Exploring Weka’s
interfaces; working with
big data
Is it faster? …
yes
Ionosphere.arff Classifier > Functions > Voted Perceptron
Ionosphere.arff Classifier > Functions > SMO
Lesson 5.1: Simple neural
networks
History of the Perceptron
1957: Basic perceptron algorithm
– Derived from theories about how the brain works
– “A perceiving and recognizing automaton”
– Rosenblatt “Principles of neurodynamics: Perceptrons
and the theory of brain mechanisms”
1970: Suddenly went out of fashion
– Minsky and Papert “Perceptrons”
1986: Returned, rebranded “connectionism”
– Rumelhart and McClelland “Parallel distributed
processing”
– Some claim that artificial neural networks mirror brain
function
Multilayer perceptrons
– Nonlinear decision boundaries
Lesson 5.1: Simple neural
networks
Basic Perceptron algorithm: linear decision boundary
– Like classification-by-regression
– Works with numeric attributes
– Iterative algorithm, order dependent
My MSc thesis (1971) describes a simple improvement!
– Still not impressed, sorry
Modern improvements (1999):
– get more complex boundaries using the “Kernel trick”
– more sophisticated strategy with multiple weight vectors and
voting
Course text
Section 4.6 Linear classification using the Perceptron
More Data Mining with Weka
Class 5 – Lesson 2
Multilayer Perceptrons
Ian H. Witten
Department of Computer
Science University of Waikato
New Zealand
weka.waikato.ac.nz
Lesson 5.2: Multilayer
Perceptrons
Class 1 Exploring Weka’s
interfaces; working with
big data
outpu
Multilayer Perceptron: función sigmoide
Network of sigmoid
t
perceptrons
Input layer, hidden layer(s), and output
layer input
Each connection has a weight (a number)
Each node performs a weighted sum
–of usually
its inputs
withand thresholds
a sigmoid functionthe
result
– nodes are often called
“neurons”
output
output
MultilayerPerceptron performance
Numeric weather.arrf
Numeric weather data 79%!
(J48, NaiveBayes both 64%, SMO 57%, IBk 79%)
On real problems does quite well – but slow
Parameters
hiddenLayers: set GUI to true and try 5, 10, 20
learningRate, momentum
makes multiple passes (“epochs”) through the
data
training continues until
– error on the validation set consistently increases
– or training time is exceeded
Lesson 5.2: Multilayer
Perceptrons
Create your own network structure!
Selecting nodes
– click to select
– right-click in empty space to deselect
Creating/deleting nodes
– click in empty space to create
– right-click (with no node
selected) to delete
Creating/deleting connections
– with a node selected, click on
another to connect to it
– … and another, and another
– right-click to delete connection
Can set parameters here too
Lesson 5.2: Multilayer
Perceptrons
Are they any good?
Experimenter with 6 datasets
– Iris, breast-cancer, credit-g, diabetes, glass, ionosphere
9 algorithms
– MultilayerPerceptron, ZeroR, OneR, J48, NaiveBayes, IBk,
SMO, AdaBoostM1, VotedPerceptron
MultilayerPerceptron wins on 2 datasets
Other wins:
– SMO on 2 datasets
– J48 on 1 dataset
– IBk on 1 dataset
But … 10–2000 times slower than other methods
Lesson 5.2: Multilayer
Perceptrons
Multilayer Perceptrons implement arbitrary decision
boundaries
– given two (or more) hidden layers, that are large enough
– and are trained properly
Training by backpropagation
– iterative algorithm based on gradient descent
In practice??
– Quite good performance, but extremely slow
– Still not impressed, sorry
– Might be a lot more impressive on more complex datasets
Course text
Section 4.6 Linear classification using the Perceptron
More Data Mining with Weka
Class 1 – Lesson 2
Exploring the Experimenter
Ian H. Witten
Department of Computer
Science University of Waikato
New Zealand
weka.waikato.ac.nz
Lesson 1.2: Exploring the
Experimenter
Class 1 Exploring Weka’s interfaces;
working with big data
Lesson 1.1 Introduction
Trying out
classifiers/filter
s
Performanc
e
comparisons
Graphic
al
interface
Command-line
interface
Lesson 1.2: Exploring the
Experimenter
Use the Experimenter for …
Evaluatio
n
results
Basic assumption: training and test sets produced by
Lesson 1.2: Exploring the
Experimenter
Evaluate J48 on segment-challenge (Data Mining with Weka, Lesson
2.3)
0.967
With segment-challenge.arff 0.940
and J48 (trees>J48) …
0.940
Set percentage split to 9 0.967
Run it: 96.7% accuracy 0% 0.953
0.967
Repeat
0.920
[More options] Repeat ith 0.947
w 2, 3, 4, 5, 6, 7, 8, 9 1 seed 0.933
0
0.947
Lesson 1.2: Exploring the
Experimenter
Evaluate J48 on segment-challenge (Data Mining with Weka, Lesson
2.3)
0.967
0.940
Sample x
i 0.940
xn
mean = 0.967
Variance =
(xi x )2 0.953
2 – n– 0.967
1 0.920
Standard 0.947
deviation 0.933
0.947
x = 0.949, =
0.018
Lesson 1.2: Exploring the
Experimenter
10 fold cross-validation (Data Mining with Weka, Lesson 2.5)
Stratified cross-validation
Ensure that each fold has the
right proportion of each class
value
Lesson 1.2: Exploring the
Experimenter
Setup panel
click New
note defaults
– 10-fold cross-validation, repeat 10
times
under Datasets, click Add
new, open segment-
challenge.arff
under Algorithms, click Add
new, open trees>J48
Run panel
click Start
Analyse panel
click Experiment
Select Show std. deviations
Click Perform test
Repertir con 90% split, segment-challenge.arrf, J48
Lesson 1.2: Exploring the Guarda resultados en Clase_Experiemnter.cvs
Experimenter
To get detailed results
Run panel
Analyse
panel
Lesson 1.2: Exploring the
Experimenter
Open Experimenter
Setup, Run, Analyse panels
Evaluate one classifier on one dataset
… using cross-validation, repeated 10 times
… using percentage split, repeated 10 times
Examine spreadsheet output
Analyse panel to get mean and standard
deviation
Other options on Setup and Run panels
Course text
Chapter The
13 Experimenter
More Data Mining with Weka
Class 1 – Lesson 3
Comparing classifiers
Ian H. Witten
Department of Computer
Science University of Waikato
New Zealand
weka.waikato.ac.nz
Lesson 1.3: Comparing classifiers
Is J48 better than (a) ZeroR and (b) OneR on the Iris
data?
v significantly
better
* significantly
worse