Professional Documents
Culture Documents
Weka
The HIV Data Management and
Data Mining Workshop.
http://www.cs.waikato.ac.nz/ml/weka/
Why use Weka?
• ARFF-format:
@RELATION stanford_data
@ATTRIBUTE Subtype { "D", "K", "F1", "C", "A1", "B", "A2", "G", ""}
@ATTRIBUTE Bootstrap numeric
@ATTRIBUTE eNFV { "n", "y"}
@ATTRIBUTE SeqId { "NDK_198301", "SE554_199808", "SE474_199808", …,
"KDR_pat152_200101", "IL210_200104"}
@ATTRIBUTE PR2 { "Q", "K", "R", "H", "*", "E"}
…
@ATTRIBUTE PR99 { "F", "N", "Y", "L"}
@DATA
"D", "38.0", "n", "NDK_198301", "Q", "I", "T", "L", "W", "Q", "R", "P", "L", "V", "T", …, "F"
"F1", "50.0", "n", "CDC7944_199501", "Q", "I", "T", "L", "W", "Q", "R", "P", "L", "V", …, "F"
Importing a data set (4)
• CSV format:
can be
exported
from Excel
eNFV,subtype,bootstrap,seqid,PR2,PR3,PR4, …,PR97,PR98,PR99
n,D,44,NDK_198301,Q,I,T,…,L,N,F
n,C,95,BRP2139_200208,Q,I,P,…,L,N,F
y,C,97,TCDD13_200103, ,I,T,…,L,N,F
Preprocess-tab
Preprocess-tab
gives
information
about the
loaded data
and allows you
to preprocess it
further
Preprocess-tab
allows you to
remove selected
attributes
Preprocess-tab
gives info about
attribute selected
from list
D N V
Preprocess-tab
visualizes the
selected attribute
with colours
according to
selected class
attribute
click update!