You are on page 1of 13

# esk vysok uen technick v Praze

## Katedra teoretick informatiky

Evropsk soci ln fond Praha ! E"# \$nvestu%eme do va& 'udoucnosti

## Seminar: Rapid Miner eginner!s g"ide

Jan ern, FIT, Czech Technical University in Prague
1

## Installing Rapid Miner

https://rapidminer%s)n%so"r#eforge%net/s)nroot/rapidminer/*ega/

Add tools%+ar (from (o"r lo#al ,D-) as pro+e#t dependen#(% R"n ant "ild s#ript% R"n Rapid miner \$ith #lass RapidMiner./I%+a)a in pa#'age #om%rapidminer%g"i%
2

&reating an e0periment
Rapid Miner "ses nested graphs to des#ri e the 'no\$ledge flo\$ pro#ess% 1his pro#ess #an #ontain loading data2 prepro#essing2 modeling "sing different t(pes of algorithms2 performan#e meas"ring2 report generating and so on%% In this e0ample \$e \$ill learn step ( step ho\$ to #reate a 'no\$ledge flo\$ that \$ill read data and performs a #ross-)alidation to test o"r model 3"alit(%

4perators
-no\$ledge flo\$s #onsists of 4perators \$here ea#h ha)e gi)en n"m er of inp"ts and o"tp"ts \$ith t(pe #he#'ing% 5a#h 4perator also ha)e its attri "tes \$hi#h #an e set \$hen (o" sele#t gi)en operator%

## 6earning simple model

6et!s #onstr"#t a simple 'no\$ledge flo\$ that \$ill learn o"r model on all data and get its o"tp"t%
1his #onstr"#t \$ill read data from arff file and passes them to the S"pport *e#tor Ma#hine model% 1he model is then send to the o"tp"t (the right side) \$here \$e #an )ie\$ it in the report vie(%

7oti#e the red S*M inp"t and error messages in the pro lems dialog
5

## 6earning simple model

Most of the time (o" #an +"st "se Rapid Miner s"ggested fi0es and it \$ill \$or' fine% 1he first error tells "s that S*M #annot handle pol(nomial o"tp"t attri "tes and offer "s 8 fi0es:
1) &on)ert them to n"meri#al \$hi#h is "sef"l if the attri "tes has defined distan#e to ea#h other (let!s sa( st"dent!s mar' (finite set from A to 9) - \$e 'no\$ that A is #loser to : than to & and so on%%) 2) &lassifi#ation ( regression \$hi#h "ses 1 regression S*M model for ea#h o"tp"t (\$e #an "se regression model to sol)e #lassifi#ation tas' \$ith this option) 8) ;ol(nomial ( inominal #lassifi#ation \$hi#h "ses inominal S*M #lassifier for ea#h #lass (to #lassif( into 2 #lasses m( #lass and others)%

<e add the la el from the a)aila le fi0es(la el identifies o"tp"t attri "te in o"r #ase o"tp"t attri "te is named #lass in arff files)2 sele#t &lassifi#ation ( inominal #lassifi#ation and see \$hat happens to the 'no\$ledge flo\$%
6

## 6earning simple model

-no\$ledge flo\$ #hanged and one operator \$as added to set the role of attri "te #lass to la el and one nested operator \$as added to perform pol(nomial ( inominal #lassifi#ation% Inside of that operator is the logi# ehind #reation of the inominal #lassifiers2 in o"r #ase the S*M operator ((o" #an )ie\$ it ( do" le #li#')% 7ested 4perators are identified ( s(m ol on the right ottom%

7ote: (o" ma( see same errors that (o" see here2 "t this is a "g of rapid miner and 'no\$ledge flo\$ \$ill \$or' normall(%

Res"lts - model
7o\$ \$e #an s\$it#h to res"lts )ie\$ and \$e #an loo' at the model \$hi#h \$as #reated%

=ere \$e #an see model des#ription "t \$e \$ant to also 'no\$ its 3"alit( (ie error)% 9or that \$e need to modif( the 'no\$ledge flo\$ e)en f"rther%
8

;erforman#e of model
6et!s meas"re performan#e of o"r model "sing 10 fold #ross)alidation% Add >-)alidation operator and pl"g it instead of the pol(nomial ( inominal #lassifi#ation (;:&) operator% 1hen #"t the ;:& operator and paste it inside the learning )alidation part%

As (o" #an see )alidation has 2 parts inside and a"tomati#all( di)ides the data% 4ne is e0e#"ted \$hen model is learned and train data are passed into the inp"t% After learning the model testing part is e0e#"ted and model is tested on the test data%
9

;erforman#e of model

1he testing part "ses Appl( Model operator \$hi#h gets o"tp"t from gi)en model on gi)en data follo\$ed ( ;erforman#e operator \$hi#h #omp"tes )ario"s statisti#s on the o"tp"t of the model% <e are mainl( interested in #lassifi#ation a##"ra#( "t (o" #an sele#t an( other meas"re a)aila le%

10

Res"lts - a##"ra#(
7o\$ \$e see a##"ra#( of o"r model in#l"ding #onf"sion matri0

11

6oop
7o\$2 \$e are going to modif( o"r 'no\$ledge flo\$ to ma'e some statisti#all( signifi#ant e0periment \$e need to repeat them large n"m er of times% .o to the top le)el of 'no\$ledge flo\$2 insert 6oop operator instead the >-)alidation operator and #"t-paste the )alidation operator inside the loop% ?o" #an see that the last line #oming from loop is do" led% 1hat indi#ates it has m"ltiple )al"es in it% So \$e #an a)erage a##"ra#( )al"es from different r"ns "sing A)erage operator%

12

13