Professional Documents
Culture Documents
2. Enable some visualization on data using the ‘distributions’ node in the ‘Visualize’
group.
3. Deal with missing values through the ‘Impute’ node in the ‘Data’ group.
1) Use ‘Average / Most Frequent’ for all numeric variables.
2) Use ‘Random Values’ for all categorical variables.
3) Create a new ‘Data Table’ named ‘Raw Data’ to view the imputed data.
4. Split the data into ‘train’ set and ‘test’ set (60:40), using the ‘Data Sampler’ node in
the ‘Data’ group.
5. Build a decision tree model with the “train’ set we just had using the ‘Tree’ node in
the ‘Model’ group. You can look at the tree model by adding the ‘Tree Viewer’ node in
the ‘Visualize’ group.
6. Make Prediction by using the ‘Predictions’ node in the ‘Evaluate’ group. Use the
tree model we just built, and the ‘test’ set as test data.
7. Check the prediction result by double clicking the ‘Predictions’ node. Save the result
of prediction as ‘.xlsx’ format using the ‘Save Data’ node.
8. Compare models using the ‘Test and Score’ node in the ‘Evaluate’ group. Use this
function to compare results of the ‘SVM’ and ‘Logistic Regression’ models.
9. Save the whole chart on canvas as the flow (.ows) file. You may reuse this flow next
time.
10. Upload the flow (.ows) file, Excel (.xlsx) file and the screenshots of ‘Prediction’,
‘Tree viewer’ and ‘Test and Score’ to moodle as your submitted assignments.