You are on page 1of 1

BSAN 67900 – Business Intelligence and Data Analysis

Class Activity 2

First Name:
Last Name:

Download the ‘Health.arff’ data and find answers to the following questions. The data set has
seven attributes that are related to the health condition. They have been measured for 20 days
(dayID) and four times per day (sequenceID). Hrv and hr indicate heart rate variability (the
specific changes in time) and heart rate, respectively.

1) Which attribute can be selected for a dependent variable of this data?

2) A) Which attributes are strongly related to the dependent variable? How can you decide?
(InfoGainAttributeEval -> Ranker)

B) Take a screenshot of the Weka with your name (using Notepad) and attach here.

3) A) Try to train a machine using Logistic (Classify tab -> Choose -> functions -> Logistic).
How is the performance of the classifiers? If you want to improve the performance, which
attributes would you remove? Remove them and see the trained performance of the new
logistic classifier. Is it improved?

B) Take a screenshot of the Weka with your name (using Notepad) and attach here.

4) A) Try to train a machine using J48 (Classify tab -> Choose -> trees -> J48) and compare
the result with the Logistic classifier. Which one is better?

B) Take a screenshot of the Weka with your name (using Notepad) and attach here.

You might also like