Professional Documents
Culture Documents
Lab Manual Computer Science & Engineering
Lab Manual Computer Science & Engineering
Lab Manual
Heni.R.Vyas-190305105729
Practical: 1
Aim: - Perform preprocessing on a dataset. Apply various filters and discuss
the effect of each filter applied.
A. Handle missing values
B. Handle Infrequent Nominal values.
C. Derive an attribute from the existing attribute.
About dataset: -
We are using Weather Dataset from Kaggle in our
task. Data src:https://www.kaggle.com/c/weather
Data Dictionary: -
pclass weather
outlook Sunny,Rainy,overcast
Humidity Humidity
Windy Windy
Play Play
Open our Weather Dataset in Weka Tool:
Task A: Handle missing values
Missing value: - Missing data are values that are not recorded in a dataset. They can be
a single value missing in a single cell or missing of an entire observation (row). Missing
data can occur both in a continuous variable
Step 2: Deriving an attribute from existing attributes from add expression filter.
Step 3: After applying add expression filter new attribute created that name is
temperature+humidity that are derived from temperature and humidity attribute on
data set.
Practical 2
Aim- Perform Binning in Dataset.
Binning: Data binning, bucketing is a data pre-processing method used to minimize
the effects of small observation errors. The original data values are divided into small
intervals known as bins and then they are replaced by a general value calculated for
that bin.
Result of Step 1.
Visualize results of Step 1.
Tree view:
Result of applying filter J48 and percentage split =89 % on play column.
Visualize Result of Step 2.
Trees:
Result of applying filter J48 and percentage split =55% on “windy” column.
Visualize Result of Step 3.
Tree: