Professional Documents
Culture Documents
PRACTICAL – 1
Aim: Write Perform the following using Python Pandas and Matplotlib library on given
dataset:
i. Deal with missing values in the data either by deleting records or using mean/median/mode
imputation.
ii. Detect if Outliers exist and Plot the data distribution using Box Plots, Scatter Plots and
Histograms of matplotlib library
iii. Create and display the correlation matrix of all features of the data.
iv. Record and Analyse Observations.
Dataset description:
Dataset is downloaded from http://www.cs.toronto.edu/~delve/data/comp-activ/desc.html.
Dataset name is Computer Activity Dataset.
Dataset has 13 number of attributes and 8192 number of samples. Dataset has Numerical
types of samples.
Does dataset have a target attribute? YES
Did it have missing values and how did you deal with them? NO
Did you perform any data transformation tasks? NO
Did you perform any other data wrangling tasks? NO
DEPSTAR (CE-2) 1
CE473: Machine Learning 18DCE125
DEPSTAR (CE-2) 2
CE473: Machine Learning 18DCE125
DEPSTAR (CE-2) 3
CE473: Machine Learning 18DCE125
DEPSTAR (CE-2) 4
CE473: Machine Learning 18DCE125
DEPSTAR (CE-2) 5
CE473: Machine Learning 18DCE125
DEPSTAR (CE-2) 6