You are on page 1of 1

IT-462 EDA

Lab 5

Instructions:-

● The assignment should be submitted as an IPython Notebook as well as


a PDF.
● The submitted file must be of format:
“STUDENTID_Labno.ipynb” and
“STUDENTID_Labno.pdf”

Tasks:

● Calculate summary statistics for numerical variables. Compute


measures like mean, median, standard deviation, etc.
● Check for null values and impute them by understanding the
distribution of that feature.
● Explore the distribution of categorical variables like Sex, Age, Higher
using frequency tables or bar plots.
● Analyze the relationship between study time and final grades. Plot
scatter plots or box plots to visualize how grades vary with study time.
● Investigate whether there are any differences in study time between
genders by plotting appropriate plots
● Compute correlations between variables such as study time, final
grades(G1,G2,G3). Use correlation matrices or heatmaps to visualize
the correlation structure. - Identify variables that are strongly
correlated with student performance and explore potential causal
relationships.
● Explore the impact of Parental Status(Pstatus) on student
performance(Grades) by plotting appropriate plots.
● Apply Label Encoding to categorical features.

You might also like