Professional Documents
Culture Documents
Masters of Science
By
A.LAHARI
Registration Number: 2139465
submitted to
RAJESH.R
CHRIST UNIVERSITY
Banglore, Karnataka-560029
Attribute Variables
Gender Male, Female
Race/Ethnicity Group A, Group B, Group C, Group D, Group E
Parental level of Associate's degree, Bachelor's degree, High school, Master's
Education degree, Some college, Some high school
Lunch standard, free/reduced
Test preparation
course None, Completed
Math score 0-100
Reading score 0-100
Writing score 0-100
The above-mentioned Dataset it taken from the one well known website called
GitHub and this known for hosting the coding communities. The reason for
choosing the Dataset is because it meets all the requirement like size and quality
of the data for my assignment in this subject. Along with satisfying requirements
this particular Dataset has also this has been one of the popular Dataset on the
website and I believe the operation on this Dataset would be smooth and flexible
to apply all the class learnt concepts on this Dataset.
The Dataset taken for this assignment can be found by clicking here and it is
associated with the name “StudentPerformance.csv”.
Bascic operations
1.what are the libraries used in the data set?
Ans. Importing the required libraries for EDA are
Conclusion:
EDA is primarily used to see what data can reveal beyond the formal modeling or
hypothesis testing task and provides a provides a better understanding of data set
variables and the relationships between them. It can also help determine if the
statistical techniques you are considering for data analysis are appropriate.
Exploratory Data Analysis is valuable to data science projects since it allows to get
closer to the certainty that the future results will be valid, correctly interpreted,
By performing the above operations on my data set I can clearly analyse,visualize
and can even detect where the data has been incorrect or missing it parts .EDA
helps us to perform our tasks efficiently and it is also easy to use