You are on page 1of 4

Computer project MAS291

References:
1. Statistics with python:
https://www.kaggle.com/code/nayanagal/hands-on-exercises-on-stats-and-probability
2. Statistics with R: google, website of cran R
Exercises on using R for Statistics and Hypothesis Testing, Dr. Wenjia Wang, School of
Computing Sciences, UEA University of East Anglia

use Excel to perform inferential


statistics on the data to obtain useful
information.
More specifically, you are required to
apply all of the following techniques on
your data:
1. Test a hypothesis and construct a
confidence interval for the mean of a
population.
2. Test a hypothesis and construct a
confidence interval for the proportion
of a
population.
3. Test a hypothesis and construct a
confidence interval for the difference in
means of
two populations.
4. Test a hypothesis and construct a
confidence interval for the difference in
proportions of two populations.
5. Regression analysisFor
Regression analysis, you are expected to
complete the
following steps:
5a. Identify two random variables
X and Y in your data (For example, X =
height, Y
= weight).
5b. Construct a scatter plot for the
data. Do you observe a linear
relationship?
5c. Compute the sample
correlation coefficient R.
5c. Find the equation of the
estimated regression line, and use it to
predict a future
value for Y.
5d. Test the significance of
regression
In this project, you are required to work in a group and present your work to the class. Each
group will look for secondary data (housing, finance, health, … or any topic of your choice:
1. See if there is a difference in test scores for students who have taken a gap year vs. those
who have not taken a gap year.
2. Analyze the GPAs of male students to female students.
3. See if there is a difference in grades between students who are in extracurricular activities
and those who are not.
4. The impact of sleep deprivation on health.
…)
, then use Excel or R or Python to perform inferential statistics on the data to obtain useful
information.
More specifically, you are required to apply all of the following techniques on your data:

1. Organize, summarize and visualize the data using descriptive statistics.


2. Test a hypothesis and construct a confidence interval for the mean of a population.
3. Test a hypothesis and construct a confidence interval for the proportion of a
population.
4. Test a hypothesis and construct a confidence interval for the difference in means of 2
populations.
5. Test a hypothesis and construct a confidence interval for the difference in proportions
of 2 populations.
6. Regression analysis:
For Regression analysis, you are expected to complete the following steps:
6.1. Identify 2 random variables X and Y in your data (e.g., X = height, Y= weight).
6.2. Compute the sample correlation coefficient R.
6.3. Find the equation of the estimated regression line, and use it to predict a future
value for Y.
6.4. Test the significance of regression.
7. Comment on your obtained results.
Sources for collecting secondary data: Internet, Google, https://www.kaggle.com/

You might also like