Professional Documents
Culture Documents
[Note: To complete this template, replace the bracketed text with your own content.
Remove this note before you submit your outline.]
MATH141 – Project
[Your Name]
1 589055811.docx
Introduction
This project finds out whether there is a significant difference in salaries between
American league and National league players. To achieve this objective, this project
uses Baseball-2005 dataset that includes all summary measures of teams from major
2 589055811.docx
Five number summary and descriptive statistics (mean, median, std deviation) for all
major league teams.
Column Mean Std. Median Min Max Q1 Q3
dev.
Salary - 73.06 34.233 66.2 29.7 208.3 48.6 87.8
mil
Part II - Probabilities
Calculate the sample mean and std deviation of salary-mil for American League.
American league1475.47857145.929548
<a> calculate the probability that an American League team randomly selected
[0.4178858]
3 589055811.docx
<b> calculate the probability that an American League team randomly selected
[ 0.40976698 ]
4 589055811.docx
Calculate the sample mean and std deviation of salary-mil for National League.
<a> calculate the probability that an National League team randomly selected
[0.33062769]
5 589055811.docx
<b> calculate the probability that an National League team randomly selected
would have a salary-mil less than $60M
[ 0.29822723 ]
6 589055811.docx
Part III – Confidence Intervals
Calculate a 95% confidence interval for American League salary-mil
[60.816691, 81.070809]
Output
One sample Z summary confidence interval:
μ : Mean of population
Standard deviation = 20.667848
μ 16 70.943755.16696260.81669181.070809
[ 51.419645, 99.537497 ]
7 589055811.docx
Output
One sample Z summary confidence interval:
μ : Mean of population
Standard deviation = 45.929548
μ 14 75.47857112.27518851.41964599.537497
8 589055811.docx
Part IV – Hypothesis Tests
One Sample Tests
Collect a sample of 6 teams; 3 American League and 3 National League
Team Sample(Salary -mil)
New York Yankees 208.3
Arizona 62.3
Baltimore 73.9
Pittsburgh 38.1
Washington 48.6
Detroit 61.9
Calculate the mean and std deviation of this sample of 6 teams
[ Mean=82.18333 and std. deviation=63.01106 ]
Column n Mean Variance Std. dev.
Sample(Salary -mil) 6 82.183333 3970.3937 63.01106
Conduct one sample hypothesis test comparing sample mean versus the population
than 0.05 , and conclude that there is no sufficient evident to reject the claim that the
Output
One sample T hypothesis test:
μ : Mean of variable
H0 : μ = 73.06
HA : μ ≠ 73.06
9 589055811.docx
Conduct two sample hypothesis test comparing sample means from the American
value =0.2989 is greater than 0.05 and conclude that there is no significant difference in
Output
Two sample T hypothesis test:
Prepare a 95% confidence interval for the difference in the two league means:
meanAL - meanNL
[ -130.997, 261.064]
Output
Two sample T confidence interval:
μ1 : Mean of salary AL
μ2 : Mean of salary NL
μ1 - μ2 : Difference between two means
(without pooled variances)
95% confidence interval results:
DifferenceSample Diff. Std. Err. DF L. Limit U. Limit
μ1 - μ2 65.03333347.4481592.0891154-130.99721261.06388
The interval is not significant because it contains zero. This means that there is no
difference in mean salaries between national league and American league players.
10 589055811.docx
Conclusion
In summary, this project has found that there is no significant difference in salaries of
teams in both national and American leagues. The histogram indicated that the salaries
of the teams are skewed to the right. Comparing the means of a random sample of six
teams and the population mean indicated that there were no significant different
between the sample mean and population mean. Comparing the salary mean between
the 3 national league teams and 3 national league teams, it was found that there was no
significant difference between the two random samples. Confidence intervals confirmed
that there was significant difference in salary means between national league and
American league teams in the random sample of 6 teams. From the analysis I learned
that Z-test is used when dealing with population statistics and t-tests when dealing with
11 589055811.docx