You are on page 1of 6

Yony Zuniga Fall 2013

Math 1040 Term Project

The following data from a list of 1340 baseball players has been compiled. These 1340 baseball players have been broken down into categories based on their primary position played along with being graphed on a pie graph to represent the proportion of each position.

D 1 3 2 S C O Grand Total

8 139 145 148 154 254 492 1340

Total
O C S 2 3 1 D

The next data was collected from the above population which we used a random sampling.

1 2 3 C O S Grand Total

1 2 5 6 15 5 34

Total
O C S 3 2 1

Total
Total Count 20 15 10 5 0 O C S 3 2 Primary Position Played 1

16 14 12 10 8 6 4 2 0 1 2 3 C O S

Using a simple random sample this sample was chosen. If we look at the graph we will see first a random number was generated for each player. Later the numbers were ranked from smallest to largest. After that the first 34 players were chosen. We created a frequency table to show how many players play each position from our sample of 34 players, If we compare the sample and population we could see that it is very similar. In both the population statistics and our sample statistics, out-fielders have the most players then catchers, and

then shortstops. For this particular statistic our random sample is a good representation of the population. The following data was collected from the above population using a convenience sampling.

2 3 C D O S Grand Total

2 3 7 1 16 4 33

Total
O C S 3 2 D

Total
20 Total Count 15 10 5 0 O C S 3 2 Primary Position Played D

18 16 14 12 10 8 6 4 2 0 2 3 C D O S

This sample was chosen using a convenience sample. First the data was arranged alphabetically by the first name of each player. Then the first 34 players were chosen. From our sample of 34 players, we created a frequency table to show how many players play each position.

The sample in this case is also very similar to the population. In both the population data and our sample data, out-fielders have the most players then catchers, and then shortstops. I think for this particular statistic our convenience sample is a good representation of the population. The population of baseball players referenced above was found to have a mean on base percentage of 0.336 with a standard deviation of 0.034. A random sample of 30 players was then generated from the population of 1340 baseball players. The sample has a mean on base percentage of 0.344 with a standard deviation of 0.032. The boxplot and frequency histogram below show the distribution of the sample values.

Random Sample Box Plot On Base Percentage

Random Sample Frequency Histogram On Base Percentage


16 14 12

Frequency

10 8 6 4 2 0 0.250 - 0.279 0.280 - 0.309 0.310 - 0.339 0.340 - 0.369 0.370 - 0.399 0.400 - 0.429

On Base percentage

Another sample of 30 players was chosen from the population of 1340 baseball players, this time using a systematic sampling method. The sample has a mean on base percentage of 0.344 with a standard deviation of 0.030. The boxplot and frequency histogram below show the distribution of the sample values.

Systematic Sample Box Plot On Base Percentage

Systematic Sample Frequency Histogram On Base Percentage

Systematic Sample
14 12

Frequency

10 8 6 4 2 0 0.250 - 0.279 0.280 - 0.309 0.310 - 0.339 0.340 - 0.369 0.370 - 0.399 0.400 - 0.429

On Base Percentage
Both of the sample methods used above generated a mean and standard deviation of on base percentage similar to the population mean and standard deviation. Based on the box plots and

frequency histograms there seems to be a normal distribution. The box plots for both sampling methods have a similar shape. Both of the frequency histograms start low, rise to a clear high point, and then fall again. It appears that the random sample and systematic sample both yielded results similar to what we would expect to see in the population. On page two a random sample of 34 baseball players was generated. 15 of those players listed Outfielder as their primary position played. The following 95% confidence interval estimates the proportion of Outfielders in the population of baseball players is 0.280 < p < 0.602. On page three a convenience sample of 33 baseball players was generated. 16 of those players listed Outfielder as their primary position played. The following 95% confidence interval estimates the proportion of Outfielders in the population of baseball players is 0.314 < p < 0.655. On page four a random sample of 30 baseball players was generated. This sample of players showed a mean on base percentage 0.344 with a standard deviation of 0.032. The following 95% confidence interval estimates the mean on base percentage of the population of baseball players is 0.332 < < 0.356. On page five a systematic sample of 30 baseball players was generated. This sample of players showed a mean on base percentage 0.344 with a standard deviation of 0.030. The following 95% confidence interval estimates the mean on base percentage of the population of baseball players is 0.333 < < 0.355.

These confidence intervals indicate that we are 95% sure that the population proportion (1 and 2) and the population mean (3 and 4) fall within the above noted ranges. In all four cases the actual population parameter falls within the confidence interval. Part V Given that the mean population of our sample was 0.336 and the standard deviation was 0.034, we took a sample of 30 players and calculated a sample mean of 0.344 with a sample standard deviation of 0.32. Since we were 95 percent confident that our numbers were between the above stated intervals, we decided that our claim was that population mean was not equal to 0.336. Calculating the proportion with the known population standard deviation, we got our test statistic of 1.288. Since it was a 2-tail calculation we calculated a Critical Value of -1.96. Because our T-Statistic was 1.288, it falls within the 95% confidence. Thus concluding that we fail to reject the null hypothesis, and so, there is not enough evidence to conclude that our numbers fall within those parameters. Part VI This project has helped us understand all the concepts that we have learned from the beginning of the semester to now. It applies in many situations because they help us understand statistical numbers in our daily lives. We have developed statistical reasoning from the skills learned in this class and will apply it to any life situation that would require statistical use.