Professional Documents
Culture Documents
Skittlesproject
Skittlesproject
Math 1040-012
Term project-Skittles data
For this project, everyone in the class was asked to buy a 2.17 oz sized bag
of skittles, and count the number of each color of candy in the bag. The class
data was compiled, and this is the data that we used to complete the
different aspects of this statistics assignment.
For the first part of the project we were asked to determine the
proportion of each color within the overall sample gathered by the class. To
do this, we created a Pie Chart and a Pareto Chart representing the numbers
of each color of candy. We compared the class data to our own personal data
and noted any similarities or differences.
For the next portion of the project we used the skittles data to
calculate the mean, standard deviation, and 5-number summary. We then
used this data to make a frequency histogram, and a box plot.
The last part of the project involved confidence intervals, and
hypothesis tests. We found three different confidence intervals. One each for
the population proportion, mean, and standard deviation and wrote an
analysis about what each confidence interval meant.
Colors
Red
Orange
Yellow
Green
Purple
total
number
295
291
282
294
265
TOTAL
COLOR
red
orange
yellow
1427
PROPORTION
0.207
0.204
0.198
green
purple
TOTAL
0.206
0.186
1
Number
Proportion
Percentage
Green
19
.3015873016
30.16%
Red
17.46%
11
.1746031746
Orange
11
.1746031746
12
.1904761905
10
.1587301587
17.46%
Yellow
19.05%
Purple
15.87%
These graphs do represent what I expected to see. I thought that each color
of candy would be equally represented in each bag, and the sample data
seems to suggest that is the case.
With my sample data all colors were approximately equally represented, with
the exception of green candies. In my bag there were significantly more
green candies. the green candies made up .301587, or 30.16%.
Using the total number of candies in each bag in the class sample, we were
asked to calculate the mean, standard deviation, and 5-number summary.
Those results are as follows:
Mean: 59.5
Sample standard deviation: 1.98
Five number summary: Min=55, Q1=58, Median=60, Q3=61, Max=63
(Histogram drawn out by hand on the print out. Could not figure out
how to do it on excel)
You would use histograms, pie charts, boxplots, stem and leaf plots, and
scatterplots for quantitative data. For categorical data you should use a
Pareto chart.
When it comes to calculations, mean and median only make sense for
quantitative data. The mean is the average quantity of something in an
entire sample, therefore only makes sense when applied to quantitative
data. The median represents the middle value of the data and once again
makes the most sense only when applied to quantitative data. The best
central tendency to apply to categorical data is the mode. When looking at
the colors of candy in a skittles bag, you may not able to find the average
color or the median color, but you can establish which color occurs the most
often.
X= 282
n= 1427
Z-value for 99% CI = 2.575
p= 282/1427= 0.198
99% Confidence Interval Estimate: (0.171, 0.225)
Confidence Intervals estimated from a population proportion are used to
determine, with the
specified degree of confidence, the proportion of a characteristic found
within a population. In
relation to the skittles, we are 99% confident that the proportion of yellow
skittles in any bag of
skittles falls between 0.171 and 0.225.
95% Confidence Interval estimate for the population mean number
of skittles per bag
n= 24
Sx = 1.978
Sample mean= 59.458
59.458 +/ 2.069(0.404) =.835
59.458 + .835 = 60.293
59.458- .835 = 58.623
95% Confidence Interval Estimate: (58.623, 60.293)
Confidence Interval estimates of the population mean use sample data to
extrapolate an interval with the specified degree of confidence that the
mean characteristic of a population should fall within. In this case, we are
95% confident that the mean number of skittles in any bag is between
58.623 and 60.293.
pp p 0
p 0(1 p 0)
n
.2067274.20
.20(.80)
=.6353298386=.64
1427
P-value=.7389
Critical value= 1.96
Fail to reject H0. P-value is greater than .025. There is not sufficient evidence to
warrant rejection of the claim that p=.20
This hypothesis test tells us that we can say with 90% confidence that the claim that
20% of all skittles are red is true.
Use a 0.01 significance level to test the claim that the mean number of candies in
a bag of Skittles is 55.
Claim: =55
Null (H0): =55
Alternative (H1): 55
xx 0
s/n
59.45833355
4.458333
=
. 4036929466 =11.03
1.977683464 24
P-value=2.5
Critical value= 2.575
Reject H0. There is sufficient evidence to warrant rejection of the claim the mean
number of candies in each bag equals 55.
This hypothesis test tells us that we can say with 99% confidence that the claim that
there is a mean of 55 Skittles per bag is not true.