Math Final Project

Riley Meyers
MATH 1040
5/14/2020
Skittles Statistical analysis
Link to data
https://docs.google.com/spreadsheets/d/1A0mfYxA_V-OutOE_T3ASpst0P_Nt9hqNZ_OXbwkw
aS8/edit#gid=0
In this project the 2020 class of Math 1040 bought bags of skittles at random and counted
the number of every color of skittle in the bag. We can assume that our sample used independent
sampling as long as there are more than 620 bags of skittles (31/.05) to choose from, making our
sample of 31 an independent sample. We can also assume that our data will be normally
distributed as our N>30. We will be going through a statistical analysis of our skittles data,
including the Confidence interval of how many yellow skittles as well as a Confidence interval
for how many skittles there are in the bag. For our two confidence interval equations we are
doing one that is around one data point and one that is around the whole class of data points, that
is because it is two very different processes that can both be used to make assumptions of our
data.
Confidence interval estimates:
The purpose is to find the spread in which we have a certain percentage of confidence a
value will fall inside. For example we can say that we are 90% confident that the average man
consumes between 1500 and 2000 calories every day (hypothetical). As the percentage of
confidence increases so does the spread of our interval, for example we are 99% confident that
the average man consumes between 100 and 4000 calories everyday, while it is a true statement
it does not provide much information about our data.
Confidence interval for yellow skittles in bag-
We are trying to estimate the true value of the proportion of yellow skittles in a bag, my best
guess is Phat= .145. However, due to sampling variability we are unlikely to be correct. So we
will be creating an z interval in which we are 99% certain our true average proportion will lie
between. I will only be using my data to make this interval so we must proceed with caution to
assume that this is true for all bags of skittles.
PHAT- from my data= 9 yellow skittles in bag of 62 skittles 9/62= .145
.145 +/- 2.575 * sqrt((.145*.855)/62) = Upper bound .2604 Lower bound .0299
Thus I am 99% sure that the interval from .0299 to .2604 captures the true proportion of yellow
skittles in a bag.
Confidence interval for how many skittles in a bag:
We are trying to estimate the average value of skittles in a bag, my best guess is the mean of the
data 60.55. However, due to sampling variability we are unlikely to be correct. So we will be
creating an t interval in which we are 95% certain our true mean will lie between. I will be using
the class data for this assumption so we can safely assume a level of accuracy due to the
conditions stated in the intro above.
Mean of the data (x): 60.55
Standard deviation (s): 3.812
X +/- T (s/sqrt(n))= 60.55 +/- 1.96 (3.812/sqrt(31)) = Lower 59.1506 Upper 61.9462
Thus we are 95% confident that the true mean of skittles in a bag is between 59.1506 and
61.9462
Hypothesis testing
A hypothesis test is used to see if a change in data is significant enough to conclude that
we have enough data to say that a hypothesis is false.
Hypothesis testing for proportion of red skittles in a bag:
The hypothesis that we will be testing is if 20% of skittles in a bag are red, I will only be
using my data to make this interval so we must proceed with caution to assume that this is true
for all bags of skittles. Using my Phat of 13/62 = .21 we will be testing if the true proportion is
higher than .2.
H null: P=.2 H alt: P>.2

Using 1-propZtest our P is .5755
Since P is greater than our level of significance (A=.05) we do not reject the Null
hypothesis because there is not significant evidence to support the alternative hypothesis.
Hypothesis testing for the number of skittles in a bag:
We will be using the class data to test whether the number of skittles in a bag is 55. With
the conditions stated in the introduction we can safely assume our conclusions will be accurate.
H null: Mean=55 H Alt: Mean does not = 55
T value: 8.1063565
P value: >.000001
Since the P value is significantly smaller than our level of significance (a=.01) we can safely
reject the null hypothesis because there is evidence to support that the mean of skittles in a bag
does not equal 55.

Math Final Project

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Math Final Project

Uploaded by

Copyright:

Available Formats

Riley Meyers

Skittles Statistical analysis

it does not provide much information about our data.

Confidence interval for yellow skittles in bag-

assume that this is true for all bags of skittles.

PHAT- from my data= 9 yellow skittles in bag of 62 skittles 9/62= .145

conditions stated in the intro above.

Mean of the data (x): 60.55

Standard deviation (s): 3.812

we have enough data to say that a hypothesis is false.

Hypothesis testing for proportion of red skittles in a bag:

higher than .2.

H null: P=.2 H alt: P>.2

Hypothesis testing for the number of skittles in a bag:

H null: Mean=55 H Alt: Mean does not = 55

does not equal 55.

You might also like