You are on page 1of 5

MINI PAPER

Candy Exercise
Simple Data Analysis
by Bob Mitchell
While brainstorming ideas for the Kansas City AQC
division booth activity the Statistics Division officers
wanted to identify a hands-on activity whereby exhibit
hall visitors could participate in data generation and
analysis. The goal is to demonstrate the power of basic
statistical tools and Statistical Thinking in a fun,
entertaining manner something that could be further
developed as a teacher lesson in the Virtual Academy
module of our re-designed website. The Virtual Academy
is an on-line e-Learning basic statistics-training tool
targeted for K-12 students. It is currently undergoing
dramatic changes to incorporate lots of motion,
animation, sounds, bright colors, etc. tools to tickle the
senses of GenY future Statistics Division members.
Borrowing from the basic statistics module taught in Six
Sigma BB and GB training, we decided to demonstrate
Hypothesis testing, Signal-to-Noise ratios, and basic
quality tools to develop our process knowledge of candy
packaging. For example, known class exercises used in
some Six Sigma training involve studying bag fill weight
and color variation within and between Mars Inc.s M&M
candy varieties (plain, peanut, almond, crispy, peanut
butter), or bag fill and flavor variation of Mars Inc.s
Skittles (original fruit, tropical, wild berry, sour, mint).
One of the Statistics Division officers spouses owns a
candy store. Hearing about our search for a booth
activity, this spouse suggested that we examine the
hypothesis that people least like banana pieces of the five
Willy Wonka Runts flavors (cherry, strawberry, orange,
banana, watermelon). This candy storeowner orders by
bulk at the end of each month. It is her observation that
more people tend to remove the banana pieces before
their purchase; at the end of the month she has a
disproportionate amount of yellow Runts remaining in the
bin.
Two different hypotheses were developed for our Kansas
City AQC booth activity:
Null hypothesis Ho#1:
People like all Runts flavors equally;
No preference: Red = Pink = Orange = Yellow = Green

Null hypothesis Ho#2:


Bag fill process (by weight) is stable;
Short-term: Bag 1 = Bag 2 = Bag 3
Long-term: Lot 1 = Lot 2 = Lot 3
The ASQ Inspection Division provided the calibrated
electronic scale.
Exercise details:
1. Booth visitor selects and weighs a bag of Runts candies
2. Booth visitor opens the bag, sorts the candy by color
(flavor)
3. Booth visitor tastes each flavor
4. Record bag weight and color count by box# and bag.
5. Record individual flavor preference.
6. Recorded nominal empty bag weight
7. Recorded nominal candy weight by color
Data:
The Runts exercise data were analyzed in Minitab.
In order to promote discovery and learning we are
providing this Minitab workbook for download
from the Statistics Division website
http://www.asqstatdiv.org/documents/special/runts.xls.
We invite and encourage you to analyze the data and
offer your insights by posting your analysis and
conclusions to the Runts Discussion Page that
we created on the Statistics Division website
http://www.asqstatdiv.org/discussiongroups.htm.
Sample data set:
Bag
Full

Bag
Red
Empty

Blue

Orange Yellow

Pink

Green

Total

Preference

Box

60.1
55.7
57.8
60.5
56.0
53.1
59.4
57.9
58.0
57.1

1.4
1.4
*
*
*
*
*
*
*
*

10
4
2
3
2
6
2
4
5
5

6
7
11
11
14
11
6
15
10
13

9
13
16
15
14
11
11
11
15
17

9
8
2
4
6
1
5
5
5
0

43
45
48
49
45
44
44
48
48
49

orange
yellow
red
orange
red
orange
red
red
yellow
red

1
1
1
1
1
1
1
1
1
1

7
3
7
4
4
11
11
7
4
4

2
10
10
12
5
4
9
6
9
10

Continued on page 15

16

ASQ STATISTICS DIVISION NEWSLETTER, Vol. 21, No. 3

CANDY EXERCISE SIMPLE DATA ANALYSIS


Continued from page 14

Summary data:
Color

Flavor

Red Cherry
Blue Raspberry
OrangeOrange
Yellow Banana
Pink Strawberry
Green Watermelon

Rule of Thumb: When p-value is low, Ho must go.


Ind
Weight

Box1
Freq

Box2
Freq

Box3
Freq

Box4
Freq

Flavor
Total

Flavor
Preference

1.24
1.20
1.10
1.30
1.07
1.80

140
102
223
154
274
124

195
219
157
96
197
126

195
198
216
98
177
107

174
145
239
98
242
107

704
664
835
446
890
464

19
11
14
25
13
6

NOTE 1: We have a 6th color (flavor): blue raspberry.


Wonka recently introduced Chewy Runts
candies, and launched a new flavor with the
chewy product line.
NOTE 2: Unlike M&Ms and Skittles, Runts candies have
distinct shapes. Each color/flavor/shape has its
own weight distribution.

P = 0 therefore reject the null hypothesis that there is no


flavor preference. In fact, the data suggest that people
might actually prefer the banana flavor (Observed >
Expected)!
This is opposite from the store owners casual
observation. I am reminded by a quote from Don
Wheeler, All data out of context are meaningless. The
storeowner stocks Runts original (hard) candy; the AQC
booth activity used Runts chewy variety. While
consumers may prefer the banana flavor, feedback
suggests that the banana-flavored pieces in the original
Runts are so much harder than the other flavors. Is it this
hardness characteristic that people dislike? (I sense
another study). Green (watermelon) appears to be the
least preferred variety of chewy Runts.
Analysis of Means for Color Preference
0.03

Data Analysis:
Flavor Preference
Use Chi-Square to test Independence
Chi-Square Test: Flavor Sum, Flavor Preference
Expected counts are printed below observed counts

Color/Total

2.6SL=0.2689

0.2
P=0.1667

0.1
-2.6SL-0.06448

Flavor
Cherry

Sum
704
707.45

Preference
19 (Obs)
15.55 (Exp)

Total
723

Subgroup
Color

Raspberry

664
660.48

11 (Obs)
14.52 (Exp)

675

Orange

835
830.74

14 (Obs)
18.26 (Exp)

849

Banana

446
460.87

25 (Obs)
10.13 (Exp)

471

Strawberry

890
883.58

13 (Obs)
19.42 (Exp)

903

Watermelon

464
459.89

6 (Obs)
10.11 (Exp)

470

Total

4003

88

4091

Yellow

Red

3
Orange

4
Pink

5
Blue

6
Green

Overall 0.05 probability level used

Interesting side point: Though this study indicates that


people may prefer the chewy banana flavor, it has the
lowest frequency from Wonka

Chi-Sq = 0.017 + 0.764 +


0.019 + 0.853 +
0.022 + 0.995 +
0.480 + 21.820 +
0.047 + 2.125 +
0.037 + 1.671 = 28.849
DF = 5, P-Value = 0.000

Continued on page 16

ASQ STATISTICS DIVISION NEWSLETTER, Vol. 21, No. 3

17

CANDY EXERCISE SIMPLE DATA ANALYSIS


Continued from page 15
I and MR Chart for Bag Weight

Use ANOVA to test differences by box (22 bags per box)


One-way ANOVA: Bag Wt. (Lot 3014TL2) versus Box
MS
2.86
2.81

F
1.02

P
0.388

Mean=56.61

55

LCL=51.41

50

Subgroup 0

Individual 95% CIs For Mean


Based on Pooled StDev
Box
N
Mean
StDev -----+---------+---------+---------+1
22 56.886
1.890
(-----------*-----------)
2
22 56.945
1.676
(-----------*-----------)
3
22 56.255
1.796 (-----------*----------)
4
22 56.336
1.272 (-----------*-----------)
-----+---------+---------+---------+Pooled StDev =1.675
55.80
56.40
57.00
57.60

10

20

30

40

50

60

7
6
5
4
3
2
1
0

70

80

90

UCL=6.388

R=1.955
LCL=0

Another look at the same data, but segregated by box


shows a somewhat different story:
I and MR Chart for Bag Weight by Box
Individual Value

P-value > 0.05; cannot reject the null hypothesis that bag
weights are the same.
Boxplots of Bag Wt. by Box
(means are indicated by solid circles)
62
61
60

64
62
60
58
56
54
52
50

58
57
56
55
54

7
6
5
4
3
2
1
0

4
UCL=60.10
Mean=56.34
LCL=52.57

Subgroup 0

59

Moving Range

Bag Wt Lot 301 4TL2

UCL=61.81

60

Moving Range

Analysis of Variance for Bag Wt.


Source
DF
SS
Box
3
8.58
Error
84
235.73
Total
87
244.31

Individual Value

Bag Fill Consistency

10

20

30
2

40

50

60

70

80

90

4
1

UCL=4.621

R=1.414
LCL=0

52

53

The bag fill weights appear statistically equivalent; but is


the process variation over time?

It is theorized on our part that total bag weight is


affected by flavor (shape, weight) distribution. So what is
the relative color /flavor distribution by box?
P=chart of Color Distribution
Stratified by Box

Red
1

4
UCL=0.3462

0.4

0.3
Color/Total

The caution here is that we do not know the order of bag


fill at the candy manufacturer. The time series order
presented below is the order of booth visitor. Control
limits that define the true bag fill process natural variation
may very well be quite different; but our observation tells
us that the bag weight is rather consistent, bag to bag.

0.2
P=0.1731
0.1

LCL=3.47E-05

0.0
0

10

20

30

40 50
Packet

60

70

80

90

Continued on page 17

18

ASQ STATISTICS DIVISION NEWSLETTER, Vol. 21, No. 3

CANDY EXERCISE SIMPLE DATA ANALYSIS


Continued from page 16
P=chart of Color Distribution
Stratified by Box

P=chart of Color Distribution


Stratified by Box

Blue
1

Pink

0.5
0.4

UCL=0.4364

0.2
P=0.1443

Color/Total

UCL=0.3050

0.3
Color/Total

0.4

0.3
P=0.2408
0.2

0.1
0.1
0.0
LCL=0
0

10

20

30

40 50
Packet

60

70

80

LCL=0.04519

0.0

90

10

20

30

0.5

70

80

90

Green

Orange
2

60

P=chart of Color Distribution


Stratified by Box

P=chart of Color Distribution


Stratified by Box
1

40 50
Packet

0.3

UCL=0.4326

0.4

P=0.2378
0.2

Color/Total

Color/Total

UCL=0.2476
0.3

0.2

0.1

P=0.1065

0.1
LCL=0.04304
0.0

0.0
0

10

20

30

40 50
Packet

60

70

80

LCL=0
0

90

P=chart of Color Distribution


Stratified by Box

10

20

30

40 50
Packet

60

70

80

90

Control Chart of Average Weight per Piece


(Stratified by Box)

Yellow
1

4
1.5

0.3
1.4

UCL=1.387

0.2

0.1

P=0.09751

Total Weight

Color/Total

UCL=0.2332
1.3
Mean=1.235
1.2
1.1
0.0

LCL=0
0

10

20

30

40 50
Packet

60

70

80

90

LCL=1.082

1.0
0

10

20

30

40 50
Bag #

60

70

80

90

The chart above, Average Weight per Piece gives us


some insight that bag fill variation is controlled by the
relative frequencies of Runt color.
Continued on page 18

ASQ STATISTICS DIVISION NEWSLETTER, Vol. 21, No. 3

19

CANDY EXERCISE SIMPLE DATA ANALYSIS


Continued from page 17

Conclusion
Our intention for the AQC booth activity was to
demonstrate the applicability of basic statistics and quality
tools in everyday activities. Our hope is that this brief,
fun look at passive data analysis will motivate the use of
graphical displays of data towards deeper process
understanding and discovery. Our plan is to continue
development of the Virtual Academy to present statistics
in a fun and stimulating learning environment to K-12
education.

20

To be certain, we could have launched into more


sophisticated statistical data analysis, but that was not our
goal. Again, you are invited to download the dataset and
post your analysis and conclusions on the Discussion
Page we have provided on the Statistics Division website
(www.asqstatdiv.org).

ASQ STATISTICS DIVISION NEWSLETTER, Vol. 21, No. 3

You might also like