Wheeler’s SPC Toolkit
Articles from Quality Digest columns 1996-97

What Are Shewhart's Charts?
One day my friend David Chambers found a graph summarizing the " daily percentage of defective pairs" on the office wall of the president of a shoe company. Intrigued, David asked the president why he had this graph on the wall. The president condescendingly replied that he had the chart on the wall so he could tell how the plant was doing. David immediately responded with, "Tell me how you're doing." He paused, looked at the chart on the wall, and then said, "Well, some days are better than others!" Even though the president displayed his data in a suitable graphic format, and even though he felt that these data were important enough to require their posting each day, he did not have a formal way to analyze these values and interpret them. Data must be filtered in some manner to make them intelligible. This filtration may be based upon a person's experience plus presuppositions and assumptions, or it may be more formalized and less subjective, but there will always be some method of analysis. Of course, inadequate experience, flawed assumptions or inappropriate presuppositions can result in incorrect interpretations. However, in the absence of a formal and standardized approach to interpreting data, most managers use the seat-of-the-pants approach. Walter Shewhart developed a simple and effective way to define the voice of the process- he called it a control chart. A control chart begins with a time-series graph. A central line is added as a visual reference for detecting shifts or trends, and control limits (computed from the data) are placed equidistant on either side of the central line. Thus, a control chart is simply a time series with three horizontal lines added. The key to the effectiveness of the control chart is the way in which these limits are computed from the data. The control chart shown below consists of a sequence of single values. In other situations, the control chart may be based upon a time series of average values, ranges or some other function of the raw data. While there are several different types of control charts, they are all interpreted in the same way, and they all reveal different aspects of the voice of the process. Control charts also characterize the behavior of the time series. Occasionally you will encounter a time series that is well-behaved; such time series are predictable, consistent and stable over time. More commonly, time series are not well-behaved; they are unpredictable, inconsistent and change over time. The lines on a control chart provide reference points for use in deciding which type of behavior is displayed by any given time series. Shewhart wrote that a process "will be said to be in control when, through the use of past experience, we can predict, at least within limits, how the process will behave in the future." Thus, the essence of statistical control is predictability, and the opposite is also true. A process that does not display a reasonable degree of statistical control is unpredictable. This distinction between predictability and unpredictability is important because prediction is the essence of doing business. Predictability is a great asset for any process because it makes the manager's job that much easier. When the process is unpredictable, the time series will be unpredictable, and this unpredictability will repeatedly undermine all of our best efforts. Shewhart's terminology of "controlled variation" and "uncontrolled variation" must be understood in the context of predictable and unpredictable, rather than in the sense of being
Wheeler’s SPC Toolkit 1996-97 Pag 1 di 36

able to exert control. The user does not get to "set the limits." We should talk about " predictable processes" and "unpredictable processes." The control chart shows a time series that remains within the computed limits, with no obvious trend nor any long sequences of points above or below the central line. Thus, this process appears to be predictable. Unless the process is changed in some fundamental way, the plant will continue to produce anywhere from 7-percent defectives to 30-percent defectives, with a daily average of about 19-percent defective. Predictable performance is not necessarily the same as desirable performance. Notice how the control chart has helped interpret the data. First, the chart is used to characterize the behavior of the data- are they predictable or not? Second, the control chart allows the manager to predict what to expect in the future- the voice of the process! Finally, notice the difference between the shoe company president's interpretation of these data and the interpretation based on the control chart. Some days only appeared to be better than others! In truth, both the "good" days and the "bad" days came from the same process. Looking for differences between the "good" days and the "bad" days will simply be a waste of time.

Myths About Shewhart's Charts
The control charts described in many current technical articles bear little, if any, resemblance to the control chart technique described in Walter Shewhart's writings. Part of this problem can be attributed to novices teaching neophytes, while part is due to the failure to read Shewhart's writings carefully. Therefore, to help the reader differentiate control chart myths from foundations, this column will focus on both. This month, I will discuss four myths about Shewhart's charts. Next month, I will discuss four foundations of Shewhart's charts. Myth One: Data must be normally distributed before they can be placed on a control chart. While the control chart constants were created under the assumption of normally distributed data, the control chart technique is essentially insensitive to this assumption. This insensitivity is what makes the control chart robust enough to work in the real world as a procedure for inductive inference. In August, this column showed the robustness of three-sigma limits with a graphic showing some very nonnormal curves. The data don't have to be normally distributed before you can place them on a control chart. The computations are essentially unaffected by the degree of normality of the data. Just because the data display a reasonable degree of statistical control, doesn't mean that they will follow a normal distribution. The normality of the data is neither a prerequisite nor a consequence of statistical control. Myth Two: Control charts work because of the central limit theorem. The central limit theorem applies to subgroup averages (e.g., as the subgroup size increases, the histogram of the subgroup averages will, in the limit, become more "normal," regardless of how the individual measurements are distributed). Because many statistical techniques utilize the central limit theorem, it's only natural to assume that it's the basis of the control chart. However, this isn't the case. The central limit theorem describes the behavior of subgroup averages, but it doesn't describe the behavior of the measures of dispersion. Moreover, there isn't a need for the finesse of the central limit theorem when working with Shewhart's charts, where three-sigma limits filter out 99 percent to 100 percent of the probable noise, leaving only the potential signals outside the limits. Because of the conservative nature of the three-sigma limits, the central limit theorem is irrelevant to Shewhart's charts. Undoubtedly, this myth has been one of the greatest barriers to the effective use of control charts with management and process-industry data. When data are obtained one-value-perWheeler’s SPC Toolkit 1996-97 Pag 2 di 36

time-period, it's logical to use subgroups with a size of one. However, if you believe this myth to be true, you'll feel compelled to average something to make use of the central limit theorem. But the rationality of the data analysis will be sacrificed to superstition. Myth Three: Observations must be independent-data with autocorrelation are inappropriate for control charts. Again, we have an artificial barrier based on theoretical assumptions, which ignores the nature of real data and the robustness of the control chart. All data derived from production processes will display some level of autocorrelation. Shewhart uses autocorrelated data in the control chart as early as page 20 of his first book. He writes that assignable causes of variation are found and removed, then new data is collected. The new data shows they improved the process. Remember, the purpose of analysis is insight rather than numbers. The control chart isn't concerned with probability models. Rather, it's concerned with using data for making decisions in the real world. Control charts have worked with autocorrelated data for more than 60 years. Myth Four: Data must be in control before you can plot them on a control chart. This myth could have only come from computing limits incorrectly. Among the blunders that have been made in the name of this myth are: censoring data prior to charting them and using limits that aren't three-sigma limits. Needless to say, these and other manipulations are unnecessary. The purpose of Shewhart's charts is to detect lack of control. If a control chart can't detect lack of control, why use it?

Foundations of Shewhart's Charts
Last month, I described four myths relating to Shewhart's charts. This month I will discuss four foundations of the charts. Foundation One: Shewhart's charts always use three-sigma limits. Regardless of the type of chart you're using, the limits depend on the same principle. The data will be used to determine the amount of variation that is likely to be background noise, and the limits will be placed three estimated standard deviations on either side of the central line. Three-sigma limits are action limits-they dictate when action can be taken on a process. They are not probability limits. While they have a basis in probability theory, three-sigma limits were chosen because they provided reasonable action limits. They strike an economical balance between the two possible errors you can make in interpreting data from a continuing process. Three-sigma limits neither result in too many false alarms nor do they miss too many signals. In addition, they are unaffected by data nonnormality, even when the subgroup size is one. Foundation Two: Computing three-sigma control limits requires the use of an average dispersion statistic. By computing several dispersion statistics, using either an average or a median dispersion statistic, computation stability increases. This use of the subgroup variation will provide measures of dispersion that are much less sensitive to a lack of control than most other approaches. The choice of dispersion statistic is unimportant-ranges, standard deviations or root mean square deviations may be used. If the proper approach is used, different statistics will yield similar results. If the wrong approach is used, different statistics will yield similar incorrect results. Foundation Three: The conceptual foundation of Shewhart's control charts is the notion of rational sampling and rational subgrouping. How the data are collected, how they are arranged into subgroups and how these subgroups are charted must be based on the context of the

Wheeler’s SPC Toolkit 1996-97

Pag 3 di 36

data, the sources of data variation, the questions to be answered by the charts and how the knowledge gained will be used. Failure to consider these factors when placing data on a control chart can result in nonsensical control charts. The effective use of Shewhart's charts requires an understanding of rational sampling and rational subgrouping. Foundation Four: Control charts are effective only to the extent that the organization can use the knowledge. Knowledge gathering is in vain without an organization that can disseminate and use this knowledge. As long as there are internal obstacles that prevent an organization from utilizing SPC charts, nothing will happen. This is why so many of W. Edwards Deming's 14 points bear directly upon this one foundation and why SPC alone isn't enough. On the other hand, any program designed to increase organizational effectiveness and efficiency that does not use SPC is doomed to fail. This fourth foundation of Shewhart's charts is only implicit in Shewhart's work-there was always the assumption that organizations behave in a rational manner. However, Deming came to see that this wasn't the case. Simply giving people effective methods for collecting, organizing and analyzing data wasn't enough. In the absence of such methods, businesses had come to be run by the emotional interpretation of the visible figures-a universal "My mind is made up, don't bother me with the facts" syndrome. While Deming's 14 points do not constitute the whole of his philosophy, they are a profound starting point. They are not a set of techniques, a list of instructions nor a checklist. They are a vision of just what can be involved in using SPC. And they ultimately lead to radically different and improved ways of organizing businesses and working with people. However, a deep understanding is required before these 14 points can be used to accomplish the total transformation. The need is not to adopt the 14 points individually or collectively, but rather to create a new environment conducive to their principles.

The New Definition of Trouble
Product 1411 was in trouble. After batches 1 through 29 ran without a problem, Batch 30 failed in the customer's plant. Batch 31 worked, but the next three batches failed. When the preliminary investigation exonerated the customer's process as the problem source, the pressure shifted to the supplier. The supplier had a five-step process spanning three plants. Product 1411 started as a fluid produced by Step A in Plant One. The fluid was then shipped 400 miles to Plant Two, where Step B was performed. From there, it was shipped to Plant Three, where steps C, D and E were completed. The supplier knew that, at no time, had any marginal material been passed on to the next stepall of the material supplied to Step E had been within specification. Therefore, the investigation focused on Step E as the likely problem source. While they investigated Step E, batches 35 through 52 were produced. Of these 18 batches, 10 worked, seven failed in the customer's plant and one was recycled because it did not meet internal specifications. After much effort, the investigators became convinced that the problem did not originate in Step E. They then widened the investigation to include Step D. While Step D was scrutinized, batches 53 through 65 were produced and shipped to the customer. Of these 13 batches, only two worked. After Batch 65, they decided to look at Step C. Finally, seven months after the problem arose, steps C and D were also eliminated as the problem source. By this time, they had produced 75 batches. However, only one of the last 10 batches had worked in the customer's plant.

Wheeler’s SPC Toolkit 1996-97

Pag 4 di 36

Extending the investigation to Step B required the involvement of personnel from Plant Two. Even with the expanding number of people and the mounting number of man-hours, nothing was found at Step B. By now, eight months had passed, and only 18 of the last 54 batches had worked. So the team returned to Plant One and Step A. As the investigators looked over the records for Step A, they found that the levels of "Compound M" had increased, beginning with Batch 30. The specification for Compound M was "not more than 1,500 ppm." This value was based upon the safe-handling requirements. Because the level of Compound M was fixed at Step A, and because safety requirements prohibited shipping any liquid that was out-of-spec, Compound M was not even measured at the subsequent steps. Batches 1 through 29 averaged 78 ppm of Compound M, with levels ranging from 0 to 167 ppm. All of these batches worked. Batches 30 through 51 averaged 191 ppm of Compound M, with levels ranging from 0 to 346 ppm. Of these 22 batches, only the 10 with less than 200 ppm worked. Batches 53 through 85 averaged 412 ppm of Compound M, with levels ranging from 0 to 969 ppm. Of these 32 batches, only the eight with less than 200 ppm worked. Dozens of people had worked for eight months to discover something that a simple control chart at the first production step could have revealed immediately-Plant One had ceased to produce a consistent product with Batch 30. This deterioration was not noticed because the specification was so much greater than the actual levels. They are still trying to assess the damage done to the customer of Step E because of the performance degradation at Step A. Conformance to specifications is no longer the definition of world-class quality. Specifications seldom take into account customer needs Shewhart's charts, on the other hand, are the voice of the process. They define when a process is in trouble, and they will warn you so that you can take action to avoid even greater trouble. Statistical control is an achievement attainable only by the persistent and effective use of Shewhart's charts. And those who can't use Shewhart's charts are doomed.

Charts for Rare Events
Counts of rare events are inherently insensitive and weak. Your imagination is the only limitation on the use of Shewhart's charts. They are such a basic tool for analyzing and understanding data that they can be used in all kinds of situations-the key is to see how best to apply them. This column will illustrate how to use Shewhart's charts to track rare events. Department 16 has occasional spills. The most recent spill was on July 13. Spills are not desirable, and everything possible is done to prevent them; yet they have historically averaged about one spill every seven months. Of course, with this average, whenever they have a spill, they are 690 percent above average for that month. When dealing with very small numbers, such as the counts of rare events, a one-unit change can result in a huge percentage difference. Counts of rare events would commonly be placed on a c-chart. (While the c-chart is a chart for individual values, and while most count data may be charted using an XmR chart, the XmR chart requires an average count that exceeds 1.0. The c-chart does not suffer this restriction.) The average count is found by dividing the total number of spills in a given time period by the
Wheeler’s SPC Toolkit 1996-97 Pag 5 di 36

length of that time period. During the past 55 months, a total of eight spills occurred, which gives an average count of: c bar = 8 spills/55 months = 0.145 spills per month This average will be the central line for the c-chart, and the upper control limit will be computed according to the formula: UCLc = c bar + 3 (square root of c bar ) = 0.145 + 3 (square root of 0.145 ) = 1.289 The c-chart is shown in Figure 1. In spite of the fact that a single spill is 690 percent above the average, the c-chart does not show any out-of-control points. This is not a problem with the charts but rather a problem with the data. Counts of rare events are inherently insensitive and weak. No matter how these counts are analyzed, there is nothing to discover here. Yet there are other ways to characterize the spills. Instead of counting the number of spills each year, they could measure the number of days between the spills. The first spill was on Feb. 23, Year One. The second spill was on Jan. 11, Year Two. The elapsed time between these two spills was 322 days. One spill in 322 days is equivalent to a spill rate of 1.13 spills per year. The third spill was on Sept. 15, Year Two. This is 247 days after the second spill. One spill in 247 days is equivalent to a spill rate of 1.48 spills per year. Continuing in this manner, the remaining five spills are converted into instantaneous spill rates of 1.24, 1.61, 1.64, 2.12 and 3.17 spills per year. These seven spill rates are used to compute six moving ranges and are placed on an XmR chart in Figure 2. The average spill rate is 1.77 spills per year, and the average moving range is 0.42. These two values result in an upper natural process limit of: UNPL = 1.77 + 2.66 x 0.42 = 2.89 the lower natural process limit will be: LNPL = 1.77 - 2.66 x 0.42 = 0.65 and the upper control limit for the moving ranges will be: UCL = 3.268 x 0.42 = 1.37 This XmR chart for the spill rates is shown in Figure 2. The last spill results in a point that is above the upper natural process limit, which suggests that there has been an increase in the spill rate. This signal should be investigated, yet it is missed by the c-chart. In general, counts are weaker than measurements. Counts of rare events are no exception. When possible, it will always be more satisfactory to measure the activity than to merely count events. And, as shown in this example, the times between undesirable rare events are best charted as rates.

Why Three-Sigma Limits?
Three-sigma limits filter out nearly all probable noise and isolate the potential signals. The key to Walter Shewhart's choice of three-sigma limits lies in the title of his first book, Economic Control of Quality of Manufactured Product, where he emphasizes the economics of decisions. For example, Shewhart writes: "As indicated the method of attack is to establish
Wheeler’s SPC Toolkit 1996-97 Pag 6 di 36

limits of variability such that, when [a value] is found outside these limits, looking for an assignable cause is worthwhile." Here Shewhart makes a fundamental distinction-some processes are predictable while others are not. He shows that by examining the data produced by a process, we can determine the predictability of a process. If the data show that a process has been predictable in the past, it's reasonable to expect that it will remain predictable in the future. When a process is predictable, it's said to display common-cause, or chance-cause variation. When a process is unpredictable, it's said to display assignable-cause variation. Therefore, the ability to distinguish between a predictable process and an unpredictable one depends upon your ability to distinguish between common-cause and assignable-cause variation. What's the difference? Shewhart writes that a predictable process can be thought of as the outcome of "a large number of chance causes in which no cause produces a predominating effect." When a cause does produce a predominating effect, it becomes an "assignable" cause. Thus, if we denote the predominating effect of any assignable cause as a signal, then the collective effects of the many common causes can be likened to background noise, and the job of separating the two types of variations is similar to separating signals from noise. In separating signals from noise, you can make two mistakes. The first mistake occurs when you interpret noise as a signal (i.e., attribute common-cause variation to an assignable cause). The second mistake occurs when you miss a signal (i.e., when we attribute assignable-cause variation to common causes). Both mistakes are costly. The trick is to avoid the losses caused by these mistakes. You can avoid making the first mistake if you consider variation to be noise. But, in doing this, your losses from the second mistake will increase. In a similar manner, you can avoid making the second mistake if you consider each value a signal indicator. But, in doing this, your losses from the first mistake will increase. In our world, when using historical data, it's impossible to avoid both mistakes completely. So, given that both mistakes will be made occasionally, what can we do? Shewhart realized it's possible to regulate the frequencies of both mistakes to minimize economic loss. Subsequently, he developed a control chart with three-sigma limits. Three-sigma limits filter out nearly all probable noise (the common-cause variation) and isolate the potential signals (the assignablecause variation). How is it possible that three-sigma limits filter out virtually all probable noise? While there are certain mathematical inequalities that guarantee most data sets will require at least 95 percent of the values within three standard deviations of the average, a better rule of practice is the Empirical Rule, which states that about 99 percent to 100 percent of the data will be located within three standard deviations, either above or below the average. Figure 1 displays six theoretical distributions to illustrate the Empirical Rule's appropriateness. It shows the area within three standard deviations of the mean. No matter how skewed or "heavy tailed" the distribution may be, virtually all of the area under the distribution curve will fall within three standard deviation units of the mean. When applied to homogeneous data sets, the Empirical Rule suggests that no matter how the data "behave," virtually all of the data will fall within three standard deviation units of the average. Because data that display statistical control are, by definition, reasonably homogeneous, the Empirical Rule explains why the control chart will yield very few instances of noise interpreted as a signal. Figure 1 also shows that three-sigma limits will indeed filter out nearly all common-cause variation displayed by predictable processes. Three-sigma limits allow you to detect the process changes that are large enough to be economically important, while filtering out almost all common-cause variation. These limits allow you to strike a balance between the losses associated with interpreting noise as a signal and attributing assignable-cause variation to common causes.
Wheeler’s SPC Toolkit 1996-97 Pag 7 di 36

What About Charts for Count Data?
Deciding which probability model is appropriate requires judgment that most students of statistics do not possess. Some data consist of counts rather than measurements. With count data, it has been tradition to use a theoretical approach for constructing control limits rather than an empirical approach for making measurements. The charts obtained by this theoretical approach have traditionally been known as "attribute charts." There are certain advantages and disadvantages of these charts. Count data differ from measurement data in two ways. First, count data possess a certain irreducible discreteness that measurement data do not. Second, every count must have a known "area of opportunity" to be well-defined. With measurement data, the discreteness of the values is a matter of choice. This is not the case with count data, which are based on the occurrence of discrete events (the so-called attributes). Count data always consist of integral values. This inherent discreteness is, therefore, a characteristic of the data and can be used in establishing control charts. The area of opportunity for any given count defines the criteria by which the count must be interpreted. Before two counts may be compared, they must have corresponding (i.e., equally sized) areas of opportunity. If the areas of opportunity are not equally sized, then the counts must be converted into rates before they can be compared effectively. The conversion from counts to rates is accomplished by dividing each count by its own area of opportunity. These two distinctive characteristics of count data have been used to justify different approaches for calculating the control limits of attribute charts. Hence, four control charts are commonly associated with count data-the np-chart, the p-chart, the c-chart and the u-chart. However, all four charts are for individual values. The only difference between an XmR chart and an np-chart, p-chart, c-chart or u-chart is the way they measure dispersion. For any given set of count data, the X-chart and the four types of charts mentioned previously will show the same running records and central lines. The only difference between these charts will be the method used to compute the distance from the central line to the control limits. The np-, p-, c- and u-charts all assume that the dispersion is a function of the location. That is, they assume that SD(X) is a function of MEAN(X). The application of the relationship between the parameters of a theoretical probability distribution must be justified by establishing a set of conditions. When the conditions are satisfied, the probability model is likely to approximate the behavior of the counts when the process displays a reasonable degree of statistical control. Yet, deciding which probability model is appropriate requires judgment that most students of statistics do not possess. For example, the conditions for using a binomial probability model may be stated as: Binomial Condition 1: The area of opportunity for the count Y must consist of n distinct items. Binomial Condition 2: Each of the n distinct items must be classified as possessing, or not possessing, some attribute. This attribute is usually a type of nonconformance to specifications. Binomial Condition 3: Let p denote the probability that an item has the attribute being counted. The value of p must be the same for all n items in any one sample. While the chart checks if p changes from sample to sample, the value of p must be constant within each sample. Under the conditions, which are considered to be in a state of statistical control, it must be reasonable to assume that the value of p is the same for every sample. Binomial Condition 4: The likelihood that an item possessing the attribute will not be affected if the preceding item possessed the attribute. (This implies, for example, that nonconforming items do not naturally occur in clusters, and counts are independent of each other.) If these four conditions apply to your data, then you may use the binomial model to compute an estimate of SD(X) directly from your estimate of MEAN(X). Or, you could simply place the counts (or proportions) on an XmR chart and estimate the dispersion from the moving range
Wheeler’s SPC Toolkit 1996-97 Pag 8 di 36

chart. You will obtain essentially the same chart either way. Unlike attribute charts, XmR charts assume nothing about the relationship between the location and dispersion. It measures the location directly with the average, and it measures the dispersion directly with the moving ranges. Thus, while the np-, p-, c- and u-charts use theoretical limits, the XmR chart uses empirical limits. The only advantage of theoretical limits is that they include a larger number of degrees of freedom, which means that they stabilize more quickly. If the theory is correct, and you use an XmR chart, the empirical limits will be similar to the theoretical limits. However, if the theory is wrong, the theoretical limits will be wrong, and the empirical limits will still be correct. You can't go far wrong using an XmR chart with count data, and it is generally easier to work with empirical limits than to verify the conditions for a theoretical model.

Which Chart Should I Use?
In the previous columns, we saw examples of the basic control chart for individual values. This chart is recommended whenever you obtain data one value per time period, or one value per shipment. The second major type of control chart is used when the data have been arranged into subgroups. Here we are typically concerned with data where several values are obtained in a short period of time. For example, an auto plant in Portugal received shipments from a supplier in Germany. The part was a piece of wire for connecting the horn buttons to steering wheels. These wires were supposed to be 100 mm long. Every time they received a shipment, the Portuguese selected five wires and measured the lengths. The data for the first four shipments and a table of control chart constants are shown below:

A "subgroup" should consist of a set of measurements which, in the user's judgment, represent essentially the same set of conditions. The concept here is that while each subgroup should be more or less homogeneous, the control chart will examine the data to see if there are differences from one subgroup to another. In this example, each set of five measurements came from one shipment. The pieces of wire in each shipment were made in the same short production run and under essentially the same conditions. Therefore it is logical to make each shipment a subgroup. With subgrouped data, we plot the subgroup averages and subgroup ranges. Therefore, we must begin by computing averages and ranges for each subgroup. For each shipment, the average of the five values will be the subgroup average. The range of a subgroup will be the difference between the maximum value and the minimum value in that subgroup. For the first shipment, the maximum value is 115.2, while the minimum value is 112. Thus the subgroup range is: 115.2 &shyp; 112.0 = 3.2 units. After the average and range have been computed for each subgroup, these values are plotted in two running records. Conventionally, the averages are plotted on the upper running record and the ranges are plotted on the lower running record. The limits for this average and range chart are computed from the data according to the following steps:
Wheeler’s SPC Toolkit 1996-97 Pag 9 di 36

The average of the subgroup averages is obtained. This value is called the grand average. Here the grand average is 112.45. This value will be the central line for the upper portion of the chart. The average of the subgroup ranges, called the average range, is also obtained. Here the average range is 4.725. This value will be the central line for the lower portion of the chart. The control limits for the average and range chart are computed using the grand average and the average range. The upper control limit for the average chart will be: Grand Average + (A2 times Average Range) = 112.45 + (0.577 x 4.725) = 115.2 The lower control limit for the average chart is: Grand Average - (A2 times Average Range) = 112.45 - (0.577 x 4.725) = 109.7 The upper control limit for the range chart is: D4 times Average Range = 2.114 x 4.725 = 10.0 where A2 and D4 are the appropriate control chart constants for a given subgroup size. They are those values which allow us to convert the grand average and the average range into control limits.

As may be seen on the average and range chart, one average and one range fall outside their limits. Shipment Two has a lower average and a greater range than the other shipments. Due to the way the data were arranged into subgroups, the average chart characterizes each shipment's location-the average length of the wires in that shipment, while the range chart characterizes each shipment's consistency-the dispersion of the lengths of the wires in each shipment. Clearly, the four shipments have different locations and dispersion. Moreover, since the target is 100 mm, all shipments were far above the target. Based on this record of inconsistency, both within and between the shipments, the German supplier was dropped. The other suppliers were much more consistent in the product they delivered. This example serves to introduce the second of the two major types of control charts-charts for subgrouped data. When several values are collected under essentially the same conditions, it is logical to place these values in subgroups and use an average and range chart. The key to effective average and range charts is to have subgroups that are internally homogeneous. This is, of course, a judgment made by the user. It is the means by which users get to bring their process knowledge to bear upon the chart. When the data are collected in such a way that each value may differ from the others, it is logical to place the data on a chart for individual values. This commonly occurs when the values are obtained individually.
Wheeler’s SPC Toolkit 1996-97 Pag 10 di 36

While there are other types of control charts, they are all special cases of the two charts above. They are either charts for subgrouped data, or charts for individual values. Once you have learned how to use an average and range chart and a chart for individual values, you can work with virtually any type of data, in any type of situation.

When Do I Recalculate My Limits?
Correct limits allow the user to separate probable noise from potential signals. Of all the questions about Shewhart's charts, this is perhaps the most frequently asked question. While there is no simple answer, there are some useful guidelines. The first guideline for computing limits for Shewhart's charts is: You get no credit for computing the right number-only for taking the right action. Without the follow-through of taking the right action, the computation of the right number is meaningless. Now, this is contrary to everyone's experience with arithmetic. Early on we are trained to "find the right number." Thus, when people are introduced to Shewhart's charts, this natural anxiety will surface in the form of questions about how to get the "right limits." While there are definite rules for computing limits, and right and wrong ways of computing such limits, the real power of Shewhart's charts lies in the organization's ability to use them to understand and improve their processes. This use of Shewhart's charts-as an aid for making decisions-is the true focal point of the charts. But it is so easy to miss and so hard to teach. The second guideline for computing limits for Shewhart's charts is: The purpose of the limits is to adequately reflect the voice of the process. As long as the limits are computed in the correct way and reflect the voice of the process, then they are "correct limits." (Notice that the definite article is missing-they are just "correct limits," not "the correct limits.") Correct limits allow the user to separate probable noise from potential signals. Shewhart's charts are a tool for filtering out the probable noise. They have been proven to work in more than 70 years of practice. Shewhart deliberately chose three-sigma limits. He wanted limits wide enough to filter out the bulk of the probable noise so that people wouldn't waste time interpreting noise as signals. He also wanted limits narrow enough to detect the probable signals so that people wouldn't miss signals of economic importance. In years of practice he found that three-sigma limits provided a satisfactory balance between these two mistakes. Therefore, in the spirit of striking a balance between the two mistakes above, the time to recompute the limits for Shewhart's charts comes when, in your best judgment, they no longer adequately reflect the voice of the process. The third guideline for computing limits for Shewhart's charts is: Use the proper formulas for the computations. The proper formulas for the limits are well-known and widely published. Nevertheless, novices continually think that they know better and invent shortcuts that are wrong. The proper formulas for average and range charts will always use an average or median dispersion statistic in the computations. No formula that uses a single measure of dispersion is correct. The proper formula for X-charts (charts for individual values) will always use an average moving range or a median moving range. Within these three guidelines lies considerable latitude for computing limits. As Shewhart said, it is mostly a matter of "human judgment" about the way the process behaves, about the way the data are collected and about the chart's purpose. Computations and revisions of limits that heed these three guidelines will work. Calculations that ignore these guidelines won't.
Wheeler’s SPC Toolkit 1996-97 Pag 11 di 36

So, in considering the recalculation of limits, ask yourself: - Do the limits need to be revised in order for you to take the proper action on the process? - Do the limits need to be revised to adequately reflect the voice of the process? - Were the current limits computed using the proper formulas? So, if the process shifts to a new location and you don't think there will be a change in dispersion, then you could use the former measure of dispersion, in conjunction with the new measure of location, to obtain limits in a timely manner. It is all a matter of judgment. Remember, Shewhart's charts are intended as aids for making decisions, and as long as the limits appropriately reflect what the process can do, or can be made to do, then they are the right limits. This principle is seen in the questions used by Perry Regier of Dow Chemical Co.: Do the data display a distinctly different kind of behavior than in the past? Is the reason for this change in behavior known? Is the new process behavior desirable? Is it intended and expected that the new behavior will continue?

If the answer to all four questions is yes, then it is appropriate to revise the limits based on data collected since the change in the process. If the answer to question 1 is no, then there should be no need for new limits. If the answer to question 2 is no, then you should look for the assignable cause instead of tinkering with the limits. If the answer to question 3 is no, then why aren't you working to remove the detrimental assignable cause instead of tinkering with the limits? If the answer to question 4 is no, then you should again be looking for the assignable cause instead of tinkering with the limits. The objective is to discover what the process can do, or can be made to do. Finally, how many data are needed to compute limits? Useful limits may be computed with small amounts of data. Shewhart suggested that as little as two subgroups of size four would be sufficient to start computing limits. The limits begin to solidify when 15 to 20 individual values are used in the computation. When fewer data are available, the limits should be considered "temporary limits." Such limits would be subject to revision as additional data become available. When more than 50 data are used in computing limits, there will be little point in further revisions of the limits. So stop worrying about the details of computing limits for Shewhart's charts and get busy using them to understand and improve your processes.

Description or Analysis?
Analysis discourages inappropriate actions by filtering out the noise before potential signals are identified. The supervisor of Department 17 has just been asked to write a report-the in-process inventory of his department was at an all-time high of 2,800 pounds last month. He had to explain this value at the next management meeting, so he began by analyzing the numbers. First he looked at the current value of the inventory. The value of 2,800 pounds was 42 percent above the "plan value." It was also 12 percent above the value for the same month last year. There was no joy to be found in its current value.

Wheeler’s SPC Toolkit 1996-97

Pag 12 di 36

Next he looked at the year-to-date average of the in-process inventory for Department 17. The value was 2,160 pounds, which was 9.6 percent above the plan and 5.9 percent above the year-to-date value for the same month last year-two more bad values. Then the supervisor compared the percentage differences with the percentage changes in other departments. He prepared a bar graph for all the measures listed on the monthly report and discovered that the 42-percent value was the greatest percentage difference on the report. No luck here, either. In fact, having the greatest percentage difference, he realized that other managers would start the meeting by asking for his report. No matter how he packaged the numbers, the story looked bad. While he was required to explain these values, he had no idea what to say. So he made up something that sounded plausible and which shifted the blame to forces beyond his control. He hoped no one would quiz him too closely on the findings in his report. Sound familiar? It ought-to this little drama is acted out thousands of times each day. Of course, there are two problems with this "write a report" approach. The first is that these reports are usually works of fiction whose sole purpose is to enable some manager to pretend that something is being done about a perceived problem. The second is that the approach is based upon the assumption that the current value of the in-process inventory is actually a signal. But is it a signal-or is it just noise? How can you know? Before you can detect signals within the data, you must first filter out the probable noise. And to filter out noise, you must start with past data. In short, the supervisor, with his limited comparisons, could not fully understand the current values, and he suffered the consequences of his ignorance. The traditional analysis is nothing more than a collection of descriptive statistics. These days, most statistical analyses are little more than description. Bar graphs comparing unlike measures, pie charts showing proportions and rudimentary comparisons like those in the story above are more descriptive than anything else. Descriptive measures are concerned with how much or how many. They provide no insight into why there are so many, or why there is so much. Because analysis focuses on answering "why" questions, we must analyze data in the context of the question and begin to separate the potential signals from the probable noise. Managers beginning the analysis process should start by looking at a measure in a time series plot, which should include methods for filtering out routine variation. So what would the story have been for Department 17 if the manager had analyzed the values of the in-process inventory? Some of the past monthly in-process inventory values are seen on the X-chart in Figure 1. The limits on this chart define how large or small a single monthly value must be before its deviation from the historical average can be measured. Here, a monthly value in excess of 3,160 would be a signal that the amount of in-process inventory had risen. Likewise, a monthly value below 850 would signal a fall. In either case, you would be justified in looking for the cause of such movements. The July value of 2,800 is not a signal. There is no evidence of any real change in the in-process inventory. This means that asking for an explanation of July's value was futile. There was nothing to explain. Department 17 had 2,800 because it was averaging 2,004, and the routine variation caused about half of the values to fall between 2,004 and 3,160. There is no other explanation for the value of 2,800. Anything else is pure fiction. Some may feel disconcerted when they see limits that go from 850 to 3,160. Surely we can hold the in-process inventory more steadily than that! But that is precisely what cannot be done. At least it cannot be done unless some fundamental changes are made in the underlying process. The natural process limits are the voice of the process. They define what the process will deliver as long as it continues to operate as consistently as possible. The way to calculate these limits was discussed in the January 1996 "SPC Toolkit."
Wheeler’s SPC Toolkit 1996-97 Pag 13 di 36

When a process displays a reasonable degree of statistical control, it's operating as consistently as possible. The process doesn't care whether you like the natural process limits, and it certainly doesn't know what the specifications may be (specifications should be thought of as the voice of the customer, which is distinctly different from the voice of the process). Therefore, if you are not pleased with the amount of variation shown by the natural process limits, then you must change the underlying process, rather than setting arbitrary goals, asking for reports, jawboning the workers or looking for alternative ways for computing the limits. Mere description encourages inappropriate actions. It makes routine variation look like signals that need attention. In this case, there were no signals in the data, yet traditional ways of viewing the data didn't reveal this absence of signals. Analysis discourages inappropriate actions by filtering out the noise before potential signals are identified. The difference is profound.

Better Graphics
You have 30 seconds to communicate the content of your data to your audience: after 30 seconds, their eyes glaze over and you have lost their attention. So, how do you beat the 30second rule? The only reliable way is with a graph. But not all graphs are created equal. Howard Wainer offers an interesting example of this in the Summer 1996 issue of Chance when he uses a graphic from the Bureau of the Census' Social Indicators III. Figure 1 shows a facsimile of this graph. This bar graph attempts to show two things at once; that the total number of elementary schools has gone down over the period shown, while the number of private elementary schools has grown slightly. Figure 1 is not a particularly bad graphic. It is legible, even if the vertical scale is a bit larger than it needs to be. However, by placing both measures on the same graph, Figure 1 compresses the time series for the private schools excessively. One of the principles of good graphics requires that when the data changes, the graphic should also change. By placing both public and private schools on the same graph, the difference in magnitude between the two time series makes it impossible to fully comprehend the private school time series. Figure 2 shows what Figure 1 obscures: a jump in the number of private schools between 1950 and 1960. Of course, the connected nature of the plot in Figure 2 also helps because it draws the eye the way the mind wants to go. The bars of Figure 1 do not do this.

Wheeler’s SPC Toolkit 1996-97

Pag 14 di 36

Adding data to Figure 2 will increase insight. By using more points over the same period, the nature of the changes will be better understood. Once you have plotted the data in an effective graph, you can see the need to explain certain data characteristics. While a graph cannot distinguish between an accidental relationship and a cause-and-effect relationship, it can be the springboard for asking interesting questions. For example, the baby boom hit elementary schools in 1952. Thus, the trend in Figure 3 lags behind the increases in the sizes of the elementary school cohorts. So, the baby boom as a possible explanation is not convincing. Another possible explanation is the 1954 Supreme Court decision Brown vs. Topeka School Board, which declared segregated public schools to be illegal. During the following decade, many private schools were started. This trend continued until the mid-1960s, when two things happened: the baby boomers were moving on to secondary schools, and the 1964 Civil Rights Act was passed. Figure 3 does not prove anything, but it certainly does support some interesting speculation.

Wheeler’s SPC Toolkit 1996-97

Pag 15 di 36

Better graphics communicate the interesting parts of the data more directly. That means that when the data changes, the graph shows that change. When the data is presented as a time series, it is better to use connected points than to use a bar chart. The connected points draw the eye the way the mind wants to go, while the bar chart doesn't do this. The scale should be sufficient to avoid excessive compression of the graphic. The graphic should "fill up" the graph. Only rarely will multiple measures be appropriate on a single graph. Decoration should be avoided. If you denote the points and lines needed to show the data as "data ink" and denote all other lines, tick-marks, labels and decoration as "nondata ink," better graphics will always have a high ratio of data ink to nondata ink. Nowadays, people can easily obtain highly decorated graphs at the push of a button. Yet only when they understand that the purpose of a graph is to inform rather than to decorate will they begin to produce better graphics. Experience, practice and good guidance will all help. To this end, I recommend Edward Tufte's book, The Visual Display of Quantitative Information (Graphics Press, Cheshire, Connecticut).

Global Warming?
Global warming is a theory in search of supporting data. However, in the search for supporting data, we should avoid misinterpreting our data. In the July 4, 1996, issue of Nature, Santer et al. use data such as those in Figure 1 as evidence of global warming. The values shown represent the annual average air temperatures between 5,000 feet and 30,000 feet at the midlatitude of the Southern Hemisphere. The zero line on the graph represents the normal temperature, and the values plotted are the deviation from the norm for each year.

Wheeler’s SPC Toolkit 1996-97

Pag 16 di 36

The data of Figure 1 show a clear upward trend between 1963 and 1986. However, when we fit a regression line to data, we are imposing our view upon the data. If we know of some cause-and-effect mechanism that relates one variable to another, then regression lines are appropriate. But does the year cause the trend shown in Figure 1? While regression equations are useful in showing relationships, these relationships may be either causal or casual. At most, the relationship in Figure 1 is casual. But is the trend in Figure 1 real? Or is it just noise? This question can be answered in two ways: Check for internal evidence of a trend with a control chart, and check for external evidence of a trend by adding more data as they become available. Figure 2 shows these 24 data on an Xchart. The values for 1963, 1978, 1980 and 1986 all appear to differ from the norm. Hence, the "cool" year of 1963 combined with the "warm" years of 1977 through 1986 do suggest a possible trend. So there is some internal evidence for a trend in these data.

The limits can be adjusted for this possible trend in the following manner. Compute the average for the first half of the data. Years 1963 through 1974 had an average of &endash;0.74° C. Plot this average vs. the midpoint of this period of time -- halfway between 1968 and 1969. Compute the average for the last half of the data. Years 1975 through 1986 had an average of 4.55° C. Plot this value vs. the point halfway between 1980 and 1981. Connect these two points to establish a trend line. The distance from the central line to the limits in Figure 2 was found by multiplying the average moving range by the scaling factor of 2.660. The average moving range is 2.663° C. Thus, limits will be placed on either side of the trend line at a distance of: 2.660 x 2.663° C = 7.08° C (see Figure 3).

So the internal evidence is consistent with a trend for these data. But what about the external evidence? Professor Patrick Michaels of the University of Virginia added six prior years and eight following years to the data of Figure 1 (see Figure 4).

Wheeler’s SPC Toolkit 1996-97

Pag 17 di 36

So, if the data for 1963 through 1986 are evidence of global warming, then the subsequent data show that we solved the problem of global warming in 1991. However, if the interpretation of the data for 1963 through 1986 is merely wishful thinking, then we may still have some work to do. The 38 values are placed on an X-chart in Figure 5. Once again, the central line is taken to be zero in order to detect deviations from the norm.

So while 1963 was cooler than the norm, and while 1977 through 1990 were detectably warmer than the norm, there is no evidence in these data to support the extrapolation of the trend line shown in Figure 1. Obviously, there are cycles in the global climate, and any substantial evidence for global warming will require a much longer baseline. The first principle for understanding data is: No data have meaning apart from their context. We cannot selectively use portions of the data to make our point and ignore other portions that contradict it. The second principle is: While all data contain noise, some data may contain signals. Therefore, before you can detect a signal, you must first filter out the noise. While there are some signals in these data, there is no evidence of a sustained trend.

Good Limits From Bad Data (Part I)
There are right and wrong ways of computing limits. Many technical journals and much software use the wrong methods. Charles, from the home office, was pushing the plant manager to start using control charts. The plant manager didn't know where to start, so he asked what he should be plotting. Charles responded that he might want to start with the data they were already collecting in the plant. To start, they checked the log sheet for batch weights -- a page where the mix operators had written down the weight of each batch they produced. Charles began to plot the batch weights on a piece of graph paper. After filling up the first page, he computed limits for an XmR chart. Of course, the chart was out-of-control and the process was unpredictable. Even though every batch was weighed and the operators wrote down each weight, the log did not enable them to produce batches with consistent weights. Unpredictable weights meant that the formulation was changing in unpredictable ways, which translated into a sense of fatalism downstream.
Wheeler’s SPC Toolkit 1996-97 Pag 18 di 36

How could Charles determine that the process was unpredictable when he was using the data from the unpredictable process to compute the limits? The answer has to do with the way the limits are computed. There are right and wrong ways of computing limits. This column illustrates this difference for the XmR chart. The first 20 batch weights were: 920 915 970 925 940 940 830 940 975 855 910 1,000 905 860 1,035 925 865 1,040 945 985

The central line for the X chart is commonly the average of the individual values. For these 20 values, the average is 934. (Alternate choices for the central line are a median for the individual values or, occasionally, when we are interested in detecting deviations from a norm, a target or nominal value.) Both of the correct methods for computing limits for the XmR chart begin with the computation of the moving ranges. Moving ranges are the differences between successive values. By convention, they are always non-negative. For the 20 data above, the 19 moving ranges are: 5 30 15 25 30 95 0 35 25 30 25 50 50 35 20 5 5 20 120

Correct Method 1: The most common method of computing limits for XmR charts is to use the average moving range, which is commonly denoted by either of the symbols: R or mR. The limits for the X chart will be found by multiplying the average moving range by the scaling factor of 2.660, and then adding and subtracting this product from the central line. For these data, the average moving range is: mR = 32.63, so multiplying by 2.660 gives 86.8, and the limits for the individual values are placed at: 934 ± 86.8 = 847.2 to 1,020.8. The upper limit for the moving range chart is found by multiplying the average moving range by the scaling factor 3.268. For these data, this limit is 106.6. Figure 1 shows the XmR chart for these 20 batch weights. (Notice that the chart in Figure 1 shows three separate signals of unpredictable variation, even though the data from the unpredictable process were used to compute the limits.)

Correct Method 2: The other correct method of computing limits for a chart for individual values is to use the median moving range, which is commonly denoted by either of the symbols: R or mR. The limits for the X chart may be found by multiplying the median moving range by the scaling
Wheeler’s SPC Toolkit 1996-97 Pag 19 di 36

factor of 3.145, and then adding and subtracting this product from the central line. For these 19 moving ranges, the median moving range is mR = 30. Multiplying by the scaling factor of 3.145 gives 94.4, and the limits for the X chart are placed at: 934 ± 94.4 = 839.6 to 1,028.4. The upper limit for the mR chart is: 3.865 x 30 = 116.0. These limits are slightly wider than those in Figure 1. However, the same points that fell outside the limits in Figure 1 would still be outside the limits based upon the median moving range. There is no practical difference between these two correctly computed sets of limits. An incorrect method: A common, but incorrect, method for computing limits for an X chart is to use some measure of dispersion that is computed using all of the data. For example, the 20 data could be entered into a statistical calculator, or typed into a spreadsheet, and the standard deviation computed. The common symbol for this statistic is the lowercase letter "s." For these data: s = 56.68. This number is then erroneously multiplied by 3.0, and the product is added and subtracted to the central line to obtain incorrect limits for the X chart: 934 ± 170.0 = 764 to 1,104. Figure 2 shows these limits. Notice that the chart in Figure 2 fails to detect the signals buried in these data. It is this failure to detect the signals which are clearly indicated by the other computational methods that makes this approach incorrect. Note that it is the methodology of computing a measure of dispersion, rather than the choice of dispersion statistic, that is the key to the right and wrong ways of obtaining limits. If we used the range of all 20 data (1,040 - 830 = 210), we would obtain incorrect limits of: 934 ± (3)(210)/3.735 = 934 ± 168.7 = 765.3 to 1,102.7, which are essentially the same as in Figure 2.

Conclusion The right ways of computing limits will allow us to detect the signals within the data in spite of the fact that we used the data containing the signals in our computations. They are always based upon either an average or median dispersion statistic. The wrong ways of computing limits will inevitably result in inflated limits when signals are present within the data, and thus they will tend to hide the very signals for which we are looking. The wrong ways tend to be based upon a single measure of dispersion that was computed using all of the data. This distinction between the right and wrong ways of computing limits has not been made clear in most books about SPC, but it was there in Shewhart's first book. Many recent articles and software packages actually use the wrong methods. I can only assume it is because novices have been teaching neophytes for so many years that the teaching of SPC is out of control.
Wheeler’s SPC Toolkit 1996-97 Pag 20 di 36

Good Limits From Bad Data (Part II)
Continuing the theme from last month, this column will illustrate the difference between the right and wrong ways of computing limits for average charts. We begin with a data set consisting of k = 6 subgroups of size n = 4:] Subgroup -1-2-3-4-5-6-


4 5 5 4

0 2 1 5 2.0 5

8 4 3 7 5.5 5

6 9 9 7 7.75 3

3 2 0 3 2.0 3

8 7 9 9 8.25 2

Averages Ranges

4.5 1

The central line for the average chart (also known as an X-bar chart) is commonly taken to be the grand average. For these data, the grand average is 5.00.

Average charts done right The most common method of computing limits for average charts is to use the average range. The limits for the average chart may be found by multiplying the average range by the scaling factor A2, and then adding and subtracting this product from the central line. For these data, the average range is 3.167 and the value of A2 for subgroup size n = 4 is 0.729, therefore the product is 2.31, and the limits for the average chart are: 5.00 ± 2.31 = 2.69 to 7.31. Instead of using A2 times the average range, you may use any one of several alternatives as long as you use the correct scaling factors. Some appropriate substitutions are: A4 times the median range (2.27), A1 times the average RMS deviation (2.44) and A3 times the average standard deviation (2.44). While there are other valid alternatives, the four above are the ones most commonly used. Tables of the scaling factors are found in most textbooks on SPC. No matter which of the

Wheeler’s SPC Toolkit 1996-97

Pag 21 di 36

computational alternatives is used, the chart looks the same: Subgroups 2 and 5 have averages below the lower limit, and subgroups 4 and 6 have averages above the upper limit. Thus, all of the correct ways of computing limits for an average chart allow you to obtain good limits from bad data -- that is, we are able to detect the lack of statistical control even though we are using the out-of-control data to compute the limits. Of course, this property is subject to the requirement that the subgrouping is rational -- that each subgroup is logically homogeneous. As Shewhart observed, the issue of subgrouping is essentially a matter of judgment. Based on the context for the data, we must be able to argue that the values collected together within any one subgroup can be thought of as having been collected under essentially the same conditions. For more on this topic, see this column in the April 1996 issue of Quality Digest.

Average charts done wrong Perhaps the most common mistake made in computing limits for an average chart is the use of a single measure of dispersion computed using all of the data. If all 24 values were entered into a spreadsheet or statistical calculator, and the standard deviation computed, we would get the value: s = 2.904. When this global measure of dispersion is used (inappropriately) to compute limits for an average chart, it is divided by the square root of the subgroup size and multiplied by 3.0. This would result in a value of 4.356, which would yield incorrect limits for the average chart of: 5.00 ± 4.36 = 0.64 to 9.36. This method of computing limits for the average chart is wrong because it results in limits that do not detect the signals contained in the data. This approach gives you bad limits from bad data because the computation of a single measure of dispersion using all of the data makes an implicit assumption that the data are globally homogeneous. In short, this computation assumes that there is no possibility of any signals within the data, and so it makes sure that you do not find any signals.

Wheeler’s SPC Toolkit 1996-97

Pag 22 di 36

Average charts done very wrong The second most common mistake in computing limits for an average chart is the use of a single measure of dispersion computed using all of the subgroup averages. If the six subgroup averages were typed into a spreadsheet or entered in a statistical calculator, and the standard deviation computed, we would get the value: s = 2.706. Because this is the standard deviation of the subgroup averages, it is (inappropriately) multiplied by 3.0 and used to construct incorrect limits for the average chart of: 5.00 ± (3.0) (2.706) = 5.00 ± 8.12 = 3.12 to 13.12. This method of computing limits does not just bury the signals, it obliterates them. Once again, we get bad limits from bad data. The calculation of the standard deviation of the subgroup averages implicitly assumes that the subgroup averages are globally homogeneous; that is, they do not differ except for noise. Because this method assumes that there is no possibility of any signals within the data, it makes sure that you do not find any signals. Conclusion The only way to get good limits from bad data is to use the correct computational approaches. These correct methods all rely upon either an average dispersion statistic or a median dispersion statistic and the appropriate scaling factor. The wrong methods tend to rely upon a single measure of dispersion computed on a single pass, using either all the data or all the subgroup averages. The distinction between the right and wrong ways of computing limits was first made by Shewhart on page 302 of his book, Economic Control of Quality of Manufactured Product (ASQC Quality Press). It should not be an issue some 65 years later. The fact that it is an issue suggests that many people who think they know about control charts didn't get the message. How can you tell the difference? You can use any out-of-control data set to evaluate software packages -- they should agree with the limits computed by hand. Or you can look at the formulas or computations used: If the limits are based upon any measure of dispersion that doesn't have a bar above it, then the limits are wrong. So, if you now catch someone using any other way than one of the scaling factor approaches, then you know, beyond any doubt, that in addition to being wrong, they are also either ignorant or dishonest.

Good Limits From Bad Data (Part III)
Wheeler’s SPC Toolkit 1996-97 Pag 23 di 36

When you use rational sampling and rational subgrouping, you will have powerful charts. In March and April, this column illustrated the difference between the right and wrong ways of computing control chart limits. Now I would like to discuss how you can make the charts work for you. The calculation of control limits is not the end of the exercise, but rather the beginning. The chief advantage of control charts is the way they enable people -- to reliably separate potential signals from the probable noise that is common in all types of data. This ability to characterize the behavior of a process as predictable or unpredictable, and thereby to know when to intervene and when not to intervene, is the real outcome of the use of Shewhart's charts. The computations are part of the techniques, but the real objective is insight, not numbers. To this end, you will need to organize your data appropriately in order to gain the insights. This appropriate organization of the data has been called rational sampling and rational subgrouping. First, you must know the context for the data. This involves the particulars of how the data were obtained, as well as some appreciation for the process or operations represented by the data. Rational sampling involves collecting data in such a way that the interesting characteristics of the process are evident in the data. For example, if you are interested in evaluating the impact of a new policy on the operations of a single office, you will need to collect data that pertains to that office, rather than for a whole region. Rational subgrouping has to do with how the data are organized for charting purposes. This is closely linked to the correct ways of computing limits. With average and range charts (X-bar and R charts), there will be k subgroups of data. The right way to compute limits for these charts involves the computation of some measure of dispersion within each subgroup (such as the range for each subgroup). These k measures then combine into an average measure of dispersion (such as the average range) or a median measure of dispersion (such as a median range), and this combined measure of dispersion is then used to compute the limits. The objective of the control chart is to separate the probable noise from the potential signals. The variation within the subgroups will be used to set up the limits, which we shall use as our filters. Therefore, we will want the variation within the subgroups to represent the probable noise, i.e., we want each subgroup to be logically homogeneous. Shewhart said that we should organize the data into subgroups based upon our judgment that the data within any one subgroup were collected under essentially the same conditions. In order to have a meaningful subgrouping, you must take the context of the data into account as you create the subgroups. You have to actively and intelligently organize the data into subgroups in order to have effective average and range charts. When you place two or more values together in a single subgroup, you are making a judgment that, for your purposes, these data only differ due to background noise. If they have the potential to differ due to some signal, then they do not belong in the same subgroup. This is why the average chart looks for differences between the subgroups while the range chart checks for consistency within the subgroups. This difference between the charts is inherent in the structure of the computations -- ignore it at your own risk. But what if every value has the potential to be different from its neighbors, such as happens with monthly or weekly values? With periodically collected data, the chart of preference is the chart for individual values and a moving range (the XmR chart). Here, each point is allowed to sink or swim on its own. The moving range approach to computing limits uses short-term variation to set long-term limits. In this sense, it is like the average chart, where we use the variation within the subgroups to set the limits for the variation between the subgroups.

Wheeler’s SPC Toolkit 1996-97

Pag 24 di 36

While the right ways of computing limits will allow you to get good limits from bad data, the chart will be no better than your organization of the data. When you use rational sampling and rational subgrouping, you will have powerful charts. If you organize your data poorly, you can end up with weak charts that obscure the signals. Until you have the opportunity to develop subgrouping skills, it is good to remember that it is hard to mess up the subgrouping on an XmR chart.

How Much Data Do I Need?
The relationship between degrees of freedom and the coefficient of variation is the key to answering the question of how much data you need. How much data do I need to use when I compute limits?" Statisticians are asked this question more than any other question. This column will help you learn how to answer this question for yourself. Implicit in this question is an intuitive understanding that, as more data are used in any computation, the results of that computation become more reliable. But just how much more reliable? When, as more data become available, is it worthwhile to recompute limits? When is it not worthwhile? To answer these questions, we must quantify the two concepts implicit in the intuitive understanding: The amount of data used in the computation will be quantified by something called "degrees of freedom," and the amount of uncertainty in the results will be quantified by the "coefficient of variation." The relationship between degrees of freedom and the coefficient of variation is the key to answering the question of how much data you need. The terminology "degrees of freedom" cannot be explained without using higher mathematics, so the reader is advised to simply use it as a label that quantifies the amount of data utilized by a given computation.

The effective degrees of freedom for a set of control limits will depend on the amount of data used and the computational approach used. For average and range charts (X-bar and R charts), where the control limits are based upon the average range for k subgroups of size n, the degrees of freedom for the limits will be: d.f. ~ 0.9k(n1). For example, in April's column, limits were computed using k = 6 subgroups of size n = 4. Those limits could be said to possess: 0.9 (6) (3) = 16.2 degrees of freedom. For average and standard deviation charts (X-bar and s charts), where the control limits are based on the average standard deviation for k subgroups of size n, the degrees of freedom for the limits will be: d.f. ~ k(n1) 0.2(k1). In my April column, if I had used the average standard deviation to obtain limits, I would have had: (6) (3) 0.2 (5) = 17 degrees of freedom. As will be shown below, the difference between 16 d.f. and 17 d.f. is of no practical importance. For XmR charts, with k subgroups of size n = 1, and limits based on the average moving range, the degrees of freedom for the limits will be: d.f. ~ 0.62 (k1). In my March column, I computed limits for an XmR chart using 20 data. Those limits possessed: 0.62 (19) = 11.8 degrees of freedom. The better SPC textbooks give tables of degrees of freedom for these and other computational approaches. However, notice that the formulas are all functions of n and k, the amount of data available. Thus, the question of "How much data do I need?" is really a question of "How many degrees of freedom do I need?" And to answer this, we need to quantify the uncertainty of our results, which we shall do using the coefficient of variation.

Wheeler’s SPC Toolkit 1996-97

Pag 25 di 36

Control limits are statistics. Thus, even when working with a predictable process, different data sets will yield different sets of control limits. The differences in these limits will tend to be small, but they will still differ. We can see this variation in limits by looking at the variation in the average ranges. For example, consider repeatedly collecting data from a predictable process and computing limits. If we use k = 5 subgroups of size n = 5, we will have 18 d.f. for the average range. Twenty such average ranges are shown in the top histogram of Figure 1.

If we use k = 20 subgroups of size n = 5, we will have 72 d.f. for the average range. Twenty such average ranges are shown in the bottom histogram of Figure 1. Notice that, as the number of degrees of freedom increase, the histogram of the average ranges becomes more concentrated. The variation of the statistics decreases as the degrees of freedom increase. A traditional measure of just how much variation is present in any measure is the coefficient of variation, which is defined as: CV = standard deviation of measure/mean of measure Examining Figure 1, we can see that as the degrees of freedom go up, the coefficient of variation for the average range goes down. This relationship holds for all those statistics that we use to estimate the standard deviation of the data. In fact, there is a simple equation that shows the relationship. For any estimate of the standard deviation of X: CV = 1/sq.rt.(2d.f.) This relationship is shown in Figure 2. So just what can you learn from Figure 2? The curve shows that when you have very few degrees of freedom -- say less than 10 -- each additional degree of freedom that you have in
Wheeler’s SPC Toolkit 1996-97 Pag 26 di 36

your computations results in a dramatic reduction in the coefficient of variation for your limits. Since degrees of freedom are directly related to the number of data used, Figure 2 suggests that when we have fewer than 10 d.f., we will want to revise and update our limits as additional data become available. The curve in Figure 2 also shows that there is a diminishing return associated with using more data in computing limits. Limits based upon 8 d.f. will have half of the variation of limits based upon 2 d.f., and limits based upon 32 d.f. will have half of the uncertainty of limits based upon 8 d.f. Each 50-percent reduction in variation for the limits requires a four-fold increase in degrees of freedom. As may be seen from the curve, this diminishing return begins around 10 degrees of freedom, and by the time you have 30 to 40 d.f., your limits will have solidified. So, if you have fewer than 10 degrees of freedom, consider the limits to be soft, and recompute the limits as additional data become available. With Shewhart's charts, 10 degrees of freedom require about 15 to 24 data. You may compute limits using fewer data, but you should understand that such limits are soft. (While I have occasionally computed limits using as few as two data, the softest limits I have ever published were based on four data!) When you have fewer than 10 d.f. for your limits, you can still say that points which are comfortably outside the limits are potential signals. Likewise, points comfortably inside the limits are probable noise. With fewer than 10 d.f., only those points close to the limits are uncertain. Thus, with an appreciation of the curve in Figure 2, you no longer must be a slave to someone's arbitrary guideline about how much data you need. Now you can use whatever amount of data may be available. You know that with fewer than 10 d.f., your limits are soft, and with more than 30 d.f., your limits are fairly solid. After all, the important thing is not the limits but the insight into the process behavior that they facilitate. The objective is not to get the "right" limits but rather to take the appropriate actions on the process. So use the amount of data the world gives you, and get on with the job of separating potential signals from probable noise.

Five Ways to Use Shewhart's Charts
The only limitation on the use of Shewhart's charts is your imagination. And the way to stimulate your imagination is to begin using this powerful technique yourself. The many different ways of using control charts in both service and manufacturing applications may be summarized under five major headings. These five categories are arranged below in order of increasing sophistication. 1. Report card charts. These charts are kept for the files. They may occasionally be used for information about how things are going or for verification that something has or has not occurred, but they are not used in real time for operating or improving the processes and systems present. This is a valid but weak usage of control charts. 2. Process adjustment charts. Some product characteristics may be plotted on a control chart and used in a feedback loop for making process adjustments, or some input characteristic may be tracked and used in a feed-forward loop for the same purpose. In many cases, these process adjustment charts will result in substantially more consistent operations than was the case prior to the use of control charts. (This assumes that someone will know how to properly adjust the process. In some cases, such knowledge can only be gained by some of the following uses of control charts.) However, once this initial improvement has been achieved, process adjustment charts simply strive to preserve the new status quo.

Wheeler’s SPC Toolkit 1996-97

Pag 27 di 36

The potential for dynamic and continual improvement is missing from this usage of the charts. Unfortunately, this seems to be the only usage considered in most of the articles recently published in trade journals. 3. Process trial charts. These charts analyze the data from simple experiments performed upon the process. This short-term usage of control charts is a simple and easy-to-understand alternative to the use of ANOVA and other statistical techniques. This usage is often found in conjunction with the next category. 4. Extended monitoring charts. This is the use of multiple control charts to simultaneously track several related characteristics in order to discover just which charts provide the best predictors of process or product performance. This usage will generally involve a project team with a specific mission. It is one of the preliminary steps for both the effective utilization of control charts and the effective use of process experiments. Without the constancy of purpose evidenced by extended monitoring and without the process stability obtained by getting the process into statistical control, it is doubtful that designed experiments will be of any long-term benefit. 5. The use of control charts for continual improvement. It is rare to begin with this usage of the charts. In many cases, progress to this last category comes only after extended monitoring and, possibly, process trials run. The control chart becomes a powerful tool for continual improvement only as those involved with the process learn how to use the chart to identify and remove assignable causes of uncontrolled variation. Every out-of-control point is an opportunity. But these opportunities can be utilized only by those who have prepared themselves in advance. SPC is ultimately a way of thinking, with the charts acting as a catalyst for this thought process. Lloyd Nelson calls a control chart a "when to fix it" chart. Ed Halteman calls it a "has a change occurred" chart. Sophronia Ward calls it a "process behavior" chart. All of these alternative names emphasize the interaction between the user and the chart, which is the secret of how the simple control chart can be the catalyst for continual improvement. The only limitation on the use of Shewhart's charts is your imagination. And the way to stimulate your imagination is to begin using this powerful technique yourself.

Three Types of Action
Shewhart's charts are the one tool that will facilitate the continual improvement of both process and product. Prior to the Industrial Revolution, manufacturing consisted of making things by hand. Each part was custom-made to fit in with the other parts in each assembly, with the result that every product was unique and expensive. As early as 1793, Eli Whitney had the idea of the interchangeability of parts. While this idea was revolutionary, it was also difficult to implement. The problem was how to make the parts interchangeable. Try as one might, the parts would not turn out to be identical. Therefore, manufacturers had to be content with making them similar. Specifications were developed to define how similar the parts had to be in order to fit, and all variation was classified as either permissible (within the specifications) or excessive (outside the specifications). The specification was a guide for defining the difference between a "good" part and a "bad" part. But it did not tell manufacturers how to make good parts, nor did it help them discover
Wheeler’s SPC Toolkit 1996-97 Pag 28 di 36

why bad parts were being produced. All they could do with specifications is sort the good stuff from the bad stuff at the end of the production line. Thus, manufacturing became an endless cycle of fabrication, inspection and rework, with some good product escaping every now and then. Of course, the customer needed more good product than was leaking out of the manufacturing process, and so the manufacturers began to write deviations from the specifications in order to get more good stuff to ship. And this is the origin of the perpetual argument about how good the parts must be. Manufacturers seek relaxed specifications, customers demand tighter specifications, and the engineers are caught in the middle. This conflict obscured the original and fundamental issue -- how to manufacture parts with as little variation as possible. The original ideal had been to make parts that were essentially identical. But how can we do this? A state of virtually uniform product can be achieved only through the careful study of the sources of variation in the process, and through action by management to reduce -- or to eliminate entirely -- sources of extraneous variation. Shewhart's charts provide a way to do just this. Shewhart's control charts allow you to characterize a given process as being predictable or unpredictable. A predictable process operates as consistently as possible; an unpredictable process does not. And this distinction is the beginning of the journey of continual improvement. When your process is unpredictable, it will display excessive variation that can be attributed to assignable causes. By the very way the charts are set up, it will be worthwhile to look for any assignable cause of unpredictable process changes. As the charts guide you to those points in space and time that are connected with the unpredictable process changes, they help you discover ways to improve your process, often with little or no capital expense. On the other hand, when your process is predictable, it will be a waste of time to look for assignable causes of excessive variation. There is no evidence of the presence of assignable causes, and looking for such will simply be a waste of time and effort. When a process is already operating as consistently as possible, the only way to improve it will be to change it in some fundamental manner. A predictable process operates up to its potential; an unpredictable process does not. Are you getting the most out of your process? Shewhart's charts give you the means of identifying the voice of the process. This is distinctly different from specifications, which are, at best, the voice of the customer. Thus, we need to distinguish between three different types of action: 1. Specifications are for taking action on the product -- to separate the good stuff from the bad stuff after the fact. 2. Shewhart's charts are for taking action on the process -- to look for assignable causes when they are present, with an eye toward process improvement, and to refrain from looking for assignable causes when they are absent. 3. Actions to align the two voices are desirable -- While this has been tried in the past, the lack of a well-defined voice of the process has made alignment difficult to achieve. These three types of action have different objectives. All are valid, all are reasonable, but the first and the third are strictly concerned with maintaining the status quo. Shewhart's charts are the one tool that will facilitate the continual improvement of both process and product.

What Is a Rational Subgroup?

Wheeler’s SPC Toolkit 1996-97

Pag 29 di 36

In the April column, I outlined three ways to compute the limits for an average chart: the right way, a wrong way and a very wrong way. Several readers wrote that they were using the very wrong way and that they were happy with this method. I have seen dozens of examples given in attempts to justify the incorrect ways of computing limits. In every case, the problem was a failure to subgroup the data in a rational manner. We compute limits for an average chart based upon the average range. The average range is the average amount of variation within the subgroups. Thus, the limits on an average chart depend upon the amount of variation inside the subgroups. You must organize the data into subgroups in such a way that this computation makes sense. We want to collect into each subgroup a set of values that were collected under essentially the same conditions. For example, some asthma patients measure their peak exhalation flow rates four times each day: morning before and after medication, and evening before and after medication. The data for one patient is shown in Figure 1.

Now think about what happens when we make each column in the table into a subgroup of size 4. Within each subgroup, we would have the four scores from a single day, and from one subgroup to the next, we would have the day-to-day variation. But the four scores for a single day are collected under different conditions! The variation within a subgroup is more than just background variation -- it includes both the medication effects and the morning-to-evening swings of the patient. These effects will make the ranges larger than they need to be to characterize the day-to-day variation. As a result of this subgrouping, the limits will be far too wide, and the averages and ranges will hug the central lines. This mistake is called stratification. What if we made each row of the table into a subgroup of size 5? Now the different conditions would no longer be contained within the subgroups. But what about the variation inside these subgroups? With this arrangement of the data, the day-to-day variation would be within each subgroup. Because the variation within the subgroups is used to construct the limits, this subgrouping will result in limits that make allowance for the day-to-day variation, but do not make any allowance for the variation morning to evening, or before and after medication. This average chart will be "out of control." But did we really need to prove that there is a difference morning to evening and pre-medication to post-medication? Unless we are trying to document these differences, this is an inappropriate subgrouping. So we must avoid the two errors of stratification and inappropriate subgrouping. Two conditions are required for any subgrouping to be rational: Each subgroup must be logically homogeneous, and the variation within the subgroups must be the proper yardstick for setting limits on the routine variation between subgroups. When the values within the subgroups are not collected under essentially the same conditions, you have failed to satisfy the homogeneity condition. When the variation from subgroup to subgroup represents sources of variation that are not present within the subgroups, and when these sources of variation from subgroup to subgroup are known to be larger than the sources of variation within the subgroups, then you have failed the yardstick criterion.

Wheeler’s SPC Toolkit 1996-97

Pag 30 di 36

In either case, the computations will break down because you will have failed to create rational subgroups. The remedy is not to change the computations, but to change the subgrouping into one that is appropriate for your data. While the data in the table do constitute a time series, they are not easily arranged into rational subgroups because each value is collected under different conditions. In other words, our logical subgroup size is n = 1. You will learn more about the data in the table by plotting them as a time series of 20 values than you ever will by subgrouping them and using an average and range chart. At the same time, you should resist the temptation to turn this time series of 20 values into an XmR chart. The fact that this time series is a mixture of values collected under different conditions will contaminate the moving ranges and make the limits meaningless. There is more to rational subgrouping than can be presented in this column. However, the two principles above should get you started down the right road.

Collecting Good Count Data
Obtaining good count data is a mixture of planning and common sense Counts are simple. But obtaining the count is only half the job. In addition, you also must know the area of opportunity for that count. In fact, the area of opportunity defines the count. And just what is the area of opportunity for a count? It depends on what is being counted, how it is being counted and what possible restrictions there might be upon the count. Let's begin with the problem of tracking complaints. How do you count them? Do you count the number of complaints received each month? Or do you count the number of customers who complained? You will need careful instruction before you can begin to collect useful count data. A certain pediatrics unit reported the number of concerns on a monthly basis. The values for one period of 21 months were, respectively, 20, 22, 9, 12, 13, 20, 8, 23, 16, 11, 14, 9, 11, 3, 5, 7, 3, 2, 1, 7 and 6. But even though you know the counts, you don't know the whole story because you don't know the context for the counts. Before anyone can make sense of these counts, certain questions must be answered. For instance, how is "concern" defined? Are these customer complaints or internally generated counts? Where is the border between a concern and a nonconcern? Why does the number of concerns drop? And what about the rumor that the hospital administrator is using these numbers to challenge the orthopedics unit to improve? If you don't know the area of opportunity for a count, you don't know how to interpret that count. If the area of opportunity changes over time, then the counts will not be comparable. To obtain comparable values when the area of opportunity changes, you must divide each count by its area of opportunity. Let's assume that concerns is just an antiseptic term for complaints. You could characterize the area of opportunity for these complaints in several ways: by the number of office visits, procedures performed or hours worked by primary caregivers. The area of opportunity will determine the ways you can use the counts to understand your process. And what constitutes a complaint? Does a complaint about a chilly reception room count?

Wheeler’s SPC Toolkit 1996-97

Pag 31 di 36

Don't despair. You can collect useful count data. The essence of the count data problem is twofold: What should you include in your count, and what area of opportunity would you use to adjust the counts to make them comparable? Begin with a written description of what to include in your count. What is the threshold for inclusion? Give examples; the more specific the better. Next, what is an appropriate area of opportunity for your count? You must choose an area that can be measured or counted, and that bears some clear relationship to the count. The test here is rationality. Find some logical connection between the size of the area of opportunity and the size of the count. Any one count may have several possible ways to characterize the area of opportunity, and for this reason alone, you must make an initial choice. Say you track sales generated through your Web site. The number of Web site orders divided by the number of Web site visits would be a proportion based upon counts. But you might also want to know the proportion of sales that came from these Web site orders. This would require a ratio of measurements. There is no simple formula for obtaining good count data. It's basically a mixture of planning and common sense, with some thought given to adjusting for variable areas of opportunity. Of course, there is always the problem of counting events that include different degrees of severity. Many times this problem is addressed by creating categories for the different degrees and then awarding different numbers of "demerits" for each category. While it may be helpful to create such categories, you should resist the temptation of adding up the demerits. An example of the absurdities that can result if you do comes from the University of Texas, where, in the 1960s, the campus police could issue tickets to students. The tickets came in three flavors: minor, major and flagrant. Minor violations included such things as jaywalking and littering. Four minor violations would get you expelled. Major violations included parking in a faculty space or hitting a pedestrian with your car. Two major violations would get you expelled. And then there was the flagrant category. The only infraction listed for a flagrant citation was moving a campus police barricade. So, if you had to make a choice between hitting a barricade or a jaywalker, you chose the pedestrian every time -- you got two of them for each barricade.

The Four Possibilities for Any Process
Every unpredictable process is subject to the effects of assignable causes. Successful quality control requires making a clear distinction between product and process. Products may be characterized by conformance to specifications. Processes may be characterized by predictability. When combined, these two classification systems yield four possibilities for any process: 1. Conforming and predictable -- the ideal state 2. Nonconforming and predictable -- the threshold state 3. Conforming yet unpredictable -- the brink of chaos 4. Nonconforming and unpredictable -- the state of chaos The ideal state occurs when a process is predictable and produces a 100-percent conforming product. Such predictability in a process results from using Shewhart's charts to identify assignable causes in order to remove their effects. Product conformity results from having natural process limits that fall within the specification limits.
Wheeler’s SPC Toolkit 1996-97 Pag 32 di 36

How can a process achieve the ideal state? Only by satisfying four conditions: 1. The process must remain inherently stable over time. 2. The process must operate in a stable and consistent manner. 3. The process average must be set at the proper level. 4. The natural process spread must not exceed the product's specified tolerance. Not satisfying any one of these conditions increases the risk of shipping a nonconforming product. When a process fulfills these four conditions, then a consistently conforming product results. The only way to determine that these four conditions apply to your process and subsequently are established and maintained day after day is by using Shewhart's charts. The threshold state occurs when a process is predictable but produces some nonconforming product. Sorting out nonconforming product is always imperfect and often very costly. The ultimate solution requires a change in either the process or the specifications. If the nonconformity occurs because of an incorrectly set process average, then adjusting the process aim should help. Here Shewhart's charts can determine when to make adjustments. If the nonconformity occurs because the process's natural variation exceeds the specified tolerance, a reduction in the process variation may work. However, because a predictable process performs as consistently as possible, reducing the process variation will require a fundamental process change, which in turn will require evaluation. As a final resort, the specifications themselves could change, with customer approval. Here, too, Shewhart's charts will prove invaluable. They are essential not only in getting any process into the threshold state, but they also are critical in any attempt to move from the threshold to the ideal state. In the third state, the brink of chaos, processes are unpredictable even though they currently produce a 100-percent conforming product. While product conformity will lead to benign neglect, process unpredictability will result in periodic rude awakenings. The change from a 100-percent conforming product to some nonconforming product can come at any time and without the slightest warning. Every unpredictable process is subject to the effects of assignable causes, the trouble source for any process. The only way to overcome the unpredictability of a process on the brink of chaos is to eliminate the effects of these assignable causes. This will require the use of Shewhart's charts. The state of chaos exists when an unpredictable process produces some nonconforming product. The process's unpredictable nature will make some days look better than others but will also prevent effective elimination of the nonconforming product. Efforts to correct the problem ultimately will be foiled by the random process changes resulting from assignable causes. Needed process modifications will produce only short-term successes because the assignable causes continue to change the process. With unnecessary modifications, a fortuitous shift by assignable causes may mislead. As a result, companies despair of ever operating the process rationally and begin to speak in terms of magic and art. The only way to move a process out of chaos is to eliminate the effects of assignable causes. This requires the use of Shewhart's charts; no other approach will work consistently. All processes belong to one of these four states, although processes may move from one state to another. In fact, entropy acts on every process, causing it to move toward deterioration and
Wheeler’s SPC Toolkit 1996-97 Pag 33 di 36

decay, wear and tear, breakdowns and failures. Because of entropy, every process will naturally and inevitably migrate toward the state of chaos. The only way to overcome this migration is by continually repairing entropy's effects. Because processes in the state of chaos obviously require change, chaos managers inevitably are appointed to drag the process back to the brink of chaos, erroneously considered the "outof-trouble" state in most operations. Once the process returns to the brink of chaos, then chaos managers leave to work on other problems. As soon as their backs are turned, the process begins to move back down the entropy slide toward chaos. New technologies, process upgrades and other magic bullets can never overcome this cycle of despair. Technologies may change, but the benign neglect that inevitably occurs when the process teeters on the brink of chaos will allow entropy to drag the process back down to the state of chaos. Thus, focusing solely on conformance to specifications will condemn an organization to cycle forever between the two states. Entropy places a process in the cycle of despair, and assignable causes doom it to stay there. Thus, it is important to identify both the effects of entropy and the presence of assignable causes. Shewhart's charts will consistently and reliably provide the necessary information in a clear and understandable form. The traditional chaos-manager approach focuses on conformance to specifications but doesn't attempt to characterize or understand the behavior of a process. Therefore, about the best this approach can achieve is to get the process to operate on the brink of chaos some of the time. Which explains why any process operated without Shewhart's charts is doomed to operate in the state of chaos.

Analyzing Data
The problem is not in knowing how to manipulate numbers but rather in not knowing how to interpret them. From the beginning of our education, we have all learned that "two plus two is equal to four." The very definiteness of this phrase summarizes the unequivocal nature of arithmetic. This phrase is used to characterize that which is inevitable, solid and beyond argument. It is the first item in our educational catechism, which is beyond dispute. This bit of arithmetic has been elevated to a cliché for the following reasons. During the years when we were learning our sums and our multiplication tables, we were also learning to spell and to write. This means that we had to learn about irregular spellings. We had to learn to use irregular verbs. And we had to learn to cope with many of the idiosyncrasies of language. In contrast to this, we learned that there are no irregular spellings in arithmetic. Whether you multiply three by two or multiply two by three, the result is always six. Addition, subtraction, multiplication and division contain no irony; they contain no hyperbole. The multiplication tables contain no sarcasm. As a result, we receive a subliminal message: Numbers are concrete, regular and precise, but words are inconstant, vague and changing. The contrast between the regularity (and for some, the sterility) of mathematics and the complexity (and richness) of language leaves us all with an inherent belief that numbers possess some native objectivity that words do not possess. Hence, when we want to indicate a solid and dependable truth, we are prone to recall the first rule in the mathematical catechism: Two plus two is equal to four. Because of this subliminal belief, we feel that we have some sort of control over those things we can measure. If we can express it in numbers, then we have made it objective, and we therefore know that with which we are dealing. Moreover, due to all the uncertainty we routinely must deal with, this ability to quantify things is so reassuring, so comforting, that we
Wheeler’s SPC Toolkit 1996-97 Pag 34 di 36

gladly embrace measurements as being solid, real and easy to understand. Hence, today we have gone beyond measuring the physical world. We have gone beyond the accounting of wealth. Now we are trying to measure everything. If we can quantify it, then we can deal with it "scientifically." So now we "measure" attitudes, we measure satisfaction, and we measure performance. And once we have measured these things, we feel that we know them objectively, definitively and concretely. But, having obtained these measurements, how do you analyze them? Do the normal rules of arithmetic apply? Unfortunately, all of our mathematical education has not prepared us to properly analyze such measurements. Our very first lessons taught us that two numbers which are not the same are different. So when the numbers differ, we conclude that the things being measured are also different. That this is not so is a fact that seems to have escaped the attention of almost everyone. And when we think the things are different, we tend to rank them and publish a list. For example, a recent article in my local newspaper reported that Nashville and Knoxville were, respectively, the 25th and 27th "most violent cities in the country." This ranking was based on the number of crimes against persons reported to the FBI by the local law enforcement agencies. But just what is entailed in such numbers? Is purse snatching a burglary (a crime against property) or a robbery (a crime against a person)? Is domestic violence reported as an assault or as disturbing the peace? These and other crimes are reported differently in different cities. Finally, even if the crimes were categorized and reported the same way, would the crime rates make the proper comparison? The incorporated portion of Nashville includes all of Davidson County and consists of urban, suburban and rural areas. In contrast, only half the population of greater Knoxville lives within the city limits-the rest live in the unincorporated portions of Knox County. Therefore Knoxville contains a much higher proportion of urban environments than does Nashville. If crime rates are higher in an urban setting, then dividing the number of reported crimes by the city's population will artificially inflate Knoxville's rate compared to that of Nashville. Considerations such as these can raise more than a reasonable doubt about the appropriateness of most of the published rankings we hear about every day. Many comparisons made by those who compile lists are virtually meaningless. The only thing that is worse than the compilation of such rankings is the use of these rankings for business decisions. The problem here is not a problem of arithmetic. It is not a problem of not knowing how to manipulate numbers but rather in not knowing how to interpret them. All the arithmetic, all the algebra, all the geometry, all the trigonometry and all the calculus you have ever had was taught in the world of pure numbers. This world is one where lines have no width, planes have no thickness and points have no dimensions at all. While things work out very nicely in this world of pure numbers, we do not live there. Numbers are not exact in the world in which we live. They always contain variation. As noted above, there is variation in the way numbers are generated. There is variation in the way numbers are collected. There is variation in the way numbers are analyzed. And finally, even if none of the above existed, there would still be variation in the measurement process itself. Thus, without some understanding of all this variation, it is impossible to interpret the numbers of this world. If a manufacturer applies two film coatings to a surface, and if each coating is two microns thick, will the combined thickness of the two coatings be exactly four microns thick? If we measure with sufficient care and precision, the combined thickness is virtually certain to be some other value than four microns. Thus, when we add one thing that is characterized by the value 2.0 to another thing characterized by the value 2.0, we end up with something which is only equal to four on the average.

Wheeler’s SPC Toolkit 1996-97

Pag 35 di 36

What we see here is not a breakdown in the rules of arithmetic but a shift in what we are doing with numbers. Rather than working with pure numbers, we are now using numbers to characterize something in this world. When we do this, we encounter the problem of variation. In every measurement, and in every count, there is some element of variation. This variation is connected to both the process of obtaining the number and to the variation in the characteristic being quantified. This variation tends to "fuzz" the numbers and undermine all simple attempts to analyze and interpret the numbers. So how, then, should we proceed? How can we use numbers? When we work with numbers in this world, we must first make allowances for the variation that is inherent in those numbers. This is exactly what Shewhart's charts do-they filter out the routine variation so that we can spot any exceptional values which may be present. (One way of doing this was described in this column last month.) This filtering, this separation of all numbers into "probable noise" and "potential signals" is at the very heart of making sense of data. While it is not good to miss a signal, it is equally bad to interpret noise as if it were a signal. The real trick is to strike an economic balance between these two mistakes, and this is exactly what Shewhart's charts do. They filter out virtually all of the probable noise, so that anything left over may be considered a potential signal. Whether or not you acknowledge variation, it is present in all of the numbers with which you deal each day. If you choose to learn about variation, it will change the way you interpret all data. You will still detect those signals that are of economic importance, but you will not be derailed by noise. If you choose to ignore variation, then for you, two plus two will still be equal to four, and you will continue to be misled by noise. You will also tend to reveal your choice by the way you talk and by the mistakes you make when you interpret data. Two plus two is only equal to four on the average. The sooner you understand this, the sooner you can begin to use numbers effectively.

Wheeler’s SPC Toolkit 1996-97

Pag 36 di 36

Sign up to vote on this title
UsefulNot useful