You are on page 1of 36

Pocket guide to statistical analysis techniques

Pocket guide to statistical analysis techniques for use with tightening tools
Chapter ..............................................................................Page 1. Introduction.......................................................................4 2. Basic statistics ...................................................................5 2.1 Variation........................................................................5 2.2 Distribution...................................................................6 2.3 Histogram .....................................................................6 2.4 Mean value ...................................................................6 2.5 Standard deviation ........................................................7 2.6 Estimation of a normal distribution .............................9 Sample mean and standard deviation ...............................10 3. Accuracy requirements ..................................................11 3.1 Meanshift and combined scatter.................................11 Example ............................................................................12 4. Understanding processes................................................13 5. Capability ........................................................................14 5.1 Cp................................................................................14 5.2 Cpk..............................................................................15 5.3 When is a process capable? ........................................16 5.4 Machine capability indices.........................................18 5.5 What else is there to think about? ..............................18 6. Control charts .................................................................19 6.1 X-bar charts ................................................................20 6.2 The subgroup ..............................................................21 6.3 Alarms.........................................................................22 6.4 Range charts ...............................................................22 6.5 Control charts conclusion...........................................23 Summary .........................................................................23 Appendix..........................................................................24 A1. Example of statistics calculation ...............................24 A2. Example of capability calculation .............................28 A3. Example of control chart calculation ........................29 A4. Analysis of assembly tool performance ISO 5393 Calculation ................................................32
P O C K E T G U I D E T O S TAT I S T I C S 3

1. Introduction
The purpose of this guide is to explain the basics of statistics and how statistics can be used in production. You will learn that with the aid of statistics we can compare tools with each other, we can tell whether a tool is good enough for a specific application, and by using Statistical Process Control (SPC) we can see how a production process develops over time. Our hope is that you, after reading this guide, will have a general knowledge and understanding of the potential of using statistics as a tool in production.

4 P O C K E T G U I D E T O S TAT I S T I C S

2. Basic statistics
2.1 Variation

Understanding statistics is much about understanding variation. Variation is present everywhere, in nature as well as in industrial processes. In industrial processes, even a slight deviation from the target value, a dimension for instance, may have strong influence on the functionality of the finished product. This means it is important to understand, and in some cases control, variation. There are two different kinds of variation. Random variations are predictable, always present and with many contributing causes. Examples of random variations are small variations in hole diameter, inconsistent friction, operator influence and variations in air pressure. It is hard to isolate one of these causes. The variations are tackled by improvement of the process. Random variations are natural and depend on the process and its environment. They are also called common causes. Systematic variations are sporadic and isolated. They are not predictable but it is often easy to pinpoint the cause. They are tackled by controlling the process. Systematic variation has a determined cause and can often be identified and eliminated. Examples are machine adjustments, wear of tools and human error. They are also called special causes. A great deal of importance has been placed on the use of statistical analysis techniques to control the quality of the assembly process. The traditional method of using these techniques is to analyze what has already occurred and when a problem is identified to adjust the process accordingly. It is now becoming increasingly common to use statistical techniques to predict how the process will perform in the future and to identify systematic variations and adjust the process before we end up with faulty products.
Figure 1. Variations in air pressure and operator influence are examples of random variations.

Figure 2. Human errors like missing washers and using wrong screws are examples of systematic variations.

P O C K E T G U I D E T O S TAT I S T I C S

2.2 Distribution

Consider a tightening process where we measure the torque applied to a bolt. As you know, we would not achieve the same readings for all tightenings. Suppose we collect enough readings to create a plot of the frequency (the number of times a particular reading occurred) against the actual torque readings. The result would be a plot similar to the one in figure 3 below. In statistical analysis this curve is known as a distribution. There are many different types of distribution, but the one that best describes this example (and others like it) is called the Normal or Gaussian distribution. A normal distribution is always symmetrical and determined by the mean and the standard deviation. A normal distribution only occurs when random variations affect the result.
2.3 Histogram

A histogram is when you divide the results into categories (for example all results between 20 21 Nm). Then it is possible to create a diagram by counting the number of results in every category and putting them into a diagram. By doing this it is possible to visualize the distribution with a fairly limited number of results.
Figure 3. Histogram.

2.4 Mean value

Figure 4. Normal distributions can be found everywhere. The height of people is one example. Another example can be if you try to cut off sticks to the same length.

A normal distribution can be found everywhere, both in nature and in industrial processes. If we have a big sample of measures, i.e. we have made 1000 tightenings with one tool, we can make a histogram. The more tightenings we have, the better curve we get. If we were to measure the height of all Swedish men, we would achieve an average (mean value) of 1.80. The mean value is the most common value in a normal distribution. There are not that many men that are really tall or really short. Another example could be when you cut off a stick. The target value is 20.00 cm and this would probably also be the mean value. However, some parts become only 19.90 and others 20.10, which is due to the natural variation of the process and is normal.

6 P O C K E T G U I D E T O S TAT I S T I C S

2.5 Standard deviation

If a tool is used for a very large number of tightenings at a set torque of e.g. 30 Nm, it is unlikely that every single tightening will reach this torque value exactly. This will be the case even if the tool is run on the same screw joint, a test fixture. Random factors, such as material wear and different handling of the tool may cause the applied torque to exceed or fall below the intended torque. The readings are said to deviate from the mean and we measure this with what is known as standard deviation. It is not essential to fully understand the formula, which is presented later. But it is helpful if you know how to calculate it, and it is crucial that you understand what it is! The standard deviation is the amount by which each reading is most likely to deviate from the average. What is the practical use of standard deviation? We have already said that the mean tells us the average value of the distribution (all different tightenings) and standard deviation indicates the scatter. We can use it to estimate how many of our values will come within a certain range. The standard deviation may be more accurately described as a calculation of how far a known percentage of the distribution lies from the mean. is a letter in the Greek alphabet and it is used to symbolize the deviation from the mean (average) of any distribution. For a business or manufacturing process, the value indicates how well that process is performing. A low value indicates that most of the values are close to the target. A high value indicates that the spread is big and that the values deviate more from the target value. If you have 20 values of a population, you are able to group them as shown in the figure. We make the assumption that they belong to a normal distribution. This is in fact the area within which you will get the next tightening. There is a 100% probability of getting inside the entire range. It is mathematically proven that there is a 68% certainty that all data lies between +/ 95% certainty that all data lies between +/ 2 , and 99.7% certainty that all data lies between +/ 3 . It is an important characteristic of the normal distribution that the standard deviation is symmetrical around the mean, and always covers the same percentage of the distribution. This is a mathematical law.

Figure 5. We always know how many percent of our values we will have within a certain range.

P O C K E T G U I D E T O S TAT I S T I C S

This now brings us to something very useful. Now that we know the percentage of the values that will end up within a certain boundary, we can predict how the process will behave in the future. Do you remember the discussion about random and systematic variation? We said that for a normal distribution all systematic variations are eliminated and only random variation is present. We now also know that 99.7% of all values are within 6, (or +/ 3). This enables us to make an important assumption: even though 0.3% of all tightenings will fall outside the 6 limits for a normal distribution, we assume that all tightenings outside these limits happen because of systematic variations in the process. This means that something new has entered the process it is not under control any more. To make things clearer, we assume that as long we have tightenings within the 6 limits, the process is only affected by random variations and is under control. When we have tightenings outside the 6 limits, the process is affected by systematic variation and is not under control. When this happens, this means that something new and strange has started to affect the tightening process and we need to find the reason for this and eliminate it. The following graphs show a comparison of two different normal distributions.

Figure 6. The first picture shows two curves with the same average, but different deviation. The second picture shows two curves with same deviation but different averages.

8 P O C K E T G U I D E T O S TAT I S T I C S

2.6 Estimation of a normal distribution

When we talk about measurements or readings on an application, we can calculate an average and a standard deviation. If we were to measure an infinite number of tightenings, we would know for sure that we have the true value of the mean and the standard deviation. This is the population mean and the population standard deviation. But in reality this is not possible and we have to rely on a limited number of tightenings. In statistics we talk about a sample; in the tightening business we talk about subgroup or a batch. This means that we cannot really know for sure that our calculations (mean and standard deviation) are correct, since they are only based on a limited number of tightenings. In fact, what we have is an estimate of the real values. The more tightenings we have on which to base our calculations, the more sure we can be that we are close to the population mean and standard deviation. We say that the average value of the distribution is the population mean () and the scatter is represented by the population standard deviation (). The population mean () is calculated by: = n x the sum of all tightenings, divided by the total number of tightenings (n). The population standard deviation () is calculated by
i=1
i

Figure 7. It is impossible to measure the entire population. We have to rely on a limited number of values, a sample or a batch.

x i

Where: is the value of each individual occurrence, the ith xi measurement of variable x. n x i
i

is the total number of occurrences in the population is the value of all occurrences added together (the sum) is the sum of all values of (xi-)2

i=1

We take the value of each individual occurrence minus m, the mean, and square this new value. Then we add each new
P O C K E T G U I D E T O S TAT I S T I C S 9

value together. We now divide this by the number of tightenings. Finally, we need to take the root of this total value, as we have (Nm)2 and need Nm, and we get the population standard deviation. The square and the root only exist because we want to get rid of the positive and negative deviations from the mean. However, in practice it is very rare that we can measure every occurrence of the data. In fact, n would then have to be infinite, which of course is impossible. Instead we use a representative sample to predict the mean and standard deviation of the population.
Sample mean and standard deviation

We calculate sample mean ( ) in the same way as for the population mean (): x=
i=1

x i
n

The calculation for Sample standard deviation (s) differs slightly from the population standard deviation ():
i

Where xi n xi
i

is the value of each individual occurrence in the sample is the total number of occurrences in the sample is the sum of the values of all occurrences in the sample is the sum of all values of (xi - )2

The use of (n 1) instead of (n) gives more accurate estimate of the population standard deviation, , and is very important when small sample sizes are used. So remember that we can never use the total population in our calculations; that is impossible. We have to use smaller samples and calculate estimates of the real average and the real standard deviation. Thus, the sample mean ( ) is an estimation of the population mean (). The sample standard deviation (s) is an estimation of the population standard deviation ().
10 P O C K E T G U I D E T O S T A T I S T I C S

3. Accuracy requirements
In a tightening application there are often accuracy requirements of the tools. Accuracy requirements are written as a target torque +/ a maximal acceptable deviation from the target, for example +/ 10%. The accuracy of a tool is often calculated as 50% of the natural variation (3) divided by the target value. This makes it possible to compare different tools at a certain target value, without relating them to a certain application (tolerances). As you will notice in the next chapter, the accuracy calculations are similar to some capability calculations (in accuracy calculations we compare the natural variation to the mean value, in capability calculations we compare the natural variation to tolerance demands in the application)! If the accuracy requirements are 40 Nm +/ 10%, we have to check that 3s is within 10%, or 100 * 3/Ave is less than 10%. Assume that we test the tool and achieve a mean value of 40 Nm, and a standard deviation of 1.2 Nm. Then we calculate the accuracy: (3*1.2 / 40) = 9%. We now see that the tool is accurate enough to do the job.
3.1 Mean shift and combined scatter

Mean shift is what occurs when you run a tool on both hard and soft joints. You will most probably get two different mean values, a higher value for the hard joint, with two different distributions. The difference between these two mean values is the mean shift. We want to find the limits (comparable to the normal distribution) where the probability of getting a torque outside these limits is 99.7% on the hard or soft joint. This is the combined scatter and corresponds to 6 on the normal distribution. Once we have the combined scatter we can relate this to the combined average. This gives us something that is often referred to as the accuracy. Written as a formula it will look like this: Accuracy = 100 x 0.5 ((Avehard +3 hard) (Avesoft 3soft))/Ave Where Ave = (Avesoft+Avehard)/2 (the combined average).

Figure 8. The mean shift is the difference between the mean values of the hard and the soft joints.

Figure 9. Combined average and combined scatter.

P O C K E T G U I D E T O S TAT I S T I C S

11

This is normally true, but we cannot know for sure that the distribution will look like this. We can, for example, have a negative mean shift. We need to check which of the limits are the outermost. Adjusted, the formula would look like this: Accuracy = 100 * 0.5 Deviation/Ave Where Deviation = max (Avehard +3hard, Avesoft +3soft) min (Avesoft 3soft, Avehard 3hard) Ave = (Avesoft + Avehard)/2 (the combined average)
Example:

Tests on a hard joint (30 degrees) and a soft joint (800 degrees) produced the following data. Hard joint: Soft joint: Ave = 61 Nm and = 1.2 Nm Ave = 60.2 Nm and = 1.0 Nm

Deviation = Max (61+3*1.2, 60.2+3*1.0) min (61-3*1.2, 60.2-3*1.0) = 7.4 Nm Ave = (61+60.2)/2 = 60.6 Nm Accuracy = 100*0.5*7.4/60.6 = 6.1% It is hard to give an estimate of the accuracy of tools because of: Different accuracy on hard, soft and combined applications. Different accuracy if the tool is used high up in the torque range or in the lower part.

12 P O C K E T G U I D E T O S T A T I S T I C S

4. Understanding processes
Every organization produces something, whether it be products or activities, and this is done in many different ways. But what all organizations have in common is that the way they work can be described as methods and activities. A process is simply a structured set of activities designed to produce a specified output for a particular customer or market. It has a beginning, an end, and clearly identified inputs and outputs. A process is therefore a structure for action, for how work is done. Within the quality area, the process concept is defined as a set of activities, which are repeated in time, for the purpose of creating value for a customer. As you now understand, the process approach implies adopting the customer point of view. Processes also have performance dimensions, such as cost, time, output quality and customer satisfaction. Bear in mind that all of these dimensions can be measured and improved.

Figure 10. A process is a set of activities designed to produce an output for a customer or a market.

In a modern car plant, the production line is a typical operative process i.e. it creates value for the person buying the car. Along the line, the cars are assembled with different kinds of nutrunners, all with different functionality, performance and reliability. In the assembly process there are a lot of things that affect the outcome of the tightening. The operators, the screws, the holes and many other things affect the tightenings. All this contributes to the total process variation for each application. Remember the discussion about variation in chapter 1. The dimensions with which we measure the performance of the nutrunners are torque and sometimes angle. By using statistics, we can analyze the performance of the process (tightenings) and we can monitor, control and improve the assembly process. This means, in the long run, more accurate tightenings, better and safer cars and better value for the customers.

Figure 11. Industrial production is an operative process. A lot of things contribute to the process variation.

P O C K E T G U I D E T O S TAT I S T I C S

13

5. Capability
Earlier in this pocket guide we talked about statistics and accuracy. The accuracy of a tool tells us something about the performance, but this is not enough. The important aspect for our customers is how the tool performs in an application, on the production line. So, somehow we have to relate the accuracy of the tool to the application. Every joint has a target value, but also some tolerance that is acceptable for the customer. By relating the mean and the standard deviation to the target value and the tolerance limits of an application we can tell how a tool is performing where it really matters, in its application. This is possible thanks to different capability indices. There are many different capability indices, some of them quite simple and some of them more intricate. This pocket guide deals with the most commonly used ones, the ones our customers use. We know from before that a normal distribution is defined by its mean and its standard deviation. We also remember our assumption that all values, when the process is under control, are within the 6 limits, although only 99.7% really are. This is called the process natural variation.
5.1 Cp

The first, and most commonly used capability index, is called Cp. The formula for the Cp is: HI LO Cp = Tolerance interval = 6 6 If you look at the formula, you can see that it simply relates the tolerance interval (HI-LO), to the process natural variation! If we have a tool with a big spread, and an application with very high demands (narrow tolerance limits), we get a low Cp value. Conversely, if we have a tool with very small spread (small ), but very wide tolerance limits, we get a high Cp. Of course this is what we want, because the smaller the variation in relation to the tolerance limits, the lower the risk of tightenings outside the tolerances. The Cp requirements vary. The most common is that Cp has to be greater than 1.33. This indicates that 6 times the standard deviation covers no more than 75% of the tolerance interval.

Figure 12. When calculating Cp, the tolerance interval is related to the 6.

14 P O C K E T G U I D E T O S T A T I S T I C S

But is this enough for us to tell if the tool is good or bad for a specific application? Do we need something more? Yes. The Cp does not consider whether the mean of the distribution is close to the target value or not. This index does not guarantee that the distribution lies in the middle of the tolerance interval. In the picture below you can see the same tool on the same application, but before and after torque adjustment. In both cases we would have the same Cp. If we are off target, it is possible that the tightenings are outside one of the tolerance limits, even if the scatter is small in relation to the tolerance interval (high Cp). So we need something more that also relates the distribution to the target value.

Figure 13. High Cp does not guarantee that we are close to the target value.

5.2 Cpk

The Cpk also relates the mean of the distribution to the target value of the application. The way to do this is to divide the distribution and the application into two different parts and make one calculation for each side. The formula looks like this: Cpk = min [(HI AVE) / 3 , (AVE LO) / 3] First we relate the difference between the upper tolerance limit and the average to half the natural variation (3). Then we make another calculation, relating the difference between the average and the lower tolerance limit to 3. We now have two potentially different values, and the LOWER of the two is the Cpk. If you think this is difficult, just take a few minutes to think about this. If the average is higher than the target value, then the difference between the upper tolerance limit and the average is smaller than the difference between the average and the lower tolerance limit. If this is the case, the

Figure 14. When calculating Cpk also the target value is considered.

P O C K E T G U I D E T O S TAT I S T I C S

15

upper calculation will give us the Cpk, because we are closer to the upper tolerance limit. What happens to the Cpk if we are right on target? Well, in this case we are as close to the upper tolerance limit as to the lower, and both calculations will give us the same result. In this case, we can also see that the Cpk has the same value as the Cp.
Bad

Cp

Good

Bad
Figure 15. The relation between Cp and Cpk.

Process not capable Change tool or adjust for good accuracy. Not possible.

Process capable but average needs to be adjusted.

Cpk
Good
Process capable and well adjusted.

Now we have introduced the Cp and the Cpk. By studying the formulas it is easy to see that Cp only relates the tolerance interval to the process 6. Cpk also considers the target value. We want both Cp and Cpk to be higher than 1.33. If our average is right on target, the Cp and Cpk are the same. The more off target we are, the bigger the difference between Cp and Cpk. Obviously Cpk can never be higher that Cp.
5.3 When is a process capable?

The question of how good is capable? has still not been definitively answered. Since Cp was first used, a Cp value of 1.33 has become the most commonly acceptable criterion as a lower boundary. The Cpk requirements vary. The most common is that Cpk has to be greater than 1.33. A process that has a Cpk lower than 1.00 is never capable. It is very important that you understand why we use both the Cp and the Cpk. If we only use the Cp, we do not know whether we are on target or not. If we only use the Cpk, we cannot know whether a good or bad Cpk value is because of the centering of the process or because of the spread. So we have to use both. Together they can give us a very good indication of how well a specific tool is performing in a specific application. They are also the perfect way to compare different tools.
16 P O C K E T G U I D E T O S T A T I S T I C S

Look at the following dartboards:

The first dartboard shows a poorly centered process, but with a low spread (high accuracy). In this case the Cp is high and the Cpk low. On the second dartboard, the darts are spread randomly around the bulls eye, but the spread is quite large related to the tolerances. Cp is probably not so good, but if the mean value is on target, the Cpk has the same value as the Cp. The third dartboard shows a well centered process, with high accuracy. This means that both the Cp and Cpk are high; the process is capable.
An example:

Figure 16. Dartboard 1: High Cp and low Cpk. Dartboard 2: Low Cp and low Cpk. Dartboard 3: High Cp and high Cpk

A joint should be tightened at 70 Nm 10 %. A tool is tested and we get an average of 71 Nm and a of 1.2 Nm. Cp = Cpk = (77-63) / 6*1.2 = 1.95 min [ (77-71) / (3*1.2) , (71-63) / (3*1.2) ] = min [ 1.67, 2.22 ] = 1.67

Both the Cp and Cpk values are greater than 1.33 and the process is capable and does not need to be adjusted.

P O C K E T G U I D E T O S TAT I S T I C S

17

5.4 Machine capability indices

As you now know, Cp and Cpk are process capability indices. Everything that affects the process affects these indices. But if we take away all variation affecting the assembly process, except the variation in the tool itself, we get what are called Machine Capability indices. This must be done under very controlled circumstances, preferably in a tool crib. The tests should be carried out on the same joint and by the same operator (or even better, place the tool in a fixture in order to get rid of all the operator influence). The calculations are the same for Cm as for Cp, and the same for Cmk as for Cpk. So remember, Cp and Cpk determine whether the process is capable. The Cm and Cmk determine whether the machine (tool) is capable.
5.5 What else is there to think about?

When you analyze the capability of a tool, the sample size is of great importance in order to obtain reliable mean and standard deviation calculations. A sample size of at least 25 is strongly recommended. And remember that if a someone says something like I have a tool that always can live up to a Cpk demand of 2.0, there are two alternatives: 1. He does not know what he is talking about, because it is meaningless to talk about capability indices without relating the tool performance to an application with customer demands (tolerance limits)! 2. He knows what he is talking about and is trying to make the tool look better than it really is.

18 P O C K E T G U I D E T O S T A T I S T I C S

6. Control charts
We have talked about statistics and accuracy, about processes and capability. Now we are going to learn about control charts. Statistics, tool performance and a production environment (process variation) are important elements in understanding this. The control chart is an important tool within Statistical Process Control. The idea is to repeatedly collect a number of observations (samples) with a certain interval from the process. With help from these observations (measurements) we want to calculate some kind of quality indicator and plot it in a diagram. The indicator normally used in the tightening industry is subgroup mean and/or subgroup range. Do you remember the difference between special and random variation? If not, do go back and read the section again, because this is very important. If the plotted quality indicator is within the 6 limits, we say that the process is under statistical control, only random variation affects the tightenings. When we use these limits in control charts, they are called control limits. We also have an ideal level, a target value marked between the control limits, and of course it should be the same as our target value in the assembly process. If some special variation enters the process, it can affect the tightenings in two different ways; it can affect the average of the tightenings, the spread or both. We have the following requirements on a control chart: It should be possible to quickly detect systematic changes in the process, enabling us to find sources of variation. It should be easy to use. The chance of getting a false alarm should be very small (if we use the 6s limits as control limits, the chance is 0.3 %). It should be possible to know when the change started to affect the process. It should prove that the process has been under control. It should be motivating and constantly bring attention to variations in the process and to quality related issues.

P O C K E T G U I D E T O S TAT I S T I C S

19

6.1 X-bar charts

First we introduce a control chart for controlling the average level of a certain unit. It can be the diameter of a bolt, or the torque applied to a joint. It is called -chart, and when using it we plot the average of the observations (measurements) into the diagram. At pre-defined intervals we collect a number of measurements, a subgroup, from the process. We then calculate the mean for each subgroup and use this value as our quality indicator. We know that the tightening applications can be described as a normal distribution. We know that the mean and the standard deviation help us to do that. We also know that all processes vary over time, due to different kinds of variation, i.e. material differences, operator influence etc. The 6 limit makes it possible to tell whether the process variation is due to random or special causes, so the control limits are normally based on the 6 limits, the natural variation of the process. The procedure for plotting these charts is straightforward, the relevant variable (in our case torque or angle) is measured at regular intervals (maybe once every hour or once a day), and typically a group of 5 consecutive readings are taken each time. When the control limits are set, the -values from each group of readings can be plotted on the charts. When the assembly process is under control (only random variation affects the tightenings), the subgroup averages will spread randomly around the overall mean ( ).

LO

Figure 17. We collect a number of measurements, a subgroup from the process, and plot the averages into the diagram.

20 P O C K E T G U I D E T O S T A T I S T I C S

6.2 The subgroup

Assume that the quality variable (in our case the tightenings) we want to control has the average and standard deviation when the process is under control. Remember that our quality indicator is the subgroup mean, . Ideally the individual measurements and the subgroup averages have the same mean value (see picture). But we can also see that the spread between the individual measurements () is bigger than between the subgroup averages, which in fact is /n, where n is the number of measurements in each subgroup. So the chance of detecting a deviation from is greater when we study subgroups instead of individual measurements. So, in fact, the control limits are normally set to (the subgroup 6limits): UCL = + 3/n LCL = 3/n

Figure 18. The spread between individual measurements is bigger than between subgroup averages.

Estimated by:

But how big does the subgroup need to be? If you look at the picture below you see that as we increase the size of the subgroups (n), the standard deviation does not decrease so much when we go over 4 or 5. This explains why 4, 5 or 6 are very common choices of subgroup sizes. Historically, a subgroup of 5 is a very common choice.

Figure 19. Using a subgroup size of 5 is very common in the industry.

P O C K E T G U I D E T O S TAT I S T I C S

21

6.3 Alarms

Now to the good stuff; what happens if something non-random starts affecting the tightenings? What if the quality of the screws suddenly deteriorates? Well, maybe it will affect the mean of the subgroups. Maybe it will affect the spread within the subgroups. Maybe the torque applied to the joints will slowly decrease. All this can now be detected. The beauty of control charts is that the quality engineer, or quite often the operator, can pick up potential problems at an early stage before we get tightenings outside the tolerances, before faulty assemblies are made. The easiest way to detect that something non-random has started to affect the process is when we get values outside the control limits. This is an ALARM and we have to find out what has happened immediately, before we get tightenings outside the tolerance limits! In the figure to the left you can see what a control chart CAN look like when special variation starts affecting the assembly process. The first two cases show trend alarms. Production can continue during investigation. The fourth case is when the overall mean ( ) starts to deviate from the target value. We have to find out why this has happened, but maybe an adjustment of the tool is enough; this depends on the reason for the change.
6.4 Range charts

Figure 20. Examples of what control charts can look like when systematic variation has entered the process.

To control the spread in the process we can use either the standard deviation or the range within the subgroups. The range (R) is the difference between the biggest and the smallest value of each subgroup. The standard deviation is of course based on all values within the subgroup, whereas the range is only based on two. This means that the s-chart is more reliable and gives us more information about the spread. However, the range is easier to calculate and even though we now have very good tools, which calculate everything for us, the R-chart is still the most popular chart to use.

22 P O C K E T G U I D E T O S T A T I S T I C S

The Range R helps us to estimate the spread of the subgroup. This can be done with the aid of different devisors, which can be found in manuals for statistical process control. If you want the centerline to be , the control limits for the control chart will be: UCL = D4* LCL = D3* The R-chart indicates how the spread within the subgroups develops. It makes it possible to detect when a systematic change in the process affects the subgroup spread.
6.5 Control chart conclusion

The control limits should be based on a large and reliable number of tightenings and they should be re-calculated, using the actual production results, at regular intervals in order to obtain reliable charts. This chapter is only intended as an introduction to Process Control charts and does not cover all aspects of these charts.

Summary
This guide explains the basics of statistics such as the distribution, mean value and standard deviation. It also describes how this can be related to an application by capability calculations. The process can be monitored and controlled by using SPC, and this is also described and explained. This pocket guide does not explain all aspects of and the potential of statistics. This is an introduction to the subject, and if there is a need for further studies we recommend you refer to specialist literature. The different product offerings that Atlas Copco can supply to help customers utilise the potential of statistics in production are not explained in this guide. If you need to discuss Atlas Copcos product range, please contact your local Atlas Copco sales representative.

P O C K E T G U I D E T O S TAT I S T I C S

23

Appendix
A1. Example of basic statistics calculation

The following example will help you to understand the basics of statistics. In this example we compare the torque levels of two different tools. You then might obtain the torque values shown below. Target torque is 10.
Atlas Copco Tool 10 10.1 10.2 9.7 10.0 10.2 10.1 9.7 9.8 10.2 Other tool 10 11 9 8 12 10 9 12 8 11

Which one of these tools is the most accurate? To answer this, we first calculate the mean value of the two series. The mean value gives us an average of all values received from the different tightenings and we use the symbol . The mean value is calculated by adding all tightening data, x, and dividing by the number of tightenings, n. x=
i=1

x i
n

Mean value,
Atlas Copco Tool 10 Other tool 10

24 P O C K E T G U I D E T O S T A T I S T I C S

Both tools have a mean value of 10. If one tool were to have a mean value of 15, we would know that that tool is not as good as the one hitting the target torque. Do both tools have the same accuracy? Accuracy tells us how accurate a tool is, i.e. how well it hits the target. It is the degree to which an indicated value matches the actual value of a measured variable. How do we now see the difference? Let us look at the range of the values of the two tools. The range, R, tells us between which values we have received our tightenings, and is calculated as the difference between the highest and the lowest value in the range. R = xmax xmin. Range, R
Atlas Copco Tool 0.5 Other tool 4

With the Atlas Copco tool, our tightening values differ by 0.5 Nm between highest and lowest value; while the other tool has a deviation of 4 Nm. But if you perform 1000 tightenings with the Atlas Copco tool and get one value totally out of the range, e.g. 5, you get a range for the Atlas Copco tool of 5.5. Then the Atlas Copco tool becomes the bad one. We have to find a function to remove the influence of that one tightening.

P O C K E T G U I D E T O S TAT I S T I C S

25

The standard deviation is a statistical index of variability, which describes the deviation and tells us the average difference between the value of a specific variable and some desired value, usually a process set point. Let us calculate the deviation for each value received and sum them up.
Atlas Copco Tool Torque 10 10.1 10.2 9.7 10.0 10.2 10.1 9.7 9.8 10.2 xi 0 0.1 0.2 -0.3 0 0.2 0.1 -0.3 -0.2 0.2 Other tool Torque 10 11 9 8 12 10 9 12 8 11 xi 0 1 -1 -2 2 0 -1 2 -2 1

=10

=0

=10

=0

The result is 0 for both tools. What is it that causes a problem in this case? We have both positive and negative values. We need to take away the minus, to get the absolute values of each deviation. To mathematically take away the minus, we can square each value.
i

26 P O C K E T G U I D E T O S T A T I S T I C S

Atlas Copco Tool

Other tool )
2

10 10.1 10.2 9.7 10.0 10.2 10.1 9.7 9.8 10.2


=10

xi 0 0.1 0.2 -0.3 0 0.2 0.1 -0.3 -0.2 0.2


(xi )=0 (xi

(xi -

10 11 9 8 12 10 9 12 8 11
=10

xi 0 1 -1 -2 2 0 -1 2 -2 1
(xi )=0

(xi -

)2 0 1 1 4 4 0 1 4 4 1

0 0.01 0.04 0.09 0 0.04 0.01 0.09 0.04 0.04


)2= 0.036

(xi

)2=2

Now we have a value that is Nm2 to compare with. But what does this value tell us? It tells us something about deviation. This value depends on the number of tightenings. What we do is to divide this value by the number of tightenings 1 to get an average. We have to take the square root of this sum to get the value back to Nm.
i

Atlas Copco Tool 0.2

Other tool 1.4

What we now have done is to calculate the sample standard deviation. Standard deviation is a way of measuring how well the tool performs, how close we are to the expected value. Now we can see a clear difference. The Atlas Copco tool has a standard deviation of 0.2 Nm from the target; while the other tool has a standard deviation of 1.4 Nm. So what this example tells us is that although both tools have the same mean value, the first tool is more accurate. The different tightenings are closer to the target value and the standard deviation is a way for us to prove this.

P O C K E T G U I D E T O S TAT I S T I C S

27

A2. Example of capability calculation

We know that the capability of a tool is how the tool is performing in a specific application. So what we do when calculating capability indices is to relate the tool accuracy (mean value and standard deviation) to the demands on the application (target value and tolerance limits). Let us assume that we have an application with target value 15 Nm, and tolerances +/ 8%. This means that the upper tolerance limit is 16.2 Nm and the lower limit is 13.8 Nm. We have collected 20 tightening results from one tool, on the production line:
15.4 15.6 15.4 15.1 15.1 15.5 15.0 15.3 15.2 15.1 15.5 15.3 15.4 15.3 15.3 15.1 15.2 15.4 15.1 15.2

28 P O C K E T G U I D E T O S T A T I S T I C S

It is now easy to calculate the mean value and standard deviation: x= i=1 n x i
n

It is now easy to calculate Cp and Cpk: Cp = (HI LO) / 6 = (16.2-13.8)/(6*0.165) = 2.42 Cpk = min [(HI - AVE) / 3 , (AVE - LO) / 3] = min [(16.2-15.275)/3*0.165 , (15.275-13.8/3*0.165] = min [1.87 , 2.98] = 1.87 Both the Cp and Cpk values are greater than 1.33 and the process is capable and does not necessarily need to be adjusted, even though the average is slightly off target.
A3. Example of control chart calculation

Now we want to create a control chart from the same tightenings, as in the previous example. Let us assume that we are starting up a production process after it has been stopped for some time. Then we do not really know the mean value and the standard deviation . In order to calculate the control limits for the control chart, the calculations must be based on a reliable number of tightenings. A good rule of thumb is to collect at least 20-25 subgroups before calculating the control limits for a control chart. The reason for this is that at least 20 subgroups are needed for us to be able to say whether the process is under control or not. However, in this example we have simplified things and collected only 4 subgroups.

P O C K E T G U I D E T O S TAT I S T I C S

29

Let us assume that we have collected these results on 4 different occasions. We have set the subgroup size to 5, so we have collected 5 results on each occasion:

Day 1

15.4 15.6 15.4 15.1 15.1

Day 2

15.5 15.0 15.3 15.2 15.1

Day 3

15.5 15.3 15.4 15.3 15.3

Day 4

15.1 15.2 15.4 15.1 15.2

30 P O C K E T G U I D E T O S T A T I S T I C S

The first thing we do is to calculate the average for every subgroup: = = = = 15.32 15.22 15.36 15.2

When the production process is under control, the target value is the same as the overall average value. It is easy to calculate the overall average ( ) = 15.275. We know from before that the control limits are based on the natural variation between the average values of the subgroups. UCL = LCL = + 3s / n = 15.275 + (3*0.165 / 5) = 15.275 + 0.22 = 15.5 3s / n = 15.275 (3*0.165 / 5) = 15.275 0.22 = 15.05

Now we can create our control chart. We use the overall average as centerline and also mark the control limits in the chart. Now we can plot the subgroup averages in the chart. As we can see they are all within the control limits and the production is under control (even though the individual tightening values are outside the limits. Remember that the limits are based on the variation between the subgroup averages, not the individual tightenings. From now on it is easy to plot a new subgroup average into the chart every day. As long the plotted values are spread randomly round the centerline, the process is under control.

Figure 21. The process is under control when the subgroup averages spread randomly around the overall mean.

P O C K E T G U I D E T O S TAT I S T I C S

31

A4. Analysis of assembly tool performance ISO 5393 Calculation

To allow us to assess the performance of different tools and to compare one tool with another, there is an international standard (ISO 5393), which sets out a basic test procedure and analysis of results. Based on this, many motor vehicle manufacturers have developed their own certification standards. As an example we will assume we have tested a tool according to the procedure stated in ISO 5393. On a hard joint with the tool set at the highest torque setting the following results were obtained (in Nm).
31.5 33.2 32.6 33.7 31.4 32.5 33.1 31.2 33.5 32.6 33.1 31.0 32.3 33.2 32.4 31.5 33.5 33.3 31.5 32.6 31.3 33.7 33.0 31.8 33.0

We now calculate the values required to analyze the tools tightening accuracy as described in ISO 5393. For the data on the hard joint at the highest torque setting.
Mean torque ( )

= (31.5 +33.2 + 32.6 + 33.7 + ....+ 33.0) / 25 = 32.5 Nm


Range

= 33.7 - 31.0 = 2.7 Nm


Standard deviation (s)

= 0.863 Nm
Sigma (6s) torque scatter

6 x 0.863 = 5.18 Nm
6s scatter as a percentage of mean torque

= (5.18 / 32.5) x 100 = 15.93 %

32 P O C K E T G U I D E T O S T A T I S T I C S

Now let us assume that for the same tool we calculated the following values for the data collected at the other torque settings and joint conditions described in ISO 5393. For the higher torque setting on the soft joint A mean of 31.95 and a standard deviation of 0.795. For the lower torque setting on the hard joint A mean of 23.72 and a standard deviation of 0.892. For the lower torque setting on the soft joint A mean of 22.87 and a standard deviation of 0.801.
We can now make the following calculations for the higher torque setting

a= b= c= d= a= b= c= d=

Mean hard joint +3S Mean soft joint +3S Mean hard joint 3S Mean soft joint 3S

hard joint soft joint hard joint soft joint

32.50 + (3 x 0.863) = 35.09 31.95 + (3 x 0.795) = 34.34 32.50 (3 x 0.863) = 29.91 31.95 (3 x 0.795) = 29.56

Combined mean torque

(35.09 + 29.56) / 2 = 32.33 Nm


Mean shift

32.5 31.95 = 0.55 Nm


Combined torque scatter

35.09 29.56 = 5.53 Nm


Combined torque scatter as a % of combined mean

(5.53 / 32.33) x 100 = 17.1 %


Lower torque setting

a= b= c= d= a= b= c= d=

Mean hard joint Mean soft joint Mean hard joint Mean soft joint

+ 3s + 3s 3s 3s

hard joint soft joint hard joint soft joint

23.72 + (3 x 0.892) = 26.40 22.87 + (3 x 0.801) = 25.27 23.72 (3 x 0.892) = 21.04 22.87 (3 x 0.801) = 20.47
P O C K E T G U I D E T O S TAT I S T I C S 33

Combined mean torque

(26.40 + 20.47) / 2 = 23.44 Nm


Mean shift

23.72 -22.875 = 0.85 Nm


Combined torque scatter

26.40 20.47 = 5.93 Nm


Combined torque scatter as a % of combined mean

(5.93 / 23.44) x 100 = 25.3 %


Tool Capability is 25.3 %

as the greatest torque scatter was at the lower torque setting. This particular tool will tighten 99.7 % of all practical joints to within 13 % of its pre-set torque value. (i.e. 99.7 % of results will fall within 3s of the mean).

34 P O C K E T G U I D E T O S T A T I S T I C S

Atlas Copco Pocket Guides


Title Air line distribution Air motors Drilling with hand-held machines Grinding Percussive tools Pulse tools Riveting technique Screwdriving Statistical analysis technique The art of ergonomics Tightening technique Vibrations in grinders Ordering No. 9833 1266 01 9833 9067 01 9833 8554 01 9833 8641 01 9833 1003 01 9833 1225 01 9833 1124 01 9833 1007 01 9833 8637 01 9833 8587 01 9833 8648 01 9833 9017 01

P O C K E T G U I D E T O S TAT I S T I C S

35

www.atlascopco.com

9833 8637 01

Recyclable paper. Jetlag 2003:1. Printed in Sweden

You might also like