You are on page 1of 90

SIMPLE COMPARATIVE

EXPERIMENTS AND REVIEW OF


SIMPLE STATISTICAL
PRINCIPLES

School of Chemical and Bio Engineering


AAiT
STATISTICAL TECHNIQUES
What Do Engineers Do?
 An engineer is someone who solves problems of interest to
society with the efficient application of scientific principles by:
• Refining existing products
• Designing new products or processes
STATISTICAL TECHNIQUES
 The field of statistics deals with the collection, presentation,
analysis, and use of data to:
• Make decisions
• Solve problems
• Design products and processes
 Specifically, statistical techniques can be a powerful
aid
 in designing new products and systems,
 improving existing designs, and
 designing, developing, and improving production processes.
 It is the science of learning information from data
STATISTICAL TECHNIQUES-VARIABILITY
Statistical methods are used to help us describe
and understand variability.
 By variability, we mean that successive
observations of a system or phenomenon do not
produce exactly the same result.
 Statistics gives us a framework for describing this
variability and for learning about which potential
sources of variability are the most important or
which have the greatest impact on the response
variable
STATISTICAL TECHNIQUES-VARIABILITY
 Example:- suppose that an engineer is designing a
nylon connector to be used in an automotive engine
application. The engineer is considering establishing
the design specification on wall thickness at 3/32 inch
but is somewhat uncertain about the effect of this
decision on the connector pull-off force. Eight prototype
units are produced and their pull-off forces measured,
resulting in the following data (in pounds): 12.6, 12.9,
13.4, 12.3, 13.6, 13.5, 12.6, 13.1.
STATISTICAL TECHNIQUES-VARIABILITY
 As we anticipated, not all of the prototypes have the same
pull-off force.
 We say that there is variability in the pull-off force
measurements.
 Because the pull-off force measurements exhibit
variability, we consider the pull-off force to be a random
variable.
 A convenient way to think of a random variable, say X,
that represents a measurement, is by using the model

where µ is a constant and  is a random disturbance


STATISTICAL TECHNIQUES-VARIABILITY
 The µ constant remains the same with every
measurement, but small changes in the environment,
test equipment, differences in the individual parts
themselves, and so forth change the value of .
 If there were no disturbances,  would always equal
zero and X would always be equal to the constant µ .
 However, this never happens in the real world, so the
actual measurements X exhibit variability.

We often need to describe, quantify and


ultimately reduce variability
STATISTICAL TECHNIQUES
 The field of statistical inference consists of those
methods used to make decisions or to draw
conclusions about a population.
 Statistical inference is concerned with making
decisions about a population based on the
information contained in a random sample from that
population
 Statistical inference may be divided into two major
areas:
 Parameter estimation and
 Hypothesis testing.
STATISTICAL TECHNIQUES

Parameter Estimator
 As an example of a parameter estimation problem, suppose that a
structural engineer is analyzing the tensile strength of a
component used in an automobile chassis.
 Since variability in tensile strength is naturally present between
the individual components because of differences in raw material
batches, manufacturing processes, and measurement procedures
(for example), the engineer is interested in estimating the mean
tensile strength of the components.
 In practice, the engineer will use sample data to compute a
number that is in some sense a reasonable value (or guess) of the
true mean. This number is called a point estimate.
i.e. The sample mean is a point estimator of the population mean
and the sample variance is a point estimator of the population
variance.
STATISTICAL TECHNIQUES
Hypothesis Testing
Statistical hypothesis testing is a framework for solving problems
of this type.
 Now consider a situation in which two different reaction
temperatures can be used in a chemical process, say t1 and t2 .
 The engineer speculates that t2 results in higher yields than
does t1.
A statistical hypothesis is a statement about the
parameters of one or more populations. Or
A procedure leading to a decision about a particular hypothesis is
called a test of a hypothesis.
 In this case, the hypothesis would be that the mean yield using
temperature 2 is greater than the mean yield using temperature
1
 Notice that there is no emphasis on estimating yields; instead,
the focus is on drawing conclusions about a stated hypothesis
STATISTICAL TECHNIQUES-VARIABILITY
Simple comparative experiment
 Example:- suppose that an engineer is designing a nylon
connector to be used in an automotive engine application. The
engineer is considering establishing the design specification on
wall thickness at 3/32 inch but is somewhat uncertain about the
effect of this decision on the connector pull-off force.

 Eight prototype units are produced and their pull-off forces


measured, resulting in the following data (in pounds): 12.6, 12.9,
13.4, 12.3, 13.6, 13.5, 12.6, 13.1.
STATISTICAL TECHNIQUES-DATA
COMPARISON
Approach 1
 From testing the prototypes, he knows that the average
pull-off force is 13.0 pounds.
 For example, the engineer might want to know if the mean
pull-off force of a 3/32-inch design exceeds the typical
maximum load expected to be encountered in this
application, say 12.75 pounds.
One approach would be to test the hypothesis that the
mean strength exceeds 12.75 pounds.
This is called a single sample hypothesis testing
problem.
STATISTICAL TECHNIQUES-DATA
COMPARISON
 However, he thinks that this may be too low for the
intended application, so he decides to consider an
alternative design with a greater wall thickness, 1/8
inch.
 Eight prototypes of this design are built, and the
observed pull-off force measurements are 12.9, 13.7,
12.8, 13.9, 14.2, 13.2, 13.5, and 13.1. The average is 13.4.
STATISTICAL TECHNIQUES-DATA
COMPARISON
Approach 2
 In this simple comparative experiment, the engineer
is interested in determining if there is any difference
between the 3/32-inch and 1/8-inch designs.

Other approach could be to compare the mean pull-off


force for the 3/32-inch design to the mean pull-off force
for the 1/8-inch design using statistical hypothesis
testing.
This result gives the impression that increasing the wall
thickness has led to an increase in pull-off force.

 Clearly, this is an analytic study; it is also an example


of a two-sample hypothesis testing problem.
STATISTICAL TECHNIQUES-DATA
COMPARISON
 However, there are some obvious questions to ask. For
instance, how do we know that another sample of
prototypes will not give different results?
 Is a sample of eight prototypes adequate to give reliable
results?
 What risks are associated with this decision in a conclusion
that increasing the wall thickness increases the strength?
 Is it possible that the apparent increase in pull-off force
observed in the thicker prototypes is only due to the inherent
variability in the system and that increasing the thickness of
the part (and its cost) really has no effect on the pull-off
force?
Review of basic statistical principles
PROBABILITY DISTRIBUTIONS
 The probability structure of a random variable, say y, is described
by its probability distribution.
 If y is discrete, we often call the probability distribution of y, say
p{y), the probability function of y.
 If y is continuous, the probability distribution of y, say f(y), is
often called the probability density function for y.
Discrete and continuous probability distributions
PROBABILITY DISTRIBUTIONS
 The properties of probability distributions may be
summarized quantitatively as follows:
SAMPLING AND SAMPLING
DISTRIBUTIONS
Random Samples, Sample Mean, and Sample
Variance
 The objective of statistical inference is to draw
conclusions about a population using a sample from
that population.
Most of the methods that we will study assume that
random samples are used.
 That is, if the population contains N elements and a
sample of n of them is to be selected, and if each of the
N!/(N - n)!n! possible samples has an equal probability
of being chosen, then the procedure employed is called
random sampling.
SAMPLING AND SAMPLING
DISTRIBUTIONS
 Statistical inference makes considerable use of quantities
computed from the observations in the sample.
 We define a statistic as any function of the observations
in a sample that does not contain unknown parameters.
 For example, suppose that Yl, Y2, ... ,Yn represents a
sample. Then the sample mean

 These quantities are measures of the central tendency and


dispersion of the sample, respectively.
sample standard deviation (s), is used as a measure of
dispersion and n-1 is called degree of freedom
SAMPLING AND SAMPLING
DISTRIBUTIONS
Properties of the Sample Mean and Variance
 The sample mean y is a point estimator of the
population mean µ, and the sample variance S2 is a
point estimator of the population variance 2.

 In general, an estimator of an unknown parameter is


a statistic that corresponds to that parameter.

 A particular numerical value of an estimator,


computed from sample data, is called an estimate.
SAMPLING AND SAMPLING
DISTRIBUTIONS
 There are several properties required of good point
estimators. Two of the most important are the
following

 The point estimator should be unbiased. That is, the long-


run average or expected value of the point estimator
should be the parameter that is being estimated.

 An unbiased estimator should have minimum variance.


This property states that the minimum variance point
estimator has a variance that is smaller than the
variance of any other estimator of that parameter.
DEGREES OF FREEDOM
 The quantity n - 1 is called the number of degrees of
freedom of the sum of squares.

 This is a very general result; that is, if y is a random


variable with variance 2 and SS has v degrees of
freedom, then
THE NORMAL AND OTHER SAMPLING
DISTRIBUTIONS
 Often we are able to determine the probability
distribution of a particular statistic if we know
the probability distribution of the population
from which the sample was drawn.
 The probability distribution of a statistic is called a
sampling distribution.

Normal distribution
THE NORMAL SAMPLING DISTRIBUTIONS
 One of the most important sampling distributions is the
normal distribution.
 If Y is a normal random variable, the probability
distribution of Y is

Where; 2 > 0 is the variance


 Many important sampling distributions may also be
defined in terms of normal random variables.
 We often use the notation Y ~ N(µ, 2) to denote that y
is distributed normally with mean µ and variance 2.
STANDARD NORMAL DISTRIBUTION
 An important special case of the normal distribution is
the standard normal distribution; i.e., µ = 0 & 2 = 1.
We see that if y ~ N(µ, 2), the random variable;

standardizing the normal


random variable y

follows the standard normal distribution, denoted Z ~


N(0, 1).
 The cumulative standard normal distribution is given
in Tables in most appendix Appendix
CHI-SQUARE DISTRIBUTION
Chi-square
 An important sampling distribution that can be
defined in terms of normal random variables is the
chi-square or X2 distribution.
 If Z1 Z2, . .. , Zk are normally and independently
distributed random variables with mean and variance
, abbreviated NID(0, 1), then the random variable

follows the chi-square distribution with k degrees of


freedom. The density function of chi-square is
CHI-SQUARE DISTRIBUTION
 The distribution is
asymmetric, or skewed, with Several chi-square distributions

mean and variance

respectively.
 Percentage points of the chi-
square distribution are given
in tables .
T DISTRIBUTION

 IfZ and X2 are independent standard normal


and chi-square random variables, respectively,
the random variable

 followsthe t distribution with k degrees of


freedom, denoted tk .The density function of t
is

and the mean and variance of t are = 0 and =


k/(k - 2) for k > 2, respectively.
T DISTRIBUTION

 Note that if k = infintiy, the t


distribution becomes the
standard normal distribution.
 The percentage points of the t
distribution are shown in tables

Several t distributions
T DISTRIBUTION

 If Y1, Y2. ... , Yn is a random sample from the


N(µ, 2) distribution, then the quantity

is distributed as t with n - 1 degrees of freedom.


F DISTRIBUTION
 If X2v and X2u are two independent chi-square random
variables with u and v degrees of freedom,
respectively, then the ratio

follows the F distribution with u numerator degrees of


freedom and v denominator degrees of freedom.
F DISTRIBUTION
 If x is an F random variable with u numerator and v
denominator degrees of freedom, then the probability
distribution of x is
F DISTRIBUTION
 This distribution is very Several F distributions
important in the
statistical analysis of
designed experiments.
 Percentage points of the F
distribution are given in
Tables
HYPOTHESIS TESTING
Inferences About the Differences in Means and
variances: Randomized Designs
The important parametric tests are:
(1) Z-test;
(2) t-test;
(3) Chi-square test
(4) F-test.
All these tests are based on the assumption of
normality i.e., the source of data is considered to
be normally distributed
HYPOTHESIS TESTING
How can a data from a simple comparative experiment
can be analyzed using hypothesis testing and
confidence interval procedures for comparing two
treatment means in completely randomized
experimental design?
HYPOTHESIS TESTING
 Statisticians have developed several tests of
hypotheses (also known as the tests of significance)
for the purpose of testing of hypotheses which can
be classified as:
(a) Parametric tests or standard tests of hypotheses; and
(b) Non-parametric tests or distribution-free test of
hypotheses.
 Parametric tests usually assume certain properties
of the parent population from which we draw
samples. Assumptions like observations come from
a normal population, sample size is large,
assumptions about the population parameters like
mean, variance, etc., must hold good before
parametric tests can be used.
HYPOTHESIS TESTING
 But there are situations when the researcher
cannot or does not want to make such
assumptions. In such situations we use statistical
methods for testing hypotheses which are called
non-parametric tests because such tests do not
depend on any assumption about the parameters
of the parent population.
HYPOTHESIS TESTING
 Statistical Hypothesis: A statistical hypothesis is a
statement either about the parameters of a
probability distribution or the parameters of a model.

 Ho is called Null Hypothesis


 H1 is called Alternative hypothesis
HYPOTHESIS TESTING
 For example, suppose that we are interested in the burning rate
of a solid propellant used to power aircrew escape systems. Now
burning rate is a random variable that can be described by a
probability distribution. Suppose that our interest focuses on the
mean burning rate (a parameter of this distribution). Specifically,
we are interested in deciding whether or not the mean burning
rate is 50 centimeters per second. We may express this formally
as
HYPOTHESIS TESTING
 In some situations, we may wish to formulate a one-
sided alternative hypothesis, as in

It is important to remember that hypotheses are


always statements about the population or distribution
under study, not statements about the sample.
The value of the population parameter specified in the
null hypothesis (50 centimeters per second in the above
example) is usually determined in one of three ways.
HYPOTHESIS TESTING
 First: It may result from past experience or knowledge of
the process, or even from previous tests or experiments. The
objective of hypothesis testing then is usually to determine
whether the parameter value has changed.

 Second: this value may be determined from some theory or


model regarding the process under study. Here the objective
of hypothesis testing is to verify the theory or model.

 Third: situation arises when the value of the population


parameter results from external considerations, such as
design or engineering specifications, or from contractual
obligations. In this situation, the usual objective of
hypothesis testing is conformance testing.
HYPOTHESIS TESTING
 To illustrate the general concepts, consider the
propellant burning rate problem introduced earlier. The
null hypothesis is that the mean burning rate is 50
centimeters per second, and the alternate is that it is
not equal to 50 centimeters per second. That is, we wish
to test
HYPOTHESIS TESTING
 The sample mean can take on many different values.
 Suppose that if 48.5<X<51.5 we will not reject the null hypothesis Ho, and if either
X<48.5 or X >51.5, we will reject the null hypothesis Ho in favor of the alternative
hypothesis H1.

 The values of X that are less than 48.5 and greater than 51.5 constitute the
critical region for the test, while all values that are in the interval form a region
for which we will fail to reject the null hypothesis. By convention, this is usually
called the acceptance region. The boundaries between the critical regions and
the acceptance region are called the critical values. In our example the critical
values are 48.5 and 51.5.
 It is customary to state conclusions relative to the null hypothesis H0. Therefore,
we reject H0 in favor of H1 if the test statistic falls in the critical region and fail to
reject H0 otherwise.
HYPOTHESIS TESTING
 This decision procedure can lead to either of two wrong
conclusions.
 For example, the true mean burning rate of the propellant
could be equal to 50 centimeters per second. However, for
the randomly selected propellant specimens that are tested,
we could observe a value of the test statistic X that falls
into the critical region.
 We would then reject the null hypothesis H0 in favor of the
alternate when, in fact, H0 is really true. This type of wrong
conclusion is called a type I error.
 Now suppose that the true mean burning rate is different
from 50 centimeters per second, yet the sample mean X
falls in the acceptance region. In this case we would fail to
reject H0 when it is false. This type of wrong conclusion is
called a type II error.
HYPOTHESIS TESTING

 Because our decision is based on random variables,


probabilities can be associated with the type I and
type II errors in Table 9-1.
 The probability of making a type I error is denoted by
the Greek letter, . That is,
HYPOTHESIS TESTING
 In evaluating a hypothesis-testing procedure, it is also
important to examine the probability of a type II
error, which we will denote by β.
HYPOTHESIS TESTING
 Generally, the analyst controls the type I error
probability when he or she selects the critical
values.
 Thus, it is usually easy for the analyst to set the type
I error probability at (or near) any desired value.
 Since the analyst can directly control the probability
of wrongly rejecting H0, we always think of rejection
of the null hypothesis H0 as a strong conclusion.
HYPOTHESIS TESTING
TESTS ON THE MEAN OF A NORMAL
DISTRIBUTION, VARIANCE KNOWN
TESTS ON THE MEAN OF A NORMAL
DISTRIBUTION, VARIANCE UNKNOWN
HYPOTHESIS TESTS ON THE VARIANCE
AND STANDARD DEVIATION OF A
NORMAL POPULATION
INFERENCES ABOUT THE
DIFFERENCES IN MEANS: PAIRED
COMPARISON DESIGNS

INFERENCE FOR A DIFFERENCE IN MEANS OF


TWO NORMAL DISTRIBUTIONS, VARIANCES
KNOWN – Z test
INFERENCE FOR THE DIFFERENCE IN MEANS
OF TWO NORMAL DISTRIBUTIONS, VARIANCES
UNKNOWN – t test
PROCEDURE FOR HYPOTHESIS
TESTING
 To test a hypothesis means to tell (on the basis of the
data the researcher has collected) whether or not the
hypothesis seems to be valid.

 In hypothesis testing the main question is: whether to


accept the null hypothesis or not to accept the null
hypothesis?

 Procedure for hypothesis testing refers to all those


steps that we undertake for making a choice between
the two actions i.e., rejection and acceptance of a null
hypothesis.
PROCEDURE FOR HYPOTHESIS
TESTING
PROCEDURE
 1. From the problem context, identify the parameter
of interest.
 2. State the null hypothesis, H0.

 3. Specify an appropriate alternative hypothesis,

 4. Choose a significance level .

Steps 1–4 should be completed prior to examination of


the sample data.
PROCEDURE FOR HYPOTHESIS
TESTING
 5. Determine an appropriate test statistic to be used
(such as Z test or t test).
 6. State the rejection region for the statistic i.e.
specify the location of the critical region (two-tailed,
upper-tailed, or lower-tailed).
 7. Compute any necessary sample quantities,
substitute these into the equation for the test
statistic, and compute that value.
 8. Decide whether or not H0 should be rejected and
report that in the problem context.
(Specify the criteria for rejection (typically, the value of
, or the P-value at which rejection should occur).)
PROCEDURE FOR HYPOTHESIS
TESTING
 (i) Making a formal statement: The step consists in
making a formal statement of the null hypothesis (H0)
and also of the alternative hypothesis (Ha). This means
that hypotheses should be clearly stated, considering
the nature of the research problem.
 Example:- Mr. Mohan of the Civil Engineering
Department wants to test the load bearing capacity of
an old bridge which must be more than 10 tons, in that
case he can state his hypotheses as under
PROCEDURE FOR HYPOTHESIS
TESTING
 (ii) Selecting a significance level: The hypotheses are tested
on a pre-determined level of significance and as such the
same should be specified. Generally, in practice, either 5%
level or 1% level is adopted for the purpose.
 The factors that affect the level of significance are:
(a) the magnitude of the difference between sample
means;
(b) the size of the samples;
(c) the variability of measurements within samples; and
(d) whether the hypothesis is directional or non-
directional (A directional hypothesis is one which
predicts the direction of the difference between, say,
means).
 In brief, the level of significance must be adequate in the
context of the purpose and nature of enquiry.
PROCEDURE FOR HYPOTHESIS
TESTING
 (iii) Deciding the distribution to use: After deciding the
level of significance, the next step in hypothesis
testing is to determine the appropriate sampling
distribution.
 The choice generally remains between normal
distribution and the t-distribution.
 The rules for selecting the correct distribution are
similar to those which we have stated earlier in the
context of estimation.
PROCEDURE FOR HYPOTHESIS
TESTING
 (iv) Selecting a random sample and computing an
appropriate value:
 Another step is to select a random sample(s) and
compute an appropriate value from the sample data
concerning the test statistic utilizing the relevant
distribution.
 In other words, draw a sample to furnish empirical
data.
PROCEDURE FOR HYPOTHESIS
TESTING

 (v) Calculation of the probability: One has then to


calculate the probability that the sample result
would diverge as widely as it has from expectations,
if the null hypothesis were in fact true.

 (vi) Comparing the probability: Yet another step


consists in comparing the probability thus calculated
with the specified value for a , the significance level.
INFERENCES ABOUT THE VARIANCES OF NORMAL
DISTRIBUTIONS
INFERENCES ABOUT THE VARIANCES
OF NORMAL DISTRIBUTIONS
INFERENCES ABOUT THE VARIANCES
OF NORMAL DISTRIBUTIONS
P VALUE
 One way to report the results of a hypothesis test is to state
that the null hypothesis was or was not rejected at a specified
-value or level of significance.
 For example, in the propellant problem above, we can say
that H0:µ=50 was rejected at the 0.05 level of significance.
This statement of conclusions is often inadequate because it
gives the decision maker no idea about whether the computed
value of the test statistic was just barely in the rejection region
or whether it was very far into this region.
 Furthermore, stating the results this way imposes the
predefined level of significance on other users of the
information.
 This approach may be unsatisfactory because some decision
makers might be uncomfortable with the risks implied by =
0.05.
P VALUE
 To avoid these difficulties the P-value approach has
been adopted widely in practice.
 The P-value is the probability that the test statistic will
take on a value that is at least as extreme as the
observed value of the statistic when the null hypothesis
H0 is true.
 Thus, a P-value conveys much information about the
weight of evidence against H0, and so a decision maker
can draw a conclusion at any specified level of
significance.
The P-value is the smallest level of significance
that would lead to rejection of the null hypothesis
H0 with the given data.
P VALUE
H0 would be rejected at any level of significance >P-
value

 It is customary to call the test statistic (and the data)


significant when the null hypothesis H0 is rejected;
therefore, we may think of the P-value as the smallest level
at which the data are significant.
 Once the P-value is known, the decision maker can
determine how significant the data are without the data
analyst formally imposing a preselected level of significance
 It is not always easy to compute the exact P-value for a test.
However, most modern computer programs for statistical
analysis report P-values, and they can be obtained on some
hand-held calculators.
P VALUE
 Finally, if the P-value approach is used, final step of the
hypothesis-testing procedure can be modified.
 Specifically, it is not necessary to state explicitly the critical
region.
 For the foregoing normal distribution tests it is relatively easy to
compute the P-value. If Zo is the computed value of the test
statistic, the P-value is

Here,Φz is the standard normal cumulative


distribution function. Recall that , where Z is N(0, 1).
P VALUE
 If level of significance is greater that or equal to P-
value, then there is strong evidence to conclude that Ho
is not true (reject Null hypothesis)
If > P-value --------------- Reject Ho
If < P-value --------------- Accept Ho
CONFIDENCE INTERVAL
 Although hypothesis testing is a useful procedure, it
sometimes does not tell the entire story.
 It is often preferable to provide an interval within
which the value of the parameter or parameters in
question would be expected to lie.
 These interval statements are called confidence
intervals.
 An interval estimate for a population parameter is
called a confidence interval.
CONFIDENCE INTERVAL
A confidence interval estimate for µ is an interval of
the form l< µ<u, where the endpoints l and u are
computed from the sample data.
The end-points or bounds l and u are called the lower-
and upper-confidence limits, respectively, and 1- is
called the confidence coefficient.
EXAMPLE 2.1
 The breaking strength of a fiber is required to be at
least 150 psi. Past experience has indicated that the
standard deviation of breaking strength is = 3 psi. A
random sample of four specimens is tested, and the
results are Y1 = 145, Y2 = 153, Y3 = 150, and Y4 = 147.
(a) State the hypotheses that you think should be tested in
this experiment.
(b) Test these hypotheses using a = 0.05. What are your
conclusions?
(c) Find the P-value for the test in part (b).
(d) Construct a 95 percent confidence interval on the
mean breaking strength.
EXAMPLE 2.2
 The viscosity of a liquid detergent is supposed to
average 800 centistokes at 25°C. A random sample of
16 batches of detergent is collected, and the average
viscosity is 812. Suppose we know that the standard
deviation of viscosity is = 25 centistokes.
(a) State the hypotheses that should be tested.
(b) Test these hypotheses using = 0.05. What are your
conclusions?
(c) What is the P-value for the test?
(d) Find a 95 percent confidence interval on the mean.
EXAMPLE 2-5
 The shelf life of a carbonated beverage is
of interest. Ten bottles are randomly
selected and tested, and the following
results are obtained:
 (a) We would like to demonstrate that the
mean shelf life exceeds 120 days. Set up
appropriate hypotheses for investigating
this claim.
 (b) Test these hypotheses using a = 0.01.
What are your conclusions?
 (c) Find the P-value for the test in part
(b).
 (d) Construct a 99 percent confidence
interval on the mean shelflife.
EXAMPLE 2-7
 The time to repair an electronic instrument is a
normally distributed random variable measured in
hours. The repair times for 16 such instruments chosen
at random are as follows:
(a) You wish to know if the mean repair
time exceeds 225 hours. Set up
appropriate hypotheses for investigating
this issue.
(b) Test the hypotheses you formulated in
part (a). What are your conclusions? Use a
= 0.05.
(c) Find the P-value for the test.
(d) Construct a 95 percent confidence
interval on mean repair time
EXAMPLE 2.9
 Two machines are used for filling plastic bottles with a net volume of 16.0
ounces. The filling processes can be assumed to be normal, with standard
deviation of 1 = 0.015 and 2 = 0.018. The quality engineering department
suspects that both machines fill to the same net volume, whether or not this
volume is 16.0 ounces. An experiment is performed by taking a random
sample from the output of each machine.

(a) State the hypotheses that should


be tested in this experiment.
(b) Test these hypotheses using a =
0.05. What are your conclusions?
(c) Find the P-value for this test.
(d) Find a 95 percent confidence
interval on the difference in mean fill
volume for thetwo machines.
EXAMPLE
 The following are the burning times of chemical flares of
two different formulations. The design engineers are
interested in both the means and variance of the burning
times.
(a) Test the hypotheses that the two
variances are equal. Use = 0.05.

(b) Using the results of (a), test the


hypotheses that the mean burning times
are equal. Use = 0.05. What is the P-
value for this test?
EXAMPLE
 A new filtering device is installed in a chemical unit.
Before its installation, a random sample yielded the
following information about the percentage of
impurity: ŷ1 = 12.5, S12 =101.17, and n1 = 8. After
installation, a random sample yielded y2 = 10.2, S22 =
94.73, n2 = 9.
(a) Can you conclude that the two variances are equal?
Use a = 0.05.
(b) Has the filtering device reduced the percentage of
impurity significantly? Use a = 0.05.
Understanding Mechanistic & Empirical Models

• A mechanistic model is built from our underlying knowledge of the basic physical
mechanism that relates several variables.
Example: Ohm’s Law

1-3 Mechanistic & Empirical Models


Current = voltage/resistance
I = E/R
I = E/R + 
• The form of the function is known.

85

© John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger.
Mechanistic and Empirical Models

An empirical model is built from our engineering and scientific knowledge of the

1-3 Mechanistic & Empirical Models


phenomenon, but is not directly developed from our theoretical or first-principles
understanding of the underlying mechanism.
The form of the function is not known a priori.

86

© John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger.
An Example of an Empirical Model
• We are interested in the numeric average molecular weight (Mn)
of a polymer. Now we know that Mn is related to the viscosity of
the material (V), and it also depends on the amount of catalyst (C)
and the temperature (T ) in the polymerization reactor when the

1-3 Mechanistic & Empirical Models


material is manufactured. The relationship between Mn and these
variables is
Mn = f(V,C,T)

say, where the form of the function f is unknown.

• We estimate the model from experimental data to be of the


following form where the b’s are unknown parameters.

87

© John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger.
Another Example of an Empirical Model
• In a semiconductor manufacturing plant, the finished semiconductor is wire-bonded to
a frame. In an observational study, the variables recorded were:

• Pull strength to break the bond (y)

1-3 Mechanistic & Empirical Models


• Wire length (x1)
• Die height (x2)
• The data recorded are shown on the next slide.

88

© John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger.
Table 1-2 Wire Bond Pull Strength Data

1-3 Mechanistic & Empirical Models


89

© John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger.
Empirical Model That Was Developed

1-3 Mechanistic & Empirical Models


In general, this type of empirical model is called a regression model.

The estimated regression relationship is given by:

90

© John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger.

You might also like