You are on page 1of 8

DOE-Exercise PILOT PLANT (Frac Fac 24-1)

Organic synthesis of semi-carbazone from glyoxylic acid in a pilot plant

Background
The organic synthesis of semi-carbazone from glyoxylic acid is a key step in the synthesis of
azuracil (a cytostaticum, anti-cancer drug). The objective of this study was to investigate the
best operating conditions for a pilot plant synthesizing semi-carbazone. A fractional factorial
design in four factors was constructed and three responses were measured (we use two
here). The aims of this experimental protocol were to obtain a high yield of semi-carbazone
and high purity. Two center points have been added to the original design.

Objective
This exercise demonstrates what’s possible with a fractional factorial design (res IV) and
findings that may need a follow up. In the exercise you will;
 Investigate how to detect and solve problems with significant but confounded
interactions and square effects using the Analysis Wizard and its tools.
 Interpret and communicate a possible result.
 Understand the difference between the presentation tools Contour, Sweet Spot and
Design Space plots.

Data

Copyright Sartorius Stedim Data Analytics AB, 20-04-20 Page 1 (8)


Tasks
Task 1
Set up the investigation in MODDE and choose a fractional factorial design of resolution IV.
This means that the two-factor interactions will be confounded. Produce a list showing which
interactions are confounded with each other. The factor precision (left at default values) will
not be used in this investigation.

Task 2
Use the Analysis wizard to work through the responses.
Use the “Interaction test” and “Square test” functions to see if the model(s) need interaction
and/or square terms.

Task 3
Show graphically the part of the experimental space that should be chosen for a series of
verifying experiments in the pilot plant (specify levels for the variables). Goal: High Yield and
High Purity. Consider Addition Time as a factor that contributes to higher cost in the
production.
Hint: Use and compare; Contour, Sweet spot and Design Space plots on the Home tab.

Task 4
Which method is commonly used to separate confounding effects between two-factor
interactions?

Copyright Sartorius Stedim Data Analytics AB, 20-04-20 Page 2 (8)


Solutions to Pilot Plant

Task 1
On the Design tab, click Confoundings to show the list of interactions that are confounded.
Below, we can see the confounding pattern. The problem is that we cannot be sure which of
the confounded interaction terms that are important when we get a significant coefficient
(Note: a model including all confounded terms cannot be fitted with MLR since the
confounded terms are 1:1 correlated in the current design).

Task 2
Response 1 (Yield)

Copyright Sartorius Stedim Data Analytics AB, 20-04-20 Page 3 (8)


The summary plot indicates that this is a bad
model, why?
Note that Model validity seems OK despite
a missing interaction. With only two
replicates the model validity test will be
quite unreliable.

In this case we have a linear model to start with and it is often true that interaction terms have
to be added to produce better models.
Use the Interaction test in the wizard:

Copyright Sartorius Stedim Data Analytics AB, 20-04-20 Page 4 (8)


The test shows that there is an interaction between factors “Addition Time” and
“Temperature” (low probability that the term is equal to zero). This interaction is confounded
with the interaction between “Stirring” and “Water”. It does not matter (from a statistical
perspective) which interaction we select, the other will automatically be unavailable in the
dialog, but the selection should be motivated either by knowledge or other reasons (large
main effects). The choice of term will have a profound influence on the model interpretation.

Although they have low contribution to the model, the linear terms of Stirring and Water are
kept in the model, but can also be removed. It is a good procedure to document all linear
contributions and later use a fully tuned model for predictions.

Response 2 (Purity)

Copyright Sartorius Stedim Data Analytics AB, 20-04-20 Page 5 (8)


The histogram indicates 2 main groups of
data. Typically due to one main factor
influence (two levels on the factor).
The square test is highlighted indicating
that one or several factors have a non-linear
influence on the response.

The square test gives a list of possible square terms to add to the model.

Here we have chosen to add a square term in temperature to the model. That is an educated
guess. In chemistry temperature often has a non-linear effect on results. However, from a
theoretical point, it can be any of the factors that cause the non-linearity (one or several). To
sort this out we have to augment the design with new experiments so that there are three
levels for the factors (RSM design).
We have chosen to add one square term. This improves the model. The two linear terms
AddT and Stirr are kept in the model but can also be excluded.

Copyright Sartorius Stedim Data Analytics AB, 20-04-20 Page 6 (8)


Task 3
We can produce contour plots with addition time and temperature factors on the axes.
Amount of water added is set to its center level (because it has a negative effect on Yield and
a positive one on Purity). Stirring is also set to the center level because it has a small effect on
both responses.

A sweet spot plot shows where the criteria for the two responses are fulfilled.

Copyright Sartorius Stedim Data Analytics AB, 20-04-20 Page 7 (8)


The design space plot shows how to set factor levels to achieve safe results (the plot shows
the Probability of failure). The default is set to 1%. Note: When creating the Design Space
plot, the uncertainty interval of Confidence was used; this is reasonable given the small size
(10 experiments) of the experimental design.

Comparing the Sweet Spot Plot and the Design Space Plot it is obvious that the allowable
factor ranges are smaller in the Design Space Plot. In order to identify a region for verifying
experiments we compare the Contour Plot and the Design Space plot. In the Contour Plot
the dynamics and average levels of predictions are seen. In the Design Space plot the risk of
failing to comply with the specifications is given. For temperature the range between 35 and
55 seems to be the most interesting and for addition time the range above 1.9 h. Note that
the models have unresolved interaction confoundings and square confoundings
respectively.

Task 4
One common method used to unconfound two-factor interactions is called FOLD-OVER. It
is also possible to use D-Optimal functionality to augment designs in a more targeted way
(i.e. add a few experiments to resolve a specific confounding).

Conclusions
In order to accomplish high yield and high purity a factor combination of addition time 1.9h,
water 137.5 ml/mol and temperature 45 C looks appropriate. This setpoint should be verified
with additional experiments or if possible with model resolving experiments. The last factor,
stirring time, may be set at a convenient level. It is unfortunately in the most expensive region
(high temp. and long time) we predict a result within specifications with high confidence. If
the conclusions had been based on the Sweetspot plot it is likely that the decision could have
been quite different with the motivation of having a fast and economical process. A factor
combination of addition time 1.3h, water 137.5 ml/mol and temperature 30 C will probably
fail in > 30% of the attempts. Taking probability analysis into account may give remarkably
different conclusions. Original literature reference: J-C Vallejos, Diss. IPSOI, Marseille 1978.

Copyright Sartorius Stedim Data Analytics AB, 20-04-20 Page 8 (8)

You might also like