You are on page 1of 25

2018-11

Introduction to Randomistic Optimization

Hugo Hernandez
ForsChem Research, 050030 Medellin, Colombia
hugo.hernandez@forschem.org

doi: 10.13140/RG.2.2.30110.18246

Abstract

Optimization problems are ubiquitous in many different fields of science and engineering, as
well as in our daily lives. And so are uncertainty and randomness. Mathematically speaking, an
optimization problem can only be solved if it is deterministic. Thus, the solution of any
optimization problem involving any type of uncertainty or randomness requires the correct
transformation of the problem into a solvable deterministic optimization problem. The purpose
of this report is to show that it is possible to formulate a general type of optimization problem
(involving or not uncertainty or randomness), denoted as a randomistic optimization problem.
It is also shown that such randomistic problem can successfully be translated into a solvable
deterministic optimization problem. That general optimization problem can be regarded as the
randomistic optimization approach, which uses the concept of randomistic variables for
integrating both the deterministic and the stochastic worlds. The present work is intended to
provide only an introduction to the topic, explaining some fundamental concepts and
presenting a selection of examples to illustrate the basics of the method. One of the examples
is of particular interest because it shows how to estimate the parameters of a model based on
Taguchi’s robust signal-to-noise maximization approach. For the particular case of linear
models, robust regression provides different values of the parameters compared to the least-
squares minimization regression.

Keywords

Chance-constrained, Deterministic, Optimization, Probability distributions, Randomistic,


Randomness, Regression, Robust, Taguchi, Uncertainty

23/11/2018 ForsChem Research Reports 2018-11 (1 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

1. Introduction

Randomistic variables are variables that can be measured several times under identical
conditions.[1] Thus, randomistic variables can show either a deterministic (the result is always
the same) or a random (the result may be different) behavior. Different possible values for the
outcome of a randomistic variable are the result of having an incomplete set of controlled
conditions, that is, it is the result of not controlling relevant factors in the measurement. Such
uncontrolled relevant factors can be interpreted as missing information. Each uncontrolled
relevant factor provides one additional dimension of randomness. From that point of view, a
deterministic variable can be considered as a randomistic variable with zero dimensions of
randomness.[2]

On the other hand, randomistic variables can be characterized by means of the raw moments
of the distribution of probability for all possible outcomes, where the moments are also
randomistic variables. They can be considered as a fingerprint of the probability distribution.
Normally, the moments and their variations present a deterministic behavior if the effect of all
uncontrolled factors was successfully captured in the data. Otherwise, moving or drifting
random behavior may appear,[3] leading to random moments and variations. This is the case,
for example, when the uncontrolled conditions are changing with time. Thus, the
characterization of the randomistic variable is incomplete.

As it can be seen, the randomistic behavior is a more general condition of any system under
observation. In fact, the randomistic generalization merges the predictable behavior of
deterministic variables with the stochastic behavior of random variables. In particular,
randomistic optimization is presented in this report as an application example of such
integration, considering both conventional deterministic optimization as well as all types of
stochastic optimization problems.

2. Randomistic Optimization Problem Formulation

An optimization problem, in general, consists on finding the best values of a certain set of
decision variables that optimizes (maximizes or minimizes) a certain objective function. When
the function to be optimized is a function of random variables, then the optimization problem
is ambiguous. This can be seen in Figure 1, where two different values of the decision variable
( ) may lead to a wide range of possible values of the objective function ( ). If the objective
function must be maximized, what is the best value for the decision variable between the two
alternatives presented? It can be observed that for it is possible to reach the highest
values of the objective function. However, also the worst values can be obtained, and on
average, the results achieved for seem better.

23/11/2018 ForsChem Research Reports 2018-11 (2 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Figure 1. Sample of results obtained when evaluating a random objective function ( ( )) for
two different values of the decision variable ( ).

2.1. Randomistic Objective Functions

Because of the inherent variability of a random objective function, it is not possible to solve
satisfactorily an optimization problem for such random function. Instead, the solution of the
optimization problem requires a deterministic function. There are several different approaches
used to solve this issue including:[4] Considering the expected or average value of the random
objective function, considering the variance (or the standard deviation) of the random
objective function, or considering the maximum or minimum value of the random objective
function.

In general, it would be possible to generalize the requirement of the objective function as a


function of the moments operators ( ) of randomistic variables ( ), which are defined as:[1]

( ) ( )
(2.1)
where ( ) is the expected value operator and .

23/11/2018 ForsChem Research Reports 2018-11 (3 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

This definition cover all previous approaches, including the expected value ( ( )), the
variance ( ( ) ( )), and the maximum or minimum value of the distribution, assuming
that they can be expressed (for real positive numbers)§ as:

( ) ( ( ))
(2.2)
( ) ( ( ))
(2.3)

where . Particularly, a fair estimate of the maximum and minimum values of the
distribution can be obtained by selecting a sufficiently large value of (e.g. ).

Thus, the generalized objective function can be expressed as:

( ) ( ( ))
(2.4)

Please notice the following:

 Several different random variables can be involved in the determination of the


objective function, represented by vector .
 Different values of can be considered in the function. The set of values of
considered are denoted by vector .
 can take any real value, thus it is not limited to natural numbers.
 Any nonlinear function of the moments can be used as objective function. This allows
considering also desirability functions used for solving multi-objective optimization
problems.

§
When only real negative values are involved, the following expressions can be used which also are
functions of the moments of :
( ) ( ( ))
( ) ( ( ))
On the other hand, if both positive and negative real values are possible, then the following expressions
must be used:
( ( )) ( )
( )
( ( ( )) ) ( )
{
( ( ( )) ) ( )
( )

{ ( ( )) ( )

23/11/2018 ForsChem Research Reports 2018-11 (4 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

 If all randomistic variables are deterministic where , then ( ) ( ) .


Thus, ( ( )) ( ), which corresponds to the conventional deterministic
optimization problem.
 If all randomistic variables are assumed to be fixed, that is, if the effect of all
uncontrolled factors is assumed to be successfully captured by their distributions, then
the moments of the variables can be described by deterministic values.

2.2. Randomistic Decision Variables

The objective function for a randomistic optimization problem is a function of the moments of
a set of randomistic variables. Therefore, it is expected that those randomistic variables
represent the set of decision variables. However, since randomistic variables do not necessarily
have a unique value, the whole probability distribution of the randomistic variables should be
considered as decision variables. In general, since the probability distribution of a randomistic
variable is characterized by the moments of the distribution, it is the (deterministic) moments
that are ultimately the decision variables of the optimization problem. Now, there are an
infinite number of moments for a single randomistic variable, so this would seem as an
impossible problem. But there are two possible approaches:

1) Non-parametric approach: Only the moments involved in the determination of the


objective function (and/or constraints) are considered as decision variables.

2) Parametric approach: A certain parametric probability distribution function is assumed


for each randomistic variable (e.g. normal, uniform, etc.), and then the finite set of
parameters of the distributions become the decision variables of the optimization
problem. Since the moments can be expressed as a function of the parameters of the
assumed distribution, the objective function becomes a function of the distribution
parameters.

In this work, the set of randomistic decision variables (whether it is composed of distribution
moments, distribution parameters, or a combination of both) will be denoted as . Therefore,
the objective function will be:

( ) ( )
(2.5)

In the case of a deterministic variable , the decision variable can be expressed as the first
moment or expected value of , and thus, ( ) ( ) .

23/11/2018 ForsChem Research Reports 2018-11 (5 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

2.3. Randomistic Constraints

Finally, it is important to consider the constraints imposed on the optimization problem. Three
general types of constraints are possible, depending on the nature of the restriction:

 Constraints on the possible values taken by the decision variables.


These constraints can be expressed in general as follows:

(2.6)

where is the -th constraint in the vector , and represents a set of possible
values for . can be either a finite set of values (equality constraints), or a finite set
of ranges of values (inequality constraints), or a combination of both.

 Constraints on deterministic functions of the decision variables.


It may be possible that certain deterministic functions of the decision variables should
meet some predefined conditions. Those conditions can be either equality (Eq. 2.7) or
inequality constraints (Eq. 2.8).

( )
(2.7)
( )
(2.8)

The right hand side of these equations is set to zero because any other possible value
can always be subtracted from the left side resulting in a suitable function of the
decision variables. Similarly, inequality constraint functions greater than or equal to
zero, can be multiplied by yielding a constraint function less than or equal to zero.

 Constraints on random functions of the randomistic variables.


It is also possible that the predefined conditions involve random functions of the
randomistic variables . In that case, the conditions are expressed on the probability of
such random function of satisfying certain events. In general, this type of constraints
can be expressed as:

( ( ) )
(2.9)

where ( ) denotes the probability of an event, and the event in this case is ( ) ,
being ( ) a random function of the randomistic variables . is the minimum

23/11/2018 ForsChem Research Reports 2018-11 (6 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

probability required for that event. The probability for such event can be expressed in
terms of the cumulative probability distribution of the resulting function ( ) as:

( ( ) ) ( ( ) ) ∫ ( )

(2.10)

where represents the probability density function of ( ) and is a particular


realization of ( ). Using the change of variable theorem,[5] it is possible to describe
the probability ( ) as a function of the corresponding probabilities for
( ( ) ). Furthermore, since the probability distributions can be determined
either from the decision variables (either from the distribution parameters, or from the
distribution moments [6]), then ultimately Eq. (2.9) becomes:

( ) ( ( ) )
(2.11)

whose form is equivalent to that of constraint (2.8). This means that constraints on
random functions of the randomistic variables can be represented by constraints on
deterministic functions of the decision variables .

All these types of constraints can be represented by a single type of randomistic constraint
given in a general form by:

( ( ) )
(2.12)

where are the required boundaries on the randomistic function ( ) of randomistic


variables such that . On the other hand, are the permitted
boundaries on the probability for such event, where .

Please notice that Eq. (2.12) represents a deterministic constraint when or


. Furthermore, if and , the event can be
represented by an equality constraint. If , but or , then the
event is represented by an inequality constraint. Otherwise, the event is represented by two
inequality constraints.

The probability of the event considered in Eq. (2.12) can be expressed in terms of the
cumulative probability distribution of ( ) as:

( ( ) ) ( ( ) ) ( ( ) )
(2.13)

23/11/2018 ForsChem Research Reports 2018-11 (7 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

where represents the cumulative probability distribution function of ( ), and the


superscripts in and indicate that the function approaches from the right and
from the left, respectively.

Again, considering that the cumulative distributions can be determined from the values of the
decision variables , then Eq. (2.12) can be expressed as two deterministic inequality
constraints:
( ( ) ) ( )
(2.14)
( ( ) ) ( )
(2.15)
If both and , then ( ) ( ), and therefore the two
inequality constraints are represented by the single equality constraint:

( )
(2.16)
where ( ) ( ) or ( ) ( ).

Thus, any general randomistic constraint given by Eq. (2.12) can be represented by
deterministic constraints such as those presented in Eq. (2.7) and (2.8).

The main conclusion is that a general randomistic optimization problem can ultimately be
expressed as a conventional deterministic optimization problem. Therefore, any method used
for solving a conventional deterministic optimization problem can be used for solving any
general randomistic optimization problem.

3. Examples

3.1. Deterministic optimization problem: Equivalence with the Randomistic formulation

Let us consider the following linear problem involving two independent deterministic variables
and as decision variables: (Problem 7.4 in [7])
( )

(3.1)

23/11/2018 ForsChem Research Reports 2018-11 (8 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Problem (3.1) can be expressed as an equivalent randomistic optimization problem considering


the randomistic variables and , described by the following probability density functions:

( ) ( )
(3.2)
( ) ( )
(3.3)

where represents the probability density function of a continuous ** variable ,


represents any possible realization of , is Dirac’s delta function [8] and is a deterministic
variable.

According to this definition, the moments of the randomistic variables are:

( ) ∫ ( )

(3.4)
and particularly, the decision variables for this problem are:

( )
(3.5)

Furthermore, the corresponding cumulative probability density functions for these randomistic
variables are given by the step function:
( ) {
(3.6)

Considering the minimization problem as a standard for randomistic optimization problems,


the deterministic optimization problem (3.1) becomes:

( )
( )
( )
( )
( )
( )
( )
( )
(3.7)

**
Continuous refers in this context also to the continuous approximation, valid for discrete variables
with a low relative measurement resolution ( ).[1]

23/11/2018 ForsChem Research Reports 2018-11 (9 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

The first constraint in problem (3.7) can be expressed in terms of cumulative probability
functions as:

( ) ∫ ( )

(3.8)
where

(3.9)
and from the change of variable theorem [5] it can be expressed as a function of :

( ) ∫ ( )| | ( )

∫ ( ) ( ) ( )

(3.10)
Thus, using the result in Eq. (3.10), Eq. (3.8) can be written as:

∫ ( ) ( )

(3.11)

where ( ) represents a modified version of Heaviside step function approaching from the
left, which corresponds to:
( ) {
(3.12)
Therefore, Eq. (3.11) is equivalent to:

(3.13)

Following a similar procedure for all other constraints in problem (3.7), a restatement of the
problem is possible:
( )

(3.14)

23/11/2018 ForsChem Research Reports 2018-11 (10 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

which is identical to problem (3.1), only with a change in the notation of the decision variables.
The solution to this problem is and .

3.2. Random response variables: Statistical models as objective functions

As a first step towards incorporating uncertainty in an optimization problem, let us consider an


objective function related to a certain response variable, whose model was obtained by
statistical analysis. In this case, the response variable model will be composed of a
deterministic prediction term, and a random error correction term. As an example let us
analyze the maximization of the product yield in a pharmaceutical reaction. A uniform design of
experiments is used to obtain experimental data which is later used to obtain a statistical
model as a function of the molar ration of reactants (cyclopentanone/formaldehyde) and the
reaction temperature.[9] After modeling the response variable, the following equation is
obtained: (Example 4 in [10])

(3.15)
where is the product yield, [ ] is the molar ratio of reactants
(cyclopentanone/formaldehyde), [ ] is the reaction temperature (°C), and is a
random error with the following properties:

( ) ( )
(3.16)
( ) ( ) ( ) ( )
(3.17)
Given that the product yield is also a random variable, the objective function should be
formulated in terms of one or more or its moments. If we are interested in finding the
maximum average product yield, then the corresponding randomistic optimization problem is:

( )
( )
( )
(3.18)
where ( ) . Since are in this case deterministic, then problem (3.18) becomes:

( )

(3.19)

23/11/2018 ForsChem Research Reports 2018-11 (11 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

The optimum set of decision variables is in this case and .

Please notice that when the objective function corresponds to the expected value of a
statistical model (where there is a zero mean error term), then the variance in the error term
has no effect on the result of the optimization.

3.3. Random decision variables

When the decision variables cannot be exactly set in their desired values, then their uncertainty
should be considered in an optimization problem. The case study will be the manufacturing of
coins (based on the example presented in Section 5.3. of [2]). During the manufacturing of
coins, the diameter ( ) and thickness ( ) of the coin are considered random variables. Thus,
the total surface ( ) and total volume ( ) of the coin will also fluctuate from coin to coin. In
this problem, the manufacturer expects to design a new coin such that the average surface
area of the coin is as large as possible, with a maximum average volume of and an
average diameter to average thickness ratio between and . The average diameter that
the manufacturer can handle is in the range from to with a size-independent
coefficient of variation of . The average thickness that the manufacturer can handle is in the
range from to with a size-independent coefficient of variation of . The
randomistic optimization problem can be formulated as follows (non-parametric approach):

( ) ( ) ( ) ( )
( ) ( ) ( )
( ( ) )
( )
( )
( )
( ( ) )
√ ( ) ( )
( )
( )
( ( ) )
√ ( ) ( )
( )
( )
(3.20)
Considering that
( ) ( ) ( )
(3.21)

and setting ( ), ( ), ( ) and ( ), problem (3.20) can be


reformulated as:

23/11/2018 ForsChem Research Reports 2018-11 (12 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

( )

(3.22)

Considering a resolution in and of ( ), the solution to this problem is:


, , and . The ratio of average
diameter to average thickness of this coin would be , with an average surface of
and an average volume of .

3.4. Random constraints

So far, all constraints considered in the previous examples have been deterministic since the
probabilities of the events have been set equal to either or . There are certain situations
where the constraints are random. Let us consider the following example (based on Example
5.3 in [11]): A chemical manufacturer uses two different solvents in a certain production
process. The total daily consumption of solvents in the plant is tons. Solvent 1 is a
conventional solvent with a cost of /ton, whereas solvent 2 is a by-product obtained from
a different manufacturer which pays /ton for its disposal. However, the second solvent
has a randomly fluctuating limited availability (a uniform random distribution between 4 and 10
ton/day). For that reason, during the formulation of the final product, the plant manager stated
that the highest tolerable risk of daily shortage in solvent 2 is 10%. In addition, there are certain
storage capacity restrictions on the daily amount of each solvent that can be used in the
process. All these conditions can be expressed as the following randomistic optimization
problem for obtaining the best solvent mix formulation:

23/11/2018 ForsChem Research Reports 2018-11 (13 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

( ) ( ) ( )
( ) ( )
( ( ) ( ) )
( ( ) )
( ( ) ( ) )
( ( ) )
( ( ) )
(3.23)

where is a daily cost function for the solvent mixture, is the daily requirement of solvent 1
in the formulation (ton/day), is the daily requirement of solvent 2 in the formulation
(ton/day), and is the uniform random availability of solvent 2 from other manufacturers
(ton/day). Constraint 1 represents the plant storage constraint.

By setting ( ) and ( ), and considering that and are deterministic


variables, problem (3.23) can be expressed as:

( )
( )

(3.24)

where represents the cumulative probability function of the random variable for solvent 2
daily availability. Recalling that is a uniform random distribution between and ton/day,
then:

( ) {∫

(3.25)

Therefore, ( ) for any value of , and ( ) for ††. Thus, problem (3.24)
becomes:

††
Since from Eq. (3.25), ( ) .

23/11/2018 ForsChem Research Reports 2018-11 (14 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

(3.26)
The solution to this problem is: ton/day and ton/day.

3.5. Second moments in the objective function: Robust (Taguchi) optimization

Robust optimization consists on finding the best set of decision variables that minimizes the
effect of uncertain conditions (noise) on the performance of the system.[12] Genichi
Taguchi,[13] considered as the father of robust design, proposed solving the problem of robust
optimization by using as objective function the “signal-to-noise (SN) ratio” function ( ) defined
as (in decibel units):

( )

(3.27)

where ( ) represents the expected value of the local sensitivity of the measurement
( ) to a certain signal factor ( ) and is the variability (standard deviation) of the
measurement error or noise ( ), with ( ). Thus the SN ratio is a ratio of sensitivity to
variability.

Let us now consider that the measurement is being mathematically represented by the
following model:
( )
(3.28)

where represents any arbitrary nonlinear function of the signal factor and the model
parameters (independent of and ), and ( ). Then, the SN ratio function
becomes:

( )
( )
( ) ( ) ( )

(3.29)

23/11/2018 ForsChem Research Reports 2018-11 (15 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

In a parameter estimation problem, a sample of data for and ( and , respectively) is used
to find the best values of the parameters that optimizes a performance function. Traditional
least squares minimization method uses ( ) as the function to minimize. In the robust
parameter estimation, will be used as the objective function to maximize.

For this example, a linear model is considered where:

( )
(3.30)
with

(3.31)
Thus, becomes:
( )
( )
( ) ( ) ( )
( )
( )
( ) ( ) ( ) ( ) ( )
(3.32)
since the parameters are independent of and .

Furthermore, ( ), ( ) and ( ) are estimated from the data sample:

∑ ( ̅)
( )
(3.33)
∑ ( ̅)
( )
(3.34)
∑ ( ̅)( ̅)
( )
(3.35)
where is the sample size, and ̅ and ̅ represent sample averages.

Thus, the corresponding randomistic optimization problem is:

∑ ( ̅) ( )∑ ( ̅)( ̅) ( )∑ ( ̅)
( )
( ) ( ) ( ) ( )
(̅ ( ) ( ) ̅ )
(3.36)

The constraint in problem (3.36) basically indicates that the average value of the residuals in
the sample is ̅ .

23/11/2018 ForsChem Research Reports 2018-11 (16 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

( ) ( )
Problem (3.36) can be solved analytically be setting ( )
and ( )
. The derivative
on ( ) is not considered because ( ) has no effect on the SN ratio function, although it
must satisfy the constraint. Thus,

( ) ( )∑ ( ̅) ∑ ( ̅ )( ̅)
( )( )
( ) ∑ ( ̅) ( )∑ ( ̅ )( ̅) ( )∑ ( ̅) ( )
(3.37)
and therefore:
∑ ( ̅)
( )
∑ ( ̅)( ̅)
(3.38)
∑ ( ̅)
( ) ̅ ̅
∑ ( ̅)( ̅)
(3.39)
On the other hand,
∑ ( ̅)
( ) ∑ ( ̅)
( )
( ) ∑ ( ̅) ∑ ( ̅)
( (∑ ( ̅)( ̅ ))

( )∑ ( ̅) ∑ ( ̅)( ̅)
( )
∑ ( ̅) ( )∑ ( ̅)( ̅) ( )∑ ( ̅) ( )
)
(3.40)
is valid as long as
∑ ( ̅) ∑ ( ̅)
(∑ ( ̅)( ̅ ))
(3.41)

(∑ ( ̅)( ̅ ))
Now, since ( ̅) )(∑ ( ̅) )
is Pearson’s correlation coefficient squared, and
(∑

, then . Thus, condition (3.41) is satisfied. Therefore, the parameters obtained in


Eq. (3.38) and (3.39) represent a minimum in – , or equivalently, a maximum in the SN ratio
function .‡‡

Please notice that the robust linear regression parameters are different from the least squares
linear regression parameters,§§ since for the latter:

‡‡
Note that the robust regression approach (as well as any other regression method) is only reliable if
is large. Particularly, robust regression may be useful when non-linear transformations of the original
data are used.
§§
In fact, the robust ( ) corresponds to the reciprocal of the least-squares slope estimation when
and data are interchanged.

23/11/2018 ForsChem Research Reports 2018-11 (17 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

∑ ( ̅)( ̅)
( )
∑ ( ̅)
(3.42)
∑ ( ̅)( ̅)
( ) ̅ ̅
∑ ( ̅)
(3.43)

As a numerical example, let us consider the solubility data at room temperature for different
non-charged solutes in two solvents, tetrahydrofuran (THF) and 2-methyl tetrahydrofuran (2-
MeTHF), presented in Table 1.[14]

Table 1. Solubility data of different non-charged solutes at room temperature in THF and 2-
MeTHF. Values in mg/ml approximately obtained*** from Figure 4 reported in [14].
Solubility of different non-charged solutes (mg/ml) in:
THF 2-MeTHF THF 2-MeTHF THF 2-MeTHF THF 2-MeTHF THF 2-MeTHF
0.10 0.11 2.98 0.63 11.37 5.88 25.80 3.82 48.71 21.48
0.16 0.10 3.03 0.66 11.75 2.76 28.52 2.92 49.54 12.52
0.17 0.19 3.35 6.73 11.75 13.95 29.00 0.17 51.22 4.49
0.30 0.82 3.53 0.26 12.15 4.03 29.00 34.91 51.22 22.07
0.42 0.10 3.83 2.62 12.15 4.37 29.49 5.88 51.22 56.74
0.54 0.25 3.96 5.73 12.57 5.73 29.49 14.33 52.96 22.67
0.61 0.15 4.17 1.57 12.57 9.56 29.99 13.22 54.76 29.69
0.68 0.52 4.46 2.69 13.21 5.00 31.01 2.62 55.69 16.85
0.68 0.63 4.46 3.82 13.44 1.84 32.61 6.21 56.63 8.35
0.74 0.63 4.84 2.00 13.66 2.76 33.72 4.74 57.58 15.54
0.89 0.11 5.09 0.50 13.89 2.48 34.29 17.31 58.55 20.91
0.91 0.15 5.09 1.70 14.37 8.82 34.86 13.58 58.55 92.22
1.02 0.14 5.45 3.25 14.86 6.38 34.86 24.58 61.57 11.24
1.06 0.26 6.02 1.57 15.11 25.25 35.45 4.26 63.66 42.17
1.19 0.20 6.12 4.03 15.62 4.37 35.45 9.31 64.73 24.58
1.23 0.50 6.77 1.75 15.88 2.76 35.45 12.52 66.94 64.94
1.25 0.26 7.00 2.23 15.88 6.21 37.28 17.31 68.07 15.96
1.31 0.15 7.24 1.95 15.88 7.30 37.90 14.72 68.07 25.95
1.38 0.42 7.36 1.57 16.15 3.82 38.54 10.37 69.21 0.10
1.48 0.24 7.48 2.84 16.98 6.92 39.19 13.22 70.38 14.33
1.50 0.59 7.48 3.16 17.27 10.37 40.53 4.87 71.57 26.66
1.75 0.32 7.87 9.56 18.16 7.70 41.91 11.86 72.78 70.41
1.78 0.38 8.00 1.75 18.46 3.43 41.91 25.95 74.00 19.28
1.81 0.43 8.14 0.47 19.74 7.50 42.61 9.56 76.52 45.73
1.87 0.28 8.85 3.25 20.08 8.58 42.61 8.35 83.19 9.82
2.48 0.66 8.99 1.79 20.41 5.88 42.61 35.87 83.19 42.17
2.48 3.25 9.30 0.99 20.41 8.13 43.33 17.78 84.60 17.31
2.57 0.56 9.30 2.76 21.11 3.16 47.11 8.58 86.03 25.95
2.61 0.38 9.46 4.03 22.19 2.76 47.11 11.86 86.03 48.26
2.84 0.58 9.94 5.14 23.73 3.72 47.91 9.31 91.98 21.48
2.84 0.87 10.81 3.08 24.13 6.21 47.91 26.66 95.11 42.17

***
ForsChem Deuterium XL v.1.0 was used for extracting data from the reported figure.

23/11/2018 ForsChem Research Reports 2018-11 (18 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

A linear regression analysis between the decimal logarithm of the solubility of non-charged
solutes in 2-MeTHF ( ) and the decimal logarithm of the solubility in THF ( ) yields the
following results:

 Using least squares minimization:†††

(3.44)
with ( ) , and .

 Using robust SN ratio maximization:

(3.45)
with ( ) , and .

The data sample along with both fitted models is presented in Figure 2 (decimal logarithm
scale) and Figure 3 (original scale).

Figure 2. Solubility (mg/ml) of different solutes at room temperature for THF and 2-MeTHF in
decimal logarithm scale. Blue dots: Data sample. Dashed purple line: Least-squares fit (Eq.
3.44). Solid green line: Robust fit (Eq. 3.45)

†††
The model obtained in [14] is slightly different, probably because of errors in the data extraction from
the corresponding graph. The reported model is: , with .

23/11/2018 ForsChem Research Reports 2018-11 (19 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Interestingly, the variance of the residuals between the solubility in 2-MeTHF in the data sample
and the solubility predicted from the fitted models was for the least-squares
minimization and for the robust estimation. Thus, the robust estimation resulted in a
lower variance in the original scale of the data compared to the least-squares minimization
method.

Figure 3. Solubility (mg/ml) of different solutes at room temperature for THF and 2-MeTHF in
original scale. Blue dots: Data sample. Dashed purple line: Least-squares fit (Eq. 3.44). Solid
green line: Robust fit (Eq. 3.45)

3.6. Randomness everywhere

The following example (based on Example 6.7 in [15]), considers the optimal design of a
cylindrical tank with diameter and height . The volume of the tank is given by:

(3.46)

The nominal volume of the tanks is . The tank is considered defective if the absolute
deviation in volume is greater or equal than .

The materials cost ( ) of the tank is:

( )
(3.47)
where is the cost per unit area of the body, and is the cost per unit area of the lid.

23/11/2018 ForsChem Research Reports 2018-11 (20 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

The manufacturing or processing cost ( ) of the tank is given by:

( ) ( )
(3.48)

where is the indirect manufacturing cost, is a cost factor associated to the manufacturing
precision in the tank diameter and is a cost factor associated to the manufacturing precision
in the tank height. Thus, the manufacturing cost is inversely proportional to the variance in the
dimensions considered in the process.

The total cost ( ) of each tank is therefore:

( )
( ) ( )
(3.49)

The cost factors , , , and have changed historically following almost normal
distributions with the parameters given in Table 2.

The goal of this design is minimizing the cost of the tank while guaranteeing that the maximum
proportion of defective (out-of-specification) tanks is . It is also desired that the standard
deviation in the total cost be less or equal than .

The corresponding randomistic optimization problem can be stated as follows:


( )
( ) ( ) ( ) ( )

( ( ) ( ) ( )) ( ) ( ) ( ) ( )

( ) ( )
( ) ( ) ( ) ( )
( )
( ( ) ( ) )
( )
( )
(3.50)

Assuming that the tank diameter and the tank height behave as normal distributions
(parametric approach), then:

(3.51)

(3.52)

23/11/2018 ForsChem Research Reports 2018-11 (21 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

where ( ) , ( ) , ( ) , ( ) . In addition, and


represent Type I standard normal distributions.

Table 2. Parameters of normal distributions describing the historical behavior of different cost
factors for tank manufacturing
Cost factor Unit Average Standard deviation

The tank volume can also be approximated by the following normal distribution: (see Example
5.3 in [2])

(3.53)
where
( )
(3.54)

√ ( )
(3.55)

Thus, problem (3.50) can be restated as follows (replacing the values from Table 2):

( ( )) ( )

( ) ( )
( ( )( ) )

( ( ) ( ) )

( )

√ ( )

( )

( )

(3.56)

23/11/2018 ForsChem Research Reports 2018-11 (22 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

where represents the cumulative probability function of the standard normal distribution
and is calculated as:

( ) ∫ ( ( ))
√ √
(3.57)

where denotes the error function. Since normal distributions spans from to (and
therefore ( ) only when z ) then a tolerance value used in the last two
constraints. Setting , and solving problem (3.56), the following optimal
parameters are found:

(3.58)

The average total cost of the tank is , with a standard deviation of . The average
tank volume is with a standard deviation of .

4. Conclusion

Deterministic optimization problems along with stochastic optimization problems can be


represented by a single type of problem, denoted as the randomistic optimization problem.
The decision variables involved are either the moments of the probability distribution of
randomistic variables (non-parametric formulation) or the parameters of previously defined or
assumed probability distribution functions (parametric formulation). The constraints are
expressed as a range of probability for the occurrence of certain pre-established events.
Assuming a complete characterization of the randomistic variables, the decision variables as
well as the objective function and the constraints become deterministic. Therefore, the
randomistic optimization problem can be represented by a corresponding deterministic
optimization problem and be solved by any suitable method of deterministic optimization.
Thus, the key for solving any randomistic optimization problem lies in the correct interpretation
of the problem and its translation into a suitable deterministic optimization problem.

23/11/2018 ForsChem Research Reports 2018-11 (23 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Acknowledgments

The author gratefully acknowledges Prof. Dr. Silvia Ochoa (Universidad de Antioquia,
Colombia) and Prof. Jaime Aguirre (Universidad Nacional de Colombia), for useful discussions
and for reviewing the manuscript.

This research did not receive any specific grant from funding agencies in the public,
commercial, or not-for-profit sectors.

References

[1] Hernandez, H. (2018). The Realm of Randomistic Variables. ForsChem Research Reports
2018-10. doi: 10.13140/RG.2.2.29034.16326.

[2] Hernandez, H. (2018). Multidimensional Randomness, Standard Random Variables and


Variance Algebra. ForsChem Research Reports 2018-02. doi: 10.13140/RG.2.2.11902.48966.

[3] Hernandez, H. (2018). On the Behavior of Dynamic Random Variables. ForsChem Research
Reports 2018-09. doi: 10.13140/RG.2.2.20135.19366.

[4] Sahinidis, N. V. (2004). Optimization under Uncertainty: State-of-the-art and Opportunities.


Computers & Chemical Engineering, 28(6-7), 971-983.

[5] Hernandez, H. (2017). Multivariate Probability Theory: Determination of Probability Density


Functions. ForsChem Research Reports 2017-13. doi: 10.13140/RG.2.2.28214.60481.

[6] John, V., Angelov, I., Öncül, A. A., & Thévenin, D. (2007). Techniques for the Reconstruction
of a Distribution from a Finite Number of its Moments. Chemical Engineering Science, 62(11),
2890-2904.

[7] Edgar, T. F., Himmelblau, D. M., & Lasdon, L. S. (2001). Optimization of Chemical Processes.
2nd ed. Singapore: McGraw-Hill.

[8] Dirac, P.A.M. (1958). The Principles of Quantum Mechanics. 4 th Revised Edition. New York:
Oxford University Press.

[9] Liang, Y. Z., Fang, K. T., & Xu, Q. S. (2001). Uniform Design and its Applications in Chemistry
and Chemical Engineering. Chemometrics and Intelligent Laboratory Systems, 58(1), 43-57.

23/11/2018 ForsChem Research Reports 2018-11 (24 / 25)


www.forschem.org
Introduction to Randomistic Optimization
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

[10] Hernandez, H. (2018). Statistical Modeling and Analysis of Experiments without ANOVA.
ForsChem Research Reports 2018-05. doi: 10.13140/RG.2.2.21499.00803.

[11] Diwekar, U. (2008). Introduction to Applied Optimization. 2nd Edition. New York: Springer.

[12] Beyer, H. G., & Sendhoff, B. (2007). Robust Optimization – A Comprehensive Survey.
Computer Methods in Applied Mechanics and Engineering, 196(33-34), 3190-3218.

[13] Taguchi, G., Chowdhury, S., & Wu, Y. (2005). Taguchi's Quality Engineering Handbook.
Hoboken, NJ: John Wiley & Sons.

[14] Qiu, J., & Albrecht, J. (2018). Solubility Correlations of Common Organic Solvents. Organic
Process Research & Development, 22(7), 829-835.

[15] Dutta, S. (2016). Optimization in Chemical Engineering. Cambridge: Cambridge University


Press.

23/11/2018 ForsChem Research Reports 2018-11 (25 / 25)


www.forschem.org

You might also like