You are on page 1of 104

FRAMEWORK FOR JOINT DATA RECONCILIATION AND

PARAMETER ESTIMATION

JOE YEN YEN


(B.Eng.(Hons.), NUS)

A THESIS SUBMITTED

FOR THE DEGREE OF MASTER OF ENGINEERING

DEPARTMENT OF ELECTRICAL
AND COMPUTER ENGINEERING

NATIONAL UNIVERSITY OF SINGAPORE

2004
ACKNOWLEDGMENTS

I would like to express my gratitude to my supervisors Dr Arthur Tay and A/Professor

Ho Weng Khuen of the NUS ECE Department and Professor Ching Chi Bun of the

Institute of Chemical and Engineering Sciences (ICES), for their advice and guidance,

and for having confidence in me in carrying out this research.

The support of ICES in financing the research and providing the research environment is

greatly acknowledged. The guidance and assistance from Dr Liu Jun of ICES is also

appreciated.

Most of the research work is done in the Process Systems Engineering (PSE) Group of

the Department of Chemical Engineering in University of Sydney. The guidance of

Professor Jose Romagnoli and Dr David Wang is valuable in directing and improving the

quality of this work. I would also like to thank them for accommodating me with such

hospitality that my stay in the group is not only fruitful, but also enjoyable. Thanks also

go to the other members of the PSE for sharing their research knowledge and experience

and for making me feel part of the group.

My friend Zhao Sumin has seen to my well being in Sydney, and most importantly, has

been a like-minded confidante from whom I obtain inspiration in carrying out my

research work. I would like to dedicate this thesis to her, as with her persistent

encouragements, I feel that the effort in completing this thesis is partly hers.

i
TABLE OF CONTENTS

ACKNOWLEDGMENTS............................................................................................................................. I
TABLE OF CONTENTS.............................................................................................................................II
SUMMARY ................................................................................................................................................ IV
LIST OF TABLES ..................................................................................................................................... VI
LIST OF FIGURES ..................................................................................................................................VII
CHAPTER 1: INTRODUCTION ................................................................................................................1
1.1. MOTIVATION ........................................................................................................................................1
1.2. CONTRIBUTION .....................................................................................................................................3
1.3. THESIS ORGANIZATION ........................................................................................................................4
CHAPTER 2: THEORY & LITERATURE REVIEW..............................................................................6
2.2. DATA RECONCILIATION (DR)...............................................................................................................6
2.3. JOINT DATA RECONCILIATION – PARAMETER ESTIMATION (DRPE) ..................................................11
2.4. ROBUST ESTIMATION .........................................................................................................................15
2.5. PARTIALLY ADAPTIVE ESTIMATION ...................................................................................................22
2.6. CONCLUSION ......................................................................................................................................24
CHAPTER 3: JOINT DRPE BASED ON THE GENERALIZED T DISTRIBUTION .......................26
3.1. INTRODUCTION ...................................................................................................................................26
3.2. THE GENERALIZED T (GT) DISTRIBUTION..........................................................................................26
3.3. ROBUSTNESS OF THE GT DISTRIBUTION .............................................................................................29
3.4. PARTIALLY ADAPTIVE GT-BASED ESTIMATOR...................................................................................30
3.5. THE GENERAL ALGORITHM ................................................................................................................32
3.6. CONCLUSION ......................................................................................................................................35
CHAPTER 4: CASE STUDY .....................................................................................................................36
4.1. INTRODUCTION ...................................................................................................................................36
4.2. THE GENERAL-PURPOSE CHEMICAL PLANT .......................................................................................36
4.3. VARIABLES AND MEASUREMENTS ......................................................................................................40
4.4. SYSTEM DECOMPOSITION: VARIABLE CLASSIFICATION .....................................................................43
4.4.1. Derivation of Reduced Equations ..............................................................................................43
4.4.2. Classification of Measurements .................................................................................................45
4.5. DATA GENERATION ............................................................................................................................47
4.5.1. Simulation Data .........................................................................................................................47
4.5.2. Real Data ...................................................................................................................................48
4.6. METHODS COMPARED ........................................................................................................................48
4.7. PERFORMANCE MEASURES .................................................................................................................50
4.8. CONCLUSION ......................................................................................................................................51
CHAPTER 5: RESULTS & DISCUSSION ..............................................................................................53
5.1. INTRODUCTION ...................................................................................................................................53
5.2. PARTIAL ADAPTIVENESS & EFFICIENCY .............................................................................................53
5.3. EFFECT OF OUTLIERS ON EFFICIENCY .................................................................................................64
5.4. EFFECT OF DATA SIZE ON EFFICIENCY ...............................................................................................67
5.5. ESTIMATION OF ADAPTIVE ESTIMATOR PARAMETERS: PRELIMINARY ESTIMATORS AND ITERATIONS
..................................................................................................................................................................71
5.6. REAL DATA APPLICATION ..................................................................................................................76
5.7. CONCLUSION ......................................................................................................................................79

ii
CHAPTER 6: CONCLUSION ...................................................................................................................80
6.1. FINDINGS ............................................................................................................................................80
6.2. FUTURE WORKS .................................................................................................................................82
REFERENCES ............................................................................................................................................83
AUTHOR’S PUBLICATIONS...................................................................................................................85
APPENDIX A ..............................................................................................................................................86
APPENDIX B...............................................................................................................................................95

iii
SUMMARY

Objective knowledge about a process is essential for process monitoring, optimization,

identification and other general management planning. Since measurement of process

states always contain some type of errors, it is necessary to correct these measurement

data to obtain more accurate information about the process. Data reconciliation is such an

error correction procedure that utilizes estimation theory and the conservation laws

within the process to improve the accuracy of the measurement data and estimates the

values of unmeasured variables such that reliable and complete information about the

process is obtained.

Conventional data reconciliation, and other procedures that involve estimation, have

relied on the assumption that observation errors are normally distributed. The inevitable

presence of gross errors and outliers violates this assumption. In addition, the actual

underlying distribution is not known exactly and may not be normal. Various robust

approaches such as the M-estimators have been proposed, but most assumed, in a priori,

yet other forms of distribution, although with thicker tails than that of normal distribution

in order to suppress gross errors / outliers. To address the issue of the suitability of the

actual distribution to the assumed one, posteriori estimation of the actual distribution,

based on non-parametric methods such as kernel, wavelet and elliptical basis function, is

then proposed. However, these fully adaptive methods are complex and computationally

demanding. An alternative is to strike a balance between the simplicity of the parametric

approach and the flexibility of the non-parametric approach, i.e. by adopting a

iv
generalized objective function that covers a wide variety of distributions. The parameters

of the generalized distribution can be estimated posteriori to ensure its suitability to the

data.

This thesis proposes the use of a generalized distribution, namely the Generalized T (GT)

distribution in the joint estimation of process states and model parameters. The desirable

properties of the GT-based estimator are its robustness, simplicity, flexibility and

efficiency for the wide range of commonly encountered distributions (including Box-Tiao

and t-distributions) that belong to the GT distribution family. To achieve estimation

efficiency, the parameters of the GT distribution are adapted from the data through

preliminary estimation. The strategy is applied to data from both the virtual version and a

trial run of a chemical engineering pilot plant. The results confirm the robustness and

efficiency of the estimator.

v
LIST OF TABLES

Table 5.1. MSE of Measurements..................................................................................... 57


Table 5.2. Estimated parameters of the partially adaptive estimators used to generate
Figure 5.5 .................................................................................................................. 63
Table 5.3. MSE of Measurements with Outliers............................................................... 64
Table 5.4. Reconciled Data and Estimated Parameter Values Using Different DRPE
Methods..................................................................................................................... 78

vi
LIST OF FIGURES

Figure 2.1. Plots of Influence Function for Weighted Least Square Estimator (dashed
line) and the Robust Estimator based on Bivariate Normal Distribution (solid line)21
Figure 2.2 Partially Adaptive Estimation Scheme............................................................ 24
Figure 3.1. Plot of GT density functions for various settings of distribution parameters p
and q.......................................................................................................................... 27
Figure 3.2. GT Distribution Family Tree, Depicting the Relationships among Some
Special Cases of the GT Distribution........................................................................ 28
Figure 3.3. Plots of Influence Function for GT-based Estimator with different parameter
settings ...................................................................................................................... 30
Figure 3.4. General Algorithm for Joint DRPE using partially adaptive GT-based
estimator.................................................................................................................... 33
Figure 4.1. Flow Diagram of the General Purpose Plant for Application Case Study ..... 37
Figure 4.2. Simulink Model of the General Purpose Plant in Figure 4.1. ........................ 38
Figure 4.3. Configuration of the General Purpose Plant for Trial Run............................. 39
Figure 4.4. Reactor 1 Configuration Details and Measurements...................................... 39
Figure 4.5. Reactor 2 Configuration Details and Measurements...................................... 40
Figure 5.1. MSE Comparison of GT-based with Weighted Least Squares and
Contaminated Normal Estimators............................................................................. 57
Figure 5.2. Percentage of Relative MSE........................................................................... 58
Figure 5.3. Comparison of the Accuracy of Estimates for the overall heat transfer
coefficient of Reactor 1 cooling coil......................................................................... 60
Figure 5.4. Comparison of the Accuracy of Estimates for the overall heat transfer
coefficient of Reactor 2 cooling coil......................................................................... 61
Figure 5.5. Adaptation to Data: Fitting the Relative Frequency of Residuals with GT,
Contaminated Normal, and Normal distributions..................................................... 62
Figure 5.6. MSE of Variable Estimates for Data with Outliers ........................................ 66
Figure 5.7. Comparison of MSE with and without outliers for GT and Contaminated
Normal Estimators .................................................................................................... 66
Figure 5.8. MSE Results of WLS, Contaminated Normal and GT-based estoimators for
Different Data Sizes.................................................................................................. 68
Figure 5.9. Improvement in MSE Efficiency when Data Size is increased...................... 70
Figure 5.10. Iterative Joint DRPE with Preliminary Estimation ...................................... 72
Figure 5.11. Final MSE Comparison for GT-based DRPE method Using GT, Median and
WLS as preliminary estimators................................................................................. 74
Figure 5.12. MSE throughout iterations ........................................................................... 75
Figure 5.13. Scaled Histogram of Data and Density Plots of GT, Contaminated Normal
and Normal Distributions.......................................................................................... 77

vii
CHAPTER 1: INTRODUCTION

1.1. Motivation

The continuously increasing demand for higher product quality and stricter compliance to

environmental and safety regulations requires the performance of a process to be

continuously improved through process modifications (Romagnoli and Sanchez, 2000).

Decision making associated with these process modifications requires accurate and

objective knowledge of the process state. This knowledge of process state is obtained

from interpretation of data generated by the process control systems. The modern-day

Distributed Control System (DCS) is capable of high-frequency sampling, resulting in

vast amount of data to be interpreted, be it for the purpose of process monitoring,

optimization or other general management planning. Since measurement data always

contain some type of error, it is necessary to correct their values in order to obtain

accurate information about the process.

Data reconciliation (DR) is such an error-correction procedure that improves the accuracy

of measurement data, and estimates the values of unmeasured variables, such that reliable

and complete information about the process is obtained. It makes use of conservation

equations and other system/model equations to correct the measurement data, i.e. by

adjusting the measurements such that the adjusted data are consistent with respect to the

equations. The conventional data reconciliation approach is the least squares

minimization, whereby the (square of) adjustments to the measurements are minimized,

1
while at the same time subjecting the measurements to satisfy the system/model

equations. The least squares method is simple and reasonably efficient; in fact, it is the

best linear unbiased, the most efficient in terms of minimum variance, and also the

maximum likelihood estimator when the measurement errors are distributed according to

the Normal (Gaussian) distribution.

However, measurement error is made up of random and gross error. Gross errors are

often present in the measurements and these large deviations are not accounted for in the

normal distribution. In this case, the least squares method can produce heavily biased

estimates. Attempts to deal with gross errors can be grouped in two classes. The first

includes methods that still keep the least squares approach, but incorporate additional

statistical tests to the residuals of either the constraints (which can be done pre-

reconciliation) or the measurements (which must be done post-reconciliation). The

drawback of these approaches is that there is a need for separate gross-error processing

step. Also, most importantly, normality is still assumed for the data, while the data may

not be best represented by the Normal distribution. Furthermore, the statistical tests are

theoretically valid only for linear system/model equations, which is a rather constricting

restriction in chemical processes where most relationships are nonlinear.

The second class of gross-error handling approaches comprises the more recent

approaches to suppress gross error, i.e. by making use of the so-called robust estimators.

These estimators can suppress gross error while performing reconciliation, so there is no

need for a separate procedure to remove the gross errors. Most of these approaches are

2
based on the concept of statistical robustness, and they can be further grouped as either

parametric or non-parametric approach. The parametric approach either represents the

data with a certain distribution that has thicker tails to account for gross errors, or uses a

certain form of estimator that does not assume normality and gives small weights for

largely deviating observations. The non-parametric group consists of estimators that do

not assume any fixed form of distributions, but adjust their forms to the data distribution

through non-parametric density estimation instead. The resulting estimator will be

efficient as it is fitted to the data. However, these fully flexible estimators are prone to the

data size for the preliminary fitting and often do not perform well for data size often

encountered in practice (Butler et al, 1990).

A strategy is proposed to improve the efficiency of the parametric estimation by allowing

the parameters of the estimator to vary to suit the data. This is called partially adaptive

estimation. In this thesis, a robust partially adaptive data reconciliation procedure using

the generalized-T (GT) distribution will be studied and applied to the virtual version of a

chemical engineering pilot plant. The strategy is extended to the joint DRPE, which gives

both parameter and variable estimates that are consistent with the system/model

equations.

1.2. Contribution

In this thesis, a robust and efficient strategy for joint data reconciliation and parameter

estimation is studied. The strategy makes use of the generalized T (GT) distribution, a

robust and versatile general distribution family that is originally proposed in the field of

3
statistics by McDonald and Newey (1988). The GT distribution is first used in data

reconciliation by Wang and Romagnoli (2003) in a comparison case study.

In the present work, the strategy is extended to incorporate parameter estimation in the

joint data reconciliation and parameter estimation scheme. The properties of the GT-

based partially adaptive estimators are comprehensively studied through various

simulation cases.

A comprehensive literature review of the data reconciliation and joint data reconciliation

– parameter estimation, and the technical aspects associated with them is conducted.

As an application case study, the virtual version of a real lab-scale general-purpose

chemical plant is developed in Matlab/ Simulink. Besides simulation studies, steady-state

experimental data are also obtained from a trial run of the pilot plant. Full system

decomposition based on formal transformation methods and symbolic manipulation is

then conducted on the plant to facilitate accurate and complete estimation of the process

states and parameters by the joint data reconciliation – parameter estimation procedure.

1.3. Thesis Organization

This thesis is organized as follows. The theory on relevant topics in data reconciliation

and parameter estimation, along with existing works in the literature, is given in Chapter

2. Chapter 3 starts with an introduction to the Generalized T (GT) distribution, and goes

on to describe the proposed GT-based joint data reconciliation – parameter estimation

4
strategy. Chapter 4 describes the application case study and gives overview of some of

the settings used in the case studies in Chapter 5, where the results of the case studies are

presented and discussed in detail. The thesis is then concluded with Chapter 6.

5
CHAPTER 2: THEORY & LITERATURE REVIEW

2.1. Introduction

Data reconciliation aims to improve the accuracy of measurement data by enforcing the

data consistency. It uses estimation theory and subjects the optimization to model balance

equations. Two important estimation criteria are robustness and efficiency. In this

chapter, an introduction to data reconciliation is provided along with its important

aspects, including the incorporation of robustness and variable classification. Relevant to

the estimation criteria, the concept of robustness in statistical sense and an approach to

improve estimation efficiency is presented. A section is also devoted to joint data

reconciliation and parameter estimation, a strategy that combines the two estimation

procedures to simultaneously obtain consistent variable and parameter estimates.

2.2. Data Reconciliation (DR)

Measurements always contain some form of errors. Errors can be classified into random

error and gross error. Random errors are caused by natural fluctuations and variability

inherent in the process; they occur randomly and are typically small in magnitude. Gross

errors, on the other hand, are large in magnitude but occur less frequently; their

occurrences can be attributed to incorrect calibration or malfunction of instruments,

process leaks, and other unnatural causes.

In order to obtain objective knowledge of the actual state of the process, accurate data

must be used, requiring erroneous measurements to be first corrected before used. Data

6
reconciliation is such error correction technique that makes use of simple, well-known

and indubitable process relationships that should be satisfied regardless of the

measurement accuracy, i.e. the multicomponent mass and energy balances (Romagnoli

and Sanchez, 2000).

The presence of errors in the measurement of process variables gives rise to discrepancies

in the mass and energy balances. Data reconciliation adjusts or reconciles the

measurements to obtain estimates of the corresponding process variables that are more

accurate and consistent with the process mass and energy balances. The adjustment of

measurements is such that certain optimality regarding the characteristic of the error is

achieved. Mathematically, the general data reconciliation translates into the following

constrained optimization problem:

(reconciled variables) = arg min (optimality criteria)

subject to

(mass and energy balances; variable bounds)

To illustrate more clearly, the data reconciliation problem using the weighted least

squares method will be formulated in the following.

Denote as

y, an (mx1) vector of measurements;

x, an (mx1) vector of corresponding true values of the variables with measurements y;

u, a (px1) vector of unmeasured variables;

7
and g(x,u)=0, the multicomponent mass and energy balance equations of the process; the

data reconciliation problem for weighted least square estimator can then be formulated

as:

Min ( y − x) T Ψ −1 ( y − x)
x, u

s.t.
g ( x, u ) = 0 (2.1)
x L ≤ x ≤ xu
u L ≤ u ≤ uu

where Ψ is the measurement error covariance matrix, x L and u L the lower bounds on x

and u, respectively, and xU and uU the upper bounds on x and u, respectively.

Three features are observed from the above formulation:

(1) Firstly, the objective function of the optimization is the square of the adjustments

made to the measurements y, weighted by the inverse of the error covariance matrix.

This corresponds to the weighted least square estimator used in the problem. The

objective function of the data reconciliation optimization problem is in fact

determined by the estimator applied to the problem. The choice of estimator is, in

turn, usually dependent on the assumption regarding the error characteristics. For

example, the use of weighted least square reflects the assumption that the error is

small relative to its standard deviation such that the measurement must lie somewhere

within very few standard deviations from the true value of the variable. In fact, if the

weighted least square is chosen based on maximum likelihood consideration, the error

is assumed to follow multivariate normal distribution with mean zero and covariance

8
Ψ . To demonstrate this, consider the likelihood function of the multivariate normal

distribution

f (ε ) = (2π det(Ψ )) −1 exp(−ε T Ψ −1ε ) , (2.2)

where ε = y − x ; the maximum of f (ε ) is obtained by minimizing ε T Ψ −1ε , i.e. the

weighted least square of adjustments. As will be discussed in Section 2.4 on

robustness, the adequacy of the assumption regarding the error and hence, the choice

of estimator plays an important role in ensuring the accuracy of the reconciled data in

all situations.

(2) Secondly, the constraint g(x,u) comprises the mass and energy balances of the

process. Together with the form of the objective function, the constraint equation

determines the difficulty of the DR optimization. If only total mass balances are

considered and the weighted least square objective function is used, the optimization

problem will have quadratic objective function with linear constraints. This kind of

optimization can be solved analytically, i.e. closed-form solution can be obtained.

However, as using merely mass balances limits the reconciliation to only flow

measurements, component and energy balances are usually also considered. This

results in non-linear optimization problem for which analytical solution usually does

not exist. Several optimization methods are proposed for this case, including the QR

orthogonal factorization for bilinear systems (Crowe, 1986), successive linearization

(Swartz, 1989), and nonlinear numerical optimization methods such as sequential

9
quadratic programming (Gill et al., 1986; Tjoa and Biegler, 1991). In this thesis, the

sequential quadratic programming (SQP) is used, as it is not restricted to linear or

bilinear systems, and is flexible in terms of the form of objective function. Although

convexity is essential to guarantee convergence to the true optimum, the algorithm

also converges to satisfactory solutions for many non-convex problems, as will be

demonstrated by the estimation results in Chapter 5.

(3) Thirdly, the optimization problem involves measured and unmeasured variables, x

and u, respectively. A very important procedure must be taken before formulating the

data reconciliation problem. This procedure is called variable classification. Given the

knowledge of which variables are measured, the balance equations can be analysed to

identify which measured variables are redundant or non-redundant, and which

unmeasured variables are determinable or undeterminable. The value of a

determinable variable can be determined from the value of measured variables

through the model equations, whereas an undeterminable variable is not involved in

such equations, and hence its value cannot be determined from the values of other

variables. A redundant variable is a variable whose value can still be determined from

measurements of other variables through the model equations, even if its own

measurement is deleted. On the contrary, if the measurement of a non-redundant

variable is deleted, its value becomes undeterminable.

In the actual optimization step of data reconciliation, the decision variables will

consist of only the measured variables, as adjustments can only be made to

10
measurements. The calculation of determinable unmeasured variables is performed

using the reconciled values of the measured variables, and hence, this calculation can

only be carried out after the optimization is completed. Therefore it can be said that

the problem in equation (2.1) is decomposed into two steps, the optimization to obtain

reconciled measurements, and the calculation of determinable variables.

Various methods have been proposed for this variable classification / problem

decomposition (Romagnoli and Stephanopoulos, 1980; Sanchez et al, 1992; Crowe,

1989; Joris and Kalitventzeff, 1987). However, any formal method is restricted to

linear or linearized model equations.

2.3. Joint Data Reconciliation – Parameter Estimation (DRPE)

Parameters of a process are often not known and have to be estimated from the

measurements of the process variables. These estimated parameters are often important

for design, evaluation, optimization and control of the process. As measurements of

process variables are corrupted by errors, the measurements are usually reconciled first

before being used for parameter estimation. This results in two separate estimations with

two sets of variable estimates, i.e. the reconciled data satisfying the process constraints

and the data fitted with the model parameter estimate. It is most likely that these two sets

of data are not similar, albeit representing the same physical quantities. In this thesis, the

two estimation steps corresponding to data reconciliation and parameter estimation are

merged into a single joint data reconciliation – parameter estimation (DRPE) step.

11
The problem formulation, taking the weighted least square objective function as an

example, is as follows.

Min ( y − x) T Ψ −1 ( y − x)
x, u,θ

s.t.
g ( x, u , θ ) = 0 (2.3)
x L ≤ x ≤ xu
u L ≤ u ≤ uu
θ L ≤θ ≤θu

θ is the model parameter to be estimated, while the meaning of the other symbols are as

in equation (2.1). It should, however, be noted that the vector of measurements y may

now contain non-redundant measured variables which are involved in the equations to

estimate θ now included in g. Because all measurements, both redundant and non-

redundant are subject to the constraints which now includes data reconciliation process

constraints and parameter estimation model equations, the resulting estimates of both

variables and model parameters are now consistent with the whole set of constraints.

The joint data reconciliation – parameter estimation is also a general formulation of the

error-in-variables method (EVM) in parameter estimation, where there is no distinction

between independent and dependent variables and all variables are subject to

measurement errors. Main aspects of the joint DRPE include the general algorithm for the

solution strategy, the optimization strategy and the robustness of the estimation. The

optimization and robustness issues are similar to those in data reconciliation; therefore, it

12
will not be discussed here. The optimization strategy is discussed in Section 2.2, while

the estimation robustness is discussed in Section 2.4.

In error-in-variables method (EVM), the need for efficient general algorithm for the

solution strategy arises due to the large optimization problem that results from the

aggregation of independent and dependent variables in the estimation. From the point of

view of data reconciliation, the computation complexity is due to the addition of non-

redundant variables and the unknown model parameters to be estimated. The general

algorithm can be distinguished into three main approaches (Romagnoli and Sanchez,

2000):

(1) Simultaneous solution methods

This is the most straightforward approach, i.e. solving the joint estimation of

variables and model parameters simultaneously. This approach relies on efficient

optimization method that is able to handle large-scale problems involving large

amount of decision variables. This is because considering there are p model

parameters to be estimated and N set of measurements of m variables, the number of

decision variables will be (p+Nm).

(2) Nested EVM

Reilly and Patino-Leal (1981) was the first to propose the idea of nested EVM, i.e.

decoupling the parameter estimation problem from the data reconciliation problem,

where the data reconciliation problem is optimized at each iteration of the parameter

estimation problem. While they used successive linearization for the constraints, Kim

13
et al (1990) later replaced the linearization with the more general non-linear

programming. The algorithm due to Kim et al is as follows (Romagnoli and Sanchez,

2000):

Step 1:

At the first iteration, x = y, and θ = θ 0 (initial guess of the parameter)

Step 2:

Find the minimum of the function for x and θ :

Min ( y − x) T Ψ −1 ( y − x)
θ

s.t.
Min ( y − x) T Ψ −1 ( y − x)
x

s.t.
g ( x, θ ) = 0
x L ≤ x ≤ xu
θ L ≤θ ≤θu

The optimization size is reduced to the same order as the number of parameters to be

estimated.

(3) Two-stage EVM

Valko and Vadja (1987) proposed an algorithm where the data reconciliation and

parameter estimation steps are essentially also decoupled. By linearizing the

constraints and solving the weighted least square optimization analytically, they

14
utilizes the analytical solution to manipulate the problem such that it can be re-

formulated into two optimization stages: the first stage to optimize for the model

parameters, keeping the variable estimates fixed at the results from previous iteration,

and the second to optimize for the variable estimates, keeping the parameter values

fixed at the values obtained from the preceding step. The resulting algorithm is

compact yet flexible. However, the ability to decouple the problem into two stages

depends heavily on whether the optimization problem can be solved analytically.

Therefore, it is restricted to only few very simple estimators such as the weighted

least squares.

2.4. Robust Estimation

The conventional and most prevalent form of estimator is the weighted least squares

formulation. It has been shown in Section 2.1 that if maximum likelihood estimation is

considered, the weighted least square estimates are the maximum likelihood estimates

when the measurement errors follow the multivariate normal (Gaussian) distribution in

equation (2.2). However, the normality assumption is rather restrictive; it assumes that a

measurement will lie within a small range around the true variable value, that is, the error

consists of natural variability of the measurement process. The presence of gross errors,

whose magnitudes are considerably large compared to the standard deviation of the

assumed normal distribution, presents the risk of the weighted least square estimates

becoming heavily biased. Besides, the natural variability of the measurement process

may also be better characterized by distributions other than normal.

15
The usual approach to deal with departure from normality is by detecting the presence of

gross errors through statistical tests. Some typical gross error detection schemes are the

global test, nodal test, and measurement test (Serth and Heenan, 1986; Narasimhan and

Mah, 1987). The global and nodal tests are based on residuals of model constraints, while

the measurement test is based on Neyman-Pearson hypothesis testing using the residuals

between the measurements and the estimates (Wang, 2003). The limitations of these

complementary tests are as follows. Firstly, the statistical tests are still based on the

assumption that the error is normally distributed; they detect deviations from normal

distribution. When the empirical character of the error differs significantly from the

normal distribution, the results of the statistical test might be very misleading. The

second limitation is the restrictions of the tests to linear or linearized constraint equations.

In a practical setting, the model equations are usually nonlinear and linearization will

introduce some approximation errors that will confound the statistical gross error tests.

An alternative approach to deal with gross errors is to reformulate the objective function

to take into account the presence of gross errors from the beginning, such that gross

errors can be suppressed while performing the estimation. In this case, there is no need

for a separate procedure such as the previously mentioned statistical tests to recover from

gross errors. A seminal work that proposed this approach is the maximum likelihood

estimator based on the contaminated normal density proposed by Tjoa and Biegler

(1991). Instead of assuming purely normal error distribution, Tjoa and Biegler combined

two normal distributions: a narrow Gaussian with the same standard deviation as that of

the weighted least squares method, and a wide Gaussian with considerably larger

16
standard deviation to represent gross errors. The density function of the contaminated

normal distribution can be expressed as

1 u2 1 u2
f (u) = (1− p) exp(− 2 ) + p exp(− 2 2 ) (2.4)
2πσ 2σ 2πσb 2σ b

where u is the measurement residual, p the probability of gross errors, b the ratio of the

standard deviation of the wide Gaussian to the narrow one, and σ the standard deviation

of the narrow Gaussian. As illustrated in Figure 2.1, this distribution has heavier tails

than the uncontaminated normal distribution, which means that it recognizes the

possibility of gross error occurring (i.e. with a probability of p as in equation 2.4). In their

paper, Tjoa and Biegler showed that the estimator is able to detect gross errors and to

recover from them in most of the cases studied.

To study the robustness of an estimator, a unifying theoretical framework has been

proposed by Huber (1981) and Hampel (1986). The analysis based on Hampel et al’s

influence function (IF) will be adopted here. To simplify presentation, the derivation of

formulae will be omitted here; details can be found in Hampel et al (1986). The influence

function aims to describe the behaviour of an estimator in the neighbourhood of the

parametric distribution assumed by the estimator. If the residual u is drawn from a

distribution with density f(u), and if T[f(u)] is the unbiased estimate corresponding to u,

then the influence function of a residual u0 is given by

17
T [(1 − t ) f + tδ (u − u 0 )] − T [ f ]
IF = ψ (u 0 ) = lim (2.5)
t →0 t

where δ (u - u 0 ) is the delta function centred about u0. The following properties are

desirable for the influence function (McDonald and Newey, 1988):

(1) The influence function should be bounded, so that the any single residual cannot

dominate and distort the estimation.

(2) The influence function should be descending to very small values as the residuals

get large, so that a single large deviating residual has negligible effect on the

estimation.

(3) The influence function should be continuous in the residuals, such that grouping

or rounding of data has minimal effect on the estimation.

The robustness of the contaminated normal estimator above and a few other robust

estimators will be studied using the influence function in the following discussion. Before

that, however, it is appropriate to introduce the class of estimators called M-estimators.

M-estimators are a slight generalization of maximum likelihood estimators proposed by

Huber (1964). The maximum likelihood estimators have the following objective function

form:

Min - Σ log f (u ) = 0
x ,θ

18
where x and θ are the variables and model parameters to be estimated, u the residuals,

and f (u ) the density function of u. This is also equivalent to solving

Σ f ′/ f = 0

where f ′ = ∂f / ∂u . Huber generalized the estimation problem to

Min Σ ρ (u ) = 0 ,
x ,θ

or equivalently,

Σ ψ (u) = 0 ,

where ρ and ψ are not necessarily of the form - log f (u ) and f ′ / f , respectively.

The reason for introducing M-estimators here is that most of the robust estimators that

have been proposed in the literature, including those going to be discussed here, fall

under this class. Furthermore, the influence function of the M-estimators is proportional

to the first derivative of the objective function, i.e.

IF ∝ ∂ρ / ∂u = ψ(u)

or for maximum likelihood estimators:

19
IF ∝ ∂ ln( f (u )) / ∂u .

In the following discussions, the term influence function and ψ function will be use

interchangeably.

Having introduced the concept of influence function and shown the simple derivation of

the influence function for M-estimators, the robustness of several M-estimators can now

be analysed. It is useful to first examine the influence function of the weighted least

squares estimator, to explain why it is not robust to outliers and to compare it with robust

estimators. Since ρ = − log f (u ) , the ρ function for the weighted least squares estimator

is ρ = u T Ψ −1u . For simplicity but without losing generality, let u be a 1x1 vector, and

Ψ = σ so that ρ = u 2 / σ . Taking the derivative of ρ , the influence function will be

proportional to the magnitude of the residual, i.e.

u2 2
IF ∝ ∂ ( ) / ∂u = u.
σ σ

Consequently, a large outlying measurement will have a proportionally large influence on

the estimation; this means that a gross error can bias the final estimates. The dotted line

in Figure 2.1 is the plot of the influence function; it is seen that the function is not

bounded and increases infinitely.

20
For comparison, the influence function of the contaminated normal estimator is also

plotted in Figure 2.1. This function can be expressed as

u p u
(1 - p)exp(- )+
exp(- 2 2 )
IF ∝ 2σ 2
b 3
2b σ u = w(u )u = ψ (u )
u p u
(1 - p)exp(- ) + exp(- 2 2 )
2σ 2 b 2b σ

For very large values of u, the weighting function w(u) can be approximated by

w(u ) ≈ 1 / b 2 ; whereas for small values of u, w(u ) ≈ 1 . This means that large residuals are

suppressed by weighting them down; while for small residuals, the estimator behaves like

a weighted least squares estimator, which can also be observed from Figure 2.1.

5
IF value

-5
-5 0 5
Normalised Residual, u
Figure 2.1. Plots of Influence Function for Weighted Least Square Estimator (dashed
line) and the Robust Estimator based on Contaminated Normal Distribution (solid
line)

21
2.5. Partially Adaptive Estimation

Besides robustness, an important goal of any estimation procedure is the estimation

efficiency. Fully adaptive estimators are, from a theoretical point of view, ideal in that

respect. The concept of adaptive estimation was proposed by Stein in 1956, who

suggested the incorporation of preliminary testing of data to determine the most likely

distribution that they are drawn from, and the use of a set of estimators corresponding to

the set of likely distributions. The idea is to use the estimator that is best suited for the

data. This is desirable in practice for obvious reasons. However, Bickel (1982) states in

his seminal work on adaptive estimation:

‘The difficulty of nonparametric estimation of score functions suggests

that a more practical goal is partial adaptation, the construction of

estimates which are (i) always n-consistent, and (ii) efficient over a large

parametric subfamily of F [the space of distributions]. Our results

indicate that . . . this goal should be achievable by using a one-step Newton

approximation to the maximum likelihood estimate for the parametric

subfamily by starting with an estimate which is n-consistent for all

of F.’

In other words, partially adaptive estimators can be constructed by adopting a family of

distributions that have a general form which depends on a number of parameters, and

each member of the family can be derived by a particular set of values of the parameters.

22
For example, Potscher and Prucha (1986) used a family of t-distributions, and McDonald

and Newey (1988) used a generalized t-distribution, the idea of which will be adopted in

this thesis. Adopting the family of distribution is the first step; the most important step is

to then estimate or adapt the distribution parameters to the data, in order to characterize

the data better to improve estimation efficiency. Of course, the partially adaptive

estimators should also be reasonably robust.

A rather similar yet different approach to the partially adaptive concept above is the

tuning of the estimator parameters to the data by means other than statistical criteria. A

good example is the joint data reconciliation and parameter estimation strategy by Arora

and Biegler (2001), who use the Akaike Information Criterion (AIC) to tune the

parameters of their redescending estimator. The difference is that the redescending

estimator is constructed based on the robustness criteria (influence function) and there are

no statistical distribution corresponding to the influence function. Therefore, no statement

about the efficiency of the estimator for any distributions can be made. However, it is a

practical approach that has been shown in the paper to perform well for all the cases

considered.

The motivation for partially adaptive estimation is to include the information about the

error characteristics into the estimation. This inclusion of prior information also

corresponds to the formulation of posterior density which is a Bayes estimation. As stated

above, this inclusion of prior information can be achieved through preliminary testing of

the data, followed by selection of most suitable estimators according to the data

23
characteristics obtained from the preliminary testing. As such the selected estimators can

be said to have been adapted to the data. The general steps of this scheme are illustrated

in Figure 2.2below.

Preliminary
Data testing of data Data
characteristics

Adaptation:
Determination of most Adapted
estimator
suitable estimators

Final Estimation

Figure 2.2 Partially Adaptive Estimation Scheme

2.6. Conclusion

It is noticed that three main issues prevail in the general joint data reconciliation –

parameter estimation scheme. The first two are none other than the estimation criteria: it

is essential for the estimator to be robust against outliers, and it is desirable to attain

higher estimation efficiency. The analysis of statistical robustness by means of the

influence function facilitates the identification and construction of robust estimators.

Meanwhile, estimation efficiency can be achieved by partially adaptive estimation where

the estimator is constructed such that it is best suited to the data. The third issue is that of

variable classification. Parameter estimation involves unknown model parameters, which

24
means that some non-redundant measurements may be present. The identification of non-

redundant measurements is essential as such measurements may bias the estimation if

gross error is present in them.

25
CHAPTER 3: JOINT DRPE BASED ON THE GENERALIZED T

DISTRIBUTION

3.1. Introduction

The Generalized T (GT) distribution is the generalization of a family of statistical

distributions comprising many important distributions such as the exponential and the t

distribution. It has the potential to be adaptive and robust, and hence, in this thesis it is

proposed as an estimator for the joint data reconciliation – parameter estimation strategy.

An introduction to the GT distribution is presented in this chapter, along with its

robustness and adaptiveness properties. At the same time, techniques used in the strategy

proposed in this thesis are also described when the relevant topic is being discussed.

These techniques are then summarized in the last section of this chapter, along with the

outline of the algorithm for the proposed strategy.

3.2. The Generalized T (GT) Distribution

The Generalized T (GT) distribution has the following density function:

p (3.1)
f GT (u;σ , p, q) = q +1 p
; -∞<u <∞
⎛ |u| ⎞ p

2σq1/ p B(1 p , q)⎜⎜1 + ⎟


p ⎟
⎝ qσ ⎠

It is symmetric about zero and unimodal. The density function is characterized by the

distribution parameters { p, q,σ } : p and q determine the shape of the distribution,

26
while σ determines the scale of the distribution spread. Figure 3.1 plots the density

function for several settings of p and q.

0.4 0.4
PDF of GT with PDF of GT with p=1
p=1
p = 1, 1.25, 2, 3, 4, 5;
0.35 p=1.25 0.35 p = 1, 1.25, 2, 3, 4, 5; p=1.25
q = 0.5; q = 50; p=2
p=2
sigma = sqrt(2) sigma = sqrt(2) p=3
0.3 p=3 0.3
p=4 p=4
p=5 p=5
0.25 0.25
fGT

fGT
0.2 0.2

0.15 0.15

0.1 0.1

0.05 0.05

0 0
-6 -4 -2 0 2 4 6 -6 -4 -2 0 2 4 6
Residual, u Residual, u

(a) (b)

0.4 0.4
PDF of GT with PDF of GT with
0.35 q = 0.5, 1, 5, 10, 20, 50; p=1 0.35 q = 0.5, 1, 5, 10, 20, 50; p=1
p = 1; p=1.25 p = 5; p=1.25
sigma = sqrt(2) p=2 sigma = sqrt(2) p=2
0.3 0.3
p=3 p=3
p=4 p=4
0.25 p=5 0.25 p=5
fGT
fGT

0.2 0.2

0.15 0.15

0.1 0.1

0.05 0.05

0 0
-6 -4 -2 0 2 4 6 -6 -4 -2 0 2 4 6
Residual, u Residual, u

(c) (d)

Figure 3.1. Plot of GT density functions for various settings of distribution parameters p
and q
(a) various p; q = 0.5; (b) various p; q = 50;
(c) various q; p = 1; (d) various q; p = 5.

From Figure 3.1, the role of the distribution parameters p and q can be observed as

follows. The parameter p determines the type of shape of the distribution in the area

around its center of symmetry. Depending on this type of shape, the thickness of the

27
distribution tail also varies. For lower value of p, the density around the symmetry has

narrower shape (Figure 3.1 a or b), and hence the distribution has thicker tails. For the

same type of shape, i.e. for the same values of p, varying q will vary the thickness of the

tails (Figure 3.1 a and b; Figure 3.1 c or d).

It is also apparent from Figure 3.1 that the GT distribution has high flexibility to assume

various distribution shapes. In fact, the Generalized T defines a very general family of

density functions and combines two general forms that include as special cases most of

the stochastic specifications encountered in practice. Some of the more commonly known

special cases, along with the particular values of { p, q,σ } for each case, are depicted in the

GT family tree in Figure 3.2.

GT
p=2
q →∞
σ =α 2
Power exponential/ T-distribution,
Box-Tiao df = 2q
p=2
q →∞
p =1 σ =α 2 q = 12

Double Normal Cauchy


exponential/
Laplacian

Figure 3.2. GT Distribution Family Tree, Depicting the Relationships among Some
Special Cases of the GT Distribution (McDonald and Newey, 1989)

28
3.3. Robustness of the GT Distribution

The influence function (IF) of the GT-based estimator can be obtained as:

p −1
( pq + 1) sign ( u ) u (3.2)
ψ GT ( u ; σ , p , q ) = p
qσ 1/ p
+ u

Figure 3.3 shows the influence function with several different sets of values of { p, q, σ } ;

these set of values correspond to those in Figure 3.1. It shows that generally the IF is

bounded and actually descending when the residuals get large. However, it is also

observed that as p increases, the IF for large residuals increases and as q increases, the IF

becomes less bounded. In fact, notice that one special case where q → ∞ ,

with p = 2,σ = α 2 (Figure 3.2; α is the standard deviation) is none other than the

normal distribution. To ensure that the GT-based estimator is insensitive to large

residuals, therefore, upper bounds must be imposed on the values that p and q can take.

Finally, it is also noted that for any given plant, the effect of p and q on the influence

function will be the same, because the variations from one plant to another (or from one

measurement to another) will be taken into account by the value of sigma. The expression

of the GT distribution (equation 3.1) further affirms this, i.e. the estimation residue

(measurement error) is always scaled by sigma.

29
2.5 2.5
IF of GT with IF of GT with
2 p = 1, 1.25, 2, 3, 4, 5; 2 p = 1, 1.25, 2, 3, 4, 5;
q = 0.5; sigma = sqrt(2) q = 50; sigma = sqrt(2)
1.5 1.5

1 1

0.5 0.5

0 0
IF

IF
-0.5 -0.5
p=1 p=1
-1 p=1.25 -1 p=1.25
p=2 p=2
-1.5 p=3 -1.5 p=3
p=4 p=4
-2 p=5 -2 p=5

-2.5 -2.5
-6 -4 -2 0 2 4 6 -6 -4 -2 0 2 4 6
Residue, u Residue, u

(a) (b)

2.5 2.5
IF of GT with IF of GT with
2 q = 0.5, 1, 5, 10, 20, 50; q = 0.5, 1, 5, 10, 20, 50;
2
p = 1; sigma = sqrt(2) p = 5; sigma = sqrt(2)
1.5 1.5

1 1

0.5 0.5

0
IF

0
IF

-0.5 -0.5
q=0.5 q=0.5
-1 q=1 -1 q=1
q=5 q=5
-1.5 q=10 -1.5 q=10
q=20 q=20
-2 -2
q=50 q=50
-2.5 -2.5
-6 -4 -2 0 2 4 6 -6 -4 -2 0 2 4 6
Residue, u Residue, u

(c) (d)

Figure 3.3. Plots of Influence Function for GT-based Estimator with different parameter
settings

3.4. Partially Adaptive GT-based Estimator

For the GT-based approach, the distributional parameters { p, q, σ } can be estimated using

the maximum likelihood estimator (McDonald and Newey, 1988):

30
{ p, q, σ } = arg max ∑ log f GT (u0 ; { p, q, σ }) (3.3)

where u0 is the residual from the preliminary estimates. The values of { p, q,σ } are then

obtained as the parameters of a GT member from which the data are most likely sampled.

The GT estimator using such estimated values of { p, q, σ } has been shown to be

asymptotically efficient among all estimators, when the error distribution is within the

GT family. Rigorous mathematical proof can be found in McDonald and Newey (1988).

Nothing can be said about the efficiency in the case of non-GT distributed errors, and

some loss of efficiency is possible. However, since the GT family includes a wide range

of commonly encountered distributions, the fact that it is asymptotically efficient for all

distributions within this wide range makes its application very appealing.

In the current work, the maximum likelihood estimator in equation 3.3 is used to

estimate { p, q, σ } . To obtain the preliminary residuals u0, a preliminary estimation is

necessary. The choice of preliminary estimator is not limited to robust estimators. As will

be shown in Chapter 5, regardless of the robustness of the preliminary estimator, the final

estimates always converge to highly similar values. It is found that the crucial

components in this (robust and) partially adaptive estimation scheme are the use of

iteration and the robustness of the range of GT distributions considered. The iteration

scheme will be discussed in Section 3.5 about the general algorithm. The robustness of

the range of the GT distributions, on the other hand, depends on the range of values that p

and q can assume, i.e. the bounds of p and q. This will also be discussed in Section 3.5.

31
In this thesis, the median of the data set is taken as the preliminary estimated values.

Taking the median as the estimate corresponds to the use of robust L-estimator

(Albuquerque and Biegler, 1996; Hampel et al., 1986). It is feasible in the current work

as only steady state case is considered, i.e. the true values of the variables are assumed to

be constant over the time horizon considered. This method of estimating the distribution

parameters is simple, robust and computationally more convenient, as compared to

performing a full preliminary DRPE to obtain u0. It will therefore give a good initial

estimate in the iterative scheme. Moreover, since the asymptotic distribution of the

estimates depend only on the limit of the distribution parameters, and not on the

particular way by which they are estimated (McDonald and Newey, 1988), this approach

is justifiable. For cases where steady state cannot be assumed, the full preliminary DRPE

must be performed to obtain u0.

3.5. The General Algorithm

Figure 3.4 outlines the general algorithm steps for the joint data reconciliation –

parameter estimation (DRPE) based on the partially adaptive GT-based estimator.

32
Data Preliminary Preliminary
Estimation Residuals

Adaptation: Estimator
Estimation of estimator parameters Parameters
{ p , q , σ } = arg max log (f GT (u 0 ; p , q , σ )) {p,q,ó}
{ p , q ,σ }

Final Estimation
{params,vars} = arg max f (u)
{params,vars}

= arg min- log(fGT(u | p, q,σ))


{params,vars}

s.t. modelequations

Figure 3.4. General Algorithm for Joint DRPE using partially adaptive GT-based
estimator

There are three main steps in the algorithm:

1. Preliminary estimation

The preliminary estimation is necessary in the partially adaptive scheme, to provide

the initial residuals from which prior information regarding the data is first obtained.

As discussed in the previous section, the median of the data is used as the preliminary

estimates. Taking the median provides estimates that do not necessarily satisfy the

model equations; however, the constraint residues are usually small. Moreover,

compared to performing a full data reconciliation – parameter estimation, taking the

median in a steady-state case is much more computationally efficient and the median

estimates run less risk of being biased due to errors in parameter estimation because it

33
does not impose any constraints and thus is not affected by constraints which include

unknown parameters.

2. Adaptation: Estimation of the Estimator Parameters

Using the preliminary residuals obtained from the previous step, the GT parameters

{ p, q, σ } are estimated as the maximum likelihood estimates of the GT objective

function (equation 3.3). Bounds on the GT parameters values are imposed as the only

constraints in this optimization. The values of these bounds are set as p ≤ 5 , q ≤ 50

and σ ≤ 50 . These values are chosen based on the analysis of the GT influence

function, i.e. p and q are the parameters that result in a reasonably robust GT

distribution without sacrificing the efficiency of the estimation. The meaning of

reasonably robust has been discussed in Section 3.3 regarding influence function, i.e.

where upper bounds are necessary to ensure sufficiently effective suppression of large

outliers. On the other hand, not sacrificing the efficiency of the estimation means that

the range of values of p and q within these bounds must sufficiently spans the shapes

of distribution that might possibly be encountered.

3. Final Estimation: Joint DRPE using the adapted estimator parameters

The joint data reconciliation – parameter estimation is then performed using the

adapted GT estimator, i.e. GT with the estimated { p, q, σ } . The term “final” for

“final estimation” is used very loosely here and only to differentiate this DRPE step

form the other two estimation steps in the whole framework, i.e. preliminary

estimation and the estimation of GT parameters. When improvement in the efficiency

34
of the preliminary estimates is considered necessary, iterations can be performed. In

this case, the estimates from the joint DRPE are used to obtain the residuals for the

next estimation of GT parameters. This will be further discussed and investigated in

Chapter 5.

3.6. Conclusion

The Generalized T (GT) distribution is a potentially robust and efficient estimator. Its

robustness stems from the flexibility to adjust its tail weight to produce thicker tail

distributions. The flexibility to adjust its shape makes it potentially efficient as it can

adapt to data profiles that resemble the GT family of distributions. Furthermore, the GT

family of distributions is wide-ranging and includes many important distributions as

special cases. Considering these beneficial properties, the GT-based estimator for a joint

data reconciliation – parameter estimation strategy is formulated. The algorithm consists

of three main steps: the preliminary estimation and the estimation of the GT parameters

to adapt the estimator to the data, and the joint DRPE itself. The performance of this

strategy is investigated in a case study in Chapter 4, and the results are discussed in

Chapter 5.

35
CHAPTER 4: CASE STUDY

4.1. Introduction

To study the performance of the GT-based joint data reconciliation – parameter

estimation strategy, it is applied to a general-purpose chemical engineering plant. This

chapter serves to present preparatory information for the studies in Chapter 5. The

description of the plant and its virtual version is given in the first section of this chapter,

followed by the decomposition of the model equations to classify the variables. An

overview of the data generation and the DRPE strategies compared in Chapter 5 is also

given.

4.2. The General-Purpose Chemical Plant

To study the performance of the proposed strategy, a case study of the pilot-scale setting

of a general-purpose plant containing two CSTRs, a mixer and a number of heat

exchangers (Figure 4.1) is performed. Material feed from the feed tank is heated before

being fed to the first reactor and the mixer. The effluent from the first reactor is then

mixed with the material feed in the mixer and fed to the second reactor. The effluent from

the second reactor is, in turn, fed back to the feed tank and the cycle continues. Each

reactor has a steam jacket and a cooling coil for temperature control.

36
HX3

Figure 4.1. Flow Diagram of the General Purpose Plant for Application Case Study

Associated with the pilot-scale plant, a virtual environment that mimics the plant

behaviour has been developed within the Matlab/Simulink framework and will be used to

generate simulation data. Since the equations governing the operations of the reactors,

heat exchangers, mixer and feedtank are nonlinear differential algebraic equations, the

simulation for these units is implemented using the Matlab S-functions. There are

altogether 26 equations, consisting of 18 nonlinear differential equations and 8 nonlinear

algebraic equations. The model equations and descriptions can be found in Appendix A,

while the overview diagram of the simulation model is shown in Figure 4.2 below.

37
Fin,Tin,Cain
Fin,Tin,Cain Out1 F,T,Ca
F,T,Ca In
Out2
530 Tcin
530 Tcin splitter 2
T cin1 V
Tcin V U
Y Fsin
U
Y Fsin
0 C
0 C
Xsin1 FC5 Tc
U
Xsin FC2 Tc Y Fcin
U
Y Fcin
49.9 C
Fcin3
49.9 C
Fcin1 Fcin2 FC3
U Ts Terminator2
Fcin FC1 T erminator Y F
U Ts F1, T1, Ca1 40 C
Y F
40 C F, T, Ca In1
Out Fout1 FC6 REACTOR2
Fout FC4 REACTOR1 In2

F2, T2, Ca2 junction RX1_LV1


RX1_LV Selector2
SP 40
SP 40 U
X
U Vrx1_sp1
X Vrx1_sp V
Selector U
Y F RX1_FC1
40 C
RX1_FC
Fout2 FC8 SP 600
SP 600 U
MIXER X Trx_sp1
U
X Trx_sp RX1_XS1
RX1_XS MX_LV SP
SP U
U
SP 40 X
X U
X Vrx1_sp2

Out1
600 600
In
Out2
T 2_sp T2_sp1
Selector1 Selector3 splitter 3
X

X
SP

SP
ITC1 ITC2
U

U
Vft_sp 50

F1, T1, Ca1

F2, T2, Ca2

F
Selector4
600
T2_sp2

X
SP
IT C3

U
Steam hx1 Steam hx2

U U FEEDT ANK
Y In1 Out1 Y In1 Out1
[40, 670, 0] C [40, 670, 0] C U
ST1 ST2 Y
steam in FC steam in1 FC7 [40, 530, 0] C
steam in2 FC9

In2 Out2 In2 Out2

F, T, Ca
HEAT_EX1 HEAT _EX2 Out1 In1

V
ST3
Steam hx3
Terminator1
Out1
In Out2 In2
Out2
splitter 1

HEAT_EX3

Figure 4.2. Simulink Model of the General Purpose Plant in Figure 4.1.

Besides simulation data, a data set is also collected from the trial run of the real plant.

The configuration for this trial run is shown in Figure 4.3. The simulation data studied in

Chapter 5 are also generated by running the Simulink model according to this

configuration. Appendix B lists the full set of steady-state equations according to this

configuration.

38
Vrx1

Trx1

F5

T5 Trx2

F9
F12
T9
F2 F7 T12

T2 T7

HEAT EX 1 HEAT EX 2

HEAT EX 3
Vft

Tft
T13a

Figure 4.3. Configuration of the General Purpose Plant for Trial Run
Measurements are indicated in the text boxes.

Fcin
Tc1 Tcin

T2

Trx1

Vrx1

F2

T13

T5

F5

Figure 4.4. Reactor 1 Configuration Details and Measurements

39
Fcin
Tc2 Tcin

T9

Trx2

F9
Vrx2

T12

F12

Figure 4.5. Reactor 2 Configuration Details and Measurements

4.3. Variables and Measurements

The variables consist of the volume flow rates and temperatures of streams and the

temperatures and levels of vessels; however, as the current study is performed for steady-

state case, only flow rates and temperatures will be considered. Following the

configuration in Figures 4.3, 4.4 and 4.5, the steady-state model equations involve a total

of 34 variables, consisting of 13 flow variables and 21 temperature variables. Out of the

total 13 flow variables, 7 are measured; while for temperature variables, 12 out of the

total 21 are measured. The locations of measurements are indicated in the text boxes in

Figures 4.3, 4.4 and 4.5. The following lists the measurements.

40
The measured flow rates are:

(i) Feed flow to reactor 1, F2

(ii) Effluent flow from reactor 1, F5

(iii) Feed flow to mixer, before mixing with reactor 1 effluent, F7

(iv) Output flow of mixer / Feed flow to reactor 2, F9

(v) Effluent flow of reactor 2, F12

(vi) Cooling water flow to reactor 1, Fc1

(vii) Cooling water flow to reactor 2, Fc2

The measured temperatures are:

(i) Feed temperature of reactor 1, T2

(ii) Effluent temperature of reactor 1, T5

(iii) Feed temperature to mixer, before mixing with reactor 1 effluent, T7

(iv) Output temperature of mixer/ Feed temperature of reactor 2, F9

(v) Effluent temperature of reactor 2, T12

(vi) Reactor 1 vessel temperature, Trx1

(vii) Reactor 2 vessel temperature, Trx2

(viii) Inlet cooling water temperature, Tcin

(ix) Reactor 1 outlet cooling water temperature, Tc1

(x) Reactor 2 outlet cooling water temperature, Tc2

(xi) Temperature of feed tank, Tft

(xii) Temperature of stream out of heat exchanger 3, T13a

41
The measured levels are:

(i) Reactor 1 level, Vrx1

(ii) Reactor 2 level, Vrx2

(iii) Feed tank level, Vft

The unmeasured flow rates are:

(i) Output flow of feedtank

(ii) Flow of cooling water in heat exchanger 3

(iii) Flow of heating steam into heat exchanger 2

(iv) Flow of condensate out of heat exchanger 2

(v) Flow of heating steam into heat exchanger 1

(vi) Flow of condensate out of heat exchanger 1

The unmeasured temperatures are:

(i) Output temperature of feedtank

(ii) Temperature of cooling water into heat exchanger 3

(iii) Temperature of cooling water out of heat exchanger 3

(iv) Temperature of material into heat exchanger 2

(v) Temperature of steam into heat exchanger 2

(vi) Temperature of condensate out of heat exchanger 2

(vii) Temperature of material into heat exchanger 1

(viii) Temperature of steam into heat exchanger 1

(ix) Temperature of condensate out of heat exchanger 1

42
The unmeasured level is:

(i) Mixer level

4.4. System Decomposition: Variable Classification

For reasons discussed in Chapter 2, it is beneficial to first simplify the equations to the

reduced form that contains only measured variables, before subjecting them as constraints

in the DRPE procedure. Next, it is important to identify non-redundant measurements

from the set of measurements included in the reduced equations. There could also be non-

redundant measurements not included in the reduced equations. The importance of

identifying non-redundant measurements is also discussed in Chapter 2. Finally, from the

set of unmeasured variables, the determinable variables can be distinguished from the

undeterminable ones. These steps are described in the following subsections. The final

classification is summarized in Table 4.1.

4.4.1. Derivation of Reduced Equations

The reduced set of equations in this case is straightforward to obtain by observations of

the complete set of steady-state equations, because all of the variables around most units

are measured, so that equations with measured variables and those with unmeasured ones

are clearly distinguished from each other. It is therefore not necessary to use formal

variable classification methods such as the ones mentioned in Chapter 2. The reduced

equations are as follows.

43
Reactor 1

Mass Balance: F2 − F5 = 0 (4.1)

U c1 Ac1
Energy Balance: F2T2 − F5T5 − (T − T ) = 0 (4.2)
ρC p rx1 c1

U c1 Ac1
Cooling Coil Energy Balance: Fc1 (Tc ,in − Tc1 ) + (T − T ) = 0 (4.3)
ρ c C pc rx1 c1

Mixer

Mass Balance: F5 + F7 − F9 = 0 (4.4)

Energy Balance: F5T5 + F7T7 − F9T9 = 0 (4.5)

Reactor 2

Mass Balance: F9 − F12 = 0 (4.6)

U c 2 Ac 2
Energy Balance: F9T9 − F12T12 − (T − T ) = 0 (4.7)
ρC p rx 2 c 2

U c 2 Ac 2
Cooling Coil Energy Balance: Fc 2 (Tc ,in − Tc 2 ) + (T − T ) = 0 (4.8)
ρ c C pc rx 2 c 2

F indicates flowrates and T indicates temperatures. The indices refer to the same entities

as listed previously in Section 4.2. Uc is the overall heat transfer coefficient of the

cooling coil, while Ac is the effective cooling area. In the present work, the model

parameters to be estimated are the product of the overall heat transfer (Uc) with the

effective cooling area (Ac) for both reactors, i.e. Uc1Ac1 and Uc2Ac2.

44
4.4.2. Classification of Measurements

Following the definition of redundant measurements, it can be deduced that redundant

measurements can only appear in the reduced set of equations. However, as there are

reduced equations with unknown model parameters to be estimated, some measurements

in the reduced equations may be non-redundant. The formal classification procedures can

of course be performed to determine these non-redundant measurements, but as the

number of variables is not too large, the classification can be done using symbolic

manipulation. This way, linearization can be avoided.

Firstly, all measured variables involved in reduced equations without unknown

parameters are redundant. This means that only the variables that appear only in

equations with unknown parameters need to be tested for redundancy. To test a variable

for its redundancy, it is assumed as unmeasured. If its value can be determined from other

measurements through symbolic manipulation of the other reduced equations, then it is

redundant, and vice versa. Another alternative is if the unknown parameters in the

reduced equations can be eliminated through algebraic manipulation of the equations that

contain them, then the measured variables that are also eliminated from the resulting set

of equations are non-redundant. For example, from the set of reduced equations above, it

is straightforward to see that if equations 4.2 and 4.3 are summed together, the term

containing the unknown parameter UcAc1 can be eliminated. However, the measured

variable Trx1 is also eliminated in the process, and it does not appear in the other

equations 4.1 and 4.4 to 4.8. Therefore, the measurement of Trx1 is non-redundant. The

45
same can be performed to equation 4.7 and 4.8, leading to the conclusion that Trx2 is

non-redundant. Since the remaining equations other than 4.2, 4.3, 4.7 and 4.8 do not

contain any unknown parameters, the measurements of all the other measured variables

in those equations are redundant.

Nevertheless, since the classification methods based on linearized and bilinear model are

more formal, these two methods are also performed in this thesis to classify the variables

in the reduced set of equations above. The results of the classification are then used as a

check for that obtained using symbolic manipulation. It is found that the results of

classification using symbolic manipulation, orthogonal transformation with linearized

model and orthogonal transformation with bilinear model agree with one another.

Finally, it is also straightforward to deduce that the measurements of variables that do not

appear in the reduced set of equations are non-redundant. As such, referring to the

complete set of steady-state equations, T13a is classified as non-redundant. The results of

the variable classification are summarized in Table 4.1.

46
Measured Variables Unmeasured Variables

Redundant Non-redundant Determinable Undeterminable

Flow rates F2, F5, F7, F9, None F13 F1, F6, Fc3

F12, Fc1, Fc2

Temperatures T2, T5, T7, T9, Trx1, Trx2, T13 T1, T16, T17,

T12, Tcin, Tc1, T13a T6, T14, T15,

Tc2 Tcin3, Tc3

Table 4.1. Result of Variable Classification for the General Purpose Chemical Plant

4.5. Data Generation

4.5.1. Simulation Data

Simulation data are generated by random number generators in Matlab, for several

different distributions, namely Normal, Laplacian, uniform and t distribution. The

different error distributions can be considered outliers if their magnitudes are large. From

optimization point of view, when an outlier is detected by the robust estimator, the

variable is then regarded as unmeasured. Consequently, depending on the structure of the

constraint equations, there are certain variables that when regarded as unmeasured,

become indeterminable. In fact, these variables are the non-redundant measurements.

This issue will cause difficulty in the optimization as there is no unique solution

available. These situations are avoided in the data generation in this case study.

47
4.5.2. Real Data

A data set was collected from a trial run of the pilot plant. The data collection setup

consists of OPCTM (Object Linking and Embedding for Process Control) server

connection to the DCS and a Microsoft® Excel interface connected to the OPCTM server.

A Matlab® interface is then developed to sample the data from Excel at fixed sampling

intervals. The joint data-reconciliation and parameter estimation is then performed on the

data set. As mentioned before, the plant configuration and location of measurements

during the trial run is the same as those of the simulation model used to generate the

simulation data. Therefore the reduced set of equations and the classification of variables

in Section 4.1 also apply to this data set.

4.6. Methods Compared

Three DRPE methods are compared, i.e. the weighted least squares (WLS), the

contaminated normal estimator and the GT distribution. The weighted least squares

(WLS) is chosen as a comparison because it is the best representation of non-robust

methods, one that is the most efficient if the conventional normal assumption is satisfied.

It will be shown that deviations from normal assumption result in the deterioration of the

performance of WLS. The contaminated normal estimator is chosen because of its

robustness, and also because it is always used with pre-assigned parameter values,

namely the amount and magnitude of the Normal contamination. In practice, the amount

and magnitude of contamination are unknown, so these assigned values are just

heuristics. However, it will be shown that by estimating the parameter values from the

48
data, i.e. by partially adaptive estimation, the efficiency of the contaminated Normal

estimator can be improved. Nevertheless, due to its pre-assigned distribution shape, the

profiles of data that this estimator can adapt to are rather limited. On the other hand, the

GT distribution can accommodate many distributions within its distribution family,

which covers many different distribution profiles, and therefore has more advantage in

the partially adaptive scheme.

The detailed configurations of the methods are as follows:

(i) Weighted Least Squares (WLS)

The objective function of the WLS is:

ρ = ( y − x)T Ψ −1 ( y − x)

where y is the measurement, x is the estimate and Ψ is the covariance matrix of the

measurement errors.

The covariance matrix is estimated from the data set using the following formula:

1 r
cov( yi , y j ) = ∑ ( yik − yi )( y jk − y j )
r − 1 k =1
(4.1)

where

49
1 r
yi = ∑ yik ,
r k =1
(4.2)

r is the number of sampling points in the data set and y i is the measurement of the

i -th variable.

(ii) Estimator based on the contaminated Normal distribution

The parameters p, b and σ of the contaminated Normal are estimated from

preliminary residuals. The preliminary residuals and the distribution parameters are

estimated in the same way as they are for the partially adaptive GT-based method,

i.e. as described in Chapter 3.

(iii) Partially adaptive GT-based method

The configuration is the same as set forth in Chapter 3, i.e. the GT-based estimator

with distribution parameters p, q and σ estimated from preliminary residuals.

4.7. Performance Measures

Two measures are used to quantify the efficiency of the DRPE methods, i.e. the mean

square error (MSE) and the percentage of model parameter estimation accuracy (%-

discrepancy). The MSE is calculated as:

50
)
1 K m ( xi, j − xi , j )
2

MSE = ∑∑ σ 2
mK j =1 i =1
(4.3)
i

where m is the number of measured variables and K is the number of data sets used for
)
the DRPE (m=17, K=20x100 in this study). x i , j and x i , j are the estimates of the reconciled

data and the actual value of the variable, respectively, while σ i is the standard deviation

of the Gaussian noise on sensor i.

The %-accuracy is calculated as:

| parameter estimate - actual value|


% discrepancy = 100% × (4.4)
actual value

More explanation on these measures will be given in Chapter 5, in relation to the cases

studied.

4.8. Conclusion

A general purpose chemical plant is used to generate data to investigate the performance

of the GT-based joint DRPE strategy. Based on the location of measurements, the

variables are classified as redundant/ non-redundant and determinable/ undeterminable

accordingly. Both real and simulation data are generated; for simulation data, different

error distributions with and without outliers are generated. Two other methods, i.e. the

weighted least squares and the contaminated Normal estimators are used for DRPE as

51
comparison. The weighted least squares method represents non-adaptive, non-robust

methods while the contaminated Normal method represents other alternatives for partially

adaptive methods. The results of comparison are presented in Chapter 5.

52
CHAPTER 5: RESULTS & DISCUSSION

5.1. Introduction

Several case studies are conducted based on the data generated from the general-purpose

plant described in Chapter 4. Firstly, data with various error distributions are input to the

joint DRPE strategy to study the efficiency due to its adaptiveness. Secondly, the data

with various error distributions are contaminated with outliers to investigate the

robustness of the DRPE strategies. With regard to the partially adaptive scheme, the

effects of data size and choice of preliminary estimators are then studied.

5.2. Partial Adaptiveness & Efficiency

To investigate the efficiency of the partially adaptive GT-based DRPE strategy, the

strategy is used to reconcile variables and estimate model parameters for data with

various error distributions. For each distribution, 10 sets of 100 data samples each are

generated.

The generated distributions are:

(i) Normal (Gaussian) distribution

(denote the standard deviation of this distribution as σ)

(ii) Laplace distribution,

a. With standard deviation σ

b. With standard deviation 3σ

c. With standard deviation 5σ

53
(iii) Uniform distribution

a. With range from -3σ to +3σ

b. With range from -5σ to +5σ

(iv) t distribution

a. With 1 degree of freedom (dof)

b. With 2 degrees of freedom

c. With 3 degrees of freedom

These are some of the more common distributions that might represent the empirical

character of the actual physical measurement processes. The Normal distribution is the

most commonly encountered; its density function has a bell shape and 99.73% of the

errors occur within -3σ to +3σ from the true value of the variable, where σ is the standard

deviation. The Laplace distribution is the double-sided exponential distribution; it has

thicker tails than Normal, which means that the possibility of large errors occurring is

higher than Normal. The t distribution also has thicker tails than Normal, especially for

small degrees of freedom. The uniform distribution is different from the other three

distributions above, as it does not assume that smaller errors are more likely to occur than

larger ones; instead, the magnitudes of errors are within a certain range with any value

within the range having the same likeliness to occur.

As measures of efficiency, the mean squared error (MSE) is used for the estimates of

variables and the percent of absolute discrepancy is used for the estimates of model

parameters. The mean squared error (MSE) is calculated as:

54
)
1 K m ( xi, j − xi , j )
2

MSE = ∑∑ σ 2
mK j =1 i =1
(5.1)
i

where m is the number of measured variables and K is the number of data samples used
)
for the DRPE (i.e. m=17 variables, and K = 20 sets x 100 samples). x i , j and x i , j are the

estimates of the reconciled data and the actual value of the variable, respectively, while

σ i is the standard deviation of the Normal (Gaussian) noise on sensor i. Meanwhile, the

accuracy is calculated as:

| parameter estimate - actual value|


% discrepancy = 100% × (5.2)
actual value

The MSE can be thought of as a kind of empirical measure of the variance of the

estimates; therefore it is suitable for measuring efficiency, as it can be recalled from

estimation theory that the estimation efficiency is usually defined in terms of variance

(e.g. the Cramer-Rao lower bound for asymptotic variance). The same value of “standard

deviation” σ i , that is, the standard deviation used to generate the Normal noise, is used in

calculating the MSE for all cases of noise distributions, instead of using the actual or

estimated standard deviation of every respective distribution. The purpose is to establish

the same standard to visualize and compare the performance of the method across

different noise distributions with different spreads.

Figure 5.1 illustrates the MSE of the resulting variable estimates. It is observed that the

weighted least squares (WLS) method performs efficiently for Normal error compared to

55
the other two methods. This is expected, as it is theoretically the most asymptotically

efficient estimator for Normal error. However, for other error distributions, the WLS is

becoming less and less efficient in comparison to the other two methods, and in

comparison to its performance for Normal error. It is seen that the MSE of WLS

estimates is highly sensitive to the spread of the error distributions. The most obvious

manifestation is in the case of t errors with one and two degree of freedoms, as these

distributions have considerably thicker tails than the others (from Table 5.1, the MSE of

the measurements for t error with one and two degrees of freedoms are the highest and far

higher than the other error distributions). For further illustration, consider the plot of the

percentage of relative MSE, i.e. the percentage of MSE of variable estimates with respect

to the MSE of measurements, in Figure 5.4. The relative MSE for WLS estimates are

observed to be fairly similar across the different error distributions, whereas the MSE of

measurements are actually vastly different as seen in Table 5.1. This means that the

efficiency of WLS estimates is proportional to the measurement error magnitudes,

regardless of the extent of the magnitude. The sensitivity of the WLS estimates to the

error magnitudes can be explained by recalling the influence function of WLS estimator

described in Chapter 2. The influence function has, indeed, a linear relationship with the

magnitude of the error.

56
MSE of Variable Estimates

0.35
0.3
0.25
0.2 WLS
Cont_Norm
0.15
GT
0.1
0.05
0
e1

e3

e5

t1

t2

t3
al

rm

rm
m

ac

ac

ac
or

fo

fo
pl

pl

pl
N

ni

ni
La

La

La

Figure 5.1. MSE Comparison of GT-based with Weighted Least Squares and
Contaminated Normal Estimators

Table 5.1. MSE of Measurements

Error Distribution MSE of Measurements

Normal, σ 0.9962

Laplace, σ 1.0145

Laplace, 3σ 8.0945

Laplace, 5σ 22.3552

Uniform, [-3σ,3σ] 2.7740

Uniform, [-5σ,5σ] 7.4459

t, dof = 1 1323.8090

t, dof = 2 41.4721

t, dof = 3 9.5355

57
%MSE of Estimates w.r.t. MSE of Measurements
0.01

0.009

0.008

0.007
WLS
0.006
Cont_Norm
0.005
GT
0.004

0.003

0.002

0.001

t1

t2

t3
e1

e3

e5
al

5
rm
rm
m

ac

ac

ac
or

fo

fo
pl

pl

pl
N

ni

ni
La

La

La

Figure 5.2. Percentage of Relative MSE

Turning the discussion to the contaminated normal estimates, the MSE results in Figure

5.1 show that this estimator performs more efficiently than the WLS in most cases,

particularly in the case of t errors. The percentage of relative MSE in Figure 5.4 also

shows that the larger the measurement errors, the smaller the relative MSE for this

estimator is. This reflects the descending trend of the influence function of this estimator

for large values of errors. An exception of the efficient performance is for the case of

uniform error, where the performance is just comparable to the WLS estimator. This can

be explained by looking at the behaviour of the influence function around zero, i.e. for

small residuals. Theoretically the uniformly distributed errors are concentrated with

uniform probability around the true values; therefore, residuals should also have uniform

influence regardless of their proximity to the true values, as long as they are within the

58
range. The contaminated normal estimator, however, behaves very similarly to the WLS

for small residuals, i.e. the smaller the residual, the smaller the influence is. The other

distributions (except the uniform distribution) have kurtosis that can be more or less fitted

to that of the Normal distribution by adjusting the spread of the Normal distribution,

which is why for the other distributions the partially adaptive contaminated Normal

estimator performs with considerable efficiency.

On the other hand, the partially adaptive GT-based estimator performs well for all cases

of error distribution; its efficiency is comparable to or better than that of the partially

adaptive contaminated estimator (Figures 5.1 and 5.2). This can be attributed to its

adaptability to many different error profiles, and also to its robustness against large

errors. The most evident example of how the adaptability results in more efficient

estimates is the MSE performance in case of the uniform distribution. As discussed

before, the Normal and contaminated Normal estimator are not able to follow closely the

kurtosis of the uniform distribution. This is not the case for the GT distribution as

uniform distribution is actually one of its limiting cases, i.e. with the distribution

parameter p and q very large (pÆ~, qÆ~).

Observation of the relative MSE in Figure 5.2 also reveals the robustness the GT-based

estimator, where it suppresses large errors and therefore is rather insensitive to the

magnitude of these errors.

59
It is also noticed that the generated error distributions are all the limiting cases in the GT

family tree in Figure 3.3. This confirms the assertion in Chapter 3 that although the

bounds placed on the GT parameters p and q excludes such limiting cases as the uniform

(pÆ~, qÆ~) and the Normal and Laplace distributions (qÆ~), the efficiency of the

estimates is not much compromised.

Figures 5.3 and 5.4 show the percentage of the parameter estimation errors for the overall

heat transfer coefficients of Reactor 1 and Reactor 2 cooling coils, respectively. As

expected, these results mirror the MSE performance in Figure 5.1. This is because the

variables are the source of information for the value of the parameter, and hence, the

accuracy of the variable estimates will be reflected in the accuracy of the parameter

estimates.

%Discrepancy of Param1 Estimates

1.4
1.2
1
WLS
0.8
Cont_Norm
0.6
GT
0.4
0.2
0
al

t1

t2

t3
e1

e3

e5

rm

rm
m

ac

ac

ac
or

fo

fo
pl

pl

pl
N

ni

ni
La

La

La

Figure 5.3. Comparison of the Accuracy of Estimates for the overall heat transfer
coefficient of Reactor 1 cooling coil

60
%Discrepancy for Param2 Estimates

2.5
2
1.5 WLS
Cont_Norm
1
GT
0.5
0
al

t1

t2

t3
e1

e3

e5

rm

rm
m

ac

ac

ac
or

fo

fo
pl

pl

pl
N

ni

ni
La

La

La

Figure 5.4. Comparison of the Accuracy of Estimates for the overall heat transfer
coefficient of Reactor 2 cooling coil

Figure 5.5 is presented to help visualize the preceding discussion on how the GT-based

estimator is flexible enough to adapt to the different error profiles, while the

contaminated Normal and weighted least squares (which assumes Normal error) may not

be as flexible. The gray vertical bars in the figure are the histogram of the measurement

errors, scaled so that its total area is one; therefore, it is a kind of relative frequency plot.

The distribution parameters of the GT (i.e. {p,q,σ}), contaminated Normal (i.e. {p,b,σ})

and the weighted least squares (i.e. σ) that are used to generate the plots in Figure 5.5 are

listed in Table 5.2. These parameters are the results of the estimation of the estimator

parameters in the partially adaptive scheme. Along with these parameters, the “true”

underlying GT parameters of the distributions from which the errors are generated are

also included as a comparison. These “true” parameter values are, of course, the ones

shown in the GT family tree in Figure 3.3 (Chapter 3) for each respective distribution.

61
1.4 0.5
GT GT
ContNorm 0.45 ContNorm
1.2
WLS WLS
0.4
Relative Frequency / PDF(u)

Relative Freq / PDF(u)


0.35

0.8 0.3

0.25
0.6
0.2

0.4 0.15

0.1
0.2
0.05

0 0
-1.5 -1 -0.5 0 0.5 1 1.5 2 -5 0 5
Residual, u Residual, u

(a) Normal, σ = 0.5 (b) Laplace, std dev = 3σ = 1.5


0.35
GT 0.25 GT
ContNorm ContNorm
0.3
WLS WLS
0.2
0.25
Relative Freq / PDF(u)

Relative Freq / PDF(u)

0.2 0.15

0.15
0.1

0.1

0.05
0.05

0 0
-3 -2 -1 0 1 2 3 -20 -15 -10 -5 0 5 10 15 20
Residual, u Residual, u

(c) Uniform, [-5σ, 5σ] = [-2.5, 2.5] (d) t, degree of freedom = 1

Figure 5.5. Adaptation to Data: Fitting the Relative Frequency of Residuals with GT,
Contaminated Normal, and Normal distributions

62
Table 5.2. Estimated parameters of the partially adaptive estimators used to generate
Figure 5.5
GT
Cont. Normal WLS
Estimated “True”

(a) Normal, p = 1.8983 p=2 p = 0.05 σ = 0.4977

σ = 0.5 q = 10.0478 qÆ~ b=5

σ = 0.6431 σ = 0.5x√2 = 0.7071 σ = 0.4577

(b) Laplace, p = 1.1866 p=1 p = 0.05 σ = 1.2819

std dev = q = 47.1412 qÆ~ b=5

3σ = 1.5 σ = 1.1367 (σ = 1.5/√2 = 1.0607) σ = 1.0929

(c) Uniform, p=5 pÆ~ p = 0.05 σ = 1.4891

[-5σ, 5σ] = q = 50 qÆ~ b=5

[-2.5, 2.5] σ = 2.4701 (σ = range / 2 = 2.5) σ = 1.466

(d) t, p = 1.9022 p=2 p = 0.3322 σ = 3.7444

dof = 1 q = 0.8315 q = 0.5 b = 5.3714

σ = 1.9043 σ = 1.1638

63
5.3. Effect of Outliers on Efficiency

To investigate the effect of outliers on the efficiency of the partial adaptive methods, 5%

of the data in Section 5.1 above are replaced with contaminations of randomly signed

errors of fixed sizes. The magnitude of this contamination is 10σ, where σ is the standard

deviation of the Normal error before being contaminated with outliers. The MSE of the

resulting measurements are shown in Table 5.3. Compared to Table 5.1, the MSE is now

generally larger than the MSE of the measurements without outliers.

Table 5.3. MSE of Measurements with Outliers


Error Distribution,
with outliers MSE of Measurements
(5% + 10σ)
Normal, σ 5.3616

Laplace, σ 5.3861

Laplace, 3σ 12.1171

Laplace, 5σ 25.5046

Uniform, [-3σ,3σ] 7.0368

Uniform, [-5σ,5σ] 11.4773

t, dof = 1 1980.0869

t, dof = 2 29.0566

t, dof = 3 14.4804

64
Figure 5.6 charts the MSE of the resulting variable estimates. Compared to Figure 5.1,

the performance of the weighted least squares here deteriorates considerably, even for the

Normal error. On the other hand, the contaminated normal and GT estimators are able to

maintain rather similar MSEs as those in Figure 5.1. This is because of both the

robustness and adaptability of both estimators. The adaptability of the estimators allows

for the change in data profiles to be accommodated by change of the estimator

parameters, while the robustness ensures that the adaptation to the data profile does not

compromise the estimation efficiency.

Figure 5.7 compares the performance for data with and without outliers, of the

contaminated Normal and the GT estimator. It is observed that the GT estimator is

generally less sensitive to outliers than is the contaminated Normal estimator. The

exception is for the errors with uniform distribution, as this distribution has originally

very thin tails compared to the others. When outliers are present in the data, the GT has to

adapt by assuming the PDF with thicker tails. This makes the GT estimator robust to the

outliers; however, the trade-off is the decrease of efficiency for the data within the

uncontaminated range. Nevertheless, the performance of the GT estimators in the

uniform error cases is still best compared to the other two estimators, which means that

the trade-offs in efficiency made by the other two estimators are greater than that of GT.

65
MSE for Various Error Distributions

0.3

0.25

0.2
WLS
MSE

0.15 Cont_Norm
0.1 GT

0.05

0
La e1

La e3

e5

t1

t2

t3
al

rm

rm
m

ac
ac

ac
or

fo

fo
pl

pl

pl
N

ni

ni
La

Figure 5.6. MSE of Variable Estimates for Data with Outliers

Cont_Norm GT
0.16 0.16
0.14 0.14
0.12 0.12
0.1 0.1 Outliers
0.08 0.08
0.06 0.06
No Outliers
0.04 0.04
0.02 0.02
0 0
1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

Figure 5.7. Comparison of MSE with and without outliers for GT and Contaminated
Normal Estimators

66
5.4. Effect of Data Size on Efficiency

Since the efficiency of the partially adaptive estimators is achieved through its

adaptability to the data, it is natural to question the effect of the data size on the

estimation efficiency. Intuitively, the more data samples are used, the more accurate the

estimates will be. Particularly for partially adaptive estimators, where it is important to

follow the data profile closely, more data samples will mean that closer fit is possible,

and less data samples will lead to the question as to whether the partially adaptive

estimators are still viable compared to other non-adaptive estimators. To investigate this,

the joint DRPE is performed using three different data sizes, i.e. data sets with 20, 50 and

100 samples. The MSE results for the weighted least squares, contaminated Normal and

GT estimators are shown in Figure 5.8.

67
WLS Performance for Different Data Sizes

4.0000
3.5000
3.0000
2.5000 ns = 100
2.0000 ns = 50
1.5000
ns = 20
1.0000
0.5000
0.0000

t2

t3
al

5
e1

e3

e5

m
m

ac

ac

ac

r
or

fo

fo
pl

pl

pl
N

ni

ni
La

La

La

U
Cont. Normal Performance for Different Data Sizes

5.0000
4.0000
ns = 100
3.0000
ns = 50
2.0000
ns = 20
1.0000
0.0000 t2

t3
al

5
e1

e3

e5

m
m

ac

ac

ac

r
or

fo

fo
pl

pl

pl
N

ni

ni
La

La

La

GT Performance for Different Data Sizes

5.00
4.00
ns = 100
3.00
ns = 50
2.00
ns = 20
1.00
0.00
t2

t3
al

5
e1

e3

e5

m
m

ac

ac

ac

r
or

fo

fo
pl

pl

pl
N

ni

ni
La

La

La

Figure 5.8. MSE Results of WLS, Contaminated Normal and GT-based estimators for
Different Data Sizes

68
Although the scale of the MSE is not the same for each of the method, the same pattern is

observed when comparing the MSE across the different data sizes, that is, the relative

scale of the MSE from one data size to another looks similar for each of the three

methods. To show this, Figure 5.9 is plotted. Each vertical bar in the figure is the

percentage of the difference between MSE for the larger data size and MSE for smaller

data size, with respect to the MSE for the smaller data size; hence, it illustrates the

improvement in efficiency when data size is increased. It is seen that the improvement for

each error case is comparable among the three DRPE methods. This shows that although

when the data size is small, the efficiency of the partially adaptive estimators (in this

case, GT and the contaminated Normal) decreases, so does that of the non-partially

adaptive estimators (in this case, the weighted least squares). Therefore, it can be

concluded that the partially adaptive estimators, are in fact still quite as viable as other

estimators, even for relatively small data sizes (ns=20).

69
Improvement of Relative MSE from ns=50 to ns=100

100
80 WLS
60 Cont_Norm
40 GT
20
0

t2

t3
al

e1

e3

e5

5
rm

rm
m

ac
ac

ac
or

fo

fo
pl
pl

pl
N

ni

ni
La

La

La

U
Improvement of Relative MSE from ns=20 to ns=100
100
WLS
80
Cont_Norm
60
GT
40
20
0
t2

t3
al

e1

e3

e5

5
rm

rm
m

ac
ac

ac
or

fo

fo
pl
pl

pl
N

ni

ni
La

La

La

Figure 5.9. Improvement in MSE Efficiency when Data Size is increased

70
5.5. Estimation of Adaptive Estimator Parameters: Preliminary Estimators and

Iterations

A main feature of the partially adaptive estimation is the preliminary estimation to extract

information about the data from the data themselves. At the preliminary estimation stage

itself, however, the information about the data is yet to be obtained. The most suitable

estimator for the data is therefore not known at this stage. The best option at this point is

to first employ a pre-assigned estimator to produce the preliminary residuals. The

efficiency of these robust preliminary estimators may not be the best; however, by means

of iterative estimation of the adaptation parameters, the efficiency can be improved. This

preliminary estimation – iteration scheme is summarized in the flowchart in Figure 5.10.

71
Data Preliminary Estimation

Preliminary
Residuals
{u0}

Adaptation:
Estimation of estimator parameters

{ p , q , σ } = arg max log (f GT (u 0 ; p , q , σ ))


{ p , q ,σ }

Estimator
Parameters Residuals =
{p,q,ó} (Measurement Data) – (Reconciled Data)

GT-based Joint DRPE


{ params, vars } = arg max f ( u )
{ params, vars }

= arg min - log (f GT (u | p , q , σ ))


{ params, vars }

s.t. model equations

Reconciled Data,
Model Parameters

Maximum
iterations
reached?

RETURN
Reconciled Data,
Model Parameters

Figure 5.10. Iterative Joint DRPE with Preliminary Estimation

72
The joint DRPE using two different preliminary estimators: the median and the weighted

least squares estimator, is performed to study the role and effect of the preliminary

estimation. The results for a set of trial runs for each error distributions are shown in the

final MSE comparison in Figure 5.11, and the plot of MSE from one iteration to the

other, for one particular run for each error distribution, in Figure 5.12.

The results lead to the conclusion that the choice of the preliminary estimator is not really

important. From Figure 5.12, it is seen that regardless of the preliminary estimator, the

final MSE all converge to the same value. The question arises as to how the

unsatisfactory residuals produced by the weighted least squares, especially in case of

non-Normal and large errors, can be improved through iterations. To answer this

question, the evolution of the estimated GT parameters is investigated. If the residuals are

satisfactory, the estimated GT parameters will be representative of the underlying error

distribution. On the other hand, if the residuals are not satisfactory, it is likely that the

estimated GT parameters are far from the actual error distribution. This is confirmed by

observing the GT parameters estimated from the weighted least squares residuals in the

case of t noise, where these GT parameters are far from reflecting the true distribution.

However, it is also observed that many of the GT parameters take the values of their

respective bounds. Another run of joint DRPE with the weighted least squares as

preliminary estimator is then performed, but the GT parameter bounds for the maximum

likelihood optimization to estimate GT parameters are now relaxed. It is found that the

final MSE is now not as efficient as the previous run with stricter bounds; in fact, the

final estimates of variables are biased. It can therefore be concluded that the GT

73
parameter bounds play an important role in estimating the GT parameters. The bounds

should be set such that for all possible parameter settings, the resulting GT distribution is

sufficiently robust.

Finally, it is also noted that not many iterations are necessary before the MSE converges.

In this study the maximum iteration is set at ten, but in all runs, the MSE converges in

less than ten iterations. A drastic improvement usually occurs at the second or third

iteration, after which the MSE is effectively stable.

Final MSE Using Different Preliminary Estimators

0.12
0.1
0.08 GT
MSE

0.06 Median
WLS
0.04
0.02
0
e1

e3

e5

t1

t2

t3
al

rm

rm
m

ac

ac

ac
or

fo

fo
pl

pl

pl
N

ni

ni
La

La

La

Figure 5.11. Final MSE Comparison for GT-based DRPE method Using GT, Median and
WLS as preliminary estimators

74
Normal Error
GT Median WLS
0.012
0.01
0.008
MSE

0.006
0.004
0.002
0
0 1 2 3 4 5 6 7 8 9 10
Iteration #

Laplace Error
GT Median WLS

0.12
0.1
0.08
MSE

0.06
0.04
0.02
0
0 1 2 3 4 5 6 7 8 9 10
Iteration #

Uniform Error
GT Median WLS
0.3
0.25
0.2
MSE

0.15
0.1
0.05
0
0 1 2 3 4 5 6 7 8 9 10
Iteration #

t Error
GT Median WLS

1
MSE

0.5

0
0 1 2 3 4 5 6 7 8 9 10
Iteration #

Figure 5.12. MSE throughout iterations

75
5.6. Real Data Application

A data set is collected from a trial run of the general-purpose pilot plant. The three DRPE

methods – GT, contaminated Normal, weighted least squares – are then applied to

reconcile the data and estimate the overall heat transfer coefficients of reactors 1 and 2, as

is in the simulation case. The profiles of the data are shown in the (scaled) histograms in

Figure 5.13. The estimates of the variables and parameters are summarized in Table 5.4.

There is no basis to objectively assess the accuracy of these estimates because unlike in

the simulation case, the actual variable and parameter values are unknown. However,

looking at the plots of the GT, contaminated Normal, and Normal distributions that are

superimposed on the data histogram in Figure 5.13, the flexibility of the GT distribution

is apparently higher than that of the other two distributions, while the flexibility of the

contaminated Normal distribution is higher than that of the Normal distribution. This is

apparent from the fitting of the data by the three distributions, where GT seems to give

the closest fit and the Normal distribution the least close fit.

76
Feed flow to Reactor 1, F2 Effluent flow from Reactor 1, F5 Feed Flow 2, F7
GT
GT GT
CNorm
CNorm CNorm
Normal
Normal Normal
Data Data
Data

0.68 0.685 0.69 0.695 0.7 0.68 0.685 0.69 0.695 0.7 1.26 1.265 1.27 1.275 1.28 1.285

Output flow of Mixer, F9 Effluent Flow from Reactor 2, F12 Feed 2 Temperature, T7
GT
CNorm GT GT
Normal CNorm CNorm
Data Normal Normal
Data Data

1.958 1.96 1.962 1.964 1.966 1.968 33.2 33.3 33.4 33.5 33.6 33.7 36.85 36.9 36.95 37 37.05
Mixer Output Temperature, T9 Cooling water flow to Reactor 1, Fc1
Inlet cooling water tem perature, Tcin
GT GT
GT
CNorm CNorm
CNo rm
Norm al Norm al
No rm a l
Data Data Da ta

35.7 35.72 35.74 35.76 35.78 35.8 0.8 0.85 0.9 2 1 .9 2 1.9 5 22 22 .0 5 2 2 .1
Cooling water out temperature, Tcin1 Reactor 1 temperature, Trx1 Reactor 1 Feed Temperature, T2
GT GT GT
CNorm CNorm CNorm
Normal Normal Normal
Data Data Data

27.5 27.6 27.7 27.8 27. 33.2 33.4 33.6 33.8 40.2 40.3 40.4 40.5 40.6 40.7

Effluent flow from Reactor 2, F12 Reactor 2 Temperature, Trx2


Reactor 2 Effluent Temperature, T12
GT GT
GT
CNorm CNorm
CNorm
Norm al Norm al Normal
Data Data Data

1.95 1.955 1.96 1.965 1.97 33.3 33.32 33.34 33.36 33.38 33.4 33.2 33.3 33.4 33.5 33.6
Cooling Water 2 Temperature, Tc2 Cooling Water 2 Flow, Fc2
GT
GT
CNorm CNorm
Normal Normal
Data Data

31.3 31.32 31.34 31.36 31.38 31.4 0.47 0.475 0.48 0.485 0.49 0.49

Figure 5.13. Scaled Histogram of Data and Density Plots of GT, Contaminated Normal
and Normal Distributions

77
Table 5.4. Reconciled Data and Estimated Parameter Values Using Different DRPE
Methods
Contaminated Weighted
GT
Normal Least Squares
Variables
Feed flow to reactor 1, F2 0.6908 0.6906 0.6901
Effluent flow from reactor 1, F5 0.6908 0.6906 0.6901
Feed 2 flow, F7 1.2728 1.2731 1.2734
Mixer output flow, F9 1.9635 1.9637 1.9636
Reactor 1 effluent temperature, 33.5002 33.4714 33.4869
T5
Feed 2 temperature, T7 36.8941 36.9167 36.9281
Mixer output temperature, T9 35.7002 35.7050 35.7186
Reactor 1 cooling water flow, 0.8760 0.8776 0.8636
Fc1
Inlet cooling water temperature, 22.0000 22.0004 22.0050
Tcin
Reactor 1 outlet cooling water 27.5987 27.6095 27.6231
flow, Tc1
Reactor 1 temperature, Trx1 33.4914 33.4319 33.4320
Reactor 1 feed temperature T2 40.6000 40.5989 40.5173
Reactor 2 effluent flow, F12 1.9635 1.9637 1.9636
Reactor 2 effluent temperature, 33.3999 33.4014 33.4001
T12
Reactor 2 temperature, Trx2 33.4026 33.3781 33.3790
Reactor 2 outlet cooling water 31.3000 31.3001 31.3016
temperature, Tc2
Reactor 2 cooling water flow, 0.4857 0.4864 0.4897
Fc2
Parameters
Heat transfer coefficient for 0.8323 0.8454 0.8353
Reactor 1 cooling coil, UcAc1
Heat transfer coefficient for 2.1481 2.1770 2.1915
Reactor 2 cooling coil, UcAc2

78
5.7. Conclusion

The partially adaptive methods, i.e. the estimators based on the contaminated Normal and

GT distributions are more efficient than the non-partially adaptive method, i.e. the

weighted least squares estimator, for many error distributions other than the Normal

distribution. The GT-based method, however, performs better than the contaminated

Normal in almost all cases, as the GT is a more general distribution. The robustness of

both GT and contaminated Normal methods is also demonstrated, while the weighted

least squares method is conspicuous by its inability to handle outliers. It is also shown

that even for small data sizes, the partially adaptive GT-based method still consistently

outperforms the other two methods despite its decrease in efficiency. Finally, it is shown

that the robustness of the GT makes it possible to accommodate less than efficient and

less than robust preliminary estimators, through iterations of the GT-based estimation.

79
CHAPTER 6: CONCLUSION

6.1. Findings

Estimation criteria consist of robustness and efficiency. Parametric estimation where the

estimators are constructed based on a certain assumption regarding the error is

asymptotically efficient when the assumption regarding this error distribution is true, but

it is only under this limited condition that these estimators are efficient. By partially

adapting the parameters of these estimators to the data, however, the efficiency can be

improved. The Generalized T (GT) distribution is a good candidate for this partially

adaptive scheme because it can be made robust to outliers and it is very flexible to cover

a wide range of important distributions. The GT distribution is dictated by the distribution

parameter {p,q,σ}; by estimating these parameters from the data, partially adaptive GT-

based estimator can be constructed. The robustness is achieved by constraining the values

that can be taken by {p,q,σ} to ensure that the resulting GT has tails that are thick

enough.

Data is generated from the virtual version of a general-purpose chemical engineering

plant as a case study for the joint data reconciliation – parameter estimation (DRPE)

using the robust and partially adaptive GT-based estimator. The performance of the GT-

based estimator is the most efficient with respect to the other methods compared, i.e. the

weighted least squares and the contaminated Normal methods. This shows that the

adaptiveness to data is advantageous, and the GT distribution exploit this advantage very

80
well compared to the less flexible contaminated Normal estimator, as the GT is a more

general distribution.

The robustness of GT-based strategy is demonstrated through its performance when the

data is contaminated with outliers, where the GT-based estimator is able to suppress the

outliers while making only small trade-off in efficiency compared to that of the weighted

least squares and the contaminated Normal methods.

Constraining the GT parameters {p,q,σ} to be robust proves to be important and

beneficial. Firstly, although the constrained values exclude special cases such as the

Laplace, uniform, t and Normal distributions, the loss in efficiency is quite negligible as

it still performs best for error with these distributions. Secondly, because the constrained

{p,q,σ} result in robust GT distribution, any preliminary estimation method can be used

to the same final efficiency by iterating the joint DRPE with GT estimator, regardless of

the robustness and efficiency of these preliminary estimators. This is a very desirable

property because at the preliminary stage, virtually no information about the data is

known yet.

In conclusion, the robust and partially adaptive GT-based estimator is a viable choice in

performing data reconciliation and parameter estimation. It is also noted that although the

efficacy of the GT-based DRPE strategy is demonstrated for a chemical process case

study in this thesis, the strategy is essentially an estimator, which consequently has

generic use in the general estimation problems. Therefore, it is applicable not only when

81
physical model of the process or system is available, but also with other empirical and

approximation models. The usual precaution that the model inaccuracy is negligible

compared to the measurement inaccuracies holds, of course; but this also holds for other

estimators and is not a restriction specific to the GT-based estimator.

6.2. Future Works

In this work the GT-based DRPE strategy has been applied successfully to various data

profiles, both simulation and real data. These various applications have covered quite

general situations, which make the GT-based strategy a viable choice to deal with a wide

range of situations. However, recalling that the GT is a symmetric distribution, its

limitation may be that in the case when the error is actually not distributed symmetrically,

the adapted GT may not be efficient. It is possible to further generalize the GT

distribution to include non-symmetric distributions.

Another useful extension would be to explore the application of the GT-based strategy in

the dynamic operation of the system. This poses challenges in the efficient optimization

of the differential algebraic model, which is not encountered in the steady-state case.

82
REFERENCES

Albuquerque, J. S., Biegler, L.T. (1996). Data Reconciliation and Gross-Error Detection
for Dynamic Systems. AIChE J., Vol. 42, No. 10, pp. 2841-2856.

Arora, N., Biegler, L.T. (2001). Redescending Estimators for Data Reconciliation and
Parameter Estimation. Comp. Chem. Eng., Vol. 25, pp. 1585-1599.

Bickel, P.J. (1982). On adaptive estimation, Annals of Statistics 10, 647-671.

Butler, R.J., McDonald, J.B., Nelson, R.D., White, S.B. (1990). Robust and Partially
Adaptive Estimation of Regression Models. Rev. Econ. Stat., Vol 72, No. 2, pp.321-
327.

Crowe, C.M. (1989). Observability and redundancy of process data for steady state
reconciliation. Chemical Engineering Science. 44, 2909-2917.

Crowe, C.M. (1986). Reconciliation of process flow rates by matrix projection. Part II:
The non-linear case. AIChE.J. 32, 616-623.

Gill, P., Murray, W.A., Saunders, M.A., and Wright, M.H. (1986). User Guide for
NPSOL (Version 4.0); a FORTRAN program for Nonlinear Programming, Technical
Report SOL 86-2. Stanford University, Department of Operation Research, Stanford,
CA.

Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J., Stahel, W.A. (1986). Robust Statistics:
The Approach Based on Influence Function. Wiley, New York.

Huber, P.J. (1981). Robust Statistics. Wiley, New York.

Huber, P.J., (1964). Robust Estimation of a location parameter. Annals of Mathematical


Statistics, 64, 73-101.

Joris, P., and Kalitventzeff, B. (1987). Process measurements analysis and validation.
Proc. CEF ’87: Use Comput. Chem. Eng., Italy, pp.41-46.

Kim, I.W., Liebman, M. J., and Edgar, T.F. (1990). Robust error-in-variable estimation
using non-linear programming techniques. AIChE J., Vol 36, 985-993.

McDonald, J.B., Newey, W.K. (1988). Partially Adaptive Estimation of Regression


Models via the Generalized T Distribution. Econometric Theory, Vol. 4, pp.428-457.

Narasimhan , S.; Mah, R.S.H. Generalized Likelihood Ratio Method for Gross Errors
Identification. AIChE J. 1987, 33, 1514.

83
Potscher, B.M. and Prucha, I.R. (1986). A Class of Partially Adaptive One-Step M-
Estimators for the Non-linear Regression Model with Dependent Observations.
Journal of Econometrics, 32, 219-251.

Reilly, P.M., and Patino-Leal, H. (1981). A Bayesian study of the error-in-variable


models. Technometrics, Vol 23, 221-231.

Romagnoli, J. and Sanchez, M. (2000). Data Processing and Reconciliation for Chemical
Process Operations. Academic Press.

Rogmanoli, J., and Stephanopoulos, G. (1980). On the rectification of measurement


errors for complex chemical plants. Chemical Engineering Science, 35, 1067-1081.

Sanchez, M., Bandoni, A., and Romagnoli, J. (1992). PLADAT – A package for process
variable classification and plant data reconciliation. Computers and Chemical
Engineering, S16, 499-506.

Serth, R.; Heenan, W. Gross Error and Data Reconciliation in Steam Measuring Systems.
AIChE J. 1986. 32, 733.

Stein, C. (1956). Efficient nonparametric testing and estimation. In: Proceedings of the
third Berkeley symposium on mathematical statistics and probability, Vol. I
(University of California Press, Berkeley, CA).

Swartz, C.L.E. (1989). Data reconciliation for generalized flowsheet applications. 197th
National Meeting, American Chemical Society. Dallas, TX.

Tjoa, I., and Biegler, L. (1991). Simultaneous strategies for data reconciliation and gross
error detection of nonlinear systems. Computer and Chemical Engineering, 15, 679.

Valko, P., Vadja, S. (1987), An extended Marquardt-type Procedure for Fitting Error-in-
Variable Models. Comp. Chem. Eng., Vol. 11, pp. 37-43.

Wang, D., Romagnoli, J.A. (2003). A Framework for Robust Data Reconciliation Based
on a Generalized Objective Function. Ind.Eng.Chem.Res.,Vol.42, No.13, pp. 3075-
3084.

84
AUTHOR’S PUBLICATIONS

• Joe, Y.Y., Wang, D., Romagnoli, J. and Tay, A. (2004). Robust and Efficient
Joint Data Reconciliation and Parameter Estimation Using a Generalized
Objective Function. Proceedings of the 7th Dynamics and Control of Process
Systems (DYCOPS7). Boston, USA. Accepted.

• Joe, Y.Y., Wang, D., Tay, A., Ho, W.K., Ching, C.B., and Romagnoli, J. (2004).
A Robust Strategy for Joint Data Reconciliation and Parameter Estimation.
Proceedings of the 14th European Symposium on Computer-Aided Process
Engineering. Elsevier, Lisbon, Portugal.

• Joe, Y.Y., Xu, H., Dong, Z.Y., Ng, H.H., and Tay, A. (2004). Searching Oligo
Sets of Human Chromosome 12 using Evolutionary Strategies, International
Journal of Systems Sciences, Accepted.

• Joe, Y.Y., Xu, H., Dong, Z.Y., Ng, H.H., and Tay, A. (2003). Searching Oligo
Sets of Human Chromosome 12 using Evolutionary Strategies, Proceedings of
the 2003 Congress on Evolutionary Computation, IEEE Press, Canberra,
Australia.

85
APPENDIX A

Description of the Matlab/ Simulink Model of the General-Purpose Pilot Plant

GENERAL DESCRIPTION
The plant is a pilot-scale setting containing two CSTRs, a mixer, a feedtank and a number
of heat exchangers. Material feed from the feed tank is heated before being fed to the first
reactor and the mixer. The effluent from first reactor is then mixed with the material feed
in the mixer, and then fed to the second reactor. The effluent from the second reactor is,
in turn, fed back to the feed tank and the cycle continues.

RUNNING THE MODEL


The plant model is in twocstr3.mdl (excluding pumps).

UNITS
NOTE: Operating values (initial conditions, parameters) used in the current model right
now are still the ones taken from the references, not the values from the pilot plant.

1. REACTOR

Tc Cooling water
OUT

Fin, Tin, Cain

V, T, Ca

Ps

F, T, Ca

Product OUT

Reaction inside the reactor: A → B

1.1. Model I/O and Parameters

Matlab S-function: s_reactor1.m

86
Fin,Tin,Cain
F,T,Ca

Tcin
V
Fsin

Tc
Fcin

Ts
F

REACTOR1

1.1.1. Input:
Fin : volume flowrate of input stream (vol/time)
Tin : temperature of input stream
Cain : concentration of component A in the input stream (mole/vol)
Fcin : volume flowrate of incoming cooling stream
Tcin : temperature of incoming cooling stream
Fsin : volume flowrate of incoming heating stream
F : volume flowrate of output stream (as controlled by output flow controller)

1.1.2. Output:
F : volume flowrate of output stream
T : temperature of output stream = (assumed) temperature inside reactor
vessel
Ca : concentration of A in the output stream = (assumed) inside reactor vessel
V : volume of content of reaction vessel
Tc: Temperature of outgoing cooling stream
Ts: Temperature of steam jacket
Fs: Flowrate of input steam to steam jacket

1.1.3. Parameters
[Luyben, 1989]:
NOTE:
• The parameters are currently coded at the S-function; so to simulate two
reactors with different parameters, two of this S-function are needed, each
with different set of parameters.
Symbol Description Nominal Values Used
in Simulation
rho, ρ Mass density of stream 50 lbm/ft3
Cp, C p Specific heat of stream 0.75 Btu/lbm R
alpha, α Temperature constant for Arrhenius 7.08E10 /h =
expression 1.9667e+007 /s
E, E Activation energy for reaction 30,000 Btu/lb.mol
R, R Molar gas constant 1.99 Btu/lb.mol R

87
lambda, λ Reaction energy -30,000 Btu/lb.mol
Uc, U c heat transfer coefficient of reactor 150 btu/h ft2 R =
cooling coil 0.0417 btu/s ft2 R
Ac, Ac area of reactor cooling coil 250 ft2
rhoc, ρ c mass density of cooling stream 62.5 lbm/ft3
Cpc, C pc specific heat of cooling stream 1 Btu/lbm R
Vc, Vc volume of reactor cooling coil 3.85 ft3
hos, hos heat transfer coefficient of reactor 1000 btu/h ft2 R =
heating jacket 0.2778 btu/s ft2 R
Aos, Aos area of reactor heating jacket 56.5 ft2
Ms, M s molar mass of heating stream 18 lbm/lb.mol
Avp, Avp Constant for steam jacket pressure -8744.4 R
Bvp, Bvp Constant for steam jacket pressure 15.70
Hs_hc, (Vapor_enthalpy- 939 Btu/lbm
H s − hc condensation_enthalpy)

1.2. Model Equations


[Luyben, 1989]
1.2.1. Mass Balance
dV
= Fin − F
dt

1.2.2. Component Balance


d (VC a ) E
= Fin C ain − FC a − α exp(− )VC a
dt RT
Cb = ( ρ − M a C a ) / M b

1.2.3. Energy Balance


a. Reaction Vessel
dVT λα E
= FinTin − FT − exp(− )VC a − Qc − Q j
dt ρC p RT
Qc = U c Ac (T − Tc )
Q j = − hos Aos (Ts − T )

b. Cooling Coil
dTc Fc Qc
= (Tcin − Tc ) +
dt Vc ρ c C pcVc
Qc = U c Ac (T − Tc )

88
c. Steam Jacket
dρ Total Continuity
V j s = ws − wc
dt
MPj Perfect Gas Law
ρs =
RTs
Avp Simple vapor-pressure equation
Pj = exp( + Bvp )
Ts
ws = C vs X s Pjin − Pj
Q j = −hos Aos (Ts − T )
Qj Energy equation, ignoring internal energy change & assuming
wc = −
H s − hc steady-state, i.e. ws = wc

2. HEAT EXCHANGER

Matlab S-function: s_heatex_btu.m

In1 Out1

In2 Out2

HEAT_EX1

In1: [Fin1, Tin1, Cain1] T


In2: [Fin2, Tin2, Cain2] T
Out1: [F1, T1, Ca1] T
Out2: [F2, T2, Ca2] T

2.1. Model I/O and Parameters


2.1.1. Input:
Fin1/2: Volume flowrate of input stream 1/2
Tin1/2: Temperature of input stream 1/2
Cain1/2: Concentration of component A in input stream 1/2

2.1.2. Output:
F1/2: Volume flowrate of output stream 1/2
T1/2: Temperature of output stream 1/2
Ca1/2: Concentration of component A in output stream 1/2

2.1.3. Parameters:
[Luyben, 1989; Malleswararao, 1992; Holman, 1972]
Symbol Description Nominal Values Used
in Simulation

89
ρ1 Mass density of stream 1 50 lbm/ft3
ρ2 Mass density of stream 2 50 lbm/ft3
V1 Volume of stream 1 in the heat 3.85 ft3
exchanger
V2 Volume of stream 2 in the heat 3.85 ft3
exchanger
Cp1 Specific heat of stream 1 0.75 btu/lbm R
Cp 2 Specific heat of stream 2 0.75 btu/lbm R
U Heat transfer coefficient 1000 btu/h R
A Area of heat exchange 20 ft2

2.2. Model Equations


2.2.1. Mass Balance
F1 = Fin1
F2 = Fin 2

2.2.2. Component Balance


C a1 = C ain1
C a 2 = C ain 2

2.2.3. Energy Balance


[Holman, 1972; ASHRAE, 1993]
dT1 1
= [F ρ (T −T ) − Qt /C p1]
dt ρ1V1 1 1 in1 1
dT2
= 1 [ F ρ (T −T ) + Qt /C p2]
dt ρ2V2 2 2 in2 2

Qt =εQmax
where:
⎧ 1− exp(− N (1− C ))
⎪ ;C ≠1
ε = ⎪⎨1− C exp(− N (1− C ))
⎪ N ;C =1

⎩ N +1
N =UAh /Cmin
C = Cmin /Cmax
Cmin = min{F1ρ1Cp1,F2 ρ2Cp2}
Cmax = max{F1ρ1Cp1,F2 ρ2Cp2}
Qmax = Cmin (Th −Tc )
Th = max{Tin1,Tin2 }
Tc = min{Tin1,Tin2}

90
3. MIXER / FEED TANK

Matlab S-function: s_mixer_btu_cl.m

F1, T1, Ca1


F, T, Ca

F2, T2, Ca2

V
F

MIXER

3.1. Model I/O and Parameters


3.1.1. Input:
Fin1/2: Volume flowrate of input stream 1/2
Tin1/2: Temperature of input stream 1/2
Cain1/2: Concentration of component A in input stream ½
F : Volume flowrate of output stream (as controlled by output flow
controller)

3.1.2. Output
F: Volume flowrate of output stream
T: Temperature of output stream
Ca: Concentration of component A in output stream
V: Volume of tank

3.1.3. Parameter
None

3.2. Model Equations


3.2.1. Mass Balance
dV
= Fin1 + Fin 2 − F
dt

3.2.2. Component Balance


dVCa
= F1Ca1 + F2 Ca2 − FCa
dt

3.2.3. Energy Balance

91
dVT = F T + F T − FT
dt 1 in1 2 in2

4. SPLITTER

Matlab S-function: not implemented in s-function

Out1
In
Out2
splitter 1

In1: [Fin, Tin, Cain, Cbin] T


Out1: [F1, T1, Ca1, Cb1] T
Out2: [F2, T2, Ca2, Cb2] T
Parameters: [eta] T

4.1. Model I/O and Parameters


4.1.1. Input:
Fin: Volume flowrate of input stream
Tin: Temperature of input stream
Cain: Concentration of component A in input stream
Cbin: Concentration of component B in input stream

4.1.2. Output
F1/2: Volume flowrate of output stream 1/2
T1/2: Temperature of output stream 1/2
Ca1/2: Concentration of component A in output stream ½

4.1.3. Parameters
Symbol Description Nominal Values Used
in Simulation
eta, η Split fraction ratio (F1/Fin) 0.4

4.2. Model Equations


4.2.1. Mass Balance
F1 = ηFin
F2 = Fin − F1

4.2.2. Component Balance


Ca1= Ca2 = Cain

92
4.2.3. Energy Balance
T1 = T2 = Tin

5. MIXING JUNCTION
Matlab S-function: threeway.m
In1
Out
In2

junction

In1: [Fin1, Tin1, Cain1] T


In2: [Fin2, Tin2, Cain2] T
Out1: [F, T, Ca] T

5.1. Model I/O


5.1.1. Input:
Fin1/2: Volume flowrate of input stream 1/2
Tin1/2: Temperature of input stream 1/2
Cain1/2: Concentration of component A in input stream 1/2

5.1.2. Output
F: Volume flowrate of output stream
T: Temperature of output stream
Ca: Concentration of component A in output stream

5.2. Model Equations


5.2.1. Mass Balance
F = Fin1 + Fin 2

5.2.2. Component Balance


F1Ca1 + F2 Ca2 − FCa = 0

5.2.3. Energy Balance


F1Tin1 + F2Tin 2 − FT = 0

REFERENCES

ASHRAE (1993). Toolkit for HVAC System Energy Calculations, 1993.

De Nevers, Noel (1991). Fluid Mechanics for Chemical Engineers. McGraw-Hill.

Holman, J. P. (1972). Heat Transfer. McGraw-Hill.

93
Luyben, William, L (1989). Process modeling, simulation, and control for chemical
engineers. McGraw-Hill.

Malleswararao, Y. S. N, Chidambaram, M. (1992). Non-linear controllers for a heat-


exchanger. In J. Proc. Cont, Vol 2, No 1, p17-21.

94
APPENDIX B

Steady-State Equations of the General Purpose Pilot Plant

Reactor 1
Mass Balance: F2 − F5 = 0
U c1 Ac1
Energy Balance: F2T2 − F5T5 − (T − T ) = 0
ρC p rx1 c1
U A
Cooling Coil Energy Balance: Fc1 (Tc ,in − Tc1 ) + c1 c1 (Trx1 − Tc1 ) = 0
ρ c C pc

Mixer
Mass Balance: F5 + F7 − F9 = 0
Energy Balance: F5T5 + F7T7 − F9T9 = 0

Reactor 2
Mass Balance: F9 − F12 = 0
U c 2 Ac 2
Energy Balance: F9T9 − F12T12 − (T − T ) = 0
ρC p rx 2 c 2
U A
Cooling Coil Energy Balance: Fc 2 (Tc ,in − Tc 2 ) + c 2 c 2 (Trx 2 − Tc 2 ) = 0
ρ c C pc

Feed Tank
Mass Balance: F12 − F13 = 0
Energy Balance: F12T12 − F13T13 = 0

Heat Exchanger 2
Energy Balance:
F ρ (T14 − T7 ) − Qt / C p = 0
7
F ρ (T6 − T15 ) + Qt / C =0
6 hx phx
where Qt is a function of T7, T14, T6 and T15.

Heat Exchanger 1
Energy Balance:
F2 ρ (T16 − T2 ) − Qt / C p = 0
F1 ρ (T1 − T17 ) + Qt / C =0
hx phx

95
where Qt is a function of T2, T16, T1 and T17.

Heat Exchanger 3
Energy Balance:
U c 3 Ac 3
Fc 3 (Tc ,in 3 − Tc 3 ) + (T − T ) = 0
ρ c C pc 13a c 3
U c 3 Ac 3
F13 (T13 − T13 a ) − (T − T ) = 0
ρ c C pc 13a c 3

96

You might also like