You are on page 1of 11

Chemometrics and Intelligent Laboratory Systems 124 (2013) 32–42

Contents lists available at SciVerse ScienceDirect

Chemometrics and Intelligent Laboratory Systems


journal homepage: www.elsevier.com/locate/chemolab

Multi-block regression based on combinations of orthogonalisation,


PLS-regression and canonical correlation analysis
Tormod Næs a, b,⁎, Oliver Tomic a, Nils Kristian Afseth a, Vegard Segtnan a, Ingrid Måge a
a
Nofima, Oslovegen 1, 1430 Ås, Norway
b
Department of Food Science, University of Copenhagen, Faculty of Science, Denmark

a r t i c l e i n f o a b s t r a c t

Article history: This paper reviews the basic ideas underlying two new multi-block techniques, the SO-PLS and the PO-PLS
Received 12 December 2012 regression methods. A discussion is given about how the two methods are related to each other and to stan-
Received in revised form 7 March 2013 dard regression and analysis of variance (ANOVA). In particular the relation between SO-PLS and Type I
Accepted 10 March 2013
ANOVA is underlined. It is shown how the sums of squares can be split according to blocks introduced and
Available online 18 March 2013
also how two-way ANOVA applied to cross-validated residuals, can be utilised for testing significance of
Keywords:
the blocks. Different ways of interpreting the results for both methods are considered and illustrated by
Multi-block examples. Relations to other proposed methods and ideas are discussed.
Regression © 2013 Elsevier B.V. All rights reserved.
Collinearity
Invariance
Designed experiments
PLS regression

1. Introduction relevant in practical applications. The two methods are the SO-PLS (se-
quential and orthogonalised partial least squares) regression method
Exploring relationships between several large data sets is im- and the PO-PLS (parallel and orthogonalised partial least squares) re-
portant in many areas of modern science. Some typical examples gression method. We refer to [18,28] for a description of the former
can be found be in process modelling [17,18], in consumer science and [23,24] for a description of the latter.
[26] and when relating data in the so-called—omics area [12,31]. One of the main advantages of these methods as compared to
In some cases one is interested in understanding the relation MB-PLS regression is that they are invariant to the relative weighting
between the data sets without any particular ordering of them of the blocks, which is an important feature if the data blocks have
while in other cases one is interested in studying blocks of data very different measurement units. Another advantage is that they
with a predictive direction, here called multi-block regression handle situations with different underlying dimensionality (rank) of
(MBR). Note that the latter is closely related to the area of so-called the blocks, which for instance means that they are suitable also for
data fusion [13,29]. situations where one of the blocks is a design matrix while the
Using regular least squares (LS) regression is often impossible in other ones are highly multivariate and collinear. A third advantage
MBR since the independent variables are frequently highly collinear. is that they can be used for explicitly identifying spaces of additional
In addition, one is also typically interested in exploring relations and common variability of the input blocks.
among the variables within each of the blocks, which is not possible The present paper will give an overview of the SO-PLS and PO-PLS
using the LS principle. The multi-block partial least squares (MB-PLS) methods with focus on the underlying ideas and the relationships be-
regression method [33] is an example of an MBR method which both tween them. New features of the methods will also be presented and
solves the collinearity problem and provides additional plotting tools discussed. One of the main points here is to explore the relationships
for better understanding of the different blocks and their contributions between SO-PLS and PO-PLS on one side and classical regression anal-
to the full model. In this paper we will focus on two alternative MBR ysis and analysis of variance (ANOVA) on the other. An important as-
methodologies which have some additional properties that are highly pect of this is to show how interactions between the blocks can be
defined and incorporated in a similar way as done traditionally in
polynomial regression and ANOVA. A third important aspect here is
to discuss some new ways in which the results can be interpreted.
The methodology presented will be illustrated by an example from
⁎ Corresponding author at: Nofima, Oslovegen 1, 1430 Ås, Norway. Tel.: +47 91352032. near infrared (NIR) and Raman spectroscopy calibrations of chemical
E-mail addresses: tormod.naes@nofima.no, tormnaes@online.no (T. Næs). constituents in food samples.

0169-7439/$ – see front matter © 2013 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.chemolab.2013.03.006
T. Næs et al. / Chemometrics and Intelligent Laboratory Systems 124 (2013) 32–42 33

2. Methodology obvious. When in doubt or when no order is preferred to another,


one can try both and use this as an additional interpretation tool.
In the present paper we will concentrate on finding relations be- Note that if the blocks are already orthogonal (as in some designed
tween one output data block Y (dimension N by K) and two or several experiments), orthogonalisation does not make a change and
input data blocks, X1, X2,..,XB, having the same number of rows (N). SO-PLS with the maximum number of components will give the
We will here start with a presentation of the basic methodology same regression coefficients as standard LS regression based on the
using only two input blocks, X1 and X2, but later on discuss how concatenated input matrix.
this can be generalised to several blocks. Special focus will be on sit- Since the estimation is based on fitting each block separately, the
uations where the number of variables is large and there is a need SO-PLS method is non-iterative, it is invariant to the relative scale of
for other methods than standard LS regression. This will, however, the blocks and it can explicitly handle different dimensionality
fit in as a special case since PLS regression with the maximum number (rank) of the blocks. This latter aspect means that the method is par-
of components is identical to LS if it exists. For an overview of PLS re- ticularly suitable for combining for instance design variables and
gression and some of its useful numerical and statistical properties for highly multi-collinear spectral data as discussed in e.g. [18].
prediction and interpretation we refer to [22]. In all cases, the matri- Note that the SO-PLS method can be used for situations both with
ces are centred prior to analysis, which is common practice when and without an experimental relation between the blocks. An exam-
using partial least squares (PLS) regression. ple of the former is when X1 represents an experimental process de-
In the following we will describe the two mentioned methods sepa- sign and X2 represents measurements taken later on in the process. In
rately. The first method considered, SO-PLS, is developed for modelling this case, the X2 is influenced directly by X1, but also possibly by other
and measuring and analysing the additional or incremental contribu- phenomena. Orthogonalisation can in such cases be considered a way
tions of new data blocks while the PO-PLS method is developed for of “filtering out” the part of X2 that is due to X1 before interpreting
splitting the information in the blocks into so-called common and the effect of X2. An example of a situation with no experimental rela-
unique components or subspaces. tion between the input blocks is when X1 represents the design of an
experiment in a pilot plant, and X2 represents measurements of the
2.1. The SO-PLS method raw material properties.

The model for the original SO-PLS method [17,18] is the standard 2.2. The PO-PLS method
linear model, i.e.
As opposed to SO-PLS regression which focuses on the additional
Y ¼ X1 B1 þ X2 B2 þ E ð1Þ contribution of a block, the main aim of PO-PLS is to estimate com-
mon (i.e. overlapping) and unique subspaces across multiple data
with Y having either one or several columns. blocks without imposing any order [23,24]. In order to achieve this,
The first step of the SO-PLS method is to fit the matrix Y to X1 by a combination of PLS regression, orthogonalisation and Generalized
PLS regression. The X2 is then orthogonalised with respect to the Canonical Correlation Analysis (GCA; [5]) is used.
PLS scores TX1 of X1 and the original Y matrix is fitted to the The model is the same as above, but now the focus is on splitting the
orthogonalised X2orth, again using PLS regression. Note that the deflat- contribution of (X1, X2) into a subspace that is common for both X1 and
ed Y can also be used in the regression without changing the results. X2 and subspaces of X1 and X2 that are orthogonal to the common
orth
The PLS scores TX1 and Torth
X2 from X1 and X2 are then finally used as space. As for SO-PLS this will be achieved by estimating component
independent variables in a LS regression with Y representing the de- scores representing the three subspaces and using these score matrices
pendent variables. As can be seen, the SO-PLS method is simply a se- as independent blocks in a regression model with Y representing the
quential use of PLS regression in combination with orthogonalisation dependent variables.
which focuses on the additional contribution of a new block. The In more detail, the PO-PLS is defined as follows
number of components to incorporate in each of the two PLS regres-
sion models is determined by standard empirical validation methods 1. In order to reduce dimensionality and to filter out as much as pos-
as will be discussed below.
  sible of the noise in the X-data before the use of GCA, the first step
Note that since 1T I  P T X1 X 2 = 0 with P T X1 being the projection of PO-PLS is to do separate PLS regressions for each of the input
operator onto TX1 , the orthogonalised X2orth is also centred, which means blocks X1 and X2 and use only the relevant scores (defined by
that PLS can be used in the second step without a new centring. cross-validation, see e.g. [22]) for the rest of the estimation. Alter-
After fitting, the predictive model can be written as natively, one could use a regularised GCA approach in point 2
below [8,32], but this is not pursued here.
^ ¼T Q^ orth ^ ^ ^ orth ^ ^
Y X1 X1 þ T X2 Q X2 ¼ X1 V X1 Q X1 þ X2 V X2 Q X2 2. The GCA is then used to explore the correlation structure between
 orth 
¼ X1 B1 þ X2 B2 ð2Þ the reduced subspaces for the two blocks. Subspace directions
with a sufficiently high canonical correlation are defined as com-
where the V ^ and V ^ are the weight matrices needed to compute mon, and denoted by TC. More precisely, the TC is defined by taking
X1 X2
the PLS scores [22] and the Q ^ s are the regression coefficients for the the averages of the linear combinations (GCA scores) of the two
scores. The PLS loadings for the two models can be obtained by blocks X1 and X2 that have a correlation close to 1. If the correla-
projecting X1 and X2 onto their respective scores matrices TX1 and tion is exactly equal to 1, the two contributions are identical to
Torth
X2 . The Bs are the regression coefficients for the two matrices in each other. In practice, however, there will always be some noise
the units defined by the orthogonalisation process. For a transforma- left after the PLS and a lower value than 1 may be accepted for
tion back to original units we refer to Section 3.1. the purpose (see below).
Note that the orthogonalisation determines an order of the blocks 3. Next, the PLS scores of X1 and X2 obtained in step 1 are
in the regression analysis. Since, however, each step is designed to de- orthogonalised with respect to the common components TC, and
scribe as much as possible of the variation that is not already the information left is regarded as unique for each block.
modelled, the differences between the orders from a prediction 4. The orthogonalised PLS scores from 3 are finally restructured into a
point of view will generally be moderate. This has not yet been prov- set of unique scores, TOrth Orth
X1 U and TX2 U , by applying new PLS regres-
en mathematically, but is confirmed empirically. In some cases, for in- sions with Y as response and the orthogonalised scores (treated in-
stance when a design matrix is involved [17], the order is quite dependently) as predictors.
34 T. Næs et al. / Chemometrics and Intelligent Laboratory Systems 124 (2013) 32–42

The final prediction equation can then be written as original units before interpretation. For SO-PLS with two blocks, the
process goes as follows:
^
^ ¼T Q Orth ^ Orth ^
Y C C þ T X1 U Q X1 U þ T X2 U Q X2 U : ð3Þ
^ ^
^ ¼X V orth ^ ^
Y 1 X1 Q X1 þ X2 V X2 Q X2
The Q^ s are as above regression coefficients for the scores.
^ Q ^ t −1 t ^ Q ^
Contrary to regular PCA and PLS regression, the scores in PO-PLS ¼ X1 V X1 X1 þ ðI−TX1 ðT X1 T X1 Þ TX1 ÞX2 V X2 X2
are scaled to unit variance, and not the loadings. The reason for this ð4Þ
is that the common scores may represent several predictor blocks, ^ ðQ ^ −T ðTt T Þ −1 t ^ Q ^ ^ Q ^
¼ X1 V X1 X2 X1 X1 X1 TX1 X2 V X2 X2 Þ þ X2 V X X
but explain a different amount of variance in each of them. The vari- 2 2

ance is therefore most naturally retained in the individual block load-


^ Q
¼ X1 V ^ ^ ^ ^ ^
ings instead of the scores. Note that this way of organising the data is X1 X1 þ X2 V X Q X ¼ X1 B 1 þ X2 B 2 :
2 2

also necessary for making the average score process in step 2 mean-
ingful. The loadings for block i can be calculated using the simple for-
mulae PCi = X TiTC (common) and PUi = X Ti TOrth ^ 2 in Eq. (4) is the same as the B
Note that the B ^  in Eq. (2). For
Xi U (unique). Below 2
we will discuss how to determine the dimensionality of the common PO-PLS, back-fitting to original units is only possible if the correlation
space and also the unique subspaces. coefficients for common components are equal to one. If they are
Note that since GCA is used for obtaining the common space and slightly lower, which will be the case in most applications, the com-
PLS regression is run for each of the two orthogonalised blocks sepa- mon space is not exactly a subspace of either X1 or X2, and TC can
rately, also the PO-PLS is invariant to the relative scaling of the blocks. therefore not be substituted by X1 or X2 in Eq. (3).
As can be seen, it also explicitly handles situations with different di-
mensionality or rank of the blocks. Note that the two orthogonalised 3.2. Validation
X1 and X2 blocks are not orthogonal to each other and therefore a full
LS regression is needed to obtain Eq. (3). As for regular regression, validation serves two purposes; deter-
For two blocks of variables, one can say that there is an exact or mining the number of components to use for prediction and assessing
near collinearity between the blocks if there exist matrices C1 and the quality of the predictor obtained, usually measured by the root
C2 such that X1C1 spans the same or almost the same space as mean square error of prediction (RMSEP). In this paper we will only
X2C2. As can be seen, this is closely related to the common space consider cross-validation, but other options such as prediction testing
concept for PO-PLS. The common space definition above is thus or bootstrapping can also be used. It is important to note that
an operationalization of the concept of collinearity between orthogonalisation can be seen as a linear prediction process, which
blocks. holds equally well for regression fitting as for cross-validation.
For the SO-PLS, one has essentially two options for selecting the
2.3. Comparing PO-PLS with SO-PLS optimal number of components, as described in [28]: the global and
the incremental estimation of components. The global approach is
The space spanned by the common space from PO-PLS and the based on simultaneously determining the number of components in
unique part of X1 represents a low-dimensional decomposition of X1 and X2 based on the prediction ability of the joint predictor. One
X1 itself. Note that this is generally not the same as the X1 decompo- typically sets an upper limit for the number of components in each
sition obtained by SO-PLS since the two are obtained in different block and then searches through all possible combinations (with a re-
ways. Since X2orth is obtained as orthogonal to the PLS components striction on the correlations between the GCA components for the
of X1 and the unique (i.e. orthogonalised) X2 space in PO-PLS is blocks). In some cases one may see that different choices of compo-
obtained as orthogonal to the common space only (and not orthogo- nents will give similar prediction results. This indicates that there is
nal to the unique X1 space), these two spaces are also generally differ- redundancy between the blocks, and one has a choice of where to
ent from each other. This means that all subspaces extracted by the put the overlapping information.
two methods are different from each other although there may be The incremental option is based on first determining the number
similarities for certain types of data. Note that the same consideration of components in X1 and then in X2, keeping the number of compo-
can be made for the opposite order of the blocks. nents in X1 fixed. The prediction ability of the final model can then
The decision about which method to choose will thus in practice be determined by cross-validation or prediction testing the normal
depend on what aspects of the data set one is primarily interested way using the number of components already selected. The incre-
in. If the focus is on understanding common variability, which is typ- mental approach fits better to the philosophy of additional fitting of
ical in spectroscopy when more than one instrument is available, the data blocks in SO-PLS, but the global approach will generally give bet-
natural choice is the PO-PLS method. When focus is on additional in- ter cross-validation results.
formation, the natural choice is SO-PLS. The latter aspect is highly rel- For PO-PLS, one usually uses an incremental approach. To deter-
evant when one is interested in for instance what can be gained by mine the number of common components in PO-PLS, three different
introducing an additional measurement principle. It can also be high- aspects should be considered; the canonical correlation coefficient it-
ly relevant in industrial process modelling when focus is on investi- self, the amount of explained variance of each X-block, and the
gating for instance how much raw material variation adds to the amount of explained variance of Y. A valid common component
process settings in describing the end product quality. But of course, should have a correlation coefficient close to one, and at the same
as always, one should also look at the prediction ability of the two time explain significant amounts of variance in the X and Y data
methods when deciding which one to use. blocks. The actual limit for the correlation coefficients depends on
the noise level in the data. In our experience, the correlation should
3. Properties of the SO-PLS and PO-PLS methods usually be above 0.90 or 0.95 at least for the type of spectroscopic
data considered here.
3.1. Back-fitting to original units Note that the process of finding the optimal number of compo-
nents searches through many possibilities. If the model is to be used
Since X2orth in model (2) is not the same as the original X2 matrix, for prediction, it is therefore a good practice to test the prediction
it may sometimes be more natural to back-transform the model to ability of the method using an independent data set.
T. Næs et al. / Chemometrics and Intelligent Laboratory Systems 124 (2013) 32–42 35

3.3. Interpretations defined in Eq. (2). As can be seen this is identical to the formula that is
used for standard Type I ANOVA. Since PLS regression with the max-
For multi-block regression both prediction ability and interpreta- imum number of components is identical to LS, the SO-PLS method is
tion of the contributions of the different blocks are important. therefore a direct generalisation of standard Type I LS fitting of blocks
Concerning interpretation, one is interested both in interpreting the of variables to a response matrix.
variability within each block and how this variability relates to Y The Type III SS idea of considering the additional contribution of a
and the other blocks. This information can for SO-PLS and PO-PLS be variable (or a block of variables) when all others are present in the
obtained in various ways, as discussed in [18,23,24] and in [28]. model, can also easily be accommodated within the framework
One method that was proposed in [19] is to interpret the individ- presented here. For the SO-PLS this corresponds to switching the
ual PLS models directly. For SO-PLS, this is typically done by order of the blocks and to consider the effect of the latter block for
projecting the block matrices, here X1 and X2, onto the corresponding both orders. For the PO-PLS, only the type III SS can be computed
PLS components as also described above. The scores and loadings can due to lack of orthogonality. For the X1 block the Type III sum of
then be interpreted in scatter plots the usual way [22]. The PO-PLS squares is thus obtained by simply comparing the SS's based on fitting
models can be interpreted in the same manner, obtaining loadings the full model and only the two terms TC and TOrth X2 U . The same proce-
by projecting both X1 and X2 onto the common and unique scores. dure can be used for X2 and the common space contributions as will
The common components will thus have loadings from both blocks, be demonstrated below.
which can be used to interpret the features in the two blocks that
are related to the common space. 3.5. Statistical testing — ANOVA
An alternative method for interpreting SO-PLS solutions was sug-
gested in [28]. This is based on the so-called principal components of Since degrees of freedom (DF's) is a much more complex concept
prediction method (PCP, [20]) after back-fitting to original units of for PLS regression than for LS regression, testing the importance of
the input blocks (see above). The PCP method was originally devel- the different blocks in SO-PLS and PO-PLS is not an obvious task. In
oped as an alternative to standard interpretation of PLS models and the following we will, however, propose to use the CV-ANOVA
is based on post-interpretation of an already fitted PLS model. The (cross-validated ANOVA) method based on comparing prediction
first step is to use PCA on the predicted Y-values, Y^ , which means abilities of two or several predictors for this purpose [6,14]. The
that the method is only useful for multivariate Y. The scores of this CV-ANOVA is based on computing the squared predicted residuals
PCA are by definition linear combinations of Y^ which again represent (or the absolute residual values) for all objects and methods to be
linear combinations of X1 and X2. The PCP components are therefore compared using cross-validation and then comparing their averages
linear combinations of the X1 and X2 blocks themselves. The coeffi- over objects using 2-way ANOVA. One of the “ways” in the ANOVA
cients are essentially regression coefficients of the PCA scores of Y^ table thus represents the different methods tested and the other
onto X1 and X2. The coefficients are therefore most easily computed “way” corresponds to the objects to be predicted. In the Type I
by regressing the scores from Y^ directly onto (X1, X2) (with a perfect SO-PLS case, the different “prediction methods” to be compared cor-
fit, by definition). As a result, the PCP method provides scores and respond to the models with no input block (i.e. predictions based
loadings from the PCA of Y^ and X-loadings obtained by regression. on averages only), the model with only X1 in it and the model with
A third possibility for interpretation of SO-PLS, which is inspired by both matrices X1 and X2 in it. For the Type III PO-PLS, the “prediction
the ASCA method [15,30], is to calculate both the contributions methods” to be compared are the prediction methods based on the
^ and Torth Q
TX1 Q ^ to Y^ and then use a PCA for each of them. This is full model and the different sub-models keeping one of the blocks
X1 X2 X2
identical to the original ASCA if the two input blocks are orthogonal to out. The CV-ANOVA tests will in both cases be statistical tests of the
each other. A variant of this that will be illustrated here is to project importance of bringing in new blocks of data. In the CV-ANOVA
the TX1 Q^ onto the first few principal components of Y^ . In this way table one can, in addition to p-values, also present the RMSEPs (or
X1
one can directly interpret both how the first part of the model relates the SS's) after each of the blocks in order to highlight the size of the
to the predicted Y-values and also see how the second part Torth ^
X2 Q X2 additional contributions of the blocks.
adds to the prediction ability. A similar approach can be used for PO-PLS. Note that the CV-ANOVA is different from regular ANOVA testing in
the sense that it does not compare the SS's with the SSE. Instead it uses
3.4. Type I and Type III sums of squares the residuals from the actual model as the “errors”. This distinction is
comparable to the distinction between standard forward variable selec-
The general idea behind the SO-PLS has much in common with the tion and forward selection using Type I ANOVA methods. In the former,
idea underlying the calculation of Type I SS's in ANOVA and regres- the actual effects are considered using the corresponding residuals as
sion, i.e. to first fit a variable and sequentially evaluate the additional the errors, while in regular Type I ANOVA effects are tested against
contributions of the next variable introduced to the model. the error obtained when all systematic effects are in the model.
If we denote the total sums of squares by SSTot = diag(Y TY) and
the sums of squares of the contributions of X1 and X2orth in SO-PLS by 3.6. Extension to several blocks
  
^ TX B
SSX 1 ¼ diag X 1 B ^ ð5Þ The above procedures were discussed for two blocks of input data
1 1 1
X1 and X2, but both methods can quite easily be extended to any
number of blocks. For SO-PLS, one simply repeats the same proce-
and
dure, using orthogonalisation and simple PLS regression for each
   new block. The orthogonalisation in this case is done with respect
orth orth ^  T orth ^ 
SSX 2 ¼ diag X2 B X2 B2 ð6Þ
2 to all the blocks that have already been fitted. Again the order of the
blocks fitted must be determined and the same comments as given
the following relation holds by orthogonality above apply.
For PO-PLS, however, the complexity increases when more blocks
orth
SSTot ¼ SSX 1 þ SSX 2 þ SSE : ð7Þ are added. If more than two blocks are present there are common com-
ponents at several levels; components can be common for all blocks to-
As usual the SSE is the residual sums of squares. In Eqs. (5) and (6) gether, or just for a subset of blocks. This represents a combinatorial
the regression coefficients B1* and B2* are the regression coefficients problem, but in principle there is nothing unique which cannot be
36 T. Næs et al. / Chemometrics and Intelligent Laboratory Systems 124 (2013) 32–42

handled in the same way and context as with two blocks. The problem meat and fish [1]. To achieve this, an emulsion model system was cre-
is that more decisions have to be made about the number of compo- ated to mimic real foods. The current data set consists of 69 emulsions
nents and about in which order to incorporate the matrices. For an ap- where the contents of whey proteins, water and fats are varied
plication with three input blocks we refer to [24]. according to a mixture design. In addition, the fat composition in
these samples is varied by including mixtures of 5 different vegetable
3.7. Incorporating interactions in SO-PLS regression and marine oils according to another mixture design. Thus, the 69
samples in the data set exhibit variation both related to the main in-
[27] proposed to incorporate interactions in SO-PLS by adding an gredients (i.e. proteins, water, and fats) and the fatty acid composi-
extra matrix to model (1) consisting of element-wise products of lin- tion. The samples were analysed using two different spectroscopic
ear combinations of X1 and X2. This means that each column of X1C1 techniques, namely near-infrared (NIR) spectroscopy [9] and Raman
is multiplied by each column of X2C2 where the C matrices represent spectroscopy [7].
linear combinations of the two input blocks. If we denote this The example highlights a generic challenge in quantitative spec-
column-wise multiplication by *, the model (1) can then be extended troscopy not frequently discussed: what is the optimal way to pres-
to ent or calculate variables for predictions? The nature of the
spectroscopic technique in question should ideally guide this choice.
Y ¼ X 1 B1 þ X 2 B2 þ ðX 1 C 1  X 2 C 2 ÞB3 þ E: ð8Þ For Raman spectroscopy, a technique with high abilities to provide
detailed chemical information from minor components in foods, it
The last matrix X1C1 * X2C2 is then considered an extra input block is natural to present the variables relative to the total amount of
X3 and the standard SO-PLS procedure with three input blocks can be these minor components in the sample, like fat content in the pres-
used. This means that the interaction term is fitted by PLS after having ent example. For NIR, on the other hand, predictive ability is often re-
been orthogonalised with respect to both X1 and X2. The interaction stricted to major components, like fat, water, protein and water
matrix should be centred before it is used in the modelling. Note contents. Thus, describing major components relative to the total
again the similarity with the Type I philosophy of incorporating content of the sample seems like a natural choice. But, when describ-
main effects (i.e. X1 and X2) before interactions. ing minor components relative to total sample contents, one runs a
A particularly interesting aspect of this way of defining block in- considerable risk of unwanted inter-correlations between the
teractions is that it preserves the invariance with respect to the rel- minor and major components in the sample, thus potentially reduc-
ative weighting of the blocks. Another important aspect is that it ing robustness and usefulness of the obtained calibration.
comprises several of the most interesting possibilities. First of all it According to these general points, fatty acid information could in
comprises direct multiplication of the original variables and direct this study be expressed in two different ways: 1) fatty acid in per-
multiplication of subsets of the variables after some type of variable centage of total sample weight, and 2) fatty acid in percentage of
selection. Secondly, it covers multiplication of principal components total fat content. In this study, one specific fatty acid feature, namely
of X1 with principal components of X2. This latter aspect is particu- the contents of polyunsaturated fatty acids (PUFA), is expressed both
larly important when the number of variables is very high. It may in percentage of total sample weight (Y1 — PUFA% emul) and in per-
also in such cases be more natural to think of interactions as some- centage of total fat content (Y2 — %PUFA). The hypothesis is that the
thing which relates to the underlying dimensions as determined by calibration results for the two spectroscopic methods would be highly
PCA rather than between the manifest variables themselves. Anoth- dependent on which of these is used. For NIR the first alternative is
er important aspect is that it is a direct generalisation of what is expected to be the better one while for Raman the second approach
done in standard polynomial regression and ANOVA (or ANCOVA) is expected to be superior.
methodology for incorporating interactions. In other words, the The number of variables in the NIR spectrum used was equal to
above interaction proposal can be considered a direct generalisation 301 (1400 nm–2000 nm) and the number of variables in the Raman
of standard linear model methodology to situations with large and spectrum used was equal to 1096 (675 cm −1–1770 cm −1). This
highly collinear blocks of data. Interpretation of the interactions means that using LS regression is impossible in this case. According
can be done as in [27] using line plots of the regression coefficients to experience and underlying theory, the main information in the
for the interactions or by simply plotting scores and loadings as pro- spectra is usually to be found in the first relatively few principal com-
posed above. The significance of the interactions can be tested by ponents of the spectra. For both types of spectroscopy, linear PLS re-
the CV-ANOVA method. gression is the standard methodology used for calibration.
Non-linear effects are also possible to incorporate in the SO-PLS
method. The simplest way of doing this is to incorporate for instance
squares of the individual variables as new variables in the blocks. Note 4.2. Pre-treatment of data
that this is essentially identical to what is done by [3] for incorporating
quadratic effects in regular PLS regression. It is also an extension of stan- Although both responses describe % PUFA, they are not directly com-
dard polynomial regression methodology. Another possibility, which is parable since the basis for calculating percentages are different (wrt.
also a generalisation of polynomial regression, is to add an extra block of total sample weight versus wrt. fat content). The Y1 varies between
for instance quadratic terms, i.e. adding extra blocks X12 and X22 based on 0.3% and 11.6%, while Y2 varies between 2.2% and 61.6%. The responses
quadratic terms. The advantage of this is that one can explicitly assess are therefore divided by their own standard deviation before analysis.
the importance of adding non-linearities using the ANOVA type of After model fitting, the prediction results are transformed back to orig-
tests above. Non-linear blocks can be incorporated before or after the inal units. The correlation between the two variables is equal to 0.73.
interactions. The X1 and X2 data are not standardised, which is the common practice
in spectroscopy for interpretation purposes. The spectroscopic data
4. Illustration of the methodology were, however, pre-processed in standard ways; MSC [10] was used
for the NIR data and baseline correction and subsequent SNV [2] were
4.1. The data set and problem considered used for the Raman data.
The scores from the two PLS models obtained for X1 and X2orth
The data set originates from a spectroscopic study where the main were investigated for outliers. No tendency of outlying samples was
aim was to investigate the potential of using different spectroscopy detected and therefore all 69 samples were used. Further interpreta-
techniques to quantify the fat composition in food products like tion of these models will not be attempted here.
T. Næs et al. / Chemometrics and Intelligent Laboratory Systems 124 (2013) 32–42 37

Fig. 1. Måge plot. The plot to the right represents a magnified view of the bottom area. The RMSEP is obtained as the square root of the average of the mean squares of the two
responses after back-transformation to original units. The symbols given in the plot (for instance _6_5) indicate the number of components in the first and second blocks
respectively.

4.3. Results from the SO-PLS method The explained variances based on cross-validation for this optimal
number of components after fitting of X1 and after fitting of both X1
The two data blocks were fitted in both orders according to the and X2 are shown in Fig. 2. This figure shows that the NIR block ex-
SO-PLS procedure described above. The prediction results were almost plains more than 80% of the variance of Y1, but only 50% of the Y2.
identical for the two orders. In this paper we will therefore concentrate When the Raman block is added, however, one can see that both
on the results using NIR as X1 and Raman as X2. The prediction results explained variances reach a very high level (about 98%). This shows
are presented in Fig. 1 using the so-called Måge plot [28]. In this case that NIR is as expected better at predicting Y1 than Y2 and that
the RMSEP is based on the sum of the MSEPs of the two responses Raman spectroscopy is needed in order to predict Y2 well. As can be
after back-transformation of the responses to original units. It should seen, Raman also improves the prediction of Y1. The CV-ANOVA
be mentioned that other possibilities could have been used, either table based on absolute values of the prediction residuals for this
using the average before back-transformation or looking at the two re- analysis is given in Table 1.
sponses separately. For the purpose of illustrating the methodology we It is clear from this table that both steps of adding spectral vari-
confine ourselves to one solution here. ables improve prediction ability significantly. The table also presents
For obtaining a better view of the best possibilities, a magnifica- the RMSEPs after each block and thus indicates the size (in a similar
tion is made for the lower section of the curve. Since the prediction way as Fig. 2) of the incremental improvements of adding extra mea-
ability is here quite stable as a function of the total number or compo- surements. For the other order of the spectroscopic methods, it is
nents, searching for the absolute minimum is a reasonable strategy. clear that Raman is better than NIR spectroscopy used alone, but
As can be seen, using 5 and 7 components in the two blocks gives also in this case, using both spectroscopic methods improves the pre-
the best results, but several other options give very similar results. diction ability.
It should, however be mentioned that some users of this type of spec- The principal components of the predicted Y-values (after back-
troscopy may prefer to stop component selection at a lower number transformation to original units) are presented in Fig. 3. As can be
of components in order to get a more stable model in the long run seen, the first axis is primarily an Y2 axis with some contribution of Y1,
[25], but such an approach is not pursued here. while the second axis is totally dominated by Y1. This is to be expected
by the very different units of the two Y variables. The score plot is very
similar to the design of the experiment (not shown) which also is to be
expected because of the very high prediction ability.
The PCP loadings for the two components and for the two input
blocks are presented in Fig. 4. This plot can then be used for interpreting
the two axes in Fig. 3, i.e. for interpreting which parts of the NIR and
Raman spectra that are important for predicting the two responses.
The Raman loadings for the horizontal component axis (solid line)
show a signature frequently encountered for polyunsaturated fatty

Table 1
CV-ANOVA table (Type I) for both response variables (based on absolute values of the
cross-validated residuals) using the SO-PLS method. The RMSEP values represent the
prediction ability before and after each step when adding the data blocks.

Source Y1 Y2

RMSEP p-Value RMSEP p-Value

Mean 2.79 15.75


Fig. 2. Prediction ability of the SO-PLS method. Explained variance (based on cross- Adding X1 1.37 b0.0001 11.10 0.001
validation) for the two response values after the first and after the second block (5, 7 Adding X2 0.69 b0.0001 2.23 b0.0001
PLS components respectively).
38 T. Næs et al. / Chemometrics and Intelligent Laboratory Systems 124 (2013) 32–42

Fig. 3. PCA of predicted Y-values (based on the SO-PLS method).

acids. The two major positive bands (at 1263 cm−1 and at 1658 cm−1) to a horizontal shift in the plot. This corresponds well with the fact
are both related to vibrations of unsaturated carbon–carbon chains, that this axis is according to Fig. 3 most strongly linked to Y2 (see
whereas the two major negative bands (at 1304 cm−1 and at also Fig. 2). The interpretation of this effect corresponds also well
1439 cm−1) are related to vibrations of saturated carbon–carbon with the loadings in Fig. 4 as described above.
chains. The corresponding NIR loadings reveal a clear band at For comparison, the explained variances (from CV) of Y1 and Y2 were
1706 nm, directly linked to vibrations of polyunsaturated carbon– 90.1% and 97% when calculated by the MB-PLS, i.e. the concatenated PLS
carbon chains (or fatty acids), and a broad band related to water vibra- regression (with both blocks scaled to have the same sum of squares). As
tions. Thus, the X-loadings corresponding to the first axis show a logical can be seen, the results are good, but not as good as the corresponding
link towards the signature expected for polyunsaturated fatty acids, and results obtained by the SO-PLS method. It should, however, be men-
thus Y2 (or %PUFA) (see also [1]). tioned that the SO-PLS searches through more possibilities when
For the second component (vertical one in Fig. 3) the X-loadings selecting the number of components and a more fair comparison
(dotted line) clearly include additional information from other main would be to compare the methods on an independent data set.
components (i.e. protein, water and total fat-contents). This should
also be expected as the second component mainly explains the varia-
tion of Y1, where the content of PUFA is expressed as percentage of 4.4. Results from the PO-PLS method
total sample weight. The Raman loadings of the second component
resemble the signature of a protein Raman spectrum, whereas the In order to reduce dimensionality and noise before the canonical
NIR loadings reveal features first of all related to fat contents and correlation analysis, the first step of the PO-PLS modelling is to estimate
water [1]. separate PLS models for each predictor block. In this case this resulted in
The projections of the NIR predictions onto the principal compo- a 10-component model for X1 (NIR), explaining 91.5% of the Y-variation
nents of the predicted Y-values obtained by using both input blocks (95.9% and 87.1% for the two responses, Y1 and Y2 respectively) and a 5
are presented in Fig. 5. The samples are linked with a line in order component model for X2 (Raman) with an explained variance equal to
to highlight the direction of improvement obtained when adding 92% (87.1% and 96.5% for the two responses). The corresponding
Raman spectroscopy. As can be seen, the additional contribution cross-validation values for the two responses are 87.1% and 75.7% for
obtained when incorporating the Raman spectra corresponds mainly NIR and 83.1% and 95.3% for Raman.

Fig. 4. The loadings for the two first PCP components based on the SO-PLS solution. The solid line represents the first component and the dotted line the second. The units for the
two horizontal spectroscopic axes are wavelength (nm) and Raman shift (cm−1).
T. Næs et al. / Chemometrics and Intelligent Laboratory Systems 124 (2013) 32–42 39

Table 2
CV-ANOVA table (Type III) for the PO-PLS method.

Y1 Y2

Type III SS p-Value Type III SS p-Value

Model 66.87 65.95


Common 56.96 b1e−005 51.68 b1e−005
Unique X1 5.14 b0.01 2.08 b0.5
Unique X2 2.20 b0.10 2.24 b0.01
Residual 1.13 2.05
Total 68 68

than can be obtained by any of the spectroscopic methods used


separately.
The ANOVA table (Type III) for the PO-PLS model is given in
Table 2. Sums of squares were in all cases calculated for each of the re-
sponses separately. The CV-ANOVA was as above based on absolute
values of prediction residuals. The results show that the common
^ 1 onto the two principal components of Y^ ¼ X 1 B
Fig. 5. Projections of Y^ ¼ X 1 B ^ 1 þ X2B
^ 2.
The numbers represent the predictions with both input blocks. space is clearly the most important contribution. The unique part of
X1 (NIR) contributes significantly to explaining the remaining varia-
tion in Y1, and X2 (Raman) to explaining the remaining variation of
The next step is to define the common space based on the canon- Y2 which is to be expected from the above discussion of the properties
ical correlation coefficients and explained variances in X1, X2 and Y. of the two spectroscopic methods.
The number of common components is here decided based on The scores and loadings for the two common components are given
Fig. 6, where all relevant parameters are plotted together. The first ca- in Fig. 7. The score plot is coloured (shades ranging from grey and black)
nonical component has correlation coefficient 0.99, and explains a according to the value of Y1. As can be seen, there is clearly a diagonal
major variation in Y1 and Y2 (black bars), and also a significant pattern, i.e. the variation in Y1 is caused by both common components.
amount of variation of both X1 and X2 (grey bars). The first compo- When colouring the scores according to Y2, the pattern is mainly along
nent is therefore definitely a valid common component. The second the first common component (not shown). This means that the com-
component also has a high correlation coefficient (0.97), but it ex- mon space is related to both the response variables.
plains no variation in X1. This component should therefore not be The common components' loadings contain information about
regarded as common, since it is practically not present in one of the common features in X1 and X2, and should therefore be interpreted
blocks. The third component has correlation coefficient 0.95, and ex- together. While the X1 loadings are smooth with few distinct fea-
plains significant parts of X1, X2 and Y1. It is therefore here deter- tures in them, the X2 loadings are much noisier with many sharp
mined to be a common component. Components 4 and 5 have too peaks, which could be expected due to the nature of the two spec-
low correlation coefficients and variances to be considered as com- troscopic techniques used. The X1 and X2 loadings reveal the same
mon. The conclusion is therefore that there are two overlapping di- signatures as the first component of SO-PLS PCP-loadings (only
mensions between X1 and X2, where the first one is very important with opposite signs), and there is thus a clear and logical link to
for both Y-variables, while the last is mostly describing Y1. the expected spectral signatures of polyunsaturated fatty acids.
The X1 and X2 blocks are then orthogonalised with respect to the The first common component (solid line) is clearly related to the
two common scores, and new PLS models are fitted to represent the polyunsaturated fatty acid contents in the samples, and thus Y2. In
unique parts of each block. Cross validation indicates that there are 7 the same way, the second common component (dotted line)
unique components in X1 and 3 unique components in X2. With seems to be more related to the main components (i.e. protein,
these choices of the number of components (2, 7, 3), the full water and total fat) of the samples, in resemblance to the SO-PLS re-
PO-PLS model explains 98% of the variation in Y and 95% when vali- sults. The second common PO-PLS component thus reveals a natural
dated by CV (both obtained by averaging the explained variances link to Y1, where the content of PUFA is expressed as percentage of
over the two responses). As can be seen, this is substantially better total sample weight.

100
90
80
70
60
50
40
30
20
10
0
1 2 3 4 5

Fig. 6. Candidates for common components. Correlation coefficients (×100) are plotted as dots with a connecting line. Explained variance for the two Y-variables is given by black
bars and explained variance for each X-block by grey bars.
40 T. Næs et al. / Chemometrics and Intelligent Laboratory Systems 124 (2013) 32–42

Common scores NIR loadings


0.1
61
63 26 0.05
0.2
718
62 22 3913
12 0
42 19
30
48 38
Common comp. 2

0.1 29 -0.05
69 506657
4111
64 6558 51
25 47 23 36 2 -0.1
0 68 54 53 49 45 4 6
35 55 44 10
67 24 3
31 32 43 -0.15
60 34 52
3315 27 21
-0.1 14
59 56 37 -0.2
46 4095 17
-0.2 28 8 16 -0.25
20
1
-0.3 -0.2 -0.1 0 0.1 0.2 1400 1500 1600 1700 1800 1900 2000
Common comp. 1

Raman loadings
4

-2

-4

-6

-8

800 1000 1200 1400 1600

Fig. 7. Scores and loadings for the common components in PO-PLS. The score plot is coloured according to Y1 (light grey — low value, black — high value). The loading plots cor-
respond to the first component (solid) and the second component (dotted).

Regression coefficients for the unique blocks are given in Fig. 8. Only do therefore not show very clear, distinct features. For X1 the main spec-
the regression coefficients of the significant block are shown; i.e. only the tral features are found as expected around 1700 nm, where bands relat-
regression coefficients for X1 when predicting Y1 and the coefficients ed to polyunsaturated carbon–carbon chains are found. For X2 the
from X2 when predicting Y2. Note that the main part of the Y-variation corresponding main variation is found around the C_C stretching
is already modelled by the common components, and the unique contri- bands around 1660 cm−1. Both contributions probably adjust for spec-
butions only play a smaller role (see Table 2). The regression coefficients tral shifts due to different degree of unsaturation in the samples.

10
0.08

0.06
5
0.04

0.02
0
0

-0.02
-5

-0.04

-10 -0.06
1400 1500 1600 1700 1800 1900 2000 800 1000 1200 1400 1600

Fig. 8. Regression coefficients for the unique blocks of the PO-PLS model. The unique part of X1 is significant only for Y1, and only the Y1 coefficients are therefore shown in the left
plot. Likewise, the unique part of X2 is significant only for Y2, and only Y2 coefficients are shown in the right plot.
T. Næs et al. / Chemometrics and Intelligent Laboratory Systems 124 (2013) 32–42 41

5. Relations to other multi-block methods is linked to both variables in the Y-block. Another important result is
that both the spectroscopies used together give substantially better
The methodology presented here is, in addition to being a generali- results for both responses than each of the methods used separately.
sation of certain aspects of standard ANOVA, related to a number of al- All the results obtained are very natural from a spectroscopic point of
ternative methods discussed in the literature. The closest alternative is view and correspond to the indicated hypotheses.
MB-PLS or concatenated PLS [33]. The MB-PLS regression is, however,
not invariant to scaling of the blocks and it does not approach different
Acknowledgments
dimensionality and incremental contributions explicitly. It focuses,
however, on common variability among the blocks, but in a different
We would like to thank the project “Data integration” financed by the
and less explicit way as proposed here. It is shown in [17] that the
MB-PLS is sometimes less easy to interpret when one of the blocks is a Research Council of Norway (NFR) for financial support. We would also
design matrix. like to thank Rasmus Bro and Age Smilde for discussion that led to the
The SO-PLS also has a relation to the method proposed in [21]. In projection approach for interpretation. We would also like to thank the
that paper, the focus is on process modelling and the possibility of filter- referees for many important comments that led to an improvement of
ing data prior to process modelling. The filter proposed is a filter which the paper.
orthogonalises the signal relative to some instrumental variables of no
interest, in this way obtaining data which are more relevant for the pur- References
pose. The filter proposed is identical to the orthogonalisation step in
SO-PLS. [1] N.K. Afseth, V.H. Segtnan, B.J. Marquardt, J.P. Wold, Raman and near-infrared spec-
troscopy for quantification of fat composition in a complex food model system,
Another important relation is to the ASCA method [15,16] and to the Applied Spectroscopy 59 (11) (2005) 1324–1332.
closely related ANOVA-PCA of [11]. The ASCA method is developed for [2] R.J. Barnes, M.S. Dhanoa, S.J. Lister, Standard normal variate transformation and
designed experiments with multivariate output. The ASCA estimates detrending of near infrared diffuse reflectance, Applied Spectroscopy 43 (1989)
772–777.
each of the effect matrices (for each factor, assumed to be orthogonal [3] A. Berglund, S. Wold, INLR, implicit non-linear latent variable regression, Journal
to each other) separately and then uses a PCA on each. When the design of Chemometrics 11 (1996) 141–156.
is a fully balanced factorial design, the SO-PLS estimation with maxi- [4] S. Bougeard, E.M. Qannari, N. Rose, Multi-block redundancy analysis: interpretation
tools and application in epidemiology, Journal of Chemometrics 25 (2011) 467–475.
mum number of components in the PLS regressions is identical to
[5] J.D. Carroll, Generalisation of canonical analysis to three or more sets of variables,
ASCA. The orthogonalisation becomes unnecessary (redundant) and Proceedings of the 76th Convention of the American Psychological Association,
the factor estimates are obtained directly with LS regression. The vol. 3, 1968, pp. 227–228.
[6] H.R. Cederkvist, A.H. Aastveit, T. Naes, A comparison of methods for testing differ-
SO-PLS can in this sense be considered a direct generalisation of ASCA
ences in predictive ability, Journal of Chemometrics 19 (2005) 500–509.
to situations where the data blocks do not represent different factors [7] R.S. Da, Y.K. Agrawal, Raman spectroscopy: recent advancements, techniques and
in a designed experiment and when the input variables are highly col- applications, Vibrational Spectroscopy 40 (2) (2011) 163–176.
linear. Note also that the idea presented above of projecting each of [8] T. And Dahl, T. Næs, A bridge between Tucker-1 and generalised canonical corre-
lation, Computational Statistics and Data Analysis 50 (11) (2006) 3086–3098.
the block predictions onto a PCA analysis of the predicted Y-values [9] A.E. Esteve, H. Charles, A tutorial on Near infrared spectroscopy and its calibra-
stems from analogies with what is done for ASCA. The ANOVA-PCA is tion, Critical Reviews in Analytical Chemistry 40 (4) (2010) 246–260.
very similar to ASCA with the exception that the residuals are added [10] P. Geladi, D. McDougall, H. Martens, Linearisation and scatter correction for near
infrared reflectance spectra of meat, Applied Spectroscopy 39 (1985) 491–500.
to the effects prior to PCA thus obtaining information also about the un- [11] P. Harrington, deB. , N.E. Vieira, J. Espinoza, J.K. Nien, R. Romero, A.L. Yergey, Analysis
certainty in the plots. of variance — principal component analysis: a soft tool for proteomic discovery,
The paper by [4] also has some similarities with the approach Analytica Chimica Acta 544 (2005) 118–127.
[12] S. Hassani, H. Martens, M. Qannari, M. Hanafi, G.I. Borge, A. Kohler, Analysis of -omics
taken here. It is based on a generalisation of the redundancy analysis data: graphical interpretation- and validation tools in multi-block methods,
to several blocks. The problem with this approach is, however, that it Chemometrics and Intelligent Laboratory Systems 104 (2010) 140–153.
does not handle highly collinear data. For instance, it cannot be used [13] R.V. Haware, P.R. Wright, K.R. Morris, M.L. Hamad, Data fusion of Fourier transform
infrared spectra and powder X-ray diffraction patterns for pharmaceutical mixtures,
for situations with more variables than samples (as was the case in
Journal of Pharmaceutical and Biomedical Analysis 56 (2011) 944–949.
the example used here) without prior data compression of the data. [14] U.G. Indahl, T. Næs, Evaluation of alternative spectral feature extraction methods of
This method is not invariant to the relative scaling of the blocks. textural images for multivariate modelling, Journal of Chemometrics 12 (4) (1998)
261–278.
[15] J. Jansen, J. van der Hoefsloot, M. Greef, E. Timmerman, J. Westerhuis, A.K. Smilde,
6. Conclusion ASCA: analysis of multivariate data obtained from an experimental design, Journal
of Chemometrics 19 (9) (2005) 469–481.
This paper has presented and discussed two different methods for [16] J. Jansen, R. Bro, C. Huub, J. Hoefsloot, F.W.J. van den Berg, J. Westerhuis, A.K.
Smilde, PARAFASCA: ASCA combined with PARAFAC for the analysis of metabolic
handling multi-block regression analysis. The methods are both based fingerprinting data, Journal of Chemometrics 22 (2008) 114–121.
on orthogonalisation and PLS-regression, they are both invariant to [17] K. Jørgensen, V. Segtnan, K. Thyholt, T. Næs, A comparison of methods for
the relative weighting of the blocks and handle explicitly different analysing regression models with both spectral and designed variables, Journal
of Chemometrics 18 (10) (2004) 451–464.
underlying dimensionality of the blocks. The two methods SO-PLS [18] K. Jørgensen, B.-H. Mevik, T. Næs, Combining designed experiments with several
and PO-PLS focus on incremental and joint/unique variability of the blocks of spectroscopic data, Chemometrics and Intelligent Laboratory Systems
blocks respectively and therefore supplement each other for interpre- 88 (2) (2007) 143–212.
[19] K. Jørgensen, T. Næs, The use of LS-PLS for improved understanding, monitoring
tation. Both methods handle highly collinear data and the SO-PLS can and prediction of cheese, Chemometrics and Intelligent Laboratory Systems 93
also incorporate interactions among the blocks. Validation, interpre- (2008) 11–19.
tation and relations to standard ANOVA are discussed and it was [20] Ø. Langsrud, T. Næs, Optimised score plot by principal components of prediction,
Chemometrics and Intelligent Laboratory Systems 68 (2003) 61–74.
shown that in some sense the SO-PLS regression can be considered [21] B. Li, E. Martin, J. Morris, Process performance monitoring in the presence of
a direct generalisation of Type I ANCOVA to situations with high col- confounding variation, Chemometrics and Intelligent Laboratory Systems 94
linearity in some of the blocks. (2008) 104–111.
[22] H. Martens, H. Næs, Multivariate Calibration, Wiley, Chichester, UK, 1989.
The spectroscopic example has first of all served as an illustration
[23] MågeI. , B.-H. Mevik, T. Næs, Regression models with process variables and parallel
on how to use the interpretation tools discussed above. From a spec- blocks of raw material measurements, Journal of Chemometrics 22 (2008) 443–456.
troscopic point of view we have got, using the SO-PLS method, infor- [24] I. Måge, E. Menichelli, T. Naes, Preference mapping by PO-PLS: separating com-
mation about the additional information in the Raman spectra needed mon and unique information in several data blocks, Food Quality and Preference
24 (2012) 8–16.
to predict the response values better. The PO-PLS has clearly demon- [25] T. Næs, T. Isaksson, T. Fearn, T. Davis, A User-friendly Guide to Multivariate Calibration
strated that the common information in NIR and Raman spectroscopy and Classification, NIR Publications, Chichester, UK, 2002.
42 T. Næs et al. / Chemometrics and Intelligent Laboratory Systems 124 (2013) 32–42

[26] T. Næs, P. Brockhoff, O. Tomic, Statistics for Sensory and Consumer Science, Wiley, [30] A.K. Smilde, J.J. Jansen, H.C.J. Hoefsloot, R.-J.A.N. Lamers, J. van der Greef, M.E.
Chichester, UK, 2010. Timmerman, Anova-simultaneous component analysis (ASCA): a new tool for an-
[27] T. Næs, I. Måge, V.H. Segtnan, Incorporating interactions in multi-block SO-PLS alyzing designed metabolomics data, Bioinformatics 21 (2005) 3043–3048.
regression, Journal of Chemometrics 25 (2011) 601–609. [31] A.K. Smilde, M.J. van der Werf, S. Bijlsma, van der Werff, B.J. van der Vat, R.H. Jellema,
[28] T. Næs, T. Tomic, B.-H. Mevik, H. Martens, Path modelling by sequential PLS re- Fusion of mass spectrometry-based metabolomics data, Analytical Chemistry 77
gression, Journal of Chemometrics 25 (2011) 28–40. (20) (2005) 6729–6736.
[29] P. Odman, C.L. Johansen, L. Olsson, K.V. Gernaey, A.E. Lantz, Sensor combination [32] A. Tenenhaus, M. Teenenhaus, Regularized generalised canonical correlation
and chemometric variable selection for online monitoring of streptomyces analysis, Psychometrika 76 (2) (2011) 257–284.
coelicolor fed-batch cultivations, Applied Microbiology and Biotechnology 86 [33] J.A. Westerhuis, T. Kourti, J.F. MacGregor, Analysis of multi-block and hierarchical
(2010) 1745–1759. PCA and PLS models, Journal of Chemometrics 12 (1998) 301–321.

You might also like