You are on page 1of 11

Journal of Advanced Research (2012) 3, 53–63

Cairo University

Journal of Advanced Research

ORIGINAL ARTICLE

Application of artificial neural networks for response


surface modeling in HPLC method development
Mohamed A. Korany *, Hoda Mahgoub, Ossama T. Fahmy, Hadir M. Maher

Department of Pharmaceutical Analytical Chemistry, Faculty of Pharmacy, University of Alexandria, Alexandria 21521, Egypt

Received 31 October 2010; revised 23 March 2011; accepted 2 April 2011


Available online 12 May 2011

KEYWORDS Abstract This paper discusses the usefulness of artificial neural networks (ANNs) for response sur-
Optimization; face modeling in HPLC method development. In this study, the combined effect of pH and mobile
HPLC; phase composition on the reversed-phase liquid chromatographic behavior of a mixture of salbuta-
Artificial neural network; mol (SAL) and guaiphenesin (GUA), combination I, and a mixture of ascorbic acid (ASC), para-
Multiple regression analysis; cetamol (PAR) and guaiphenesin (GUA), combination II, was investigated. The results were
Method development compared with those produced using multiple regression (REG) analysis. To examine the respective
predictive power of the regression model and the neural network model, experimental and predicted
response factor values, mean of squares error (MSE), average error percentage (Er%), and coeffi-
cients of correlation (r) were compared. It was clear that the best networks were able to predict the
experimental responses more accurately than the multiple regression analysis.
ª 2011 Cairo University. Production and hosting by Elsevier B.V. All rights reserved.

Introduction The most important aspect of method development in li-


quid chromatography is the achievement of sufficient resolu-
The use of artificial intelligence and artificial neural networks tion in a reasonable analysis time. This goal can be achieved
(ANNs) is a very rapidly developing field in many areas of sci- by adjusting accessible chromatographic factors to give the de-
ence and technology [1]. sired response. A mathematical description of such a goal is
called an optimization.
* Corresponding author. Tel.: +20 3 4871317; fax: +20 3 4873273. The methods usually focus on the optimization of the mo-
E-mail address: makorany@yahoo.com (M.A. Korany). bile phase composition, i.e. on the ratio of water and organic
solvents (modifiers). Optimization of pH may lead to better
2090-1232 ª 2011 Cairo University. Production and hosting by selectivity. The degree of ionization of solutes, stationary
Elsevier B.V. All rights reserved. phase and mobile phase additives may be affected by the
pH. It is clear, however, that if the full power of eluent compo-
Peer review under responsibility of Cairo University. sition is to be realized, efficient strategies for multifactor chro-
doi:10.1016/j.jare.2011.04.001
matographic optimization must be developed [2].
Retention mapping methods are useful optimization tools be-
Production and hosting by Elsevier cause the global optimum can be found. The retention mapping is de-
signed to completely describe or ‘map’ the chromatographic
54 M.A. Korany et al.

behavior of solutes in the design space by response surface, neuron in the hidden layer is connected to each neuron in
which shows the relationship between the response such as the the output layer, which produces the output vector. Infor-
capacity factor of a solute or the separation factor between mation from various sets of input is fed forward through
two solutes and several input variables such as the components the ANN to optimize the weight between neurons, or to
of the mobile phase. The response factor of every solute in the ‘train’ them. The error in prediction is then back-propagated
sample can be predicted, rather than performing many separa- through the system and the weights of the inter-unit connec-
tions and simple choosing the best one obtained [2]. tions are changed to minimize the error in the prediction.
Neural network methodology has found rapidly increasing This process is continued with multiple training sets until
application in many areas of prediction both within and out- the error value is minimized across many sets.
side science [3–7]. The main purpose of this study was to pres- The error of the network, expressed as the mean squared er-
ent the usefulness of ANNs for response surface modeling in ror (MSE) of the network, is defined as the squared difference
HPLC optimization [8–10]. between the target values (T) and the output (O) of the output
In this study, the combined effect of pH and mobile neurons:
phase composition on the reversed-phase liquid chromato- " #,
XX
graphic behavior of a mixture of salbutamol (SAL) and gua- MSE ¼ ðOkl  Tkl Þ 2
pm ð3Þ
iphenesin (GUA), combination I, and a mixture of ascorbic k¼1 l¼1
acid (ASC), paracetamol (PAR) and guaiphenesin (GUA),
where p is the number of training sets, and m is the number of
combination II, was investigated. The effects of these factors
output neurons of the network. During training, neural tech-
were examined where they provided acceptable retention and
niques need to have some way of evaluating their own perfor-
resolution. The data predicted using ANN were compared to
mance. Since they are learning to associate the inputs with
those calculated on the basis of multiple regression (REG)
outputs, evaluating the performance of the network from the
[11].
training data may not produce the best results. If a network
is left to train for too long, it will over-train and will lose the
Theory
ability to generalize. Thus test data, rather than training data,
are used to measure the performance of a trained model. Thus,
Neural computing three types of data set are used: training data (to train the net-

The output (Oj) of an individual neuron is calculated by sum-


ming the input values (Oi) multiplied by their corresponding
Table 1 Training and testing data used for the prediction of
weights (Wij) (Eq. (1)) and converting the sum (Xj) to output
the capacity factor (K0 ) of salbutamol (SAL) and of guaiph-
(Oj) by a transform function. The most common transform
enesin (GUA).a
function is a sigmoidal function [2,12]:
X Methanol (%) pH K0 (SAL) K0 (GUA)
Xj ¼ Oi  Wij ð1Þ
i
30 3.1 0.667 3.611
Xj 35 3.1 0.611 2.444
Oj ¼ ½2=ð1 þ e Þ  1 ð2Þ 40 3.1 0.444 1.556
25 3.1 1.000 5.500
where O is the output of a neuron, i denotes the index of the 20 3.1 1.611 9.389
neuron that feeds the neuron (j), and (Wij) is the weight of 18 3.1 1.889 12.556
the connection. 40 3.5 0.778 1.556
In an ANN, the neurons are usually organized in layers. 40 4.1 1.111 1.556
There is always one input and one output layer. Furthermore, 40 5.0 1.222 1.556
the network usually contains at least one hidden layer. The use 40 6.0 1.333 1.556
of hidden layers confers on ANNs the ability to describe non- 18 3.8 6.722 12.389
20 5.8 6.778 9.278
linear systems [12,13].
24 3.6 2.778 6.611
An ANN attempts to learn the relationships between the 30 5.2 2.222 3.500
input and output data sets in the following way: during the 27 4.5 2.778 4.667
training phase, input/output data pairs, called training data, 25 4.6 3.611 5.556
are introduced into the neural network. The difference be- 34 4.6 1.278 2.667
tween the actual output values of the network and the train- 34 3.7 1.222 2.722
ing output values is then calculated. The difference is an 36 4.1 1.167 2.000
error value which is decreased during the training by modi- 38 5.7 1.444 1.889
fying the weight values of the connections. Training is con- 38 3.7 0.833 1.833
tinued iteratively until the error value has reached the 42 5.3 1.111 1.333
24.0b 4 3.882 6.492
predetermined training goal.
35.0b 3.5 0.722 2.468
There are several algorithms available for training ANNs 30.0b 3.3 0.758 3.560
[14]. One quite commonly used algorithm is the back-prop- 30.0b 5.5 2.504 3.500
agation, which is a supervised learning algorithm (both input 20.0b 3.5 1.661 9.380
and output data pairs are used in the training). The neural a
Factor levels used in HPLC separation and the obtained
network used in this work is the feed-forward, back-propa- capacity factors.
gation neural network type. Each neuron in the input layer b
Testing data.
is connected to each neuron in the hidden layer and each
HPLC optimization using ANNs 55

work), test data (to monitor the neural network performance Experimental
during training) and validation data (to measure the perfor-
mance of a trained application), each with a corresponding Instrumentation
error.
The chromatographic system consisted of an S 1121 solvent
Multiple regression analysis
delivery system (Sykam GmbH, Germany), an S 3210 vari-
able-wavelength UV–VIS detector (Sykam GmbH, Germany)
A response surface, based on multiple regression analysis, was and an S 5111 Rheodyne manual injector valve bracket fitted
used to illustrate the relation between different experimental with a 20 ll sample loop. HPLC separations were performed
variables [14]. A response surface can simultaneously represent on a ThermoHypersil stainless-steel C-18 analytical column
two independent variables and one dependent variable when (250 · 46 mm) packed with 5 lm diameter particles. Data were
the mathematical relationship between the variables is known, processed using the EZChrom Chromatography Data Sys-
or can be assumed. tem, version 6.8 (Scientific Software Inc., CA, USA) on an
In this study, the independent variables were pH and meth- IBM-compatible PC connected to a printer. The elution was
anol percentage in the mobile phases for both combinations I performed at a flow rate of 1.5 or 1 ml min1 for combinations
and II where the dependent variable was the capacity factor or I and II, respectively. The absorbance was monitored at 275 or
the separation factor for combinations I and II, respectively. 225 nm for combinations I and II, respectively. Mixtures of
Experimental data were fitted to a polynomial mathematical methanol:0.01 M sodium dihydrogenphosphate aqueous solu-
model with the general form: tion adjusted to the required pH by the addition of ortho-
Y ¼ b0 þ b1 p þ b2 m þ b3 pm þ b4 p2 þ b5 m2 ð4Þ phosphoric acid or sodium hydroxide were used as the mobile
where b0–b5 are estimates of model parameters, p and m stand phases for both combinations.
for the independent variables and y is the dependent variable.
Using this model the dependent variable can be predicted at Materials and reagents
any value of the independent variables.
Standards of SAL, GUA, ASC and PAR were kindly supplied
by Pharco Pharmaceuticals Co. (Alex, Egypt). All the solvents
used for the preparation of the mobile phase were HPLC grade
and the mixtures were filtered through a 0.45 lm membrane fil-
Table 2 Training and testing data used for the prediction of
trate and degassed before use.
the separation factors (a) between ascorbic acid (ASC) and
(Bronchovent) syrup was obtained from Pharco Pharma-
paracetamol (PAR) and between paracetamol (PAR) and
ceuticals Co. (Alex, Egypt) labelled to contain 2 mg SAL
guaiphenesin (GUA).a
and 50 mg GUA per 5 ml syrup. (G.C. Mol) effervescent
Methanol (%) pH a1 (ASC/PAR) a2 (PAR/GUA) sachets were obtained from Pharco Pharmaceuticals Co. (Alex,
60 6.1 3.667 1.545 Egypt) labelled to contain 250 mg ASC, 100 mg GUA and
50 6.1 4.000 2.583 325 mg PAR per sachet.
40 6.1 4.667 3.643
30 6.1 6.667 5.450
20 6.1 11.667 7.857
Table 3 Multiple regression results for the prediction of K0 of
50 3.3 1.300 2.385 salbutamol (SAL) and guaiphenesin (GUA).
50 4.1 1.444 2.385 Dependant variables: K0 (SAL) r: 0.829 F = 20.856
50 5.1 1.857 2.385 r2: 0.687 dF = 2, 19
50 6.8 16.250 1.455 No. of experiments: 22 Adjusted r2: 0.654 p = 0.000016
70 6.5 11.000 1.400 Standard error of estimate (SE): 1.025
80 6.5 10.000 1.170
90 6.5 7.000 1.175 Dependant variables: K0 (GUA) r: 0.942 F = 74.446
88 3.3 0.800 1.175 r2: 0.887 dF = 2, 19
88 4.1 0.889 1.175 No. of experiments: 22 Adjusted r2: 0.875 p = 0.000001
88 6.1 2.667 1.175 Standard error of estimate (SE): 1.260
88 4.7 1.000 1.176
88 5.4 1.067 1.175
88 5.8 1.404 1.179 Table 4 Multiple regression results for the prediction of the
40 3.3 1.400 3.643 separation factors between ascorbic acid (ASC) and paracet-
40 4.1 1.556 3.645 amol (PAR), a1, and between paracetamol (PAR) and guaiph-
40 5.1 2.000 3.655
enesin (GUA), a2.
40 6.8 17.500 3.643
60.0b 4.5 1.375 1.545 Dependant variables: a1 r: 0.771 F = 13.917
35.0b 6.1 5.333 4.750 r2: 0.594 dF = 2, 19
30.0b 5.5 3.333 5.452 No. of experiments: 22 Adjusted r2: 0.552 p = 0.00019
90.0b 6.1 2.333 1.143 Standard error of estimate (SE): 1.939
20.0b 3.3 3.500 7.857 Dependant variables: a2 r: 0.875 F = 30.987
a
Factor levels used in HPLC separation and the obtained sepa- r2: 0.765 dF = 2, 19
ration factors. No. of experiments: 22 Adjusted r2: 0.741 p = 0.000001
b
Testing data. Standard error of estimate (SE): 0.857
56 M.A. Korany et al.

Solutions and left for 5 min until no effervescence was detected; then the
clear solution was quantitatively transferred to a 250 ml volu-
Preparation of stock and standard solutions metric flask and completed to volume with methanol. 0.4 ml of
About 10 mg of SAL and 250 mg of GUA (for combination this stock solution was further diluted to 10 ml using the mo-
I) or 25 mg of ASC, 10 mg of GUA and 32.5 mg of PAR bile phase used for each chromatographic run.
(for combination II) reference materials were accurately
weighed, dissolved in methanol and diluted to 25 ml with Data analysis
the same solvent to form stock solutions. Working standard
solutions were prepared by dilution of a 0.2 or 0.4 ml vol- ANN simulator software
ume of stock solutions for combinations I and II, respec-
tively, to 10 ml with the mobile phase used for each
chromatographic run. MS-Windows based Matlab software, version 6, release 12,
2000 (The Math-Works Inc.) was used. Calculations were per-
Sample preparation formed on an IBM-compatible PC.

For combination I, 0.2 ml of the syrup was accurately trans- Training data
ferred to a 10 ml volumetric flask and diluted to volume with
the mobile phase used for each chromatographic run. For
combination II, the content of one effervescent sachet was A neural network with a back-propagation training algorithm
accurately transferred into a beaker containing 100 ml of water was used to model the data. For combination I, the behaviour

0.009
0.013
0.016
0.020
0.023
0.027
0.031
0.034
0.038
0.041
above

b
550

500

450

400
TRAINING

350

300
0.009
250 0.013
0.016
0.020
200 0.023
0.027
150 0.031
0.034
100 0.038
0 4 8 12 16 20 24 0.041

HIDDENN

Fig. 1 Effect of the number of hidden neurons and number of cycles during training on the MSE, in the prediction of the capacity factor
(K0 ) for combination I. (a) 3D surface plot and (b) 3D contour plot.
HPLC optimization using ANNs 57

of the capacity factor (K0 ) of SAL and GUA to the changes in input/output data pair in the training data reached the pre-
pH (3.1–6.0) and mobile phase composition (18–42 metha- determined error level. While the network was being
nol%), were emulated using a network of two inputs (pH optimized, the testing data (Tables 1 and 2 for combinations
and methanol%), one hidden layer and two outputs (K0 for I and II, respectively) were fed into the network to evaluate
SAL and GUA). For combination II, the behaviour of the sep- the trained net.
aration factor (a) between ASC, PAR and between PAR,
GUA to the changes in pH (3.3–6.8) and mobile phase compo-
Multiple regression analysis
sition (20–90 methanol%), were emulated using a network of
two inputs (pH and methanol%), one hidden layer and two
outputs (a between ASC, PAR and between PAR, GUA). Multiple regression analysis (quadratic) was carried out using
Training data are listed in Tables 1 and 2 for combinations I STATISTICA software, release 5.0, 1995 (StatSoft Inc., USA).
and II, respectively. Chromatographic experiments were performed in the pH
Neural networks were trained using different numbers of range of 3.1–6.0 or 3.3–6.8 and methanol% of 18–42% or
neurons (2–20) in the hidden layer and training cycles (150– 20–90% for combinations I and II, respectively. According
500) for both combinations I and II. At the start of a training to these experimental data (Tables 1 and 2), model-fitting
run, weights were initialized with random values. During methods gave the equations for the relationship between the
training, modifications of the weights were made by back- responses (K0 or a for combinations I and II, respectively)
propagation of the error until the error value for each and pH and mobile phase composition.

0.021
0.031
0.041
0.051
0.061
0.070
0.080
0.090
0.100
0.110
above

b
550

500

450

400
TRAINING

350

300
0.021
250 0.031
0.041
0.051
200 0.061
0.070
150 0.080
0.090
100 0.100
0 4 8 12 16 20 24 0.110

HIDDENN

Fig. 2 Effect of the number of hidden neurons and number of cycles during training on the MSE, in the prediction of the separation
factor (a), combination II. (a) 3D surface plot and (b) 3D contour plot.
58 M.A. Korany et al.

For combination I, where p = methanol% and m = pH.


0 2 Results of the multiple regression analysis for both
K ðSALÞ ¼ 3:538  0:552p  6:688m þ 0:012p
combinations are summarized in Tables 3 and 4.
 0:079pm  0:377m2 ð5Þ
Results and discussion
K0 ðGUAÞ ¼ 36:938  1:83p þ 0:178m þ 0:023p2
þ 0:01pm  0:068m2 ð6Þ Network topologies

For combination II,


The properties of the training data determine the number of in-
put and output neurons. In this study, the number of factors
a1 ðASC and PARÞ ¼ 41:944 þ 0:028p  19:469m (pH and methanol%) forced the number of input neurons to
be two in both combinations. The number of responses includ-
þ 0:001p2  0:029pm þ 2:411m2 ð7Þ
ing K0 of SAL and of GUA or a (ASC and PAR) and a (PAR
a2 ðPAR and GUAÞ ¼ 13:193  0:317p  0:094m
þ 0:002p2 þ 0pm þ 0:014m2 ð8Þ
a
a

0.972
3.075
0.727 5.178
1.455 7.281
2.182 9.383
2.909 11.486
3.636 13.589
4.364 15.692
5.091 17.794
5.818 19.897
6.545 above
7.273
above
b
b

1.637
2.373
2.457 3.110
3.606 3.846
4.755 4.582
5.903 5.319
7.052 6.055
8.201 6.791
9.349 7.527
10.498 8.264
11.647 above
12.795
above
Fig. 4 Response surfaces for multifactor effect of pH and
Fig. 3 Response surfaces for multifactor effect of pH and methanol% on (a) separation factor between ascorbic acid and
methanol% on (a) capacity factor (K0 ) of salbutamol (SAL) and paracetamol (a1) and (b) between paracetamol and guaiphenesin
(b) of guaiphenesin (GUA) generated by ANN with 12 hidden (a2) generated by ANN with 14 hidden neurons and 250 training
neurons and 350 training cycles. cycles.
HPLC optimization using ANNs 59

a a

2.002
0.526 3.602
1.273 5.202
2.021 6.801
2.768 8.401
3.515 10.001
4.263 11.601
5.010 13.201
5.758 14.800
6.505 16.400
7.253 above
above

b
b

0.973
1.776
2.535 2.578
3.677 3.381
4.819 4.184
5.961 4.986
5.789
7.104
6.592
8.246 7.395
9.388 8.197
10.530 above
11.672
12.814
above Fig. 6 Response surfaces for multifactor effect of pH and
methanol% on (a) separation factor between ascorbic acid and
Fig. 5 Response surfaces for multifactor effect of pH and paracetamol (a1) and (b) between paracetamol and guaiphenesin
methanol% on (a) capacity factor (K0 ) of salbutamol (SAL) and (a2) generated by the REG model.
(b) of guaiphenesin (GUA) generated by REG model.
Since test set error is usually a better measure of performance
and GUA) for combinations I and II, respectively, forced the than training error, while the network has been optimized, test
number of output neurons also to be two. data were fed through the network to evaluate the trained
The number of connections in the network is dependent network. After the addition of the 12th or the 14th hidden
upon the number of neurons in the hidden layer. In the train- neurons for combinations I and II, respectively, it became
ing phase, the information from the training data is trans- evident that more hidden neurons did not improve the gener-
formed to weight values of the connections. Therefore, the alization ability of the network (Figs. 1 and 2).
number of connections might have a significant effect on the
network performance. Since there are no theoretical principles Training of the networks
for choosing the proper network topology, several structures
were tested. To compare the predictive power of the neural network struc-
A problem in constructing the ANN was to find the optimal tures, MSE was calculated for each model (with certain num-
number of hidden neurons. Another problem was over-fitting bers of hidden neurons and training cycles). The performance
or over-training, evident by an increase in the test error. Neu- of the network on the testing data gives a reasonable estimate
ral networks were trained using different numbers of hidden of the network prediction ability.
neurons (2–20) and training cycles (150–500) for each combi- The lowest testing MSE was obtained with 12 or 14 hidden
nation. Neurons were added to the hidden layer two at a time. neurons and 350 or 250 training cycles for combinations I and
The networks were trained and tested after each addition. II, respectively (Figs. 1 and 2). After 350 or 250 cycles, extra
60 M.A. Korany et al.

training made the prediction ability worse and the test error be- (interpolation). The generalization ability was studied by
gan to increase. This effect is called over-training or over-fitting. consulting the network with test data and observing the
The combined effect of pH and methanol% on the capacity output values. The output values are hence predicted by the
factors or separation factors for combinations I and II, network. This operation is called interrogating or querying
respectively, generated by the best ANN model, are presented the model.
in Figs. 3 and 4. Average error percentage (Er%) is used for examination of
the best generalization ability or method validation of neural
Multiple regression analysis networks (the smallest Er%).
(Er%) is calculated according to Eq. (9):
X
Eqs. (5) and (6) was used to predict K0 of SAL and GUA, Er % ¼ j½1  ðOi =Ti Þj  100=n ð9Þ
respectively, at any selected value for pH and methanol%. i¼1
Eqs. (7) and (8) could be also used to predict a (ASC and
where n is the number of experimental points, Ti is the mea-
PAR) and a (PAR and GUA), respectively, at any selected va-
sured (target) capacity factor or separation factor for combina-
lue for pH and methanol%. Predicted response surfaces drawn
tions I and II, respectively, and Oi denotes the value predicted
from the fitted equations are shown in Figs. 5 and 6 for com-
by the model for a drug.
binations I and II, respectively.
Comparison of the best network and the regression model
Method validation

To compare the predictive power of the regression model with


In studying the generalization ability of neural networks, five
the neural network model, we compared experimental and pre-
additional experiments were performed (see Tables 5 and 6
dicted response factor values, mean of squares error (MSE),
for combinations I and II, respectively). In the experimental
average error percentage (Er%) and squared coefficients of
points, the factor levels of the input variables were chosen so
correlation (r2).
that they were within the range of the original training data

Table 5 Method validation for the prediction of K0 of salbutamol (SAL) and guaiphenesin (GUA).
Methanol (%) pH Measured Predicted by ANNa Predicted by REG
SAL GUA SAL GUA SAL GUA
24.0 4.2 4.100 6.456 3.954 6.680 3.602 6.819
38.0 3.5 0.778 1.833 0.848 2.042 1.097 1.727
35.0 3.3 0.941 2.229 0.830 2.650 0.682 2.062
40.0 5.5 1.350 1.541 1.243 1.495 1.582 1.657
35.0 3.5 1.109 2.568 1.141 2.669 0.954 2.075
rb 0.989 0.997 0.966 0.992
r2 0.978 0.994 0.932 0.983
Er%c 0.070 0.051 0.223 0.115
a
ANN with 12 hidden neurons and 350 training cycles.
b
Coefficient of correlation.
c
Relative percentage error.

Table 6 Method validation for the prediction of the separation factors between ascorbic acid (ASC) and paracetamol (PAR), a1, and
between paracetamol (PAR) and guaiphenesin (GUA), a2.
Methanol (%) pH Measured Predicted by ANNa Predicted by REG
a1 a2 a1 a2 a1 a2
70.0 4.7 1.375 1.273 1.542 1.218 1.018 0.670
44.0 6.1 4.333 3.000 5.500 3.016 8.281 3.065
25.0 6.1 8.667 6.500 8.822 6.489 9.798 6.466
30.0 5.5 2.667 6.450 3.556 4.437 4.752 5.389
90.0 4.1 0.875 1.171 0.962 1.178 2.569 0.713
Rb 0.893 0.915 0.900 0.810
R2 0.596 0.837 0.953 0.910
ERR (%)c 0.168 0.048 0.804 0.181
a
ANN with 14 hidden neurons and 250 training cycles.
b
Coefficient of correlation.
c
Relative percentage error.
HPLC optimization using ANNs 61

4.5 4.5

4
(a) 4 (a’)
3.5 3.5

Predicted value
3 3
K (SAL)

2.5 2.5

2 2

1.5 1.5

1 1

0.5 0.5

0 0
1 2 3 4 5 0 1 2 3 4 5
Experimental value
Experimental point

8 8

7 (b) 7
(b’)
6 6

Predicted value
5 5
K (GUA)

4 4

3 3

2 2

1 1

0 0
1 2 3 4 5 0 2 4 6 8
Experimental point Experimental value
Experimental value ANN REG Experimental value ANN REG

Fig. 7 Capacity factors (a) of salbutamol (K0 SAL) and (b) of guaiphenesin (K0 GUA): experimental values, artificial neural network
estimated (ANN) and regression model estimated (REG).

In Fig. 7, experimental K0 of SAL and of GUA were com- ANNs, more complex relationships, especially nonlinear ones,
pared with those predicted by ANN and with those calculated may be investigated without complicated equations.
by the regression models (Eqs. (5) and (6)). The ANN values ANN analysis is quite flexible concerning the amount and
were closer to the experimental values than the REG values. form of the training data, which makes it possible to use more
Fig. 8 also compared experimental a1 (ASC and PAR) and informal experimental designs than with statistical approaches.
a2 (PAR and GUA) with those predicted by ANN and with It is also presumed that neural network models might general-
those calculated by the regression models (Eqs. (7) and (8)). ize better than regression models generated with the multiple
The ANN values were closer to the experimental values than regression technique, since regression analyses are dependent
the REG values. on pre-determined statistical significance levels. This means
The closeness of the data predicted by ANN compared that less significant terms are not included in the models.
with REG is also illustrated by the validation graphs shown The application of ANN is a totally different method, in which
in Figs. 7a0 , b0 and 8a0 , b0 where the former show little scat- all possible data are used for making the models more
ter around the experimental values compared with the REG accurate.
model. A possible explanation may be that in the regression model,
In this sense, ANNs offer a superior alternative to classical each solute has its own model. The neural network, however,
statistical methods. Classical ‘‘response surface modeling’’ constructs one model for all solutes at all design points used
(RSM) requires the specification of polynomial functions such for training. In this way the information is obtained more com-
as linear, first order interaction, or second or quadratic, to un- pletely as the peak sequence in the different chromatograms
dergo the regression. The number of terms in the polynomial is can contribute to the model.
limited to the number of experimental design points. On the
other hand, selection of the appropriate polynomial equation Conclusion
can be extremely laborious because each response variable re-
quires its own polynomial equation. The ANN methodology Neural networks proved to be a very powerful tool in HPLC
provides a real alternative to the polynomial regression meth- method development. The combined effect of pH and mobile
od as a means to identify the non-linear relationship. Using phase composition on the reversed-phase liquid chromato-
62 M.A. Korany et al.

12 12

(a) (a’)
10 10

Predicted value
Alpha 1 8

6 6

4 4

2 2

0 0
1 2 3 4 5 0 2 4 6 8 10
Experimental point Experimental value
7
7

6 (b) 6
(b’)
5
5

Predicted value
alpha 2

4 4

3 3

2 2

1 1

0 0
1 2 3 4 5 0 2 4 6 8

Experimental point Experimental value


experimental value ANN REG
Experimental value ANN REG

Fig. 8 Separation factors (a) between ascorbic acid and paracetamol (a1), (b) between paracetamol and guaiphenesin (a2): experimental
values, artificial neural network estimated (ANN) and regression model estimated (REG).

graphic behavior of a mixture of salbutamol (SAL) and gua- [2] Agatonovic Kustrin S, Zecevic M, Zivanovic LJ, Tucker IG.
iphenesin (GUA), combination I, and a mixture of ascorbic Application of artificial neural networks in HPLC method
acid (ASC), paracetamol (PAR) and guaiphenesin (GUA), development. J Pharm Biomed Anal 1998;17(1):69–76.
combination II, was investigated. Results showed that it is pos- [3] Boti VI, Sakkas VA, Albanis TA. An experimental design
approach employing artificial neural networks for the
sible to predict response factors more accurately using neural
determination of potential endocrine disruptors in food using
networks than using regression models. An ANN method matrix solid-phase dispersion. J Chromatogr A
was successfully applied to chromatographic separations for 2009;1216(9):1296–304.
modeling and process optimization. Moreover, neural network [4] Piroonratana T, Wongseree W, Assawamakin A, Paulkhaolarn
models might have better predictive powers than regression N, Kanjanakorn C, Sirikong M, et al.. Classification of
models. Regression analyses are dependent on pre-determined haemoglobin typing chromatograms by neural networks and
statistical significance levels and less significant terms are usu- decision trees for thalassaemia screening. Chemometr Intell Lab
ally not included in the model. With ANN methods, all data Syst 2009;99(2):101–10.
are used potentially, making the models more accurate. [5] Khanmohammadi M, Garmarudi AB, Ghasemi K, Garrigues S,
de la Guardia M. Artificial neural network for quantitative
determination of total protein in yogurt by infrared
spectrometry. Microchem J 2009;91(1):47–52.
References
[6] Torrecilla JS, Mena ML, Yá~ nez Sede~no P, García J. Field
determination of phenolic compounds in olive oil mill
[1] Murtoniemi E, Yliruusi J, Kinnunen P, Merkku P, Leiviskä K. wastewater by artificial neural network. Biochem Eng J
The advantages by the use of neural networks in modelling the 2008;38(2):171–9.
fluidized bed granulation process. Int J Pharm [7] Faur C, Cougnaud A, Dreyfus G, Le Cloirec P. Modelling the
1994;108(2):155–64. breakthrough of activated carbon filters by pesticides in surface
HPLC optimization using ANNs 63

waters with static and recurrent neural networks. Chem Eng J combined with artificial neural networks. J Chromatogr A
2008;145(1):7–15. 2005;1096(1–2):50–7.
[8] Webb R, Doble P, Dawson M. Optimisation of HPLC gradient [11] Miller JN, Miller JC. Statistics and chemometrics for analytical
separations using Artificial Neural Networks (ANNs): chemistry. 4th ed. Prentice Hall; 2000.
application to benzodiazepines in post-mortem samples. J [12] Freeman JA, Skapura DM. Neural network algorithms: appli-
Chromatogr B 2009;877(7):615–20. cations and programming techniques. Houston: Addison-Wesley;
[9] Tran ATK, Hyne RV, Pablo F, Day WR, Doble P. 1991.
Optimisation of the separation of herbicides by linear gradient [13] Dayhoff JE. Neural network architectures: an intro-
high performance liquid chromatography utilising artificial duction. New York: Van Nostrand Reinhold; 1990.
neural networks. Talanta 2007;71(3):1268–75. [14] Lisbon GJ. Neural network current applicati-
[10] Novotná K, Havliš J, Havel J. Optimisation of high ons. London: Chapman & Hall; 1992.
performance liquid chromatography separation of
neuroprotective peptides: fractional experimental designs

You might also like