You are on page 1of 11

War. Res. Vol. 31, No. 6, pp.

1471-1481, 1997
~ Pergamon © 1997 ElsevierScienceLtd
All rights reserved. Printed in Great Britain
PII:S0043-1354(96)00395-8 0043-1354/97 $17.00 + 0.00

PREDICTING HENRY'S LAW CONSTANT A N D THE EFFECT


OF' TEMPERATURE ON HENRY'S LAW CONSTANT

N. N I R M A L A K H A N D A N @~*, R. A. B R E N N A N ~ and R. E. S P E E C E @~
tCivil, Agricultural and Geological Engineering Department, New Mexico State University, NM 88003,
U.S.A. and 2Civil and Environmental Engineering Department, Vanderbilt University, TN 37235, U.S.A.

(First received January 1996; accepted in revised form November 1996)

Abstract--Air-water partitioning data for a wide range of organic chemicals are used to validate a three-
variable quantitative structure-activity relationship (QSAR) model for Henry's law constant, H. The
predictive ability of the basic model, developed from a training set of 180 chemicals, is now demonstrated
on 105 new chemicals of similar molecular structure. This basic model is then extended to cover additional
chemicals ef diverse molecular features. The predictive ability of the final model is demonstrated on a
new testing set of 70 chemicals featuring multiple structural components and polyfunctional groups.
Spanning over 10 orders of magnitude, the log H values predicted by the QSAR model for 462 compounds
are found to agree with the reported experimental values with r 2 > 0.95 at p = 0.0005. A new QSAR model
for estimating H as a function of temperature, T, is also proposed. The predictive ability of this H - T
model is demonstrated using experimental data for 18 chemicals over a temperature range of 1(~55~C.
© 1997 El,;evier Science Ltd.

Key words--Henry's law, QSAR model, Henry's law constant, ai~water partitioning

INTRODUCTION constant at a given temperature. The above


formulation is generally valid for x,_ < 0.03 and total
When a solute is introduced into an air-water binary pressure < 2 atm.
system, it distributes itself between the two phases so In essence, H can be considered as the ratio of the
as to re-establish thermodynamic equilibrium. This gas phase content of the solute to the liquid phase
distribution phenomenon has been modeled using content. This basic concept of partitioning has been
Henry's law. The modern thermodynamic theory of defined and interpreted in the science and engineering
Henry's law, its experimental determination, and its literature in several different forms, causing consider-
significance in air-water partitioning have been well able confusion and discussion (see Letters to the
documented in literature and textbooks. F o r dilute Editor, 1987). For example, in certain atmospheric
solutions of nonelectrolytes, a practical and "strict" chemistry and physical chemistry applications, H is
form of Henry's law can be formulated by equating considered to be the inverse of the above ratio. One
the chemical potential of the solute in the liquid phase of the more c o m m o n forms of H, as used in the
to that in the gas phase, leading to the following environmental engineering field, is obtained by
expression (Pierotti, 1963): expressing p2 and x2 as concentrations (mass/volume),
yielding a non-dimensional form of Henry's law
In = ~-~+ ~+ In . (1) constant. In this paper, H is reported in this
non-dimensional concentration ratio form.
Partitioning of organic solutes in air-water binary
Here, P2 is the partial pressure of the solute in the gas systems has been recognized as an important
phase, and Xz is the mole fraction of the solute in the phenomenon in many applications. Chemical process
liquid phase; R is the ideal gas law constant; T is the engineering, environmental remediation, and fate and
absolute temperature; and v, is the molar volume of transport of organic chemicals in the ecosphere are
the solvent. The first two terms on the right-hand side some areas where H data are crucial inputs. With new
of the equation quantify the partial molar Gibbs free environmental regulations, the need for reliable H
energy for cavity formation and interaction, respect- data has become even more critical. Under the Clean
ively, and the last term is a measure of the pure Air Act Amendments of 1990, assessing and
solvent effect. The ratio (pz/x2) has been termed the controlling air emissions of more than 350 synthetic
Henry's law constant, H, which can be seen to be a organic chemicals (SOCs) has emerged a pressing
issue. The National Emission Standards for Haz-
*Author to whom all correspondence should be addressed ardous Air Pollutants under this Act requires
[Fax: + 1 505 646 6049; e-mail: nkhandan@nmsu.edu]. profound reductions in process and fugitive emissions

1471
1472 N. Nirmalakhandan et al.

of SOCs from many chemical processes, treatment and their corresponding molecular descriptors are
facilities and storage operations. Essential steps assembled to form a "training" set. Then, statistical
involved in meeting this challenge include calcu- techniques are applied to develop a QSAR model
lations of mass balance and mass transfer at relating the experimental H data to the descriptors.
air-water interfaces, process modifications, solvent To validate the resulting QSAR model, a "testing"
substitution etc., where H data at different tempera- set of chemicals with known H values can be used to
tures are key inputs. verify the H values predicted by the model. Given the
In spite of the importance of H in several such variability in the reported experimental H data and
applications, experimental data are not readily the uncertainty and the data requirements of the s-vp
available for many chemicals. In a critical review of model, we feel that if H could be predicted for "new"
H data published in the literature, Mackay and Shiu chemicals (i.e. chemicals not included in formulating
(1981) reported only 40 experimentally measured the models) with an A F E less than 3, then the QSAR
data for 167 organic chemicals of environmental model could be considered excellent and satisfactory,
concern. Additional experimental data are becoming and acceptable when A F E is less than 4.
available, but at a fraction of the rate at which new The QSAR model offers certain advantages over
chemicals are being introduced. Even for the few the s-vp model. It does not normally require
common chemicals for which experimentally deter- experimental inputs, in that the model parameters are
mined H data exist, considerable differences can be calculated or estimated directly from the molecular
noted among the reported data from different structure of the solute. This provides rapid estimation
sources. In a comparison of experimental H data for of H values for "new" chemicals. In addition, a
65 common organic solvents from different sources, robust and validated QSAR model for H can be used
we found the average factor of variability between to reconcile and corroborate contradictory inter-
different data sources to be typically 2.56 (Nir- laboratory data. When used with the s-vp model, the
malakhandan, 1988). QSAR model can aid in establishing realistic values
Because of the large number of chemicals in not only for H, but also for s and vp as well. The
current use, the discrepancies among the reported objective of this paper is to present a QSAR model
experimental data, and the severe analytical limi- for H, and document its predictive ability.
tations on the experimental determination of H
values, researchers, regulators and end users are
interested in developing and using rapid estimation
RELATED WORK
methods for H. Two approaches have emerged for
the estimation of H values. In the first approach, H Hine and Mookerjee (1975) were among the first to
is estimated from a ratio of the vapor pressure to the report a structure-property relationship to fit H data.
aqueous solubility of the solute (s-vp model). The (In their study, the reciprocal of H was reported.)
rationale behind this approach and its validity have They proposed two schemes to relate molecular
been well discussed in the literature (Mackay and features to H data: a bond contribution scheme, and
Shiu, 1981; Burkhard et al., 1985; Ashworth et al., a group contribution scheme. In the first scheme, H
1986). It is obvious that errors in aqueous solubility data of 263 chemicals were analyzed by a least-square
and vapor pressure data will propagate into the H method to establish 34 bond contribution factors. In
values estimated by this model. Burkhard et al. (1985) the second scheme, data of 212 chemicals were
have, for example, evaluated such errors in the analyzed to establish 49 fragment contribution
prediction of H values of PCB congeners, and factors. The model based on bond contributions
concluded that the average factor of error, AFE, fitted the log H data with a standard deviation, SD,
could range from 3 to 4, with a worse-case maximum of 0.41, while the model based on group contri-
of around 5. (AFE is defined as the ratio of the butions fitted with SD = 0.15, albeit for a smaller
predicted H value to the experimental value; if the number of chemicals. The H values used in these
ratio is less than 1, then the reciprocal is used.) We regressions agreed with the fitted values with an A F E
found this method resulted in A F E values of 2-3 for of 2.61 and 1.28 for the two schemes, respectively.
the more common solvents (Nirmalakhandan, 1988). (Several chemicals were excluded from the regression
However, whenever reliable experimentally measured analysis due to very high errors). The predictive
s-vp data are available, this approach will yield ability of this approach, however, was not evaluated
excellent results (Meylan and Howard, 1991). for "new" chemicals.
In the second approach, quantitative structure- In our earlier work (Nirmalakhandan and Speece,
activity relationship (QSAR) techniques are used to 1988), we used a subset of the compilation of Hine
develop models to predict H (QSAR models). This and Mookerjee (1975) to develop a QSAR model that
technique is based on the premise that the fitted the H data of 180 chemicals ranging over seven
physical/chemical/biological properties and activities log units, with r 2 = 0.98. We then demonstrated the
of organic molecules are a function of their molecular predictive ability of this model by comparing the H
structures. To apply this technique, experimentally values predicted for 20 "new" chemicals against the
measured H data for a selected number of chemicals reported H values with r2= 0.98. This study also
Predicting Henry's law constant 1473

included results of several tests that further validated demonstrated the predictive ability of their approach
the statistical robustness of the model. on a testing set of 74 chemicals of complex structures.
Meylan and Howard (1991) have recalculated and The predicted values agreed well with the experimen-
expanded the bond. contribution factors of Hine and tal values with r2 = 0.965 and SD = 0.46; based on
Mookerjee (1975) to 59. Their model fitted the the predicted values reported by them, we found the
training set of 345 chemicals with r 2 = 0.94 and SD AFE for this testing set to be 4.39. Meylan and
of 0.45. To improve the quality of fit, they derived Howard (1991) compared their approach against our
correction factors for certain chemical classes which earlier model (Nirmalakhandan and Speece, 1988)
increased the r 2 to 0.97 and decreased the SD to 0.34. and concluded the two to be similar in overall
(Since they did not publish the predicted values, we accuracy.
could not determine the corresponding AFE). They Russell et al. (1992) used a subset of 63 chemicals

Table 1. Prediction of H for chemicals of "similar" structures


Log H Log H

Chemical Found Predicted Error Chemical Found Predicted Error


Aliphatic hydrocarbons Aliphatic alcohols
1 Methane 1.46 1.39 0.07 1 3-Methylbutan- 1-ol -4.47 -4.05 -0.42
2 2-Methylbutane 1.75 1.68 0.07 2 2-Methylpropan-2-ol -3.28 -3.26 -0.02
3 2,3-Dimethylbutane 1.72 1.84 -0.12 3 Pentan-3-ol - 3.19 - 3.28 0.09
4 2-Methylhexane 2.15 1.89 0.26 4 3-Methylbutan- 1-ol -3.24 -3.23 -0.01
5 3-Methylhexane 1.99 1.88 0.11 5 Nonan-l-ol -2.85 -2.88 0.03
6 2,2-Dimethylpentane 2.11 1.99 0.12 6 Decan- 1-ol - 2.67 - 2.77 0.10
7 2,3-Dimethylpentane 1.85 1.93 -0.08 7 Prop-2-en-l-ol -3.69 -3.94 0.25
8 3,3-Dimethylpentane 1.88 1.96 -0.08 Aromatics
9 3-Methylheptane 2.18 1.98 0.20 1 lsopropylbenzene -0.22 -0.36 0.14
l0 2,3,4-Trimethylpenl:ane 1.88 2.10 -0.22 2 1,2,3-Trimethylbenzene - 0.89 - 0.31 - 0.58
l1 n-Nonane 2.30 2.03 0.27 3 1,3,5-Trimethylbenzene -0.66 -0.30 -0.36
12 2,2,5-Trimethylhexane 2.15 2.26 - 0.11 4 1,4-Diethylbenzene -0.69 -0.29 -0.40
13 n-Decane 2.32 2.14 0.18 5 Allylbenzene -0.55 -0.85 0.30
14 Cyclopropane 0.55 0.65 -0.10 6 2-Ethyltoluene -0.76 -0.37 -0.39
15 n-Propylcyclopentane 1.56 1.20 0.36 7 4-Ethyltoluene -0.70 -0.37 -0.33
16 n-Pentylcyclopenta:ae 1.87 1.41 0.46 8 Isobutylbenzene 0.12 -0.24 0.36
17 cis-1,2-Dimethylcyclohexane 1.16 0.97 0.19 9 sec-Butylbenzene -0.33 0.61 -0.94
18 trans-l,4-Dimethylcyclohexane 1.55 1.26 0.29 10 4-isopropyltoluene -0.50 -0.21 -0.29
19 Cyclohepta- 1,3,5-triene -0.73 -0.32 -0.41 11 n-Pentylbenzene - 0.17 - 0.20 0.03
20 2-Methylpent-l-ene 1.08 1.33 -0.25 12 n-Hexylbenzene -0.03 -0.09 0.06
21 Heptene 1.22 1.38 - 1.16 13 Styrene -0.91 -0.96 0.05
22 (E)-Hept-2-ene 1.23 1.38 -0.15
23 Non-l-erie 1.51 1.59 -0.08 Halogenated aromatics
1 Fluorobenzene -0.59 -1.09 0.50
2 Benzotrifluoride -0.18 -1.68 1.50
3 1,2,4-Trichlorobenzene -0.82 -1.61 0.79
Halogenated hydrocarbons 4 1,3,5-Trichlorobenzene -0.57 - 1.61 1.04
1 1,1,1,2-Tetrachloroethane -0.94 - 1.13 0.19 5 1,2,3,5-Tetrachlorobenzene -1.19 -1.90 0.71
2 2-Chlorobutane 0.00 -0.02 0.02
6 1,2,4,5-Tetrachlorobenzene -0.98 - 1.90 0.92
3 2-Chloro-2-methylpropane 0.80 0.11 0.69
7 2-Chlorotoluene -0.84 -0.88 0.04
4 1,4-Dichlorobutane - 1.70 -0.44 - 1.26
8 4-Bromotoluene - 1.02 - 1.11 0.09
5 1-Chlorohexane 0.00 0.12 - 0.12
9 Iodobenzene - 1.28 -0.97 -0.31
6 l-Chloroheptane 0.21 0.22 -0.01
7 l-Chloroprop-2-ene -0.42 -0.64 0.22 Polyaromatic hydrocarbons
8 I-Bromo-2-methylpropane - 0.02 - 0.33 0.31 1 1,3-Dimethylnaphthalene - 1.81 - 2.67 0.86
9 2-Bromo-2-methylpropane -0.62 -0.11 -0.51 2 1,4-Dimethylnaphthalene -2.07 -2.68 0.61
10 I-B romo-3-methy]butane 0.15 -0.24 0.39 3 2,3-Dimethylnapthalene -2.04 -2.52 0.48
11 l-Bromopentane -0.07 -0.30 0.23 4 2,6-Dimethylnaphthalene - 1.93 -2.67 0.74
12 l-Bromohexane 0.13 - 0.19 0.32 5 1-Ethylnaphthalene -1.76 -2.74 0.98
13 l-Bromoheptane 0.25 -0.09 0.34 6 Indane -I.07 -1.16 0.09
14 l-Bromooctane 0.38 0.01 0.37 Phenols
15 I-lodopentane -0.10 -0.15 0.05 1 2,3-Dimethylphenol -4.52 -5.06 0.54
16 l-lodohexane 0.06 -0.05 0.11 2 2,4-Dimethylphenol -4.41 -5.06 0.65
17 1-Iodoheptane 0.20 0.06 0.14 3 2,5-Dimethylphenol -4.34 -5.06 0.72
4 2,6-Dimethylphenol - 3.86 - 5.06 1.20
5 3,4-Dimethylphenol -4.77 - 5.06 0.29
Esters~acids 6 3,5-Dimethylphenol -4.60 -5.06 0.46
I n-Pentylacetate -- 1.84 - 1.71 - 0.13 7 3-Ethylphenol - 4.59 - 5.13 0.54
2 Methyltrimethylacetate - 1.76 - 1.66 - 0.10 8 4-Ethylphenol -4.50 -5.13 0.63
3 n-Pentylpropanoate - 1.55 - 1.74 0.19 9 4-n-Propylphenol -4.33 -5.03 0.70
4 i-Butylpropionate - 1.17 - 1.67 0.50 10 4-tert-Butylphenol -4.34 --4.78 0.44
5 Ethylhexanoate -1.64 -1.63 -0.01 11 2-Fluorophenol -3.88 -5.72 1.84
6 Isobutylisobutano~te - 1.24 - 1.51 0.27 12 4-Fluorophenol -4.54 -5.72 1.18
7 Hexylformate - 1.08 -1.61 0.53 13 2-Chlorophenol -3.34 -5.65 2.31
8 Amyl formate -1.26 -1.84 0.58 14 3-Chlorophenol -4.85 -5.65 0.80
9 Methyloctanoate - 1.50 - 2.07 0.57 15 4-Chlorophenol - 5.16 - 5.65 0.49
10 Ethylbcnzoate -2.67 -2.73 0.06 16 4-Chloro-3-methylphenol -4.98 -5.50 0.52
11 Pentanoic acid -4.52 -4.73 0.21 17 4-Bromophenol -5.23 -5.87 0.64
12 Hexanoic acid -4.56 -4.62 0.06 18 2-1odophenol -4,55 -5.59 1.04
1474 N. Nirmalakhandan et al.

2-

< 0-

-2- ~~¢rFF F FF

-4-

~1~ H H
-6-
Line of perfect prediction
-8 • I ' I ' I ' I I

-8 -6 -4 -2 0 2 4

log H F o u n d f r o m literature

A- 23 Aliphatic hydrocarbons E- 13 Aromatics


B- 17 HalogenatedHCs F- 9 Halogenated aromatics
C- 12 Esters/acids G- 6 Polyaromatie H C s
D- 7 Alcohols H- 18 Phenols
Fig. 1. Comparison between reported and predicted log H values for 105 organic chemicals of "similar"
structures.

from the compilation of Hine and Mookerjee (1975), End users of structure-property models would
and reported another model to estimate H with often prefer to apply these models to predict H values
F = 0.978 and SD = 0.375. The model parameters for chemicals for which experimental data are not
included five variables: number of heavy atoms, two readily available. Unless the predictive ability of such
different measures of charged partial surface area and models is demonstrated on " n e w " chemicals with
two different measures of atomic charge. They used an diverse molecular features that were not used in the
external testing set of seven chemicals to demonstrate original model development, end-users would be
the predictive ability of their model. Based on the reluctant to apply them confidently to new chemicals.
comparative values reported by them, we found the In this paper, we present additional data to
agreement between the experimental and predicted demonstrate the predictive ability of the basic model
values to be good with r 2 = 0.92 and A F E = 2.6. reported by us earlier (Nirmalakhandan and Speece,
A b r a h a m et al. (1994) reported a model based on 1988) and an extension to this model which enables
the linear solvation energy relationship (LSER) H values to be estimated for a wider range of
approach. In this model, H data in the form of gas chemicals. In addition, a new Q S A R model is
solubility were correlated for 408 chemicals with a presented that enables, for the first time, H values to
high degree of fit using five solvatochromic be estimated at different temperatures, directly from
parameters: excess molar refraction, dipolarity/polar- the molecular structures of the chemicals.
izability, effective hydrogen-bond acidity, effective
hydrogen-bond basicity, and M c G o w a n character- BASIC MODEL FOR H
istic volume. This study did not demonstrate the
predictive ability of the model on " n e w " chemicals. The basic model proposed by us (Nirmalakhandan
However, the model parameters in this model enable and Speece, 1988) was derived from equation (I) in
one to evaluate, at a molecular level, the various which the energy terms associated with the cavity
factors involved in air-water partitioning. formation and the solute-solvent interactions were
modeled using a combination of a polarizability
Table 2. Contributions to polarizability parameter parameter, @, the molecular connectivity index, 'Xv,
Atom Contribution Group/bond Contribution and a hydrogen bonding index, L This approach
Carbon 0.577 Aldehyde - 1.192 yielded the following three-variable model for the
Hydrogen - 0.12 Ketone - 1.433 non-dimensional concentration form of H at 25°C:
Oxygen - 0.825 Amine - 3.106
Hydroxyl - 3.701 Nitro -2.223 log H = 1.29 + 1.005@ -- 0.468 ~v _ 1.258 I (2)
Chlorine -0.187 Pyridine -2.511
Bromine -- 0.222 --
Iodine 0.407 Cyclic - 0.952 n = 180; r = 0.99; r 2 = 0.98; SE = 0.262.
Fluorine -0.57 Double bond -0.859
Sulfur -0.535 Triple bond -0.109 The polarizability parameter, @, was derived from
Predicting Henry's law constant 1475

Table 3. Fitting o f H data to chemicals with " n e w " structures


Log H Log H

Chemical Found Calculated Error Chemical Found Calculated Error


Aldehydes Nitro compounds
1 Formaldehyde -2.02 -2.66 0.64 I Acetonitrile -2.85 - 1.85 - 1.00
2 Acetaldehyde - 2.57 - 2.56 - 0.01 2 1-Cyanobutane -2.67 -1.57 - 1.10
3 Propionaldehyde - 2.52 - 2.47 -0.05 3 1-Cyanopropane -2.58 - 1.68 -0.90
4 Butyraldehyde -2.33 -2.37 0.04 4 Nitromethane -2.95 -2.26 -0.69
5 Isobutyraldehyde -2.10 -2.31 0.21 5 Nitroethane -2.72 -2.18 -0.54
6 Pentanal -2.22 -2.26 0.04 6 l-Nitropropane -2.45 -2.07 -0.38
7 Hexanal -2.06 -2.16 0.10 7 2-Nitropropane - 2.30 -2.01 -0.29
8 Heptanal - 1.96 -2.05 0.09 8 l-Nitrobutane -2.27 - 1.96 -0.31
9 Octanal -1.68 -1.95 0.27 9 l-Nitropentane -2.07 -1.85 -0.22
10 Nonanal -1.52 - 1.84 0.32 10 Benzonitrile -3.09 -2.57 -0.52
11 (e)-But-2-enal -3.10 -2.82 -0.28 11 Nitrobenzene - 3.02 - 3.91 0.89
12 (e)-Hex-2-enal -2.70 -2.62 -0.08 12 2-Nitrotoluene - 2.63 - 3.76 1.13
13 (e)-Oct-2-enal -2.52 -2.41 -0.11 t3 3-Nitrotoluene -2.53 -3.76 1.23
Ketones
1 Propanone -2.79 --2.71 -0.08 Pyridines
2 Butanone -2.72 -2.61 -0.11 1 Pyridine -3.44 -3.65 0.21
3 Pentan-2-one -2.58 -2.46 -0.12 2 2-Methylpyridine - 3.40 - 3.51 0.11
4 Pentan-3-one - 2.50 - 2.49 -0.01 3 3-Methylpyridine -3.50 -3.51 0.01
5 3-Methylbutan-2-one -2.38 -2.41 0.03 4 4-Methylpyridine -3.62 -3.51 -0.11
6 Hexan-2-one -2.41 -2.36 -0.05 5 2,3-Dimethylpyridine -3.54 -3.37 -0.17
7 4-Methylpentan-2-one -2.24 -2.29 0.05 6 2,4°Dimethylpyridine -3.57 -3.36 -0.21
8 Heptan-2-one -2.23 -2.25 0.02 7 2,5-Dimethylpyridine - 3.46 - 3.36 - 0.10
9 Heptan-4-one -2.14 -2.28 0.14 8 2,6-Dimethylpyridine -3.37 -3.37 0.00
10 Octan-2-one - 2.11 - 2.15 0.04 9 3,4-Dimethylpyridine -3.83 -3.36 -0.47
11 Nonan-2-one - 1.83 -2.04 0.21 10 3,5°Dimethylpyridine -3.55 -3.36 -0.19
12 Nonan-5-one - t.94 -2.07 0.13 11 2-Ethylpyridine - 3.18 - 3.43 0.25
13 Decan-2-one - 1.72 -2.53 0.81 12 3-Ethylpyridine -3.37 -3.43 0.06
14 Undecan-2-one -1.58 -1.84 0.26 13 4-Ethylpyridine - 3.47 - 3.43 -0.04
15 3,3-Dimethylbutan-2-one -2.28 -2.29 0.01 14 2-Chloropyridine -3.22 -4.57 1.35
16 2,4-Dimethylpenta:a-3-one -2.01 -2.17 0.16 15 3-Chloropyridine 2.94 -4.56 1.62
16 3-Formylpyridine -5.21 -4.97 -0.24
Amines 17 4-Formylpyridine -5.14 -4.97 -0.17
1 Methylamine -3.34 -3.14 -0.20
18 3-Acetylpyridine -6.06 -4.83 - 1.23
2 Ethylamine - 3.30 - 3.06 - 0.24
19 4-Acetylpyridine -5.59 -4.83 -0.76
3 n-Propylamine -3.22 -2.95 -0.27
4 n-Butylamine -3.11 -2.84 -0.27
5 n-Pentylamine -3.00 -2.74 -0.26 Sulfonated compounds
6 n-Hexylamine -2.90 -2.64 -0.26 1 Dimethylsulfide - 1.18 - 1.21 0.03
7 n-Heptylamine 2.78 -2.53 -0.25 2 Diethylsulfide - 1.07 -0.86 -0.21
8 n-Octylamine - 2.68 - 2.42 - 0.26 3 Dipropylsulfide -0.94 -0.65 -0.29
9 Dimethylamine - 3.15 - 3.13 - 0.02 4 Di-isopropylsulfide -0.88 -0.46 -0.42
10 Di-n-propylamine -2.68 -2.76 0.08 5 Methylthiobenzene -2.00 - 1.75 -0.25
11 Di-isopropylamine -2.36 -2.65 0.29 6 Phenylmethylsulfide 2.00 - 1.75 -0.25
12 Di-n-butylamine -2.38 -2.55 0.17 7 Methylethylsulfide - 1.10 - 1.04 - 0.06
13 Trimethylamine -2.35 -3.07 0.72 8 Diethyldisulfide - 1.07 -0.75 -0.32
14 Triethylamine -2.36 -2.91 0.55 9 Dimethyldisulfide -I.31 -1.06 -0.25
15 N-Methylaniline -3.44 -3.83 0.39
16 N,N-DimethylanilJne -2.53 -3.78 1.25
17 2,6-Dimethylaniline -3.82 -3.55 -0.27

atomic and structural contributions based on an using 20 chemicals that were not included in the
approach similar to that proposed by Ketelaar model development. In this paper, we now use H data
(Horvath, 1982). The molecular connectivity index, from the compilation of Abraham et al. (1994) for
~Xv, was calculated based on the approach proposed additional test chemicals that were not used in the
by Kier and Hall (1976, 1986) and modified according development of our basic model.
to Nirmalakhandan (1988). A sample calculation
illustrating this nlodified algorithm is shown in the Prediction o f H data for chemicals with "similar"
Appendix. The hydrogen bonding index, I, is assigned structures
a value of 1 for the chemicals that contained an
electronegative element attached directly to a carbon As a first step in the validation process, we selected
atom holding a hydrogen atom. Acetylinfc com- chemicals that are similar in molecular features to
pounds and aromatic compounds with partially those used in developing the basic model. From the
substituted hydrogen atoms were also assigned 1. For compilation of Abraham et al. (1994), we found 105
all the other chemicals, I was assigned a value of 0. chemicals from eight different congeneric families
meeting this criterion. (Table 1). The reported H
VALIDATION OF THE BASIC MODEL FOR H
values agreed well with those predicted by our basic
model with r: = 0.953 as shown in Fig. 1. In this set,
In our earlier study, the basic model was validated the predictions for phenolic and fluorinated com-
1476 N. Nirmalakhandan et al.

p o u n d s a r e p o o r . F o r e x a m p l e , in t h e c a s e o f m o r e in line w i t h t h e o t h e r s i m i l a r c h e m i c a l s . ( U s i n g
2 - c h l o r o p h e n o l , t h e predictive e r r o r is h i g h at 2.31; the other s-vp model a value of -0.43 was obtained.)
t h e r e p o r t e d e x p e r i m e n t a l v a l u e is - 3 . 3 4 while t h a t If 2-chlorophenol and 2-fluorophenol are excluded,
p r e d i c t e d b y t h e Q S A R m o d e l is - 5 . 6 5 , w h i c h is t h e overall A F E for t h e r e m a i n d e r o f t h e c h e m i -
cals = 3.52, w i t h 7 0 % of the predictions of
Table 4. Prediction of H for chemicals with "mixed" structures A F E < 3.0. T h i s a g r e e m e n t s u p p o r t s t h e r a t i o n a l e o f
Log H o u r a p p r o a c h , t h e validity o f t h e b a s i c m o d e l , a n d t h e
significance o f t h e m o d e l p a r a m e t e r s in a i r - w a t e r
Chemical Found Predicted Error
partitioning.
1 Benzaldehyde -2.95 -3.25 0.30
2 4-Methylbenzaldehyde -3.13 -3.11 -0.02
3 3-Hydroxybenzaldehyde -6.97 -7.87 0.90 Fitting of H data for chemicals with "new" structures
4 4-Hydroxybenzaldehyde -6.48 -7.87 1.39
5 Acetophenone - 3.36 - 3.12 - 0.24 C o n t i n u i n g t h e a p p r o a c h , we n o w e x t e n d t h e b a s i c
6 4-Methylacetophenone -3.45 -2.97 -0.48 m o d e l ' s a p p l i c a b i l i t y to c h e m i c a l families t h a t
7 Cyclopentanone -3.45 -3.25 -0.20 c o n t a i n a t o m i c a n d s t r u c t u r a l f e a t u r e s t h a t were
8 Cyclohexanone - 3.60 - 3.14 - 0.46
9 Methylcyclopropylketone - 3.38 - 3.20 -0.18 c o m p l e t e l y a b s e n t in t h e o r i g i n a l t r a i n i n g set.
10 Methylcyclohexylketone -2.86 -2.89 0.03 C h e m i c a l families t h a t were n o t r e p r e s e n t e d a r e
11 4-Methoxyacetophenone - 3.23 - 3.51 0.28
12 N,N-Dimethylformamide -5.73 -5.73 0.00 aldehydes, ketones, pyridines, amines, sulfonated and
13 2-Chloroaniline -3.60 -4.13 0.53 n i t r o c o m p o u n d s . S u c h c h e m i c a l s were n o t i n c l u d e d
14 3-Chloroaniline -4.27 -4.12 -0.15 in o u r b a s i c m o d e l d e v e l o p m e n t d u e to lack o f
15 4-Chloroaniline -4.33 -4.12 -0.21
16 2-Methoxyaniline -4.49 -4.57 0.08 e x p e r i m e n t a l d a t a for a sufficient n u m b e r o f
17 3-Methoxyaniline -5.35 -4.57 -0.78 c h e m i c a l s . A b r a h a m et al. (1994) h a v e n o w r e p o r t e d
18 4-Methoxyaniline -5.49 -4.57 -0.92 d a t a for 87 s u c h c h e m i c a l s . U s i n g t h e s e n e w d a t a as
19 2-Nitroaniline -5.41 -6.05 0.64
20 3-Nitroaniline -6.49 -6.04 -0.45 a t r a i n i n g set, o p t i m i z e d c o n t r i b u t i o n s to ~ were first
21 4-Nitroaniline -7.54 -6.04 - 1.50 established for aldehydes, ketones, pyridines, amines
22 o-Toludine -4.06 -3.69 -0.37
23 p-Toludine -4.09 -3.69 -0.40 a n d n i t r o c o m p o u n d s ( T a b l e 2). U s i n g t h e s e • v a l u e s
24 Cyclohexylamine -3.37 -3.48 0.11 in t h e s a m e b a s i c m o d e l t h a t w a s d e v e l o p e d earlier,
25 1-Naphthylamine -5.34 -6.06 0.72 l o g H v a l u e s were t h e n c a l c u l a t e d for t h e s e 87
26 2-Naphthylamine -5.48 -6.05 0.57
27 Benzamide - 8.07 - 7.31 -0.76 c h e m i c a l s . T h e r e p o r t e d a n d t h e c a l c u l a t e d log H
28 Cyclopentanol -4.03 -4.04 0.01 v a l u e s for this t r a i n i n g set a r e listed in T a b l e 3, f r o m
29 Cycloheptanol -4.02 -3.82 -0.20 which the quality of the agreement between the two
30 2-Methoxyethanol -4.96 -4.57 -0.39
31 2-Ethoxyethanol -4.91 -4.27 -0.64 c a n be s e e n to be s a t i s f a c t o r y , w i t h o v e r a l l
32 2-Propoxyethanol -4.70 -4.16 -0.54 A F E = 3.57 a n d 8 0 % of the predictions of
33 2-Butoxyethanol -4.59 -4.05 -0.54
34 2-Methoxyphenol -4.09 -6.10 2.01 A F E < 3.0.
35 3-Methoxyphenol -5.62 -6.09 0.47
36 l-Naphthol -5.63 -6.57 0.94
37 2-Naphthol -5.95 -6.62 0.67 Prediction of H data for chemicals with "mixed"
38 Benzylalcohol -4.86 -5.23 0.37 structures
39 2-Phenylethanol -4.98 - 5.12 0.14
40 3-Phenylpropanol -5.08 -5.02 -0.06 T o d e m o n s t r a t e t h e p r e d i c t i v e ability o f this final
41 3-Cyanophenol -7.08 -8.91 1.83
42 4-Cyanophenol - 7.46 - 8.91 1.45 m o d e l , we f o r m u l a t e d a t e s t i n g set o f c h e m i c a l s t h a t
43 2-Nitrophenol -3.36 -7.57 4.21 contained combinations of molecular features which
44 3-Nitrophenol -7.06 -7.57 0.51 were r e p r e s e n t e d in t h e t r a i n i n g set i n d i v i d u a l l y . F o r
45 4-Nitrophenol - 7.81 - 7.57 - 0.24
46 3-Cyanopyridine -4.95 -5.48 0.53 example, 4-methylbenzaldehyde consists of a combi-
47 4-Cyanopyridine -4.42 -5.48 1.06 n a t i o n o f a n a r o m a t i c ring a n d a n a l d e h y d e g r o u p ,
48 Quinoline -4.20 -4.92 0.72 w h e r e a s t h e s e t w o s t r u c t u r a l f e a t u r e s were rep-
49 2-Methylpyrazine - 4.04 - 3.89 - 0.15
50 2-Ethylpyrazine -4.00 -3.83 -0.17 r e s e n t e d in t h e t r a i n i n g set s e p a r a t e l y . Similarly,
51 2-1sobutylpyrazine - 3.70 - 3.55 - 0.15 methylcyciopropylketone has a combination of a
52 N-Methylpiperidine -2.85 --2.74 -0.11
53 Diethylether -I.17 -0.62 -0.55 cycle a n d a k e t o n e s t r u c t u r e . D a t a for a t o t a l o f 72
54 Methylpropylether - 1.22 -0.57 -0.65 s u c h c h e m i c a l s w e r e a v a i l a b l e f r o m A b r a h a m et al.
55 Di-n-Propylether -0.85 -0.40 -0.45 (1994) to test t h e p r e d i c t i v e ability o f t h e final m o d e l .
56 Di-lsopropylether -0.39 -0.30 -0.09
57 Di-n-butylether -0.61 -0.20 -0.41 T h e s e c h e m i c a l s a r e listed in T a b l e 4, a n d t h e
58 Dimethylether - 1.40 -0.74 -0.66 comparison between the reported and predicted log
59 Methylethylether - 1.54 -0.67 -0.87 H v a l u e s is i l l u s t r a t e d in Fig. 3. T h e a g r e e m e n t
60 Methyl-tert-butylether - 1.62 -0.33 - 1.29
61 Methylphenylether - 1.80 - 1.47 -0.33 b e t w e e n t h e t w o is s a t i s f a c t o r y , w i t h r 2 = 0.881.
62 Ethylphenylether - 1.63 - 1.41 -0.22 Again, the predictions for the phenolic compounds--
63 Divinylether - 0.15 - 0.70 0.55
64 Methanethiol - 0.99 - 0.86 - 0.13 2-nitrophenol, 2-methoxyphenol, 3-cyanophenol and
65 Ethanethiol -0.95 -0.68 -0.27 4-cyanophenol--are poor. In the case of 4-nitroani-
66 Propanethiol -0.78 -0.58 -0.20 line, t h e r e p o r t e d e x p e r i m e n t a l v a l u e w a s - 7 . 5 4
67 Butanethiol -0.73 -0.47 -0.26
68 Thiophenol -1.87 -I.40 -0.47 while t h e p r e d i c t e d v a l u e is - 6 . 0 4 . U s i n g t h e s - v p
69 Morpholine -5.26 -4.40 --0.86 m o d e l a v a l u e o f - 5.06 c a n b e c a l c u l a t e d w h i c h m a y
70 N-Methylmorpholine -4.64 -4.62 -0.02 be r e a s o n a b l e . W i t h t h e e x c l u s i o n o f t h e s e c h e m i c a l s ,
Predicting Henry's law constant 1477

<

.~ -4-

"
-6- a A a
~ -7-
~ A A
~~ -8- /
.9 -9- ~ a
-I0 j , , , , , , , , ,

-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0
Log H Found from Literature
Fig. 2. Comparison between reported and predicted log H values for 70 chemicals with "mixed"
structures.

4
O Training set 1
n = 180
2- o Training set 2
n= 87
o- A Testing set 1
n = 20
< × set2
-2-
I . + n=
Testing set 3
70
-4-
"X
oo

~~ 0 -6-
o

-8-
j~++% + d%~2.Niffophcnol
-I0 I I I I I I
-11 -8 -6 -4 -2 0 2

log H Found from Literature

Fig. 3. Comparison of log H values from literature vs. QSAR model for chemicals in training and testing
sets.

5
4
3
o 2
L.

0 -- r--
O
O
8

"ii
O

Training Training Testing Testing Testing


set 1 set 2 set I set 2 set 3
Fig. 4. Notched box plots of predictive errors in log H values for chemicals in training and testing sets.
1478 N. N i r m a l a k h a n d a n et al.

Table 5. Chemicals used in deriving equations (4) and (5)


Parameter A Parameter B

Chemical Code °Z ';C Exp. Calc. Exp. Calc.


1 Methylene chloride A 2.70 1.60 10.2 11.7 3645 3932
2 Chloroform B 3.57 1.96 11.8 12.5 4046 4088
3 2-Chlorobutane C 4.28 2.34 15.1 12.2 4499 4055
4 1,2-Dichloropropane D 4.28 2.44 12.4 11.0 4333 3869
5 1,3-Dichloropropane E 4.12 2.60 9.9 8.2 3917 3421
6 1,2,3-Trichloropropane F 4.99 3.07 7.4 7.7 3477 3372
7 1,4-Dichlorobutane G 4.82 3.10 6.6 6.3 3128 3156
8 l-Chlorobutane H 4.12 2.50 11.3 9.3 3482 3607
9 Toluene 1 4.38 2.41 11.3 11.9 3751 4020
10 Chlorobenzene J 4.38 2.47 9.6 11.2 3466 3908
11 o-Chlorotoluene M 5.30 2.89 10.0 11.6 3545 4000
12 Ethylenedichloride N 3.41 2.10 8.9 9.9 3539 3677
13 Carbon tetrachloride O 4.50 2.26 15.0 14.4 4438 4412
14 Benzene P 3.46 2.00 11.8 11.4 3964 3910
15 s-Tetrachloroethane Q 5.15 2.95 7.7 10.0 3547 3747
16 l,l-Dichloroethylene R 3.20 1.48 15.9 16.1 4618 4628
17 l,l,2-Trichloroethane S 4.28 2.51 9.0 10.2 3690 3739
18 l,l,l-Trichloroethane T 4.50 2.20 14.5 15.1 4375 4523
19 Trichloroethylene U 4.07 2.07 14.7 14.1 4647 4357
20 Tetrachloroethylene V 5.00 2.51 15.5 14.3 4735 4421

the overall A F E is 3.81, with 60% of the predictions represented by the notch; and values below the 10th
of A F E < 3.0. and above the 90th percentile are plotted as points.
The log H values determined from the QSAR These results support the claim that the proposed
model for the entire data set of 462 chemicals are QSAR model can be used to predict H values for a
compared against the reported values in Fig. 3. These range of chemicals similar to those analyzed here,
data include the original training set of 180 that was typically with an AFE less than 4.
used to develop the basic model in Meylan and
Howard (1991) (Training set 1), the testing set of 20
used in Meylan and Howard (1991) (Testing set 1), MODELING THE EFFECT OF TEMPERATURE ON
the testing set listed in Table 1 with "similar" HENRY'S CONSTANT
structures (Testing set 2), the training set with "new"
Equation (1) was adapted to model the effect of
structures listed in Table 3 (Training set 2), and the
temperature on H. Assuming that the variations of
testing set with "mixed" structures listed in Table 4
Go, G~ and, v~ are negligible over a small range of
(Testing set 3). The distributions of the errors for
temperatures as in environmental applications, the
these five data sets are illustrated in Fig. 4 by notched
following form of equation (1) can be deduced:
box plots, where the 10th, 25th, 50th, 75th and 90th
percentiles are indicated by the horizontal lines; the B
lnH=A-~+CIn T. (3)
95% confidence interval around the median is

1.00
Lines represent QSAR predictions;

"5',

'=' 0.I0,

E
0 0

0.01 , 0 . . I . . . . I . . . . I . . . . I . . . . I . . . .

0 5 10 15 20 25 30
Temperature [aC]
Fig. 5. C o m p a r i s o n b e t w e e n Q S A R - p r e d i c t e d a n d e x p e r i m e n t a l ( L e i g h t o n a n d C a l o , 1981) H d a t a . A:
c a r b o n t e t r a c h l o r i d e ; B: o - c h l o r o t o l u e n e ; C: c h l o r o b e n z e n e ; D: 1 , 2 - d i c h l o r o p r o p a n e ; E: 1 , 3 - d i c h l o r o -
propane.
Predicting Henry's law constant 1479

Table 6. Comparisonbetweenreported (Gossett, 1981)and predicted H valuesat various temperatures


DimensionlessHenry'slaw constant
at temperatureof:
10.3°C 17.5°C 24.8°C 34.6°C AFE
cis-1,2-Dichloroethylene Observed 0.07 0.11 0.17 0.22
[°X = 3.15; tXv = 1.64; Predicted 0.28 0.41 0.59 0.94
A = 13.9; B = 4287] Factor o f error 3.83 3.71 3.55 4.34 3.86
10.3°C 17.5°C 24.8°C 34.6°C

Vinylchloride Observed 0.63 0.81 1.14 1.42


[°x = 2.28; ~X~ = 1.06; Predicted 0.71 1.05 1.54 2.51
A = 15.7; B = 4539] F a c t o r o f error 1.12 1.30 1.36 1.77 1.39
9.6°C 17.5°C 24.8°C 34.6°C

1,1 -Dich loroethane Observed 0.11 0.16 0.23 0.32


[°x = 3.57; ~;t~ = 1.88; Predicted 0.22 0.33 0.47 0.73
A = 13.5; B = 4239] Factor o f error 2.03 2.00 2.03 2.29 2.09
10.3°C 17.5°C 24.8°C 34.6°C

Chloroel:hane Observed 0.28 0.35 0.45 0.61


[°;t = 2.70; '~ = 1.50; Predicted 0.20 0.28 0.40 0.62
A = 12.9; B = 4120] Factor o f error 1.43 1.26 1.14 1.01 1.21
9.6°C 17.5°C 24.8°C 34.6°C

Dichloromethane Observed 0.05 0.05 0.09 0.13


[°X = 2.70; 'Xv = 1.60; Predicted 0.11 0.16 0.23 0.35
A = 11.7; B = 3935] F a c t o r o f error 2.26 2.97 2.54 2.69 2.62
10.3°C 17.5°C 24.8°C 34.6°C

Chloromethane Observed 0.17 0.24 0.36 0.49


[°x = 2.0; '~= 1.13; Predicted 0.25 0.35 0.50 0.78
A = 13.:!; B = 4144] F a c t o r o f error 1.46 1.44 1.39 1.59 1.47
Overall A F E = 2.10

Several workers have proposed empirical results of a n = 20; r = 0.879; r2= 0.772; RMS = 1.46
similar form to model the H - T relationship, using
experimentally mea.sured H data at different tempera- B = 4346 + 947 °x - 1855 ~Xv (5)
tures with corresponding A and B values for the
tested chemicals, but the practical utility of such n = 20; r = 0.813; r 2 = 0.661; RMS = 298.
models is severely limited to only those few chemicals
tested. In this study, we have used such experimental The chemicals used, their X values, and the
data to derive QSAR models for A and B, so that comparison between the reported and calculated A
they could be estimated for similar untested chemicals and B values are shown in Table 5. Using A and B
directly from their molecular structures. This enables values calculated from these equations, H values were
the H - T relationship to be predicted for a wider then estimated for five selected chemicals over a
range of chemicals. temperature range of 1-25°C. As seen in Fig. 5, these
A data set reported by Leighton and Calo (1981) calculated values agree well with the measured H
for 20 chemicals was used as a training set to develop values. The agreement between the measured and
QSAR models for the parameters in equation (3) calculated H values for the 20 chemicals ranging in
(original H data reported by Leighton and Calo temperature from 1-27°C was satisfactory, with an
(1981) were in the atm/mole fraction form; they were overall r 2= 0.86 for the 170 data points and an
converted to the non-dimensionalconcentration ratio overall AFE of 1.74.
form in this study). Because of the limited availability To demonstrate the predictive ability of this H - T
of experimental data and the marginal contribution model, data sets reported by Gossett (1985) and
of the third term, C, in equation (3), only the first two Hulscher et al. (1991) were used as testing sets.
terms, A and B, were modeled. This data set Gossett's data set contained H data for six "new"
contained measured H values for benzene, toluene, chemicals over a temperature range of 10-35°C.
and 18 chlorinated aliphatics over a temperature These measured values are compared with the H
range of 1-27°C, and the A and B values calculated values predicted at various temperatures in Table 6.
from them by Leighton and Calo (1981). The The agreement can be seen to be good, with
following equations were derived in this study to an AFE of 2.10 for this set. The comparison between
model A and B in terms of molecular connectivity the observed values of Hulscher et al. (1991)
indexes: and those predicted over a temperature range of
10-55°C for 12 "new" chemicals is shown in Table 7.
A = 14.97 + 5.78 02( - 11.79 lXv (4) While the three chlorinated biphenyls in this set are
T a b l e 7. C o m p a r i s o n between reported (Hulscher et al.) a n d Q S A R - p r e d i c t e d H values at various temperatures
Dimensionless Henry's law constant at temperature of:

14.8°C 20.1 ° C 22.1 ° C 24.2'~C 34.8°C 50.5°C AFE


1,2,3,4-Tetrachlorobenzene Observed 2.03 × 10 -2 2 . 1 4 x 10 2 2 . 7 8 × 10 -2 2.88 × 10 -2 5.01 × 10 -2 1.03 × 10 - I
[°X = 7.15; ~X~ = 3.92; Predicted 3.76 × 10 -2 4 . 7 8 × 10 2 5.23 x 10 -2 5.73 × 10 -2 8.95 × 10 -2 1.64 × 10 - I
A = 10.08; B = 3845] Factor of error 1.85 2.24 1.88 1.99 1.79 1.59 1.89
Pentachlorobenzene Observed 1.57 x 10 2 2.03 × 10 -2 2.78 × 10 -2 2.71 × 10 -2 4.86× 10 2 1.03 × 10 ,
[0~ = 8.07; ~ X ' = 4.57; Predicted 1.15 x 10 -2 1.43 x 10 _2 1.56 × 10 -2 1.69 × 10 -2 2.54x 10 2 4 . 4 2 × 10 2
A = 7.73; B = 3511] Factor of error 1.36 1.42 1.79 1.60 1.91 2.33 1.74
Hexachlorobenzene Observed 9 . 8 9 × 10 3 1 . 2 4 × 10 -2 1.91 × 10 2 2 . 1 2 × 10 -2 3 . 4 6 x 10 -2 8 . 1 0 × 10 -2
[°X = 9.0; ~X~ = 5.08; Predicted 7.63 x 10 -3 9.48 × 10 -3 1.03 × 10 2 1.11 × 10 -2 1.66 x 10 -2 2 . 8 6 × 10 -2
A = 7.10; B = 3446] Factor of error 1.30 1.30 1.86 1.90 2.08 2.83 1.88
10.4°C 20.0°C 30.1 ° C 34.9°C 42.1 ° C 47.9°C 48.4°C

2,5-Dichlorobiphenyl Observed 6.85 × l 0 -3 1.22 × 10 : 2 . 3 2 × 10 z 3.22 × 10 -2 4.71 × 10 -2 6 . 2 2 × 10 -2


[o;( = 8.62; 'X~ = 5.09; Predicted 2.38 × l 0 -3 3.39 × 10 3 4.81 x 10 3 5.63 x 10 3 7.07 × 10 -3 8.43 × 10 -3
A = 4.78; B = 3067] Factor of error 2.88 3.59 4.82 5.72 6.66 7.37 5.17
2,4,4'-Trichlorobiphenyl Observed 3.70 × 10 3 8.73 × 10 3 1.89 × 10 -2 1 . 9 7 × 10 -2 2.71 × 10 -2 4 . 5 3 x 10 -2 4 . 5 9 × 10 2 Z
[02( = 9.54; ';(~ = 5.60; Predicted 1.55 × 10 -3 2 . 1 9 × 10 3 3.07 × 10 -3 3.58 × 10 -3 4 . 4 7 × 10 -3 5.31 × 10 -3 5.39 × 10 _3 Z
A = 4.09; B = 2992] Factor of error 2.39 3.99 6.14 5.50 6.06 8.54 8.51 5.44
2,2',5,5'-Tetrachlorobiphenyl Observed 3.66 × 10 -3 6.75 × 10 -3 1.49 × 10 2 1.52 × 10 2 2.63 × 10 2 4.11 × 10 2 4 . 5 3 × 10 -z
[°X = 10.46; 'X'= 6.12; Predicted 9.53 × 10 _4 1.33 × 10 -3 1.85 × 10 3 2.15 × 10 3 2 . 6 7 × 10 3 3.15 × 10 3 3.20 × 10 -3
A = 3.27; B = 2899] Factor of error 3.84 5.07 8.03 7.07 9.86 13.03 14.17 7.82
lO.O°C 20.0°C 35.0°C 40.1 ° C 45.0°C 55.0°C

Fiuoranthene Observed 1.11 × 10 -4 2 . 6 4 x 10 -4 6 . 3 9 × 10 -4 9 . 1 7 x 10 4 2 . 2 2 × 10 -3 2 . 2 9 × 10 -3


[°X = 8.77; 'X ~ = 5.56; Predicted 2.88 × 10 -4 3 . 8 2 x 10 -4 5 . 6 4 × 10 -4 6.38 × 10 -4 7.16 x 10 -4 8 . 9 6 × 10 -4
A = 0.11; B = 2337] Factor of error 2.60 1.45 1.13 1.44 3.10 2.56 2.05
Benzo(b)fluoranthene Observed 1.07 × 10 5 2 . 1 0 × 10 5 4 . 6 6 x 10 -5 5.82 × 10 -5 7.89 × 10 5 1.36 x 10 4
[°X = 10.85; ~ l ~ = 6.90; Predicted 4 . 0 9 x 10 5 5 . 1 0 × 10 5 6 . 9 0 × 10 5 7 . 5 9 × 10 -5 8.31 × 10 -5 9 . 8 9 × 10 -5
A = - 3.67; B = 1821] Factor of error 3.84 2.43 1.48 1.31 1.05 1.38 1.91
Benzo(k)flnoranthene Observed 9 . 3 8 × 10 -6 1.77 × 10 -5 4 . 1 9 × 10 -5 5.32 × 10 5 7.51 × 10 -5 1.48 × 10 -4
[~X = 10.93; 'X ~ = 6.97; Predicted 3.45 x 10 -5 4 . 2 6 x 10 -5 5 . 7 2 x 10 -5 6.28 × 10 5 6 . 8 5 × 10 -5 8 . 1 2 × 10 -5
A = - - 4 . 0 3 ; B = 1767] Factor of error 3.67 2.41 1.36 1.18 1.10 1.83 1.92
Benzo(a)pyrene Observed 9 . 3 8 x 10 6 1 . 4 0 x 10 - s 2 . 9 0 × 10 -5 3.55 × 10 -5 4 . 1 7 × 10 -5 8 . 7 9 × 10 -5
[~X = 10.93; 'Z~ = 6.97; Predicted 3.45 × 10 5 4 . 2 6 x 10 -5 5.72 x 10 -5 6.28 × 10 -5 6.85 × 10 -5 8 . 1 2 × 10 -5
A = - 4 . 0 3 ; B = 1767] Factor of error 3.67 3.04 1.97 1.77 1.64 1.08 2.20
Benzo(ghi)perylene Observed 8 . 1 0 × 10 -6 1.11 × 10 -5 2 . 0 4 × 10 -5 2.08 × 10 -5 2 . 5 0 x 10 5 3 . 2 0 × 10 -5
[°X = ! 1.93; 'X ~ = 7.72; Predicted 7 . 7 4 × 10 -6 9 . 0 8 × 10 -6 1.13 × 10 -5 1.21 × 10 -5 1.30 x 10 -5 1.47 × 10 -5
A = - 7.09; B = 1323] Factor of error 1.05 1.22 1.80 1.71 1.93 2.18 1.65
lndeno(1,2,3-cd)pyrene Observed 7.67 × 10 -6 1.19 x 10 5 2.23 × 10 5 2.35 × 10 5 2 . 9 2 × 10 -5 3 . 8 6 × 10 5
[°X = ! 1.93; 'X ~ = 7.72; Predicted 7 . 7 4 × 10 6 9.08 × 10 -6 1.13 × 10 5 1.21 × 10 -5 1.30 × 10 -5 1.47 × 10 5
A = - 7.09; B = 1323] Factor of error 1.01 1.31 1.97 1.94 2.26 2.63
1.85
Overall AFE = 2.96
Predicting Henry's law constant 1481

not well predicted, the overall A F E is satisfactory at Kier L. B. and Hall L. H. (1986) Molecular Connectivity in
2.96. Structure-Activity Analysis. Research Studies, London.
Leighton D. T. and Calo J. M. (1981) Distribution
CONCLUSIONS coefficients of chlorinated hydrocarbons in dilute
air-water systems for groundwater contamination appli-
A QSAR model for predicting Henry's constant cations. J. Chem. Eng. Data 6, 87-96.
Letters to Editor (1987) Environ. Sci. Technol. 21, 828.
values directly from the molecular structure of Mackay D. and Shiu W. Y. (1981) A critical review of
chemicals has been developed. The quality of the Henry's law constants for chemicals of environmental
predictions made by this model was shown to be concern. J. Phys. Chem. Ref. Data 1@(4), 1175-1199.
satisfactory for a wide variety of chemicals that were Meylan W. M. and Howard P. H. (1991) Bond contribution
method for estimating Henry's law constant. Environ.
not included in the model development. The overall fit
Toxicol. Chem. 10, 1283-1293.
o f this model for 462 chemicals, spanning an H range Nirmalakhandan N. (1988) Prediction of aqueous solubility
of 10 orders o f magnitude, was found to be good, with and Henry's constant from molecular structures. PhD
r 2 exceeding 0.95. A Q S A R model for predicting H as dissertation, Drexel University, Philadelphia, PA.
a function o f temperature was also developed using Nirmalakhandan N. and Speece R. E. (1988) QSAR Model
for predicting Henry's constant. Environ. Sci. Technol.
the limited number of data reported in the literature. 22, 1349-1361.
This H - T model predicted H values for 18 " n e w " Pierotti R. A. (1963) The solubility of gases in liquids. J.
chemicals over 10-55°C fairly well. It should be noted Phys. Chem. 67, 1840-1845.
that the shortcomings of the H - T model stem from Russell C. J., Dixon S. L. and Jurs P. C. (1992)
Computer-assisted study of the relationship between
two sources. Firs~L, the inadequacy of the Q S A R
molecular structure and Henry's law constant. Anal.
models, equations (4) and (5). Second, the values for Chem. 64, 1350-1355.
A and B used in developing these two equations; these
values were in turn calculated by regression analysis
APPENDIX
o f the experimental H - T data and thus contain some
built-in uncertaintiles. Additional experimental H - T Algorithm for calculation of molecular connectivity
data for a wider range of chemicals would enable this indexes:
model to be strengthened. Based on the validations l Zv = ~ -05
Hvl.v2
I
demonstrated in this study, the proposed models where n is the number of subgraphs containing single bonds,
could be expected to predict acceptable H data and vl and v2 are the valence values at the terminal points
directly from the molecular structures for chemicals of each bond. For carbon atoms, values are equal to the
similar to those included in the training sets. number of bonds at each atom. For hetero atoms, the
valence values are given in Nirmalakhandan (1988).
Example calculation for ethyl acetate, CH3COOCH:CH3.
Acknowledgements This work was supported in part
by US Air Force Office of Scientific Research under an Step 1: Draw hydrogen-suppressed molecular skeleton and
Award for Science and Engineering Research Training label each node
(ASERT), Grant F49620-94-1-0366; Project Manager, Dr
W. Kozumbo.
1 3
REFERENCES
Abraham M. H., Andonian-Haftvan J., Whiting G. S., Leo
A. and Taft S. (1994) Hydrogen bonding 34: the factors
that influence the solubility of gases and vapors in water
at 298 K, and a new method for its determination. J.
Chem. Soc. Perkir~ Trans. 2, 1777-1791.
I10
Ashworth R. A., Howe G. B., Mullins M. E. and Rogers T. 6
N. (1986) Air-wat,:r partitioning coefficients of organics Step 2: For each node, assign valence nodal values
in dilute aqueous solutions. Presented at AIChE Summer
Node ID 1 2 3 4 5 6
National Meeting, Boston, MA.
Burkhard L. P., Armstrong D. E. and Andren A. W. (1985) Atom C C O C C O
Henry's law constants for the polychlorinated biphenyls. Valence value, v 1 4 6 2 l 6
Environ. Sci. Technol. 19, 590-596. Step 3: Identify subgraphs containing one bond each
Gossett J. M. (1985) Air Force Engineering and Services Subgraph ID 1 2 3 4 5
Center Report ESL TR-85-38. Terminal node IDs 1,2 2,3 3,4 4,5 2,6
Hine J. and Mookerjee P. K. (1975) The intrinsic
hydrophyllic character of organic compounds: corre- Step 4: For each subgraph, calculate HvLv2where vl and v2
lations in terms of structural contributions. J. Org. Chem. are the terminal valence values
40, 292-298. Subgraph ID l 2 3 4 5
Horvath A. (1982) Halogenated Hydrocarbons. Marcel Hv,.~2 4 24 12 2 24
Dekker, New York.
Step 5: Calculate the contribution from each subgraph
Hulscher T. E. M., wmder Velde L. E. and Bruggeman W.
A. (1992) Temperature dependence of Henry's law Subgraph ID 1 2 3 4 5
constants for sele:ted chlorobenzenes, polychlorinated H~i°~ 0.50 0.20 0.29 0.71 0.20
biphenyls, and polycyclic aromatic hydrocarbons. Envi- Step 6: Determine the summation of the contributions from
ron. Toxicol. Chem. 11, 1595-1603. subgraphs to 'Zv
Kier L. B. and Hall L. H. (1976) Molecular Connectivity in
Chemistry and Drug Design. Academic Press, New York. giving 'Xv = ~ HvLv2-°5= 1.90 for ethyl acetate.
I

You might also like