You are on page 1of 6

10th International Symposium on Process Systems Engineering - PSE2009

Rita Maria de Brito Alves, Claudio Augusto Oller do Nascimento and Evaristo
Chalbaud Biscaia Jr. (Editors)
© 2009 Elsevier B.V. All rights reserved. 363

Estimating the Normal Boiling Point of Organic


Compounds Based on Elements and Chemical Bonds
Xia Li, Xiang Shu-Guang and Jia Xiao-Ping
The Hi-Tech Institute for Petroleum and Chemical IndustryˈQingdao University of
Science and TechnologyˈQingdaoˈ266042ˈShandongˈChina

Abstract
Based on elements and chemical bonds, a new method for estimating the normal boiling
point of pure organic compounds from chemical structure was proposed. This method
considers the contributions of interactions between elements and chemical bonds in the
molecule. On the basis of 4,060 kinds of credible experimental data, 7 correlation
equations are obtained by regression analysis. A mean absolute average deviation of
15.16K, less much than mostly used group contribution methods for alkyne, chloride
derivative, iodine derivative, especially for alkane, bromide derivative, aromatic
hydrocarbon, alicyclic hydrocarbon and sulphide organic compounds. Compared with
the other group contributions methods, the proposed method in this paper showed
significant improvements in accuracy and the ability to distinguish among isomers.

Keywords: the normal boiling point, elements, chemical bonds, group-contribution


techniques

1. Introduction
The boiling point of organic compounds is of importance in the design of new organic
compounds and in some industrial processes since they are the factors that primarily
control solubility and vapour pressure [1-3].
Despite the enormous amount of available boiling point data, there are very few
useful general means for quantitatively relating the boiling points of a compound to its
chemical structure [4]. Most of the work in the prediction of the phase transition
temperature of compounds has been focused mainly on boiling point estimation.
Recently simple group contribution methods were proposed to predict the boiling points.
It was found that the boiling temperature can be estimated simply by using molecular
fragment values. Most of the methods were developed mainly from small data sets and
can be used to predict only certain classes of organic compounds [5].
From the experience gained with the development of the previous group
contribution methods, based on elements and chemical bonds, a new method for
estimating the normal boiling point of pure organic compounds from chemical structure
was proposed.

2. Physical Property Data Set


A data base containing boiling points, molecular weight, molecular descriptors, etc. was
developed in Yaws' Handbook of Thermodynamic and Physical Properties of Chemical
Compounds [6]. The data set for the normal boiling points (measured at 1 atm) consists
of 4,060 compounds. The normal boiling points were credible experimental data. The
compounds considered in this study were substituted aromatic including heterocyclic
364 X. Li et al.

compounds. The substituents used were the non-hydrogen bonding and single hydrogen
bonding groups. These were methyl(CH3), methylene (CH2), cyano (CN), oxygen (O),
hydroxyl(OH), halogens, nitro(NO2), aldehyde (CHO),made(CONH2), keto (CO),
carboxyl(COOH), Sulfur(S), sulfoxide(SO), mercapto (SH), thiocyanate (SCN), and
isothiocyanate (NCS) groups.

3. Development of the new method


3.1 Molecular Descriptors
The elements and chemical bonds were regarded as basic contribution units in this
method. Elements were divided into C, H, O, F, Cl, Br, I, N and S. Chemical bonds
were divided into cyclic and non-cyclic. According to the different atoms, the chemical
bonds were divided into C-C, C-H, C-O, cyclical C-C, cyclical C-O, and so on. Because
the chemical bond in benzene was different from C-C and C=C, it was regarded as a
special one. The list of structural groups for the new method (9 elements, 29 chemical
bonds and 32 adjacent chemical bonds) were given in table 2.

3.2 Statistical Analysis


The Statistical Analysis of the data was performed using Matlab 7.0. The correlation
coefficient and standard error were used as a measure of correlation for the equations
developed. However, in selecting the best model emphasis was placed upon reducing
the standard error.

3.3 Regression algorithm


From the experience gained with the development of the previous method, six linear
regression of equation were employed.

Tbk m  ¦ n j 'Tbj (1)


j

¦ n 'T
j
j bj

Tb m (2)
Mk

¦ n 'T
j
j bj

Tb m (3)
Mk l
l  ¦ n j 'Tbj
j
Tb m (4)
Mk
Tb m  k * (¦ n j 'Tbj ) l (5)
j
Estimating the Normal Boiling Point of Organic Compounds Based on Elements and
Chemical Bonds 365

¦n
j
j 'Tbj
T b
l
m (6)
Mk

3.4 Correlation coefficient and group contributions


The correlation coefficients were found in Table 1. The contributions of elements,
chemical bonds and the adjacent chemical bonds interaction were showed in Table 2,
Table 3 and Table 4.
Table 1 The correlation coefficient for six linear regression of equations
correlations m k l
Correlation 1 100.638 2.425 -
Correlation 2 10.892 0.604 -
Correlation 3 71.215 0.616 3.569
Correlation 4 37.069 0.619 489.408
Correlation 5 -25.668 27.548 0.397
Correlation 6 9.556 0.690 0.739
Table 2 The elements contribution for six linear regression of equations
elements correlation 1 correlation 2 correlation 3 correlation 4 correlation 5 correlation 6
C 240320 782.910 839.850 841.670 114.640 222.400
H 28212 53.126 56.257 53.593 13.620 13.896
O 237280 756.010 823.740 817.720 115.180 216.640
F 95650 208.790 280.010 210.410 45.101 54.531
Cl -155650 -109.020 -146.970 -126.550 -76.313 -16.131
Br 422880 1970.100 2032.800 2092.900 202.960 591.470
I 676750 3128.200 3223.100 3316.900 324.490 938.540
N 780110 1456.100 1619.800 1535.200 371.080 373.950
S 72825 351.300 379.940 402.480 36.894 103.750
Table 3 The contribution of chemical bonds for six linear regression of equation
chemical
correlation 1 correlation 2 correlation 3 correlation 4 correlation 5 correlation 6
bonds
C-C 18652 48.224 52.788 52.958 8.469 12.609
C-O -20133 46.858 33.214 45.012 -9.801 16.598
O-O 2458 47.524 24.630 38.489 0.039 12.299
O-H 718760 863.370 1069.400 933.160 343.120 197.610
C-H -24453 -28.265 -31.964 -28.521 -11.603 -6.073
C-F -134440 215.030 58.425 231.510 -60.153 97.711
C-Cl 615500 1952.800 2062.100 2085.700 298.710 559.620
C-Br 422880 1970.100 2032.800 2092.900 202.960 591.470
C-I 676750 3128.200 3223.100 3316.900 324.490 938.540
C-S 374130 934.160 1022.900 997.440 179.510 256.850
S-S 739810 1876.200 1998.800 1903.300 345.920 521.080
S-H 485890 992.620 1092.000 1019.700 232.760 267.250
C-N -232150 -310.240 -364.760 -332.700 -111.490 -67.245
N-H 198200 272.680 326.360 303.670 96.577 61.182
N-N -330090 -361.230 -400.540 -348.010 -156.670 -72.967
cyclical C-C 52679 82.322 92.760 84.927 25.001 20.488
cyclical C-O 90786 185.950 211.660 207.020 43.115 48.322
cyclical C-S 389500 1004.500 1101.000 1063.100 185.530 276.910
cyclical C-N -36751 -39.512 -0.233 -8.676 -15.713 -4.859
C=C 60230 -3.186 4.808 7.272 32.892 -6.126
C=O 457480 614.510 729.390 663.240 218.650 145.840
C=S 186130 1085.300 1070.300 1198.800 97.746 327.880
C=N -201780 -273.430 -253.300 -253.890 -90.481 -60.664
N=O 55093 187.760 197.250 197.880 26.707 53.176
cyclical C=C -20851 -51.780 -49.945 -51.298 -8.349 -13.837
cyclical C=N 106730 134.760 176.370 143.740 49.635 31.476
{
C C 149130 25.339 46.749 17.972 75.249 -3.479
{
C N 305480 150.140 310.530 208.650 151.050 24.377
benzene 591560 684.240 824.250 703.750 282.020 154.100
366 X. Li et al.

Table 4 The contributions of the adjacent chemical bonds interaction for six linear
regression of equation
the adjacent chemical correlation correlation correlation correlation correlation correlation
bonds interaction 1 2 3 4 5 6
CH3 -104880 -158.130 -189.040 -170.840 -46.805 -36.446
CH2 11360 31.982 35.531 36.574 5.416 8.847
CC-CC2 31033 26.431 30.312 15.806 14.014 6.647
CC-CC3 80859 110.570 126.710 104.160 36.607 27.832
C2C-CC2 127440 193.840 224.360 196.320 57.775 48.234
C2C-CC3 217360 315.360 372.390 324.710 98.957 76.099
C3C-CC3 336220 459.980 552.920 477.440 153.490 107.650
C2C=CC 30005 63.936 66.662 73.000 12.549 16.169
C2C=CC2 224030 283.310 355.600 310.510 103.170 64.649
H2C=CC -101830 -151.200 -177.810 -140.520 -45.310 -36.194
H2C=CC2 -46540 -87.804 -102.000 -75.157 -20.609 -21.663
CRC˄ortho˅ 31212 46.820 61.002 49.555 13.634 10.994
CRC˄para˅ 20961 38.531 46.535 40.560 9.224 9.819
CRO˄ortho˅ -126000 -98.122 -125.810 -110.520 -61.269 -19.466
CRO˄para˅ 132790 180.530 222.410 191.660 62.335 41.568
ORO˄ortho˅ 19101 19.490 16.704 14.063 7.779 5.701
ORO˄para˅ 221950 137.670 195.000 164.990 105.510 26.854
CRN˄ortho˅ 30320 221.410 239.060 230.780 13.143 62.553
CRN˄para˅ 64651 237.440 286.990 258.320 28.161 60.186
CRS˄ortho˅ -166220 -126.090 -129.820 -107.540 -79.427 -24.883
CRS˄para˅ -129230 -96.662 -92.418 -77.653 -61.246 -18.962
NRN˄ortho˅ 777570 886.490 1094.600 964.550 367.910 198.390
NRN˄para˅ 1017700 1042.800 1253.700 1137.600 483.930 225.640
CHO 6964 -19.275 -3.241 -15.799 6.983 -4.160
COO -20083 -6.329 -41.545 -35.664 -11.736 3.467
NH2 -13300 -105.410 -64.020 -86.257 -3.547 -23.710
NO2 27546 93.881 98.624 98.941 13.353 26.588
triatomic ring -167720 -353.690 -390.680 -274.480 -67.937 -94.181
four-membered ring -131290 -338.550 -362.300 -276.710 -52.184 -91.562
five-membered ring 3812 -126.410 -129.950 -124.380 6.280 -37.263
six-membered ring 57043 19.234 37.358 21.224 29.240 1.195
sever-membered ring 30570 -91.893 -64.870 -79.766 20.197 -28.648
Note: R: ring structure

3.5 Results and discussion


We will compare the above results with those from Joback and Reid (JR) [7], Devotta
and Pendyala (DP) [8], Constantinuou and Gani (CG) [9], Marrero-Moreion and
Pardillo (MP) [10]. All methods have in common the fact that they only require the
knowledge about the molecular structure and therefore are comparable. The probability
of a prediction failure (extreme deviation between experimental and estimated value)
was chosen as the most important criterion for the reliability of a model.
The first correlation proposed here gives a mean absolute average deviation of
15.88K for 4060 compounds (second correlation: 15.89K (4060), third correlation:
15.16K (4060), fourth correlation: 15.34K (4060), fifth correlation: 15.78K (4060),
sixth correlation: 16.47K (4060), JR: 33.45K (4060), DP: 33.37K (4060), CG: 16.43K
(3893), MP: 17.02K (3549)). This means that it combines the lowest deviation with the
broadest range of applicability.
In order to test the predictive capability of the method, experimental normal
boiling temperatures for 10 components, not in the database used for regression, were
compared with the predicted values. The results showed that the method proposed in
this paper had significant improvements in accuracy.
To test the predictive capability of the method to differentiate the same kind of
isomers, a detailed procedure for the estimation of Tb is given in Table 6 for heptane
Estimating the Normal Boiling Point of Organic Compounds Based on Elements and
Chemical Bonds 367

isomers. The results showed that the method proposed in this paper had the ability to
distinguish among isomers.
Table 5 Comparison of the method proposed for compounds, not used for regression
methods N AAE
Correlation 1 10 2.78
Correlation 2 10 1.91
Correlation 3 10 1.74
Correlation 4 10 1.84
Correlation 5 10 2.74
Correlation 6 10 1.77
JR 10 5.83
DP 10 5.90
CG 10 3.96
MP 10 3.17

Table 6 Comparison of the method proposed for heptane isomers


Compound name Tbexp Tbcal AAE AAPE

heptane 371.58 371.01 -0.57 0.15


3-methyl-hexane 365 365.05 0.05 0.01
2-methyl-hexane 363.2 362.18 -1.02 0.28
3,3 - dimethyl - pentane 359.21 363.52 4.31 1.20
2,2 - dimethyl - pentane 352.34 355.92 3.58 1.02
2,4 - dimethyl - pentane 353.64 356 2.36 0.67
2,3 - dimethyl - pentane 362.93 362.16 -0.77 0.21
3-ethyl-pentane 366.62 367.89 1.27 0.35
2,2,3 -trimethyl- butane 354.03 356.77 2.74 0.77

3.6 Examples
To illustrate the application of the proposed method (using correlation 2), the results for
the estimation of Tb is given in Table 7 for 1, 3-dihydroxy-4-methylbenzene.
Table 7 The estimation result of 1, 3-dihydroxy-4-methylbenzene

Molecular Descriptors ni 'Tbi M


¦ n 'T
i
i bi Tbcal Tbexp AAE AAPE

C 7 782.910
H 8 53.126
O 2 756.010
C-C 1 48.224
C-H 6 -28.265
C-O 2 46.858 124.139 9682.146 537.21 543.15 5.94 1.09
O-H 2 863.370
Benzene 1 684.240
CH3 1 158.130
CRO(ortho) 1 -98.122
ORO(para) 1 137.670

4. Conclusions
Based on elements and chemical bonds, a new method for estimating the normal boiling
point of pure organic compounds was developed. The predictions are based exclusively
on the molecular structure of the compound. The first correlation proposed in paper
gives a mean absolute average deviation of 2.78K (second correlation: 1.91K, third
correlation: 1.74K, fourth correlation: 1.84K, fifth correlation: 2.74K, sixth correlation:
1.77K, JR: 5.83K, DP: 5.90K, CG: 3.96K, MP: 3.17K) for 10 components, not in the
database used for regression. The results of the new method were in most cases far more
accurate than previous methods and had the ability to distinguish among isomers.
368 X. Li et al.

List of symbols
AAE a mean absolute average deviation (K)
AAPE a mean relative average deviation (%)
m , k, l correlation coefficient
M molecular weight
N number of compounds
ni number of group i in compounds
nj number of group j in compounds
Tb the normal boiling point of compounds (K)
Tbexp the experimental normal boiling temperature (K)
Tbcal the predicted normal boiling temperature (K)
'Tb the contribution of elements or chemical bonds (K)
'Tbj he contribution of group j in compounds (K)

References
Pardillo-fontdevila E., R. Gonzalez-Rubio. 1997, A Group-interaction contribution approach for
the estimation of chemical-physico properties of branched isomers, Chem. Eng. Commun,
163:245.
Reid R.CˈPrausnitz J.M., Poling, B.E., 1987, The Properties of Gases and liquids, 4th ed.,
McGraw-Hill, NewYork.
Lohninger H., 1993, Evaluation of Neural Networks based on radial basis functions and their
application to the prediction of boiling points from structural parameters. J. Chem. Inf.
Comput. Sci. 33:736.
Cordes W., Rarey J, 2002, A New Method for the Estimation of the Normal Boiling Point of Non-
electrolyte organic compounds. Fluid Phase Equilib, 201, 409.
Nannoolal Y., Rarey J., Ramjugernath D., Cordes W., 2004, Estimation of Pure Component
Properties, Part 1: Estimation of the Normal Boiling Point of Non-Electrolyte Organic
Compounds via Group Contributions and Group Interactions . Fluid Phase Equilib, 226, 45.
Yaws Carl L. 2003, Yaws' Handbook of Thermodynamic and Physical Properties of Chemical
Compounds. Knovel.
Joback KG., Reid R., 1987, Estimation of pure component properties from Goup-contributions,
Chem. Eng. Commun. 57:233.
Devotta S., Pendyala VR, 1992ˈModified Joback group contribution method for normal boiling
point of aliphatic Halogenated compounds. Ind Eng Chem Res, 31:2042.
Constantinous L., Gani R., 1994, New group contribution method for estimating properties of
pure compounds, AlChE J. 40:1697.
Marrero-Morejon J., Pardillo-Fontdevila E., 1999, Estimation of pure compound properties using
group-interaction contributions, AlChE J. 145:615.

You might also like