You are on page 1of 20

Exploratory study of the use of infrared spectroscopy for the detection of eating disorders in

gingival crevicular fluid

Abstract

Introduction

Vibrational spectroscopic techniques can be applied in different fields due to its versatility,

simplicity and low-cost per analysis. Within these techniques, mid-infrared (MIR) spectroscopy

is one of the most explored as it is non-destructive, enables the determination of several

parameters in the same analysis, environmental friendly, rapid, avoids sample preparation, has

a low cost per analysis and can be applied in-situ [1]. This technique was first applied in the

agro-food sector but currently it has been extended to the health sector because of the

abovementioned characteristics. In fact, the number of published articles is rapidly increasing

within the health area [2, 3]. MIR spectroscopy measures the fundamental vibrations within

4000 to 400 cm-1. It is considered a fingerprint technique and for this reason it is applied in the

identification and characterization of different types of samples [1, 3, 4]. The final spectrum is

complex, reflecting several absorption bands that can be weak and overlapping, and therefore

it is essential to apply chemometric tools to extract useful information. The need of using

chemometric tools is pointed as the major drawback.

This technique was already explored in the analysis of gingival crevicular fluid (GCF) but there

are only two works described in the literature [5, 6]. In 2010, Xiang et al. [6] collected GCF
from several patients to assess if this technique was able to discriminate between healthy

patients and patients with periondontitis. The GCF samples were dried and then analysed. The

results obtained, 93% of correct predictions for the validation set through linear discriminant

analysis, demonstrated that this technique was very accurate for the discrimination of patients

with and without periodontitis. Moreover, the authors suggested that the molecular

components responsible for this discrimination were lipids, proteins and DNA. However, the

results obtained could be better if the authors had performed a spectral region and pre-

processing optimization. A few years later, in 2013, Xiang et al. [5] investigated the capacity of

infrared spectroscopy to discriminate between patients with and without diabetes mellitus.

Several GCF samples collected from different sites in each patient were analysed. The

classification of the samples was accomplished by linear discriminant analysis and an overall

accuracy of 87% of correct classifications for the validation set was obtained. An algorithm was

applied for the selection of the best spectral regions which included the molecular vibrations

of proteins, glycogen, oligosaccharides and glycolipids. Therefore, the authors attributed the

discrimination of patients with and without diabetes mellitus to the differences of these

compounds content in the GCF. Again, the authors have not tested different pre-processing

techniques which could improve the good results obtained. Both these works reveal the

potential of infrared spectroscopy to detect, through a non-invasive and rapid approach,

chemical differences in the composition of GCF.

Therefore, the hereby manuscript explores the used of mid-infrared spectroscopy to detect

eating disorders in GCF. Moreover, this work explored as well the potential of this technique to

discriminate patients, sampling site, medication intake, type of eating disorders, the presence

of others pathologies and vomiting induction. To the best of our knowledge this is the first

time mid-infrared spectroscopy is used with this purpose.


Material and methods

Patients/Subjects

Gingival crevicular fluid collection procedure

(falar sobre dp,dv,mv,mp)

for each patient/subject , four strip perio-papers were collected

Se calhar é melhor introduzir uma tabela com a informação dos pacientes

Mid-infrared spectral acquisition

The mid-infrared spectra acquisition of strip perio-papers was performed in a PerkinElmer

Spectrum BX FTIR System spectrophotometer (Waltham, USA) with a DTGS detector and a PIKE

Technologies Gladi ATR accessory. The spectra were acquired in diffuse reflectance mode from

4000 to 600 cm-1, with a resolution of 4cm-1 and 32 scans co-additions. Each strip perio-paper

was analysed on both sides at the bottom and compressed with a pressure of 150 N cm -2. Thus,

a total of 224 (28 x 8) spectra were obtained. The ATR crystal was cleaned and a background

was acquired between each patient.

Multivariate data analysis

MIR spectra were modelled with the help of chemometric tools, namely, principal component

analysis (PCA) to assist outlier detection and partial least squares discriminant analysis (PLSDA)
to develop discrimination models [7, 8]. The spectra were previously mean centred before any

data analysis. For the PLSDA, the spectral data were randomly divided in two data sets, one for

calibration (70%) and the other for validation (30%). This division was performed ensuring the

same proportion of patients’ classes in both sets aiming to avoid unbalanced classes [9]. The

optimization of the PLSDA models was performed through the selection of the optimal number

of latent variables, best spectral region and best pre-processing technique (using only the

calibration set). The optimal number of latent variables (LV) was estimated through leave-one-

sample-out cross-validation procedure using only the calibration set. The assessment of the

best spectral region involved dividing the MIR spectra in 5 different regions and testing all

these regions individually and in combination. The different regions were the following: from

3982 to 2652 cm-1 (region 1), from 2650 to 1862 cm-1 (region 2), from 1860 to 1182 cm-1

(region 3), from 1180 to 922 cm-1 (region 4) and from 920 to 620 cm-1 (region 5). The selection

of the best pre-processing technique was achieved through testing different techniques

individually and in combination, namely standard normal variate (SNV) and Savitzky-Golay

filter (with different filter widths, polynomial orders and first and second derivatives). After

model optimization, the validation set was used to test the accuracy of the optimized models.

This was performed through the projection of the validation set and the results were arranged

in the form of confusion matrices. The confusion matrices express the percentages of correct

predictions for each patient class and the total percentage of correct predictions was obtained

by adding the diagonal elements of the confusion matrix [9]. The coefficient regression vectors

of the PLSDA models for eating disorders and medication were analyzed to understand which

specific wavenumbers were more important and to relate them with possible compounds

present in the gingival crevicular fluid.

Matlab version 8.6 (MathWorks, Natick, USA) and PLS Toolbox version 8.2.1 (Eigenvector

Research Inc., Wenatchee, USA) software were used to perform all chemometric analysis.
Results and discussion

The raw spectra of the gingival crevicular fluid are show in Figure 1. As abovementioned, a PCA

was performed to assist outlier detection and no outliers were identified. After this, a several

PLSDA models were developed to verify if GCF spectra contains information related with the

patient/subject, sampling site, presence of eating disorders, presence of other pathologies,

medication intake and vomit induction. For all these discrimination models, different pre-

processing techniques and different spectral regions were tested individually and in

combination to improve model’s predictive capacity. Moreover, testing different spectral

regions allowed comprehending which were more appropriate for the respective

discrimination and with that ascertain the compounds that can be responsible.

The PLSDA models were built considering three different strategies: total spectral data

(strategy one), the mean of both sides of each strip perio-paper (strategy two) and the mean

of all the spectra collected from each patient/subject (strategy three) for each discrimination

model. Therefore, for each discrimination model (ex: patient/subject, sampling site, presence

of eating disorders, presence of other pathologies, medication intake and vomit induction) a

total of three correct predictions were obtained. However it should be noted that for each

discrimination model and strategy more than 150 (31 spectral regions combinations x 5 pre-

processing techniques combinations) PLSDA models were developed.

Patient/subject
In the first place, GCF spectra were modelled trough PLSDA to verify if MIR spectra contained

specific information related with each patient/subject. The results obtained (a total of correct

predictions below 20% for all the strategies) indicated that it was not possible to discriminate

these patients through GCF. With these results it was not possible to perceive which was the

best spectral region and pre-processing technique. We believe that the GCF composition of

each patient varies (introduzir reference talvez) but possibly these variations could be so

marginal and MIR spectroscopy is not sensitive enough.

Sampling site

The GCF spectra were modelled as well against the sampling site. It was important to

understand if GCF is influenced by the sampling site (aqui será melhor desenvolver mais um

bocado sobre se faz sentido existir ou não variação da composição do GCF). Again, the PLSDA

results revealed that it was not possible to discriminate (a total of correct predictions below

30% for all the strategies except for the mean of all spectra collected for each patient that was

not performed) the sampling site as well as to perceive the best spectral region(s) and pre-

processing technique. (como anteriormente, tentar justificar estes resultados e se possível

colocar referências)

Vomit induction

Some of the subjects/patients sampled presented induced vomiting practices. Therefore,

several PLSDA models were developed using all the strategies considering patients with

induced vomiting practices in one group and patients without vomiting practices in another

group. The best results were obtained, 75% of correct predictions (with 13 LV), when strategy

one was used. The best spectral region was obtained within 3982 to 2652 cm -1 (region 1) and
when applying SNV followed by Savitzky-Golay filter (15 points filters width, second polynomial

order and second derivative). The results obtained in terms of correct predictions through the

application of the other strategies were lower (approximately 65%). This was not expected

since averaging the spectra should smooth slightly random variations present in the spectra of

strip perio-papers.(não sei bem se isto é verdade, tenho de pensar melhor).

Thus, it can be concluded that MIR spectra is capable of detecting the chemical differences on

the composition of GCF between these two groups of patients/subjects. However, further

studies are needed (including a higher amount of patients) to confirm the robustness of this

technique.

Eating disorders

The primary objective of this manuscript was to attest if it is possible to discriminate patients

with and without eating disorders (ED) through the GCF spectra. Again, several PLSDA models

were tested aiming to find the best spectral region and pre-processing technique. This was

done for all the strategies. The best results were obtained when selecting the spectral region

within 920 to 620 cm-1 (region 5) and through the application of SNV followed by Savitzky-

Golay filter (15 points filters width, second polynomial order and second derivative) in all the

strategies. The raw and pre-processed spectra obtained in the spectral region within 920 to

620 cm-1 are depicted in Figure x. The raw spectra are very similar between each group of

patients but the pre-processed spectra sign some differences.


a b

Figure x- Mean of raw (a) and pre-processed spectra (b) using the best spectral region for the

discrimination of eating disorders.

A total of 80.1% of correct predictions was obtained using 6 LV and adopting strategy three.

Strategy two yielded a total of 77.3% of correct predictions with 8 LV while strategy one

yielded a total of 76.0% of correct predictions with 10 LV’s. The results demonstrated that

averaging the spectra increases the accuracy of the PLSDA models and reduces the number of

LV. This was expected since averaging the spectra should smooth slightly random variations

present in the spectra of strip perio-papers. Table X shows the confusion matrix obtained

through strategy three (the confusion matrices obtained using the other strategies are very

similar and for that reason were not showed). It can be seen that the worst predictions

involved the group of patients/subjects without ED. In fact, slightly more than half of this

group was incorrectly classified as having ED.

Table x. Confusion matrix for the discrimination of patients/subjects with and without ED

based on the GCF spectra through strategy two (80.1 % of correct predictions and 6 LV’s).

Strategy three
% Subjects group(real)

Subjects group (predicted) With ED Without ED Total

With ED 50.1 16.6 64.7

Without ED 3.3 30.0 35.3

Total 56.6 43.4 100

Thus, it is important to explore the regression coefficient vectors to understand which spectral

absorptions showed a higher contribution and with that attempt to establish a possible

relation with the compounds present in the GCF. Figure x shows the regression coefficient

vectors of the PLSDA model adopting strategy three and the wavenumbers that showed the

higher contribution were: 862, 836, 684, 654, 642 and 635 cm -1. (explore the compounds that

could be changing in GCF and relate them with the possible absorptions)

642 cm-1
862 cm-1

684 cm-1

836 cm-1
654 cm-1 635 cm-1
Type of eating disorders

Within the patients/subjects with ED, ten presented nervous anorexia restrictive type, 4

nervous anorexia purgative type, 1 nervous bulimia restrictive type and 3 nervous bulimia

purgative type. Thus, PLSDA models using the three strategies were developed considering five

different groups (the four abovementioned plus the group of patients/subjects without ED).

The percentage of total prediction was below 50% in all the strategies. The spectral region and

pre-processing technique that yielded the best results were the same as for discriminating the

presence of ED. Although the results obtained were not good, it seems that GCF spectra

contain some information regarding the ED type. Further studies increasing the number of

samples for each type of ED can enable better results.

Presence of other pathologies

(Pedro achas que vale a pena falar sobre as outras patologias?))

Os resultados indicam que não é possível identificar outras patrologias (ex: osteoporose,

diabetes tipo I, depressão, personalidade borderline, hipercolesterolémia). Provavelmente por

termos poucas amostras para todo o tipo de patologias (só temos uma amostra para cada)

(talvez devido ao número reduzido de amostras e/ou similaridade dos distúrbios em termos

dos compostos químicos presentes no fluido crevicular gengival).


Medicação

Os resultados indicam que é possível identificar que os pacientes são medicados (70-80% de

previsões correctas). (colocar figuras dos espectro da zona utilizada para evidenciar as

diferenças entre pacientes com e sem)

a b

Figure x- Mean of raw (a) and pre-processed spectra (b) using the best spectral region for the

discrimination of medication intake.

PLSDA results

Regression coefficient analysis

Conclusion
References

1. dos Santos, C.A.T., R.N. Páscoa, and J.A. Lopes, A review on the application of
vibrational spectroscopy in the wine industry: From soil to bottle. TrAC Trends in
Analytical Chemistry, 2017.
2. Diem, M., et al., A decade of vibrational micro-spectroscopy of human cells and tissue
(1994-2004). Analyst, 2004. 129(10): p. 880-885.
3. Siqueira, L.F.S. and K.M.G. Lima, A decade (2004-2014) of FTIR prostate cancer
spectroscopy studies: An overview of recent advancements. Trac-Trends in Analytical
Chemistry, 2016. 82: p. 208-221.
4. Stuart, B., Infrared Spectroscopy: Fundamentals and Applications 2004, Chichester:
Wiley Online Library. p.46-47.
5. Xiang, X.M., et al., Diabetes-Associated Periodontitis Molecular Features in Infrared
Spectra of Gingival Crevicular Fluid. Journal of Periodontology, 2013. 84(12): p. 1792-
1800.
6. Xiang, X.M., et al., Periodontitis-specific molecular signatures in gingival crevicular
fluid. Journal of Periodontal Research, 2010. 45(3): p. 345-352.
7. Naes, T., et al., Interpreting PCR and PLS solutions. A User-Friendly Guide to
Multivariate Calibration and Classification, 2004. 1: p. 39-54.
8. Barker, M. and W. Rayens, Partial least squares for discrimination. Journal of
Chemometrics, 2003. 17(3): p. 166-173.
9. Páscoa, R., et al., Exploratory study on vineyards soil mapping by visible/near-infrared
spectroscopy of grapevine leaves. Computers and Electronics in Agriculture, 2016. 127:
p. 15-25.

Tables

ED

Table x. Confusion matrix for the discrimination of patients/subjects with and without ED

based on the GCF spectra through strategy one (76.0 % of correct predictions and 10 LV’s).

Strategy one

% Subjects group(real)
Subjects group (predicted) With ED Without ED Total

With ED 49.2 14.9 64.1

Without ED 9.1 26.8 35.9

Total 58.3 41.7 100

Table x. Confusion matrix for the discrimination of patients/subjects with and without ED

based on the GCF spectra through strategy two (77.3 % of correct predictions and 8 LV’s).

Strategy two

% Subjects group(real)

Subjects group (predicted) With ED Without ED Total

With ED 49.3 15.4 64.7

Without ED 7.3 28.0 35.3

Total 56.6 43.4 100

Table x. Confusion matrix for the discrimination of patients/subjects with and without ED

based on the GCF spectra through strategy two (80.1 % of correct predictions and 6 LV’s).

Strategy three

% Subjects group(real)

Subjects group (predicted) With ED Without ED Total

With ED 50.1 16.6 64.7

Without ED 3.3 30.0 35.3

Total 56.6 43.4 100


medication

Table x. Confusion matrix for the discrimination of patients/subjects with and without

medication based on the GCF spectra through strategy one (72.8 % of correct predictions and 6

LV’s).

Strategy one

% Subjects group(real)

Subjects group (predicted) With medication Without medication Total

With medication 16.2 12.2 28.4

Without medication 14.9 56.7 71.6

Total 31.1 68.9 100

Table x. Confusion matrix for the discrimination of patients/subjects with and without

medication based on the GCF spectra through strategy two (73.5 % of correct predictions and

6 LV’s).

Strategy two

% Subjects group(real)

Subjects group (predicted) With medication Without medication Total

With medication 18.2 11.2 29.4

Without medication 15.3 55.3 70.6

Total 33.5 66.5 100


Table x. Confusion matrix for the discrimination of patients/subjects with and without

medication based on the GCF spectra through strategy two (81.5 % of correct predictions and

4 LV’s).

Strategy three

% Subjects group(real)

Subjects group (predicted) With medication Without medication Total

With medication 23.4 9.9 33.3

Without medication 8.6 58.1 66.7

Total 32.0 68.0 100

List of figures
Figure 1- Raw spectra of strip perio-papers gingival crevicular fluid.
a b

Figure x- Mean of raw (a) and pre-processed spectra (b) using the best spectral region for the

discrimination of eating disorders.


a b

Figure x- Mean of raw (a) and pre-processed spectra (b) using the best spectral region for the

discrimination of medication intake.


Figure X – Regression coefficient vectors of the PLSDA model obtained adopting strategy three

within 920 and 620 cm-1 (region 5) and through the application of SNV followed by Savitzky-

Golay filter (15 points filters width, second polynomial order and second derivative).

642 cm-1
862 cm-1

684 cm-1

836 cm-1
654 cm-1 635 cm-1
Extra

modelo Regiões mais Absorções activas nestas regiões Zona (cm-1)


importantes
(cm-1)
C-Br stretch 700-600
O-H alcohol 720-590
C-H alkyne bend 680-610
C-H aromatic bend 900-670
C-O-O-C peroxides stretch 890-820
Perturbação do 862, 836,
P-O-C aromatic phosphates 995-850
comportament 684, 654,
C-S thioethers stretch 660-630
o alimentar 642 e 635
C-S disulfides 705-570
C-S aryl thioethers 715-670
Sulfate ion 680-610
Nitrate ion 840-815
Carbonate ion 880-860
C-C skeletal vibrations 1300-700
C-H methyne bend 1350-1330
C=C conjugated 1600
C-H vinylidene bend 1310-1290
C=C-C aromatic ring stretch 1615-1580
O-H primary or secondary 1350-1260
O-H phenol 1410-1310
1608, 1340, C-N tertiary amine stretch 1210-1150
Hábitos 1330, 1316, C-N aromatic primary amine 1340-1250
tabágicos 1300, 1224, C-N aromatic secondary amine 1350-1280
1208 C-N aromatic tertiary amine 1360-1310
carboxylate 1420-1300
Quinone or conjugated ketone 1650-1600
Aromatic nitro compounds 1355-1320
Organic phosphates (P=O) 1350-1250
Aromatic phosphates (P-O-C) 1240-1190
Diakyl/aryl sulfones 1335-1300
-N=N- open chain azo 1630-1575
Coates bibliografia MIR

You might also like