You are on page 1of 96

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/264851338

Processing and analysis of NMR data Impurity determination and metabolic


profiling

Article

CITATION READS

1 4,656

1 author:

Jenny Forshed
Karolinska Institutet
56 PUBLICATIONS   1,339 CITATIONS   

SEE PROFILE

All content following this page was uploaded by Jenny Forshed on 30 April 2015.

The user has requested enhancement of the downloaded file.


Processing and analysis of NMR data
Impurity determination and metabolic profiling

Jenny Forshed

A
Doctoral Thesis
Department of Analytical Chemistry
Stockholm University
2005
Processing and analysis of NMR data
Impurity determination and metabolic profiling

Doctoral Thesis, 2005

Jenny Forshed
jenny.forshed@anchem.su.se
Department of Analytical Chemistry
Stockholm University
SE-106 91 Stockholm

Academic dissertation for the Degree of Doctor of Philosophy in Analytical


Chemistry at Stockholm University to be publicly defended on Thursday
1 December 2005 at 13:00 in Magnélisalen, Kemiska övningslaboratoriet,
Svante Arrhenius v. 10-12.

© Jenny Forshed 2005, pp 1-95.


ISBN: 91-7155-139-5
Cover by Mina, Folke and Jenny Forshed
Akademitryck AB, Edsbruk, Sweden
till Mina, Folke och Östen
Abstract

This thesis describes the use of nuclear magnetic resonance (NMR)


spectroscopy as an analytical tool. The theory of NMR spectroscopy
in general and quantitative NMR spectrometry (qNMR) in particular
is described and the instrumental properties and parameter setups
for qNMR measurements are discussed. Examples of qNMR are
presented by impurity determination of pharmaceutical compounds
and analysis of urine samples from rats fed with either water or
a drug (metabolic profiling). The instrumental parameter setup of
qNMR and traditional data pre-treatments are examined. Spectral
smoothing by convolution with a triangular function, which is an
unusual application in this context, was shown to be successful
regarding the sensitivity and robustness of the method in paper II.
In addition, papers III and IV comprise the field of peak alignment,
especially designed for 1H-NMR spectra of urine samples. This is
an important preprocessing tool when multivariate analysis is to
be applied. A novel peak alignment method was developed and
compared to the traditional bucketing approach and a conceptually
different alignment method.
Univariate, multivariate, linear and nonlinear data analyses were
applied to qNMR data. In papers I–II, calibration models were
created to examine the potential of qNMR for these applications.
The data analysis in papers III–VI was mainly explorative. The
potential of data fusion and data correlation was examined in order
to increase the possibilities of analysing the highly complex samples
from metabolic profiling (papers V–VI). Data from LC/MS analysis
of the same samples were used with the 1H-NMR data in different
ways. Correlation analyses between the 1H-NMR data and the drug
metabolites identified from the LC/MS data were also performed.
In this process, data fusion proved to be a valuable tool.

|5
6 | Processing and analysis of NMR data
Sammanfattning

Denna avhandling beskriver hur NMR (nuclear magnetic resonance)


spektroskopi kan användas för kvantifiering i analytisk kemi. Teorin
för NMR-spektroskopi i allmänhet och kvantitativ NMR (qNMR)
i synnerhet är beskriven och instrumentella förutsättningar och
parameterinställningar för qNMR diskuteras. Föroreningsanalys
av två olika läkemedelskomponenter och analys av urin från råttor
som matats med vatten eller läkemedel (metaboliska profiler) ges
som exempel på qNMR. I de två första artiklarna utvärderades
instrumentella parameterinställningar för qNMR och betydelsen
av traditionell databehandling. Utjämning av spektra (”spectral
smoothing”) med hjälp av faltning med en triangulär funktion
visade sig förbättra både känslighet och robusthet hos qNMR-
metoden i artikel II, dock syns få sådana applikationer i litteraturen.
Artiklarna III–IV omfattar topplinjering, att anpassa analytiska
signaler utefter en rät linje, anpassat för 1H-NMR data från urinprov.
Detta är en viktig förprocessning av data vid multivariat analys. En
ny metod för topplinjering utvecklades och jämfördes med den
traditionella (integrering av förutbestämda, lika stora segment) och
en konceptuellt skild topplinjeringsmetod.
Univariat, multivariat, linjär och olinjär data-analys har tillämpats på
qNMR data. För att utreda potentialen av qNMR i artiklarna I–II
gjordes kalibrerings-modeller. Data-analysen i artiklarna III–VI var
i första hand explorativ. För att öka möjligheterna att analysera de
mycket komplexa proverna från metaboliska profiler prövades olika
metoder för datafusion och datakorrelation (artiklarna V–VI). Data
från LC/MS- och 1H-NMR-analys av samma prover analyserades
sammanslagna på olika sätt. Vidare utfördes korrelationsanalys
mellan 1H-NMR-data och LC/MS-variabler identifierade som
läkemedelsmetaboliter. I denna process visade sig datafusion vara
ett värdefullt verktyg.

|7
8 | Processing and analysis of NMR data
Preface

The work on this thesis has been carried out in the borderland
between the fields of nuclear magnetic resonance (NMR) spectroscopy
– mostly from a structure elucidation perspective, analytical chemistry
– aiming for low detection limits and high accuracy and precision,
and chemometrics – including the large field of multivariate data
analysis. These three fields of science do not often come together in
the same department, and even less often in the same person. This
PhD thesis, however, aims to combine these fields of knowledge
in order to create opportunities for novel research in the field of
quantitative NMR spectrometry (qNMR).
As I come from a background in analytical chemistry and
chemometrics, the work of this thesis started with the basic theories
and practice of NMR spectrometry, which was then a new technique
to me. The theory and background sections in this thesis were written
more as a handbook for myself at that time. The thesis also contains a
summary of the six papers I have completed during my PhD studies
at the Department of Analytical Chemistry at Stockholm University.
The laboratory work was done at AstraZeneca in Södertälje, which
provided all the laboratory materials, chemicals and people with
expertise in NMR spectroscopy.

|9
10 | Processing and analysis of NMR data
Contents

List of papers _____________________________________13


Abbreviations and notations _________________________15
1 Introduction ____________________________________19
2 Nuclear magnetic resonance theory ________________21
Spin ___________________________________________21
Applying a magnetic field ___________________________22
The laboratory frame ______________________________24
The rotating frame of reference ______________________25
NMR data acquisition _____________________________30
NMR data processing _____________________________33
3 NMR for quantitative analysis _____________________37
Background _____________________________________37
qNMR methods __________________________________39
The sensitivity of NMR spectrometry _________________39
Accuracy and precision of qNMR ____________________43
Discussion ______________________________________44
4 qNMR applications _____________________________45
Impurity determination ____________________________45
Metabolic analyses ________________________________50
5 Chemometrics __________________________________53
Principal component analysis ________________________53
Calibration ______________________________________55
Data preprocessing _______________________________59
Validation ______________________________________60
6 Peak alignment _________________________________63
Profile alignment or peak alignment ___________________65
Target for alignment ______________________________67
Optimisation ____________________________________67
NMR data compression ____________________________70

| 11
Discussion ______________________________________70
7 1H-NMR + LC/MS ______________________________71
Data fusion _____________________________________72
Data correlation __________________________________75
Discussion ______________________________________77
8 Conclusions ____________________________________79
9 Future perspectives ______________________________81
Acknowledgements ________________________________83
References ________________________________________85

12 | Processing and analysis of NMR data


List of papers

I. NMR and Bayesian regularized neural network regression for impurity


determination of 4-aminophenol
Forshed, J., Andersson, F.O. and Jacobsson, S.P.,
Journal of Pharmaceutical and Biomedical Analysis 29 (2002)
495-505.
The author is responsible for the experimental work and for writing the paper.

II. Quantification of aldehyde impurities in poloxamer by ¹H-NMR


spectrometry
Forshed, J., Erlandsson, B. and Jacobsson, S.P.
Analytica Chimica Acta, 552 (2005) 160-165.
The author is responsible for the experimental work, the calculations and the writing of
the paper.

III. Peak alignment of NMR signals by means of a genetic algorithm


Forshed, J., Schuppe-Koistinen, I. and Jacobsson, S.P.
Analytica Chimica Acta, 487 (2003) 189-199.
The author is responsible for the experimental work, developing the algorithm and
writing the paper.

IV. A comparison of methods for alignment of NMR peaks in the context


of cluster analysis
Forshed, J., Torgrip, R.J.O., Åberg, K.M., Karlberg, B.,
Lindberg, J. and Jacobsson, S.P.
Journal of Pharmaceutical and Biomedical Analysis 38 (2005)
824-832.
The author is responsible for the comparison idea, developing the algorithm for the
segmentwise method and writing the main part of the paper.

List of papers | 13
V. Evaluation of different techniques for fusion of LC/MS and 1H-
NMR data
Forshed, J., Idborg, H. and Jacobsson, S.P.
Chemometrics and Intelligent Laboratory Systems (2005)
submitted.
The author is, together with H. Idborg, responsible for the ideas, implementing the code,
and writing the paper.

VI. Enhanced multivariate analysis by correlation scaling and fusion of


LC/MS and 1H-NMR data.
Forshed, J., Stolt, R., Idborg, H. and Jacobsson, S.P.
Chemometrics and Intelligent Laboratory Systems (2005)
submitted.
The author is responsible for the major ideas (together with H. Idborg), some of the
computations and writing the paper.

14 | Processing and analysis of NMR data


Abbreviations and notations

Abbreviations
1
H-NMR proton-NMR
ADC analogue-to-digital converter
CC correlation coefficient
COW correlation optimised warping
DFT discrete Fourier transform
DSP digital signal processing
DTW dynamic time warping
FID free induction decay = S(t)
FT Fourier transform
FW fuzzy warping
GA genetic algorithm
GC gas chromatography
IR infrared spectroscopy
LC liquid chromatography
LOD limit of detection
LS least-squares regression
LW local warping
MLR multiple linear regression
MS mass spectrometry
NMR nuclear magnetic resonance
NN neural network
N-PLS N-way PLS
OPA outer product analysis
PAGA peak alignment by means of a genetic algorithm
PARAFAC parallel factor analysis
PARS peak alignment by reduced set mapping
PC principal component
PCA principal component analysis

Abbreviations and notations | 15


PCR principal component regression
PLF partial linear fit
PLS partial least squares or projection to latent structures
PLS-DA PLS-discriminant analysis
ppm parts per million (the units of the horizontal scale in
an NMR spectrum)
PTW parametric time warping
PWA piecewise alignment
qNMR quantitative NMR
RF radio frequency
RMS root-mean-square
RMSE root-mean-square error
RMSEP root-mean-square error of prediction
S/N signal-to-noise ratio
TMS tetramethyl silane
TMSP (sodium 3-(trimethylsilyl)propionate-2,2,3,3-d4)

Notations
Generally the algebraic notation is used as follows: small letters
represent scalars, small bold letters represent vectors, and bold
capitals represent data matrices.
A(ω) absorption spectra
at acquisition time
b regression vector
b receiver bandwidth
B0 external magnetic field
B1 RF pulse
D(ω) dispersion spectra
E residual matrix
E energy
f filling factor
F(ω) the spectrum as a function of frequency
Fsampling sampling frequency
Fsignal the highest frequency signal
h Planck’s constant (= 6.62618 E-34 Js)
I spin quantum number
I(ω) imaginary part of spectrum
k the Bolzmann constant (= 1.380 E-23 J/K)
m magnetic quantum states

16 | Processing and analysis of NMR data


M net magnetisation
N number of detectable nuclei (magnetically equivalent)
n the noise figure of the amplifier
nt the number of transients (scans)
p angular moment
P loading matrix
Q the quality factor of the RF coil
R(ω) real part of spectrum
S spectrum to be aligned
S(t) the NMR signal as a function of time (= FID)
rd relaxation delay
rt the pulse repetition time
T sample temperature
T scores matrix
T target spectrum
T1 spin-lattice relaxation time constant
T2 spin-spin relaxation time constant
Vs volume of the sample
w weights
β pulse angle
μ magnetic moment
ν0 the Larmor frequency [Hz]
φ phase error
ω0 the Larmor frequency [rad/s]
γ the magnetogyric ratio
τ pulse width [μs]
λ wavelength

Abbreviations and notations | 17


18 | Processing and analysis of NMR data
1 Introduction

Nuclear magnetic resonance (NMR) spectroscopy is a powerful


technique for determination of molecular structures as well as
chemical and physical properties. It allows the detection of the most
common atoms in organic compounds, i.e. carbon and hydrogen.
Experience of quantitative determinations by NMR spectrometry
(qNMR) is still, however, fairly limited.
One limiting factor of NMR spectroscopy as a quantitative tool is
that it is considered to be a relatively insensitive technique. However,
recent developments of the technique have led to fairly robust
instrumentation and low limits of detection in the picomole range
have been reported.1 Consequently, NMR spectroscopy has attracted
a growing interest and has become a viable analytical technique.

The scope of this thesis


The aim of this thesis has been to address the possibilities and
limitations in qNMR and to make a contribution to this field of
research. While the requirements for an accurate and precise analysis
with qNMR has been investigated in the literature,2-9 there is a large
demand for further investigation.
In analytical chemistry reference is frequently made to what is known
as the analytical chain:
Sampling → sample preparation → instrumental analysis → data analysis.
This work includes all the steps in this chain, with the main emphasis
on the last one. Linear and nonlinear, univariate and multivariate
calibration methods have been employed, as well as principal
component cluster analysis. The substantial need for methods to
correct peak shifts, especially when analysing biological samples,
were realised and explored. In response to this need, such methods

Introduction | 19
were developed. Furthermore, techniques for data fusion and data
correlation of 1H-NMR and LC/MS data from metabolic samples
have been investigated and evaluated.
This thesis will cover some basic theory of NMR spectroscopy, to
an extent comprehensive enough to enable one to understand the
significance of qNMR and be able to use it. This will be followed
by the special case of analysing low-level components by qNMR in
general. 1H-NMR spectrometry for impurity determination (papers
I–II) and urine analysis (papers III–VI) will be described in more
detail.
The final chapters will include the chemometric methods that have
been used in the papers. Separate chapters have also been devoted
to peak alignment, which has been dealt with in papers III and IV,
and data fusion, covered in papers V and VI.
My experiences, results and conclusions from the work in the papers
will be addressed in the chapters when considered relevant.

20 | Processing and analysis of NMR data


2 Nuclear magnetic resonance theory

This chapter includes the NMR theory needed for understanding


quantitative measurements by one-dimensional NMR spectrometry.
Some quantum mechanics and classical mechanics will be
addressed to facilitate the understanding of the theories. More
extensive descriptions of the NMR theory can be found in the
references.4,9-17

History
The theoretical bases of NMR spectroscopy were proposed by
W. Pauli as early as 1924. However, it was not until 1946 that Bloch
and Purcell independently showed that nuclei absorb electromagnetic
radiation in a strong magnetic field.18,19 They shared the Nobel prize
in Physics for this work in 1952. In 1953 the first high-resolution
NMR spectrometer was presented (a continuous wave instrument),
although it was not until about 1970 that the first Fourier transform
(FT) NMR instrument was available on the market. At least eight
Nobel prizes in physics and two in chemistry have been awarded to
scientists for their work in the field of magnetic resonance.1,8,17

Spin
Atomic nuclei have a spin quantum number, I. If I differs from zero,
the nucleus possess a magnetic moment (μ) that may interact with an
external magnetic field:
, (1)
where γ is the magnetogyric† ratio and p is the angular momentum. γ will
differ between nuclei and may be positive or negative. The possible
values of p are quantised and depend on I. The interrelation between
the nuclear spin and μ leads to 2I+1 observable magnetic quantum states (m).


Also denoted as gyromagnetic

Nuclear magnetic resonance theory | 21


Applying a magnetic field
The fact that nuclei are magnetic allows us to study them by means
of NMR spectroscopy. A nucleus with I ≠ 0 will interact with a
magnetic field and can absorb or emit electromagnetic radiation at
certain resonance conditions. Some of the most important nuclei in
organic chemistry and biochemistry (1H, 13C, 19F and 31P) have I = 1/2
and therefore two allowed magnetic quantum states: m = +1/2 and
–1/2. The 1H nuclei will be the model in the following theoretical
descriptions.
In the absence of an external magnetic field, the nuclear spins are
disordered. Applying a magnetic field (B0) will make the 1H nuclei
orient with a small access in the lower energy state (m = +1/2)
according to the Bolzmann distribution:

, (2)

where k is the Bolzmann constant, T is the absolute temperature


and N represents the population at the different stages. The excess
of nuclei in the lower energy state is small (64 nuclei per million
(for 1H) in a 400 MHz instrument). The difference in energy (ΔE) is
proportional to the strength of the magnetic field according to:

, (3)

where h is Planck’s constant and B0 the external/applied magnetic


field. This creates the basic prerequisites for NMR spectrometry.

Energy transitions
The transition between lower and upper energy levels in NMR
spectroscopy is analogous to absorption in other spectroscopic
methods, although wavelengths and frequencies differ (Figure 1).
However, the discrete energy levels between which the energy
transitions take place are created artificially in NMR spectroscopy,
by placing the nuclei in a magnetic field.

22 | Processing and analysis of NMR data


Frequency (Hz)
MHz
102 106 1010 1014 1018 1022
radio VIS γ-rays
NMR MW IR UV X-ray

106 102 10--2 10--6 10--10 10--14


cm µm Å
Wavelength (m)
Figure 1: The electromagnetic spectrum.

Transitions between energy states are brought about by the


absorption or emission of electromagnetic radiation of a frequency
ν0 with the corresponding energy:
(4)
From equations (3) and (4) it follows that the frequency of radiation
required for a transition is:

(5)

Applied energy with the frequency ν0 will induce both upward and
downward spin transitions of nuclei. As long as there are a greater
number of nuclei in the lower energy state, the energy absorption
will exceed the emission and a signal can be observed. However,
continuously applied energy of the “right” frequency will cause the
spin populations to become equal and no signal will be observed,
due to competing relaxation processes. This phenomenon is called
saturation and may be utilised when a specific signal (for example,
water) needs to be suppressed,20 e.g. when urine samples are to be
analysed.

Nuclear magnetic resonance theory | 23


The laboratory frame
To illustrate the measurement of energy transitions, a classical
description of the behaviour of a charged particle in a magnetic
field is illuminative. A coordinate system with B0 along the z axis,
and the x and y axes describing the horizontal plane, is introduced
to facilitate the following theories. This is called the laboratory frame
of reference.

Larmor frequency
When a magnetic field (B0) is applied to a nucleus with a magnetic
moment (I ≠ 0), it starts to precess about the z axis because of the
gyroscopic effect,† Figure 2.

Figure 2: A precessing nucleus. B0 is the applied magnetic field and


ω0 is the angular frequency of motion.

The angular frequency of the motion (ω0) is called the Larmor


frequency and is given by:
. (6)
ω0 [rad/s] is equivalent to ν0 [Hz] (equation (5)). The frequency, ν0,
for the 1H nucleus is actually what is referred to when the strength
of an NMR magnet is denoted in MHz.

Net magnetisation
The ensemble of spins in a sample generates an average magnetisation
or net magnetisation (M0) in the direction of the z axis upon interaction
with a magnetic field (B0), Figure 3. The individual spins may point in any
direction in the xy plane, so their component in the xy plane evens out.


As when a toy gyroscope starts to precess/wobble because of the gravity force.

24 | Processing and analysis of NMR data


Figure 3: The net magnetisation, M0.

M0 is the sum of all the spins and is directly proportional to the


population difference between the low and high energy levels,
N+1/2 – N-1/2. Although the magnetic moment (μ) in each nucleus
is quantised, M0 is continuous since it reflects the whole population
of nuclei.

The rotating frame of reference


To further facilitate the illustration of energy absorption and to
describe the single pulse experiment, we change the angle of view.
When studying the nuclei in the laboratory coordinate system, the
laboratory frame of reference, each nucleus is precessing at the
Larmor frequency (ν0 or ω0) as described above. By subtracting the
laboratory frame we can define a new rotating frame of reference in
which the precession around the z axis is cancelled out. Moving our
view into this rotating coordinate system makes it easier to illustrate
the observed energy transitions. This is further described below in
“Analogue filtering”.

RF pulse
When a radio frequency (RF) pulse (B1) at the Larmor frequency is
applied perpendicular to B0 we achieve a resonant condition and the
system absorbs energy (∆E). The populations of N+1/2 and N-1/2 are
changed according to the Bolzmann equation (2) and M0 changes
direction. Application of the RF pulse is equivalent to application
of a magnetic field along the x or y axis. This rotates M0 away from
the z axis into the xy plane to give Mxy , Figure 4.

Nuclear magnetic resonance theory | 25


Figure 4: The direction of the net magnetisation (M0) is changed
with the angle β when an RF pulse (B1) is applied to give Mxy.

The changed angle (β) of M0 depends on the field strength of the


RF pulse (B1) and the duration τ (typically μs) according to:

(7)

Relaxation
After applying the RF pulse, the system will relax back and
Bolzmann equilibrium (2) will be re-established. The relaxation can
be described by two different processes, the disappearance of Mxy
and the appearance of Mz until equilibrium is reached and Mxy = 0
and Mz = M0, as illustrated in Figure 5.

Figure 5: The directions of T1 and T2 are shown in a schematic


representation of the relaxation of Mxy back to M0.

26 | Processing and analysis of NMR data


The process restoring Mz is called the spin-lattice (or longitudinal)
relaxation and is defined by the relaxation time constant T1. The
energy of the excited state dissipates into molecular vibrations,
rotations etc, until M0 has returned to its original value on the z
axis. The “lattice” denotes the surroundings of the molecule, hence
T1 will depend on, for example, the viscosity of the sample (low
mobility will give a high T1).13
The loss of Mxy is due to intrinsic molecular properties and to
instrumental imperfections. When excluding the contributions
from instrumental imperfections, we may define the spin-spin (or
transverse) relaxation time constant T2, which includes interactions
between neighbouring nuclei. This results in loss of xy magnetisation
until no preferred orientation in the transverse plane remains.
When we include the contribution of instrumental imperfections
to the loss of xy magnetisation, we define T2*. T2* determines
the spectral resolution (peak width) and depends on, for example,
inhomogeneities in the magnetic field.
T2* is always less than T2 and T2 is always less than or equal to T1.
For many liquids, T1 and T2 are about the same and in the order of
seconds.
In contrast to other spectroscopic methods, it is not possible to
accurately measure the absolute values of transition frequencies.
Instead, the difference between two resonance frequencies is
measured. Each nucleus, rotating with its own resonance frequency
ωi, differs from the Larmor frequency ω0 by a few parts per million
(ppm) in a span of about 10 ppm for the 1H-nuclei. This relaxing
magnetisation induces a measurable current in the xy plane, the free
induction decay (FID), Figure 6.

Nuclear magnetic resonance theory | 27


Figure 6: Schemes of the relaxation of Mxy back to equilibrium
(M0) after an applied RF pulse. The drawing also depicts the RF
and detection coils. The relaxation of My with the time constant T2
shows the free induction decay (FID).

The FID is composed of exponentially decaying signals, expressed


in simple form by

, 0<t<at, (8)

where ωi represents the angular frequency of the ith of K magnetically


different nuclei in the same sample. φ is the phase error, which will
be corrected for later, t is the time and at is the acquisition time.

28 | Processing and analysis of NMR data


Quadrature detection
To distinguish between vectors rotating faster or slower than the
carrier frequency, the FID is sampled in two directions perpendicular
to each other (Figure 6). This is called quadrature detection14 and
results in two signals with phases shifted 90o. These two signals are
subsequently treated as one real R(t) and one imaginary part I(t):

(9)

. (10)

One signal of the FID may be illustrated as a combination of


equations (9) and (10):

. (11)

Chemical shifts
Resonance frequencies from the same isotopes will differ depending
on different molecular (magnetic) surroundings, which give the
chemical shifts. Therefore, nuclei with different surroundings result
in peaks at different frequencies in an NMR spectrum. In 1H-
NMR spectroscopy TMS (tetramethyl silane) or TMSP (sodium 3-
(trimethylsilyl)propionate-2,2,3,3-d4) are usually set as the reference
value of 0 ppm (Figure 7). This relative scale implies that each peak
will appear at the same chemical shift, regardless of the strength of
the magnetic field.

Coupling constants
The energy levels of a nucleus may also be affected by the spin
state of nuclei nearby, in which case the nuclei are said to couple
to each other. This will give split peaks and the magnitude of this
separation depends on the strength of the coupling interaction, the
coupling constant (Figure 7). These coupling patterns are crucial for
the determination of chemical structures but will not be described
further here.

Nuclear magnetic resonance theory | 29


Figure 7: A sketch of a 1H-NMR spectrum of ethane (CH3-CH3)
showing the chemical shift scale (δ) and the coupling constant (J).
TMS assigns the reference value δ=0.

NMR data acquisition

Locking and shimming


The frequency of a resonance in the sample (usually deuterium in the
solvent) is compared to a fixed frequency derived from the master
clock used to generate the spectrometer frequency. Any deviation
due, for example, to small changes in the magnetic field is converted
into a correction to B0. This is called field frequency locking.4,21
The lack of homogeneity in the magnetic field is compensated
for by spinning the sample and/or by shimming. Shimming is the
automatic or manual adjustment of certain shim coils or magnetic
gradients until the instrumental contributions to the observed line
width are minimised.4,21

Analogue filtering
We are interested in spectral frequencies (ωi) relative to some
reference frequency (ω0), as stated above. The NMR signal from
the laboratory frame of reference is demodulated by subtracting
the reference frequency (ω0) by a bandpass filter, leaving a signal
which then is sampled and digitised. This is comparable with how
FM radio stations broadcast in the MHz range. The audio signal
desired is superimposed on a high-frequency carrier signal, and this
carrier frequency is tuned into the radio receiver and subsequently
subtracted to leave a signal consisting of frequencies audible to the
human ear. ω0 is adjusted by tuning (to find the right frequency) and
matching (impedance).21

30 | Processing and analysis of NMR data


Analogue to digital conversion
The FID collected by the detection coils in the xy plane (Figure 6) is
converted to digital signals. An analogue-to-digital converter (ADC)
is characterised by its sampling rate and dynamic range. The Nyquist
sampling theorem22 says that the rate of sampling (Fsampling) must be
at least twice the frequency of the highest frequency signal (Fsignal)
that is to be detected, i.e. at least two points per cycle of the highest
frequency signal according to
Fsampling ≥ 2 Fsignal. (12)
The dynamic range of the ADC gives the limit of the smallest
signals that are measurable in the presence of a strong signal and
may be a source of error/noise when there are large concentration
differences.1,22 This is the case, for example, in impurity determinations
and biological samples containing a substantial amount of water. A
typical dynamic range is 16 bits, which corresponds to an intensity
resolution of 1/216. It is important to set the receiver gain of the
instrument so that the dynamic range is fully utilised, but to avoid
overflow, which results in a truncated FID. Theoretically, the height
of the smallest detectable peak is then 1/216-1 = 30 µmol per mole of
the main peak. This is also shown to be the practical limit in paper I,
where the influence of the dynamic range is further discussed.

Digital filtering
Digital signal processing (DSP) or digital filtering represents
processing for extending or enhancing the spectrometer capabilities,
particularly when analysing small signals in the presence of large
ones. The general steps involved are oversampling, filtering and
downsampling.23
Oversampling is a technique borrowed from audio technology and
has proved to be useful in NMR technology. The oversampling rate
describes how many times (q) greater the actual sampling rate is
compared to the Nyquist sampling rate (equation (12)). Theoretically,
an oversampling factor of q gives a gain in the effective dynamic
range of the ADC by log2(q) bit. For instance, with an oversampling
of 8, a gain of 3 bits in ADC resolution is obtained.22

Nuclear magnetic resonance theory | 31


After the oversampled data is acquired, a digital filter is applied to the
FID to reduce noise or extraneous signals outside the final spectral
width desired. This consists of signals with frequencies higher
than the Nyquist frequency, which will be “undersampled” and
consequently appear at the wrong frequency in the spectrum.
Downsampling of the filtered data is then done to reduce the data size,
which is generally reset to the values of the selected spectral width
and the Nyquist sampling frequency.

Signal averaging
In the pulsed NMR technique, the responses from multiple
experimental repetitions may be co-added, a technique known as
signal averaging. Like infrared (IR) spectroscopy, NMR spectroscopy
fits the definition of a “detector noise limited” measurement, which
means that the process of signal averaging results in an increase
of the signal-to-noise ratio by the square root of the number of
repetitions.24

, (13)
where S/N is the signal-to-noise ratio and nt is the number of
transients (scans).
Important parameters to be set for a repeated pulse experiment
are the acquisition time (at) during which the FID is collected and
the relaxation delay (rd), see Figure 8. The choice of at and rd is
influenced by the relaxation processes T2* and T1.
p p

at rd at rd
Figure 8: An example of two repeated RF pulses (p) from the
transmitter (upper) with acquisition time (at), during which the
FID is collected, and relaxation delay (rd) of the reciever (lower).

32 | Processing and analysis of NMR data


NMR data processing

Zero filling
The spectral resolution could be enhanced by zero filling (zero
padding) of the FID.9,11,24,25 The information content in the FID is
not increased but the distribution between the real and imaginary
parts of the spectrum is changed. A rule of thumb is to zero fill
by at least double the number of data points to regain the spectral
resolution after Fourier transformation, when the imaginary part of
the spectrum is eliminated.

Apodisation
More information from the spectrum may be obtained if the FID
is multiplied by a suitable apodisation function before Fourier
transformation.26,27 The Lorenzian line shapes in spectra may
be changed by multiplying the exponentially decaying FID by a
function. This will affect the spectral resolution as well as S/N.
A negative exponential function will emphasise the early part of
the FID with the strongest signal and suppress the late part, which
consists mostly of noise. Such a function would increase S/N but
also give line broadening since it would simulate faster relaxation
(T2*). However, in general, broader peaks will be covered with more
data points which help to obtain more precise integral values. e(-bt)
is a so-called matched filter (compare with equation (8)) where b is
called the line-broadening factor. The matched filter is evaluated in
papers I–II.
Besides the negative exponential function, an abundance of
various functions have been used together with the FID. Examples
are gaussian, lorenzian-to-gaussian transformation, sine bell and
trapezoidal function.12,27 No single apodisation is optimal for all
signals in spectra and different functions will enhance different
phenomena, such as S/N or resolution.

Convolution
Furthermore, convolution or spectral smoothing could be
performed on the spectrum in various ways. To give one example,
a one-dimensional convolution with a triangular function26 means

Nuclear magnetic resonance theory | 33


that the spectra are filtered by a triangular window. The convolution
theorem says that convolving two vectors is equal to multiplying
their Fourier transforms.14 Thus, the convolution with a triangular
function is the same as multiplying the FID by the Fourier transform
of a triangular function (an exponentially decaying cosine curve),
which also enhances the very first part of the FID. This has been
tried and evaluated in paper II.

Fourier transformation
One aim of NMR data processing is to calculate the frequencies (ωi)
from equation (8) and obtain a spectrum. The most usual method for
converting the NMR signal from the time domain into a frequency
domain is Fourier transformation (FT).24,26
Fourier-transforming the signal S(t) from equation (11) gives a
function of frequency, F(ω), also consisting of one real and one
imaginary part:
(14)
Since we have discrete values from the ADC, a discrete FT (DFT)
must be applied. DFT requires linear sampling, data sampling at
regular time intervals until the signal is close to zero, and truncation
of the FID will lead to artefacts.
Hodgkinson and Hore28 have pointed out possible ways in which
nonlinear sampling of data can be more efficient. A frequency
spectrum could then be obtained by alternative data processing
procedures such as least squares methods (LS),29 linear prediction
(LP),30 the Bayesian method31,32 or the maximum entropy method
(MEM).33 These alternative methods are recommended for
quantitative problems in the literature,5,10 although only a few have
been employed.34

34 | Processing and analysis of NMR data


Phase correction
The FID S(t ) as well as the Fourier-transformed spectrum F(ω) has
been described as a signal consisting of one real part R and one
imaginary part I, equations (11) and (14). Due to phase shifts, R(ω)
and I(ω) consist not just of the absorption spectra A(ω) and the
dispersion spectra D(ω) respectively, but a linear combination of the
two according to:

(15)

where φ represents the phase error. A linear combination of R(ω)


and I(ω) will give the pure absorption and dispersion spectra,
according to

(16)

This is obtained by phase correction, which is most often performed


manually. Since the phase correction is frequency dependent, φ is
defined as:
, (17)
where the constant phase error, φ0, arises from the inability of the
spectrometer to detect the signals from the two detector coils in
the exact x and y directions. Frequency-dependent linear phase
errors, φ1, arise because the detectors cannot detect the transverse
magnetisation immediately after the RF pulse due to the risk of
pulse breakthrough and the RF pulse being slightly off the Larmor
frequency of every nucleus.4

Nuclear magnetic resonance theory | 35


36 | Processing and analysis of NMR data
3 NMR for quantitative analysis

Peak area determination is very common in routine proton NMR


spectrometry. Accurately calculating integrals to determine the
relative number of protons contributing to particular peaks is
straightforward. For this purpose, an integral accuracy of 10–15%
is satisfying and easily achieved. When NMR spectrometry is
applied as a quantitative analytical tool, an accuracy of 1–2% or
better is desirable. In this thesis NMR spectrometry for quantitative
applications is denoted qNMR;35 however, this is not a generally
accepted convention. This chapter will review the possibilities
of, discuss the requirements for and hopefully give a deeper
understanding of the applicability of qNMR.

Background
Early research in the qNMR field was done, among others, by Muller
and Goldenson,36 Hollis,37 and Jungnickel and Forbes38 and by 1976
qNMR in pharmaceutical research had been reviewed.39 Several texts
have discussed this subject in the last two decades1,5,8-10,25,34,35,40,41 and
the notation QNMR was introduced in 1998.7
The use of NMR spectrometry as a quantitative technique has
been rather limited despite its many advantages, which include the
following: Since it is able to detect small differences in a molecular
structure, it is a very selective technique. It is also directly applicable
to most samples, little or no sample preparation being required. For
instance, samples with a complex background matrix such as urine,42
plasma,42 petrol,43,44 oil,45 beer46 or egg yolk extracts47 can be analysed
virtually directly. It is also a non-destructive tool of analysis. One
special advantage of NMR spectrometry is its property of being
a primary method of measurement, a method with the highest
metrological qualities whose operations can be completely described

NMR for quantitative analysis | 37


and understood. Thus the results can be accepted without reference
to an independently determined standard of the same material but
can be determined directly from the physical context. In theory, the
peak intensity of each NMR signal exactly reflects the molar ratio of
the nuclei present (i.e. the molar response is 1 for all nuclei of the
same isotope), making the technique conceptually and technically
simple for quantification.
Because organic compounds typically generate several peaks, the
complexity of the spectra, especially for 1H-NMR spectrometry, may
cause problems as well as bringing an advantage. A discouraging fact
in qNMR analysis may be the large amount of parameters that have
to be set. The most important ones are accounted for in this chapter.
Compared to other analytical techniques, the major disadvantage of
qNMR is its high limit of detection.

qNMR vs other analytical techniques


Despite its limitations, qNMR may rival chromatography in
sensitivity, speed precision and accuracy, while also avoiding the
need for a reference standard of each analyte. Examples where
qNMR is comparable and even more accurate and precise than the
standard liquid chromatography (LC) method are easily found in the
literature.7,40,48-51, papers I–II Other advantages of qNMR over LC include
time for development and validation, robustness and analysis time
(i.e. cost),48 together with the possibility of getting a good overall
picture of all types of organic compounds in the sample.
Compared to mass spectrometry (MS), where individual events could
be counted, NMR spectrometry has S/N ratios that are lower by
many orders of magnitude. However, it does have some advantages
over MS: normally, no separation (chromatographic or other) is
required, no expensive, authentic reference samples are necessary
and, in many cases, it is quicker and easier to perform.48
There are no general rules for predicting whether analytical problems
can be solved by NMR spectrometry. Pauli et al.40 demonstrated
the strong demand for non-chromatographic alternatives in the
quality assessment of reference compounds. The area of impurity
determination (papers I–II) is suitable for qNMR since the technique
is very selective. Favourable application areas of qNMR are also
where it can replace complicated and time-consuming sample
preparation and analysing methods, one example being found in
paper II.

38 | Processing and analysis of NMR data


qNMR methods
Two principal methods may be used for qNMR: the absolute method
with reference to an internal standard compound (a primary method)
and the calibration method with reference to a calibration model.

(i) The absolute method


The intensity of a signal (A) is proportional to the total number of
nuclei generating it (N). Comparing two signals, where one works as
an internal standard (is), gives the relationship:

, (18)

which makes it easy to determine the number of nuclei generating


the other signal (s) without any calibration. However, in order for
equation (18) to be valid, the nuclei s and is have to be relaxed to the
same extent or, easier to fulfil, fully relaxed.

(ii) The calibration method


NMR spectrometry can also be used without utilising the advantage
of the equal molar response for all nuclei. However, this requires
a calibration based on the internal standard technique. The equal
molar response does not then have to be valid, and incomplete
relaxations or peak interferences may be included in the model. The
focus of the analysis will then be to optimise the S/N ratio for the
particular quantification task. Calibration qNMR is used (more or
less) in all the papers in this thesis, although it has been presented in
the literature43,44,47,52-55 to a lesser extent than the absolute method.

The sensitivity of NMR spectrometry


The low sensitivity of NMR spectrometry26 is due particularly to the
very small differences in energy that are measured. Low-frequency
energies are also more difficult to detect than high-frequency
ones. Moreover, only the small differences in population between
the energy states (equation (2)) are observed in the experiment.

NMR for quantitative analysis | 39


However, since the nuclei have a relatively long lifetime in the excited
state (compared to other spectroscopic techniques), the resonance
is well defined, resulting in narrow peaks, according to Heisenberg’s
uncertainty principle.†,13
New technical developments in the NMR instrumental field will
provide greater sensitivity in the analyses. Imminent developments
include higher magnetic field strengths. Also NMR probes with
cooled transmitter/receiver coils and preamplifiers will increase
the sensitivity due to reduction of electronic thermal noise.
Probes cooled by cryogenic liquids such as liquid helium are called
cryoprobes and will increase the S/N ratio approximately fourfold
or more.56 References1,57 are also made to the possibilities of using
microcoil technology in order to reach a limit of detection (LOD)
down to picomole (10-12) level.

Signal-to-noise ratio
The achievable signal-to-noise ratio (S/N) of an NMR single pulse
experiment is a function of a number of parameters. These are
summarised in equation (19)58,59 and in Table 1, where the time
parameters are also included.

(19)

Instrumental parameters determined by the available equipment


are the filling factor ( f ), the magnetic field strength (B0), receiver
bandwidth (b), the quality factor of the RF coil (Q) and the noise
figure of the amplifier (n). The filling factor is a measure of the
fraction of the coil volume occupied by the sample and could be
increased by fixing the RF coil directly onto the sample cell. This
construction, however, makes it impossible to spin the sample to
get improved magnetic field homogeneity. On the other hand, the
magnetic field homogeneity in such a probe is superior to those with
an unfixed coil. According to equation (19) an increase in magnetic
field strength will increase S/N by a power of 3/2, i.e. changing
a magnet of 500 MHz into 600 MHz would give an increase in
S/N of 30%. Instrumental factors affecting sensitivity are further
discussed by Hoult.60


To measure an energy difference accurately, one needs a long time. If the system
relaxes rapidly, the time available is small and hence ΔE (=hν) is poorly defined and
the observed NMR signals are wide.
40 | Processing and analysis of NMR data
Table 1: A summary and categorisation of the factors influencing
NMR spectrometry S/N.16,59
Possibility of S/N
Categorisation Parameter
enhancement
RF coil fixed onto
f filling factor
the sample cell
receiver
b decrease
bandwidth
the quality
Probe Q factor of the increase
design RF coil
the noise
decreased by cooled
n figure of the
probe
amplifier
volume of
Vs a large coil volume
the sample
number of
Adjustable increased sample
N detectable
laboratory concentration
nuclei
parameters
T temperature decrease
Instrument Affect the magnetic
B0 increase
parameter Bolzmann field strength
B

distribution gyromagnetic a nucleus with a


Intrinsic �
ratio high �, e.g. 1H or 19F
nuclear
spin quantum a nucleus with a
properties I
number high I
spin-lattice decrease by e.g.
T1 relaxation temperature or
time relaxation reagents
experimental increase by
Time
spin-spin
parameters T2* shimming �
relaxation
time narrower peaks
number of
nt increase
scans

An adjustable instrumental parameter is the sample temperature (T).


Lowering the temperature will give a larger polarisation (larger M0)
according to the Bolzmann equation (2), giving increased sensitivity.
However, lowering the temperature may reduce T2* and lead to a
loss of S/N due to larger line widths.
The volume of the sample (Vs) and the number of detectable
nuclei (N) are factors that may be limited by the available amount
of sample and solubility, but should be as high as possible. Too high
a concentration, however, may risk incomplete dissolution or cause

NMR for quantitative analysis | 41


ADC overflow, which results in signal truncation or downscaling
and hence a decrease in the ADC dynamic range. A sample with
a very high concentration can also give line broadening due to
intermolecular interactions. The number of nuclei detected also
depends on the number of structurally equivalent nuclei that give
rise to one signal from the studied molecule.
The gyromagnetic ratio (γ) and the spin quantum number (I ) are
given parameters for the chosen nucleus. If possible, the nuclei of
protons are the ones to measure since 1H is a frequently occurring
isotope (99.99% natural abundance) and has a high gyromagnetic
ratio (26.75 compared to 6.73 for 13C). All in all, 1H-NMR is some
5,700 times more sensitive than 13C NMR spectrometry.9
Furthermore, the splitting of the signal, depending on the structural
and chemical surroundings of the proton, affects S/N. To give an
example, a singlet peak has a much higher S/N than a doublet peak,
although they come from the same number of nuclei and thus have
the same total area. If possible, this should be considered when
choosing one peak for quantification.
As discussed earlier, various time parameters will also affect the
S/N ratio in the resulting spectra: the number of transients (pulse
experiments) in signal averaging (equation (13)), acquisition time
(determined by T2*) and pulse repetition time (determined by T1).
These are also included in Table 1.
For the best S/N ratio, there is an optimum combination of the tip
angle (β), the T1 relaxation time and the time between pulses, given
by Ernst’s equation:61

(20)

where rt is the pulse repetition time. This equation is valid for


one nucleus with the spin-lattice relaxation T1 and does not give
the conditions for the best accuracy of the peak integral that are
required for the absolute method (i).
The simplest, and most frequently mentioned, way to ensure that the
spin system is in equilibrium between pulses is to wait for 5 times
the longest T1 after a 90° pulse before repulsing. This corresponds
to a relaxation of 99.3% and will give a maximum 0.7% error in
the integral accuracy. According to Trafficante,62 a pulse angle of

42 | Processing and analysis of NMR data


83° and a pulse repetition period of 4.5 times the longest T1 give
the optimum recovery after a pulse regarding accurate integrations.
The settings of these parameters for quantification are discussed in
several references.4,5,9,61,63,64

Accuracy and precision of qNMR


The signal peaks in an NMR spectrum are generally characterised
by four attributes: chemical shift, multiplicity, line width and relative
intensity. A well-resolved, narrow, single peak with a high S/N ratio
has the best chance of giving an accurate and precise quantification.
Several studies have discussed the uncertainties and errors in
qNMR,2-9 and accuracy and precision better than 1% have been
reported.2,3,7,25 The main contributions to the experimental error
have been localised in these studies to the sample preparation and
the analyst.2,7

Accuracy
The systematic errors presented will mainly affect the accuracy of the
absolute method (i) of qNMR, where no calibration is performed.
If the aquisition time is shorter than the time it takes for all nuclei
to relax completely, the molar response will differ from unity. To
avoid systematic errors in the peak areas, the relaxation delay must
also be long enough to allow complete relaxation in z-led (T1) for all
nuclei. Also vertical truncation of the FID, due to ADC overflow, will
introduce distortions of spectra.
Resonances (ωi) with different distances from the excitation pulse
(B1) may be differently excited and detected due to off resonance effects
and non-uniform responses to the RF pulse. If the bandpass filter
width prior to the ADC is too narrow, the lines at the ends of the
spectrum may be reduced in intensity. In this respect it is beneficial
to use peaks close to each other for quantification.
The integration range for Lorenzian NMR peaks is limited, especially in
a crowded spectrum. An integration error smaller than 1% requires
integration limits of ±24 times the peak width.25 However, if the
integration limits are consistent between peaks, errors or less than
0.5% are easily achieved.3,7,40

NMR for quantitative analysis | 43


Precision
Random errors will affect both the precision of the absolute (i) and
the calibration method (ii) for qNMR.
Point-to-point noise due to spectral resolution is a random error
source. If the spectral resolution exceeds 0.4 of the peak width, the
maximum error of the integral will be 0.1%.25 The spectral resolution
may be enhanced by zero filling. Furthermore, the intensity resolution
will be limited when a small peak is detected in the presence of a
strong peak, depending on the dynamic range of the ADC, which
will give random errors.
Automatically phased peaks are rarely acceptable with available
methods, so manual phasing with a vertical expansion of the peaks is
still the recommended method for the best results.35 Since perfect
phasing of NMR spectra then relies on a subjective judgment
of the goodness of the shape of signals, this will be a source of
error. Baseline and phase anomalies are thoroughly discussed in the
literature.5,40

Discussion
Very few articles describe the calibration method (ii) in the context
of qNMR. By using this method, the parameter determination
ensuring complete relaxation is avoided. This is a great advantage
over the absolute method. The drawbacks are that the calibration
samples have to be prepared and that all the sample components are
required as standards. The advantage of the absolute method (i) is
that it is very straightforward and that no calibration is needed.

44 | Processing and analysis of NMR data


4 qNMR applications

This chapter will describe qNMR applications for impurity


determination and metabolic profiling. The sample matrices, NMR
instrumental parameters and applied data processing from the
papers will be briefly illustrated.

Impurity determination
The presence of unwanted chemicals, e.g. residuals from synthesis and
storage, even in small amounts may affect the efficiency and safety of
pharmaceutical products. This is why the different pharmacopoeias
prescribe limits for the allowable levels of impurities present in
the active pharmaceutical ingredients and formulations. Impurity
profiling – the identification and quantification of impurities – is of
great importance in the pharmaceutical industry.65
qNMR is a very useful technique for impurity determinations due
to the absolute response factor, selectivity and the reproducibility
of chemical shifts. Impurities with similar chemical structures may
readily be both identified and quantified by NMR spectrometry,
making it a very cost-effective alternative compared to other
analytical methods that require the acquisition of separate standards
of the target analyte and possible impurities.
The use of 1H-NMR spectrometry for impurity (or low-level)
determination has been reported in several papers1,7,45,48-52,55,66-72 and
in papers I–II in this thesis. Organic components have been reliably
quantified below the 0.1% level7,70 and some papers have reported
LODs lower than 100 µg/g.49,52,67,70,papers I–II The main aim in papers
I–II was to investigate the potential of the qNMR technique for
impurity determination in these specific cases. There were no limits
of materials, i.e. no problem of making calibration samples, and no
attempt was made to perform an absolute determination.

qNMR applications | 45
Paracetamol samples (paper I)
Paracetamol (N-(4-hydroxyphenyl)-acetamide) is a drug having
an analgesic, antipyretic and antiphlogistic action that is widely
employed in therapeutics. The main impurity in paracetamol
preparations is 4-aminophenol, an intermediate in the synthesis of
paracetamol (Figure 9), which may also be formed during storage. The
4-aminophenol content in paracetamol is limited to 50 µg/g by the
European, United States, British and German pharmacopoeias.73

I II III

I II III

Figure 9: Structures of paracetamol (left) and 4-aminophenol (right).


The Roman numbers is referred in the legend of Figure 10.

In paper I, the potential of determining 4-aminophenol in


paracetamol by qNMR was explored. Mixtures with different
concentrations of the impurity were dissolved in DMSO-d6.
The pulse angle and repetition time chosen were 83° and 4.5*T1
respectively, the optimal settings for best peak integral accuracy
according to Trafficante.62 To expand the vertical dynamic range,
the oversampling factor was set to 20. The study is fully described in
paper I and part of a spectrum is shown in Figure 10.

46 | Processing and analysis of NMR data


p+4ap

4ap

6.8 6.6 6.4


δ(1Η)/ppm

Figure 10: A part of an NMR spectrum of paracetamol holding


1 mg/g of the impurity 4-aminophenol. The peak marked p+4ap
comprises a 13C-satellite signal (0.55% of the main peak intensity)
from two aromatic protons in paracetamol (marked I in Figure 9)
and the signal from the corresponding signals in 4-aminophenol
(II). The peak marked 4ap arises from 4-aminophenol (III).

Poloxamer samples (paper II)


Poloxamer materials are synthetic copolymers of ethylene oxide and
propylene oxide (Figure 11) and were first synthesised in 1954 by
Lundsted and Ile.74 Poloxamers form a thermoreversible gel, a low-
viscous liquid at low temperatures and a gel at body temperature,
which has been used for pharmaceutical formulations.75
Acetaldehyde and propionaldehyde (Figure 11) are well-known
impurities in poloxamers.76 They are formed from the monomers
ethylene oxide and propylene oxide, which are the building blocks
of the poloxamers. The maximum allowed amounts of acetaldehyde
and propionaldehyde in the poloxamer are 80 and 100 µg/g,
respectively, for some medical applications.

qNMR applications | 47
Figure 11: Structures of poloxamer (upper), acetaldehyde (left)
and propionaldehyde (right). a and b hold different numbers for
different types of poloxamers.

The LC method for the determination of acetaldehyde and


propionaldehyde in poloxamer in use today involves a very time-
consuming work-up procedure. By means of qNMR the poloxamer
samples can be analysed only by dissolving the samples in acidic
water. Due to the volatile aldehydes and because the poloxamer
forms a gel at temperatures above room temperature, the poloxamer
samples were analysed at 275 K. The viscosity of the samples was
also the reason for running the experiments in a non-spinning
mode. The oversampling factor was set to 16 to expand the vertical
dynamic range (chapter 2, “Digital filtering”). The remainder of the
parameter setup varied according to Table 2. An NMR spectrum is
shown in Figure 12.

Table 2: NMR instrumental parameter setups that were tried


for determination of acetaldehyde and propionaldehyde in
poloxamer.
Shimming Gradient Gradient Manual Gradient
Pulse angle 27o 27o 78o 78o
Recycle time (s) 6 (3.3�T1aa) 4 (2.1�T1aa) 9 (5�T1aa) 14 (7.8�T1aa)
Acquisition time (s) 6 4 4 4
Number of scans 64 256 / 64 64 64
Tot. exp. time (min) 6.4 17 / 4.3 9.6 14.9
T1aa: Spin-lattice relaxation time constant for acetaldehyde (1.8 s)

48 | Processing and analysis of NMR data


pa

aa

poloxamer

2.2 2 1.8 1.6 1.4 1.2 1 0.8


δ( 1 H)/ppm
Figure 12: A spectral segment including signals from the CH3
groups from the substances poloxamer at 1.2 ppm, acetaldehyde
(aa) at 2.22 ppm (176 μg/g poloxamer) and propionaldehyde (pa)
at 0.89 ppm (175 μg/g poloxamer).

Zero filling and line broadening (papers I–II)


Zero filling and line broadening were performed to a varied extent
previous to a Fourier transform in both paper I (Table 3) and
paper II. The evaluation showed no improvement in the quantitative
results due to zero filling, although line broadening improved the
quantifications to some extent in paper II.

Table 3: Root-mean-square error of prediction (RMSEP)


(equation (31)) of different test sets from NN calibration models
including partly different data sets (A-D), and with different data
preprocessing (see paper I for details).
RMSEP (�g 4ap/g pa)
Zero Wavelet Line
A B C D Mean
filling† compression broadening
no no no 35 24 15 27 25
yes yes no 21 59 32 38 37
yes yes yes 62 59 30 46 49
† todouble the number of data points
4ap: 4-aminophenol
pa: paracetamol

qNMR applications | 49
Spectral smoothing (paper II)
A one-dimensional convolution of spectra with a triangular function
(of height 1 and width 41) was performed in paper II. The evaluation
of this coarse smoothing of spectra improved both repeatability,
reproducibility and the LOD.

Metabolic analyses
NMR spectrometry is very attractive for quantitative and qualitative
determinations of small molecules in biological fluids since it
requires no or very little sample preparation. Drugs, toxic substances
and metabolites may be analysed without prior assumptions, e.g. to
measure the biological response to some xenobiotics. Already in the
early eighties NMR spectrometry was used for the analysis of serum,
plasma and urine.77-79 This field of research in combination with
high resolution 1H-NMR was reviewed by Nicholson and Wilson
in 1989.42
Although the terminology in this area of research is still evolving,
metabolic analyses may be divided into four general areas.80-83
• Metabolite target analysis: The quantification of specific
metabolites.
• Metabolite/metabolic profiling: Quantitative or qualitative
determination of a group of related compounds or of specific
metabolic pathways.
• Metabolomics/metabonomics: Comprehensive analysis of
all metabolites at a specified time and given environmental
conditions.
• Metabolic fingerprinting: Sample classification by global analysis
of all present components.
In drug research, the term metabolic profiling is often used to describe
the metabolic fate of an administrated drug.83 Metabolic profiling
makes it possible to generate information about in vivo multiorgan
functions. This is an effective preclinical screening method. This
type of analysis may also provide one key to “systems biology”: the
combination of genomic, proteomic, and metabolomic data.80,81,84

50 | Processing and analysis of NMR data


By means of urine sampling, information can be semi-continuously
collected from the same animal without euthanasia. The 1H-NMR
analysis of urine samples takes a few minutes and the sample
preparation includes only buffering and centrifugation. Urine
sample analysis is expected to reflect a large number of biochemical
processes in the body, making it a potential diagnostic tool.81,84
The improved capabilities of the NMR instruments due to the
development of technology makes it possible to detect thousands
of signals arising from hundreds of endogenous molecules in
spectra from 1H-NMR analysis of biological fluids such as urine or
plasma. Although these signals often overlap, this is a useful tool for
metabolic fingerprinting. When analysing these complex samples,
the assignment of all peaks is seldom accomplished. However,
even unidentified signals can be used as fingerprints. Techniques
for multivariate data analysis and pattern recognition then help
to give qualitative and quantitative interpretation out of complex
spectra.46,47,54-56,72,80,82,84-97
A lot of biological variations will affect the quality of urine sample
1
H-NMR spectra. For example, the ionic strength varies considerably
in urine samples and may affect tuning and matching (chapter 2,
“NMR data acquisition”).81 Water, which is present in a very high
concentration,42 has to be reduced by solvent suppression during
the NMR experiment. The concentration of the urine must also be
compensated for in the analyses.

Rat urine samples (papers III–VI)


The rat urine samples studied in papers III–VI were received from
an animal study at AstraZeneca. The study included two groups of
six rats each: one control group orally dosed with water and one
group dosed with citalopram (positive control for phospholipidosis).
The rats were dosed once a day for 14 consecutive days and urine
samples were collected on days -5 (pre-dose, two occasions), 1, 3, 7,
10 and 14. The study is fully described in paper III.
The samples were buffered and centrifuged prior to 1H-NMR
analysis on a Bruker DRX600 instrument working at 600.13 MHz.
Each sample was automatically shimmed by gradient shimming and
suppression of the water signal was achieved using presaturation
during relaxation delay and mixing time with a shaped pulse for

qNMR applications | 51
selective saturation. The acquisition time was 4.68 s and the total
pulse recycle time was 7.71 s, 64 scans per sample using a spectral
width of 7002.8 Hz. These parameters were chosen on the basis of
previous work analysing urine samples with NMR spectrometry.92,95
The total NMR acquisition time for one sample was 8.6 min. An
example of an NMR spectrum is shown in Figure 13.

10 8 6 4 2 0
δ( 1H)/ppm

Figure 13: 1H-NMR spectrum of urine from a rat dosed with


citalopram for 7 days.

Phase correction and peak alignment (papers I–VI)


Spectra were manually phased and referred to the TMS peak on the
NMR instrument and further processed in MATLAB.98 In paper I, the
spectral areas of interest (6.4-6.55 ppm) were cut out, automatically
phased and corrected laterally by means of a genetic algorithm
until the peaks were aligned. In papers II–VI each spectrum was
manually phased (again) in MATLAB and peak-aligned segmentwise
as described in papers III and IV. The subject of peak alignment is
thoroughly discussed in chapter 6.

52 | Processing and analysis of NMR data


5 Chemometrics

Data from a chemical analysis consist of information and noise. To


get high accuracy and precision for a chemical analysis, we need
knowledge about both of these components. The properties of
the instrumental noise of NMR spectrometry are discussed in
chapter 3 above. The noise also has chemical and/or biological
contributions.
The desired and relevant information is often just one part of the
total information. When it comes to data analysis, this is what we
want to extract. This can be performed by chemometrics, which
comprises the application of multivariate statistics, mathematics
and computational methods and also includes how to represent and
display this information. A philosophical article about the subject
has been written by Svante Wold.99 More theoretical references to
the subject of chemometrics can be found in books.100-102

Principal component analysis


Principal component analysis (PCA)103,104 is a well-established bilinear
projection method for interpreting multivariate data. Pearson105 is
considered to be the first person to have described PCA as we know
it today, although Cauchy derived it as early as 1829.100 PCA denotes
the extraction of eigenvectors† from the variance-covariance (var-
cov) matrix (21) to obtain a number of uncorrelated (orthogonal)
new variables called latent variables or principal components (PCs).

(21)

for variable i, j =1... p in the data matrix X of the dimensions n × p.


From the German eigen meaning “inherent, characteristic”

Chemometrics | 53
PCA gives a new representation of data in a new, rotated coordinate
system. Every PC is optimised in regard to its ability to explain
the direction of maximum variance in the original data and each PC
explains variation uncorrelated with the other PCs. The data are
approximated by the model:
X = TP’ + E,
where X is the original n × p set of data, typically n samples and p
variables. P’ is an m × p matrix of variable coefficients (loadings)
that give the relative contributions of every variable in calculating
the object scores, T, which is an n × m matrix, of the coordinate
values on the PCs. E is an n × p matrix containing the residuals not
explained by the model using m PCs. The score value representation
(scores plot) will show a clustering of similar samples, as sketched in
Figure 14. The loadings represent the variable direction, describing
the influence of the variables on the scores, and are a valuable
interpretation tool.

v1

v2

v3

Figure 14: A geometrical interpretation of a PCA model. The


original data matrix X consist of p variables and eight samples. The
first three variables (v) are represented in the coordinate system to
the left, where the first two principal component vectors, PC1 and
PC2, are also shown.

PCA as a projection technique in combination with NMR spectra is


frequently employed in systems biology and in the pharmaceutical
industry. Examples are found in papers III–VI and several
references.82,86,90-97,106

54 | Processing and analysis of NMR data


Parallel factor analysis
Parallel factor analysis (PARAFAC) is a multiway decomposition
method, a generalisation of PCA to higher order arrays.107,108 X is
decomposed in the loading matrices A, B and C, illustrated by:
Xnpq = AnmBpmCqm + Enpq
where n, p and q represent the dimensions of the data matrices, m
denotes the number of PARAFAC components and E is the residual
matrix in this three dimensional example.
PARAFAC was applied to three- and four-dimensional data in
paper VI and has the advantage of being able to retain the time-
related structure of this data. Compared to bilinear models with
unfolded data, multiway models are generally more robust and
easier to interpret, provided that all dimensions hold information.107
Theoretically, the PARAFAC model has a unique solution compared
to bilinear methods, which solutions suffer from rotational freedom.
However, scaling and centring are not as straightforward in the
multiway case.

Calibration
Since qNMR has the property of being a primary method of
measurement, no calibration is needed as long as we have only
random noise. However, if we are uncertain about the randomness
in the noise or we do not wait for full relaxation, the equal molar
ratio between peaks will not be valid and a calibration is required (as
discussed above in chapter 3).
The simplest form of calibration is to measure one sample for
which we have all the information, typically concentration. From
this sample we get a relationship (b) between the signal (x) and
the information (y), from which an unknown sample (yi) could be
predicted when we have new measured data (xi):
yi = b ⋅ xi (22)
This is suitable when we have samples to determine in a known,
rather limited, concentration range.

Chemometrics | 55
To cover a larger concentration interval, several samples (x) are
required in a calibration model:
y=b⋅x+e (23)
e will include both measurement and model error, and b is estimated
by least-squares regression (24) and is called the regression vector.
, (24)
where x = (x1, x2,... , xi,... , xn)’ and y = (y1, y2,... , yi,... , yn)’.
qNMR calibration modelling including area determinations,
i.e. univariate calibration, is exemplified in paper II and
references.43,44,53
NMR peak area determinations are easily performed with consistency
if the peaks are singlets and have no other peaks in their immediate
surroundings. However, if the peaks suffer from interferences or
if singlets have to be compared with multiplets, integration is a
more difficult task. A very complex spectrum, such as the NMR
spectra of biofluids, may also lead to difficulties in sorting out the
important signals. To exploit the full information content of NMR
data, including hundreds of overlapping signals, multivariate data
analysis methods are preferred.
The use of methods for multivariate calibration methods have
increased with the desire to solve increasingly difficult problems
and complex data. Naturally this goes hand-in-hand with the
increased computer performance. A simple method for multivariate
regression is multiple linear regression (MLR), with the estimate of
b, according to:
(25)
where X = (x1, x2,..., xi,... , xn), in accordance with equation (24)
above.
MLR may also be performed on PCA scores (T) on y, which is called
principal component regression (PCR). b is then estimated as:

(26)
The advantage of PCR over MLR is that it involves no limit in the
number of variables, since in MLR the number of variables must be
less than the number of samples.

56 | Processing and analysis of NMR data


Partial least squares
When the maximum variance in X does not correlate with the
maximum variance in y, PCR will give an unnecessarily complex
and unstable model for calibration. In such a case, it may be more
appropriate to use partial least squares or projection to latent
structures (PLS). The objective function of PLS is to find maximum
covariance between y and X, in contrast to PCR, where the maximum
variance in X is found.
PLS was first invented by Herman Wold in the 1960s109,110 and was
later modified and extended by Svante Wold et al. in 1983,111 when it
also was named partial least squares. This method for regression has
been established by its success in a wide variety of problems.112-114
The details of PLS regression have been thoroughly described in
the literature, and only a short summary follows here.
A vector of weights, w, describing the maximum covariance between
X and y is defined according to:
= X’y/|X’y| (27)
Scores values t are obtained when X is projected onto w
t = Xw, (28)
and the regression coefficient vector is then determined by
(29)
After estimation of the x-loadings vector
p = X’t/(t’t), (30)
the tp’ matrix is subtracted from X and the bt matrix is subtracted
from y, and the procedure is repeated from (27) to extract additional
PLS components.

PLS applied to NMR data


PLS has previously been applied to NMR data46,47,54,55,85-87 and in
the course of the work on all the papers in this thesis, both for
creating calibration models and for explorative analysis. In paper I,
PLS did not perform as well as a calibration model as the applied
neural network, and was therefore not described. In paper II, the
univariate analysis based on integration of peaks performed as well

Chemometrics | 57
as the PLS models. In order to chose as simple a model as possible,
the PLS models were rejected. In paper III, PLS was rejected in
favour of the unsupervised PCA model to avoid the risk of guiding
the latent variables in the direction of a false y variable. Day of
sampling was tried as y, although the kinetics of the drug-induced
metabolic changes were unknown. Papers IV–VI include PLS, both
with y as a variable of time and a measured y variable (LC/MS),
and as a discriminant analysis (DA) method.112 There is, however,
a risk, especially with the PLS-DA models, in such analyses. The
PLS model may appear to perform really well if only one or a few,
e.g. misaligned peaks or noise, discriminate between the samples,
though the actual model performance is poor. This risk must be
minimised by studying loadings and by a proper validation.
N-PLS was developed by Bro 1996115 as an extension of PLS to
higher order data. It was used in paper VI, making it possible to
utilise the structure of original data and avoid unfolding.

Nonlinear regression
Linear multivariate regression methods like PCR and PLS may to
some extent explain nonlinearities if a sufficient number of latent
variables are selected or the x and/or y block are transformed or
expanded. However, a better alternative may be to use a nonlinear
model. There are examples of nonlinear PLS models where the inner
relation between x and y scores is defined as nonlinear (a quadratic
polynomial).116 A more common way of modelling nonlinear data
is probably neural networks (NN), which was employed in paper I,
where a nonlinear relationship was observed. The first application
of NN to NMR data was demonstrated in 1989 by Thomsen and
Meyer,117 followed by, among others, Amendolia et al.118 A short
description will be given here, and further reading about NN can be
found in the literature.119
A multilayer feed-forward NN is composed of a highly interconnected
mesh of nonlinear and linear computing elements. The fundamental
processing element of an NN is the neuron. Neurons are connected
by links, and there is a coefficient (or a weight) associated with each
link. The neurons receive external inputs (first layer) or inputs from
other neurons and perform a weighted sum of these inputs. The
neurons process the resulting signal with a transfer function and

58 | Processing and analysis of NMR data


then produce an output passed on as an input to other neurons or as
an output from the entire model. This is illustrated in Figure 15.

Figure 15: A sketch of a neural network with three input data,


two neurons in the hidden layer and one output data. Each line
represents a weighting and each circle is a neuron defining a
transfer function.

A process known as training or learning establishes the values of


the weights. Output vectors ( ) are calculated from a set of input
vectors (x) for which y is known a priori. The weights are modified
based on the difference y- . As this process is repeated, the weights
gradually converge to values that transform each input pattern to
an output pattern close to its target (y). Any type of variation in
the inputs may be accounted for by including them in the training
process.

Data preprocessing
A common denominator of all of the multivariate methods is
the necessity of proper data preprocessing prior to data analysis.
Several approaches are available that deal with undesired spectral
artefacts, e.g. multiplicative scatter correction (MSC)120 for baseline
correction, orthogonal scatter correction (OSC)86,121 for removal of
unwanted variances and corrections for relative intensity variations
such as mean-centring (xij-xjmean, for variable j and sample i), unit
variance scaling (xij/xistd, where xistd is the standard deviation of xi),
and normalisation (which could be performed in various ways).
Mean-centring is performed prior to multivariate analysis to centre
the new data representation on the origin of coordinates since we
are generally not interested in the mean level of the variables, but
the variation around the mean. Unit variance scaling is generally not
suitable for spectral data since noise is scaled to be as important as
peaks, which may be risky, especially when PLS is applied due to the
risk of overfitting.

Chemometrics | 59
Normalisation is actually worth a chapter of its own, but will only
briefly be mentioned here. It must be performed with care since
normalising to the wrong standard could introduce erroneous
trends into data. In papers III–VI, spectra are normalised to equal
area in order to deal with concentration differences in the urine
samples. Also, normalisation with reference to one peak as an
internal standard (e.g. the TMSP55 or creatinine peak) appears in the
literature.
Furthermore, peak alignment is an important preprocessing tool in
multivariate analysis of NMR data. It is employed in all the papers in
this thesis and the subject is discussed in detail in chapter 6.

Validation
One may make an explorative model to search for trends or
groupings in data and use no validation at all, as is done in papers
III, IV and VI in this thesis. This is, however, a somewhat “quick
and dirty” approach. A proper validation of a model (explorative
or for prediction) is preferred. The purpose of the validation is to
derive estimates of the accuracy of the validity or predictions from
the model.
An appropriate method of validation is, prior to any data analysis,
to choose a (representative of the population) test set to be put aside.
The remaining samples, the training set, are then used to determine
the model parameters, e.g. the number of components in the model,
variable selection, number of learning cycles in a NN training, as well
as preprocessing and scaling of data. The test set is then predicted
or projected onto the optimised model to estimate the generalisation
error. This procedure was used in papers I, II and V.

Model selection
The optimisation of a model could be guided by cross validation.122
The data set (here the training set) is divided into a number of subsets.
Each subset is predicted onto a model built from the other subsets.
When all the subsets have been predicted, the generalisation error
of the model can be estimated. The subsets must be picked with
care, duplicates must be in the same subset, and the subset must not
represent all samples from one single corner of the sample domain.
An example of this type of optimisation is shown in paper V.

60 | Processing and analysis of NMR data


If the number of subsets equals the number of samples, this is called
leave-one-out cross validation. This could give a good estimate of
the error of prediction if the samples are too few to comprise a test
set and the samples are well spread in the domain, as in paper II. In
paper I, an “internal test set”, here called a validation set, is used for
the optimisation of the neural net parameters. Additional validation
methods for optimisation can be found in the literature.102

Objective function
We may have different performance objectives for the model
depending on the situation. Most often, the error of prediction is
to be minimised. Root-mean-square error of prediction (RMSEP),
equation (31), is a suitable and common measure (papers I–II).

, (31)

where is the predicted concentration, y is the measured


concentration and J represents the number of predictions.
Another measure of quality may be a measure of class separation.123
In papers IV–VI we have used a measure developed by M. Åberg,124
illustrated in Figure 16. In the scores vectors space, each class was
approximated by a bivariate normal probability distribution. The
boundary between a pair of classes was defined as the hypersurface
on which the probability densities of the two classes were equal.
The measure of class separation was calculated as the minimum
Mahalanobis distance125 between the class boundary and the class
centres.

Figure 16: A geometrical interpretation of the measure of class


separation where x and o represent two classes, the line between
them represents equal probability density and the arrow shows the
measure of class separation.

Chemometrics | 61
62 | Processing and analysis of NMR data
6 Peak alignment

NMR peak area and position give detailed chemical and quantitative
information. Peak shifts and differences in peak shapes may,
however, also arise from variations in experimental conditions
such as temperature variations and inhomogeneities in the applied
magnetic field or differences in the background matrix of the
sample.81,87 The pH of the sample is a major source of variation
in peak positions,126 and even if the samples are buffered, small
pH differences will be detected.42,92,95,127 In biofluid samples the
background matrix also reflects an individual variation depending
on the metabolism.82,128 These factors are, however, most often not
what we want to measure.
From an ideal multivariate analysis point of view, any NMR peak
originating from identical nuclei should have an identical position
and shape between samples. If this is not the case, a multivariate
model will be unnecessarily complex and more difficult to interpret.
Consequently, the usefulness of the results achieved will decrease.
Examples of problems and artefacts originating from misaligned
NMR signals have previously been discussed in papers III–IV and
in references.87,89,126,129,130

Peak alignment in NMR spectrometry


To overcome this NMR peak shift problem in combination with
pattern recognition techniques, different approaches have previously
been suggested. Cloarec et al. 88 have recently presented the use of
orthogonal projection to latent structures (O-PLS)131 discriminant
analysis to handle peak shift problems. The predominant method
in the literature is, however, integration of predetermined spectral
segments, i.e. bucketing, typically 0.04 ppm wide.86,91,92,95,96 The
bucketing approach deals with the peak shift problem but ruins

Peak alignment | 63
the spectral resolution of the acquired data, typically the spectral
resolution decreasing from 6,500 data points/ppm to 28 data
points/ppm. This makes multivariate detection of small variances
(peaks) impossible if they are confounded in the same bucket as
larger variances. Hence, the interpretation of a bucketed data model
will not reveal the specific peaks responsible for the clustering, as is
shown, for example, in paper III and by Cloarec et al.88
Also, more refined alignment methods (than bucketing) for NMR
data have been reported. One example is partial linear fit (PLF), a
method outlined by Vogels et al. 1996.47 PLF automatically picks out
segments in an NMR spectrum of size d and shifts them s points
left and right. Every possible and relevant combination of d and s is
tried until the sum of squared differences between the spectrum and
the target for alignment is minimised. Another approach, outlined
by Brown and Stoyanova 1996,129 performs automatic removal of
both frequency and phase shifts in NMR spectra by using PCA
to determine the misalignment in a single peak appearing across a
series of spectra. This method has been extended and applied to in
vivo NMR spectra,132 and has been further developed and applied to
high-resolution spectral data.130
As a contribution to this field of research, a method of peak
alignment by means of a genetic algorithm (PAGA) was developed
in Paper III. It is an automated segment picking method including
both linear interpolation and shifting of segments, optimised by
a genetic algorithm with the correlation coefficient as objective
function. This algorithm was speeded up by Lee and Woodruff 133
by replacing the genetic algorithm by a beam search algorithm, and
was further analysed and compared to peak alignment by reduced
set mapping (PARS) in paper IV.
PARS was developed by Torgrip et al.134 The method relies on peak
detection and converts spectra into sparse vectors consisting of
zeros and peak intensities corresponding to peak maxima. This peak
alignment is performed by a fast tree-search algorithm with early
pruning of non-optimal alignment solutions.

64 | Processing and analysis of NMR data


Peak alignment for other analytical techniques
There are also several examples of alignment of first-order data,
which have not yet been applied to NMR data. Some of them will
be mentioned here.
Dynamic time warping (DTW) is an application of dynamic
programming which has been used for aligning profiles from
industrial batch processes of polymerisation135 and GC/FT-IR/MS
chromatograms,136 although it has its origin in speech recognition.137
The signals are warped nonlinearly, point by point, to fit a reference
profile and the cumulative distance (e.g. Euclidean, equation (35))
between points is optimised. Additional methods developed from
DTW have been presented for chromatograms: one peak matching
algorithm relying on peak detection and local warping (LW)138 and
one parametric time warping method (PTW).139
Piecewise linear correlation optimised warping (COW)140 has been
applied to GC and LC data.141,142 This algorithm selects predetermined
spectral segments of equal length and by linear interpolation stretches
and shrinks them to fit a reference spectrum. The correlation
coefficient is optimised. A method for peak alignment very similar
to COW, but with sideways shifting instead of interpolation, is
piecewise alignment (PWA).123 Another segmentwise peak alignment
approach based on genetic algorithm optimisation has been applied
to LC chromatograms on peptide maps.143
Walczak and Wu recently presented a method for fuzzy warping
(FW) of chromatograms relying on only the main peaks.144 They also
compared the performance and processing time for several different
methods of peak alignment referred to above: FW, COW, PAGA,
PTW and DTW. PAGA (from paper III) performs really well in
comparison to the others, although it is relatively time-consuming
in some cases.

Profile alignment or peak alignment


Some of the methods mentioned rely on peak detection in some
way (PARS, FW and LW) or include just one peak at a time. When
a peak list is created, no noise is present and unit variance scaling
may successfully be applied as in paper IV, for instance. This and the
data reduction are some of the advantages of peak detection. The
disadvantage is the uncertainty in defining the peaks.

Peak alignment | 65
The data set from the urine sample 1H-NMR analysis studied in
this thesis has a lot of line-width variations due to, for example,
shimming problems. An example is shown in Figure 17, which
illustrates the difficulty of peak detection relying on a peak shape
definition in these spectra. In order to avoid detection of peaks,
methods relying on alignment of profiles are preferred, e.g. DTW,
PTW, COW, PLF, PWA and PAGA.

3.22 3.2 3.18 3.16 3.14 3.12 3.1 3.08 3.06

3.2 3.18 3.16 3.14 3.12 3.1 3.08 3.06 3.04


δ( 1 H)/ppm
Figure 17: Example of peak shifts and peak shape distortions due
to varying background matrix and shimming between 1H-NMR
spectra from rat urine samples. Black lines represents samples
from control rats and grey lines represent samples from dosed
rats (days 7-14). The upper figure shows unaligned spectra and the
lower figure shows spectra aligned by PAGA.

66 | Processing and analysis of NMR data


Target for alignment
In first-order real data (one vector per sample), there are no data
to guide or validate a peak alignment as in second-order data (one
matrix per sample). Hence, a target must be chosen – it could be one
spectrum, although various alternatives are possible.
When the same target is used for all the spectra in a data set,
differences between groups of spectra will easily be detected in
the same score plot from a multivariate analysis, assuming that
peaks appearing from the same substance appear at the same
position in all spectra. An unknown sample would be aligned to the
previously chosen target and projected into the multivariate model
space (loadings) for visual interpretation. This is the situation in
papers III–IV and references,47,123,134,135,138-140,144,145 where one target
spectrum is chosen and assumed to reflect all peaks of interest.
Also a combination of targets may well be employed, such as the
sum of two spectra (paper II) or a mean spectrum.138 For the PARS
method,134 there is also a possibility of using a recursively updated
target, where new (unaligned) peaks appearing in aligned spectra are
added to the target (paper IV), or a dendogram alignment scheme
where a root target is created.146
When interpreting data with defined class associations, this
information could be used. All the spectra in one class may be
aligned to one chosen target in its own class.143 The interpretation
with all classes in one multivariate model will not be valid in this
case. There are also examples where the whole data matrix (all
samples in the study) is part of the alignment and no special target
is chosen.129,130,132

Optimisation
An optimisation problem is a computational problem in which the
objective is to find the best of all possible solutions, or more formally,
find a solution in the feasible region which has the minimum or
maximum value of the objective function. There are several solution
methods to optimisation problems. I will bring up and compare
some used in the methods for alignment mentioned above.

Peak alignment | 67
Dynamic programming is an algorithmic technique which
decomposes a multivariate optimisation by evaluating all possible
solutions. This is a “safe” method but could be very computer-
intense if the variables are numerous, as when comparing two NMR
spectra with 65,000 variables each. Dynamic programming is utilised
in COW, DTW or LW, for example.
Genetic algorithms (GAs), used for PAGA (paper III), search the
solution space of a function by simulated evolution.147 A GA usually
starts with a random population of candidate solutions. Each solution
is subjected to a genetic search, which encompasses assignment to a
fitness value according to an objective function, selection of strings
for further replication, and recombination by crossover and/or
mutation. This genetic search runs until a predefined optimisation
criterion is met.
Beam search,148 which was used for alignment by Lee and Woodruff,133
is a heuristic search algorithm that includes an optimisation of best-
first search, where a number of nearly optimal alternatives (“the
beam”) are examined. Initially, a set of likely solutions is created
on the edge of a predetermined search radius and subsequently
evaluated. The algorithm then assigns one or more new candidate
solutions by selecting the k best steps from the current trial solution(s)
where k is the parametric input beam width. For each loop of the
algorithm, the radius is reduced until a stop criterion is met. In
paper IV and the paper by Lee and Woodruff,133 peak alignments by
a genetic algorithm and by beam search are compared. According
to these studies, they perform equally well, but beam search is about
ten times faster.
In optimisation by tree-search algorithms, nodes (e.g. possible peak
matches) are defined from the data structure. The nodes are then
searched level by level until the globally optimal solution is found
(breadth-first search as in PARS).

68 | Processing and analysis of NMR data


Objective function
Several objective functions are mentioned together with peak
alignment techniques. In paper I, the fitness criterion was the sum
of the absolute differences between the actual spectrum (S ) and a
target spectrum (T ):

(32)

where si and ti represents each data point i in spectrum and target


respectively. The correlation coefficient (CC ) was used as a measure for
the goodness of alignment in, for example, COW, PWA, LW and
PAGA (paper III). It is a numeric measure of the strength of linear
relationship between variables and it is calculated by dividing the
covariance (cov) by the product of their standard deviations (σ):

. (33)

The standard deviation (σ) is estimated by the square root of the


variance, the root mean square (RMS) of the deviations from the
average:

, (34)

where J is the number of observations. In COW, PWA and LW, a


Wallis filter140 was also used to minimise the effect of varying peak
heights. This was not used, however, in paper III since we wanted
the larger peaks to guide the alignment.
The root-mean-square error (RMSE) used as objective function
in PLF and PTW, for example, is actually a measure of the mean
Euclidean distances between profiles. The Euclidean distance is
defined as:

(35)

for J variables.

Peak alignment | 69
NMR data compression
Generally, no data compression of a 1H-NMR spectrum is required
to perform the data analysis. Since today’s powerful computers can
handle (quite) large data volumes, the whole data matrix should
not be a problem to cope with. However, special cases when, for
example, neural networks are utilised for calibration, the data need
to be reduced due to heavy computations, as shown in paper I. This
reduction can be performed by more sophisticated methods than
bucketing (mentioned above). Examples in the already referred
literature are PARS134 where a sparse matrix replace the spectra
by peak identification, or wavelet reduction and variable selection
(paper I). In paper IV, data compressions by bucketing to different
extents after peak alignment are compared. The conclusion was that
no difference in class separation from a PLS-DA model is shown
when a reduction from about 65,000 to 8,000 data points of spectra
is performed.

Discussion
The optimum choice of method for peak alignment will differ
between applications. However, most of the methods mentioned
above could be applied to any data produced by an analytical chemist.
On the other hand, it is not straightforward to put the optimum
parameters into the algorithms. Either the alignment methods rely
on peak detection, which implies some definition of a peak and may
cause problems, e.g. if the peak shapes differ between samples, or
the alignment requires some segment picking, in which optimum
parameters will also differ between applications. One conclusion
may be that the peak picking approach is favourable in samples with
well-defined peaks, appearing similar between samples, and a profile
alignment including segment picking is preferred when peaks or
profiles are not that well defined. Point-by-point alignments such
as DTW are, of course, also possible. However, for NMR data,
consisting of around 60K data points, the available methods are
very computer-intense.
Another parameter to choose in alignment tasks is the objective
function, which should be chosen to fit the samples, either the
largest intensities being allowed to guide the alignment (CC (paper
III) or RMSE47,139,145) or each peak being equally evaluated (Wallis
filter140).

70 | Processing and analysis of NMR data


7 1
H-NMR + LC/MS

As described in chapter 4, 1H-NMR spectrometry is a well-established


technique for generating metabolic profiles. Recently, liquid
chromatography/mass spectrometry (LC/MS) has also been used
for the same purpose.149,150 Both techniques have their advantages in
this field of research. For instance, 1H-NMR spectrometry requires
minor sample pretreatment, while LC/MS in general possesses a
higher sensitivity. 1H-NMR spectrometry may detect compounds
that are not retained on the LC column or not ionisable in the mass
spectrometer. LC/MS may detect compounds that are present in a
concentration below the LOD for 1H-NMR spectrometry.
The rat urine samples described in chapter 4 have been analysed
both by 1H-NMR (paper III) and by LC/MS.149,151 When comparing
the data from the two techniques, it is most likely that they generate
complementary information. A few previous metabolic studies
have analysed both LC/MS and NMR spectrometry for analysing
the same samples to verify and complement results of the other
technique.93,94,97 The data sets have then been studied separately. In
papers V and VI, we wanted to analyse the two data sets within the
same chemometric model. This was carried out by data fusion and
data correlation.

Original data
These data analyses started with the data schematically represented
in Figure 18. The LC/MS data consisted of a 95 variable peak list
for each sample, pretreated by curve resolution, peak alignment,
log10 transformation and mean centring. The 1H-NMR data were
pretreated by peak alignment according to PAGA (but with the
beam search optimisation algorithm), data reduction by segmentwise
integration, normalisation to equal area and mean centring, resulting

1H-NMR + LC/MS | 71
in 8,192 variables for the data fusion and 2,048 variables for the data
correlation. The data pretreatments are further described in papers
V and VI.

Figure 18: Original data.

Data fusion
In the field of chemometrics the number of possible operations
and combinations of method applications to data are numerous.
In paper V, the aim was to examine different possibilities of
performing data fusion of the measured LC/MS and 1H-NMR data.
Data fusion is defined as a method that combines information from
different sources to produce a single model or decision.152 There
are a few examples of data fusion in the literature153,154 showing that
this is not straightforward. Our hypothesis for this work was that
the classification of the samples would be improved by using data
fusion, making the diagnosis (i.e. pattern recognition) of the studied
phenomena more reliable.
Concatenation, full hierarchical modelling, batch modelling and
outer product analysis were the techniques examined for the fusion
of LC/MS and 1H-NMR data in papers V and VI. Various scalings of
the variables depending on the standard deviation and/or the mean
of each variable and two types of block scaling were tried, and the
possibilities of reaching an enhanced classification were explored.
Of course, various types of other data fusion methods, scalings
and multivariate analyses are possible; however, the optimum data
treatment must be tried out for each new case.

72 | Processing and analysis of NMR data


Data concatenation
Data concatenation (Figure 19) is denoted as data-level (low-
level) fusion according to definitions in the literature.152,155 It is
straightforward, but most certainly requires appropriate block
scaling156-158 to prevent one block of data from being numerically
dominant. In paper V, a number of different scalings of the
concatenated data were evaluated prior to mean centring and
multivariate analysis (MVA) – PCA or PLS.

Figure 19: Data concatenation.

Hierarchical modelling
Hierarchical modelling157,159 was also tried in paper V. It is a way
of modelling data at two levels: one lower level and one higher
level. The lower-level modelling is performed on separate blocks of
the data set (LC/MS and 1H-NMR in this case), while higher-level
modelling is performed on the concatenated and scaled scores from
the lower-level, Figure 20. Different combinations of PCA and PLS
were tried. This type of data fusion has previously been termed
decision-level (high-level) fusion in the literature.152,155

Figure 20: Hierarchical modelling.

1H-NMR + LC/MS | 73
Batch modelling
Batch modelling85,160 is a special case of hierarchical modelling in
which every batch (rat in paper V), is modelled with respect to one
variable (time in paper V) in the lower-level modelling. Prior to the
higher-level modelling, the scores from each rat and each technique
at the lower level are concatenated and scaled (Figure 21). Different
combinations of PCA and PLS were tried.

Figure 21: Batch modelling.

Outer product analysis


Outer product analysis (OPA) is another possible method of fusing
two different data sets in order to determine the relations that exist
between the two types of signals.161 The outer product matrix for
two vectors is defined as all possible product combinations of
the vector variables (Figure 22). In paper VI, this data fusion was
performed samplewise, resulting in one matrix per sample.

Figure 22: Outer product analysis.

74 | Processing and analysis of NMR data


Data correlation

Correlation scaling
Paper VI shows the potential of using prior knowledge, in this
case identified metabolites, in the data analysis. 1H-NMR variables
correlated with drug metabolites variables in LC/MS151 were
identified by correlation scaling and PARAFAC analysis. The
correlation scaling was performed by multiplying the NMR variables
by the correlation coefficient (33), scaled to the range [0 1], between
the time trajectory of each 1H-NMR variable and the LC/MS
variables from the identified drug metabolites. These identified 1H-
NMR signals (most likely arising from drug metabolites) were then
eliminated from 1H-NMR data. The drug metabolite variables in
LC/MS were also eliminated, and the two data sets were fused by
OPA and again analysed by PARAFAC. The data fusion step was
required to achieve satisfactory class separation. The PARAFAC
analysis gave rise, of course, to new 1H-NMR variables, separating
the classes of dosed and control rats. The kinetics of those variables
differed from the kinetics of the drug metabolites and they were thus
identified as potential endogenous markers for phospholipidosis.
Some of the peaks could be identified from the literature (see paper
VI). An example is shown in Figure 23.

1H-NMR + LC/MS | 75
14
10
7
day of 3
dosage
1
-5
7.52 7.44
7.66 7.59
7.80 7.73
δ( 1 H)/ppm

Figure 23: A time profile of a 1H-NMR spectral segment. The


peak at 7.75 ppm was selected in the correlation scaling analysis. It
grows with time in the same way as a drug metabolite. The peaks
at 6.64 and 7.56 ppm show a decrease (minimum at day 3) and then
increase with time, these peaks being selected in the PARAFAC
analysis after the elimination of data correlating with identified
drug metabolites and being identified as hippurate.

76 | Processing and analysis of NMR data


PLS
PLS was also used to find correlating variables in paper VI. There
the 1H-NMR data were chosen as X and selected LC/MS variables
identified as drug metabolites were chosen as y. The loadings from
PLS (unfolded data) and N-PLS showed the variables separating the
sample classes and correlating with the chosen LC/MS variables.
These results agreed with the correlation scaling and PARAFAC
analysis. However, problems occurred in the next step (with
eliminated metabolite variables), when the separation of 1H-NMR
data was not satisfactory, and no data fusion was possible since the
LC/MS data were used as y data.

Discussion
In paper V, data fusion by full hierarchical modelling and
concatenation of data generally improved the classification, especially
for the unsupervised models (PCA). This indicates that we may have
some complementary data between the two data sets. The batch
modelling, however, was not very successful in this study.
In paper VI, too, the separation between dosed and control classes
was improved by data fusion, although this time OPA was utilised
for the purpose. The large-scale use of data fusion in general was
discovered in this work when some dominating peaks had been
removed and the data fusion was actually needed to discriminate
between the different groups of samples.
It is not possible to choose the best data fusion method on the basis
of these studies. The four methods presented will be successful to
different extents in different cases. The scaling of data is of great
importance and should be optimised by a proper validation, as
exemplified in paper V.
The data correlation methodology in Paper VI shows a lot of
potential in the interpretation of metabolic data. More variables
could be selected in both steps of the analysis and hence give more
signals for interpretation. Moreover, further analysis of the LC/MS
data still remains to be done.

1H-NMR + LC/MS | 77
78 | Processing and analysis of NMR data
8 Conclusions

Determination of low-level components by qNMR in organic and


biological samples is very easy. The sample is simply dissolved in a
deuterated solvent, perhaps buffered to reduce pH variations, put in
an NMR sample tube and placed in the magnet. A lot of standard
pulse programs have already been prepared, ready for use, in the
modern NMR instruments.
However, to get good spectral resolution and quantitative information
out of an NMR spectrum, the instrumental parameters have to be
carefully chosen. Knowledge of the influence of the sample matrix,
relaxation processes and the NMR instrumentation will be of great
importance. The optimal parameter settings must be selected for
each new type of sample. The work of this thesis includes only
the initial step to enable qualified guesses to be made for the basic
starting parameters in a qNMR experiment.
The evaluation of qNMR data has been the subject uppermost in
my thoughts during this work. Classical data pretreatments (zero
filling and line broadening) were tried, together with more unusual
and novel ones. The conclusions from these studies were that the
classical methods (with typical setups) had very little influence
on the quantitative results (papers I–II), while the non-standard
methods (convolution by a triangular function and peak alignment)
significantly improved the evaluation of qNMR data (papers II–IV).
Extensions in the field of chemometrics were explored by fusion
of data from the various analytical techniques 1H-NMR and LC/MS
(papers V–VI). Data fusion was shown to be successful when
more detailed studies in the interpretation of 1H-NMR data were
performed by means of correlation with LC/MS data.

Conclusions | 79
80 | Processing and analysis of NMR data
9 Future perspectives

Of course, several additional parameter setups, data pretreatments,


and calibration models are possible in the field of qNMR. Experience
from the work carried out on this thesis provides me with even more
motivation to tackle new qNMR problems and try new methods. If
I get the chance to proceed with this work, my immediate efforts
would include further work on the profile/peak alignment problem
and nonlinear modelling, which may handle peak shifts, incomplete
nuclear relaxations and nonlinear relations between samples.
Moreover, the project of data fusion has just been initiated and a lot
of scaling and methods remain to be investigated.
In order to extract the maximum information out of a set of samples,
the noise must be minimised and the resolution maximised. As the
time devoted to experiments has taken up a very small part of the
total time in the research projects presented, an increase of more
transients in favour of S/N would be beneficial. More and shorter
pulses would probably have given a lower LOD in papers I–II. For
the metabonomic samples in papers III–VI, an improved spectral
resolution, in particular, would have been desirable. This will require
improved instrumental facilities such as the sample probe and
improved shimming. This would facilitate and improve the final
interpretation.
Metabonomic sample collection (which is really outside my subject)
may also be an area for improvement. More frequently collected
samples would, for example, give a more reliable estimate of the
kinetics of a metabolite or biomarker (paper VI).
In the future, the metabolic profiling analysis will perhaps have a
more important role to play in toxicology, enabling classification of
the type of toxicity. This will probably require more extensive data

Future perspectives | 81
analysis than the most straightforward types of analysis. In papers
III–VI, the relevant goal is to find a biomarker time pattern for
a specific type of toxicity, namely phospolipidosis. This task (and
many others) still remains to be solved and presents a challenge.
More than one analysis technique will most likely be needed to give
a clear and significant diagnosis or classification. This will require
appropriate methods of data fusion. Furthermore, metabolomics,
including metabolic profiling, will most likely be of greater
significance in the area of systems biology in the future, requiring
data fusion techniques.

82 | Processing and analysis of NMR data


Acknowledgements

Jag vill hjärtligt tacka alla Ni som direkt eller indirekt hjälpt och stöttat mig
under min tid som doktorand på Institutionen för Analytisk kemi (IAK).
Det har varit roligt att gå till jobbet varje dag.
Sven, tack för din oändliga entusiasm när det gäller forskningen, det
smittar av sig. Få har sån koll på vad som är rätt och hett, det har varit ett
privilegium att få ha dig som handledare.
Bosse, tack för att du visat både vetenskapligt och personligt engagemang,
trots att du inte har varit min huvudhandledare.
Tack, alla ni doktorander före mig, kursassistenter och inspiratörer: Åsa,
Gunnar, Jonas B, Kakan, Ludde, Magnus A, Bettan, Anders Ch, Joanna,
Petr, David och Eva.
Och TACK alla ni samtida doktorander: Helena I (för ett fantastiskt
samarbete i Bergen, på Smyril, på Färöarna, i källaren, i Ume och i
Tänndalen – high five!), Malin (för alla allvarliga och lättsamma prat och ditt
smittande skratt), Magnus Å (för alla bra diskussioner, mer och mindre
vetenskapliga), Stina C.B. (för trevligt rumssällskap och handledning på
kurslab), Leila (tack för att du hjälper mig att behålla mina vänner – och för
att du dansar så bra!), Helena H (för att du är en lyssnare och en toppen-
träningskompis), Petter (tack för allt tjôt om JAVA, barn, riskokare, brödbak
och locket på eller av), Stina M (för din humor och för att du är så bra på
danska – va fäen Määts!), Magnus E (för att du kan vara både arg och glad),
Johanna (för ditt lugn och din attityd – det är en konst att leva i nuet),
Sindra (för att du hjälpte mig överleva en grundkurs som assistent), Ove
(för en helt underbar kajak-tur och för att du är en rättskaffens), Yvonne
(som ju tydligen var doktor hela tiden, fast bara inte på pappret), Ragnar
(tack för gott samarbete, och lunchsällskap i bögparken), Thorvald (för att
du är så sträng mot studenterna men så barnkär), Ralf (för alla diskussioner
som inte leder nånstans, och för alla de som leder nånstans), Kent (för
bugg-kursen), Nana (för allt dagis-prat), Leo (för att du fortsätter att jobba
för jämställdheten på IAK – och aligna NMR data), Lina (tack för din
oblyghet och glada humör), Christoffer (den perfekta studenten), Axel (för
det tyska popen), Helena B, Bodil, Cristina S and Sharam.

Acknowledgements | 83
Tack alla ni andra som gör/gjorde det så trevligt på IAK: Jonas R (för att
du så väl har balanserat upp mitt ”bråttom” med ditt lugn, du är ovärderlig),
Anders Co (för att du är så mån om att stämningen skall vara god på
IAK), Anita (för trevligt samarbete på kurs), AnneMarie (för allt du alltid
fixar), Carlo (för vin och kulinariska upplevelser), Håkan (för äventyrliga
hemresor) Pol (for your whistling), Conny (för att du så vänligt lånar ut din
scanner), Ulrika, Lena, Björn, Roger, Ulla, Ramon, Eva, Rolf och Boven.
Och tack Hillevi på kemibiblioteket för din gränslösa service!
Stort tack Ni på AstraZeneca som har hjälpt mig med allt från teoretiska
problem till att hitta min dator och mina tofflor: Fredrik, Ina, Bengt, Johan,
Ingrid, Joanna med flera. Tack särskilt till Susanne, Anders och Alex; er
NMR-hjälp har varit ovärderlig!
Ett extra tack till Er som har korrekturläst och kommit med värdefulla
åsikter om manus: Magnus, Susanne, Ralf, Bosse, Sven, Jonas R.
Jag vill också tacka för värdefull inspiration från Ume: Henrik, Johan, Suss,
Mattias, Pera m.fl.
Tack alla ni sköna vänner som inte har med kemi att göra: Jenny, Micke,
Camilla, Mathias, Gunilla, Alvin, Eva, Anna, Robert, Karin (tackas extra för
snowboard och surfing), Danne, Jenny, Sandis, Christina, Magnus, Magda,
Ludde, Annika, Peter, Martin, Marie, Ola, Klara.
Jag vll också tacka Ylva, Kjell, Folke (d.ä.), Monica, Tim och Frida för
släktkalas och visat intresse.
Tack också alla trivsamma grannar, uthålliga barnvakter, trevliga
dagisföräldrar och proffsig dagis- och skolpersonal.
Och min närmsta familj: Farmor och Morfar, tack för att ni finns/fanns.
Mamma och Pappa, tack för ert outtröttliga och kravlösa stöd, jag är er
evigt tacksam för det. Sara och Carl med familjer, tack för att ni gör något
helt annat än jag och på så sätt ger mig perspektiv.
Men... ♫♪ Vad vore livet utan Östen, vad vore livet utan Folke, vad vore livet utan
Mina – Ingenting!! – Ingenting!! – Ingenting!! ♪♫ Östen, tack för att du är som
du är, jag älskar dig. Folke och Mina, mina kära solstrålar! Det är nog i alla
fall ni som har hjälpt mig mest i detta arbetet, genom att ni ger mig den
viktiga distansen till allt det här.

84 | Processing and analysis of NMR data


References

1. Szantay Jr, C. and Demeter, A., NMR spectroscopy, Progress in Pharmaceutical


and Biomedical Analysis 4 (2000) 109-145.
2. Al Deen, T.S., Hibbert, D.B., Hook, J.M. and Wells, R.J., An uncertainty budget
for the determination of the purity of glyphosate by quantitative nuclear magnetic resonance
(QNMR) spectroscopy, Accreditation and Quality Assurance 9 (2004) 55-63.
3. Bauer, M., Reproducibility of 1H-NMR integrals: a collaborative study, Journal of
Pharmaceutical and Biomedical Analysis 17 (1998) 419-425.
4. Derome, A.E., Modern NMR Techniques for Chemistry Research, Series ed: Baldwin,
J.E. and Magnus, P.D. Vol. 6. Pergamon, Oxford, UK (1987)
5. Grivet, J.-P., Accuracy and precision of intensity determinations in quantitative NMR,
Data Handling in Science and Technology (1996) 306-329.
6. Malz, F. and Jancke, H., Validation of quantitative NMR, Journal of Pharmaceutical
and Biomedical Analysis 38 (2005) 813.
7. Maniara, G., Rajamoorthi, K., Rajan, S. and Stockton, G.W., Method performance
and validation for quantitative analysis by 1H and 31P NMR spectroscopy. Applications to
analytical standards and agricultural chemicals, Analytical Chemistry 70 (1998) 4921-
4928.
8. Pieters, L.A.C. and Vlietinck, A.J., Applications of quantitative 1H- and 13C-NMR
spectroscopy in drug analysis, Journal of Pharmaceutical and Biomedical Analysis 7
(1989) 1405-1417.
9. Rabenstein, D.L. and Keire, D.A., Quantitative chemical analysis by NMR, Practical
Spectroscopy 11 (1991) 323-369.
10. Bain, A.D. and Lao, L., NMR as a quantitative analytical tool in chemical applications
of isotopes. in: Isotopes in the Physical and Biomedical Science, Ed: Buncel, E. and R.,
J.J., Elsevier Science Publishers B. V., Amsterdam, (1991) pp. 411-429.
11. Becker, E.D., Ferretti, J.A. and Gambhir, P.N., Selection of optimum parameters for
pulse Fourier transform nuclear magnetic resonance, Analytical Chemistry 51 (1979)
1413-20.

References | 85
12. Ernst, R.R., Bodenhausen, G. and Wokaun, A., Principles of nuclear magnetic
resonance in one and two dimensions. Clarendon, Oxford (1987)
13. Field, L.D. and Sternhell, S., Analytical NMR. John Wiley & sons, Chichester
(1989)
14. Fukushima, E. and Roeder, S.B.W., Experimental Pulse NMR: A Nuts and Bolts
Approach. Addison-Wesley Pub. Comp., Massachusetts (1981)
15. King, R.W. and Williams, K.R., The Fourier transform in chemistry. Part 1. Nuclear
magnetic resonance: introduction, Journal of Chemical Education 66 (1989) A213-
A219.
16. King, R.W. and Williams, K.R., The Fourier transform in chemistry. Part 2. Nuclear
magnetic resonance: the single pulse experiment, Journal of Chemical Education 66
(1989) A243-A248.
17. Skoog, D.A., Holler, F.J. and Nieman, T.A., Principles of Instrumental Analysis. 5:
th ed. Saunders College Publishing, Florida (1998)
18. Bloch, F., Hansen, W. and Packard, M.E., Nuclear induction, Physical Review 69
(1946) 127.
19. Purcell, E.M., Torrey, H.C. and Pound, R.V., Resonance absorption by nuclear
magnetic moments in a solid, Physical Review 69 (1946) 37-38.
20. Gueron, M., Plateau, P. and Decorps, M., Solvent Signal Suppression in NMR,
Progress in Nuclear Magnetic Resonance Spectroscopy 23 (1991) 135-209.
21. Braun, S., Kalinowski, H.-O. and Berger, S., 150 and more basic NMR experiments:
a practical course. 2nd ed. Wiley-VCH, Weinheim (1998)
22. Delsuc, M.A. and Lallemand, J.Y., Improvement of dynamic range in nmr by
oversampling, Journal of Magnetic Resonance 69 (1986) 504-507.
23. Cross, K.J., Digital filtering, Data Handling in Science and Technology 18 (1996)
120-145.
24. Glasser, L., Fourier transforms for chemists. Part II. Fourier transforms in chemistry and
spectroscopy, Journal of Chemical Education 64 (1987) A260-A266.
25. Griffiths, L. and Irving, A.M., Assay by nuclear magnetic resonance spectroscopy:
quantification limits, Analyst 123 (1998) 1061-1068.
26. Ernst, R.R., Sensitivity Enhancement in Magnetic Resonance. in: Advances in Magnetic
Resonance, Ed: Waugh, J.S., Academic Press Inc., New York, (1966) pp. 1-135.
27. Lindon, J.C. and Ferrige, A.G., digitisation and data processing in Fourier rtransform
NMR, Progress in NMR spectroscopy 14 (1980) 27-66.
28. Hodginsson, P. and Hore, P.J., Sampling and the quantification of NMR data. in:
Advances in magnetic and optical resonance, Academic press, (1997).
29. Martin, Y.L., A Global Approach to Accurate and Automatic Quantitative Analysis
of NMR Spectra by Complex Least-Squares Curve Fitting, Journal of Magnetic
Resonance. Series A 111 (1994) 1-10.
30. Gesmar, H., Led, J.J. and Abildgaard, F., Improved Methods For Quantitative Spectral
Analysis Of NMR Data, Progress in Nuclear Magnetic Resonance Spectroscopy
22 (1990) 255-288.

86 | Processing and analysis of NMR data


31. Bretthorst, G.L., Bayesian-Analysis. I. Parameter Estimation Using Quadrature NNR
Models, Journal of Magnetic Resonance 88 (1990) 533-551.
32. Kotyk, J.J., Hoffman, N.G., Hutton, W.C., Bretthorst, G.L. and Ackerman, J.J.H.,
Comparison Of Fourier And Bayesian-Analysis Of Nmr Signals. II. Examination Of
Truncated Free Induction Decay Nmr Data, Journal of Magnetic Resonance. Series
A 116 (1995) 1-9.
33. Sibisi, S., Skilling, J., Brereton, R.G., Laue, E.D. and Staunton, J., Maximum
entropy signal processing in practical NMR spectroscopy, Nature 311 (1984) 446-447.
34. Evilia, R.F., Quantitative NMR spectroscopy, Analytical Letters 34 (2001) 2227-
2236.
35. Pauli, G.F., Jaki, B.U. and Lankin, D.C., Quantitative 1H NMR: Development and
potential of a method for natural products analysis, Journal of Natural Products 68
(2005) 133-149.
36. Muller, N. and Goldenson, J., Rapid analysis of reaction mixtures by nuclear magnetic
resonance spectroscopy, Journal of the American Chemical Society 78 (1956) 5182-
5183.
37. Hollis, D.P., Quantitative analysis of aspirin, phenacetin, and caffeine mixtures by Nuclear
Magnetic Resonance Spectrometry, Analytical Chemistry 35 (1963) 1682-1684.
38. Jungnickel, J.L. and Forbes, J.W., Quantitative Measurement of Hydrogen Types by
Intregated Nuclear Magnetic Resonance Intensities, Analytical Chemistry 35 (1963)
938-942.
39. Rackham, D.M., Recent applications of quantitative nuclear magnetic resonance spectroscopy
in pharmaceutical research, Talanta 23 (1976) 269-274.
40. Pauli, G.F., qNMR - A versatile concept for the validation of natural product reference
compounds, Phytochemical Analysis 12 (2001) 28-42.
41. Szantay Jr, C., High-field NMR spectroscopy as an analytical tool for quantitative
determinations: pitfalls, limitations and possible solutions, TrAC Trends in Analytical
Chemistry 11 (1992) 332-344.
42. Nicholson, J.K. and Wilson, I.D., High resolution proton magnetic resonance spectroscopy
of biological fluids, Progress in NMR spectroscopy 21 (1989) 449-501.
43. Ichikawa, M., Nonaka, N., Amano, H., Takada, I., Ishimori, S., Andoh, H. and
Kumamoto, K., Proton NMR Analysis of Octane Number for Motor Gasoline: Part V,
Applied Spectroscopy 46 (1992) 1548-1551.
44. Sarpal, A.S., Kapur, G.S., Mukherjee, S. and Jain, S.K., Estimation of oxygenates in
gasoline by 13C NMR spectroscopy, Energy & Fuels 11 (1997) 662-667.
45. Yang, C.M., Grey, A.A., Archer, M.C. and Bruce, W.R., Rapid Quantitation of
Thermal Oxidation Products in Fats and Oils by 1H-NMR Spectroscopy, Nutrition and
Cancer 30 (1998) 64-68.
46. Lachenmeier, D.W., Frank, W., Humpfer, E., Schafer, H., Keller, S., Mörtter, M.
and Spraul, M., Quality control of beer using high-resolution nuclear magnetic resonance
spectroscopy and multivariate analysis, European Food Research And Technology
220 (2005) 215-221.

References | 87
47. Vogels, J.T.W.E., Tas, A.C., Venekamp, J. and Van der Greef, J., Partial Linear Fit:
a new NMR spectroscopy preprocessing tool for pattern recognition applications, Journal of
Chemometrics 10 (1996) 425-438.
48. Holzgrabe, U., Diehl, B.W.K. and Wawer, I., NMR spectroscopy in pharmacy,
Journal of Pharmaceutical and Biomedical Analysis 17 (1998) 557-616.
49. Lacroix, P.M., Dawson, B.A., Sears, R.W., Black, D.B., Cyr, T.D. and Ethier, J.C.,
Fenofibrate raw materials: HPLC methods for assay and purity and an NMR method for
purity, Journal of Pharmaceutical and Biomedical Analysis 18 (1998) 383-402.
50. Lankhorst, P.P., Poot, M.M. and deLange, M.P.A., Quantitative determination of
lovastatin and dihydrolovastatin by means of 1H NMR spectroscopy, Pharmacopeial
Forum 22 (1996) 2414-2422.
51. Wells, R.J., Hook, J.M., Al-Deen, T.S. and Hibbert, D.B., Quantitative nuclear
magnetic resonance (QNMR) spectroscopy for assessing the purity of technical grade
agrochemicals: 2,4-dichlorophenoxyacetic acid (2,4-D) and sodium 2,2-dichloropropionate
(dalapon sodium), Journal of Agricultural and Food Chemistry 50 (2002) 3366-
3374.
52. Berregi, I., Santos, J.I., del Campo, G. and Miranda, J.I., Quantitative determination
of (-)-epicatechin in cider apple juices by 1H NMR, Talanta 61 (2003) 139-145.
53. Fernández-Ramírez, A., Salazar-Cavazos, M.D.L., Rivas-Galindo, V., Ceniceros-
Almaguer, L. and Waksman de Torres, N., Determination of isoperoxisomicine A1
content in peroxisomicine A1 batches by means of 1H NMR, Analytical Letters 37
(2004) 2433-2444.
54. Ruiz-Calero, V., Saurina, J., Galceran, M.T., Hernández-Cassou, S. and Puignou,
L., Estimation of the composition of heparin mixtures from various origins using proton
nuclear magnetic resonance and multivariate calibration methods, Analytical and
Bioanalytical Chemistry 373 (2002) 259-265.
55. Ruiz-Calero, V., Saurina, J., Galceran, M.T., Hernández-Cassou, S. and Puignou,
L., Potentiality of proton nuclear magnetic resonance and multivariate calibration methods
for the determination of dermatan sulfate contamination in heparin samples, Analyst 125
(2000) 933-938.
56. Keun, H.C., Beckonert, O., Griffin, J.L., Richter, C., Moskau, D., Lindon, J.C.
and Nicholson, J.K., Cryogenic probe 13C NMR spectroscopy of urine for metabonomic
studies, Analytical Chemistry 74 (2002) 4588-4593.
57. Griffin, J.L., Nicholls, A.W., Keun, H.C., Mortishire-Smith, R.J., Nicholson, J.K.
and Kuehn, T., Metabolic profiling of rodent biological fluids via 1H NMR spectroscopy
using a 1 mm microlitre probe, Analyst 127 (2002) 582-584.
58. Hoult, D.I. and Richards, R.E., The signal-to-noise ratio of the Nuclear Magnetic
Resonance Experiment, Journal of Magnetic Resonance 24 (1976) 71-85.
59. Lindon, J.C., Nicholson, J.K. and Wilson, I.D., Direct coupling of chromatographic
separations to NMR spectroscopy, Progress in Nuclear Magnetic Resonance
Spectroscopy 29 (1996) 1-49.
60. Hoult, D.I., Sensitivity optimization [in NMR], Topics in Carbon 13 NMR
Spectroscopy 3 (1979) 16-28.

88 | Processing and analysis of NMR data


61. Ernst, R.R. and Anderson, W.A., Application of Fourier transform spectroscopy to
magnetic resonance, The Review of Scientific Instruments 37 (1966) 93-102.
62. Traficante, D.D., Optimum tip angle and relaxation delay for quantitative analysis,
Concepts in Magnetic Resonance 4 (1992) 153-160.
63. Rabenstein, D.L., Sensitivity enhancement by signal averaging in pulsed/Fourier transform
NMR spectroscopy, Journal of Chemical Education 61 (1984) 909-913.
64. Traficante, D.D. and Steward, L.R., The relationship between sensitivity and integral
accuracy. Further comments on optimum tip angle and relaxation delay for quantitative
analysis, Concepts in Magnetic Resonance 6 (1994) 131-135.
65. Roy, J., Pharmaceutical Impurities -A Mini-Review, AAPS PharmSciTech 3 (2002)
1-8.
66. Fux, P., Determination of trace amounts of polydimethylsiloxane in extracts of chemicals
by proton nuclear magnetic-resonance spectroscopy, Analyst 115 (1990) 179-183.
67. Kestner, T.A., Newmark, R.A. and Bardeau, C., Quantitative analysis of polyethylene
oxide in polyethylene by 1H-NMR spectroscopy, Polymer Preprints (American
Chemical Society, Division of Polymer Chemistry) 37 (1996) 232-233.
68. Lindgren, B. and Martin, J.R., Nuclear Magnetic Resonance (NMR) as a tool for
quantitative determination, Pharmaeuropa 5 (1993) 51-54.
69. Simeral, L.S., Improved Analysis of Minor Species in Mixtures Using Carbon-13
Decoupling in One-Dimensional Proton NMR, Applied Spectroscopy 49 (1995) 400-
402.
70. Soininen, P., Haarala, J., Vepsalainen, J., Niemitz, M. and Laatikainen, R.,
Strategies for organic impurity quantification by 1H NMR spectroscopy: Constrained total-
line-shape fitting, Analytica Chimica Acta 542 (2005) 178-185.
71. Thakur, K.A.M., Kean, R.T., Hall, E.S., Doscotch, M.A. and Munson, E.J.,
A quantitative method for determination of lactide composition in poly(lactide) using 1H
NMR, Analytical Chemistry 69 (1997) 4303-4309.
72. Vogels, J.T.W.E. and Venekamp, J.C., Profiling impurities in Chlomethiazole Edisilate
using proton NMR spectroscopy and multivariate calibration techniques, TNO Report
V97.659, TNO Nutrition and Food Research Institute, Zeist, (1997)
73. Wyszecka-Kaszuba, E., Warowna-Grzeskiewicz, M. and Fijalek, Z., Determination
of 4-aminophenol impurities in multicomponent analgesic preparations by HPLC with
amperometric detection, Journal of Pharmaceutical and Biomedical Analysis 32
(2003) 1081-1086.
74. Lundsted, L.G. and Ile, G., Polyoxyalkylene compounds, United States Patent, no: 2
674 619 (1954)
75. Scherlund, M., Malmsten, M., Holmqvist, P. and Brodin, A., Thermosetting
microemulsions and mixed micellar solutions as drug delivery systems for periodontal
anesthesia, International Journal of Pharmaceutics 194 (2000) 103-116.
76. Berg, M., Djurle, A., Melin, M. and Nyman, P., Process for the purification of aldehyde
impurities, United States Patent, no: 6448371 B1 (2002)
77. Bales, J.R., Higham, D.P., Howe, I., Nicholson, J.K. and Sadler, P.J., Use of high-
resolution proton nuclear magnetic resonance spectroscopy for rapid multi-component analysis
of urine, Clinical Chemistry 30 (1984) 426-432.

References | 89
78. Bock, J.L., Analysis of Serum by High-Field Proton Nuclear Magnetic Resonance,
Clinical Chemistry 28 (1982) 1873-1877.
79. Nicholson, J.K., Buckingham, M.J. and Sadler, P.J., High resolution 1H NMR
studies of vertebrate blood and plasma, Biochemical Journal 211 (1983) 605-615.
80. Goodacre, R., Vaidyanathan, S., Dunn, W.B., Harrigan, G.G. and Kell, D.B.,
Metabolomics by numbers: acquiring and understanding global metabolite data, Trends In
Biotechnology 22 (2004) 245-252.
81. Lindon, J.C., Nicholson, J.K., Holmes, E. and Everett, J.R., Metabonomics:
Metabolic processes studied by NMR spectroscopy of biofluids, Concepts in Magnetic
Resonance 12 (2000) 289-320.
82. Nicholson, J.K., Lindon, J.C. and Holmes, E., “Metabonomics”: understanding the
metabolic responses of living systems to pathophysiological stimuli via multivariate statistical
analysis of biological NMR spectroscopic data, Xenobiotica 29 (1999) 1181-1189.
83. Fiehn, O., Combining genomics, metabolome analysis, and biochemical modelling to
understand metabolic networks, Comparative and Functional Genomics 2 (2001)
155-168.
84. Keun, H.C., Metabonomic modeling of drug toxicity, Pharmacology & Therapeutics
In Press (2005) Available online 26 July 2005.
85. Antti, H., Bollard, M.E., Ebbels, T., Keun, H., Lindon, J.C., Nicholson, J.K.
and Holmes, E., Batch statistical processing of 1H NMR-derived urinary spectral data,
Journal of Chemometrics 16 (2002) 461-468.
86. Beckwith-Hall, B.M., Brindle, J.T., Barton, R.H., Coen, M., Holmes, E.,
Nicholson, J.K. and Antti, H., Application of orthogonal signal correction to minimise
the effects of physical and biological variation in high resolution 1H NMR spectra of
biofluids, Analyst 127 (2002) 1283-1288.
87. Brekke, T., Kvalheim, O.M. and Sletten, E., Prediction of Physical Properties of
Hydrocarbon Mixtures by Partial-Least-Squares Calibration of Carbon-13 Nuclear
Magnetic Resonance Data, Analytica Chimica Acta 223 (1989) 123-134.
88. Cloarec, O., Dumas, M.E., Trygg, J., Craig, A., Barton, R.H., Lindon, J.C.,
Nicholson, J.K. and Holmes, E., Evaluation of the orthogonal projection on latent
structure model limitations caused by chemical shift variability and improved visualization of
biomarker changes in 1H NMR spectroscopic metabonomic studies, Analytical Chemistry
77 (2005) 517-526.
89. Defernez, M. and Colquhoun, I.J., Factors affecting the robustness of metabolite
fingerprinting using 1H NMR spectra, Phytochemistry 62 (2003) 1009-1017.
90. Holmes, E. and Antti, H., Chemometric contributions to the evolution of metabonomics:
mathematical solutions to characterising and interpreting complex biological NMR spectra,
Analyst 127 (2002) 1549-1557.
91. Holmes, E., Foxall, P.J.D., Nicholson, J.K., Neild, G.H., Brown, S.M., Beddell,
C.R., Sweatman, B.C., Rahr, E., Lindon, J.C., Spraul, M., and Neidig, P., Automatic
Data Reduction and Pattern-Recognition Methods for Analysis of 1H Nuclear Magnetic
Resonance Spectra of Human Urine from Normal And Pathological States, Analytical
Biochemistry 220 (1994) 284-296.

90 | Processing and analysis of NMR data


92. Holmes, E., Nicholson, J.K., Nicholls, A.W., Lindon, J.C., Connor, S.C., Polley,
S. and Connelly, J., The identification of novel biomarkers of renal toxicity using automatic
data reduction techniques and PCA of proton NMR spectra of urine, Chemometrics
and Intelligent Laboratory Systems 44 (1998) 245-255.
93. Lenz, E.M., Bright, J., Knight, R., Wilson, I.D. and Major, H., Cyclosporin A-
induced changes in endogenous metabolites in rat urine: A metabonomic investigation using
high field 1H NMR spectroscopy, HPLC-TOF/MS and chemometrics, Journal of
Pharmaceutical and Biomedical Analysis 35 (2004) 599-608.
94. Lenz, E.M., Bright, J., Knight, R., Wilson, I.D. and Major, H., A metabonomic
investigation of the biochemical effects of mercuric chloride in the rat using 1H NMR and
HPLC-TOF/MS: time dependant changes in the urinary profile of endogenous metabolites
as a result of nephrotoxicity, Analyst 129 (2004) 535-541.
95. Potts, B.C.M., Deese, A.J., Stevens, G.J., Reily, M.D., Robertson, D.G. and
Theiss, J., NMR of biofluids and pattern recognition: assessing the impact of NMR
parameters on the principal component analysis of urine from rat and mouse, Journal of
Pharmaceutical and Biomedical Analysis 26 (2001) 463-476.
96. Spraul, M., Neidig, P., Klauck, U., Kessler, P., Holmes, E., Nicholson, J.K.,
Sweatman, B.C., Salman, S.R., Farrant, R.D., Rahr, E., Beddell, C.R., and
Lindon, J.C., Automatic Reduction of NMR Spectroscopic Data for Statistical and Pattern
Recognition Classification of Samples, Journal of Pharmaceutical and Biomedical
Analysis 12 (1994) 1215-1225.
97. Williams, R.E., Lenz, E.M., Evans, J.A., Wilson, I.D., Granger, J.H., Plumb,
R.S. and Stumpf, C.L., A combined 1H NMR and HPLC-MS-based metabonomic
study of urine from obese (fa/fa) Zucker and normal Wistar-derived rats, Journal of
Pharmaceutical and Biomedical Analysis 38 (2005) 465-471.
98. MATLAB 6.5. The MathWorks Inc., Natick, MA (2002)
99. Wold, S., Chemometrics; what do we mean with it, and what do we want from it?
Chemometrics and Intelligent Laboratory Systems 30 (1995) 109-115.
100. Brereton, R.G., Chemometrics: Data Analysis for the Laboratory and Chemical Plant.
John Wiley & Sons, West Sussex, UK. (2003)
101. Eriksson, L., Johansson, E., Kettaneh-Wold, N. and Wold, S., Multi- and
Megavariate Data Analysis, Principles and applications. Umetrics Academy, Umeå
(2001)
102. Martens, H. and Naes, T., Multivariate Calibration. Wiley, Chichester, Sussex, UK
(1991)
103. Jackson, J.E., A user’s guide to principal components. Wiley, New York (1991)
104. Wold, S., Esbensen, K. and Geladi, P., Principal Component Analysis, Chemometrics
and Intelligent Laboratory Systems 2 (1987) 37-52.
105. Pearson, K., On lines and planes of closest fit to systems of points in space, Philosophical
Magazine and Journal B 2 (1901) 559-572.
106. Keun, H.C., Ebbels, T.M.D., Antti, H., Bollard, M.E., Beckonert, O.,
Schlotterbeck, G., Senn, H., Niederhauser, U., Holmes, E., Lindon, J.C., and
Nicholson, J.K., Analytical reproducibility in 1H NMR-based metabonomic urinalysis,
Chemical Research in Toxicology 15 (2002) 1380-1386.

References | 91
107. Bro, R., PARAFAC. Tutorial and applications, Chemometrics and Intelligent
Laboratory Systems 38 (1997) 149-171.
108. Kiers, H.A.L. and Van Mechelen, I., Three-Way Component Analysis: Principles and
Illustrative Application, Physiological methods 6 (2001) 84-110.
109. Wold, H., Estimation of principal components and related models by iterative least squares.
in: Multivariate analysis, Ed: Krishnaiah, P.R., Academic Press, New York, (1966)
pp. 391-420.
110. Wold, H., Soft Modelling by Latent Variables: The Non-Linear Iterative Partial Least
Squares (NIPALS) Approach. in: Perspectives in probability and statistics, Ed: Gani, J.,
Jerusalem Academic Press, Jerusalem, (1975) pp. 117-142.
111. Wold, S., Martens, H. and Wold, H. The multivariate calibration problem in chemistry
solved by the PLS method. in: Matrix Pencils (Lecture notes in mathematics). 1983. Pite
Havsbad, Springer Verlag, Heidelberg.
112. Barker, M. and Rayens, W., Partial least squares for discrimination, Journal of
Chemometrics 17 (2003) 166-173.
113. Geladi, P. and Kowalski, B.R., Partial least-squares regression: a tutorial, Analytica
Chimica Acta 185 (1986) 1-17.
114. Wold, S., Sjöstrom, M. and Eriksson, L., PLS-regression: a basic tool of chemometrics,
Chemometrics and Intelligent Laboratory Systems 58 (2001) 109-130.
115. Bro, R., Multiway calibration. Multilinear PLS, Journal of Chemometrics 10 (1996)
47-61.
116. Wold, S., Kettaneh Wold, N. and Skagerberg, B., Nonlinear PLS Modeling,
Chemometrics and Intelligent Laboratory Systems 7 (1989) 53-65.
117. Thomsen, J.U. and Meyer, B., Pattern Recognition of the 1H NMR Spectra of Sugar
Alditols Using a Neural Network, Journal of Magnetic Resonance 84 (1989) 212-
217.
118. Amendolia, S.R., Doppiu, A., Ganadu, M.L. and Lubinu, G., Classification
and quantitation of 1H NMR spectra of alditols binary mixtures using artificial neural
networks, Analytical Chemistry 70 (1998) 1249-1254.
119. Zupan, J. and Gasteiger, J., Neural Networks for Chemists. VCH Verlagsgesellschaft,
Weinheim (1993)
120. Geladi, P., MacDougall, D. and Martens, H., Linearization and Scatter-Correction
for Near-Infrared Reflectance Spectra of Meat, Applied Spectroscopy 39 (1985) 491-
500.
121. Wold, S., Antti, H., Lindgren, F. and Öhman, J., Orthogonal signal correction of
near-infrared spectra, Chemometrics and Intelligent Laboratory Systems 44 (1998)
175-185.
122. Stone, M., Cross-Validatory Choise and Assessment of Statistical Predictions, Journal
of the Royal Statistical Society 36 (1974) 111-147.
123. Pierce, K.M., Hope, J.L., Johnson, K.J., Wright, B.W. and Synovec, R.E.,
Classification of gasoline data obtained by gas chromatography using a piecewise alignment
algorithm combined with feature selection and principal component analysis, Journal of
Chromatography A In Press (2005) Available online 17 May.

92 | Processing and analysis of NMR data


124. Åberg, M., Thesis: Variance reduction in analytical chemistry, Analytical Chemistry,
Stockholm University (2004)
125. Mahalanobis, P.C., On tests and measures of group divergence. Part I. Theoretical
formulae., Journal and Proceedings of the Asian Society of Bengal 26 (1930)
541-588.
126. Jaumot, J., Vives, M., Gargallo, R. and Tauler, R., Multivariate resolution of NMR
labile signals by means of hard- and soft-modelling methods, Analytica Chimica Acta
490 (2003) 253-264.
127. Rabenstein, D.L. and Isab, A.A., Determination of the intracellular pH of intact
erythrocytes by 1H NMR spectroscopy, Analytical Biochemistry 121 (1982) 423-432.
128. Bollard, M.E., Stanley, E.G., Lindon, J.C., Nicholson, J.K. and Holmes, E.,
NMR-based metabonomic approaches for evaluating physiological influences on biofluid
composition, NMR in Biomedicine 18 (2005) 143-162.
129. Brown, T.R. and Stoyanova, R., NMR spectral quantitation by principal-component
analysis. II. Determination of frequency and phase shifts, Journal of Magnetic
Resonance. Series B 112 (1996) 32-43.
130. Stoyanova, R., Nicholls, A.W., Nicholson, J.K., Lindon, J.C. and Brown, T.R.,
Automatic alignment of individual peaks in large high-resolution spectral data sets, Jounal
of Magnetic Resonance 170 (2004) 329-335.
131. Trygg, J. and Wold, S., Orthogonal projections to latent structures (O-PLS), Journal of
Chemometrics 16 (2002) 119-128.
132. Witjes, H., Melssen, W.J., Zandt, H., van der Graaf, M., Heerschap, A. and
Buydens, L.M.C., Automatic correction for phase shifts, frequency shifts, and lineshape
distortions across a series of single resonance lines in large spectral data sets, Journal of
Magnetic Resonance 144 (2000) 35-44.
133. Lee, G.-C. and Woodruff, D.L., Beam search for peak alignment of NMR signals,
Analytica Chimica Acta 513 (2004) 413-416.
134. Torgrip, R.J.O., Åberg, M., Karlberg, B. and Jacobsson, S.P., Peak alignment using
reduced set mapping, Journal of Chemometrics 17 (2003) 573-582.
135. Kassidas, A., MacGregor, J.F. and Taylor, P.A., Synchronization of batch trajectories
using dynamic time warping, Aiche Journal 44 (1998) 864-875.
136. Wang, C.P. and Isenhour, T.L., Time-Warping Algorithm Applied to Chromatographic
Peak Matching Gas Chromatography/Fourier Transform Infrared Mass Spectrometry,
Analytical Chemistry 59 (1987) 649-654.
137. Vintsyuk, T.K., Element-wise recognition of continuous speech composed of words from a
specified dictionary, Kibernetika 2 (1971) 133-143.
138. Johnson, K.J., Wright, B.W., Jarman, K.H. and Synovec, R.E., High-speed peak
matching algorithm for retention time alignment of gas chromatographic data for chemometric
analysis, Journal of Chromatography A 996 (2003) 141-155.
139. Eilers, P.H.C., Parametric Time Warping, Analytical Chemistry 76 (2004) 404-
411.
140. Nielsen, N.P.V., Carstensen, J.M. and Smedsgaard, J., Aligning of single and
multiple wavelength chromatographic profiles for chemometric data analysis using correlation
optimised warping, Journal of Chromatography A 805 (1998) 17-35.

References | 93
141. Nielsen, N.P.V., Smedsgaard, J. and Frisvad, J.C., Full second-order chromatographic/
spectrometric data matrices for automated sample identification and component analysis by
non-data-reducing image analysis, Analytical Chemistry 71 (1999) 727-735.
142. Pravdova, V., Walczak, B. and Massart, D.L., A comparison of two algorithms for
warping of analytical signals, Analytica Chimica Acta 456 (2002) 77-92.
143. Andersson, F.O., Kaiser, R. and Jacobsson, S.P., Data preprocessing by wavelets and
genetic algorithms for enhanced multivariate analysis of LC peptide mapping, Journal of
Pharmaceutical and Biomedical Analysis 34 (2004) 531-541.
144. Walczak, B. and Wu, W., Fuzzy warping of chromatograms, Chemometrics and
Intelligent Laboratory Systems 77 (2005) 173-180.
145. Westad, F. and Martens, H., Shift and intensity modeling in spectroscopy - general concept
and applications, Chemometrics and Intelligent Laboratory Systems 45 (1999)
361-370.
146. Åberg, K.M., Torgrip, R.J.O. and Jacobsson, S.P., Extensions to peak alignment
using reduced set mapping: classification of LC/UV data from peptide mapping, Journal
of Chemometrics 18 (2004) 465-473.
147. Wehrens, R. and Buydens, L.M.C., Evolutionary optimisation: a tutorial, Trends in
Analytical Chemistry 17 (1998) 193-203.
148. Bisiani, R., Search, Beam. in: Encyclopedia of Artificial Intelligence, Ed: Shapiro, S.C.,
Wiley, New York, (1992) pp. 1467-1468.
149. Idborg-Björkman, H., Edlund, P.-O., Kvalheim, O.M., Schuppe-Koistinen, I.
and Jacobsson, S.P., Screening of Biomarkers in Rat Urine Using LC/Electrospray
Ionization-MS and Two-Way Data Analysis, Analytical Chemistry 75 (2003) 4784-
4792.
150. Plumb, R.S., Stumpf, C.L., Gorenstein, M.V., Castro-Perez, J.M., Dear, G.J.,
Anthony, M., Sweatman, B.C., Connor, S.C. and Haselden, J.N., Metabonomics: the
use of electrospray mass spectrometry coupled to reversed-phase liquid chromatography shows
potential for the screening of rat urine in drug development., Rapid Communications in
Mass Spectrometry 16 (2002) 1991-1996.
151. Idborg, H., Edlund, P.-O. and Jacobsson, S.P., Multivariate approaches for efficient
detection of potential metabolites from liquid chromatography/mass spectrometry data,
Rapid Communications in Mass Spectrometry 18 (2004) 944-954.
152. Liu, Y. and Brown, S.D., Wavelet multiscale regression from the perspective of data fusion:
new conceptual approaches, Analytical and Bioanalytical Chemistry 380 (2004) 445-
452.
153. Roussel, S., Bellon-Maurel, V., Roger, J.M. and Grenier, P., Fusion of aroma, FT-
IR and UV sensor data based on the Bayesian inference. Application to the discrimination
of white grape varieties, Chemometrics and Intelligent Laboratory Systems 65
(2003) 209-219.
154. Steinmetz, V., Sevila, F. and Bellon-Maurel, V., A methodology for sensor fusion
design: Application to fruit quality assessment, Journal of Agricultural Engineering
Research 74 (1999) 21-31.
155. Hall, D.L., Mathematical techniques in Multisensor Data Fusion. Artech House,
London (1992)

94 | Processing and analysis of NMR data


156. Eriksson, L., Johansson, E., Lindgren, F., Sjöström, M. and Wold, S., Megavariate
analysis of hierarchical QSAR data, Journal of Computer-Aided Molecular Design
16 (2002) 711-726.
157. Wold, S., Kettaneh, N. and Tjessem, K., Hierarchical multiblock PLS and PC
models for easier model interpretation and as an alternative to variable selection, Journal of
Chemometrics 10 (1996) 463-482.
158. Workman Jr, J.J., Quantification of LDPE [low density poly(ethylene)], LLDPE [linear
low density poly(ethylene)], and HDPE [high density poly(ethylene)] in polymer film mixtures
“as received” using multivariate modeling with data augmentation (data fusion) and infrared,
Raman, and near-infrared spectroscopy, Spectroscopy Letters 32 (1999) 1057-1071.
159. Westerhuis, J.A., Kourti, T. and MacGregor, J.F., Analysis of multiblock and
hierarchical PCA and PLS models, Journal of Chemometrics 12 (1998) 301-321.
160. Rännar, S., MacGregor, J.F. and Wold, S., Adaptive batch monitoring using hierarchical
PCA, Chemometrics and Intelligent Laboratory Systems 41 (1998) 73-81.
161. Rutledge, D.N., Barros, A.S. and Giangiacomo, R., Interpreting near infrared spectra
of solutions by outer product analysis with time domain - NMR, Magnetic Resonance
in Food Science: a view to the future. - (Special Publication - Royal Society of
Chemistry) 262 (2001) 179-192.

References | 95

View publication stats

You might also like