Professional Documents
Culture Documents
Spectra Basic PDF
Spectra Basic PDF
spectral data.
SWST, Advanced analytical tools for the wood industry June 10, 2007
Data collection
-Local filters
Signal
-Smoothing
W avelength (nm )
-Derivatives
Absorbance
-Baseline correction Raw spectral data
Wavelength
Absorbance
-Orthogonal Scatter Correction (OSC) Corrected spectral
data after MSC
Wavelength
What type of information?
1. Qualitative information
Grouping and classification of spectral objects from samples
into supervised and non-supervised learning methods.
2. Quantitative information
Relationships between spectral data and parameter(s) of interest
2.5
0.5
375 x-variables
0.0
800 1200 1600 2000 2400
Wavelength (nm)
25
25 30 35 40 45 50
Measured cellulose content (%)
50
Multivariate analysis
Measured cellulose content versus Predicted Cellulose content (%) 45
25
25 30 35 40 45 50
Measured cellulose content (%)
Qualitative information
Principal Components Analysis (PCA)
Recognize patterns in data: outliers, trends, groups…
0.8
0.4
PC2 (8%)
0.0
-0.4
-0.8
-6 -4 -2 0 2
PC1 (89%)
Biomass species and near infrared spectra
n Samples
Spectral data
x variables
Transformation of the numbers into pictures by using
principal component analysis (scores)
0.18
PC1
30 PC2
Bagasse
0.12
Intensity (A.U.)
20
0.06
10 Red oak
Corn stover
Yellow poplar
PC2
0 0.00
Hichory
-10
Switchgrass -0.06
800 1200 1600 2000 2400
-20
Wavelength (nm)
-30
-30 -20 -10 0 10 20
PC1
Transformation of the numbers into pictures by using
principal component analysis (loadings)
0.18
PC1
30 PC2
Bagasse 0.12
Intensity (A.U.)
20
0.06
10 Red oak
Corn stover
Yellow poplar
PC2
0 0.00
Hichory
-10
Switchgrass -0.06
-20 800 1200 1600 2000 2400
Wavelength (nm)
-30
-30 -20 -10 0 10 20
PC1
Principal Components Analysis (PCA)
PCA is a projection method, it decomposes the spectral data into a
“structure” part and a “noise” part
X is an n samples (observations) by x variables (spectral variables)
matrix
0
x1
1 variable = 1 dimension
Principal Components Analysis (PCA)
X is an n samples (observations) by x variables (spectral variables)
matrix
x2
x2
x1
x1
x3
2.5
2.0
Absorbance
1.5
1.0
0.5
375 x-variables
0.0
800 1200 1600 2000 2400 2800
Wavelength (nm)
After PC1, next best direction for approximating the original data
The second PC lies along a direction orthogonal to the first PC
PC3 will be orthogonal to both PC1 and PC2 while simultaneously lying along the
direction of the third largest variation.
The new variables (PCs) are uncorrelated with each other (orthogonal)
Scores (T) = Coordinates of samples in the PC space
Original PC space
variable space
Relationship between the original variable space and the new PCs space
There is a set of loadings for each PC (loading vector)
Transformation of the numbers into pictures by using PCA
30
Bagasse
20
10 Red oak
Corn stover
Yellow poplar
PC2
0
Hichory
-10
Switchgrass
-20
0.18
PC1
-30 PC2
-30 -20 -10 0 10 20
PC1 0.12
Intensity (A.U.)
0.06
0.00
-0.06
800 1200 1600 2000 2400
Wavelength (nm)
Quantitative information
Projection to Latent Structures or Partial Least Squares Regression (PLS)
Establish relationships between input and output variables, creating predictive models.
X + Y ⇒ Model X + Model ⇒ Ŷ
PLS can be seen as two interconnected PCA analyses, PCA(X) and PCA(Y)
PLS uses the Y-data structure (variation) to guide the decomposition of X
The X-data structure also influences the decomposition of Y
Biomass composition and near infrared spectra
Samples
35 12
8
30
4
25 Intensity
25 30 35 40 0 45 50
Measured cellulose content (%)
-4
-8
-12
1000 1200 1400 1600 1800 2000 2200 2400
Wavelength (nm)
Validation of the model
to predict cellulose
content in pine
50
Predicted cellulose content (%)
r = 0.95
45 RMSEP = 1.55
40
35
30
25
25 30 35 40 45 50
Measured cellulose content (%)
PLS-Discriminant Analysis (PLS-DA)
A powerful method for classification. The aim is to create a predictive
model which can accurately classify future unknown samples.
-1
Yellow poplar
-2 Hickory 1.2
1.0
-0.6 -0.4 -0.2 0.0 0.2 0.4 0.6
Predicted Y variable
PC1 (92%) 0.8
0.6
0.4
0.2
r = 0.99
0.0 RMSEC = 0.04
-0.2
0.0 0.2 0.4 0.6 0.8 1.0 1.2
Measured Y variable
PLS-Discriminant Analysis (PLS-DA)
Validation of PLS-DA model
PLS-DA Y-reference Predicted Y
Validation model Spectrum 00008 0.0000 -0.0338
Spectrum 00009 0.0000 0.0270
Spectrum 00015 1.0000 0.9340
Spectrum 00016 1.0000 1.0220
1.2
1.0
r = 0.99
RMSEP = 0.04
0.8
Hickory
Predicted Y
0.6
0.4
Yellow
0.2 poplar
0.0
-0.2
0.0 0.2 0.4 0.6 0.8 1.0 1.2
Y reference
Two-dimensional correlation spectroscopy
simplify the
2D correlation tools spread
visualization of
spectral peaks over a
complex spectra
second dimension
Mechanical, electrical,
Perturbation chemical, magnetic
optical, thermal, …
Electro-magnetic 2D
Probe (eg, IR, System correlation
UV, LIBS,…) maps
0.15
2.5
0.10
2.0
Intensity (A.U.)
Intensity (A.U.)
0.05
1.5
0.00
1.0
-0.05
0.5
-0.10
1000 1200 1400 1600 1800 2000 2200 2400 1000 1200 1400 1600 1800 2000 2200 2400
Wavelength (nm) Wavelength (nm)
References
•A user-friendly guide to Multivariate Calibration and Classification; T. Næs, T.
Isaksson, t. Fearn, T. Davies, NIR Publications, Chichester, UK, 2002
•Multivariate calibration, H. Martens and T. Næs, John Wiley & Sons, Chichester, UK,
1989
•Chemometric techniques for quantitative analysis, Marcel Dekker, New York, 1998
•Two-dimensional correlation spectroscopy, I. Noda and Y. Ozaki, John Wiley & Sons,
Chichester, UK, 2004
•Neural Network Toolbox 5 User’s Guide – Matlab, H. Demuth, M. Beale and M.
Hagan.
Questions?