You are on page 1of 33

19/11/2017

Pre-processing of images

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Why pre-processing

Why pre-processing?

Presence of erroneous data values and/or non-informative background or


outliers

These anomalous observations can be generated by several sources:

1- The instrument/detector.

2- The geometry of the sample.

3- The radiation

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

1
19/11/2017

Why pre-processing

Targets of pre-processing

• To improve a subsequent exploratory analysis

• To improve a subsequent bi-linear calibration

• To improve a subsequent classification model

• To get rid of noise (spatial and spectral noise)

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Why pre-processing

Steps of pre-processing

There are many particular problems that must be solved in a particular manner

shape

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

2
19/11/2017

Why pre-processing

Steps of pre-processing

There are many particular problems that must be solved in a particular manner

background saturation

roughness

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Why pre-processing

Steps of pre-processing

There are many particular problems that must be solved in a particular manner

background Edges

Saturation

Shapes

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

3
19/11/2017

Why pre-processing

Steps of pre-processing

There is NOT a recipe

Try, and let’s see what


happens

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Why pre-processing

Steps of pre-processing

Many pre-processing methods have “side” effects. Examples:

- Spatial binning also removes surface and spectral noise


- Derivatives also remove noise and baseline drifts
- Etc, etc

There is ALWAYS A PRICE TO PAY

- You might lose spatial or spectral resolution


- The more you pre-process, the bigger risk of “destroying” information

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

4
19/11/2017

Why pre-processing

Steps of pre-processing

Procedure of this lesson:

1) I will explain some of the most important

Most of them, you already know

IMPORTANT: There are maaaany more

2) Afterwards, we will define typical problems and think types


of pre-processing

3) At the end, we will make full pre-processing procedures

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Image compression

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

5
19/11/2017

Image compression

Image compression

Hyperspectral images are huge!!!

One image  256 x 256 x 150  more than 9 millions of datapoints

Compression means to reduce the bytes of an image:

Lost of spatial information

Lost of spectral information

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Image compression

Image compression methods

Byte encoding. Losing precision in the data

Hyperspectral images can be coded in different formats of numbers

Matlab  Only useful for storage of images. Matlab mostly work with double
precission numbers.

The format of a number is related to numerical analysis and computer architecture.

Out of the scope of this course. But useful for reading from hyperspectral cameras.

But we need to know the basic and fundamental concepts

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

6
19/11/2017

Image compression

Image compression methods

Byte encoding. Floating point number

Floating point number  Method for representing real world numbers in computers

3.1416 = 31416 x 10-4 exponent

mantissa

Matlab can store numbers in different formats

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Image compression

Image compression methods

Byte encoding. Some ways of encoding numbers

• Double precision: 64 bytes encoding precision in binary

• Single precision: 32 bytes encoding precision in binary

• Uint family (uint8 – uint16 – uint32): Unsigned 8-16-32 bit integers

Integers ranged from:


uint8  0 to 255
uint16  0 to 65536
uint32  0 to 4.29 x 109

Real numbers are rounded-to-nearest

RGB images (grayscale)

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

7
19/11/2017

Image compression

Image compression methods

Spatial binning. Losing spatial resolution:

When the spatial resolution is too big (many pixels in a small area)

Calculate the mean (or median) spectrum of a sub-window of the hyperspectral image

Need of selection of a sub-window (e.g. 2)


10
5

10 5

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Image compression

Image compression methods

Spatial binning. Losing spatial resolution:

Consequences:

Reduce the size of a hyperspectral image drastically

Easier to handle and to make multi-image analysis

Helps to reduce roughness of the surface

Helps to reduce spatial noise

Helps to reduce spectral noise

Applicable to multispectral and hyperspectral images

You might lose spatial or even spectral information

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

8
19/11/2017

Image compression

Image compression methods

Spectral binning. Losing spectral resolution

When the spectral resolution is too big (many wavelengths measured)

Calculate the mean (or median) of a spectral sub-window

Need of selection of a sub-window (e.g. 2)

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Image compression

Image compression methods

Spectral binning. Losing spectral resolution

Consequences:

Reduce the size of a hyperspectral image

Easier to handle and to make multi-image analysis

Helps to reduce spectral noise

No big repercussion in the spatial dimension

You might lose spatial or even spectral information

NOT applicable to multispectral images

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

9
19/11/2017

Image compression

Image compression methods

Model approach. Losing information (in general)

Application of factorization or wavelets that compress the information in several


main components leaving the noise for the residuals.

Useful for storage. Before using the hyperspectral images again, they must be
reconstructed.

Two examples:

Factor modelling with PCA

3D wavelets transformation

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Image compression

Image compression methods

Model approach. Losing information (in general)

Factor models. Principal Component Analysis (PCA)

Scores and loadings are stored, together with the explained variance

Many more PCs are stored in order to be sure that the noise contains just noise

Example:

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

10
19/11/2017

Image compression

Image compression methods

Model approach. Losing information (in general)

3D wavelet transformations

They combine 2D wavelets for spatial and 1D for spectral dimensions

Daubechies family of orthonormal wavelets

Named from D2 (Haar wavelet) to D20

The index refers to the number of coefficients to be calculated

There is no rule (again) for selecting the type of wavelet. Some advices:

- D12 is more advisable for spatial detail retention


- Haar wavelet is more advisable for speed conditions

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Image compression

Image compression methods

Model approach. Comparison

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

11
19/11/2017

Image compression

Image compression methods

Model approach. Comparison

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Image compression

Image compression methods

Model approach

Consequences:

Reduce the size of a hyperspectral image drastically

PCA is easy and straightforward

Wavelets must be selected

It is only useful for storage

Images must be reconstructed. You might lose spatial or spectral information

PCA is applicable to both multispectral and hyperspectral images

I have no experience with wavelets and multispectral images.

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

12
19/11/2017

Background removal

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Background removal

Background removal methods. Regions of Interest (ROI)

The shape of the sample or the distance between elements play an important role

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

13
19/11/2017

Background removal

Background removal methods. Regions of Interest (ROI)

The shape of the sample or the distance between elements play an important role

Background cannot be analyzed:

It contain highly noisy spectra

It hampers the good performance of factor models

It occupies a lot of space!

Methods for removing background. As many as your imagination can come up

The ones implemented in HYPER-Tools (so far):


PCA-based histograms
PCA score scatter plots
Specific wavelengths ratios
Manual selection (mean and PCA-based)
K-means (my favourite )
Addition of morphological operations

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Background removal

Background removal methods. Regions of Interest (ROI)

Example with K-means:

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

14
19/11/2017

Background removal

Background removal methods. Regions of Interest (ROI)

Consequences:

Reduce the information to be treated (image reduction)

Many possibilities

Applicable to both multispectral and hyperspectral images

Sometimes we can remove important areas of a sample

Especially difficult in the edges of the objects

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Surface smoothing

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

15
19/11/2017

Surface smoothing

Surface noise

Surface noise is the noise produced, normally, by the acquisition system

Not very common in hyperspectral. Quite common in multispectral. Especially


in sensors of different nature

Different types of noise:

Gaussian

Salt and pepper

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Surface smoothing

Surface noise

Gaussian noise

The Gaussian noise (normal noise) is the noise that models the noise generated
by the sensors. Usually because of the lack of illumination.

The intensity of all pixels is affected

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

16
19/11/2017

Surface smoothing

Surface noise

Salt and pepper noise (impulsive noise)

The salt and pepper noise is normally produced in the quantitation step of the
digitalization

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Surface smoothing

Minimization of surface noise. Filters

There are many filters that can be applied to remove noise

Mean - Median – Mode filters

The easiest to apply

1) Select a pixel and the neighbors

2) Calculate the mean, median or mode value

3) Substitute the selected pixel for that value

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

17
19/11/2017

Surface smoothing

Minimization of surface noise. Filters

There are many filters that can be applied to remove noise

Gaussian filters

As simplest as the previous one

1) Select a pixel and the neighbors based on a standard deviation

2) Calculate the Gaussian distribution of these values

3) Substitute the selected pixel by the mean value

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Surface smoothing

Minimization of surface noise. Filters

There are many filters that can be applied to remove noise

Wiener filters

Gaussian Low pass – High pass

Butterworth Low pass – High pass

Ideal Low pass – High pass

Fourier Transform smoothing

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

18
19/11/2017

Surface smoothing

Minimization of surface noise. Filters

Consequences:

Mostly used for RGB, digital and Multispectral images from different sources

Adapted to Hyperspectral in conjunction with spectral filtering

They must be optimized. Risk to lose spatial information

Different types of noise can be present

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Dead pixels and spikes

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

19
19/11/2017

Dead pixels and spikes

Dead pixels – dead lines

Dead pixels and/or lines are usually caused by anomalies in the detectors.

Dead pixels can distort multivariate models; whereas many of the routines for
multivariate data analysis (e.g. PCA or MCR) can handle only a limited amount of
missing values.

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Dead pixels and spikes

Dead pixels – dead lines

How to detect them?

They may be present as missing, infinite or zero values; and their location and size may
vary between being a specific pixel, a group of pixels or a complete pixel line

How to eliminate them?

Entire line or row: Eliminate

Scattered pixels: Interpolation with the neighbors

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

20
19/11/2017

Dead pixels and spikes

Spiked wavelenths

Spiked wavelengths are erroneous measurements for a single wavelength

23
x 10
2 3

-2 2

-6 1

-10 0
4000 5000 6000 7000 4000 5000 6000 7000
wavelength (cm-1) wavelength (cm-1)

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Dead pixels and spikes

Spiked wavelenths

How to detect them?

For each pixel, calculate the mean spectrum plus the standard deviation. Then,
establishing a threshold they can be easily detected.

How to eliminate them?

Change the value by the mean of a sub-window with the neighbor values.

23
x 10 4
2

-2
3

-6

2
-10
4000 5000 6000 7000
wavelength (cm-1) 4000 5000 6000 7000
wavelength (cm-1)

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

21
19/11/2017

Dead pixels and spikes

Spiked wavelenths

How to detect them?

For each pixel, calculate the mean spectrum plus the standard deviation. Then,
establishing a threshold they can be easily detected.

How to eliminate them?

Change the value by the mean of a sub-window with the neighbor values.

3 1.2

1 0.6

0
4000 5000 6000 7000 0.2
4000 5000 6000 7000
wavelength (cm-1)
wavelength (cm-1)

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Spectral pre-processing

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

22
19/11/2017

Spectral pre-processing

Spectral pre-processing

Spectra always contain noise (different types of noise, of course)

Moreover, they contain artifacts depending on the type of spectroscopy

NIR  baseline drifts

Raman  Fluorescence influence

As usual, there are many filters (and they might be the same, with different name)

They have been mostly adapted from classical spectroscopy

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Spectral pre-processing

Spectral pre-processing filters

Minimize scattering

Standard Normal Variate – SNV

Multiplicative scatter correction – MSC

Derivatives

Minimize spectral noise

Smoothing

Fourier transform

Wavelets (that is Fourier transform)

Increase variability

Derivatives
FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

23
19/11/2017

Spectral pre-processing

Multiplicative Scatter Correction - MSC

Selection of a reference spectrum and plot against other spectrum

Spectrum to be corrected

Reference spectrum

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Spectral pre-processing

Multiplicative Scatter Correction - MSC

Least Squares regression. Calculate m and oo

Spectrum to be corrected

Δy
Δy
Δx m=
Δx

oo

Reference spectrum

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

24
19/11/2017

Spectral pre-processing

Multiplicative Scatter Correction - MSC

Correction of the spectrum by using the SCATTER COEFFICIENTS

Raw spectrum – oo
Spectrum corrected =
m

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Spectral pre-processing

Multiplicative Scatter Correction - MSC

Consequences:

It removes additive and multiplicative artifacts in the spectra


The shadows and shapes are minimized
The scatter maps can help to visualize the effect of MSC

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

25
19/11/2017

Spectral pre-processing

Multiplicative Scatter Correction - MSC

Consequences:

Strongly dependent of the reference spectrum. This is especially important on images

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Spectral pre-processing

Standard Normal Variate - SNV

Each spectrum is substracted the mean value and divided by the standard deviation.

Raw spectrum – Mean (raw spectrum)


Spectrum corrected =
STD (raw spectrum)

a) Raw data and single wave number image for API at 5984 cm-1 b) SNV and smoothed data and single wave number image
Absorbance (a.u.)

0.1
Absorbance (a.u.)

0.3 1
20 0.2 20
0.5 0
0.2
40 0 40 -0.1
0.1
0.1 -0.5
-0.2
60 -1 60
0 0 -0.3
-1.5
-0.1 80 80
5000 5500 6000 6500 7000 20 40 60 80 5000 5500 6000 6500 7000 20 40 60 80

Wavenumber (cm-1) Wavenumber (cm-1)

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

26
19/11/2017

Spectral pre-processing

Standard Normal Variate - SNV

Consequences:
It removes additive artifacts in the spectra
It is extremely simple and does NOT change the shape of the spectrum
The shadows and shapes are minimized

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Spectral pre-processing

SNV vs. MSC

- SNV does not need reference spectrum, therefore is a practical advantage

- MSC is based on a least square regression, therefore is less sensitive to the noise

- Effects correction
* SNV  Additive effects
* MSC  Multiplicative effects

- MSC might alter the shape of the spectra

- Relationship:

MSC ≈ SNV* S + Spectrum

Mean of mean of all spectra


Mean of the standard deviation

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

27
19/11/2017

Spectral pre-processing

Smoothing

Probably the most common filter for eliminating spectral noise.

Based on the interpolation in small windows of a polynomial of a determined degree.

Window of 5
Window of 13

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Spectral pre-processing

Smoothing

Probably the most common filter for eliminating spectral noise.

Based on the interpolation in small windows of a polynomial of a determined degree.

Window of 5
Window of 13

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

28
19/11/2017

Spectral pre-processing

Smoothing

Consequences:

Really good technique for minimizing noise

Parameters to optimize: Window size and polynomial degree

* Rule of thumb  Polynomial degree of 2 is advisable

Smoothing changes the shape of the spectrum

* Special care in the selection of the window!!!

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Spectral pre-processing

Derivatives (Savitzky-Golay)

They minimize noise and transform the spectra

In this case, there are three parameters to optimize:

* Window size  Smoothing step implied.


* Polynomial degree  Second degree is advisable
* Derivative degree  It may enlarge differences, but it may also
create differences that they do not exist.

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

29
19/11/2017

Spectral pre-processing

Derivatives (Savitzky-Golay)

Additive (scattering)
correction

Multiplicative
correction

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Spectral pre-processing

Derivatives (Savitzky-Golay)

Consequences:

Very useful to highlight minimal spectral differences

They also have a smoothing effect

First derivative removes the additive effect

Second derivative removes the multiplicative effects

They also minimize the impact of shadows and shapes

Three parameters to optimize:

* Window size  Smoothing step implied.

* Polynomial degree  Second degree is advisable

* Derivative degree  It may enlarge differences, but it may also


create differences that they do not exist.
FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

30
19/11/2017

Spectral pre-processing

Comparison between SNV and derivatives

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Spectral pre-processing

Spectral pre-processing major problem

All the filters seen up to now serve for hyperspectral imaging

But they do not serve for multispectral imaging!!!

200
Intensity (a.u.)

150

100

50
500 600 700 800 900

Wavelengths (nm.)

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

31
19/11/2017

Spectral pre-processing

Spectral pre-processing in multispectral images

Normally, digital imaging filters to remove spatial noise

And then, different combinations between the wavelengths

0.6

0.5 R1150
Reflectance

0.4 R1200
0.3
R1250
0.2
R1400
0.1
1000 1100 1200 1300 1400 1500 1600 1700
Wavelength (nm)

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Spectral pre-processing

Spectral pre-processing in multispectral images

Normally, digital imaging filters to remove spatial noise

And then, different combinations between the wavelengths

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

32
19/11/2017

Full pre-processing

Summary

There is NOT a recipe

Try, and let’s see what


happens

The SIMPLER, the


BETTER!!!
FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

33

You might also like