You are on page 1of 24

correlation and covariance

“Covariance” indicates the direction of the linear relationship between variables.


“Correlation” on the other hand measures both the strength and direction of the linear
relationship between two variables. Correlation is a function of the covariance.

Background
Variance and covariance
The variance of a variable describes how much the values are spread. The covariance is a
measure that tells the amount of dependency between two variables.
A positive covariance means that the values of the first variable are large when values of the
second variables are also large. A negative covariance means the opposite: large values from
one variable are associated with small values of the other.
The covariance value depends on the scale of the variable so it is hard to analyze it. It is
possible to use the correlation coefficient that is easier to interpret. The correlation coefficient
is just the normalized covariance.
A positive covariance means that large values of one variable are associated with big values
from the other (left). A negative covariance means that large values of one variable are
associated with small values of the other one (right).

The covariance matrix is a matrix that summarises the variances and covariances of a set of
vectors and it can tell a lot of things about your variables. The diagonal corresponds to the
variance of each vector:

A matrix A and its matrix of covariance. The diagonal corresponds to the variance of each
column vector.
Let’s just check with the formula of the variance:
This is the first cell of our covariance matrix. The second element on the diagonal
corresponds of the variance of the second column vector from A and so on.
Note: the vectors extracted from the matrix A correspond to the columns of A.
The other cells correspond to the covariance between two column vectors from A. For
instance, the covariance between the first and the third column is located in the covariance
matrix as the column 1 and the row 3 (or the column 3 and the row 1).

The position in the covariance matrix. Column corresponds to the first variable and row to the
second (or the opposite). The covariance between the first and the third column vector of A is
the element in column 1 and row 3 (or the opposite = same value).
Let’s check that the covariance between the first and the third column vector of A is equal to
-2.67. The formula of the covariance between two variables Xand Y is:
Visualize data and covariance matrices
In order to get more insights about the covariance matrix and how it can be useful, we will
create a function to visualize it along with 2D data. You will be able to see the link between
the covariance matrix and the data.
This function will calculate the covariance matrix as we have seen above. It will create two
subplots — one for the covariance matrix and one for the data. The heatmap() function
from Seaborn is used to create gradients of colour — small values will be coloured in light
green and large values in dark blue. We chose one of our palette colours, but you may prefer
other colours. The data is represented as a scatterplot.
Filtering

1. Low pass filters (Smoothing)


Low pass filtering (aka smoothing), is employed to remove high spatial frequency noise from
a digital image. The low-pass filters usually employ moving window operator which affects
one pixel of the image at a time, changing its value by some function of a local region
(window) of pixels. The operator moves over the image to affect all the pixels in the image.
2. High pass filters (Edge Detection, Sharpening)
A high-pass filter can be used to make an image appear sharper. These filters emphasize fine
details in the image - the opposite of the low-pass filter. High-pass filtering works in the same
way as low-pass filtering; it just uses a different convolution kernel.
Mean Filter

Mean filtering is easy to implement. It is used as a method of smoothing images, reducing the
amount of intensity variation between one pixel and the next resulting in reducing noise in
images.

The idea of mean filtering is simply to replace each pixel value in an image with the mean
(`average') value of its neighbors, including itself. This has the effect of eliminating pixel
values which are unrepresentative of their surroundings. Mean filtering is usually thought of
as a convolution filter. Like other convolutions it is based around a kernel, which represents
the shape and size of the neighborhood to be sampled when calculating the mean. Often
a 3×33×3 square kernel is used, as shown below:
filter2()

The filter2() is defined as:

Y = filter2(h,X) filters the data in X with the two-dimensional FIR filter in the matrix h. It
computes the result, Y, using two-dimensional correlation, and returns the central part of the
correlation that is the same size as X.

It returns the part of Y specified by the shape parameter. shape is a string with one of these
values:

1. 'full' : Returns the full two-dimensional correlation. In this case, Y is larger than X.
2. 'same' : (default) Returns the central part of the correlation. In this case, Y is the same
size as X.
3. 'valid' : Returns only those parts of the correlation that are computed without zero-
padded edges. In this case, Y is smaller than X.

Now we want to apply the kernel defined in the previous section using filter2():

As mentioned earlier, the low pass filter can be used denoising. Let's test it. First, to make the
input a little bit dirty, we spray some pepper and salt on the image, and then apply the mean
filter:
t has some effect on the salt and pepper noise but not much. It just made them blurred.

Sharpening filter

Human perception is sensitive to the edges and fine details of an image. As images are
composed of high-frequency components, the visual quality of an image degrades if the
frequency components are attenuated or removed. Image sharpening encompasses any
enhancement technique that highlights the edges and fine details of an image. Image
sharpening is done by adding to the original image a signal proportional to a high-pass
filtered version of the image. This process, referred to as unsharp masking on a one-
dimensional signal, involves two steps. The original image is first filtered by a high-pass
filter that extracts the high-frequency components. A scaled version of the high-pass filter
output is then added to the original image, thereby producing a sharpened image.

Unsharp masking produces an edge image from an input image using the following equation:
Gradient Filter

Image gradient

Gradients of each pixel in an image are useful to detect the edges, and therefore, Gradient
filters are common choice to find edges. Then what makes gradient facilitating to detect edges
and why is it useful?
That can be answered by the following example in Fig 2. The left panel is the given image,
and the panel in the center is the corresponding pixel intensities for pixels on the red line of
image. As we can observe from the two panels, the edges in image are where the pixel
intensities largely change, for example, from white to black and from black to white.

Given the definition of edges, the idea of gradient comes into the consideration, because
regions without any edge will return zero gradient, while the other regions output some
positive or negative values. This is illustrated in the right panel of Fig 2. The 1st derivative
map shows non-zero response only at pixels where the edges lay on. Thus, we can conclude
that, by finding pixels resulting the maxima of 1st derivative, edges in an image can be
located.

 Design a filter to compute derivatives

In order to compute the derivative of an image, we can make use of the concept of convolution
with filter (note that it is also can be computed by correlation). Let’s recall how the partial
derivative is calculated in 2D function f that represents an image. In continuous setting, partial
derivative of f with respect to x is defined as follows:
Before we sum up the filter design, we need to be aware that, in practice, we use 3 ×
1 gradient filter for x derivative, instead of 2 × 1 gradient filter, and so does for y derivative (1
× 3 filter). Let me elaborate why and how to struct the 3 × 1 filter (for the derivatives with
respect to x).

Despite the simplicity of filter: [1 -1], it has some issues.

 Firstly, once the image is convolved with this filter, it shifts the image by half a pixel.

 Secondly, which is also associated with first reason, when we apply [1 -1] to 2
pixels (x, y) and (x+1, y) it actually computes the gradient at position (x+0.5, y) not at
position (x, y) nor (x+1, y). In order to fix this, we insert 0 in-between the [1 -1] to
make [1 0 -1]. By convolving the new filter [1 0 -1] with pixels (x -1, y), (x, y) and (x+1,
y), it returns the gradient with respect to x for the center pixel (x, y).

In short,

 x-derivative filter: [1 -1] -> [1 0 -1]

 y-derivative filter: [1 -1]ᵀ -> [1 0 -1]ᵀ


 Laplacian Filter

A Laplacian filter is one of edge detectors used to compute the second spatial derivatives of an
image. It measures the rate at which the first derivatives changes. In other words, Laplacian
filter highlights the regions where the pixel intensities dramatically change. Due to this
characteristic of Laplacian filter, it is often used to detect edges in an image. We will see how
the filter find edges with visual illustration later.

Given the definition, discretized 3 x 3 Laplacian filter (because we are dealing with images
and they are discrete) for an image f is defined as an array below:

Note that, the defined kernel above, which is not identical to the mathematical definition of
Laplacian due to the opossite signs, uses a negative peak because it is more commonly used
and straightforward. However, it is still valid.
Edge detection

Let’s get back to the reason why we use the Laplacian filter. As mentioned before, Laplacian
filter is one common method to detect edge, but how?

First we compute the second derivative for each pixel. Our goal is to locate the edge
locations(pixels), and previously, we’ve found them by looking at the Maxima of first
derivatives. 

You might also like