You are on page 1of 16

Image Databases

Raw Images

A raw image file contains minimally processed data from the image sensor of either a digital
camera, a motion picture film scanner, or other image scanner. Raw files are named so because
they are not yet processed and therefore are not ready to be printed or edited. Normally, the
image is processed by a raw converter in a wide-gamut internal color space where precise
adjustments can be made before conversion to a "positive" file format such as TIFF or JPEG for
storage, printing, or further manipulation. There are dozens of raw formats in use by different
manufacturers of digital image capture equipment.

In a raw image, the content of the image consists of all "interesting" objects in that image.

Each object is characterized by

● a shape descriptor: that describes the shape/location of the region within which the object
is located inside the image
● a property descriptor that describes the properties of individual pixels (e.g. RGB values
of the pixel, RGB values aggregated over a group of pixels, grayscale levels)

The property consists of

● a property name, e.g., red, green, blue, texture


● a property domain - range of values that a property can assume (0, 1,..7)

Images

● Every image is associated with a pair of positive integers (m,n), called grid-resolution,
which divides the image into (m x n) cells of equal size (called image grid).
● Each cell consists of a collection of pixels.
● A cell property: (Name, Values, Method)

Example

● F (bwcolor, (b,w), bwalgo), where the possible values are b(black) and
w(white), and bwalgo is an algorithm that takes a cell as an input and
returns either black or white by somehow combining the black/white
levels of the pixels in the cell
● F (graylevel, [0,1], grayalgo), where the possible values are real numbers
within the interval [0,1].

Problems with Image Databases

1) Images are often very large

infeasible to explicitly store the properties on a pixel by pixel basis.


This led to a family of image "compression" techniques: an attempt to compress
the image into one containing fewer pixels.

2) There is a need to determine the "features" of the image (compressed or raw)

It is done by "segmentation": breaking up the image into a set of homogeneous


rectangular regions called segments

3) Need to support "match" operations that compare either a whole image or a segmented
image against another

IMAGE COMPRESSION

Image compression is a process that on implementation gives an output that is often smaller in
size but looks similar to the original.

ADVANTAGES OF IMAGE COMPRESSION

The benefits of image compression can be listed as follows:

1. Cost for transmitting an image reduces to a greater extent as the cost depends upon the
duration for which data is being transmitted.
2. It saves computing power as the execution of image transmission takes very little time if the
size is lesser.
3. It reduces transmission errors since fewer bits are transferred.
4. Secure level of transmission is possible due to encoding and compressing the image.

Image Compression Techniques

The image compression techniques are broadly classified into two categories. These are:
1. Lossy techniques
2. Lossless techniques

1) Lossy Compression Techniques:

Lossy compression methods have larger compression ratios as compared to lossless


compression techniques. Lossy methods are used for most applications. By this, the output
image that is reconstructed image is not an exact copy but somehow resembles it at larger
portion.

As shown in the above Figure, the prediction–transformation–decomposition process is


completely reversible. There is a loss of information due to the process of quantization. The
entropy coding after the quantizing is lossless. When the decoder has input data, entropy
decoding is applied to compressed signal values to get the quantized signal values. Then, de-
quantization is used on it and the image is recovered which resembles the original.

Lossy compression methods include some basic considerations (performance-wise):


1. Speed of encoding and decoding
2. Compression ratio
3. SNR ratio.

Lossy compression includes the following methods:

1. Block truncation coding


2. Code Vector quantization
3. Fractal coding
4. Transform coding
5. Sub-band coding

Techniques

1) Block Truncation Coding

In this, the image is divided into blocks as we have in fractals. The window of N by N of an
image is considered as a block. The mean value of all values of that window consisting a
certain number of pixels. The threshold is normally the mean value of the
pixel values in the vector. Then a bitmap of that vector is generated by replacing all pixels
having values that are greater than or equal to the threshold by a 1. Then for each segment in
the bitmap, a value is determined which is the average of the values of the corresponding
pixels in the original code vector.

2) Code Vector Quantization

The basic idea in Vector Quantization is to create a dictionary of vectors of constant size,
called code vectors. Values of pixels composed the blocks called code vectors. A given
image is then parted into non-recurring vectors called image vectors. Dictionary is made out
of this information and it is indexed. Further, it is used for encoding the original image. Thus,
every image is then entropy coded with the help of these indices.

3) Fractal Compression

The basic thing behind this coding is to divide images into segments by using standard points
like color difference, edges, frequency, and texture. It is obvious that parts of an image and
other parts of the same image are usually resembling. Here, there is a dictionary that is used
as a look-up table called fractal segments. The library contains codes which are compact sets
of numbers. Doing an algorithm operation, fractals are operated and the image is encoded.
This scheme is far more effective for compressing images that are natural and textured.

4) Transform Coding

In this coding, transforms like Discrete Fourier Transform (DFT) and Discrete Cosine
Transform (DCT), Discrete Sine Transform are used to alter the pixel specifications from the
spatial domain into the frequency domain. One is the energy compaction property, some few
coefficients only have the energy of the original image signal that can be used to reproduce
itself. Only those few significant coefficients are considered and the remaining is discarded.
These coefficients are given for quantization and encoding. DCT coding has been the most
commonly used in the transformation of image data

5) Subband Coding

In this scheme, quantization and coding are applied to each of the analyzed sub-bands from
the frequency components bands. This coding is very useful because quantization and coding
are more accurately applied to the sub-bands.

2) Lossless Compression Techniques

It is also known as entropy coding as it uses decomposition techniques to minimize


loopholes. The original image can be perfectly recovered from the compressed image, in
lossless compression techniques. These do not add noise to the signal. It is also known as
entropy coding as it uses decomposition techniques to minimize redundancy.

Following techniques are included in lossless compression:

1. Huffman encoding
2. Run-length encoding
3. LZW coding
4. Area coding

1) Huffman Coding

As shown in the Figure below, this is a general technique for coding symbols using their
statistical occurrence frequencies.
The pixels in the given image are assigned some specific numbers. The pixel having lesser
occurrences will be given a higher number of bits and the pixel with higher frequency
occurrences will get a relatively lesser number of bits. It is a prefix code. No two symbols in
an image can have exactly the same binary set of numbers. In the commercial arena, most
standards use lossy or noisy methods of compression in the early stages and in the final
stage, the Huffman‘s code.

2) Run Length Encoding

If the data is sequential and repetitive, Run Length Coding comes in as a simple method.
In this, sequential symbols are replaced by what is called ̳runs by shorter symbols‘. A
sequence {Vi, Ri} where Vi is the intensity of pixel and Ri refers to the number of consecutive
pixels with the intensity Vi. If both Vi and Ri are represented by one byte,
this total of 12 pixels which are generated using only 8 bytes constitutes a compression ratio
of 1:5.

3) Area Coding

This is an advanced form of the earlier coding technique. It is nothing but an array that builds
up when it is easy to get the 2D object of itself. Here, the same characteristics have to be
chosen for the 2D blocks of pixels. These blocks or windows are coded in a form stating their
spatial coordinates and a structure. The only drawback of this technique is that it uses a non-
linear method which is unapplicable in hardware.

4) LZW Coding

LZW (Lempel-Ziv–Welch) is also a dictionary-based coding. As it uses dictionary-based


coding, LZW has been greatly affecting the digital world. The sequence used here can be
fixed or updated as soon as it is needed.

Image Processing

Image processing is the process of transforming an image into a digital form and performing
certain operations to get some useful information from it. The image processing system usually
treats all images as 2D signals when applying certain predetermined signal processing methods.

There are five main types of image processing:

● Visualization - Find objects that are not visible in the image


● Recognition - Distinguish or detect objects in the image
● Sharpening and restoration - Create an enhanced image from the original image
● Pattern recognition - Measure the various patterns around the objects in the image
● Retrieval - Browse and search images from a large database of digital images that are
similar to the original image.
Applications of Image Processing

1) Medical Image Retrieval

Image processing has been extensively used in medical research and has enabled more efficient
and accurate treatment plans. For example, it can be used for the early detection of breast cancer
using a sophisticated nodule detection algorithm in breast scans. Since medical usage calls for
highly trained image processors, these applications require significant implementation and
evaluation before they can be accepted for use.

2) Traffic Sensing Technologies

In the case of traffic sensors, we use a video image processing system or VIPS. This consists of
a) an image capturing system b) a telecommunication system and c) an image processing system.
When capturing video, a VIPS has several detection zones that output an “on” signal whenever a
vehicle enters the zone, and then output an “off” signal whenever the vehicle exits the detection
zone. These detection zones can be set up for multiple lanes and can be used to sense the traffic
in a particular station.

Besides this, it can auto-record the license plate of the vehicle, distinguish the type of vehicle,
monitor the speed of the driver on the highway, and lots more.

3) Image Reconstruction

Image processing can be used to recover and fill in the missing or corrupt parts of an image. This
involves using image processing systems that have been trained extensively with existing photo
datasets to create newer versions of old and damaged photos.

4) Face Detection

One of the most common applications of image processing that we use today is face detection. It
follows deep learning algorithms where the machine is first trained with the specific features of
human faces, such as the shape of the face, the distance between the eyes, etc. After teaching the
machine these human face features, it will start to accept all objects in an image that resemble a
human face. Face detection is a vital tool used in security, biometrics, and even filters available
on most social media apps these days.

Benefits of Image Processing

The implementation of image processing techniques has had a massive impact on many tech
organizations. Here are some of the most useful benefits of image processing, regardless of the
field of operation:
● The digital image can be made available in any desired format (improved image, X-Ray,
photo negative, etc)
● It helps to improve images for human interpretation
● Information can be processed and extracted from images for machine interpretation
● The pixels in the image can be manipulated to any desired density and contrast
● Images can be stored and retrieved easily
● It allows for easy electronic transmission of images to third-party providers

Image Segmentation

Image Segmentation is the process by which a digital image is partitioned into various subgroups
(of pixels) called Image Objects, which can reduce the complexity of the image, and thus
analyzing the image becomes simpler.

We use various image segmentation algorithms to split and group a certain set of pixels together
from the image. By doing so, we are actually assigning labels to pixels and the pixels with the
same label fall under a category where they have some or the other thing common in them.

Using these labels, we can specify boundaries, draw lines, and separate the most required objects
in an image from the rest of the not-so-important ones. In the below example, from a main image
on the left, we try to get the major components, e.g. chair, table, etc., and hence all the chairs are
colored uniformly. In the next tab, we have detected instances, which talk about individual
objects, and hence all the chairs have different colors.

This is how different methods of segmentation of images work in varying degrees of complexity
and yield different levels of outputs.
From a machine learning point of view, later, these identified labels can be further used for both
supervised and unsupervised training and hence simplifying and solving a wide variety of
business problems. This is a simpler overview of segmentation in Image Processing. Let’s try to
understand the use cases, methodologies, and algorithms used in this article.

Need for Image Segmentation & Value Proposition

The concept of partitioning, dividing, fetching, and then labeling and later using that information
to train various ML models have indeed addressed numerous business problems. In this section,
let’s try to understand what problems are solved by Image Segmentation.

A facial recognition system implements image segmentation, identifying an employee and


enabling them to mark their attendance automatically. Segmentation in Image Processing is
being used in the medical industry for efficient and faster diagnosis, detecting diseases, tumors,
and cell and tissue patterns from various medical imagery generated from radiography, MRI,
endoscopy, thermography, ultrasonography, etc.

Satellite images are processed to identify various patterns, objects, geographical contours, soil
information, etc., which can be later used for agriculture, mining, geo-sensing, etc. Image
segmentation has a massive application area in robotics, like RPA, self-driving cars, etc. Security
images can be processed to detect harmful objects, threats, people, and incidents. Image
segmentation implementations in python, Matlab, and other languages are extensively employed
for the process.

A very interesting case I stumbled upon was a show about a certain food processing factory on
the Television, where tomatoes on a fast-moving conveyer belt were being inspected by a
computer. It was taking high-speed images from a suitably placed camera and it was passing
instructions to a suction robot that was picking up rotten ones, unripe ones, basically, damaged
tomatoes and allowing the good ones to pass on.

This is a basic, but pivotal and significant application of Image Classification, where the
algorithm was able to capture only the required components from an image, and those pixels
were later being classified as the good, the bad, and the ugly by the system. A rather simple-
looking system was making a colossal impact on that business – eradicating human effort,
human error, and increasing efficiency.

Image Segmentation is very widely implemented in Python, along with other classical languages
like Matlab, C/C++, etc. More likey so, Image segmentation in python has been the most sought
after skill in the data science stack.

Similarity-based retrieval

Retrieval Methods

These sections look at retrieval:

● based on color, using color histograms and color invariants, based on texture, using
variation in intensity and topography of surfaces;
● based on shape, using aspect ratios, circularity, and moments for global features or using
boundary segments for local features:
● based on position, using spatial indexing:
● based on image transformations, using transformations;
● based on appearance, using a combination of color, texture, and intensity.

1) Retrieval Based on Color

The color of an image conveys a lot of information, and therefore most image database systems
support color content queries as an important cue for image matching and retrieval. Color can be
expressed in terms of hue, saturation, and intensity (HSI). Hue corresponds to the color content
of a pixel, represented by an angular scale from 0 to 360 (red is at 0", green at 120, and blue at
240). Saturation corresponds to the "depth" of the color. A saturation value makes the color gray.
Intensity corresponds to the brightness of the incident illumination and is seen as a gray-level
image. In black and white pictures, only intensity is represented.

Color can often take the form of the distribution of colors based on an image. Several methods
for retrieving images on the basis of color similarity have been described in the literature, but
most are variations on the same basic idea. Each image added to the collection is analyzed to
compute a color histogram such as the one we studied in Chapter 2. This showed the proportion
of pixels of each color within the image. The color histogram for each image is then stored in the
database. The user can query the database by either specifying the desired proportion of each
color (75% olive green and 25% red, for example) or submitting an example image from which a
color histogram is calculated. The matching process then retrieves a set of images whose color
histograms most closely match those of the query. The matching technique most commonly used
is called histogram intersection. Swain and Ballard (1991) developed a system called color
indexing based on the similarity of color histograms. Their technique is used in many systems
including QBIC which we used in Exercise 1.2. The color histogram is independent of many
imaging conditions such as the orientation of a scene, the relative position of particular scene
elements. However, a problem results because image colors depend on the lighting conditions.
Small variations in lighting can lead to indexing failure. Three solutions to this have been
suggested:

Control lighting to remove dependence. It has been found that the ratio of adjacent colors
independent of illumination is very useful. Use color constancy algorithms as a preprocessing
step before indexing. Color invariant features, which are discussed later, are extracted from
images and used for indexing.

Color distribution moments have also been used (moments are discussed later). Color image
normalization is useful because it removes bias due to illumination but suffers from the problem
that it does not recover the true colors of surfaces. Normalized colors are functions of the true
color and the context of the scenes. Unfortunately, this means that the same object in different
scenes would have different normalized colors so it cannot be used directly for color indexing.
However, color normalization is important for analyzing local image regions where it can solve
the problem of context of scenes. Local color normalizations are referred to as color invariants.

Swain and Ballard's indexing method divides an image into a set of bins along each of the R and
G color dimensions. The chromaticity of each bin is compared with a test image. This can deliver
a recognition rate of about 30-40%. Variants of this technique are now used in a high proportion
of current CBIR systems. There are a number of methods of improving on Swain and Ballard's
original technique including the use of cumulative color histograms (Stricker and Orengo, 1995),
combining histogram intersection with some element of spatial matching (Stricker and Dimai,
1996), and the use of region-based color querying (Carson et al., 1997). According to Eakins and
Graham (1999), the results from some of these systems can look quite Impressive.

2) Retrieval Based on Texture

The study of the texture of an image can be very useful. The ability to match on texture,
similarity can often distinguish between areas of images with similar colors (such as sky and sea,
or leaves and grass). The method uses pixel intensity values that result from the reflection of
light from illuminated surfaces or the transmission of light through translucent media. This
variation is the result of the nature of the illumination and the topography of the surface. The
two-dimensional (2D) arrangement of the intensities defines the visual texture of the image.

A variety of techniques have been used for measuring texture similarity based on statistical
analysis, Essentially, these calculate and compare the relative brightness of selected pairs of
pixels from the query image with the other images, each in turn. From these, it is possible to
calculate measures of image texture such as the degree of directionality and regularity, or
periodicity

(Tamura et al., 1978; Liu and Picard, 1996). Alternative methods of texture analysis for
information retrieval include the use of Gabor filters, and fractals (Manjunath and Ma, 1996).
Gabor filters, which we will mention again later, are one of the most powerful techniques for
image analysis.

Texture queries can be formulated in a similar manner to color queries, by selecting examples of
desired textures from a palette or by supplying an example query image. The system then
retrieves images with texture measures most similar in value to the query. For example, in
processing images of faces, it is possible to infer the shape of the surface topography from the
variations in intensity. This is called "shape from shading" or "shape from texture". In addition, it
is now possible to use a texture thesaurus, which retrieves textured regions in images on the basis
of similarity to automatically derived codewords representing important classes of texture within
the collection.

3) Retrieval Based on Shape

The ability to retrieve by shape involves giving the shape of an object a quantitative description
that can be used to match other images. Unlike texture, the shape is a fairly well-defined
concept-and there is considerable evidence that in the brain natural objects are primarily
recognized by their shape. The process involves computing a number of features characteristic of
an object's shape that is independent of its size or orientation. These features are then computed
for every object identified within each stored image. Queries are then answered by computing
the same set of features for the query image and retrieving those stored images whose features
most closely match those of the query.

Two main types of shape features are commonly used: global features such as aspect ratio,
circularity, and moment invariants; local features such as sets of consecutive boundary segments.
The 2D boundaries of three-dimensional (3D) objects enable object recognition. Shape
representation is very difficult. A shape is defined by the x and y coordinates of its boundary
points. The similarity transformation could include translation, uniform scaling, and orientation
changes. If the camera changes its viewpoint with respect to the object the boundary of the object
is deformed, for example, a circle will be converted to an ellipse. Two medical images of the
same object may differ from one another by rotation about an axis as well as differences in scale.
The deformation can approximate to an affine transformation when, in addition to these
transformations, shapes

are also subject to non-uniform scaling and shearing. In Figure 11.3 we can see an image of the
Zinfandel grape from the wine shop application. In the case of the Zinfandel grape image, the
circular nature of the shapes is relatively easy to detect. In Figure 11.4 there are a number of
examples that give problems for retrieval based on the shape that human processors would not
find difficult. Because of these problems quite a wide selection of different approaches has been
tried.

Queries presented to shape retrieval systems are formulated either by identifying an example
image to act as the query or by asking the user to draw a 2D sketch as in QBIC. There are several
different challenges involved in developing these systems:

How would the user formulate a 3D query? It could involve selecting an example 3D image or
the user presenting a sketch, probably a 2D viewpoint.

How is the 3D result set to be presented and its relevance evaluated?

Can 3D images be effectively retrieved based on one or more 2D viewpoints of an object?

There is no general solution to this problem, but some progress has been made based on images
of objects from different viewpoints, especially in t facial recognition systems. One approach has
been to build up a set of plausible 3D models from the available 2D image and match them with
other models in the database. However, this method involves defining 3D shape similarity
measures. Another is to generate a series of alternative 2D views of each database object, each of
which is matched with the query Image (Dickinson et al., 1998),

4) Retrieval Based on Position - Spatial Location

Spatial location is one of the oldest image retrieval methods and is an essential aspect of
geographical information systems and biological systems. However, to exploit this method the
image collections must contain objects in defined spatial relationships with each other. It is
possible to impose a spatial structure on an image of a natural system by using the Voronoi
tessellation (Blackbum and Dunckley, 1995). One of the advantages of this approach is that
many of the established methods work in 2D and 3D. Spatial indexing is seldom useful on its
own, although it can be effective in combination with other cues such as color and shape.

5) Retrieval Based on Image Characteristics - Transformations

Several other types of image features have been proposed as a basis for CBIR. Most of these rely
on complex transformations of pixel intensities which have no obvious counterpart in any human
description of an image. These techniques aim to extract features that reflect some aspect of
image similarity which a human subject can perceive, even if he or she finds it difficult to
describe. The most well-researched technique of this kind uses the wavelet transform to model
an image at several different resolutions may prove very effective.

6) Retrieval Based on Appearance

Visual appearance is an important part of judging image similarity. However, it is difficult to


define exactly what we mean by an object's appearance. It will depend on the object's 3D shape
and also the viewpoint of the image but it will be independent of color and texture. Ravela and
Manmatha (1998) have proposed that an image's intensity surface has features that could be used
to compute appearance similarity. In their method, in order to compute global appearance
similarity, features are extracted from pixel neighborhoods and their distributions over the image
are compared. Histograms are used to represent distributions of features and correlation is used
to compare histograms.

Two versions of this method have been developed, one for whole-image matching and one for
matching selected parts of an image. Global appearance similarity is computed using differential
features of the image. A differential feature is a feature computed from spatial derivatives of an
image. They can give measures of curvature and orientation. Such features are obtained by
transforming simple derivatives so that they are invariant or tolerant to factors affecting the
object's appearance, such as rotations, scale, and illumination changes.

The part-image technique involves local curvatures and orientation. Global image similarity is
deduced by comparing distributions of these features.

R Tree Implementation of Image Databases

Visit this link:


https://www.academia.edu/14887799/R_TREE_IMPLEMENTATION_OF_IMAGE_DATABA
SES

You might also like