You are on page 1of 94

Professor & Head

CHAPTER-1

OVERVIEW

1
1.1 Introduction

In color images Red (R), Green (G) and Blue (B) components are used to form
a composite color image. In this case each pixel is characterized by three values and
constructs a 3-D histogram. For example, 256×256 color image a 256×256×256 cube
is formed and each cube consists of the RGB intensities to produce that particular
color of pixel. Each entry can then be divided by the total number of pixels in the
image to form a normalized histogram.

1.1.1 Color Image – Fundamentals

A color spectrum is broadly divided into six colors: VIOLET, BLUE,


GREEN, YELLOW, ORANGE and RED. When viewed in full color spectrum, no
color in the spectrum ends abruptly; rather each color bends smoothly into next. If the
light is achromatic, its only attribute is its intensity. GRAYLEVEL refers to a scalar
measure of intensity that ranges from black to gray shades and finally to white.

3 basic quantities are used to describe the quality of a chromatic light source:
radiance, luminance and brightness. Radiance is the total number of energy that flows
from a light source and is measured in watts. Luminance, measured in lumens, gives a
measure of the amount of energy an observer perceives from a light source.
Brightness is a subjective descriptor that cannot be measured and is one of the key
factors in describing color sensation.

With respect to human eye, all colors are seen as variable combinations of
three primary colors Red, Green and Blue. The specific wavelengths of the three
colors are blue = 435.8nm, green = 546.1nm and red = 700nm. The primary colors
alone cannot produce entire colors of visible range. The primary colors can be added
to produce the secondary colors of light as Magenta (R+B), Cyan (G+B) and Yellow
(R+G). The primary colors of light are Red, Green, Blue and the primary colors of
pigments are Magenta, Cyan and Yellow.

Three characteristics are generally used to distinguish one color from the
other. They are brightness, hue and saturation. Hue is an attribute associated with the

2
dominant wavelength in a mixture of light waves i.e. color of an object is nothing but
its Hue. Saturation refers to relative purity i.e. the amount of white light mixed with
hue. Pure spectrum colors are fully saturated and colors like pink are less saturated.
Hue and Saturation together constitute chromaticity. A color is characterized by
brightness and chromaticity. A color is then specified by its trichromatic coefficients
defined as

, , ,

Red Green Blue

A color model specifies colors in some standard. The hardware oriented


models most commonly used in practice are the

• RGB model for color monitors.

• CMY model for color printers.

• YIQ model is standard for color TV broadcast. Y- Luminance, I- in phase, Q-


quadrature.

• HSI (Hue Saturation Intensity) and HSV (Hue Saturation Value) are used in
color image manipulations.

3
1.1.1(a) RGB color model

The RGB color model is composed of the primary colors Red, Green, and

Blue. This system defines the color model that is used in most color CRT monitors

and color raster graphics. They are considered the "additive primaries" since the

colors are added together to produce the desired color. The RGB model uses the

Cartesian coordinate system as shown in Figure 1. 1. Notice the diagonal from (0,0,0)

black to (1,1,1) white which represents the grey-scale. Figure 1.2 is a view of the

RGB color model looking down from "White" to origin.

Figure1.1: RGB coordinates system

4
Figure 1.2: RGB color model

1.1.1 (b)HSV model

HSV is a common cylindrical-coordinate representation of points in an RGB


color model, which rearranges the geometry of RGB in an attempt to be more
perceptually relevant than the Cartesian representation. HSV stands for Hue,
Saturation, Value and also known as HSB.

1.1.1(c) Color histogram

In image processing, a color histogram is a representation of distribution of


colors in an image. For digital images it is basically number of pixels that have colors
in each of a fixed list of color ranges that span the image’s color space.

Color histograms are typical features used for classifying different ground
regions from aerial or satellite photographs. In case of multispectral images,
histograms may be four dimensional. Color histograms can be used in object
recognition and image retrieval systems/databases.

1.1.2 CBIR (Content based image retrieval systems)

There is a growing interest in CBIR because of the limitations of metadata


based systems as well as large range of possible uses for efficient image retrieval.

5
Essential uses of CBIR include

 Art collections

 Photograph archives

 Medical diagnosis

 Crime prevention

 Military

 Architectural and engineering design

 Geographical information and remote sensing systems.

1.2 Aim of the project

The objective of this project is mainly to develop a robust search method to be


used in content based image retrieval systems (CBIR). This project focuses on image
classification in content based image retrieval systems using histogram features and
quadratic distance as a similarities measure.

1.3 Methodology

CBIR systems gained a lot of prominence in the last few years mainly due to
the increase in multimedia database and information repositories. CBIR systems can
focus on retrieving images for a given image based on color, shape or texture. Image
texture is an important visual primitive to search and browse through large collections
of similar looking patterns, hence this project focuses on CBIR for different groups of
images. The CBIR system creates a database of features from a database of images.

A feature vector is one method to represent an image by finding


measurements on a set of features. The feature vector is an n-dimensional vector
that contains these measurements, where n is the number of features. The
measurements may be symbolic, numerical, of both. An example of a symbolic
feature is color such as “blue” or “red”: an example of a numerical feature is the
area of an object. If we take a symbolic feature and assign a number to it, it

6
becomes a numerical feature. The feature vector can be used to classify an object,
or provide us with condensed higher-level image information. Associated with the
feature vector is a mathematical abstraction called a feature space, which is also
n-dimensional and is created to allow visualization of feature vectors, and
relationships between them. With two- and three-dimensional feature vectors it is
modeled as a geometric construct with perpendicular axes and created by plotting
each feature measurement along one axis. For n-dimensional feature vectors it is
an abstract mathematical construction called a hyperspace. As we shall see the
creation of the feature space allows us to define distance and similarity measures
which are used to compare feature vectors and aid in the classification of
unknown samples.

The difference can be measured by a distance measure in the n-dimensional


feature space; the bigger the distance between the two vectors, the greater the
difference. Euclidean distance is the most common metric for measuring the
distance between two vectors, and is given by the square root of the sum of the
squares of the differences between vector components. Given two vectors A and B
where:

Then the Euclidean distance is given by:

Another distance measure, called city block of absolute metric is defined as


follows (using A and B as above):

This metric is computationally faster than the Euclidean distance, but gives
similar results. A distance metric that considers only largest difference is the
maximum value metric defined by:
7
We can see that this will measure the vector component with the maximum distance,
which is useful for some applications. A generalized distance metric is the Minkowski
distance defined as:

Where r is a positive integer.

The Minkowski distance is referred to as generalized because, for instance, if


r=2 it is the same as Euclidean distance and when r=1 it is the city block metric.

The second type of a metric used for comparing two feature vectors is the
similarity measure. Two vectors that are close in the feature space will have a large
similarity measure. The most common form of similarity measure is the vector inner
product. Using our definitions for the two vectors A and B, we can define the vector
inner product by the following equation:

Another commonly used similarity measure is the tanimoto metric defined as:

This metric takes on values between 0 and 1, which can be thought of as a “percent of
similarity” since the value is 1 (100%) for identical vectors and gets smaller as the
vectors gets farther apart.

Histogram Quadratic Distance:

The quadratic-form distance between two feature vectors q and t is given by

Where is similarity matrix, denotes similarity between elements with

indices I and j. please note, that and are denoted as vectors.


8
A feature matching follows feature extraction where a good similarity measure
is essential for effective retrieval. Unfortunately, the meaning of similarity is rather
vague and difficult to define. The difficulties in defining a similarity measure are:

• Different similarity measures capture different aspects of perceptual


similarities between images.

• Different features do not contribute equally, and therefore, cannot be


considered equally important for computing similarity between images.

• Different similarity measures for comparison purposes are presented but most
of these measures are not always consistent with human perception of visual
content, and their performance degrades as the dimensionality of the feature
space increases.

• So far in CBIR, training algorithm is not available and weight vectors have
been fixed heuristically. Unfortunately, fixing these weights a prior does not
utilize the full potential of the distance metric and does not reflect the users
perception of similarity. For example, when a user perceives two images as
being similar either in an individual feature, or in some combination of
features.

The basic idea behind content based image retrieval is the extraction of feature
vectors( the features can be color, shape, texture, region or spatial features, or features
in some comprised domain, etc.). These vectors are then stored in a database for
future use. When given a query image, its feature vectors are similarly extracted and
matched with those in the database. If the distance between the query image features
and feature vector available in the database is small enough, the corresponding image
in the database is considered a match to the query.

1.4 Significance of the work

There are many applications where content-based image retrieval is important.


Some of them are:

9
• In architecture, real estate and interior design: Allows users to find
similar building and decoration of rooms that correspond to more
appealing structures from database.

• In education: In history, for example, it is always helpful to have


immediate access to images and short video sequences of relevant
people. In such cases CBIR systems are extremely useful.

• In geographical information systems.

• In medicine, diagnosis may require recalling the current condition and


checking its resemblance with the conditions from the literature.

• In remote sensing, to find relevant data from satellite images.

• In film and video archives, to find video shots quickly for particular
characteristics such as color, texture and shape of even high level
concepts such as particular persons, places or objects.

1.5 Organization of the report

Chapter 1 describes the basic overview of the project. Chapter 2 deals with
CBIR systems and the different techniques used in them for image retrieving. Chapter
3 gives the theoretical background, algorithm and the code. Chapter 4 consists of the
results and conclusions.

10
CHAPTER-2
CBIR SYSTEMS

Content-based image retrieval (CBIR), also known as query by image


content (QBIC) and content-based visual information retrieval (CBVIR) is the
application of computer vision to the image retrieval problem, that is, the problem of
searching for digital images large databases.

11
"Content-based" means that the search will analyze the actual contents of the
image. The term 'content' in this context might refer to colors, shapes, textures, or any
other information that can be derived from the image itself. Without the ability to
examine image content, searches must rely on metadata such as captions or keywords,
which may be laborious or expensive to produce.

2.1 History

The term CBIR seems to have originated in 1992, when it was used by T. Kato
to describe experiments into automatic retrieval of images from a database, based on
the colors and shapes present. Since then, the term has been used to describe the
process of retrieving desired images from a large collection on the basis of syntactical
image features. The techniques, tools and algorithms that are used originate from
fields such as statistics, pattern recognition, signal processing, and computer vision.

2.2 Technical progress

There is a growing interest in CBIR because of the limitations inherent in


metadata-based systems, as well as the large range of possible uses for efficient image
retrieval. Textual information about images can be easily searched using existing
technology, but requires humans to personally describe every image in the database.
This is impractical for very large databases, or for images that are generated
automatically, e.g. from surveillance cameras. It is also possible to miss images that
use different synonyms in their descriptions. Systems based on categorizing images in
semantic classes like "cat" as a subclass of "animal" avoid this problem but still face
the same scaling issues.

Potential uses for CBIR include:

• Art collections
• Photograph archives
• Retail catalogs
• Medical diagnosis

12
• Crime prevention
• The military
• Intellectual property
• Architectural and engineering design
• Geographical information and remote sensing systems

2.3 System architecture of CBIR

The basic idea behind content based image retrieval is the extraction of feature
vectors (The features can be color, shape, texture, region or spatial features or features
in some compressed domain etc.). These vectors are similarly extracted and matched
with those in the database. If the distance between the query image features and
feature vector available in the database is small enough, the corresponding image in
the database is considered a match to the query.

The search is usually based on similarity rather than on exact match. The
retrieval results are then ranked according to a similarity index and a group of similar
target images is usually presented to the users.

The main block diagram consists of digitizer, feature extractor, image data base,
feature data base, matching and multidimensional indexing.

Input database image Query image

Digitizer Image13
data
base
Digitizer

Feature
extractor

Image
matching & Feature
Image multidimen extractor
feature sional
data base indexing

Retrieved images

Figure 2.1: General scheme of content based image retrieval

The function of each block is explained below.

2.3.1 Digitizer

To add new images to data base, store images acquired from CCD camera, X-
ray imaging system, micro densitometer, image de-sectors, Videocon cameras etc or
images needed to be digitized, so that computer can process those images.

2.3.2 Image data base

The comparison between query image and images from the data base can be
done pixel to pixel which will give precise match. But on the other hand recognizing
objects at query time will limit the retrieval speed of the system. Due to high expenses
of such computing, generally this crude method of comparison is not used but image
data base contains raw images, is required for visual display purpose.

2.3.3 Feature extractor

To avoid the above problem of pixel-by-pixel comparison, a better abstraction


level for representing images is the feature level. Every image characterized by a set
14
of these features such as texture, color and shape etc.. and the extraction of these
features are summarized in a reduced set of k indices and stored in the feature
database. The query image is then processed in the same way as the images in the
database. Thus, matching is done on the feature data base.

2.3.4 Image matching and multidimensional indexing

Extracted features of query image are compared with features that are stored in
the image feature database. To achieve fast retrieval speed and make the retrieval
system truly scalable to large size image collections and effective multidimensional
indexing is an indispensable part of the whole system. The system selects the N
images having the greatest overall similarities to the query images.

2.4 Image retrieval techniques used in CBIR systems

2.4.1 Query techniques

Different implementations of CBIR make use of different types of user


queries.

2.4.2 Query by example

Query by example is a query technique that involves providing the CBIR


system with an example image that it will then base its search upon. The underlying
search algorithms may vary depending on the application, but result images should all
share common elements with the provided example.

Options for providing example images to the system include:

• A preexisting image may be supplied by the user or chosen from a random set.
• The user draws a rough approximation of the image they are looking for, for
example with blobs of color or general shapes.

This query technique removes the difficulties that can arise when trying to describe
images with words.

15
2.4.3 Semantic retrieval

The ideal CBIR system from a user perspective would involve what is referred
to as semantic retrieval, where the user makes a request like "find pictures of dogs" or
even "find pictures of Abraham Lincoln". This type of open-ended task is very
difficult for computers to perform - pictures of Chihuahuas and Great Danes look very
different, and Lincoln may not always be facing the camera or in the same pose.
Current CBIR systems therefore generally make use of lower-level features like
texture, color, and shape, although some systems take advantage of very common
higher-level features like faces. Not every CBIR system is generic. Some systems are
designed for a specific domain, e.g. shape matching can be used for finding parts
inside a CAD-CAM database.

2.4.4 Other query methods

Other query methods include browsing for example images, navigating


customized/hierarchical categories, querying by image region (rather than the entire
image), querying by multiple example images, querying by visual sketch, querying by
direct specification of image features, and multimodal queries (e.g. combining touch,
voice, etc.).

CBIR systems can also make use of relevance feedback, where the user
progressively refines the search results by marking images in the results as "relevant",
"not relevant", or "neutral" to the search query, then repeating the search with the new
information.

2.4.5 Content comparison techniques

Image retrieval techniques integrate both low-level visual features, addressing


the more detailed perceptual aspects, and high-level semantic features underlying the
more general conceptual aspects of visual data. The emergence of multimedia
technology and the rapid growth in the number and type of multimedia assets
controlled by public and private entities, as well as the expanding range of image and

16
video documents appearing on the web, have attracted significant research efforts in
providing tools for effective retrieval and management of visual data. Image retrieval
is based on the availability of a representation scheme of image content. Image
content descriptors may be visual features such as color, texture, shape, and spatial
relationships, or semantic primitives.

Conventional information retrieval is based solely on text, and these


approaches to textual information retrieval have been transplanted into image retrieval
in a variety of ways, including the representation of an image as a vector of feature
values. However, “a picture is worth a thousand words.” Image contents are much
more versatile compared with text, and the amount of visual data is already enormous
and still expanding very rapidly. Hoping to cope with these special characteristics of
visual data, content-based image retrieval methods have been introduced. It has been
widely recognized that the family of image retrieval techniques should become an
integration of both low-level visual features, addressing the more detailed perceptual
aspects, and high-level semantic features underlying the more general conceptual
aspects of visual data. Neither of these two types of features is sufficient to retrieve or
manage visual data in an effective or efficient way. Although efforts have been
devoted to combining these two aspects of visual data, the gap between them is still a
huge barrier in front of researchers. Intuitive and heuristic approaches do not provide
us with satisfactory performance. Therefore, there is an urgent need of finding and
managing the latent correlation between low-level features and high-level concepts.
How to bridge this gap between visual features and semantic features has been a
major challenge in this research field.

The different types of information that are normally associated with images are:

• Content-independent metadata: data that is not directly concerned with image


content, but related to it. Examples are image format, author’s name, date, and
location.
• Content-based metadata:
o Non-information-bearing metadata: data referring to low-level or
intermediate-level features, such as color, texture, shape, spatial

17
relationships, and their various combinations. This information can
easily be computed from the raw data.
o Information-bearing metadata: data referring to content semantics,
concerned with relationships of image entities to real-world entities.
This type of information, such as that a particular building appearing in
an image is the Empire State Building , cannot usually be derived from
the raw data, and must then be supplied by other means, perhaps by
inheriting this semantic label from another image, where a similar-
appearing building has already been identified.

Low-level visual features such as color, texture, shape and spatial relationships
are directly related to perceptual aspects of image content. Since it is usually easy to
extract and represent these features and fairly convenient to design similarity
measures by using the statistical properties of these features, a variety of content-
based image retrieval techniques have been proposed. High-level concepts, however,
are not extracted directly from visual contents, but they represent the relatively more
important meanings of objects and scenes in the images that are perceived by human
beings. These conceptual aspects are more closely related to users’ preferences and
subjectivity. Concepts may vary significantly in different circumstances. Subtle
changes in the semantics may lead to dramatic conceptual differences. Needless to
say, it is a very challenging task to extract and manage meaningful semantics and to
make use of them to achieve more intelligent and user-friendly retrieval.

High-level conceptual information is normally represented by using text


descriptors. Traditional indexing for image retrieval is text-based. In certain content-
based retrieval techniques, text descriptors are also used to model perceptual aspects.
However, the inadequacy of text description is very obvious:

• It is difficult for text to capture the perceptual saliency of visual features.


• It is rather difficult to characterize certain entities, attributes, roles or events by
means of text only.
• Text is not well suited for modeling the correlation between perceptual and
conceptual features.

18
• Text descriptions reflect the subjectivity of the annotator and the annotation
process is prone to be inconsistent, incomplete, ambiguous, and very difficult
to be automated.

Although it is an obvious fact that image contents are much more complicated
than textual data stored in traditional databases, there is an even greater demand for
retrieval and management tools for visual data, since visual information is a more
capable medium of conveying ideas and is more closely related to human perception
of the real world. Image retrieval techniques should provide support for user queries
in an effective and efficient way, just as conventional information retrieval does for
textual retrieval. In general, image retrieval can be categorized into the following
types:

• Exact Matching: This category is applicable only to static environments or


environments in which features of the images do not evolve over an extended
period of time. Databases containing industrial and architectural drawings or
electronics schematics are examples of such environments.

• Low-Level Similarity-Based Searching: In most cases, it is difficult to


determine which images best satisfy the query. Different users may have
different needs and wants. Even the same user may have different preferences
under different circumstances. Thus, it is desirable to return the top several
similar images based on the similarity measure, so as to give users a good
sampling. The similarity measure is generally based on simple feature
matching and it is quite common for the user to interact with the system so as
to indicate to it the quality of each of the returned matches, which helps the
system adapt to the users’ preferences.
• High-Level Semantic-Based Searching: In this case, the notion of similarity is
not based on simple feature matching and usually results from extended user
interaction with the system. Research in this area is quite active, yet still in its
infancy. Many important breakthroughs are yet to be made.

For either type of retrieval, the dynamic and versatile characteristics of image
content require expensive computations and sophisticated methodologies in the areas
of computer vision, image processing, data visualization, indexing, and similarity
19
measurement. In order to manage image data effectively and efficiently, many
schemes for data modeling and image representation have been proposed. Typically,
each of these schemes builds a symbolic image for each given physical image to
provide logical and physical data independence. Symbolic images are then used in
conjunction with various index structures as proxies for image comparisons to reduce
the searching scope. The high-dimensional visual data is usually reduced into a lower-
dimensional subspace so that it is easier to index and manage the visual contents.
Once the similarity measure has been determined, indexes of corresponding images
are located in the image space and those images are retrieved from the database. Due
to the lack of any unified framework for image representation and retrieval, certain
methods may perform better than others under differing query situations. Therefore,
these schemes and retrieval techniques have to be somehow integrated and adjusted
on the fly to facilitate effective and efficient image data management.

Visual feature extraction is the basis of any content-based image retrieval


technique. Widely used features include color, texture, shape and spatial relationships.
Because of the subjectivity of perception and the complex composition of visual data,
there does not exist a single best representation for any given visual feature. Multiple
approaches have been introduced for each of these visual features and each of them
characterizes the feature from a different perspective.

2.4.5 (a) Color

Color is one of the most widely used visual features in content-based image
retrieval. It is relatively robust and simple to represent. Various studies of color
perception and color spaces have been proposed, in order to find color-based
techniques that are more closely aligned with the ways that humans perceive color.
The color histogram has been the most commonly used representation technique,
statistically describing combined probabilistic properties of the various color channels
(such as the(R)ed, (G)reen, and (B)lue channels), by capturing the number of pixels
having particular properties. For example, a color histogram might describe the
number of pixels of each red channel value in the range [0, 255]. It is well known that
histograms lose information related to the spatial distribution of colors and that two
very different images can have very similar histograms. There has been much work

20
done in extending histograms to capture such spatial information. Two of the well-
known approaches for this are correlograms and anglograms. Correlograms capture
the distribution of colors of pixels in particular areas around pixels of particular
colors, while anglograms capture a particular signature of the spatial arrangement of
areas (single pixels or blocks of pixels) having common properties, such as similar
colors. Anglograms also can be used for texture and shape features.

2.4.5 (b) Texture

Texture refers to the patterns in an image that present the properties of


homogeneity that do not result from the presence of a single color or intensity value.
It is a powerful discriminating feature, present almost everywhere in nature. However,
it is almost impossible to describe texture in words, because it is virtually a statistical
and structural property. There are three major categories of texture-based techniques,
namely, probabilistic/statistical, spectral, and structural approaches. Probabilistic
methods treat texture patterns as samples of certain random fields and extract texture
features from these properties. Spectral approaches involve the sub-band
decomposition of images into different channels, and the analysis of spatial frequency
content in each of these sub-bands in order to extract texture features. Structural
techniques model texture features based on heuristic rules of spatial placements of
primitive image elements that attempt to mimic human perception of textural patterns.

The well known Tamura features include coarseness, contrast, directionality,


line-likeness, regularity, and roughness. Different researchers have selected different
subsets of these heuristic descriptors. It is believed that the combination of contrast,
coarseness, and directionality best represents the textural patterns of color images.

2.4.5(c) Shape

Shape representation is normally required to be invariant to translation,


rotation, and scaling. In general, shape representations can be categorized as either
boundary-based or region-based. A boundary-based representation uses only the outer
boundary characteristics of the entities, while a region-based representation uses the

21
entire region. Shape features may also be local or global. A shape feature is local if it
is derived from some proper subpart of an object, while it is global if it is derived
from the entire object.

A combination of the above features are extracted from each image and
transformed into a point of a high-dimensional vector space. Using this
representation, the many techniques developed by the information retrieval
community can be used to advantage. As the dimensionality of the underlying space is
still quite high, however, the many disadvantages caused by the curse of
dimensionality also prevail.

Originally devised in the context of estimating probability density functions in


high-dimensional spaces, the curse of dimensionality expresses itself in high-
dimensional indexing by causing log time complexity indexing approaches to behave
no better than linear search as the dimensionality of the search space increases. This is
why there has been so much effort spent in the development of efficient high-
dimensional indexing techniques, on the one hand, and in dimensional reduction
techniques which capture the salient semantics, on the other hand.

As the ultimate goal of image retrieval is to serve the needs and wants of users
who may not even know what they are looking for but can recognize it when they see
it, there has been much work done in trying to discover what is in the mind of the
user. A very common technique for this is relevance feedback. Originally advanced in
the information retrieval community, it has become a standard in most existing image
retrieval systems, although some researchers believe that more involved user
interactions are necessary to discover user semantics. This technique helps the system
refine its search by asking the user to rank the returned results as to relevance. Based
on these results, the system learns how to retrieve results more in line with what the
user wants. There have been many new approaches developed in recent years, but the
classical techniques are query refinement or feature reweighting. Query refinement
transforms the query so that more of the positive and less of the negative examples
will be retrieved. Feature reweighting puts more weight on features which help to
retrieve positive examples and less weight on features which aid in retrieving negative

22
examples. This process continues for as many rounds as is necessary to produce
results acceptable to the user.

The variety of shape representation techniques are currently available, which


are invariant to size, position and orientation. They may be grouped as:

1. Methods based on local features such as points, angles, line segments, curvature,
etc.

2. Template matching methods.

3. Transformation coefficient method: The wavelet transformation can be used with


different basis functions. In 1995 Jacobs used Haar basis function, which do not
perform well when the query image consists of small translation of target image. This
problem is less visible in the approach of Wang using Daubechies basis function.

4. Global object method: This method works on object as whole. An important


drawback of this method is that the object in the image must be clearly segmented. In
general, such methods are not robust against noise and occlusions. Global object
features such as area, circularity, eccentricity, compactness, major axis orientation,
Elur number, contactivity tree, shape numbers, and algebraic moments can all be used
for shape description. A number of such features are used by the QBIC system.

5. Modal matching: Rather than working with the area of an object, the boundary can
be used instead. Samples of the boundary can be described with Fourier descriptors,
the coefficients of the discrete Fourier transform.

6. Curvature scale space: Another approach is the use of a scale space representation
of the curvature of the contour of objects. Another way of reducing curvature changes
is based on the turning angle function, or tangent space representation.

7. Voting schemes: The voting scheme generally works on so-called interest points.
For the purpose of CBIR, such points are for example corner points detected in
images. Geometric hashing is method that determines if there is a transformed subset
of the query point set that matches a subset of a target point set. The generalized

23
Hough transform, or pose clustering, is also voting scheme. Wolfsan made
comparison between geometric hashing and pose clustering alignment method.

24
CHAPTER-3
THEORY

3.1 Introduction
Content-based image retrieval systems(CBIR) is very useful and efficient if
images are classified on the score of particular aspects. For example, in a great
database the images can be divided into different classes such as landscapes,
buildings, animals, faces, artificial images, etc.
25
Many color image classification methods use color histograms. The aim of this
paper is to develop a color histogram based classification approach, which is efficient,
quick and enough robust. In the interest of this we used some features of color
histograms, and classified the images using these features. The advantage of this
approach is the comparison of histogram features is much faster and more efficient
than of other commonly used methods.

3.2 Theoretical background

3.2.1. Histogram Features

The histogram of an image is a plot of the gray level values or the intensity
values of a color channel versus the number of pixels at that value. The shape of
the histogram provides us with information about the nature of the image, or sub
image if we are considering an object within the image. For example, a very
narrow histogram implies a low contrast image, a histogram skewed toward the
high end implies a bright image, and a histogram with two major peaks, called
bimodal, implies an object that is in contrast with the background.

The histogram features that we will consider are statistical-based features,


where the histogram is used as a model of the probability distribution of the
intensity levels. These statistical features provide us with information about the
characteristics of the intensity level distribution for the image. We define the first-
order histogram probability, P(g) as:

= …….. (3.1)

M is the number of pixels in the image (if the entire image is under consideration then
for an image), and is the number of pixels at gray level . As

with any probability distribution all the values for are less than or equal to 1,

and the sum of all the values is equal to 1. The features based on the first order
histogram probability are the mean, standard deviation, skew, energy, and entropy.

The mean is the average value, so it tells us something about the general
brightness of the image. A bright image will have a high mean, and a dark image will

26
have a low mean. We will use as the total number of intensity levels available, so

the gray levels range from to . For example, for typical 8-bit image data, is

256 and ranges from to 255. We can define the mean as follows:

= ……… (3.2)

If we use the second form of the equation we sum over the rows and columns
corresponding to the pixels in the image under consideration.

The standard deviation which is also known as the square root of the variance,
tells us something about the contrast. It describes the spread in the data. So a high
contrast image will have a high variance, and a low contrast image will have a low
variance. It is defined as follows:

= ………. (3.3)

The skew measures the asymmetry about the mean in the intensity level
distribution. It is defined as:

………. (3.4)

The skew will be positive if the tail of the histogram spreads to the right (positive) and
negative if the tail of the histogram spreads to the left (negative). Another method to
measure the skew uses the mean, mode, and standard deviation, where the mode is
defined as the peak, or highest value:

……… (3.5)

This method of measuring skew is more computationally efficient, especially


considering that, typically, the mean and standard deviation have already been
calculated.

The energy measure tells us something about how the intensity levels are distributed:

……… (3.6)

27
The energy measure has a maximum value of 1 for an image with a constant value,
and gets increasingly smaller as the pixel values are distributed across more intensity
level values (remember all the values are less than or equal to 1). The larger this
value is, the easier it is to compress the image data. If the energy is high it tells us that
the number of intensity levels in the image is few, that is, the distribution is
concentrated in only a small number of different intensity levels.

The entropy a measure that tells us how many bits we need to code the image data,
and is given by

……… (3.7)

As the pixel values in the image are distributed among more intensity levels, the
entropy increases. A complex image has higher entropy than a simple image. This
measure tends to vary inversely with the energy.

3.2.2 Feature Vectors and Feature Spaces

A feature vector is one method to represent an image by finding measurements


on a set of features. The feature vector is an n-dimensional vector that contains
these measurements, where n is the number of features. The measurements may be
symbolic, numerical, of both. An example of a symbolic feature is color such as
“blue” or “red”: an example of a numerical feature is the area of an object. If we
take a symbolic feature and assign a number to it, it becomes a numerical feature.
Care must be taken in assigning numbers to symbolic features, so that the numbers
are assigned in a meaningful way. For example, with color we normally think of
the hue by its name such as “orange” or “magenta”. In this case, we could perform
an HSL transform on the RGB data, and use the H (hue) value as a numerical
color feature. But with the HSL transform the hue value ranges from 0 to 360
degrees, and 0 is “next to” 360, so it would be invalid to compare two colors by
simply subtracting the hue values.

The feature vector can be used to classify an object, or provide us with


condensed higher-level image information. Associated with the feature vector is a
mathematical abstraction called a feature space, which is also n-dimensional and
is created to allow visualization of feature vectors, and relationships between
28
them.With two- and three-dimensional feature vectors it is modeled as a geometric
construct with perpendicular axes and created by plotting each feature
measurement along one axis. For n-dimensional feature vectors it is an abstract
mathematical construction called a hyperspace. As we shall see the creation of the
feature space allows us to define distance and similarity measures which are used
to compare feature vectors and aid in the classification of unknown samples.

3.2.3 Distance and Similarity Measures

The feature vector is meant to represent the object and will be used to classify
it. To perform the classification we need methods to compare two feature vectors.
The primary method is to either measure the difference between the two, or to
measure the similarity. Two vectors that are closely related will have a small
difference and a large similarity.

The difference can be measured by a distance measure in the n-dimensional


feature space; the bigger the distance between the two vectors, the greater the
difference. Euclidean distance is the most common metric for measuring the
distance between two vectors, and is given by the square root of the sum of the
squares of the differences between vector components. Given two vectors A and B
where:

…….. (3.8)

Then the Euclidean distance is given by:

….. (3.9)

Another distance measure, called city block of absolute metric is defined as


follows (using A and B as above):

29
…….. (3.10)

This metric is computationally faster than the Euclidean distance, but gives
similar results. A distance metric that considers only largest difference is the
maximum value metric defined by:

…….. (3.11)

We can see that this will measure the vector component with the maximum distance,
which is useful for some applications. A generalized distance metric is the Minkowski
distance defined as:

……….(3.12)

Where r is a positive integer.

The Minkowski distance is referred to as generalized because, for instance, if


r=2 it is the same as Euclidean distance and when r=1 it is the city block metric.

The second type of a metric used for comparing two feature vectors is the
similarity measure. Two vectors that are close in the feature space will have a large
similarity measure. The most common form of similarity measure is the vector inner
product. Using our definitions for the two vectors A and B, we can define the vector
inner product by the following equation:

…….. (3.13)

Another commonly used similarity measure is the tanimoto metric defined as:

……… (3.14)

This metric takes on values between 0 and 1, which can be thought of as a “percent of
similarity” since the value is 1 (100%) for identical vectors and gets smaller as the
vectors gets farther apart.

Histogram Quadratic Distance:


30
The quadratic-form distance between two feature vectors q and t is given by

……… (3.15)

Where is similarity matrix, denotes similarity between elements with

indices I and j. please note, that and are denoted as vectors.

3.2.4 Classification Algorithms and Methods

The simplest algorithm for identifying a sample from the test set is called the
nearest neighbor method. The object of interest is compared to every sample in the
training set, using a distance measure, a similarity measure, or a combination of
measures. The “unknown” object is then identified as belonging to the same class as
the closest sample in the training set. This is indicated by the smallest number if using
a distance measure, or the largest number if using a similarity measure. This process
is computationally intensive and not very robust.

We can make the Nearest Neighbor method more robust by selecting not just the
closest sample in the training set, but by consideration of a group of close feature
vectors. This is called the k- nearest neighbor method, where, for example, .
Then we assign the unknown feature vector to the class that occurs most often in the
set of K-Neighbors. This is still very computationally intensive, since we have to
compare each unknown sample to every sample in the training set, and we want the
training set as large as possible to maximize success.

We can reduce this computational burden by using a method called Nearest


Centroid. Here, we find the centroids for each class from the samples in the training,
and then we compare the unknown samples to the representative centroids only. The
centroids are calculated by finding the average value for each vector component in the
training set.

3.3 Experiments

31
During the experiments 200 several images were used, which was divided into
four equal size classes: dinosaurs, buses, flowers and elephant images with one object
with homogenous background. One image of each class can be seen in figure 1.

Figure 3.1: Sample images of two classes: flowers, elephants

From each image classes 25 images were the number of the training class. During
the train period the YCbCr color space was applied, because it is the most efficient
color space for classification.

Using each training set the histograms of the three color channels were generated
and the above mentioned histogram features were calculated. Hence in each training
set there were 25 pieces 15-dimensional feature vectors, which were made a 15-
dimensional hyperspace. In these hyperspaces the Nearest Centroids were calculated
as the class property using the absolute value metric.

After the property generation of the training set, it was analyzed that the
remaining 100 images are closest to which class. It was found that 87% of images
were well classified during the experiment.

The algorithms were coded in MATLAB, because this system computationally is


rather fast, and the code generation is very simple. For example, the MATLAB code
of the histogram feature generation is as given below.

function t=HistogramProperties(PrH,N)

t.m= sum([1:N]’ . PrH);

t.s= sqrt(sum(([1:n]’-t.m).^2.*PrH));

32
t.sk1=sum(([1:n]’-t.m).^3.*PrH)/t.s.^3;

mode=find(PrH==max(PrH));

t.sk2=(t.m-mode(ceil(length(mode)/2)))/t.s;

t.er=sum(PrH.^2);

t.ep=-sum(PrH.*log2(PrH+eps));

3.4 Algorithm
• Load all database images in MATLAB\Work using load function.

• Select an image from database as query image using browse function.

• In browse query image is selected with a MATLAB command user interface


get file.

• Use search function to select to search for equal resembling images from
database.

• Search function performs the following tasks.

1. A feature vector is created for every image by converting RGB to


HSV model of image.

2. Red green blue histograms generated and the values are stored in a
feature vector.

3. Similar way histograms are generated for all images in database.

4. Quadratic distance is calculated between query image feature vector


and database image feature vectors.

5. A text file is created with image names in ascending order where an


image that has least quadratic distance will be placed first.

6. 20 images are selected in red plane.

20 images are selected in green plane.

33
20 images are selected in blue plane.

7. Finally the first 10 images are displayed in GUIDE window in all


the 3 planes.

• The image number of displayed images is compared with each domain range.
If the image falls in that domain a variable is incremented by one, if not image
is compared with next domain range. Likewise variables will be incremented
in each range and whichever domain has highest variable count the name of
the domain is displayed in a dialogue box. “The query image belongs to-----
domain”

• Accuracy is also calculated by taking ratio of counting actual images of query


image domain and total images in that plane.

• Close function closes the guide window.

• Exit function exits from MATLAB prompt.

34
CHAPTER-4

RESULTS & CONCLUSIONS

4.1 Results

35
Number of images belonging to the query image domain in the red plane = 9

Number of images belonging to the query image domain in the green plane = 10

Number of images belonging to the query image domain in the blue plane = 9

36
Therefore the total number of images belonging to the query image domain is 28 out
of 30. Hence the query image belongs to the elephant domain.

Percentage accuracy = (28/30)*100 = 93.3%

37
Number of images belonging to the query image domain in the red plane = 4

Number of images belonging to the query image domain in the green plane = 10

Number of images belonging to the query image domain in the blue plane = 10

38
Therefore the total number of images belonging to the query image domain is 24 out
of 30. Hence the query image belongs to the flower domain.

Percentage accuracy = (24/30)*100 = 80%

39
4.2 Conclusions
Through this project a new approach of color image classification was
introduced. The main advantage of this method is the usage of simple image features,
as histogram features. Histogram features can be generated from the image histogram
very quickly and the comparison of these features is computationally fast and
efficient. From the results obtained we observe that the accuracy for the retrieval of
the images from the elephant domain is 93.3% and from the flower domain is 80%. In
general the accuracy we obtain using this technique ranges from 80% - 95%.

40
REFERENCES
[1] C. Carson, M. Thomas, S. Belongie, J. M. Hellerstein. J. Malik, “Blobworld: A
system for region-based indexing and retrieval”. Third international conference on
visual information systems. Springer, 1999.

[2] Sz. Sergyian, “ color content-based image classification”, 5th Slovakian-Hungarian


joint symposium on applied machine intelligence and informatics. Poprad, Slovakia.
Pp. 427-434, 2007.

[3] A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, R. Jain. “content based


image retrieval at the end of the early years”. IEEE transactions on pattern analysis
and machine intelligence, vol.22, no.12, pp. 1349-1380, 2000.

[4] S. E. Umbaugh, “computer imaging – digital image analysis and processing”,


CRC Press, 2005.

[5] A. Vadivel, A. K. Majumdar, S.Sural, “characteristics of weighted feature vector


in content-based image retrieval applications”. International conference on intelligent
sensing and information processing. Pp. 127-132. 2004.

41
APPENDIX A

INTRODUCTION TO DIGITAL IMAGE PROCESSING

42
A.1 what is DIP?

Digital image processing is the use of computer algorithms to perform image


processing on digital images. As a subfield of digital signal processing, digital image
processing has many advantages over analog image processing. It allows a much
wider range of algorithms to be applied to the input data, and can avoid problems such
as the build-up of noise and signal distortion during processing.

A.2 What is an image?

An image is represented as a two dimensional function f(x,y) where x and y are


spatial co-ordinates and the amplitude of ‘f’ at any pair of coordinates (x,y) is called
the intensity of the image at that point.

A.2.1 Gray scale image

A grayscale or greyscale digital image is an image in which the value of each pixel is
a single sample, that is, it carries only intensity information. Images of this sort, also
known as black-and-white, are composed exclusively of shades of gray, varying from
black at the weakest intensity to white at the strongest.

Figure A.1: A typical grayscale image.


Grayscale images are distinct from one-bit black-and-white images, which in the
context of computer imaging are images with only the two colors, black,
and white (also called bi-level or binary images). Grayscale images have many shades

43
of gray in between. Grayscale images are also called monochromatic, denoting the
absence of any chromatic variation.

A.2.2 Color images

A digital color image is a digital image that includes color information for each pixel.
For visually acceptable results, it is necessary (and almost sufficient) to provide
three samples (color channels) for each pixel, which are interpreted as coordinates in
some color space. The RGB color space is commonly used in computer displays, but
other spaces such as YCbCr, HSV are often used in other contexts.

A.3 Image as matrices:


A digital image can be represented by the following function:

f(xylem)= f(0,0) f(0,1)…………..f(0,N-1)

f(1,0) f(1,1)…………..f(1,N-1)

f(M-1,0) f(M-1,1)……...f(M-1,N-1)

The right side of this equation is a digital image by definition. Each element of this
array is called an image element, picture element, pixel or pel.

A digital image can be represented naturally as a MATLAB matrix:

f= f(1,1) f(1,2)………f(1,N)

f(2,1) f(2,2)………f(2,N)

f(M,1) f(M,2)……..f(M,N)

Where f (1, 1)=f(0,0).Clearly the two representations are identical ,except for the shift
in origin. The notation f (p, q) denotes the element located in row p and the column q.
For example f(6,2) is the element in the sixth row and second column of the matrix f.
Typically we use the letters M and N respectively to denote the number of rows and
columns in a matrix. A 1XN matrix is called a row vector whereas an MX1 matrix is
called a column vector. A 1X1 matrix is a scalar.

Matrices in MATLAB are stored in variables with names such as A, a, RGB,


real array and so on. Variables must begin with a letter and contain only letters,
numerals and underscores. All MATLAB quantities are written using monoscope

44
characters. We use conventional Roman, italic notation such as f (x, y), for
mathematical expressions.

A.4 Reading images


You can read standard image files (TIFF, JPEG, BMP, etc.) into MATLAB using the
‘imread’ function. The type of data returned by imread depends on the type of image
you are reading.

Syntax:

A = imread(filename,fmt)

[X,map] = imread(filename,fmt)

[...] = imread(filename)

[...] = imread(URL,...)

[...] = imread(...,idx) (CUR, GIF, ICO, and TIFF only)

[...] = imread(...,'PixelRegion',{ROWS, COLS}) (TIFF only)

[...] = imread(...,'frames',idx) (GIF only)

[...] = imread(...,ref) (HDF only)

[...] = imread(...,'BackgroundColor',BG) (PNG only)

[A,map,alpha] = imread(...) (ICO, CUR, and PNG only)

Description

The imread function supports four general syntaxes, described below. The imread
function also supports several other format-specific syntaxes.

A = imread(filename,fmt) reads a grayscale or color image from the file specified by


the string filename, where the string fmt specifies the format of the file. If the file is
not in the current directory or in a directory in the MATLAB path, specify the full
pathname of the location on your system. For a list of all the possible values for fmt,
see Supported Formats. If imread cannot find a file named filename, it looks for a file
named filename.fmt. imread returns the image data in the array A. If the file contains
a grayscale image, A is a two-dimensional (M-by-N) array. If the file contains a color
image, A is a three-dimensional (M-by-N-by-3) array. The class of the returned array
depends on the data type used by the file format. For most file formats, the color
45
image data returned uses the RGB color space. For TIFF files, however, imread can
return color data that uses the RGB, CIELAB, ICCLAB, or CMYK color spaces. If
the color image uses the CMYK color space, A is an M-by-N-by-4 array.

[X,map] = imread(filename,fmt) reads the indexed image in filename into X and its
associated color map into map. The color map values are rescaled to the range [0, 1].
[...] = imread(filename) attempts to infer the format of the file from its content.

[...] = imread(URL,...) reads the image from an Internet URL. The URL must include
the protocol type (e.g., http ://).

A.5 Displaying images

Images are displayed on the MATLAB desktop using function imshow, which has the
basic syntax imshow(f,g) where f is an image array and g is the number of intensity
levels used to display it. If g is omitted, it defaults to 256 levels.

imshow(I,[low high]) displays I as a grayscale intensity image, specifying the data


range for I. The value low (and any value less than low) displays as black; the value
high (and any value greater than high) displays as white. Values in between are
displayed as intermediate shades of gray, using the default number of gray levels. If
you use an empty matrix ([]) for [low high], imshow uses [min(I(:)) max(I(:))]; that is,
the minimum value in I is displayed as black, and the maximum value is displayed as
white.

imshow(BW) displays the binary image BW. imshow displays pixels with the value 0
(zero) as black and pixels with the value 1 as white.

imshow(X,map) displays the indexed image X with the color map map.

imshow(RGB) displays the true-color image RGB.

imshow(...,display_option) displays the image, where display_option specifies how


imshow handles the sizing of the image. display_option is a string that can have either
of these values. Option strings can be abbreviated.

A.6 Writing images


46
Images are written to disk using function imwrite, which has the following basic
syntax :

imwrite(f,’filename’) with this syntax the string contained in filename must include a
recognized file format extension.

imwrite(X,map,filename,fmt) writes the indexed image in X and its associated


colormap map to filename in the format specified by fmt. If X is of class uint8 or
uint16, imwrite writes the actual values in the array to the file. If X is of class double,
the imwrite function offsets the values in the array before writing, using uint8(X-1).
The map parameter must be a valid MATLAB color map.

imwrite(...,filename) writes the image to filename, inferring the format to use from
the filename's extension. The extension must be one of the values for fmt, listed in
Supported Formats.

imwrite(...,Param1,Val1,Param2,Val2...) specifies parameters that control various


characteristics of the output file for HDF, JPEG, PBM, PGM, PNG, PPM, and TIFF
files. For example, if you are writing a JPEG file, you can specify the quality of the
output image.

A.7 Project Code


function varargout = histo(varargin)

% initialization code----------

gui_Singleton = 1;

gui_State = struct('gui_Name', mfilename, ...

'gui_Singleton', gui_Singleton, ...

'gui_OpeningFcn', @histo_OpeningFcn, ...

'gui_OutputFcn', @histo_OutputFcn, ...

'gui_LayoutFcn', [] , ...

47
'gui_Callback', []);

if nargin && ischar(varargin{1})

gui_State.gui_Callback = str2func(varargin{1});

end

if nargout

[varargout{1:nargout}] = gui_mainfcn(gui_State, varargin{:});

else

gui_mainfcn(gui_State, varargin{:});

end

% Executes just before histo is made visible----------

function histo_OpeningFcn(hObject, eventdata, handles, varargin)

handles.output = hObject;

a=imread('min.bmp');

axes(handles.axes1);

imshow(a);

axes(handles.axes2);

imshow(a);

axes(handles.axes3);

imshow(a);

axes(handles.axes4);

imshow(a);

axes(handles.axes5);

imshow(a);

axes(handles.axes6);

imshow(a);

axes(handles.axes7);
48
imshow(a);

axes(handles.axes8);

imshow(a);

axes(handles.axes9);

imshow(a);

axes(handles.axes10);

imshow(a);

axes(handles.axes11);

imshow(a);

axes(handles.axes12);

imshow(a);

axes(handles.axes13);

imshow(a);

axes(handles.axes14);

imshow(a);

axes(handles.axes15);

imshow(a);

axes(handles.axes16);

imshow(a);

axes(handles.axes17);

imshow(a);

axes(handles.axes18);

imshow(a);

axes(handles.axes19);

imshow(a);

axes(handles.axes20);

imshow(a);
49
axes(handles.axes21);

imshow(a);

axes(handles.axes22);

imshow(a);

axes(handles.axes23);

imshow(a);

axes(handles.axes24);

imshow(a);

axes(handles.axes25);

imshow(a);

axes(handles.axes26);

imshow(a);

axes(handles.axes27);

imshow(a);

axes(handles.axes28);

imshow(a);

axes(handles.axes29);

imshow(a);

axes(handles.axes30);

imshow(a);

b=imread('max.bmp');

axes(handles.one);

imshow(b);

% Update handles structure-----------

guidata(hObject, handles);

50
% Outputs from this function are returned to the command line-------

function varargout = histo_OutputFcn(hObject, eventdata, handles)

% Get default command line output from handles structure-------

varargout{1} = handles.output;

% Executes on button press in load--------

function load_Callback(hObject, eventdata, handles)

data;

helpdlg('Database succesfully loaded...');

% Executes on button press in Browse---------

function Browse_Callback(hObject, eventdata, handles)

[filename, pathname] = uigetfile('*.bmp', 'Pick an Image');

if isequal(filename,0) | isequal(pathname,0)

warndlg('Image is not selected');

else

a=imread(filename);

handles.queryx=a;

axes(handles.one);

imshow(a);

handles.filename=filename;

guidata(hObject, handles);

end

% --- Executes on button press in Search.

function Search_Callback(hObject, eventdata, handles)

filename=handles.filename;
51
[X1] = imread(filename);

HSVmap1 = rgb2hsv(X1);

% Open database txt file... for reading------

fid = fopen('database.txt');

resultValues = []; % Results matrix...

resultNames = {};

i = 1; % Indices...

j = 1;

while 1

imagename = fgetl(fid);

if ~ischar(imagename), break, end % Meaning: End of File...

[X] = imread(imagename);

HSVmap = rgb2hsv(X);

[D1,D2,D3] = quadratic1(X1, HSVmap1, X, HSVmap);

resultValues1(i) = D1;

resultValues2(i) = D2;

resultValues3(i) = D3;

resultNames(j) = {imagename};

i = i + 1;

j = j + 1;

end

fclose(fid);

% Sorting colour results--------


52
[sortedValues1, index1] = sort(resultValues1); % Sorted results... the vector index

[sortedValues2, index2] = sort(resultValues2);

[sortedValues3, index3] = sort(resultValues3); % is used to find the resulting files.

handles.sortedValues1 = sortedValues1;

handles.sortedValues2 = sortedValues2;

handles.sortedValues3 = sortedValues3;

handles.index1=index1;

handles.index2=index2;

handles.index3=index3;

%creating a file containing the top 20 matches for the query image in the red
plane----

fid = fopen('colourResults_R_C.txt', 'w+');

for i = 1:20

tempstr = char(resultNames(index1(i)));

fprintf(fid, '%s\r', tempstr);

disp(resultNames(index1(i)));

disp(sortedValues1(i));

disp(' ');

end

fclose(fid);

%creating a file containing the top 20 matches for the query image in the green
plane

fid = fopen('colourResults_G_C.txt', 'w+');

for i = 1:20

tempstr = char(resultNames(index2(i)));
53
fprintf(fid, '%s\r', tempstr);

disp(resultNames(index2(i)));

disp(sortedValues2(i));

disp(' ');

end

fclose(fid);

%creating a file containing the top 20 matches for the query image in the blue
plane---

fid = fopen('colourResults_B_C.txt', 'w+');

for i = 1:20

tempstr = char(resultNames(index3(i)));

fprintf(fid, '%s\r', tempstr);

disp(resultNames(index3(i)));

disp(sortedValues3(i));

disp(' ');

end

fclose(fid);

disp('Colour part done...');

disp('Colour results saved...');

disp('');

%Displaying the top 10 matches of the query image in the red plane------

filename='colourResults_R_C.txt';

fid = fopen(filename);

indexr=[];

54
results1={};

i = 1;

while 1

imagename = fgetl(fid);

if ~ischar(imagename), break, end

[x, map] = imread(imagename);

if i==1;

axes(handles.axes1);

indexr=[indexr index1(i)];

imshow(x);

end

if i==2

axes(handles.axes2);

indexr=[indexr index1(i)];

imshow(x);

end

if i==3

axes(handles.axes3);

indexr=[indexr index1(i)];

imshow(x);

end

if i==4

axes(handles.axes4);

indexr=[indexr index1(i)];

imshow(x);

end

if i==5
55
axes(handles.axes5);

indexr=[indexr index1(i)];

imshow(x);

end

if i==6

axes(handles.axes6);

indexr=[indexr index1(i)];

imshow(x);

end

if i==7

axes(handles.axes7);

indexr=[indexr index1(i)];

imshow(x);

end

if i==8

axes(handles.axes8);

indexr=[indexr index1(i)];

imshow(x);

end

if i==9

axes(handles.axes9);

indexr=[indexr index1(i)];

imshow(x);

end

if i==10

axes(handles.axes10);

indexr=[indexr index1(i)];
56
imshow(x);

end

index1r=num2str(indexr);

results1(i)={index1r};

fad=fopen('indexValue1.txt','w+');

indexr1=results1(i);

indexr1=char(indexr1);

fprintf(fad, '%s\r', indexr1);

i = i + 1;

end

fclose(fid);

fclose(fad);

% Displaying the top 10 matches of the query image in green plane-------

filename='colourResults_G_C.txt';

fid = fopen(filename);

indexg=[];

i = 1;

while 1

imagename = fgetl(fid);

if ~ischar(imagename), break, end

[x, map] = imread(imagename);

if i==1;

axes(handles.axes11);

indexg=[indexg index2(i)];

imshow(x);

end
57
if i==2

axes(handles.axes12);

indexg=[indexg index2(i)];

imshow(x);

end

if i==3

axes(handles.axes13);

indexg=[indexg index2(i)];

imshow(x);

end

if i==4

axes(handles.axes14);

indexg=[indexg index2(i)];

imshow(x);

end

if i==5

axes(handles.axes15);

indexg=[indexg index2(i)];

imshow(x);

end

if i==6

axes(handles.axes16);

indexg=[indexg index2(i)];

imshow(x);

end

if i==7

axes(handles.axes17);
58
indexg=[indexg index2(i)];

imshow(x);

end

if i==8

axes(handles.axes18);

indexg=[indexg index2(i)];

imshow(x);

end

if i==9

axes(handles.axes19);

indexg=[indexg index2(i)];

imshow(x);

end

if i==10

axes(handles.axes20);

indexg=[indexg index2(i)];

imshow(x);

end

index1g=num2str(indexg);

results1(i)={index1g};

fad=fopen('indexValue2.txt','w+');

indexg1=results1(i);

indexg1=char(indexg1);

fprintf(fad, '%s\r', indexg1);

i = i + 1;

end

fclose(fid);
59
fclose(fad);

% Displaying the top 10 matches of the query image in blue plane-------

filename='colourResults_B_C.txt';

fid = fopen(filename);

indexb=[];

i = 1;

while 1

imagename = fgetl(fid);

if ~ischar(imagename), break, end % Meaning: End of File...

[x, map] = imread(imagename);

if i==1;

axes(handles.axes21);

indexb=[indexb index3(i)];

imshow(x);

end

if i==2

axes(handles.axes22);

indexb=[indexb index3(i)];

imshow(x);

end

if i==3

axes(handles.axes23);

indexb=[indexb index3(i)];

imshow(x);

end
60
if i==4

axes(handles.axes24);

indexb=[indexb index3(i)];

imshow(x);

end

if i==5

axes(handles.axes25);

indexb=[indexb index3(i)];

imshow(x);

end

if i==6

axes(handles.axes26);

indexb=[indexb index3(i)];

imshow(x);

end

if i==7

axes(handles.axes27);

indexb=[indexb index3(i)];

imshow(x);

end

if i==8

axes(handles.axes28);

indexb=[indexb index3(i)];

imshow(x);

end

if i==9

axes(handles.axes29);
61
indexb=[indexb index3(i)];

imshow(x);

end

if i==10

axes(handles.axes30);

indexb=[indexb index3(i)];

imshow(x);

end

index1b=num2str(indexb);

results1(i)={index1b};

fad=fopen('indexValue2.txt','w+');

indexb1=results1(i);

indexb1=char(indexb1);

fprintf(fad, '%s\r', indexb1);

i = i + 1;

end

fclose(fid);

fclose(fad);

% Update handles structure----

guidata(hObject, handles);

% Executes on button press in Class----

function Class_Callback(hObject, eventdata, handles)

sortedValues1=handles.sortedValues1;

sortedValues2=handles.sortedValues2;

sortedValues3=handles.sortedValues3;
62
index1=handles.index1;

index2=handles.index2;

index3=handles.index3;

r=1;k=1;l=1;m=1;n=1;

class1=0;class2=0;class3=0;class4=0;class5=0;

for i=1:10

if index1(i)<=19

class1=r;

r=r+1;

end

if (index1(i)>=20) & (index1(i)<=39)

class2=k;

k=k+1;

end

end

for i=1:10

if index2(i)<=19

class1=r;

r=r+1;

end

if (index2(i)>=20) & (index2(i)<=39)

class2=k;

k=k+1;

end

end

for i=1:10

if index3(i)<=19
63
class1=r;

r=r+1;

end

if (index3(i)>=20) & (index3(i)<=39)

class2=k;

k=k+1;

end

end

ma=[class1 class2 ];

[val ind]=max(ma);

switch ind

case 1

msgbox('The Query Image is Elephant Animal Domain');

case 2

msgbox('The Query Image is Flower Domain');

end

% Executes on button press in Clear----

function Clear_Callback(hObject, eventdata, handles)

a=imread('min.bmp');

axes(handles.axes11);

imshow(a);

axes(handles.axes12);

imshow(a);

axes(handles.axes13);

imshow(a);

axes(handles.axes14);
64
imshow(a);

axes(handles.axes15);

imshow(a);

axes(handles.axes16);

imshow(a);

axes(handles.axes17);

imshow(a);

axes(handles.axes18);

imshow(a);

axes(handles.axes19);

imshow(a);

axes(handles.axes20);

imshow(a);

axes(handles.axes21);

imshow(a);

axes(handles.axes22);

imshow(a);

axes(handles.axes23);

imshow(a);

axes(handles.axes24);

imshow(a);

axes(handles.axes25);

imshow(a);

axes(handles.axes26);

imshow(a);

axes(handles.axes27);

imshow(a);
65
axes(handles.axes28);

imshow(a);

axes(handles.axes29);

imshow(a);

axes(handles.axes30);

imshow(a);

axes(handles.axes1);

imshow(a);

axes(handles.axes2);

imshow(a);

axes(handles.axes3);

imshow(a);

axes(handles.axes4);

imshow(a);

axes(handles.axes5);

imshow(a);

axes(handles.axes6);

imshow(a);

axes(handles.axes7);

imshow(a);

axes(handles.axes8);

imshow(a);

axes(handles.axes9);

imshow(a);

axes(handles.axes10);

imshow(a);

b=imread('max.bmp');
66
axes(handles.one);

imshow(b);

% Executes on button press in Close----

function Close_Callback(hObject, eventdata, handles)

Close histo;

% Executes on button press in Back----

function Back_Callback(hObject, eventdata, handles)

CBIRclassification;

% Executes on button press in exit----

function exit_Callback(hObject, eventdata, handles)

exit;

% Program for loading the database ----

function data

fid = fopen('database.txt', 'w+');

for i=1:39

a=num2str(i);

b='.jpg';

c1='.bmp';

filename=strcat(a,c1);

fprintf(fid,'%s\r',filename);

end

fclose(fid);

67
% Quadratic distance between two color images ----

% Executes on being called, with inputs:

% X1 - number of pixels of 1st image

% X2 - number of pixels of 2nd image

% map1 - HSV colour map of 1st image

% map2 - HSV colour map of 2nd image

function [value1,value2,value3] = quadratic1(X1, map1, X2, map2)

% Obtain the histograms of the two images...

% [count1, y1] = imhist(X1, map1);

% [count2, y2] = imhist(X2, map2);

[rHist1 gHist1 bHist1] = rgbhist(X1);

[rHist2 gHist2 bHist2] = rgbhist(X2);

% Obtain the difference between the pixel counts...

% q = count1 - count2;

% s = abs(q);

q1 = rHist1 - rHist2;

s1 = abs(q1);

q2 = gHist1 - gHist2;

s2 = abs(q2);

q3 = bHist1 - bHist2;

s3 = abs(q3);

% Obtain the similarity matrix...

A = similarityMatrix(map1, map2);
68
% Obtain the quadratic distance...

d1 = s1.'*A*s1;

d1 = d1^1/2;

d1 = d1 / 1e8;

d2 = s2.'*A*s2;

d2 = d2^1/2;

d2 = d2 / 1e8;

d3 = s3.'*A*s3;

d3 = d3^1/2;

d3 = d3 / 1e8;

% Return the distance metric.

value1 = d1;

value2 = d2;

value3 = d3;

% To obtain the Similarity Matrix between two HSV color histograms. This is to
be used in the Histogram Quadratic Distance equation.

% Executes on being called, with input matrices I and J.

69
function value = similarityMatrix(I, J)

% Obtain the Matrix elements... r - rows, c - columns. The general assumption is


that these dimentions are the same for both matrices.

[r c p] = size(I);

A = [];

for i = 1:r

for j = 1:r

M1 = (I(i, 2) * sin(I(i, 1)) - J(j, 2) * sin(J(j, 1)))^2;

M2 = (I(i, 2) * cos(I(i, 1)) - J(j, 2) * cos(J(j, 1)))^2;

M3 = (I(i, 3) - J(j, 3))^2;

M0 = sqrt(M1 + M2 + M3);

A(i, j) = 1 - (M0/sqrt(5));

end

end

%Obtain Similarity Matrix...

value = A;

% CBIR Classification

function varargout = CBIRclassification(varargin)

% Begin initialization code ----

gui_Singleton = 1;

gui_State = struct('gui_Name', mfilename, ...

'gui_Singleton', gui_Singleton, ...

'gui_OpeningFcn', @CBIRclassification_OpeningFcn, ...

70
'gui_OutputFcn', @CBIRclassification_OutputFcn, ...

'gui_LayoutFcn', [] , ...

'gui_Callback', []);

if nargin && ischar(varargin{1})

gui_State.gui_Callback = str2func(varargin{1});

end

if nargout

[varargout{1:nargout}] = gui_mainfcn(gui_State, varargin{:});

else

gui_mainfcn(gui_State, varargin{:});

end

% End initialization code -----

% Executes just before CBIRclassification is made visible---

function CBIRclassification_OpeningFcn(hObject, eventdata, handles, varargin)

% Choose default command line output for CBIRclassification---

handles.output = hObject;

% Update handles structure---

guidata(hObject, handles);

% Outputs from this function are returned to the command line---

function varargout = CBIRclassification_OutputFcn(hObject, eventdata, handles)

71
% Get default command line output from handles structure---

varargout{1} = handles.output;

% Executes on button press in histogram---

function Histogram_Callback(hObject, eventdata, handles)

histogram;

% --- Executes on button press in Exit.

function Exit_Callback(hObject, eventdata, handles)

exit;

72
APPENDIX B

INTRODUCTION TO MATLAB

B.1 What Is MATLAB?

MATLAB is a high-performance language for technical computing. It


integrates computation, visualization, and programming in an easy-to-use
environment where problems and solutions are expressed in familiar mathematical
notation. Typical uses include

Math and computation

Algorithm development

73
Data acquisition

Modeling, simulation, and prototyping

Data analysis, exploration, and visualization

Scientific and engineering graphics

Application development, including graphical user interface building

MATLAB is an interactive system whose basic data element is an array that


does not require dimensioning. This allows you to solve many technical computing
problems, especially those with matrix and vector formulations, in a fraction of the
time it would take to write a program in a scalar non-interactive language such as C or
Fortran. The name MATLAB stands for matrix laboratory. MATLAB was originally
written to provide easy access to matrix software developed by the LINPACK and
EISPACK projects. Today, MATLAB engines incorporate the LAPACK and BLAS
libraries, embedding the state of the art in software for matrix computation.

MATLAB has evolved over a period of years with input from many users. In
university environments, it is the standard instructional tool for introductory and
advanced courses in mathematics, engineering, and science. In industry, MATLAB is
the tool of choice for high-productivity research, development, and analysis.

MATLAB features a family of add-on application-specific solutions called


toolboxes. Very important to most users of MATLAB, toolboxes allow you to learn
and apply specialized technology. Toolboxes are comprehensive collections of
MATLAB functions (M-files) that extend the MATLAB environment to solve
particular classes of problems. Areas in which toolboxes are available include signal
processing, control systems, neural networks, fuzzy logic, wavelets, simulation, and
many others.

B.2 The MATLAB System

The MATLAB system consists of five main parts:

B.2.1 Development Environment

74
This is the set of tools and facilities that help you use MATLAB functions and
files. Many of these tools are graphical user interfaces. It includes the MATLAB
desktop and Command Window, a command history, an editor and debugger, and
browsers for viewing help, the workspace, files, and the search path.

B.2.2 The MATLAB Mathematical Function Library

This is a vast collection of computational algorithms ranging from


elementary functions, like sum, sine, cosine, and complex arithmetic, to more
sophisticated functions like matrix inverse, matrix Eigen values, Bessel functions, and
fast Fourier transforms.

B.2.3 The MATLAB Language

This is a high-level matrix/array language with control flow statements,


functions, data structures, input/output, and object-oriented programming features. It
allows both "programming in the small" to rapidly create quick and dirty throw-away
programs, and "programming in the large" to create large and complex application
programs.

B.2.4 Graphics

MATLAB has extensive facilities for displaying vectors and matrices as


graphs, as well as annotating and printing these graphs. It includes high-level
functions for two-dimensional and three-dimensional data visualization, image
processing, animation, and presentation graphics. It also includes low-level functions
that allow you to fully customize the appearance of graphics as well as to build
complete graphical user interfaces on your MATLAB applications.

B.2.5 The MATLAB Application Program Interface (API)

This is a library that allows you to write C and FORTRAN programs that
interact with MATLAB. It includes facilities for calling routines from MATLAB
(dynamic linking), calling MATLAB as a computational engine, and for reading and
writing MAT-files.

75
B.3 SOME BASIC COMMANDS

pwd displays the current working directory

demo access product demos via help browser

who lists all of the variables in your Matlab workspace

whos list the variables and describes their matrix size

clear erases variables and functions from memory

clear x erases the matrix ‘x’ from your workspace

close by itself, closes the current figure window

figure creates an empty figure window

hold on holds the current plot and all axis properties so that subsequent

graphing commands add to the existing graph

hold off sets the next plot property of the current axes to “replace”

find find indices of nonzero elements.E.g.:d=find(x>100) returns the

indices of vector x that are greater than 100

break terminate execution of m-file or WHILE or FOR loop

for repeat statements a specific number of times

diff difference and approximate derivative

NaN the arithmetic representation for positive infinity, a infinity is

Obtained as a result of mathematically undefined operations


like 0.0/0.0

76
INF the arithmetic representation for positive infinity, a infinity is

Also produced by operations like dividing by zero, e.g. 1.0/1.0

Or from overflow, e.g. exp (1000).

save saves all the matrices defined in the current session into the file,

matlab.mat, located in the current working directory

load loads contents of matlab.mat into current workspace.

save filename x y z saves the matrices x, y and z into the file titled filename.mat

save filename x y z/ascii save the matrices x, y and z into the file titled filename.dat

load filename loads the contents of filename into the current workspace, the

file can be a binary(.mat) file

load filename.dat loads the contents of filename.dat into the variable filename

xlabel(‘’) allows you to label x-axis

ylabel(‘’) allows you to label y-axis

title(‘’) allows you to give title for plot

subplot() allows you to create multiple plots in the same window.

B.4 SOME BASIC PLOT COMMANDS

plot(x,y) creates a Cartesian plot of the vectors x &y

plot(y) creates a plot of y vs the numerical values of the elements in

the y vector

semilogx(x,y) plots log(x) vs y

77
semilogy(x,y) plot x vs log(y)

loglog(x,y) plots log(x) vs log(y)

polar(theta,r) creates a polar plot of the vectors r & theta where theta is in

radians

bar(x) creates a bar graph of the vector x.(note also the command

Stairs(x))

bar(x,y) creates a bar-graph of the elements of the vector y, locating the


bars according to the vector elements of ‘x’

Plot description:

grid creates a grid on the graphics plot

title(‘text’) places a title at top of graphics plot

xlabel(‘text’) writes ‘text’ beneath the x-axis of a plot

ylabel(‘text’) writes ‘text’ beside the y-axis of a plot

text(x,y,’text’) writes ‘text’ at the location (x,y)

text(x,y, ‘text’, ‘sc’) writes ‘text’ at point x,y assuming lower left corner is (0,0)

axis([xmin xmax ymin ymax]) sets scaling for the x and y axes

B.5 MATLAB WORKING ENVIRONMENT

B.5.1 MATLAB DESKTOP

Matlab desktop is the main matlab application window. The desktop contains five sub
windows viz..the command window, the workspace browser, the current directory

78
window, the command history window and one or more figure windows which are
shown only when the user displays a graphic.

The command window is where the user types MATLAB commands and
expressions at the prompt (>>) and where the output of those commands is displayed.
MATLAB defines the workspace as a set of variables that the user creates in a work
session. The workspace browser shows these variables and some information about
them. Double clicking on a variable in the workspace browser launches the array
editor, which can be used to obtain information and edit certain properties of the
variable.

The MATLAB workspace consists of the set of variables (named arrays) built
up during a MATLAB session and stored in memory. You add variables to the
workspace by using functions, running M-files, and loading saved workspaces. For
example, if you type

t = 0:pi/4:2*pi;

y = sin(t);

The workspace includes two variables, y and t, each having nine values. You can
perform workspace operations and related features using the Workspace browser.
Equivalent functions are available and are documented with each feature of the
Workspace browser.

To open the Workspace browser, select ‘Workspace’ from the Desktop menu in the
MATLAB desktop, or type ‘workspace’ at the Command Window prompt. The
Workspace browser shows the name of each variable, its value, its array size, its size
in bytes, and the class. The icon for each variable denotes its class.

The current directory tab above the workspace tab shows the contents of the
current directory, whose path is shown in the current directory window. For example,
in the windows operating system the path might be as follows: C:\MATLAB\work,
indicating that the directory “work” is a subdirectory of the main directory
“MATLAB” which is installed in drive C. Clicking on the arrow in the current

79
directory window shows a list of recently used paths. Clicking on the button to the
right of the window allows the user to change the current directory.

MATLAB uses a search path to find M-files and other MATLAB related files,
which are organized in directories on your file system. Any file you want to run in
MATLAB must reside in the current directory or in a directory that is on the search
path. When you create M-files and related files for MATLAB, add the directories in
which they are located to the MATLAB search path. By default, the files supplied
with MATLAB and other Math Works products are included in the search path. To
see which directories are on the search path or to change the search path, select File ->
Set Path and use the resulting Set Path dialog box. Alternatively, you can use the
‘path’ function to view the search path, ‘add path’ to add directories to the path, and
‘rmpath’ to remove directories from the path.

The Command History window displays a log of the statements most recently
run in the Command Window. To show or hide the Command History window, use
the Desktop menu. Alternatively, use commandhistory to open the MATLAB
Command History when it is closed, or to select it when it is open. Previously entered
MATLAB commands can be selected and re-executed from the command history
window by right clicking on a command or sequence of commands. This action
launches a menu from which to select various options in addition to executing the
commands. This is useful to select various options in addition to executing the
commands. This is a useful feature when experimenting with various commands in a
work session.

B.5.2 Using the MATLAB editor to create M-files

The MATLAB editor is both a text editor specialized for creating M-files and
a graphical MATLAB debugger. The editor can appear in a window by itself, or it can
be a sub window in the desktop. M-files are denoted by the extension of .m, as in
pixelup.m. The MATLAB editor window has numerous pull-down menus for tasks
such as saving, viewing, and debugging files. Because it performs some simple
checks and also uses color to differentiate between various elements of code, this text
editor is recommended as the tool of choice for writing and editing M-functions. To
open the editor, type edit at the prompt opens the M-file filename.m in an editor
80
window, ready for editing. As noted earlier, the file must be in the current directory,
or in a directory in the search path.

B.5.3 Getting Help

The principle way to get help online is to use the MATLAB help browser,
opened as a separate window either by clicking on the question mark symbol (?) on
the desktop toolbar, or by typing help browser at the prompt in the command window.
The help browse is a web browser integrated into the MATLAB desktop that displays
a hypertext markup language (HTML) documents. The help browser consists of two
panes, the help navigator pane used to find information, and the display pane used to
view the information. Self-explanatory tabs other than navigator pane are used to
perform a search.

B.6 GUI (Graphical user interface)

A graphical user interface (GUI) is a user interface built with graphical


objects, such as buttons, text fields, sliders, and menus. In general, these objects
already have meanings to most computer users. For example, when a slider is moved,
a value changes; when an OK button is pressed, settings are applied and the dialog
box is dismissed. Of course, to leverage this built-in familiarity, users must be
consistent in how they use the various GUI-building components.

Applications that provide GUIs are generally easier to learn and use since the person
using the application does not need to know what commands are available or how
they work. The action that results from a particular user action can be made clear by
the design of the interface.

The sections that follow describe how to create GUIs with MATLAB. This
includes laying out the components, programming them to do specific things in
response to user actions, and saving and launching the GUI; in other words, the
mechanics of creating GUIs.

B.6.1 Creating GUIs with GUIDE

81
MATLAB implements GUIs as figure windows containing various styles of
uicontrol objects. Each object must be programmed to perform the intended action
when activated by the user of the GUI. In addition, GUI must be saved and launched.
All of these tasks are simplified by GUIDE, MATLAB's graphical user interface
development environment.

B.6.2 GUI Development Environment

The process of implementing a GUI involves two basic tasks:

• Laying out the GUI components


• Programming the GUI components
GUIDE primarily is a set of layout tools. However, GUIDE also generates an
M-file that contains code to handle the initialization and launching of the GUI. This
M-file provides a framework for the implementation of the callbacks - the functions
that execute when users activate components in the GUI.

B.6.3 The Implementation of a GUI

While it is possible to write an M-file that contains all the commands to lay
out a GUI, it is easier to use GUIDE to lay out the components interactively and to
generate two files that save and launch the GUI:

A FIG-file - contains a complete description of the GUI figure and all of its children
(uicontrols and axes), as well as the values of all object properties.

An M-file - contains the functions that launch and control the GUI and the callbacks,
which are defined as sub functions. This M-file is referred to as the application M-file
in this documentation.

The application M-file does not contain the code that lays out the uicontrols;
this information is saved in the FIG-file.

The following diagram illustrates the parts of a GUI implementation.

82
Figure B.1: GUI figure

B.6.4 Features of the GUIDE-Generated Application M-File

GUIDE simplifies the creation of GUI applications by automatically


generating an M-file framework directly from our layout. We can then use this
framework to code the application M-file. This approach provides a number of
advantages.

1. The M-file contains code to implement a number of useful features. The M-


file adopts an effective approach to managing object handles and executing
callback routines.

2. The M-files provides a way to manage global data.

3. The automatically inserted sub function prototypes for callbacks ensure


compatibility with future releases.

We can elect to have GUIDE generate only the FIG-file and write the application M-
file yourself. There are no uicontrol creation commands in the application M-file; the
layout information is contained in the FIG-file generated by the Layout Editor.

B.6.5 Beginning the Implementation Process

To begin implementing GUI, proceed to the following sections:


83
Getting Started with GUIDE - the basics of using GUIDE

Selecting GUIDE Application Options - set both FIG-file and M-file options.

Using the Layout Editor - begin laying out the GUI.

Understanding the Application M-File - discussion of programming techniques


used in the application M-file.

Application Examples - a collection of examples that illustrate techniques which are


useful for implementing GUIs.

B.6.6 Command-Line Accessibility

When MATLAB creates a graph, the figure and axes are included in the list of
children of their respective parents and their handles are available through commands
such as findobj, set, and get. If another plotting command is issued, the output is
directed to the current figure and axes.

GUIs are also created in figure windows. Generally, GUI figures are not
desired to be available as targets for graphics output, since issuing a plotting
command could direct the output to the GUI figure, resulting in the graph appearing
in the middle of the GUI.

In contrast, if a GUI is created that contains axes and commands entered in


the command window are to be displayed in this axes, command-line access should be
enabled.

B.6.7 User Interface Controls

The Layout Editor component palette contains the user interface controls that
can be used in your GUI. These components are MATLAB uicontrol objects and are
programmable via their Callback properties. They are:

 Push Buttons
 Sliders
 Toggle Buttons
 Frames
84
 Radio Buttons
 Listboxes
 Checkboxes
 Popup Menus
 Edit Text
 Axes
 Static Text
 Figures
B.6.7 (a) Push Buttons

Push buttons generate an action when pressed (e.g., an OK button may close a
dialog box and apply settings). When a push button is clicked down, it appears
depressed; when the mouse is released, the button's appearance returns to its non-
depressed state; and its callback executes on the button up event.

Properties to Set

String - set this property to the character string that is to be displayed on the push
button.

Tag - GUIDE uses the Tag property to name the callback sub function in the
application M-file. Set Tag to a descriptive name (e.g., close_button) before activating
the GUI.

Programming the Callback

When the user clicks on the push button, its callback executes. Push buttons
do not return a value or maintain a state.

B.6.7 (b) Toggle Buttons

Toggle buttons generate an action and indicate a binary state (e.g., on or off).
When toggle button is clicked, it appears depressed and remains depressed, when you
release the mouse button, at which point the callback executes. A subsequent mouse
click returns the toggle button to the non-depressed state and again executes its
callback.

85
Programming the Callback

The callback routine needs to query the toggle button to determine what state
it is in. MATLAB sets the Value property equal to the Max property when the toggle
button is depressed (Max is 1 by default) and equal to the Min property when the
toggle button is not depressed (Min is 0 by default).

From the GUIDE Application M-File

The following code illustrates how to program the callback in the GUIDE
application M-file.

functionvaragout = togglebutton1_Callback(h,eventdata,handles,varargin)

button_state = get(h,'Value');

if button_state == get(h,'Max')

% toggle button is pressed

elseif button_state == get(h,'Min')

% toggle button is not pressed

end

Adding an Image to a Push Button or Toggle Button

Assign the CData property an m-by-n-by-3 array of RGB values that define a
true color image. For example, the array a defines 16-by-128 true color image using
random values between 0 and 1 (generated by rand).

a(:,:,1) = rand(16,128);

a(:,:,2) = rand(16,128);

a(:,:,3) = rand(16,128);

set(h,'CData',a)

B.6.7 (c) Radio Buttons

86
Radio buttons are similar to checkboxes, but are intended to be mutually
exclusive within a group of related radio buttons (i.e., only one button is in a selected
state at any given time). To activate a radio button, click the mouse button on the
object. The display indicates the state of the button.

Implementing Mutually Exclusive Behavior

Radio buttons have two states - selected and not selected. The state of a radio
button can be queried and set through its Value property:

Value = Max, button is selected.

Value = Min, button is not selected.

To make radio buttons mutually exclusive within a group, the callback for
each radio button must set the Value property to 0 on all other radio buttons in the
group. MATLAB sets the Value property to 1 on the radio button clicked by the user.

The following sub function, when added to the application M-file, can be
called by each radio button callback. The argument is an array containing the handles
of all other radio buttons in the group that must be deselected.

function mutual_exclude(off)

set(off,'Value',0)

Obtaining the Radio Button Handles

The handles of the radio buttons are available from the handles structure,
which contains the handles of all components in the GUI. This structure is an input
argument to all radio button callbacks.

The following code shows the call to mutual _exclude being made from the
first radio button's callback in a group of four radio buttons.

functionvarargout = radiobutton1_Callback(h,eventdata,handles,varargin)

off = [handles.radiobutton2,handles.radiobutton3,handles.radiobutton4];

mutual_exclude(off)

87
% Continue with callback

After setting the radio buttons to the appropriate state, the callback can
continue with its implementation-specific tasks.

B.6.7 (d) Checkboxes

Check boxes generate an action when clicked and indicate their state as
checked or not checked. Check boxes are useful when providing the user with a
number of independent choices that set a mode (e.g., display a toolbar or generate
callback function prototypes).

The Value property indicates the state of the check box by taking on the value
of the Max or Min property (1 and 0 respectively by default):

Value = Max, box is checked.

Value = Min, box is not checked.

You can determine the current state of a check box from within its callback by
querying the state of its Value property, as illustrated in the following example:

function checkbox1_Callback(h,eventdata,handles,varargin)

if (get(h,'Value') == get(h,'Max'))

% then checkbox is checked-take approriate action

else

% checkbox is not checked-take approriate action

end

B.6.7 (e) Edit Text

88
Edit text controls are fields that enable users to enter or modify text strings.
Use edit text when the text is desired as input. The String property contains the text
entered by the user.

To obtain the string typed by the user, get the String property in the callback.

function edittext1_Callback(h,eventdata, handles,varargin)

user_string = get(h,'string');

% proceed with callback...

Obtaining Numeric Data from an Edit Text Component

MATLAB returns the value of the edit text String property as a character
string. If users want to enter numeric values, they must convert the characters to
numbers. This can be done using the str2double command, which converts strings to
doubles. If the user enters non-numeric characters, str2double returns NaN.

The following code can be used in the edit text callback. It gets the value of
the String property and converts it to a double. It then checks if the converted value is
NaN, indicating the user entered a non-numeric character (isnan) and displays an error
dialog (errordlg).

function edittext1_Callback(h,eventdata,handles,varargin)

user_entry = str2double(get(h,'string'));

if isnan(user_entry)

errordlg('You must enter a numeric value','Bad Input','modal')

end

% proceed with callback...

Triggering Callback Execution

On UNIX systems, clicking on the menubar of the figure window causes the
edit text callback to execute. However, on Microsoft Windows systems, if an editable
text box has focus, clicking on the menubar does not cause the editable text callback

89
routine to execute. This behavior is consistent with the respective platform
conventions. Clicking on other components in the GUI execute the callback.

B.6.7 (f) Static Text

Static text controls displays lines of text. Static text is typically used to label
other controls, provide directions to the user, or indicate values associated with a
slider. Users cannot change static text interactively and there is no way to invoke the
callback routine associated with it.

B.6.7 (g) Frames

Frames are boxes that enclose regions of a figure window. Frames can make a
user interface easier to understand by visually grouping related controls. Frames have
no callback routines associated with them and only uicontrols can appear within
frames.

Placing Components on Top of Frames

Frames are opaque. If a frame is added after adding components that are
needed to be positioned within the frame those components must be brought forward.
The Bring to Front and Send to Back operations in the Layout menu can be used for
this purpose.

B.6.7 (h) List Boxes

List boxes display a list of items and enable users to select one or more items.
The String property contains the list of strings displayed in the list box. The first item
in the list has an index of 1.

The Value property contains the index into the list of strings that correspond to
the selected item. If the user selects multiple items, then Value is a vector of indices.

90
By default, the first item in the list is highlighted when the list box is first
displayed. If no item is to be highlighted, then Value property must be set to empty,
[].

The ListboxTop property defines which string in the list displays as the
topmost item when the list box is not large enough to display all list entries.
ListboxTop is an index into the array of strings defined by the String property and
must have a value between 1 and the number of strings. Non-integer values are fixed
to the next lowest integer.

Single or Multiple Selection

The values of the Min and Max properties determine whether users can make
single or multiple selections:

If Max - Min > 1, then list boxes allow multiple item selection.

If Max - Min <= 1, then list boxes do not allow multiple item selection.

Selection Type

Listboxes differentiate between single and double clicks on an item and set the
figure SelectionType property to normal or open accordingly.

Triggering Callback Execution

MATLAB evaluates the list box's callback after the mouse button is released
or a keypress event (including arrow keys) that changes the Value property (i.e., any
time the user clicks on an item, but not when clicking on the list box scrollbar). This
means the callback is executed after the first click of a double-click on a single item
or when the user is making multiple selections.

In these situations, it is necessary to add another component, such as a Done


button (push button) and program its callback routine to query the list box Value
property (and possibly the figure SelectionType property) instead of creating a
callback for the list box.

B.6.7 (i) Popup Menus

91
Popup menus open to display a list of choices when users press the arrow. The
String property contains the list of string displayed in the popup menu. The Value
property contains the index into the list of strings that correspond to the selected item.

When not open, a popup menu displays the current choice, which is
determined by the index contained in the Value property. The first item in the list has
an index of 1.

Popup menus are useful in providing users with a number of mutually


exclusive choices, but do not want to take up the amount of space that a series of radio
buttons requires.

Programming the Popup Menu

The popup menu callback can be programmed to work by checking only the
index of the item selected (contained in the Value property) or by obtaining the actual
string contained in the selected item.

This callback checks the index of the selected item and uses a switch
statement to take action based on the value. If the contents of the popup menu are
fixed, then this approach can be used.

functionvarargout = popupmenu1_Callback(h,eventdata,handles,varargin)

val = get(h,'Value');

switch val

case 1

% The user selected the first item

case 2

% The user selected the second item

% etc.

This callback obtains the actual string selected in the popup menu. It uses the
value to index into the list of strings. This approach may be useful if the program
dynamically loads the contents of the popup menu based on user action and we have
92
to obtain the selected string. It is necessary to convert the value returned by the String
property from a cell array to a string.

functionvarargout = popupmenu1_Callback(h,eventdata,handles,varargin)

val = get(h,'Value');

string_list = get(h,'String');

selected_string = string_list{val}; % convert from cell array to string

% etc.

B.6.8 Enabling or Disabling Controls

A control can be controlled whether it responds to mouse button clicks by


setting the Enable property. Controls have three states:

on - The control is operational

off - The control is disabled and its label (set by the string property) is grayed out.

inactive - The control is disabled, but its label is not grayed out.

When a control is disabled, clicking on it with the left mouse button does not
execute its callback routine. However, the left-click causes two other callback
routines to execute:

First the figure WindowButtonDownFcn callback executes. Then the


control's ButtonDownFcn callback executes.

A right mouse button click on a disabled control posts a context menu, if one
is defined for that control.

B.6.9Axes

Axes enables GUI to display graphics (e.g., graphs and images). Like all
graphics objects, axes have properties that can be set to control many aspects of its
behavior and appearance.

B.6.9 (a) Axes Callbacks

93
Axes are not uicontrol objects, but can be programmed to execute a callback
when users click a mouse button in the axes. The axes ButtonDownFcn property can
be used to define the callback.

B.6.9 (b) Plotting to Axes in GUIs

GUIs that contain axes should ensure the Command-line accessibility option
in the Application Options dialog is set to Callback (the default). This enables to issue
plotting commands from callbacks without explicitly specifying the target axes.

B.6.9(c) GUIs with Multiple Axes

If a GUI has multiple axes, the target axes must be explicitly specified while
issuing the plotting commands. This can be done using the axes command and the
handles structure. For example,

axes(handles.axes1)

makes the axes whose Tag property is axes1 the current axes, and therefore the target
for plotting commands.

B.6.10 Figure

Figures are the windows that contain the GUI that is designed with the Layout
Editor.

94