This action might not be possible to undo. Are you sure you want to continue?

BooksAudiobooksComicsSheet Music### Categories

### Categories

### Categories

Editors' Picks Books

Hand-picked favorites from

our editors

our editors

Editors' Picks Audiobooks

Hand-picked favorites from

our editors

our editors

Editors' Picks Comics

Hand-picked favorites from

our editors

our editors

Editors' Picks Sheet Music

Hand-picked favorites from

our editors

our editors

Top Books

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Audiobooks

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Comics

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Sheet Music

What's trending, bestsellers,

award-winners & more

award-winners & more

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

Introduction Hardware for image processing - Basics Eye – Human vision sensor

Mars Orbiter depicts Marineris canyon on Mars. This image is obtained using several images with so called stereo camera.

Course information

Lecturer: Number of lectures: Type of examination: Literature: Other resources: Dr Igor Đurović 3+1 depending on number of students book

www.obradaslike.cg.ac.yu PPT presentations examples test images English textbooks available etc

Covered topics

Introduction and image acquisition Human vision, color models and image formation Color and point transforms - histogram Geometrical transforms Interpolation Image in spectral domains Filtering Basics of image reconstruction Edge detection Basics of image recognition

course Multimedia systems out of scope of the basic digital image processing course .Other topics Compression Digital image protection Motion picture processing Stereo images Superresolution Computer graphics etc.

History Photography appeared in XIX century. These machines enabled simple and fast image processing. The second important event was astronomical exploration and race to space. JPL from California had the first task related to the digital image processing within NASA USA space program. . Ideas for developing fax machines and usage of telegraphic lines for image submission was born during the World War I. The key event in digital image processing was development of the first electronic computing machines. The idea about TV development was born around 1930.

. sensors receive signals reflected from the surface. vibrations etc.Image acquisition source of radiation normal line surface Usually we will assume that source of radiation is within visible light frequency domain but it could be other electromagnetic frequency bands (Xrays. ultrasound. gamma rays. radio waves).

y) that can be approximately known in advance.Image Acquisition Optical system Sensor Digitaizer senzor transforms optical signal to electrical continuous time electrical signal transformed to digital Image reflected from the surface f(x. .y) is input in optical system. it can be modeled as 2D linear space invariant system with impulse response h(x. This is non-electronic part that consists of lenses and other similar parts.

y) = −∞ −∞ zz ∞ ∞ f (ξ.y) represent optical power they are non-negative 2D convolution (very similar to the 1D convolution for linear time invariant systems) .y) can be considered as a “power” of the light (or some other signal that is subject of visualization). y − η)dξdη since b(x. h(x. For linear space invariant systems output is given as: b( x.Image Acquisition f(x. Output of the optical system is optical signal.y) is impulse response of the optical system.y) and f(x. η)h( x − ξ.

The second important element in optical system is sensor that transforms optical signal into electrical equivalent.Image Acquisition Optical system can deform image.y) and they can develop techniques to reduce effects of distortion. Some details related to sensors are given in our textbook but development in this area is quite fast and some of them can already be too old. Companies producing systems can estimate distortion caused by h(x. . We will not study sensors since they are subject to quite fast development and we will not consider it within this course.

1) is determined 3 4 5 6 7 2 3 4 5 6 7 i(x.y) Pixel (picture element) is elementary point of the image. . The fact that eye considers numerous close dots as a continual image is used.Digitizer Analog electric equivalent i(x.y) is transformed to the digital version within two procedures: Sampling Quantization 1 1 2 Based on image context in point or in area value of the luminance for image pixel i(1.

for fixed n2) Δ n1 . Number of quantization levels is commonly 2k where k is integer. i(n1.Digitizer Sampling phase is followed with digitalization (digitalization is performed by using quantization to closest value of the quantization steps multiple). This integer can easily be represented by binary number.

Discovered in XIX century. Not producing electrical equivalent.Most important systems Photography. Digitalization of black points on the paper and coded submission. It has still the best resolution among all sensors. Standardized. Quality improvement in the last 5 years is larger than in the last 50 years. Fax. . Developed in some variant at beginning of the XX century. Based on the chemical process. It is little by little removed from the market but technological advance still exists.

Image Digitizers Parts Sampling apparatus Scanning mechanical parts Source of “light” Quantizier (commonly AD converter) Memory medium Scanning technologies: Important characteristics SCAN-IN – source of light is moving through the image part by part SCAN-OUT – entire image is illuminated while sampling is performed part by part Pixel size Image size What is transformed to image (for visible objects it is its transmittance) Linearity Check in the textbook and on the Internet for more details on digitalization technology .

Camera (previously vidicon – tube camera) in TV studio records images (frames). Eye can see only 24 images in second and TV image seems to looks as continuous image. Images are presented on the TV screen. US power network has 60Hz) . 25 appropriate for Europe since it is ½ of frequency in power electrical network. Reason: In the case of presenting 24 per second due to non-synchronization between eye and TV we would see shaken images. Standards for analog TV: PAL (used in our country and numerous other countries in the world 25 images/second) SECAM (used in France and francophone countries. Iran etc) NTSC (US standard with 30 frames per second. Images are transmitted through communication channel.Most important systems TV. Discovered around 1930. In our country TV has 25 images per second.

.TV Digital TV (DVB) is in rapid progress presently. Development of new generations of digital videodiscs requires also new high quality video and image formats. Since there is rapid development it can happen that other standards emerge. Digital TV standard is called the HDTV. High compression lower quality formats are developed for video streaming over the Internet.

astronomy) Ultraviolet spectrum (lithography. meteorology. lasers. . electronic industry.Image in non-visible spectrum Details related to image acquisition for non-visible spectrum are given in textbook. X-rays (medicine. biological imaging. astronomy) It is possible to visualize signals that are not produced with electromagnetic rays such as for example vibrations in seismology. The most important types of non-visible spectrum used in digital image are: Gamma rays (medicine. industrial inspection) Radio waves (medicine. seismology. satellite observations. microscopy. astronomy etc). astronomy) Infrared spectrum (the same application area as in visible spectrum plus food industry. industrial inspection.

. Development of new devices for image processing is very rapid and data given in the textbook just couple of years old can be assumed just as historical overview. The main purpose of the processor is usually data compression while memory commonly has special construction to allow processing of huge amount of data. This card can contain processor used for data processing as well as the memory. Commonly PCs used for general purpose have special graphic card (its duty is to process graphical data and to submit them to monitor).Computers in image processing Revolution in development and application of the digital image processing algorithms begins with development of fast and reliable image processing machines.

. video data requires large amount of memory for example each pixel of monitor requires 3 bytes of memory multiplied with number of pixels and number of frames per second that is larger than in the case of TV.Computers in image processing The main problem in realization of these cards is access to other computer resources (data bus and communication with computer processor). In addition. Also. there could be additional cards for video data processing. There are several methods for solving this problem. there are rather expensive special purpose machines – graphical stations.

Printers Printers are developed for giving images on paper. There are numerous types of printers. we will give some data about the most important ones: Matrix printers (inked ribbon pressed on the paper with needles called pins). This technology as one of the most important is described in details on next couple of slides. . Here. Line printers (similar technology like in the matrix printers but with ability to print entire line at once) Laser and LED printers (electro photography based technology described in 1938 but applied in 1980 by Canon). Inkjet and bubble jet printers (one or more boxes with ink is positioned above the paper and appropriate amount of color is put on the paper by adjusting piezoelectric device in inkjet printers and by heater in the bubble jet printers).

Laser printers have cylinder made of special glass with uniform amount of electricity. Then we are putting toner on cylinder. . Toner is kept only on places with electricity. 5.Laser and LED printers 1. 4. This cylinder is lightened selectively according to the page that should be printed. Illuminated parts of cylinder remain without electricity. 3. 6. At the image is formed on the paper creating permanent copy (it remains on the paper due to the electrostatic effect). 2.

The LED technology has ability to print entire line at once.Laser and LED printers Photoconductive material Lens system For illuminating photoconductive material in the LEDT printers LED diodes are used. Laser beam source .

Printing Laser printers print material in single color and due to the imperfection of human visual system we see different amounts of the same color as different colors. There are numerous alternative printing schemes and apparatus. Two the most important characteristics of printers are dpi=dots per inch (determines printer resolutions and how fine is our printed page. 600dpi is today assumed as reasonable minimum) and ppm=page per minute. .

Projectors. Medical devices for displaying images. . Under displays we assume: Computer monitors.Displays Displays are devices for displaying images. We will not consider here so called permanent displays that produce permanent images (for example Xerox machines). etc.

allow looking under large angles. different problems with image presentation (shaking etc).Monitors Three basic technologies: CRT cathode ray tube technology based on physical properties of illuminated phosphorous LCD liquide crystal displays PDP plasma display panel (with special purpose gas on high voltage). produced by numerous companies. not usable in military applications etc. they still have important advantages: cheap.) However. etc. energy demanding. CRT monitors have numerous disadvantages (they are large. .

How CRT monitor works? Image in the CRT monitors are results of the phosphorous coating inside of the cathode ray tube. green and blue (RGB). device for heating and elements for focusing. . Their combination produces corresponding colors. This cannon consists of cathode. Phosphorous hit by electrons produce illumination. For each pixel we have phosphorous dots: red. Focusing and directing electron rays toward the phosphorous coating is performed using the strong electromagnetic field produced using the VN cascade. On the other end of the cathode ray tube (it is narrower part) we have electronic cannon that is sending electrons toward the phosphorous.

Properties of the CRT monitors? Three the most important properties of the CRTs are: Diagonal length Resolution 3 Refreshing rate Resolution is number of pixels in image Usually we are giving diagonal in inches 1”=2.54cm 4 for CRT monitors ratio typical 1600x1200 How many images width/height=4/3 is presented in the Parts of monitor close to limits are screen in 1sec not usable in the CRT monitors .

•For 1 pixel we need data about three colors (each color represented with k bits).e. 70 semi-images). i. 70Hz means 35 frames/s. • If resolution is MxN pixels maximal refreshing rate is v=2W/(3kMN).Refreshing and resolution •Today we can adjust number of pixels that will be turned on (resolution adjustment) up to some maximal value. •We can adjust refreshing (number of semi-frames per second. •Let amount of memory that is available for on-screen presentation is W per second. Why? ..

One pixel can be seen in the following manner. imperfection of eyes and other reasons) we cannot see pixel in ideal manner. . pixel enlarged more than 100 times Pixel region Due to the numerous reasons (imprefection of the monitor.Lowpass pattern Numerous close pixels on small distance are recognized as continuous shade and we are unable to recognize separate pixels.

R depends on monitor quality.Lowpass pattern Luminance of singe pixel is modeled as: Aexp(-(x2+y2)/R2) . A is maximal luminance. What happens when pixels with the same luminance are in neighborhood? resulting luminance is y d is distance between pixels not “flat” y-axis is neglected for brevity d 2 d x luminance of single pixel d d x .

. Details related to the lowpass pattern optimization are given in textbook.Lowpass pattern Monitors have good lowpass pattern when oscillations in luminance are as small as possible.

Highpass pattern Highpass pattern is formed with alternating lines in different colors. Due to the previously described imperfections changes between colors are not abrupt and in the case of bad monitors it is smooth or blurred. . Why this pattern is called highpass? With the same purpose checkboard pattern formed as the chess board is introduced. Details related to highpqss pattern can be found in the textbook.

Analogous (vidicon camera based on the photomultiplicative tube) . electrical equivalent electrons.. (significantly amplified) of light signal.e.Cameras Light photons hit Polutransparentna the Semitransparent fotokatoda photocathode photocathode +400V +600V +200V that release I +700V primary R electrons. i. These electrons light Svjetlost accelerated by electric field hit next cathodes +300V +500V +100V 0V causing Sekundarni Secondary avalanche effect elektroni electrons Primary Primarni electrons with amplified elektroni number of Output of the system is video signal.

Digital cameras Analogous cameras have several advantages but cheaper and cheaper digital video cameras are in use. . There are two types of digital cameras: CCD (Charged Coupled Device) sensors CID (Charge Injection Device) sensors CCD sensors are dominant (due to price reasons) and only they will be analyzed here. Details related to the CID sensors can be found in the textbook.

Digital cameras The main part of this camera is photodiode field: Photodiodes MOS switches Photodiode field is made on the single chip. . Diodes works as condensators in “light integrating regime”. Opening of MOS switches causes flow of electricity that is proportional to the charge integrated on diodes.

Oblast slike Image region Pomjeraj line shift linije Full frame oblast Pomjeraj line shift linije Pomjeraj line shift linije Oblast za Storage region smije{tanje Serijski registar Serijski registar Serijski registar Pomjeraj piksela pixel shift Full frame CCD Pomjeraj piksela pixel shift I nterline Transfer CCD Pomjeraj pixel shiftpiksela Frame Transfer CCD . This is similar (with little bit higher inertia) to the vidicon.Digital cameras This electricity is proportional to luminance within given period of integration. There are three reading strategy. There are problems with reading data from this sensors.

This is row not used for light integration. Advantages: Simple hardware and large percentage of the photodiode field used for image acquisition.Reading strategies for CCD Full frame transfer Assume that photodiode filed is matrix with single row masked. Formed image is moved row by row to the masked row and from masked row is read using shift register. Drawbacks: Slowness. .

After integration interval integrated pixels are moved to masked row and information is read from these rows while new acquisition can begin from non-masked rows.Reading strategies for CCD Interline transfer Each even row is masked. . Basic colors will be discussed later. In the case of color images acquisition there are color filters that allow only single basic color for one pixel and combination of images from diodes with various filters form the color image. Drawback: Low percentage of used diode field for acquisition. Advantage: Fastness. Check textbook for details on the third strategy.

lens nucleus 12. dura mater 20. pupil 7. inferior rectus muscule 16. macula 26. medial rectus muscle 17. lens cortex 11. optical nerve 23. posterior compartment 2. anterior chamber 8. retina . optic disc 19. sclera 28. ora serrata 3. superior rectus muscule 30. ciliary muscle 4. inferior oblique muscule 15. 5/6 of all information we are getting using the eyes. Work of eyes is similar to other sensors. central retinal vein 22. ciliary zonules 5. fovea 27. retinal arteries and veins 18. bulbar sheath 25. choroid 29. iris 10. conjunctiva 14.Human eye Eye is one of five human senses. central retinal artery 21. ciliary process 13. vorticose vein 24. cornea 9. 1. canal of Schlemm 6.

Light is coming to the lens. Light passes through the posterior compartment (glass tissue). Photosensitive muscles adjust size of iris and in this way it is regulated amount of light allowed in eye. . Other group of muscles adjust lens in order to give appropriate focusing of image. Light should come to the exact position on the retina called yellow spot that has size about 1mm2.“Steps in human vision” Light is coming to iris. Light is coming to the retina.

Number of visual cells decreases with moving away from the yellow spot On relatively small distance from the yellow spot is so called blind spot. This nerve “integrate” outputs of visual cells. .“Steps in human vision” On retina we have human visual cells: rods (about 125 million used for night vision) cones (about 5. It is position where optical nerve exits.5 million used for color daylight vision – what do you think why they have prismatic shape?) Light is transformed in visual cells to electric impulses. This nerve is connected with the brain. Optical nerve is connected with visual cells with ganglions.

This leads to eye persistence – we are able to see 24 images in second. We are unable to create image immediately after abrupt luminance change.“Steps” in vision Optical nerve is rather long (about 6m) and it connects eye with the very sensitive part of the brain called (visual) cortex. Our brains reconstruct object images. . Transport of optical information through the optical nerve is relatively slow (comparing to “wired connections” in machines). We are looking with eyes but we see with brain!!! Human visual system is inertial.

cataract. . conjunctivitis. Different other problems within eyes including physical damage can cause problems in vision. Problem with development of one of types of cone cells can follow to daltonism. Due to slower reaction of the muscles that manage the size of eye lenses special type of long-sightedness common for older people can develop.Problems in eye Problems in focusing light in the eye leads to missing yellow spot – then we have two known drawbacks – short-sightedness and long-sightedness. There are other very danger deceases trachoma.

≈460nm ≈580nm ≈650nm λ[nm] wavelength Combination of response from these three groups of cone cells gives our color vision.Color model for eye There are 3 types of cone cells sensitive to 3 different colors. It follows to smaller sensitivity to this color (this is probably caused by human evolution). . response of cones that react to: blue green red color There are smaller number of cells sensitive to blue than to other two basic colors.

Within the next week we will consider different color models for digital images Within next lecture we will consider color models for digital image but we will also review some additional important features of eyes.Color model for eye During the night our cells for black-white vision are more active but they are not loosing entirely information related to colors. . Humans are able to distinguish about 1 million of colors (sensitivity to colors and colors that can be distinguished varying between humans) but levels of gray are able to see only about 40-80.

for m=1:16 for n=1:16 A(m.colormap(gray(256)). k=k+1.shading interp .n)=k.Exercise No 1 Sensitivity to grayscale The procedure will be performed as follows We will create image with 16x16 pixels with 256 shades of gray Then this image will be shown Try to detect different shades MATLAB program clear k=0. end end pcolor(A).

Exercise No 1 Obtained image Question: How many different grayscale you can observe? .

MATLAB code: [m.Exercise No 2 Create binary image with chess-board pattern.1:100). imshow(r) Adjust image to 100x100 monitor pixels What can you conclude about quality of your monitor for highpass images? .2).n]=meshgrid(1:100. r=rem(m+n.

Try to find data about required parts. Detect basic parts. . Inform yourselves about work of the LCD and plasma displays. Analyze prices and performance of video-projectors. Consider construction of own scanner. Consider work of some scanner.For self-exercise Inform yourselves about recent development in the CCD sensors.

Find data about modern computer cards form graphics and video processing.For self-exercise Find data about various types of printing machines. . performance and construction. Find data about devices for medical imaging. especially about construction and parts.

Digital Image Processing EYE – Human Vision Sensor Color Models .

Eye has ability to adopt itself to the luminance level. In order to have an impression of looking to the same object it is just required to keep constant ratio between object luminance and total luminance. . During the daylight the eye is the most sensitive to wavelength 555μm (close to green). With decrease of the luminance the frequency of the most sensitive point decreases and it is approaching toward the dark red colors (wavelength increases).Eye sensitivity Eye is more sensitive to lower frequencies with red colors than to higher frequencies with blue colors.

.Eye sensitivity Eye ability to detect luminance variations depend on total luminance of the object. Threshold level increases with luminance and it is more difficult to detect noise in bright than in dark images. For bright objects we need larger variation in luminance in order to be able to detect it while for dark object it is smaller. There is a threshold in detection of noise (disturbances) in image. Ratio between detectable variation of luminance with total luminance is constant for a particular eye.

Eye sensitivity Sensitivity to contrast assumes ability of eye to distinguish certain number of different luminance levels. . This sensitivity decreases with increasing of noise level. Noise correlated with image can be detected in easier manner than non-correlated noise. Noise is more difficult for detection in images with large number of details than in image without details. Noise in motion (video) sequences causes fatigue of spectators.

Newton discovered using the prism that white color consists of all other colors. . Aguilonius and Sigfrid Forsius considered color model during renaissance. Gothe defined color model based on equal-sided triangle (philosophical considerations). Aristotel thought that colors are rays coming from havens.History of color models History of color models is very long. The former tried to construct the color model based on colors appearing in nature from day-break to the dusk.

.History of color models Runge constructed revolutionary color model based on sphere and three coordinates “colorness. green and blue colors at the corners. Maxwell considered color models and he came to the Gothe triangle with red. the International Comity for Light CIE (Commission Internationale de l'Eclairage) standardized RGB color model. whiteness and darkness”. In 1931.

g. This model is called three channel model where channels are shades of particular colors in the model.b]. green and red color. Then all colors can be represented as a vector with three components color=[r. Young has found in 1802.Color model and image All colors that can be seen by humans can be described with combination of three images coming from our cone cells in our eyes sensitive to blue. that combination of three independent colors (!?) can produce all visible colors. . Note that T.

0) .1.1) (1. green and blue.0) Yellow lack of blue R (1.0) (1. (0. Coordinates correspond to three basic colors: red.1.1) Cian – lack of red G (0.RGB cube The simplest method for representing colors is RGB cube in Cartesian coordinates.0.1) Magenta lack of green B (0.0. Maximal luminance is 1 in any of colors while minimal is 0.1.0.

2b=1.5r+0.RGB cube Black=(0. Achromatic colors can be described by single coordinate only (luminance).1.0) White=(1.1) On the main diagonal r=g=b are grayscale (achromatic) colors.0.3g+0. Example: Section of the color cube with the plane 0. .

RGB model is called additive model since resulting color is obtained as a “sum” of three basic colors. green and blue colors.RGB model RGB model is used in computer and TV monitors. . Memory representation of the RGB model data requires MxN (pixels) x 3 (colors) x k (bits for representing each color). Commonly k=8 but there are alternative models. Based on the Newton experiment the white color (the highest luminance) is obtained with full luminance in all basic colors. For example classical cathode ray tube monitors have phosphorous grains that bombarded with electrons produce red.

Numbers from 0 to 2r-1 represent one of colors from the RGB model. The simplest is to reduce number of bits used for representing blue color since humans have the smallest sensitivity to this color. methods are based on the colormap or CLUT – color look-up table.Color table The RGB model is memory demanding. There are several techniques for reducing memory requirements. . Other. more sophisticated. Any pixel is coded with r bits.

The most common type of the CLUT is: color code amounts of red. . ..Color table Information about color coding are given in the special table (CLUT) that can be recorded together with image... . green and blue colors for a coded color 0 r0 g0 b0 1 r1 g1 b1 2 r2 g2 b2 3 r3 g3 b3 .... 2r-2 r2r-2 g2r-2 b2r-2 2r-1 r2r-1 g2r-1 b2r-1 ...

. For example.Color table What to do with colors from the RGB model that exist in image but are not given within the CLUT? The most commonly these colors are replaced with some color from the CLUT that is based on some criterion the “closest” to desired color. There are several color tables commonly used in practice. Memory requirements of the CLUT-based representation are: MxNxr bits for image representation + (r+3x8)2r bits for the CLUT. instead of the color table we can record and submit information which table is selected since decoder have the same set of the CLUTs. There are alternatives.

. Paper is white until we do not add some color and on this way we reduce some part of reflected luminance. On this way by adding colors on the paper we are getting darker and darker image. The reason is the fact that we are receiving from the paper reflected part of luminance. Everybody knows that combination of colors on the paper produces very dark close to black color.Printing color models While monitor and projectors produce colors by adding basis colors it is not the case with printing machines.

All basic colors represent lack of some of colors from the RBG models.Printing color models To conclude if we are adding colors we will obtain darker image while not adding color produces brighter image. From this reason color model for printing are called subtractive. Colors for printing can be formed in the CMY (cian-magenta-yellow) model. check their position in the color cube .

CMY model Relationship between colors from the CMY and RGB models: Problem in the CMY model is inability to reproduce large number of colors. magenta and yellow and you will get dark but not black color (or you will use huge amount of color what is not economical). It is difficult to reproduce fluorescent and similar colors and the most importantly the black!!! Approximately just about 1 million of colors is possible to reproduce with good quality. Reason: mix cyan. edges of shapes. The main problem is black since it is very important for human (important for image recognition. etc). C=1-R M=1-G Y=1-B .

M.Y) C’=C-K M’=M-K Y’=Y-K . K means blacK since B is reserved for Blue.CMYK model Printing machines usually print images in the CMYK model (4-channel model). Relationship between CMY and CMYK models: K=min(C.

CMY i CMYK model – comparison CMY image CMYK image .

The most important group of models are so called HIFI models. orange i violet. . Unfortunately. These models use (it is still not subject to standardization) 6 or 7 printing colors (channels). Additional channels are: green. It motivated research in design newer and better printing color models.Other printing color models The CMYK can produce significantly larger number of colors than the CMY model. numerous colors cannot be reproduced in such manner. This is very important especially for some important colors such as for example black.

Since grayscale are on main diagonal of the color cube we can simply use corresponding projection on this diagonal consider as a grayscale variant of used color: GRAY=(R2+G2+B2)1/2 Square root is numerically demanding operation. Sometimes it is assumed that obtained results are not realistic with respect to image colors.RGB→GRAY Sometimes it is required to transform RGB color space to grayscale. . Result should be quantized to proper number of color levels.

. Blue channel is rarely used since it is assumed to be non-realistic.RGB→GRAY The most common RGB→GRAY converter is mean value of channels: GRAY=(R+G+B)/3 Even simpler techniques such as considering some of channels (commonly red or green) as a grayscale are sometimes used: GRAY=R GRAY=G. There are also some alternative schemes some of them presented latter within this lecture. GRAY→RGB will be taught at the end of course when we present story about pseudo coloring.

edge detection etc. n) = ⎨ ⎩0 g (m. Usually black and white are used.Binary image Binary image has only two colors. Threshold selection will be discussed lately. . n) ≥ T b(m. n) < T binary image grayscale image threshold Binary image is used in industry application. White can be represented as 1 while black is 0. Usually this image is obtained from the grayscale by comparing the grayscale with threshold: ⎧1 g (m.

**Three channel model
**

In Cartesian coordinates any color can be represented using the vector with three coordinates (R,G,B). However, T. Young (1802.) has concluded that color can be determined using 3 independent information (three independent coordinates). It is quite similar to the RGB model. However, instead of using Cartesian coordinates we can also use polar or sphere coordinates and develop corresponding color space. Some other color models are also available in practice. An overview of these color models that are different from the RGB and CMYK follows.

**Three channels models
**

Assume that we have three independent basic colors (c1,c2,c3). All other colors can be represented as a vector (C1,C2,C3) where Ci corresponds to the amount of the basic color ci. Chroma is defined as:

Ci , (i = 1, 2,3) hi = C1 + C2 + C3

An alternative for memorizing of colors is color space (h1,h2,Y) where Y=C1+C2+C3 is total luminance. This procedure is used in development of numerous color models.

**CIE color models
**

International Commission on Illumination CIE developed in 1931 the RGB model (we call it the RGB CIE since it is different from the computer model). RCIE corresponds to wavelengths 700nm, GCIE λ=546.1nm and BCIE λ=435.8nm. Referent white for this model is RCIE=GCIE=BCIE=1. RGB CIE model does not contain all colors that can be reproduced and from this reason it is defined via linear transformation model called the XYZ model that can represent all visible colors.

**RGB CIE and XYZ CIE relationships
**

XYZ are RGB linearly dependent and their transform can be given as:

⎡ X ⎤ ⎡ 0.490 0.310 0.200 ⎤ ⎡ R ⎤ ⎢ Y ⎥ = ⎢ 0.177 0.812 0.011⎥ ⎢G ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ Z ⎥ ⎢ 0.000 0.010 0.990 ⎥ ⎢ B ⎥ ⎣ ⎦ ⎣ ⎦⎣ ⎦

X=Y=Z=1 referent white

X, Y, Z should be represent wavelegth on which the cone and rode cells are the most sensitive. Y corresponds to the most sensitive wavelength for rode cells.

Chrominent components can be defined as: x=X/(X+Y+Z) y=Y/(X+Y+Z)

Problem in this diagram is the fact that there are color areas of eliptic shape within diagram that cannot be visible by humens.y) chromatic diagram. Then this diagram is widely used for illuminance system design.CIE chromatic diagram Illuminance of numerous light sources is still given with (x. .

128 ⎥ ⎢ B ⎥ ⎣ ⎦ ⎣ ⎦⎣ ⎦ .151⎤ ⎡ R ⎤ ⎢G ⎥ = ⎢ 0.“Computer” and CIE RGB models Current monitor RGB model is developed as a recommendation of the NTSC (National Television Systems Commitee).146 −0.753 0.059 1.114 0.001 0. Relationship between RGB CIE and RGB NTSC is linear and it can be given using the transformation matrix: ⎡ RCIE ⎤ ⎡ 1.167 −0.159 ⎥ ⎢G ⎥ ⎢ CIE ⎥ ⎢ ⎥⎢ ⎥ ⎢ BCIE ⎥ ⎢ −0.

Modification of the XYZ model Three modifications are used for overcoming drawbacks of the XYZ model: UCS model (model with uniform chromatic scale) u= 4X X + 15Y + 3Z 6X v= X + 15Y + 3Z luminance is the same as in the XYZ-model UVW model U=2X/3 V=Y W=(-X+3Y+Z)/2 .

v0) are coordinates of referent white color V * = 13W * (v − v0 ) .01 ≤ Y ≤ 1 U * = 13W * (u − u0 ) (u0.Modification of the XYZ model U*V*W* model is formed in such manner that referent white be in origin W * = 25(100Y )1/ 3 − 17. 0.

For example in industry we can have some process that is done when color of some object is the same or close to some color known in advance.G2.Colorimetry Colorimetry is scientific area specialized for color comparison. distance defined in this manner for RGB model does not produce reliable results since similar colors could produce large distance and quite different color relatively small distance. Distance between these two colors can be described as: ( R1 − R2 ) 2 + (G1 − G2 ) 2 + ( B1 − B2 ) 2 Euclidian distance but some alternative distances are also used.G1. Assume that current color in the RGB model is (R1.B2). . Unfortunately.B1) while the target color is (R2.

Therefore. for colorimetry applications it is defined the Lab color space.Colorimetry All models with linear dependency to the RGB suffers from the same problem as the RGB. Lab model can be defined in various manners but here we adopted definition based on the XYZ model: L* = −16 + 116 3 Y / Y0 a* = 500[ 3 X / X 0 − 3 Y / Y0 ] b* = 200[ 3 Y / Y0 − 3 Z / Z 0 ] luminance component a*>0 means red while a*<0 means green b*>0 means blue while b*<0 means yellow .

.Colorimetry (X0.1)). However.Y0. Euclidian distance in the Lab coordinates is assumed good measure of color difference.1.Z0) is referent white (almost always it is (1. there are alternative approaches for defining color difference measures.

HSL and related models As we have already seen: All colors can be represented using three independent colors. Humans have two types of vision: night vision based on luminance and daylight vision based on colors. Colors can be represented using Cartesian coordinates (RGB cube) but also they can be represented using spherical and polar coordinates. Numerous color models are developed for these purposes and we will describe only probably the most popular the HSL. It was important to develop new system and that people with old TV sets could follow new signal with the same functionality as they did with the old one. . Before development of the color TV existed the black-white (grayscale) TV.

Procedure for transformation of the RGB-model to the HSL model is: Step 1.luminosity H is represented with angle. xHS 1 [2 R − G − B] = 6 yHS = 1 [G − B] 2 L= 1 [ R + G + B] 3 .HSL color model H – hue S – saturation L . Coordinate transform. In the HSL model angles between 0 and 240 represent colors that can be seen by humans while angles between 240-360 are UV colors.

Step 3.L) corresponds to the HSL but commonly several additional operations are performed.RGB→HSL Step 2.ρ. ρ max 3min( R. y HS ) Obtained coordinate system (φ. Normalized saturation. B) = 1− = 1− R+G+ B L ρ .yHS) to polar coordinates (radius is measure of saturation while angle is measure of hue) 2 2 ρ = xHS + yHS φ = ∠( xHS . B ) 3 S= min( R. G. From Cartesian coordinates (xHS. G.

2 Step 5. Final relationship for H: R θ H=S T2π − θ G≥B G≤B .RGB→HSL Step 4. LM θ = arccos MN OP ( R − G) + ( R − B)(G − B) P Q 05[( R − G) + ( R − B)] . Additional processing of angle (hue).

Yellow Red Blue Black . White Popular presentation of the HSL color model. Determine this transform for self-exercise and with usage of the textbook.HSL model Similarly it can be performed HSL→RGB transformation.

322 B . NTSC uses the YIQ color model where intensity is given as: Y=0. Here. Q = sin 30 0.587G+0.Color models for video signal They are similar to the HSL and similar relevant models. we have single component that corresponds to the grayscale (so called black-white) image due to the backward compatibility with older models for the TF signal.311B .114B components given as: with chrominent I = cos 30 0877( R − Y ) − sin 30 0.2999R+0.569 R − 0.877( R − Y ) − cos 30 0.493( B − Y ) = 0.493( B − Y ) = 0.271G − 0.211R − 0.522G − 0.

.289G = 0. . Dr = 0.450 R − 0. . .217 B − 0133R + 1116G = 1902( R − Y ) .Color models for video signals PAL color model (YUV – Y is the same as in the NTSC): U = 0. V = 0. SECAM color model (YDrDb – Y is the same as in the NTSC): Db = 1333B − 0.615R − 0515R − 0100G = 0877( R − Y ) .883G = 1505( B − Y ) . .493( B − Y ) .463B − 0147 R − 0. . These models are quite similar to the HSL.

8417 0.159.001 0.128]. Visualize channels of image for RGB and RGB CIE models.1561 0.114 0.0907 -0. We will determine limits in which we should perform discretization of the RGB CIE model.8972 .059 1.1 Realize relationship between RGB CIE and standard RGB model.3189 -0. Here.753 0. -0.0688 0.Exercise No.2032 0.146 -0.151. » B=inv(A) B = 0.0075 -0. 0. » A= [1. we will realize several aspects of the problem: We will create matrix that produces inverse transform from the RGB CIE to the RGB model.167 -0.1290 1.

G=1. b(:.:.:. b(:.:.:. B component achieves maximum for R=0.:.1)+0.146*a(:.1 Limits of the RGB model are usually 0 and 1 along all coordinates but they are different for the RGB CIE model.167.1)=1. Minimum of the R component in the CIE model is obtained for R=0. b(:.059*a(:.3).2)+0.128*a(:. G=B=0 and it exhibits -0.3)=-0.186 while the minimum is produced for R=1.297 while the maximum is produced with R=1.Exercise No.1)).:.151*a(:.jpg')). Minimum of G in the CIE model follows for R=G=B=0 and it exhibits 0.:. For visualization of color channels we can use the following commands: a=double(imread('spep. B=1 and it is equal to –0.159*a(:.:. while maximum follows for R=G=B=1 and it exhibits 1.753*a(:.2)-0.:. G=0. B=0 and it exhibits 1.:.167*a(:.2)+1. Channels can be represented with commands as the follows: pcolor(flipud(b(:.shading interp .2)=0.1)-0.001*a(:.026.:.3).3).1)+0.:.:. G=B=1 and it exhibits 1.114*a(:.001.

Colors can be printed in CMYK or in alternative model with appropriate amount of black. Rules for printing in the corresponding model are the same as for the CMY (it is possible to print up to 90% of any color). 3. green and orange). Realize all color models given on these slides and in textbook and create transformations between them. How many colors from the RGB model is possible to print the CMYK color and how many in model with three additional alternative colors.For self-exercise 1. . Assume that in addition we have color space with three alternative colors (for example rose. Assume that colors that can not be printed in the CMYK model have any of channels except black channel when it is represented with more than 90% of maximal value. 2. List of mini-projects and tasks for self-exercise: Solve problems from the textbook. Consider the following experiment. Visualize channels for considered models.

The second set of experiments is performed after that. The main results of the first set of experiments should be: average color of cake several minutes before we assume that it is done and average color when cake is done. Try to determine algorithm for automatic turning off the baking appliance based on the first set of experiments and check if it is performing well for the second set of experiments. . Determine number of correct and wrong results.For self-exercised 4. Make images of cakes in process of baking or some other similar kitchen experiment. List of miniprojects and tasks for self-exercise: Create introduced color models and transformations between these color models and RGB.

Digital Image Processing Histogram Point operation Geometrical transforms Interpolation .

H(X)=number of pixels with luminance X 255 ∑ H(X ) = M × N X =0 Sum histogram values of image for all luminances (here grayscale image with bi bits/pixel is considered) is equal to number of pixels in image.Histogram Histogram is simple (but very useful) image statistics. .

Example of histogram Histogram has numerous applications. How to connect histogram and probability density function? . It is very useful in techniques that use probabilistic model of image with probability density function of image luminance (or at least estimation of the pdf).

Histogram – Common shapes H ( P) H ( P) H ( P) L P L P L P unipolar histograms for dark and bright images bipolar histogram can be used for threshold determination and obtaining of binary images (how?) .

Histogram extension Optical sensors very often concentrate image in very narrow region of luminance.B] (estimation of A and B can be performed using histogram). Let image be contained into luminance domain of [A. Problems are solved by using information obtained using histogram. Then assume that we want to extend the histogram over entire 8-bit grayscale domain [0. Then software systems are usually employed to solve this problem.255]: luminance of image with extended histogram f (X ) = 255 255 A X− B− A B− A Luminance of original image .

Example original after histogram extension original histogram histogram after operation .Histogram Extension .

e. Then.Histogram equalization Histogram equalization is also one of the most common histogram operations. . Images with equalized histogram have good contrast and it is the main reason for performing this operation. In equalized image histogram is approximately flat. we want to transform histogram to be approximately uniform.. the goal is to have image with approximately the same number of pixels for all luminances. i.

Histogram Equalization H ( P) H(P) L P original histogram goal . We are looking for transform y=g(x) producing probability density function fy(y). In this case it is proportional to the equalized histogram.equalized histogram L P This can be considered as the following problem: There is a random variable with probability density function fx(x) (it can be estimated using the histogram of original image). .

Histogram equalization From probability theory follows: f x ( x1 ) f x ( x2 ) f x ( xN ) + + ..... + f y ( y) = | g '( x) ||x = x1 | g '( x) ||x = x2 | g '( x) ||x = xN where (x1...xN) are real roots of equation y=g(x). Assume that solution is unique (it is possible for monotone functions g(x)).x2. f x ( x1 ) f y ( y) = | g '( x) ||x = x1 .

This is monotone function in its domain. . Assume that g(x) is monotone increasing function it means g’(x)=c fx(x).Histogram Equalization Since fy(y) is constant it means that |g’(x1)| is proportional to fx(x1). constant value Select c=1 (it means that output image would have the same luminance domain as input one) : g ( x) = −∞ ∫ x f x ( x1 )dx1 Integral of probability density function.

MATLAB realization is quite simple: I=imread('pout. g=cumsum(a)/sum(a).tif'). a=imhist(I).Histogram Equalization Since image is not continual but discrete function here we have no continual probability density function but its discrete version (histogram). J=uint8(255*g(I)). output image Reading original image function g .

Histogram Equalization .Example original image equalized image (significantly improved contrast) corresponding histograms Obtained density is not uniform due to discrete nature of images 0 100 200 4000 3000 2000 1000 0 0 100 200 4000 3000 2000 1000 0 .

. Otherwise we need more complicated operation involving segmentation of g(x) in monotone regions. Similarly histogram can be matched to any desired probability density function. The procedure is the same for any monotone function g(x) (increasing and decreasing) that is satisfied in the equalization case.Histogram matching Histogram equalization is operation that produces uniform probability density function.

Improvement of image contrast (equalization). histogram can be calculated for parts of image or for channels in color images. Modifications of colors. We can perform histogram-based operations on selected region: object of background depending on our task. Histogram can be applied locally to image parts. Also. For example we have very bright object on dark background. .Applications of Histogram All methods that uses probability density function estimate are histogram based. Histogram matching.

Image Negative There numerous operations that can be applied to the image luminance. One of the simplest of them is determination of the image negative (or positive if we have image negative). These operations are called point operations. This operation can be performed in different manner depending on the image format. . Some of them are applied to each pixel in independent manner of other pixels.

.m) RGB Operation for grayscale images is performed for each image channel.Image negative logical operation negation Binary image: Negativ(n.m)=2k-1-Original(n.m) number of bits used for memory representation of image pixels Grayscale Negativ(n.m)=~Original(n.

Color Clipping Color clipping is operation performed on colors but it is not related to geometrical clipping. j ) cmax ≥ a (i. j ) < cmin ⎩ min . j ) > cmax ⎧ cmax ⎪ b(i. We are keeping some image colors as in original image but other colors are limited to some selected limits: a (i. j ) = ⎨ a (i. j ) ≥ cmin ⎪ c a (i.

The second technique: f(n.m)xr brightening for r>1 darkening for 0<r<1.m)/ [2k-1]}γ γ>1 darkening γ<1 brightening .m)=g(n. f(n.Brightening (Darkening) There are several methods to perform these operations. These techniques are not of high quality since they have several drawbacks. The most common is the following procedure: Number of bits per pixel f(n.m)=[2k-1] {g(n. For example.m)=g(n.m)+r would increase luminance for r>0 while for r<0 image will darker.

4 0.8 0.Luminance correction Histogram modification function: 1 brightening darkening 0.2 0.6 0.6 0. 0 0.8 1 0 .2 This way of representing point operation is accepted by almost all image processing software (for example photoshop).4 0.

Results of point operations Point operations and other operations usually performed on images do not produce always integers and numbers in proper domain. It is performed through simple steps (here described for grayscale image in 256 levels): Rounding or truncating to integer (rounding is more precise but truncation is commonly performed). Then after operations are performed we should return results to image format. Luminance below 0 are set to 0 while those above 255 are set to 255. .

. It is possible to define more complex functions that establish relationship between more pixels.Conclusion Point operations are very common and usually they are performed after observing histogram. We assumed that pixel of target image depends only on pixel in original image that is on the same position. Some of these combinations remains for your selfexercise.Point operations .

y) moves to the position in (x1. Here we are in fact considering transformation of coordinates in digital image that can be written as: ⎡ x1 ⎤ ⎡x⎤ X1 = ⎢ ⎥ = g ( X ) = ⎢ ⎥ ⎣ y⎦ ⎣ y1 ⎦ The simplest transform is translation where entire image is moved with a given vector (x0. .y0).y1) target image.Geometrical transforms In the case of geometrical transform we have that pixel from position (x.

y − y0 ) y0 f(x. y ) = f ( x − x0 .y) x0 g(x. . g ( x. This strategy is called cropping. In region appearing by translation we put white or some other default color.y) Also. An alternative strategy (when we want to change image dimension) is enlarging image in order to entire original image be kept in target image. it is possible to have cyclical translation with part that are removed from the image cyclically shifted at the beginning.Translation ⎡ x0 ⎤ ⎡ x − x0 ⎤ X1 = X − ⎢ ⎥ = ⎢ y0 ⎦ ⎣ y − y0 ⎥ ⎣ ⎦ In this case we keep dimension of the target image to the same dimension as in the original image.

N) and let we want to crop region between (M1.N1) and (M2. y ) for x∈[M1.M2] i y∈[N1. For example let the original image f(x.Cropping Cropping is operation were part of original image is used as a new image. Of course this image has smaller dimensions then the original one. y − N1 + 1) = f ( x.y) has dimensions (M. Determine dimensions of the target image? . g ( x − M 1 + 1.N2].N2) where 0<M1<M2<M and 0<N1<N2<N.

Positive direction for rotation is counter clockwise.y)=f(xcosα+ysinα. This is rare situation in digital images. Develop the coordinate transform for rotation around pixel (x0.y0). We assumed that coordinate transform is performed around origin.-xsinα+ycosα). positive direction – counter clockwise negative direction – clockwise .Coordinate transform in the case of the rotation is defined as: ⎡ cos α sin α ⎤ ⎡ x ⎤ X' = ⎢ − sin α cos α ⎥ ⎢ y ⎥ ⎣ ⎦⎣ ⎦ Rotation Obtained image is given as: g(x.

y) Consider distortion that would be performed in paralel to the line y=ax+b.Distortion y (x.y) (x'.y)=f(x-ycotθ.y') We will demonstrate distortion along one of coordinate axis. . Coordinate transform: θ x ⎡ x ' ⎤ ⎡1 − cot θ ⎤ ⎡ x ⎤ ⎢ y '⎥ = ⎢ 0 1 ⎥ ⎢ y⎥ ⎣ ⎦ ⎣ ⎦⎣ ⎦ g(x.

For which parameter values image is enlarged? This is scaling along x and y axes. Is it possible to define scaling along alternative directions? Could reflection with respect to coordinate axes or origin be described using scaling? . Determine dimensions of output image in function a and b.Scaling Coordinate scaling can be described as: ⎡a 0⎤ X' = ⎢ ⎥X ⎣0 b⎦ Determine function of the output image based on input image.

y+Asinbx). we give a simple example: g(x.Nonlinear transforms There are numerous non-linear transform used in digital image processing. Here.y)=f(x+Asinby. Number of non-linear transforms is significantly greater than of linear ones. . Important example is the fish-eye nonlinearity.

.Fish-eye transform The fish-eye effect is cause by shape and limited (relatively small) dimensions of camera lens. Sometimes this effect is desired in photography and photographers simulate it or they produce it using special form of lenses. It causes that objects in the middle of the scene are larger than objects on borders of the scene. Try to simulate fish-eye transform and to propose method for removing this effect.

Pixels are dislocated from the grid. transf.Problem Original image with pixels on grid. . . Image rotated for 45 degrees.Geom.

Need for interpolation Commonly only small number of pixels after geometrical transforms is on the grid while others are displaced. Internet. Here we will describe several strategies for interpolation. Then we have a problem to determine pixels in grid for the target images. Techniques for determination of grid values are called interpolation. For other techniques look at textbook. . and for additional material available at the lecturer office.

. For pixel in the grid we are taking value of the nearest pixel of interpolated image. original rectangle after rotation for 5 degrees and this interpolation technique Human eye is very sensitive to broken edges and disturbed small details that are caused by this interpolation form.Nearest neighbor Nearest neighbor technique is the simplest interpolation strategy. This technique suffers from low quality problem.

However.Bilinear Interpolation Strategy of bilinear interpolation is slightly better with respect to image quality than the nearest neighbor but little bit slower.y) transformed pixels (we assume that dimensions of square in which we perform interpolation are not changed) 1-y 1-x y x . Let pixel of original image be surrounded with four pixels of transformed image. pixel that we want to determined luminance g(x. calculation burden in this strategy is still reasonable.

n+1) f(m+1.n+y) For simpler determination we will rotate the coordinate system.n) Bilinear interpolation determines luminance in point (m+x.n+y) as: f(m+x. c and d should be determined .Bilinear interpolation f(m. b.n+y)=axy+bx+cy+d constants a.n) f(m+1. f(m.n+1) f(m+x.

n+1) Consider the following case.Bilinear interpolation Constants can be determined from the following condition: f(m. .n)-f(m+1.n+1)+f(m.n) f(m.n)-f(m.n+1)=a+b+c+d ⇒ a=f(m+1.n+1)=ax0x1+bx0+cx1+d ⇒ c=f(m.n)-f(m.n) f(m+1.n)=ax1x0+bx1+cx0+d ⇒ b=f(m+1.n)=ax0x0+bx0+cx0+d ⇒ d=f(m.n) f(m+1. Determine relationship that connects original and target image with bilinear interpolation? This operation is called image resize. We are not performing geometrical transform but we want to change number of pixels in image (for example instead of NxM we want to get kNxlN where k and l are integers and k>1 and l>1.n+1)-f(m.

x0. y0) %realization is performed within function where a is source image %theta is rotation angle. %Determination of the origin of the pixel mapped to (xp.x0) * sin(theta) + (yp .y0) * cos(theta) + y0. x0 and y0 are rotation center [M.y0) * sin(theta) + x0.yp) i.e.x0) * cos(theta) . %Size of grayscale image b = zeros(M.(yp . N] = size(a).Rotation – MATLAB program function b=rotacija_i_interpolacija(a.. %where it is in the original image (inverse transform) . %Target image %we assume that the target image has the same dimensions as source image %we will perform cropping of remaining parts for xp = 1 : M for yp = 1 : N %performing operation for all image pixels x = (xp . N). y = (xp . theta.

double(a(xd.double(a(xd. yg)) . yg = ceil(y).yd) bottom left corner %rectangular for interpolation. yd)).yg) upper right corner D = double(a(xd. %Determination of coefficients b(xp. B = double(a(xg.. yp) = A*(x-xd)*(y-yd)+A*(x-xd) + B* (y-yd)+D. xg = ceil(x). yg)) .double(a(xg. yd)). yd = floor(y).. yd)). yd)). .double(a(xd. %Values of target image . A = double(a(xg.double(a(xd. yd)). %(xd. C = double(a(xd. yg)) . (xg. yd)) .Rotation – MATLAB program if ((x >= 1) & (x <= M) & (y >= 1) & (y <= N)) %Is pixel within proper domain? xd = floor(x).

Rotation – MATLAB program end b = uint8(b). Perform the same operation with bilinear interpolation and compare results. . Rotate image for 5 degrees using the nearest neighbor two times and perform rotation for -5 degrees two times. %%%end of the program (end of if selections and two for cycles) %%%and return image to proper format end end Write program for distortion. Write program for rotation and nearest neighbor.

Polar to rectangular raster In medicine numerous images are created by axial recording of objects under various angles. . Similar images are obtained in radars. Since monitors have rectangular raster we have to perform corresponding interpolation. These images have polar raster (pixels distribution). Obtained image has circular shape. This is imaging technique is common for various medical scanner types. and some other acquisition instruments. sonars.

p3. p4 pixels of polar raster p1 c p3 p2 c pixels of rectangular raster Bilinear interpolation form that is commonly applied here: f ( p1) / p1 + f ( p2 ) / p2 + f ( p3 ) / p3 + f ( p4 ) / p4 f (c) = 1 / p1 + 1 / p2 + 1 / p3 + 1 / p4 pi are distances between pi and c while f() is luminance in the corresponding pixel .Polar to rectangular raster p4 p1. p2.

then it is observed position where nodes of the rectangular are moved. It is based on the third order polynomial function. Based on this we create inverse transform that returns image to proper shape using software means. The procedure is as follows: scan of the rectangular grid is performed. .Other interpolation methods Earlier the bicubic interpolation was not used due to its demands. Today it is assumed that calculation demands are reasonable and it is one of the most common strategies. Some sensors in acquisition process deform image (for example scanners). Fortunately distortion can be known in advance and we can select appropriate interpolation strategy commonly using the grid.

. Quite common interpolation technique today is based on spline. This technique is related to both polynomial interpolation and wavelets.Other interpolation forms A group of the polynomial interpolation algorithm is quite common and among them the Lagrange interpolation technique is quite popular. There are numerous well-established interpolation technique able to preserve important image features such for example edges. The Fourier transform can also be used for interpolation purpose.

Write program for histogram adjusting where upper and lower bounds are adjusted that reject 5% darkest and 5% brightest pixels.For self-exercise Write own program for evaluation of the image histogram. Pixels outside of this range should be set to the maximal. .e. minimal luminance. i.. Realize own functions for all variants of translation. How to determined negative of image written using colormap? Calculation of image negative for color models different from the RGB? Write programs for calculation of the image negative. Create target image based on original image of the hexagonal shaped range in the size 2-4-2 where pixels of destination image are equal to mean of pixels of original image.

.y+Asinxb). Determine functional relationship between output and input image for all transforms defined within lectures. Write coordinate transform that performs distortion parallel to arbitrary line y=ax+b. Realize coordinate transform: g(x. Perform experiments with A and b. Is original image enlarged or shrinked with respect to a and b in the case of scaling? Can scaling be defined for alternative directions than along the x and y axes? Can reflection along coordinate axes and with respect to origin be described using scaling? Realize all introduced linear geometric transforms.y)=f(x+Asinby.For self-exercise Write coordinate transform where rotation is performed for arbitrary angle.

Write program for distortion. as well as with possibility that k and l are smaller than 1. Write program for rotation and the nearest neighbor interpolation. .For self-exercise Create program for image resize based on bilinear transform. Also. This program should be able to handle with non-integer values of scale k and l. Compare results. Perform rotation for 5 degrees twice using the nearest neighbor and after that for -5 degrees twice. Check if the bilinear interpolation used for transformation of polar to rectangular raster is the same as the standard bilinear interpolation introduced previously. these operations repeat with bilinear interpolation.

Project Write program that allows that users can select colors and adjust colors in interactive manner including usage of curves presented on slide 18. Write program that perform the fish-eye transform as well as program able to image distorted with fish-eye return to normal (or close to normal) shape. When user define more different points curve should be interpolated using the Lagrange multipliers. .

Find Internet resources related to interpolation and write seminar paper related to found facts.Project Review the Lagrange interpolation formula and use it for polynomial interpolation using the grid. .

Digital Image Processing Image and Fourier transform .

FT of multidimensional signals Images are 2D signals. t2 )e − jω1t1 − jω2t2 dt1dt2 Inverse Fourier transform gives the original signal: 1 x(t1 . The Fourier transform of 2D continuous time signal x(t1.t2) is: ∞ ∞ X (ω1 . ω2 ) = −∞ −∞ ∫∫ x(t1 . ω2 )e jω1t1 + jω2t2 d ω1d ω2 . t2 ) = (2π) 2 ∞ ∞ −∞ −∞ ∫∫ X (ω1 .

. tQ =−∞ ∫ ∞ ωt = ω1t1 + ... + ωQ tQ . ω2 . can Fouriert transform tQ ) be writtenω = (ω1 ..dtQ ∫ t = t1 =−∞ t2 =−∞ ∫ ∫ ∞ ∞ ..... t2 . This pair can be written in compact form by introducing vectors (allowing larger number of coordinates than 2): = (t1 .. ωQ ) as: X (ω) = ∫ x(t )e t − j ωt dt dt = dt1dt2 ...FT of multidimensional signals Signal and its FT represent the Fourier transform pair....

Since we are considering discretized signals we will not consider in details FT of continuous time signals. Before we proceed with “story” about discretized signals we will give several general comments about the FT.FT of multidimensional signals Inverse multidimensional Fourier transform: 1 x (t ) = X (ω)e jωt d ω (2π)Q ∫ ω We will consider the 2D signals only. .

Roughly speaking signal in time and spectral domain (its Fourier transform) are different representations of the same signal. In addition the FT and its inverse have quite similar definitions (difference in constant and sign of the complex exponential).Fourier transform FT establishes the “1 to 1” mapping with signal. Why we in fact use the FT? .

Motivation for introducing the FT Consider the sinusoid represented with red line. In the FT domain it can be represented with two clearly visible spectral peaks. signal in time domain signal in frequency domain . When we add large amount of Gaussian noise to sinusoid it cannot be recognized in time domain (blue lines) while in spectral domain still the spectra achieves maximum for frequency corresponding to considered sinusoid.

Motivation for introducing the FT Roughly speaking: some signals are represented in better manner in frequency than in time domain. In this way we would reduce significantly influence of noise to sinusoid. In previous example we will detect spectral maximum and we will design cut-off filter that will take just narrow region in the frequency domain around spectral maximum. In addition. filter design is much simpler in spectral than in space domain. . Similar motivation is used for introducing additional transforms in digital image processing but we will consider them later in this course Similar motivation is used in the case 2D signals and images.

Then we will skip properties of the 2D FT of continuous time signals but I propose you to check that in textbook.2D FT of discrete signals Here we will consider discrete signals. The sampling in the case of 1D signals is simple and we can take equidistantly separated samples: x(n)=c xa(n T) constant (commonly c=T) discrete-time signal continuous time signal (T is sampling rate) . Discrete-time signals are obtained from the continuous time counterpart by using sampling procedure.

This theorem is satisfied if sampling rate satisfies: T ≤1/2fm maximal signal frequency If the sampling theorem is not satisfied we are making smaller or bigger mistake.2D FT of discrete signals In order that we can reconstruct continuous-time signal based on discrete-time counterpart the sampling theorem should be satisfied. How to perform sampling in the case of digital images? .

m T2) Constant c is commonly selected as c=T1 T2. Sampling rate is usually equal for both coordinates T1=T2.m)=c xa(n T1. Then fm1 and fm2 are maximal frequencies along these coordinates fmi=ωmi/2π). . Here. fm1 and fm2 are maximal frequencies along corresponding coordinates (note that 2D FT of continuous time signals has two coordinates ω1 and ω2.Sampling of 2D signals The simplest sampling in digital images is: x(n. Sampling theorem is satisfied when T1≤1/2fm1 and T2≤1/2fm2.

. In the case of rectangular sampling we are replacing the rectangular of image with single sample: Entire rectangular can be replaced with single sample This is practical sampling scheme but we can apply some alternative sampling patterns.Sampling of 2D signals Previously described rectangular sampling is not unique sampling scheme.

2D signal samplings Some of alternative sampling schemes are given below: It can also be rhomb but the hexagon is best pattern with respect to some well-established criterion. However. we will continue with usage of hexagonal sampling due to simplicity and practical reasons!!! .

. Instead of exact values we are taking values rounded (or truncated) to the closest value from the set of possible values (quant). Errors caused to rounding is smaller than error caused by truncation but truncation is used more often in practice.Quantization Discretized signal is not used directly but it is quantized.

Quantization Quantization can be performed with possible values equidistantly separated but some systems and sensors are using non-uniform quantization.2k-1]. Number of quantization levels is commonly 2k and these quant levels are commonly represented as integers in domain [0. We will almost always assume that we have discretized and quantized (these signals are called digitalized). .

m ) = (2π) 2 π n =−∞ m =−∞ π ∑∑ x(n. m)e − jω1n − jω2 m ω1 =−π ω2 =−π ∫ ∫ X (ω1 .2D FT of discrete signals Fourier transform pair between discrete-time signal and corresponding FT can be represented using the following relationships: ∞ ∞ X (ω1 . ω2 )e jω1n + jω2 m d ω1d ω2 2D FT of discrete-time signal is continuous variable and it is not suitable in this form for operation on the computer machines. . 2D FT is periodic with period along both coordinates of 2π. ω2 ) = 1 x ( n.

. Let size of signal (image) is NxM. Namely. assume that signal x(n. In order to achieve this we use periodicity extension property.m) is defined within limited domain (it is always the case for digital images). Our goal is to have discretized transform that is suitable for processing using computer machines.2D DFT We will not explain in details properties of the 2D FT of discrete-time signals since we will not use it in process.

m periodical extension x p ( n.2D DFT Perform periodical extension of the original signal with period N1xM1 (it should be satisfied N1≥N and M1≥M but here from the brevity reasons we assume N1=N and M1=M). m + pM ) ∞ ∞ original signal x(n. m ) = r =−∞ p =−∞ ∑ ∑ x(n + rN .m) n .

2D DFT FT of periodically extended signal is: X p ( ω1 . ω )e 2 ∞ = X ( ω1 . . m )e − jω1n − jω2m e jrN ω1 + jpM ω2 = jrN ω1 + jpM ω2 Analog Dirac pulses (generalized functions) produced by sumation over infinity number of terms in sums r =−∞ p =−∞ n =−∞ m =−∞ ∑∑∑∑ ∞ 1 r =−∞ p =−∞ ∑ ∑ X ( ω . ω2 ) = = = n =−∞ m =−∞ r =−∞ p =−∞ ∑∑∑∑ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ x ( n + rN . ω2 )δ(ω1 N − 2k1π)δ(ω2 M − 2k2 π) Here we changed places of sums and we used property of FT of translated (shifted) signal with possible neglecting some multiplicative constants. m + pM )e − jω1n − jω2m = x ( n.

we should keep in mind that we assumed and that the smallest period for extension is equal to dimension of image NxM. ω2 ) = X ⎜ .2D DFT Finally we obtain: ⎛ 2k1π 2k2 π ⎞ X p (ω1 . Periodical extension produce discretized FT (DFT).M). Periodical extension is commonly not performed in practice due to infinity number of terms in sums and usage of generalized functions.N) and k2∈[0. the FT of periodically extended signal is equal to samples of the 2D FT taken in the discrete grid k1∈[0. ⎟ M ⎠ ⎝ N Thus. However. .

k ) e 1 2 N −1 M −1 j 2 πk1n 2 πk2 m +j N M Important fact: The inverse DFT can be calculated in almost the same way as direct on usign sums. m) e n =0 m=0 N −1 M −1 −j 2 πk1n 2 πk2 m −j N M It is the FT of discrete signal calculated in limited interval and for discretized frequencies. . k2 ) = ∑ ∑ x(n. m) = NM k1 = 0 k2 = 0 ∑ ∑ X (k . 1 x(n. Differences are very small (minus in complex exponential and normalization constant 1/NM).2D DFT 2D discrete signal and 2D DFT are transformation pair X (k1 .

π) × [−π. ω2 ) ∈ [−π.N)x[0.M).Domains of the 2D DFT and FT of discrete signals Domain of the 2D FT of the discrete time signal is Descartes product: [ belongs to interval (ω1 .k2)∈[0. We have to determine relationship between frequencies in these two domains!!! Relationship ω1=2πk1/N and ω2=2πk2/M can be satisfied only for 0≤k1<N/2 and 0≤k2<M/2 while for larger k1 and k2 these relationship would produce frequencies outside of ω1 and ω2 domain. . π) ( does not belong to interval Domain of the 2D DFT is discrete set of points (k1.

(0.0) (N.M) (N.M) Arrows depict quadrants of the 2D DFT shift in this operation for obtaining properly ordered frequencies. (0.0) .Domains 2D DFT and FT of discrete signals For larger k1 and k2 we can use the fact that the 2D FT of discrete time signals is periodical along both coordinates ω1 and ω2 with period 2π and we can establish relationships: ω1=2(N-k1)π/N and ω2=2(M-k2)π/M for N/2≤k1<N and M/2≤k2<M.

**2D DFT, convolution and LSIS
**

Digital image can be subject to numerous forms of processing. Assume that image is input in linear space invariant system: LPIS

?

These systems can be for various purposes but we will assume their application for image filtering and denoising

System is linear when linear combination of inputs produces linear combination of outputs: If x(n,m) is input and T{x(n,m)} is transformation of input produced by the LSIS then it holds: T{ax1(n,m)+bx2(n,m)}=aT{x1(n,m)}+bT{x2(n,m)}

**2D DFT, convolution and LSIS
**

System is space invariant (extension of the concept of time invariance) when transform of shifted image is equal to shifted transform of original image: If y(n,m)=T{x(n,m)} if should hold Due to limits in image size, y(n-n0,m-m0)=T{x(n-n0,m-m0)}. rounding, discrete nature of LSIS has important property that image systems that process digital imagers are rarely LSIS its output can be given as a but most of them can be convolution of input signal with approximated with the LSIS. 2D impulse response of the system: y(n,m)=x(n,m)*n*mh(n,m)

2D convolution

**2D DFT, convolution and LSIS
**

h(n,m)=T{δ(n,m)} where:

⎧1 for n =0 and m =0 δ( n, m ) = ⎨ elsewhere ⎩0 Assume that impulse response is finite N1xM1. Then linear convolution (here, we consider only this type of convolution and other types will not be considered here) can be defined as :

y (n, m) = T {x(n, m)} = x(n, m) *n *m h(n, m)

Output domain is: [0,N+N1)x[0,M+M1)!!!!

= ∑ ∑ h(n1 , m1 ) x(n − n1 , m − m1 )

n1 =0 m1 =0

N1 −1 M1 −1

**2D DFT convolution and LSIS
**

We should keep in mind size of the output in important applications. The FT of convolution is equal to the products of the FTs. In order to be able to reproduce results in the case of the 2D DFT we should follow the following steps:

Perform zero-padding up to domain [0,N+N1)x[0,M+M1). The same operation should be performed with impulse response. Determine 2D DFT of signal and impulse response. Multiply obtaned 2D DFTs and calculate inverse 2D DFT.

N + N1 ) ∨ m ∈ [ M . m)e h '(n. k2 ) = N + N1 −1 M + M1 −1 ∑ ∑ n =0 m =0 n =0 x '(n. N ) ∧ m ∈ [0. M 1 ) ⎧ h ( n. convolution and LSIS Write these steps in mathematical form: n ∈ [0. k2 ) = H '(k1 .2D DFT. m) = ⎨ n1 ∈ [ N . M ) ⎧ x(n. m ) h '( n. M + M 1 ) ⎩ 0 X '(k1 . m) x '(n. m ) = ⎨ n1 ∈ [ N1 . N + N1 ) ∨ m ∈ [ M 1 . N1 ) ∧ m ∈ [0. M + M 1 ) ⎩ 0 n ∈ [0. m)e −j 2 πk1n 2 πk2 n −j N + N1 M + M1 2 πk1n 2 πk2 n −j N + N1 M + M1 N + N1 −1 M + M1 −1 ∑ ∑ m =0 −j .

k2 )e j 2 πk1n 2 πk2 n +j N + N1 M + M1 There are cases when calculation of the convolution is faster in the case of using 3 2D DFTs than by direct computation. . k2 ) H '(k1 .2D DFT. convolution and LPIS 1 y ( n. This is possible when we use fast algorithms for evaluation of the 2D DFTs. m ) = × ( N + N1 − 1)( M + M 1 − 1) N + N1 −1 M + M1 −1 k1 =0 ∑ ∑ k2 =0 X '(k1 .

.Need for fast algorithms In the digital signal processing course we have learnt two the simplest fast algorithms for the 1D DFT evaluation: decimation in time decimation in frequency Since digital images have more samples than 1D signals these algorithms are even more important than in the case of 1D signals. We can freely claim that the modern digital image processing would not be developed without FFT algorithms (FFT is the same transform as DFT but name only indicates fast evaluation algorithm!!!).

k2 ) = ∑ ∑ x(n. Then calculation complexity is of order: N2M2 complex additions (2N2M2 real) N2M2 complex multiplications (4N2M2 real+2N2M2 real additions) For example N=M=1024 on the PC that can perform 1x109 operations in second requires: 8N2M2 > 8x1012 real operations. i. For each k1.Need for fast algorithms X (k1 . k2 these operations should be repeated NxM times. . m) e n =0 m=0 N −1 M −1 −j 2 πk1n 2 πk2 m −j N M Direct evaluation for single frequency requires NxM complex multiplications (complex multiplication is 4 real multiplications and two real additions) and NxM complex additions (2 real additions).e. 8x103 sec that is more than 2h..

All applied DFTs are 1D and there are in total N+M DFTs. m)e M ⎥ e 1D DFT of image row n =0 ⎣ m=0 ⎦ N −1 M −1 = ∑ X ( n. .Step-by-step fast algorithm 2D DFT can be written as: k 2 πk2 m −j ⎡ ⎤ − j 2 πN1n = X (k1 . k2 ) = ∑ ⎢ ∑ x(n. All 1D DFT can be realized using the fast algorithms. k 2 ) e n =0 N −1 −j 2 πk1n N The second sum represents the FT of obtained result in the first step.

It is equal to 0.. For each of the FFT in rows and columns we need Nlog2N complex additions and multiplications and for entire freqeuncy-frequency plane it is required: 2NxNlog2N complex additions and multiplications.16sec on considered machine.e. 16N2log2N operations. i.Step-by-Step algorithm For simplification assume that N=M. This ie equal to 8N2log2N real additions and multiplications. . For N=M=1024 required number of operations is 160x106.

. The step-by-step algorithm is not optimal for 2D signals but it is quite popular due to its simplicity.Step-by-step algorithm For 1D signals 2FFT algorithms are in usage: Complexity of both of these algorithms is similar. PC earlier and today mobile devices have problems with memory demands of the 2D FFT algorithms since today some machines like mobile devices still have moderate memory amounts. Then we have problem since in the step by step algorithm we need 3 matrices for: original image FT of columns or rows (complex matrix written in memory as two realvalued matrices) 2D DFT (again complex-valued matrix) decimation in time decimation in frequency.

m) e * +j = it holds x*(n. m ) e n =0 m=0 N −1 M −1 −j 2 π ( N − k1 ) n 2 π ( M − k2 ) m −j N M = X ( N − k1 . M − k2 ) You can find in the textbook how this relationship can be used to save memory space!!! . image is real-valued signal where the following rule holds: N −1 M −1 n =0 m=0 2 πk1n 2 πk2 m +j N M X (k1 .Step-by-step algorithm memory Then in fact we need memory for 5 real-valued matrices. k2 ) = ∑ ∑ x(n. Fortunately.m)=x(n.m) = ∑ ∑ x ( n.

m )WN 1WN 2 . m )W W nk1 N mk2 N n =0 m =0 n odd m odd ∑∑ N −1 N −1 nk mk x ( n. m)WN 1WN 2 + n =0 m =0 n odd m even ∑ ∑ N −1 x ( n. m)WNnk1WNmk2 n=0 m=0 N −1 N −1 WN=exp(-j2π/N) We assume N=M This 2D DFT can be given with 4 subsums for even and odd coefficients X ( k1 . k2 ) = ∑∑ x(n. m)W W nk1 N mk2 N + + n =0 m =0 n even m odd ∑ ∑ N −1 N −1 nk mk x ( n. k2 ) = + n =0 m =0 n even m even ∑ ∑ N −1 N −1 N −1 x ( n.Advanced 2D FFT algorithms Advanced FFT algorithms for images perform decimation directly along both coordinates. X (k1 .

N/2) k k k X ( k1.2m2 + 1)WN m1k1 + 2m2 k2 ∑ . k2 )WN 1 + k2 where S00 ( k1. k2 ) + S01( k1. k2 ) = N / 2 −1 N / 2 −1 m1 = 0 m2 = 0 ∑ 2 x(2m1 + 1. k2 ) = S00 ( k1. k2 ) = S10 ( k1. k2 )WN 2 + S10 ( k1. k2 ) = N / 2 −1 N / 2 −1 m1 = 0 m2 = 0 ∑ 2 x(2m1. k2 ) = N / 2 −1 N / 2 −1 m1 = 0 m2 = 0 N / 2 −1 N / 2 −1 m1 = 0 m2 = 0 ∑ 2 x(2m1.2m2 + 1)WN m1k1 + 2m2 k2 ∑ ∑ ∑ x(2m + 1.2m2 )WN m1k1 + 2m2 k2 ∑ S01( k1.Advanced 2D FFT algorithms After simple manipulations we obtain all sums within limits of [0.2m )W 1 2 2 m1k1 + 2 m2 k 2 N S11( k1.N/2)x[0. k2 )WN 1 + S11( k1.

k2 ) = S00 ( k1. k2 ) 2 0 ≤ k2 ≤ N / 2 − 1 . k2 ) − WN 1 + k2 S11( k1. k2 ) + WN 1 + k2 S11( k1. k2 ) − WN 2 S01( k1.k2) as: k k k X ( k1. k2 ) for 0 ≤ k1 ≤ N / 2 − 1 X ( k1 + N 2 0 ≤ k2 ≤ N / 2 − 1 k k k . k2 ) − WN 1 + k2 S11( k1.Advanced 2D FFT algorithms 2D DFT can now be represented depending on the (k1. k2 ) = S00 ( k1. k2 + N ) = S00 ( k1. k2 ) + WN 1 S10 ( k1. k2 ) + WN 2 S01( k1. k2 ) + WN 1 S10 ( k1. k2 ) + WN 2 S01( k1. k2 ) for 0 ≤ k1 ≤ N / 2 − 1 for 0 ≤ k1 ≤ N / 2 − 1 0 ≤ k2 ≤ N / 2 − 1 k k k X ( k1. k2 ) − WN 1 S10 ( k1.

k2 ) − WN 2 S01( k1. k2 + N ) = S00 ( k1.k2) S01(k1.k2) S11(k1.k2+N/2) X(k1+N/2.k2+N/2) WN k1+k2 . k2 ) − WN 1 S10 ( k1. k2 ) + WN 1 + k2 S11( k1.k2) 1 WN k1 -1 X(k1.k2) S10(k1.k2) -1 -1 -1 -1 k2 WN -1 X(k1.Advanced 2D FFT algorithms Finally: X ( k1 + N 2 k k k . k2 ) 2 for 0 ≤ k1 ≤ N / 2 − 1 0 ≤ k2 ≤ N / 2 − 1 Decomposition can be presented using the following lattice: S00(k1.k2) X(k1+N/2.

2) X(3.3) x(1.2) x(0. 1 -j 1 1 -j -j 1 -1 X(0.2) x(1.2) x(2.k2) block.1) x(2.1) x(0.0) x(1.0) X(2.2) -1 -1 Decomposition can be performed in the next stage on each Sij(k1.3) - 1 1 1 1 1 -j -1 X(0. Full decomposition for image with 4x4 pixels.1) x(3.2) X(2.1) x(1.0) x(2.x(0.3) X(3.1) X(0.0) x(3.2) X(1.3) X(2.3) x(2.3) -1 -j -j -1 -1 .0) X(0.1) X(1.0) X(1.3) x(3.1) X(3.1) X(2.0) X(3.0) x(0.2) x(3.3) X(1.

Advanced 2D FFT algorithms Number of complex multiplications for this type of the FFT algorithm is 3N2log2N/4 while number of complex additions is 2N2log2N. Then there are 3N2log2N real multiplications and there are 5. There are myriad FFT algorithm types!!! .5N2log2N real additions. Total number operations is about half of the number of operations in the step-by-step algorithm.

Features of the DFT of real image 2D DFT can be calculated in MATLAB using function: F=fftshift(fft2(I)) image Function that calculate the 2D DFT using FFT algorithm Shifting FFT coefficients in natural order (see slide 22) On the next slide image Lena with logarithm of its 2D DFT is shown (white positions represent larger values while dark are smaller) .

Features of 2D DFT Test image Lena 2D FFT values around origin correspond to (ωx.ωy)=(0.0) are white and they are up to 1010 times larger than dark positions. .

NO and NO!!! Namely. High-frequency components have small energy and they are subject to noise influence.Features of 2D DFT In the considered image Lena of dimension 256x256 pixels. The answer is NO. This is very important feature of human eye: components of small energy on higher frequencies contain main part of information that humans receive. less than 10 samples of the 2D DFT has more than 99% of energy. Can we memorize image with just 10 2D DFT samples (comparing to 256x256 pixels). . the main part of energy is related to image luminance while details of image that corresponds to features very important for human vision are on higher frequencies.

Can evaluation of the convolution be more efficient using the 2D DFT? We demonstrated within slides one algorithm for the 2D FFT by decimation of signal on 4 subsignals. Reconstruct original signal based on these samples. Is this signal decimation in time or decimation in frequency?.For self-exercise Determine properties of the 2D DFT of real-valued signals. Consider convolution of 2D signals. Prove properties of the 2D FT of continuous time signals given in the textbook. Do these properties hold for 2D FT of discrete signals and 2D DFT? Assume that 2D signal is discretized in some arbitrary manner (diamond or hexagon instead of the rectangular sampling). .

If it is possible perform this decimation for 4x4 image and present full decomposition. If it is not possible explain reasons. . Interpolate image using zero-padding of the 2D DFT of original image!!! Solve problems given in the textbook at the end of corresponding chapter.For self-exercise Realize the 2D FFT using alternative decimation algorithm. Is it possible to combine decimations? For example decimation in frequency along rows and decimation in time along columns.

Digital Image Processing Radon transform DCT Unitary and orthogonal transforms .

The aim was to reconstruct interior of objects by using projections made along different angles. The first application of this transform was recording the Sun interior based on recordings from the Earth (the Sun was source of light. projection in this experiment). In practice. i. for this transform are used signals that can penetrate through the object (X-rays or some other signals) and we are recording the attenuation of these signals on the way through the objects.Radon transform Radon transform is developed during XIX century.e.. The entire scientific area called computer tomography is based on this transform. .

Recently it is used intensively in geology. Earth surface propagation of sound wave . the earth surface is searched for oil and other mineral goods. Namely.Radon transform Medicine is the main consumer of the Radon transform but it is also used in other fields such as astronomy. It is used sound signals for these recordings.

Radon transform

Assume that we have signal (wave, ray) that has possibility to penetrate through objects. This signal attenuates locally in point (x,y) with some attenuation function f(x,y). This function can tell us some important information about material through which this ray is penetrating through the object (ultrasound is able to propagate through liquid materials, X-rays attenuate significantly on bones etc). Therefore, our goal is visualization of attenuation function f(x,y) for each object point. However, we know only total attenuation of beam that is passing through the object (accumulated through the object on considered path) and based on this information we want to reconstruct f(x,y).

Radon transform

Consider attenuation function along direction s. A and B are points where beam entering and exiting the object.

∫

AB

f ( x, y )ds

Under relative mild assumptions beam has linear propagation in the object and s can be parameterized as:

xcosθ+ysinθ=t

θ is angle with respect to the considered coordinate system and t is parameter determining line from all possible parallel lines with the same θ. All lines can be parameterized in this manner.

Attenuation function

Now we can write attenuation as function determined by angle θ and parameter t:

P (t ) = θ

−∞ −∞

zz

∞ ∞

**f ( x, y)δ( x cos θ + y sin θ − t )dxdy
**

Now we can see that problem is reduced to determination of f(x,y) based on Pθ(t) where θ and t are determined by beams we are sending toward the object along different angles and for various values of t.

This relationship holds since (x,y) along s satisfies relationship from the previous slide.

Usage of the 2D FT

beams that are submitted toward object for angle θ1 for various t. beams that are submitted toward object for angle θ2 for various t.

Our goal is to reconstruct f(x,y) based on known Pθ(t) for various angles θ and for various t. Consider 2D FT of f(x,y):

F (u, v) =

−∞ ∞

zz

∞∞

f ( x, y)e− j ( ux + vy )dxdy

Relationship between Pθ(t) and f(x. y )δ( x cos θ + y sin θ − t )e − jωt dxdydt .y) in Fourier domain FT of signal Pθ(t) is: Sθ (ω) = Introduce: −∞ z ∞ P (t )e− jωt dt θ P (t ) = θ −∞ −∞ zz ∞ ∞ f ( x. y)δ( x cos θ + y sin θ − t )dxdy Now we get: Sθ (ω) = −∞ −∞ −∞ ∫∫∫ ∞ ∞ ∞ f ( x.

y) in Fourier domain Calculating integral over t we obtain: ∞ ∞ Sθ (ω) = −∞ −∞ ∫∫ f ( x. .y). y )e− jω( x cos θ+ y sin θ ) dxdy = F (ω cos θ. Determination of the 2D FT of projections 3.Relationship between Pθ(t) and f(x. 1. ω sin θ) Now we can claim that there is relationship between the FT of projections and FT of f(x. Determination of the 2D FT of f(x.y). 2.y) using 2D IFT. Evaluation of f(x. Our problem can be solved in 4 steps. 4. Calculating projections.

. Obviously we should perform interpolation from rectangular to polar raster in some of the Radon transform evaluation steps. . there are plenty of problems in the Radon (and inverse Radon) transform realization. The first problem is caused by the fact that we obtain samples of the FT parameterized as (ωcosθ. in polar coordinates.ωsinθ).Problems in Radon Transform Commonly under the Radon transform is assumed determination of projections Pθ(t) that are usually visualized as 2D function of t and θ. i. However.e.

Namely. trans. recording apparatus is moving and we have no direct relationship with angles θ but rather with positions and velocity of apparatuses for scanning and recording. we already have images of object interior and we want to determine Radon transform of these images in order to extract some important features presented in image For example these important features could be existence of straight and sinusoidal lines in image. Sometime. The second important issue is related to recording. . Then based on position and velocity our system should reconstruct angle of rays passing through the object.Problems in the Rad.

. Since image is discrete “rays” are not passing exactly through the pixels. transf.Radon. and projections How to calculate projections if we already have recorded image? Projection of image along considered angle is sum of luminance of pixels in considered direction.

In this manner we calculate Pθ(t). transf.Radon. . Note that this is not the most efficient technique for calculating the Radon transform but it is simple for understanding and it is reason why it is quite common in practice. Instead of calculating projections for angle θ rotate image for angle –θ. This procedure should be performed for all angles. and projections Problem of discrete nature of images can be solved in numerous manners. Projection is calculated by summing horizontal pixels. In the same time we can without rotation sum vertical pixels and in this manner calculate Pπ/2-θ(t). Rotation should be performed with appropriate interpolation.

Example Assume that we have white line y=ax+b on black background.Radon transf. y)δ( x cos θ + y sin θ − t )dxdy Our “image” can be written as: f ( x. Calculate the Radon transform: P (t ) = θ −∞ −∞ zz ∞ ∞ f ( x. . y ) = δ( y − ax − b) Then projection is given as: ∞ ∞ Pθ (t ) = −∞ −∞ ∫ ∫ δ( y − ax − b)δ( x cos θ + y sin θ − t )dxdy .

Example After some calculations we obtain: δ(t − b sin θ) Pθ (t ) = ∫ δ( x cos θ + (ax + b) sin θ − t )dx = cos θ + a sin θ −∞ Consider this expression. At the first glance we can note that this expression is equal to infinity for cosθ+asinθ=0!!! It means that we have infinity for: θ=-arcctg(a) (a is tangens of the angle between our line and x-axis). For this angle we can determine the second parameter of the line t=bsinθ. .Radon transf. ∞ .

Image shown with pcolor function has 4 peaks corresponding to 4 edges of the square.shading interp I=iradon(R.t. trans.25)=1.0:89).25:75)=1.Rad.0:89).25:75)=1. Z(25.t]=radon(Z. Inverse transform does not reconstruct image in ideal manner.R). Z(25:75. pcolor(0:89. [R. Z(75. Why? . Z=zeros(N. Z(25:75. %%%this function can be called with additional %%% parameters This is rather simple example! Image is square of dimension 50x50 (check it with imshow(Z)).75)=1.N). The second argument of the radon function are angles for which this transform is evaluated. in MATLAB clear N=100.

For “flat” (slowly varying) images the FT is concentrated around origin while for fast-varying (textured) images the FT has components on high frequencies. FT of discretetime signals should be quite clear concept for one engineer. . DFT. Coefficients on low frequencies corresponds to the sinusoids with larger periods (slowly varying) while sinusoids on higher frequencies correspond to the highly varying signal components. with its variants.Conclusions Fourier transform. The FT signal decompose signal in series expansion with sinusoidal functions.DFT .

Removing these components in the DFT domain produces undesired oscillatory effects in filtered image (caused by the so-called Gibbs phenomena). . when image is corrupted by noise we want to remove those samples of the DFT that are significantly corrupted by noise. It means that memory requirements are double of memory requirements for real-valued signals. This the reason why some alternative transforms are developed for image filtering and image compression. How? This technique then does not solve problem of handling with data. Namely. The second problem is more important.DFT . However. there are techniques to reduce complexity by using some properties of the DFT of real signals. Sometimes it is better to keep noise image than filtered image since these artifacts can be very annoying. This effect can be reduced by smoothing (not truncating) DFT coefficients but it introduces some other drawbacks.Conclusion DFT in digital image processing has two serious problems: DFT for real image gives complex signal.

**Discrete Cosine Transform (DCT)
**

The most important tool for overcoming drawbacks of the 2D DFT is the 2D DCT. There are several definitions of this transform in practice and here we will use the following (for the 1D DCT):

1 C(0) = N

N −1 n=0

∑ x ( n)

C( k ) =

2 N

(2n + 1) kπ ∑ x(n) cos 2 N n=0

N −1

for k=1,...,N-1

Inverse DCT

Inverse DCT is defined as:

1 x ( n) = C(0) + N 2 N

∑ C(k ) cos

k =1

N −1

(2n + 1) kπ 2N

For self exercise try to prove that the DCT and inverse DCT are mutually inverse.

Fast DCT

Since the DCT is very useful in the digital image processing it is important to have developed fast algorithms for its evaluation. There are several approaches for solving this problem:

One is to use the property: ⎛ (2n + 1)k π ⎞ ⎛ (2n + 1)k π ⎞ exp ⎜ j + exp ⎜ − j ⎟ ⎟ (2n + 1)k π 2N 2N ⎝ ⎠ ⎝ ⎠ cos = 2N 2 and using several simple relationship to reduce the 1D DCT evaluation to fast evaluation of the 1D DFT. Try this for homework!!!

Fast DCT

The second technique for fast DCT evaluation is based on specific methodology for signal extension. Check this methodology described in the textbook. Again using this methodology the fast DCT can be reduced to fast DFT evaluation. Finally, it is possible to decompose DCT to two DCTs with N/2 samples. Do it for homework.

dct function is used in MATLAB for the 1D DCT evaluation while the idct is used for its inverse.

there are alternative techniques for direct evaluation of the 2D DCT. Again there are several definitions of the 2D DCT that can be used in practice but here we adopted: (2n1 + 1) k1π (2n2 + 1) k2 π C( k1. n2 ) cos cos 2 N1 2 N2 n1 = 0 n2 = 0 N 1 −1 N 2 −1 . k2 ) = ∑ ∑ 4 x(n1.2D DCT There is not unique form of the 2D DCT. However. The simplest realization technique is calculation of the 1D DCT along columns and after that along rows of newly obtained matrix.

e. they are mutually inverse. If this is not satisfied propose modification of one of them or both!!! .Inverse 2D DCT Inverse 2D DCT (for our the 2D DCT form) is: 1 x(n1. k2 ) cos 2 N cos 2 N k1 = 2 = 1 2 R1 / 2 w (k ) = S T1 i i ki = 0 1 ≤ ki ≤ N − 1 For homework prove that the 2D DCT and its inverse defined on this slide form transformation pair. i.. n2 ) = N1 N 2 N 1 − 1 N 2 −1 (2n1 + 1) k1π (2n2 + 1) k2 π ∑0 k∑0 w1(k1) w2 ( k2 )C( k1.

additional problem is related to problem dimensions since we should decide between “stepby-step” realization and direct 2D evaluation. However. I am sharing your excitement with this task! . “Step-by-step” reduces problem to the 1D DCT and (with help of the textbook) try to apply these variants to the realization of the 2D DCT.Fast 2D DCT The same three methodologies used for the 1D DCT realization can be applied here for the 2D DCT fast realization.

Logarithm of the DCT coefficients for Baboon is given below. this part corresponds to the image luminance . In the case of the 2D DCT (as well as in the 1D DCT) coefficients shifting is not required. these coefficients correspond to the image details “low-frequency coefficients” with very high values.2D DCT – MATLAB . “high-frequency coefficients” with extremely small values.Example Function for 2D DCT realization in MATLAB is dct2 while its inverse is idct2.

k ) n =0 N −1 They can be also written in matrix form. both group of transforms can be written as (for 1D signals) : X (k ) = ∑ x(n) w(n. They have similar properties but even their structure is quite similar. . Namely.Unitary and orthogonal transforms The DFT and DCT obviously are quite similar.

N − 1) ⎦ ⎣ x( N − 1) ⎦ X = W x Inverse transforms can be written as: x=W-1X . 0) ⎢ X (1) ⎥ ⎢ w(0. N − 1) ⎣ w( N − 1.1) w(1.1) ⎢ ⎥=⎢ ⎢ ⎥ ⎢ ⎢ ⎥ ⎢ X ( N − 1) ⎦ ⎣ w(0.Transforms using matrices w(1. 0) ⎤ ⎡ x(0) ⎤ w( N − 1.1) ⎥ ⎢ x(1) ⎥ ⎥⎢ ⎥ ⎥⎢ ⎥ ⎥⎢ ⎥ w( N − 1. 0) ⎡ X (0) ⎤ ⎡ w(0. N − 1) w(1.

i.Unitary and orthogonal transf. all orthogonal transforms are in the same time unitary for real valued W. Orthogonal transforms have orthogonal transformation matrix: W-1=WT.e. At the first glance the DFT doesn’t belong to any of these two important classes. Two important classes of transforms are: Unitary transform has unitary transform matrix W-1=WH (WH is Hermitian matrix. transpose and conjugate WH=(WT)*). Then the DFT is sometimes defined as: 1 X (k ) = N ∑ x(n)W n =0 N −1 nk N Multiplicative constant that has no impact on transform properties. Obviously.. .

DFT as unitary transform Inverse transform is now defined as: 1 x ( n) = N − X (k )WN nk ∑ k =0 N −1 It is easy to prove that this form of the DFT is unitary transform. we assume under orthogonal and unitary transforms all transforms that can be reduced to these transforms by simply introducing multiplicative constants. . Basic reason for using these two group of transforms is the fact that inverse matrix calculation is very demanding operation and that it is avoided in these transforms. In order to avoid complications.

k) with weighted coefficients that are equal to the transformation coefficients. we want to introduce a concept of basis signals. Consider the inverse transform: x(n) = ∑ X (k ) g (n. k ) k =0 N −1 Obviously that this can be written in the matrix form and that between matrix G (rows of this matrix are g(n. Signal (under some conditions.Basis signals Now. for example signals of finite energy) can be represented with expansion of some elementary functions g(n. .k)) and matrix W exists simple relationship. Now we can note that by using formalism of the transform detection we forget very often reasons for its introduction.

Write transform as: X (k ) = ∑ X k1 δ(k − k1 ) k1 = 0 N −1 This practically means that transform can be considered as a sum of N transforms that are equal to X(k1) =Xk1 for k=k1 and 0 elsewhere.Basis signals Since considered transforms are linear. . Analyze combination of the last two relationships. sum of transforms is equal to sum of signals.

k1) are called basis and their analysis can give us a lot of information about nature of some transform.Basis signals x(n) = ∑ ∑ X k1 δ(k − k1 )g (n. k ) = k = 0 k1 = 0 N −1 N −1 = ∑ X k1 ∑ δ(k − k1 ) g (n. k1 ) k1 = 0 k =0 k1 = 0 N −1 N −1 N −1 Thus. . Signals g(n. k ) = ∑ X k1 g (n. any signal can be represented as a weighted sum of matrix of inverse transforms (that is commonly in simple relation with transform matrix rows or columns).

In order to further simplify visualization we will use instead of discrete instants n continuous time variable t.k)=exp(j2πnk/N). Take N=8 and try to visualize real part of the transform. Obtained basis signals are: 1 cos(πt/4) cos(πt/2) cos(3πt/4) cos(πt) cos(5πt/4) cos(3πt/2) cos(7πt/4) .Basis signals – Example Consider DFT where g(n.

while in opposite case it is more on high frequencies.cos(7πt/4) cos(3πt/2) cos(5πt/4) cos(πt) cos(3πt/4) cos(πt/2) cos(πt/4) 1 Basis signals describe nature of transforms. When weight of low-frequency components is larger (for small k) signal is more low-frequency. weighted sum of basis functions produce transformed signal. . Namely.

Generalized transforms for 2D signals Before we proceed with some important transforms (in addition to DFT and DCT) we will consider generalization of transforms for 2D signals. Generalized transform can be written as: X (k1 . m. k2 ) = ∑ ∑ x(n. k1 . m) J (n. k2 ) n =0 m=0 4D transformation matrix N −1 M −1 .m) has dimension NxM. Let 2D signal x(n.

k2 ) = ∑ ⎢ ∑ x(n.Common form of transformation matrix Fortunately the 4D transform matrix is not in common usage and instead we are performing transforms in stepby-step manner along columns (or rows) and after that on rows (or columns). k1 ) n =0 ⎣ m=0 ⎦ N −1 M −1 . The rationale is the same as in the step-by-step algorithm for 2D DFT calculation. m) H c (m. k2 ) ⎥H r (n. Then the 2D transform can be written as: ⎡ ⎤ X (k1 .

2D transforms Matrix form (where 4D transform matrix can be written as multiple of 2 2D transform matrices) Matrix representation of separable transforms be written as: can X=Hc x Hr Inverse transform can be written as (recall basic matrix algebra): x=Hc-1 X Hr-1 The commonly transforms applied on rows and columns are the same Hc=Hr=T and it follows: X=T x T x=T-1 X T-1 For T unitary or orthogonal matrix the 2D transform can be performed as: x=TH X TH or x=TT X TT .

q) Then any image can be represented as a sum of the basic N −1 M −1 images: x ( n. m ) p =0 q =0 .q)∈[0.N)X[0.j-q).m)=P(n.q)(n. For image of dimensions NxM it can be defined NxM basic images that are equal to inverse transform of signal δ(i-p.q) is equal to: f(p. m ) = ∑ ∑ X p .M). If P=T-1 we have separable transform where basic image (p. q f ( p . These NxM basic images can be obtained when p i q are changed within the range (p.p)P(m. q ) ( n.Basis images In analogy to the basis signals we can introduce the basic images.

In addition. . Numerous phenomena in communication systems are also associated with sinusoidal functions. sinusoidal functions offer elegant mathematical apparatus useful in analysis of numerous practical phenomena. In electrical engineering they corresponds to electrical current produced in generators causing such type of current shape in our power lines.Sinusoidal transforms Signal can be given in the form of the expansion over function of different type but sinusoidal (or cosinusoidal functions) are the most common. They are quite natural concept. In mechanic this function type can appear in the case of oscillations.

k)=exp(-j2πnk/N) It can be multiplied with N In addition we introduced the DCT with coefficients: ⎡ π(2n + 1)k ⎤ c(n. k ) = α(k ) cos ⎢ ⎥ ⎣ 2N ⎦ There are other sinusoidal transforms as the DST (discrete sinusoidal transform) with coefficients of the s ( n. k ) = transformation matrix: ⎧ 1/ N ⎪ α(k ) = ⎨ ⎪ 2/ N ⎩ k =0 k≠0 2 ⎡ π(n + 1)(k + 1) ⎤ sin ⎢ ⎥ N +1 N +1 ⎣ ⎦ .Sinusoidal transforms The DFT has coefficients of the transformation matrix: w(n.

k ) = cas ⎜ ⎟ = cos ⎜ ⎟ + sin ⎜ ⎟ ⎝ N ⎠ ⎝ N ⎠ ⎝ N ⎠ The Hartley transform is similar to the Fourier transform but for real-valued signals produces the real-valued transform what is useful in some applications. Its coefficients are defined as: ⎛ 2πnk ⎞ ⎛ 2πnk ⎞ ⎛ 2πnk ⎞ h(n. .Hartley transform The Hartley transform (DHT) is relatively common in practice.

Within the next lecture we will learn that there are alternatives to the sinusoidal transforms in the case of digital images. . For exercise determine relationships between these transform in 1D and 2D cases.Sinusoidal transforms We concluded with the most common sinusoidal transforms used in practice. In addition try to calculate basis images for these transforms.

For self-exercise Here. . we will just repeat tasks for self-exercise mentioned within lecture. Apply the Radon transform in the MATALB on several simple images and try to perform reconstruction using the iradon command. What is image with Radon image concentrated in single pixel? Is there more complicated shape than the straight line that can achieve this kind of functional relationship. What are your conclusions? Apparatus for recording and creating Radon transform moves with uniform angular velocity φ around an object that is subject to recording.

For self-exercise Prove that the DCT and the inverse DCT are mutually inverse. Present your conclusions. Consider 2D DFT and 2D DCT for some real-valued image and mask (filter) part of coefficients and calculate inverse transform and compare it with original image. Realize fast 2D DCT using these three algorithms. by using the cosine written as two exponential functions. Realize fast DCT using specific periodical extension described in the textbook. . Determine inverse transform for all introduced sinusoidal transforms. Compare these solutions. and by using direct procedure.

. For introduced sinusoidal transforms determine inverse transforms. Also. you can check next slides. For a given N determine basis signals and basis images for some introduced sinusoidal transforms by using MATLAB.For self-exercise Determine relationships between 4 introduced sinusoidal transform and realize fast algorithms for their evaluations. It is especially important (and hard) to realize for example DHT with N samples by decimation to two DHTs with N/2 samples.

k]=meshgrid(0:N-1. [n.*k/N)+sin(2*pi*n.0:N-1). H=cos(2*pi*n.k)) pause end . clear N=32. G=inv(H). for k=1:N plot(G(:.Basis signals and basis images Show basis signals for DHT with N=32.*k/N).

:).k]=meshgrid(0:N-1. [n. G=inv(H).0:N-1).' n='. clear N=8. H=cos(2*pi*n.:)'*G(n.num2str(n)]) pause end end . for m=1:8 for n=1:8 baz=G(m. pcolor(baz) title([‘basis images for m='.*k/N)+sin(2*pi*n.num2str(m).*k/N).Basis signals and images Form basis images for DHT with N=8.

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue listening from where you left off, or restart the preview.

scribd