You are on page 1of 56

1

Computer Graphics and Multimedia
CONTENT
Unit 1
Raster Scan 
Pixel 
Frame buffer
Vector and character generation 
Polygon scan conversion 
Line drawing algorithms 
Display Devices 
Boundary Fill Algorithm 
Flood Fill Algorithm 
Unit 2
Transformations 
Homogeneous coordinates 
Line Clipping 
The Cohen­Sutherland Line­Clipping Algorithm 
Clipping Polygons 
Weiler­Atherton Algorithm 
Unit 3
3D Transformations 
Parallel and Perspective Projection 
Z­buffering 
Bézier curve 
B­splines 
Painter's algorithm 
Unit 4
Diffuse Reflection 
Specular reflection 
Ray tracing 
Gouraud Shading
Phong Shading
Color Models in Computer Graphics 
Unit 5
Introduction 
Multi Media Hardware 
Audio in Multi Media 
Data Compression 
Lossless versus lossy Compression
Vedio
2
Computer Graphics and Multimedia

Unit I
Raster Scan
A raster scan, or raster scanning, is the rectangular pattern of image capture and
reconstruction in television. By analogy, the term is used for raster graphics, the pattern
of image storage and transmission used in most computer bitmap image systems. The
word raster comes from the Latin word rastrum (a rake), which is derived from radere (to
scrape); see also rastrum, an instrument for drawing musical staff lines. The pattern left
by the tines of a rake, when drawn straight, resembles the parallel lines of a raster: this
line-by-line scanning is what creates a raster. It's a systematic process of covering the
area progressively, one line at a time. Although often a great deal faster, it's similar in the
most-general sense to how one's gaze travels when one reads text.
In a raster scan, an image is subdivided into a sequence of (usually horizontal) strips
known as "scan lines". Each scan line can be transmitted in the form of an analog signal
as it is read from the video source, as in television systems, or can be further divided into
3
Computer Graphics and Multimedia
discrete pixels for processing in a computer system. This ordering of pixels by rows is
known as raster order, or raster scan order. Analog television has discrete scan lines
(discrete vertical resolution), but does not have discrete pixels (horizontal resolution) – it
instead varies the signal continuously over the scan line. Thus, while the number of scan
lines (vertical resolution) is unambiguously defined, the horizontal resolution is more
approximate, according to how quickly the signal can change over the course of the scan
line.

Scanning pattern

The beam position (sweeps) follow roughly a saw tooth wave.


In raster scanning, the beam sweeps horizontally left-to-right at a steady rate, then blanks
and rapidly moves back to the left, where it turns back on and sweeps out the next line.
During this time, the vertical position is also steadily increasing (downward), but much
more slowly — there is one vertical sweep per image frame, but one horizontal sweep per
line of resolution. Thus each scan line is sloped slightly "downhill" (towards the lower
right), with a slope of approximately –1/horizontal resolution, while the sweep back to
the left (retrace) is significantly faster than the forward scan, and essentially horizontal.
The tilt in the scan lines is imperceptible – it is tiny to start, and is dwarfed in effect by
screen convexity and other modest geometrical imperfections.

Interlaced scanning

To obtain flicker-free pictures, analog CRT TVs write only odd-numbered scan lines on
the first vertical scan; then, the even-numbered lines follow, placed ("interlaced")
between the odd-numbered lines. This is called interlaced scanning. (In this case,
positioning the even-numbered lines does require precise position control; in old analog
TVs, trimming the Vertical Hold adjustment made scan lines space properly. If slightly
misadjusted, the scan lines would appear in pairs, with spaces between.) Modern high-
definition TV displays use data formats like progressive scan in computer monitors (such
as "1080p", 1080 lines, progressive), or interlaced (such as "1080i").
Pixel
In digital imaging, a pixel (or picture element) is a single point in a raster image. The
pixel is the smallest addressable screen element; it is the smallest unit of picture that can
4
Computer Graphics and Multimedia
be controlled. Each pixel has its own address. The address of a pixel corresponds to its
coordinates. Pixels are normally arranged in a 2-dimensional grid, and are often
represented using dots or squares. Each pixel is a sample of an original image; more
samples typically provide more accurate representations of the original. The intensity of
each pixel is variable. In color image systems, a color is typically represented by three or
four component intensities such as red, green, and blue, or cyan, magenta, yellow, and
black.
In some contexts (such as descriptions of camera sensors), the term pixel is used to refer
to a single scalar element of a multi-component representation (more precisely called a
photosite in the camera sensor context, although the neologism sensel is also sometimes
used to describe the elements of a digital camera's sensor), while in others the term may
refer to the entire set of such component intensities for a spatial position. In color systems
that use chroma subsampling, the multi-component concept of a pixel can become
difficult to apply, since the intensity measures for the different color components
correspond to different spatial areas in a such a representation.
A pixel is generally thought of as the smallest single component of a digital image. The
definition is highly context-sensitive. For example, there can be "printed pixels" in a
page, or pixels carried by electronic signals, or represented by digital values, or pixels on
a display device, or pixels in a digital camera (photosensor elements). This list is not
exhaustive, and depending on context, there are several terms that are synonymous in
particular contexts, such as pel, sample, byte, bit, dot, spot, etc. The term "pixels" can be
used in the abstract, or as a unit of measure, in particular when using pixels as a measure
of resolution, such as: 2400 pixels per inch, 640 pixels per line, or spaced 10 pixels apart.
The measures dots per inch (dpi) and pixels per inch (ppi) are sometimes used
interchangeably, but have distinct meanings, especially for printer devices, where dpi is a
measure of the printer's density of dot (e.g. ink droplet) placement. For example, a high-
quality photographic image may be printed with 600 ppi on a 1200 dpi inkjet printer.
Even higher dpi numbers, such as the 4800 dpi quoted by printer manufacturers since
2002, do not mean much in terms of achievable resolution.
The more pixels used to represent an image, the closer the result can resemble the
original. The number of pixels in an image is sometimes called the resolution, though
resolution has a more specific definition. Pixel counts can be expressed as a single
number, as in a "three-megapixel" digital camera, which has a nominal three million
pixels, or as a pair of numbers, as in a "640 by 480 display", which has 640 pixels from
side to side and 480 from top to bottom (as in a VGA display), and therefore has a total
number of 640 × 480 = 307,200 pixels or 0.3 megapixels.
The pixels, or color samples, that form a digitized image (such as a JPEG file used on a
web page) may or may not be in one-to-one correspondence with screen pixels,
depending on how a computer displays an image. In computing, an image composed of
pixels is known as a bitmapped image or a raster image. The word raster originates from
television scanning patterns, and has been widely used to describe similar halftone
printing and storage techniques.
The number of distinct colors that can be represented by a pixel depends on the number
5
Computer Graphics and Multimedia
of bits per pixel (bpp). A 1 bpp image uses 1-bit for each pixel, so each pixel can be
either on or off. Each additional bit doubles the number of colors available, so a 2 bpp
image can have 4 colors, and a 3 bpp image can have 8 colors:
l 1 bpp, 21 = 2 colors (monochrome)
l 2 bpp, 22 = 4 colors
l 3 bpp, 23 = 8 colors
...
l 8 bpp, 28 = 256 colors
l 16 bpp, 216 = 65,536 colors ("High color" )
l 24 bpp, 224 ≈ 16.8 million colors ("True color")

For color depths of 15 or more bits per pixel, the depth is normally the sum of the bits
allocated to each of the red, green, and blue components. High color, usually meaning 16
bpp, normally has five bits for red and blue, and six bits for green, as the human eye is
more sensitive to errors in green than in the other two primary colors. For applications
involving transparency, the 16 bits may be divided into five bits each of red, green, and
blue, with one bit left for transparency. A 24-bit depth allows 8 bits per component. On
some systems, 32-bit depth is available: this means that each 24-bit pixel has an extra 8
bits to describe its opacity (for purposes of combining with another image).

Frame buffer.
A frame buffer is a video output device that drives a video display from a memory
buffer containing a complete frame of data.
The information in the memory buffer typically consists of color values for every pixel
(point that can be displayed) on the screen. Color values are commonly stored in 1-bit
monochrome, 4-bit palettized, 8-bit palettized, 16-bit highcolor and 24-bit truecolor
formats. An additional alpha channel is sometimes used to retain information about pixel
transparency. The total amount of the memory required to drive the framebuffer depends
on the resolution of the output signal, and on the color depth and palette size.
Framebuffers differ significantly from the vector graphics displays that were common
prior to the advent of the framebuffer. With a vector display, only the vertices of the
graphics primitives are stored. The electron beam of the output display is then
commanded to move from vertex to vertex, tracing an analog line across the area between
these points. With a framebuffer, the electron beam (if the display technology uses one) is
commanded to trace a left-to-right, top-to-bottom path across the entire screen, the way a
television renders a broadcast signal. At the same time, the color information for each
point on the screen is pulled from the framebuffer, creating a set of discrete picture
elements (pixels).
The term "framebuffer" has also entered into colloquial usage to refer to a backing store
of graphical information. The key feature that differentiates a framebuffer from memory
6
Computer Graphics and Multimedia
used to store graphics — the output device — is lost in this usage.
Vector and character generation
A vector generator for a computer graphics system of the type in which the beam of a 
CRT   is   deflected   by   analogue   deflection   signals   to   form   an   image   comprised   of   a 
plurality of discrete vectors each defined, in each of two, orthogonal deflection axes, the 
first and second digital words respectively defining the initial position and the length of 
the vector. The vector generator includes circuitry that ensures that each vector is drawn 
with substantially uniform brightness regardless of the length of the vector. Additional 
circuitry is furnished so that an instantaneous indication may be obtained of a light pen 
strike or the occurrence of edge violations. This circuitry also yields continuous digital 
information concerning the current beam position.

Polygon scan conversion

This term encompasses a range of algorithms where polygons are rendered, normally one 
at a time, into a frame buffer. The term scan comes from the fact that an image on a CRT 
is   made   up   of  scan   lines.   Examples   of   polygon   scan   conversion   algorithms   are   the 
painter's   algorithm,   the  z­buffer,   and   the   A­buffer.   In   this   course   we   will   generally 
assume that polygon scan conversion (PSC) refers to the z­buffer algorithm or one of its 
derivatives. 
The advantage of polygon scan conversion is that it is fast. Polygon scan conversion
algorithms are used in computer games, flight simulators, and other applications where
interactivity is important. To give a human the illusion that they are interacting with a 3D
model in real time, you need to present the human with animation running at 10 frames
per second or faster. Polygon scan conversion can do this. The fastest hardware
implementations of PSC algorithms can now process millions of polygons per second.
One problem with polygon scan conversion is that it can only support simplistic lighting
models, so images do not necessarily look realistic. For example: transparency can be
supported, but refraction requires the use of an advanced and time-consuming technique
called "refraction mapping"; reflections can be supported -- at the expense of duplicating
all of the polygons on the "other side" of the reflecting surface; shadows can be produced,
by a more complicated method than ray tracing. The other limitation is that it only has a
single primitive: the polygon, which means that everything is made up of flat surfaces.
This is especially unrealistic when modeling natural objects such as humans or animals.
An image generated using a polygon scan conversion algorithm, even one which makes
heavy use of texture mapping, will tend to look computer generated.
Line drawing algorithms
a) Digital Differential Analyzer DDA
Basic idea: Calculate the (x, y)-coordinates of the pixels for the line using the parametric
representation for the line and scan t from 0 to 1.
7
Computer Graphics and Multimedia
( x (t) , y(t) ) = (P1x, P1y) + t * (P2x-P1x, P2y- P1y)
= (x1, y1) + t_(x2-x1, y2-y1)
x (t) = x1 + t ∆X
y (t) = y1 + t ∆Y
= y1 + m (t ∆X)
Where m = ∆Y/∆X
Use suitable incremental values for t, calculate (x(t),y(t)) and round off to ints.
Simplified version:
If |m| ≤1 then increment x x + 1, that is, scan over tn, such that tn∆X = 0,1,2, are
integers; and increment y with m.
if |m| > 1 then increment y y + 1, that is, scan over tn, such that m*tn *∆X = 0,1,2, are
integers, and increment x by1/m.
Potential problems:
• floating point division at start up
• floating point additions when incrementing
• rounding at every increment
b) Bresenham’s Algorithm
For |m| < 1, x-increments are integer valued . Try to eliminate rounding off floating point
values by detecting when the rounding off gives a value which is different from the
previously rounded off value.

• x increases in integer steps: x = round(x)


• y has rounding ”errors” d = round(y)-y, d belongs to (-0.5, 0.5)
• Denote previously drawn pixel by (x, y’) = (round(x),round(y)) = (x, y + d)
• Next point will be (x+1,y+m) which rounded off yields round(x+1) = x+1
round(y+m) = y’ if d + m < 0.5 ,
= y’+1 if d + m > 0.5
Bresenham: Enough to monitor the sign of d + m - 0.5≡ s
At the starting point d = 0→ s = m - 0.5
Algorithm:
s = m – 0.5; y’ = round (y’);
loop {
s = s + m;
if ( s > 0.0 )
{
s = s - 1.0;
y’= y’+1;
}
}

Advantages:
– No floating point divisions
– No floating point numbers
– Only integer multiplication by 2, i.e. shift left.
8
Computer Graphics and Multimedia
2nd Order Curves

Curves which are algebraically of second order (circle, ellipse, parabola, some splines)
have special algorithms. Example: for circles use octant symmetry.
Assume that the circle in centered at the origin (0, 0)
_ Let (xi, yi) be the last drawn pixel.
_ Now study (x i+1, yi+1) = (xi+1, yi+1) and decide whether (xi+1, yi) or (xi+1, yi -1) will be
chosen.
_ Circle equation: F(x, y) = x2 + y2 - R2 = 0
_ For any point (x, y): if inside the circle then F(x, y) < 0, if outside the circle F(x, y) > 0.
Introduce a decision variable pi to be evaluated at (xi, yi) to decide where the next pixel
will be.
_ Pi evaluates F(x, y) at the midpoint between yi and yi-1 for xi+1 = xi+1:
Pi = F (xi+1, yi-0.5) = (xi+1)2 + (yi-0.5)2 –R2 = 0
_ if pi < 0 then the midpoint is inside the circle, and we choose the pixel above the
midpoint: yi+1 = yi
_ if pi > 0 then the midpoint is outside the circle, and we choose the pixel below the
midpoint: yi+1 = yi-1
Really neat trick: Calculate pi+1 using values Already calculated at pi:
pi+1 = F(xi+1+1,yi+1-0.5)
= (x +1)2 + (y -0.5)2 – R2
Pi+1 –Pi = 2(xi+1)+1+(yi+1)2 – (yi)2 – (yi+1-yi)
if pi < 0 then yi+1 = yi, and pi+1 = pi + 2*xi+1 +1
if pi > 0 then yi+1 = yi -1, and pi+1 = pi + 2*xi+1 +1 - 2*yi+1
We need only p from the very first pixel drawn at 0 (0, R):
p0 = F (1,R-0.5) = 1.25 –R
Note 1: pi is incremented/decremented by integer values, hence round off p0 = round(p0);
if R integer then start with p0 = 1-R.
Note 2: pi is incremented with 2*xi and 2*yi, hence 2*xi+1 = 2*xi + 2 , 2*yi+1 = 2*yi – 2 or 2*yi

Display Devices
Computer monitor

A monitor or display (sometimes called a visual display unit) is an electronic visual


display for computers. The monitor comprises the display device, circuitry, and an
enclosure. The display device in modern monitors is typically a thin film transistor liquid
crystal display (TFT-LCD), while older monitors use a cathode ray tube (CRT).

Performance measurements

The performance of a monitor is measured by the following parameters:


• Luminance is measured in candelas per square meter. 
9
Computer Graphics and Multimedia
• Viewable   image   size  is   measured   diagonally.   For   CRTs,   the   viewable   size   is 
typically 1 in (25 mm) smaller than the tube itself. 

• Aspect ratios is the ratio of the horizontal length to the vertical length. 4:3 is the 
standard aspect ratio, for example, so that a screen with a width of 1024 pixels 
will have a height of 768 pixels. If a  widescreen  display has an aspect ratio of 
16:9, a display that is 1024 pixels wide will have a height of 576 pixels. 

• Display resolution is the number of distinct pixels in each dimension that can be 
displayed. Maximum resolution is limited by dot pitch. 

• Dot pitch is the distance between subpixels of the same color in millimeters. In 
general, the smaller the dot pitch, the sharper the picture will appear. 

• Refresh rate  is the number of times  in a second that a display is illuminated. 


Maximum refresh rate is limited by response time. 

• Response time is the time a pixel in a monitor takes to go from active (black) to 
inactive   (white)   and   back   to   active   (black)   again,   measured   in   milliseconds. 
Lower   numbers   mean   faster   transitions   and   therefore   fewer   visible   image 
artifacts. 

• Contrast ratio is the ratio of the luminosity of the brightest color (white) to that of 
the darkest color (black) that the monitor is capable of producing. 

• Power consumption is measured in watts. 

• Viewing angle  is the maximum angle at which images on the monitor can be 
viewed, without excessive degradation to the image. It is measured in degrees 
horizontally and vertically. 

Problems

Phosphor burn-in

Phosphor burn-in is localized aging of the phosphor layer of a CRT screen where it has
displayed a static bright image for many years. This results in a faint permanent image on
the screen, even when turned off. In severe cases, it can even be possible to read some of
the text, though this only occurs where the displayed text remained the same for years.
This was once a common phenomenon in single purpose business computers. It can still
be an issue with CRT displays when used to display the same image for years at a time,
but modern computers are not normally used this way anymore, so the problem is not a
significant issue. The only systems that suffered the defect were ones displaying the same
image for years, and with these the presence of burn-in was not a noticeable effect when
10
Computer Graphics and Multimedia
in use, since it coincided with the displayed image perfectly. It only became a significant
issue in three situations:
• when some heavily used monitors were reused at home, 

• or re­used for display purposes 

• in some high­security applications (but only those where the high­security data 
displayed did not change for years at a time). 

Screen savers were developed as a means to avoid burn-in, but are unnecessary for CRTs
today, despite their popularity.
Phosphor burn-in can be gradually removed on damaged CRT displays by displaying an
all-white screen with brightness and contrast turned up full. This is a slow procedure, but
is usually effective.

Plasma burn-in

Burn-in re-emerged as an issue with early plasma displays, which are more vulnerable to
this than CRTs. Screen savers with moving images may be used with these to minimize
localized burn. Periodic change of the color scheme in use also helps.

Glare

Glare is a problem caused by the relationship between lighting and screen, or by using
monitors in bright sunlight. Matte finish LCDs and flat screen CRTs are less prone to
reflected glare than conventional curved CRTs or glossy LCDs, and aperture grille CRTs,
which are curved on one axis only and are less prone to it than other CRTs curved on
both axes.
If the problem persists despite moving the monitor or adjusting lighting, a filter using a
mesh of very fine black wires may be placed on the screen to reduce glare and improve
contrast. These filters were popular in the late 1980s. They do also reduce light output.
A filter above will only work against reflective glare; direct glare (such as sunlight) will
completely wash out most monitors' internal lighting, and can only be dealt with by use
of a hood or transreflective LCD.

Color misregistration

With exceptions of correctly aligned video projectors and stacked LEDs, most display
technologies, especially LCD, have an inherent misregistration of the color channels, that
is, the centers of the red, green, and blue dots do not line up perfectly. Sub-pixel
rendering depends on this misalignment; technologies making use of this include the
11
Computer Graphics and Multimedia
Apple II from 1976, and more recently Microsoft (ClearType, 1998) and XFree86 (X
Rendering Extension).

Incomplete spectrum

RGB displays produce most of the visible color spectrum, but not all. This can be a
problem where good color matching to non-RGB images is needed. This issue is common
to all monitor technologies with three color channels.

Boundary Fill Algorithm


There are two ways to do this:
1. Four-connected fill where we propagate: left right up down

Procedure Four_Fill (x, y, fill_col, bound_col: integer);


var
curr_color: integer;
begin
curr_color := inquire_color(x, y)
if (curr_color <> bound color) and (curr_color <> fill_col) then
begin
set_pixel(x, y, fill_col)
Four_Fill (x+1, y, fill_col, bound_col);
Four_Fill (x-1, y, fill_col, bound_col);
Four_Fill (x, y+1, fill_col, bound_col);
Four_Fill( x, y-1, fill_col, bound_col);
end;
2. Eight-connected fill algorithm where we test all eight adjacent pixels.

So we add the calls:


Procedure Four_Fill (x, y, fill_col, bound_col: integer);
var
curr_color: integer;
begin
curr_color := inquire_color(x, y)
12
Computer Graphics and Multimedia
if (curr_color <> bound color) and (curr_color <> fill_col) then
begin
set_pixel(x, y, fill_col)
eight_fill (x+1, y, fill_col, bound_col);
eight_fill (x-1, y, fill_col, bound_col);
eight_fill (x, y+1, fill_col, bound_col);
eight_fill (x, y-1, fill_col, bound_col);
eight_fill (x+1, y-1, fill_col, bound_col);
eight_fill (x+1, y+1, fill_col, bound_col);
eight_fill (x-1, y-1, fill_col, bound_col);
eight_fill (x-1, y+1, fill_col, bound_col);
end;

Note: above 4-fill and 8-fill algorithms involve heavy duty recursion which may consume
memory and time. Better algorithms are faster, but more complex. They make use of
pixel runs (horizontal groups of pixels).

Flood Fill Algorithm


Sometimes we'd like a area fill algorithm that replaces all connected pixels of a selected
color with a fill color. The flood-fill algorithm does exactly that.
public void floodFill(int x, int y, int fill, int old)
{
if ((x < 0) || (x >= raster.width)) return;
if ((y < 0) || (y >= raster.height)) return;
if (raster.getPixel(x, y) == old) {
raster.setPixel(fill, x, y);
floodFill(x+1, y, fill, old);
floodFill(x, y+1, fill, old);
floodFill(x-1, y, fill, old);
floodFill(x, y-1, fill, old);
}
}
13
Computer Graphics and Multimedia

UNIT II
Transformations
2D Transformations

Given a point cloud, polygon, or sampled parametric curve, we can use transformations
for several purposes:
1. Change coordinate frames (world, window, viewport, device, etc).
2. Compose objects of simple parts with local scale/position/orientation of one part
defined with regard to other parts. For example, for articulated objects.
3. Use deformation to create new shapes.
4. Useful for animation.
There are three basic classes of transformations:

1. Rigid body - Preserves distance and angles.


• Examples: translation and rotation.

2. Conformal - Preserves angles.


• Examples: translation, rotation, and uniform scaling.

3. Affine - Preserves parallelism. Lines remain lines.


• Examples: translation, rotation, scaling, shear, and reflection.
14
Computer Graphics and Multimedia
15
Computer Graphics and Multimedia

Homogeneous coordinates

Homogeneous coordinates are another way to represent points to simplify the way in
which we express affine transformations. Normally, bookkeeping would become tedious
when affine transformations of the form A¯p +~t are composed. With homogeneous
coordinates, affine transformations become matrices, and composition of transformations
is as simple as matrix multiplication. In future sections of the course we exploit this in
much more powerful ways.
16
Computer Graphics and Multimedia

Concept

It
is

desirable to restrict the effect


of graphics

primitives to a

subregion of the
canvas, to protect other

portions of the

canvas. All
17
Computer Graphics and Multimedia
primitives are clipped to the boundaries of this clipping rectangle; that is, primitives
lying outside the clip rectangle are not drawn.
The default clipping rectangle is the full canvas ( the screen ), and it is obvious that we
cannot see any graphics primitives outside the screen.
A simple example of line clipping can illustrate this idea.
This is a simple Example of line clipping : the display window is the canvas and also the
default clipping rectangle, thus all line segments inside the canvas are drawn.
The red box is the clipping rectangle we will use later, and the dotted line is the extension
of the four edges of the clipping rectangle.

Line Clipping

This section treats clipping of lines against rectangles. Although there are specialized
algorithms for rectangle and polygon clipping, it is important to note that other graphic
primitives can be clipped by repeated application of the line clipper.

Clipping Individual Points
Before we discuss clipping lines, let's look at the simpler problem of clipping
individual points.
If the x coordinate boundaries of the clipping rectangle are Xmin and Xmax, and
the y coordinate boundaries are Ymin and Ymax, then the following inequalities
must be satisfied for a point at (X,Y) to be inside the clipping rectangle:
Xmin < X < Xmax
and Ymin < Y < Ymax
18
Computer Graphics and Multimedia
If any of the four inequalities does not hold, the point is outside the clipping
rectangle.

The Cohen­Sutherland Line­Clipping Algorithm
The more efficient Cohen-Sutherland Algorithm performs initial tests on a line to
determine whether intersection calculations can be avoided.

Steps for Cohen-Sutherland algorithm

1. End-points pairs are check for trivial acceptance or trivial rejected using
the outcode.
2. If not trivial-accepance or trivial-rejected, divided into two segments at a
clip edge.
3. Iteratively clipped by testing trivial-acceptance or trivial-rejected, and
divided into two segments until completely inside or trivial-rejected.

Clipping Polygons

An algorithm that clips a polygon must deal with many different cases. The case is
particularly note worthy in that the concave polygon is clipped into two separate
polygons. All in all, the task of clipping seems rather complex. Each edge of the polygon
must be tested against each edge of the clip rectangle; new edges must be added, and
existing edges must be discarded, retained, or divided. Multiple polygons may result from
clipping a single polygon. We need an organized way to deal with all these cases.
The following example illustrate a simple case of polygon clipping.

Weiler­Atherton Algorithm 
The Weiler-Atherton algorithm is capable of clipping a concave polygon with interior
holes to the boundaries of another concave polygon, also with interior holes. The polygon
to be clipped is called the subject polygon (SP) and the clipping region is called the clip
polygon (CP). The new boundaries created by clipping the SP against the CP are identical
to portions of the CP. No new edges are created. Hence, the number of resulting polygons
is minimized.
The algorithm describes both the SP and the CP by a circular list of vertices. The exterior
boundaries of the polygons are described clockwise, and the interior boundaries or holes
are described counter-clockwise. When traversing the vertex list, this convention ensures
that the inside of the polygon is always to the right. The boundaries of the SP and the CP
19
Computer Graphics and Multimedia
may or may not intersect. If they intersect, the intersections occur in pairs. One of the
intersections occurs when the SP edge enters the inside of the CP and one when it leaves.
Fundamentally, the algorithm starts at an entering intersection and follows the exterior
boundary of the SP clockwise until an intersection with a CP is found. At the intersection
a right turn is made, and the exterior of the CP is followed clockwise until an intersection
with the SP is found. Again, at the intersection, a right turn is made, with the SP now
being followed. The process is continued until the starting point is reached. Interior
boundaries of the SP are followed counter-clockwise.
A more formal statement of the algorithm is [3]

• Determine   the   intersections   of   the   subject   and   clip   polygons  ­   Add   each 
intersection   to   the   SP   and   CP   vertex   lists.   Tag   each   intersection   vertex   and 
establish a bidirectional link between the SP and CP lists for each intersection 
vertex. 

• Process nonintersecting polygon borders ­ Establish two holding lists: one for 
boundaries  which  lie inside the CP and one for boundaries  which lie  outside. 
Ignore CP boundaries which are outside the SP. CP boundaries inside the SP form 
holes in the SP. Consequently. a copy of the CP boundary goes on both the inside 
and the outside holding list. Place the boundaries on the appropriate holding list. 

• Create two intersection vertex lists  ­ One, the entering list, contains only the 
intersections for the SP edge entering the inside of the CP. The other, the leaving 
list, contains only the intersections for the SP edge leaving the inside of the CP. 
The   intersection   type   will   alternate   along   the   boundary.   Thus,   only   one 
determination is required for each pair of intersections. 

• Perform the actual clipping ­ 
Polygons inside the CP are found using the following procedure. 

o Remove an intersection vertex from the entering list. If the list is empty, 
the process is complete. 

o Follow the SP vertex list until an intersection is found. Copy the SP list 
upto this point to the inside holding list. 

o Using the link, jump to the CP vertex list. 

o Follow   the   CP   vertex   list   until   an   intersection   is   found.   Copy   the   CP 
vertex list upto this point to the inside holding list. 

o Jump back to the SP vertex list. 

o Repeat until the starting point is again reached. At this point, the new 
inside polygon has been closed. 
20
Computer Graphics and Multimedia
Polygons outside the CP are found using the same procedure, except that the
initial intersection vertex is obtained from the leaving list and the CP vertex list is
followed in the reverse direction. The polygon lists are copied to the outside
holding list.

Sutherland and Hodgman's polygon-clipping algorithm uses a divide-and-conquer


strategy: It solves a series of simple and identical problems that, when combined, solve
the overall problem. The simple problem is to clip a polygon against a single infinite clip
edge. Four clip edges, each defining one boundary of the clip rectangle, successively clip
a polygon against a clip rectangle.
Note the difference between this strategy for a polygon and the Cohen-Sutherland
algorithm for clipping a line: The polygon clipper clips against four edges in succession,
whereas the line clipper tests the outcode to see which edge is crossed, and clips only
when necessary.

Steps of Sutherland­Hodgman's polygon­clipping algorithm

• Polygons   can   be   clipped   against   each   edge   of   the   window   one   at   a   time. 
Windows/edge intersections, if any, are easy to find since the X or Y coordinates 
are already known. 

• Vertices  which are kept after clipping against one window edge are saved for 
clipping against the remaining edges. 

• Note that the number of vertices usually changes and will often increases. 

• We are using the Divide and Conquer approach. 

• The original polygon and the clip rectangle.


21
Computer Graphics and Multimedia

• After clipped by the right clip boundary.

• After clipped by the right and bottom clip boundaries.

• After clipped by the right, bottom, and left clip boundaries.


22
Computer Graphics and Multimedia
• After clipped by all four boundaries.

Four Cases of polygon clipping against one edge

The clip boundary determines a visible and invisible region. The edges from vertex i to
vertex i+1 can be one of four types:
• Case 1 : Wholly inside visible region ­ save endpoint 

• Case 2 : Exit visible region ­ save the intersection 

• Case 3 : Wholly outside visible region ­ save nothing 

• Case 4 : Enter visible region ­ save intersection and endpoint 

Because clipping against one edge is independent of all others, it is possible


to arrange the clipping stages in a pipeline. The input polygon is clipped
against one edge and any points that are kept are passed on as input to the
next stage of the pipeline. This way four polygons can be at different stages
of the clipping process simultaneously. This is often implemented in
hardware.
23
Computer Graphics and Multimedia
24
Computer Graphics and Multimedia
Unit III
3D Transformations
3D projection is any method of mapping three-dimensional points to a two-dimensional
plane. As most current methods for displaying graphical data are based on planar two-
dimensional media, the use of this type of projection is widespread, especially in
computer graphics, engineering and drafting.
Translation

Scaling

Rotation about z axis

Rotation about x axis

Rotation about y axis
25
Computer Graphics and Multimedia
Parallel and Perspective Projection
This is one of the final steps before anything is actually blitted to the screen. A lot of 
programmer's choose to include volume clipping in their projection functions because it 
would be a waste of time to do projection on points or objects that are out of the viewing 
area. We are covering two types of projection, parallel and perspective. Most games use 
perspective   projection   to   give   the   user   a  real   view   of   the   world  around   them,   while 
parallel projection is typically used for graphics that don't need that sence of realism. 
Here's a Adobe Acrobat (.PDF) version of this tutorial for a nice printed copy here! 

Parallel Projection

Parallel projection is just a cheap imitation of what the real world would look like. This is 
because it simply ignores the z extent of all points. It is just concerned with the easiest 
way of getting the point to the screen. For these reasons, parallel projection is very easy 
to do and is good for modeling where perspective would distort construction, but it is not 
used as much as perspective projection. For the rest of this tutorial, assume that we are 
dealing with members of our point class that we developed. Notice that our test object to 
the right after projection each edge is perfectly aligned just as it was in the world. Each 
edge that was parallel in the world appears parallel in the projection. The transition from 
world coordinates to screen coordinates is very simple. 

Remember that our world is designed to be viewed from the center (0,0,0). If we just 
used the world coordinates, our view would be incorrect! This is because the upper left of 
the screen is (0,0) and this is NOT aigned with what we want. To correct this, we must 
add half the screen width and half the screen height to our world coordinates so that the 
views are the same. Also notice that we are taking ­1*wy. By now you should be able to 
guess why! It is because in our world, as the object moves up, its y extent increases. This 
is   completely   opposite   of   our   screen   where   moving   down   increases   its   y   extent. 
Multiplying the y extent by ­1 will correct this problem as well! I told you it was simple! 

Perspective Projection

This type of projection is a little more complicted than parallel projection, but the results 
are well worth it. We are able to realistically display our world as if we were really 
standing there looking around! The same display issues face us as before
1. We need to align the screen (0,0) with the world (0,0,0)
2. We need to correct the y directional error.
We correct the two problems as we did before. We add half the screen width and height 
to our point, and then reverse the sign of the y extent. Here's a generalized equation for 
the actual perspective projection. 

The   only   real   difference   between   these   equations   and   our   parallel   relatives   are   the 
26
Computer Graphics and Multimedia
addition   of   the   XSCALE,YSCALE   and   One   Over   Z   variables.   Firstly   we   define 
OneOverZ (double) which stores 1 divided by the z extent of the point. We need to 
multiply our points by this because by its nature, as points move further away, they will 
move closer to 0. This is exactly what we can notice in the "real" world. If we look into a 
plastic tube, the sides look as if they are moving closer to the center. The longer the tube, 
the more this effect is noticed. We then define XSCALE and YSCALE. These are used to 
adjust   our   Field   Of   View.   The   larger   our   FOV,   the   more   squished   together   objects 
appear.   This   makes   sense   since   we   have   a   finite   viewing   area,   and   in   order   to 
accommodate a larger FOV, we will have to squish stuff together. I've found that when 
displaying objects, each video mode needs a different FOV factor to make it look truely 
realistic. Play with the numbers to see what you like the best! Most vary from 40­150. 
Adjusting one of the FOV variables while leaving the other unchanged will make the 
display appear to stretch in that extent. Look at our cube now with a high FOV, the edges 
are far from being parallel!

Z-buffering
In computer graphics, z-buffering is the management of image depth coordinates in three-
dimensional (3-D) graphics, usually done in hardware, sometimes in software. It is one
solution to the visibility problem, which is the problem of deciding which elements of a
rendered scene are visible, and which are hidden. The painter's algorithm is another
common solution which, though less efficient, can also handle non-opaque scene
elements. Z-buffering is also known as depth buffering.
When an object is rendered by a 3D graphics card, the depth of a generated pixel (z
coordinate) is stored in a buffer (the z-buffer or depth buffer). This buffer is usually
arranged as a two-dimensional array (x-y) with one element for each screen pixel. If
another object of the scene must be rendered in the same pixel, the graphics card
compares the two depths and chooses the one closer to the observer. The chosen depth is
then saved to the z-buffer, replacing the old one. In the end, the z-buffer will allow the
graphics card to correctly reproduce the usual depth perception: a close object hides a
farther one. This is called z-culling.
The granularity of a z-buffer has a great influence on the scene quality: a 16-bit z-buffer
can result in artifacts (called "z-fighting") when two objects are very close to each other.
A 24-bit or 32-bit z-buffer behaves much better, although the problem cannot be entirely
eliminated without additional algorithms. An 8-bit z-buffer is almost never used since it
has too little precision.
Given: A list of polygons {P1,P2,.....Pn}
Output: A COLOR array, which display the intensity of the visible polygon surfaces.
Initialize:
note : z-depth and z-buffer(x,y) is positive........
z-buffer(x,y)=max depth; and
COLOR(x,y)=background color.
27
Computer Graphics and Multimedia
Begin:
for(each polygon P in the polygon list) do{
for(each pixel(x,y) that intersects P) do{
Calculate z-depth of P at (x,y)
If (z-depth < z-buffer[x,y]) then{
z-buffer(x,y)=z-depth;
COLOR(x,y)=Intensity of P at(x,y);
}
}
}
display COLOR array.

Bézier curve
Definition: A Bézier curve is a curved line or path defined by mathematical equations. It was named after
Pierre Bézier, a French mathematician and engineer who developed this method of computer drawing in the
late 1960s while working for the car manufacturer Renault. Most graphics software includes a pen tool for
drawing paths with Bézier curves.

The most basic Bézier curve is made up of two end points and control handles attached to
each node. The control handles define the shape of the curve on either side of the
common node. Drawing Bézier curves may seem baffling at first; it's something that
requires some study and practice to grasp the geometry involved. But once mastered,
Bezier curves are a wonderful way to draw!

A Bezier curve with three nodes. The center node is selected and the control handles are
visible.
28
Computer Graphics and Multimedia

A Bezier curve with three nodes. The second node (from left) is selected and the control
handles are visible.
B-splines
Definition: B-splines are not used very often in 2D graphics software but are used quite
extensively in 3D modeling software. They have an advantage over Bézier curves in that
they are smoother and easier to control. B-splines consist entirely of smooth curves, but
sharp corners can be introduced by joining two spline curve segments. The continuous
curve of a b-spline is defined by control points. While the curve is shaped by the control
points, it generally does not pass through them

Creature House Expression and Ulead PhotoImpact are two 2D graphics programs that
offer spline curve drawing tools. Drawing with a spline tool is generally more intuitive
and much easier for computer drawing beginners to understand compared to drawing
with a Bézier tool. Because a Bézier curve often changes form as each new curve
segment is placed, it is difficult to predict the outcome unless you understand the
underlying geometry. On the other hand, when drawing splines, the shape of the curve
can be previewed as the pointer is moved and the curve does not change form as new
segments are placed.

Originally, a spline tool was a thin flexible strip of metal or rubber used by draftsman to
aid in drawing curved lines.

The solid line represents an open path created with Expression's B-Spline tool. The points
on the dotted line represent mouse clicks. By moving the control points on the dotted
line, the path can be reshaped.
29
Computer Graphics and Multimedia

The solid line represents a closed path created with Expression's B-Spline tool.

Painter's algorithm
The painter's algorithm, also known as a priority fill, is one of the simplest solutions to
the visibility problem in 3D computer graphics. When projecting a 3D scene onto a 2D
plane, it is necessary at some point to decide which polygons are visible, and which are
hidden.
The name "painter's algorithm" refers to the technique employed by many painters of
painting distant parts of a scene before parts which are nearer thereby covering some
areas of distant parts. The painter's algorithm sorts all the polygons in a scene by their
depth and then paints them in this order, farthest to closest. It will paint over the parts that
are normally not visible — thus solving the visibility problem — at the cost of having
painted redundant areas of distant objects.
The algorithm can fail in some cases, including cyclic overlap or piercing polygons. In
the case of cyclic overlap, as shown in the figure to the right, Polygons A, B, and C
overlap each other in such a way that it is impossible to determine which polygon is
above the others. In this case, the offending polygons must be cut to allow sorting.
Newell's algorithm, proposed in 1972, provides a method for cutting such polygons.
Numerous methods have also been proposed in the field of computational geometry.
The case of piercing polygons arises when one polygon intersects another. As with cyclic
overlap, this problem may be resolved by cutting the offending polygons.
In basic implementations, the painter's algorithm can be inefficient. It forces the system
to render each point on every polygon in the visible set, even if that polygon is occluded
in the finished scene. This means that, for detailed scenes, the painter's algorithm can
overly tax the computer hardware
30
Computer Graphics and Multimedia

UNIT 4
Diffuse Reflection
Diffuse reflection is the reflection of light from a surface such that an incident ray is
reflected at many angles that can be described as casual, rather than at just one precise
angle, which is the case of specular reflection. If a surface is completely nonspecular, the
reflected light will be evenly scattered over the hemisphere surrounding the surface.
A surface built from a non-absorbing powder, such as plaster, or from fibers, such as
paper, or from a polycrystalline material, such as marble, can be a nearly perfect diffuser.
On the opposite side, calm water and glass objects give only specular reflection (not
great, however), while, among common materials, only polished metals can reflect light
specularly with great efficiency (as a matter of fact, the reflecting material of mirrors
usually is aluminum or silver). All other materials, even when perfectly polished, usually
give not more than perhaps a 5 - 10% specular reflection. Except for particular
conditions, such as the total reflection of a glass prism; or when structured in a complex
purposely made configuration, such as the silvery skin of many fish species. A white
material, instead, such as snow, can reflect back all the light it receives, but diffusely, not
specularly.

Mechanism

Diffuse reflection from a solid surface generally is not due, as one can naively think, to
the roughness of its surface: a flat surface is indeed required to give specular reflection,
but it does not cancel diffuse reflection. We can levigate and polish at will a piece of a
white marble, but it continues to be white, and will never become a mirror: simply it will
give a small specular reflection, while the remaining light continues to be difusely
reflected.
The most general mechanism by which a surface gives diffuse reflection does not involve
exactly the surface: most of the light is contributed by internal scattering centers, beneath
the surface. If we imagine that the figure represents snow, and that the polygons are its
(transparent) ice crystallites, we have that an impinging ray is partially reflected (a few
percent) by the first particle, enters in it, is again reflected by the interface with the
second particle, enters in it, impinges on the third, and so on, generating a series of
"primary" scattered rays in random directions, which, in turn, through the same
31
Computer Graphics and Multimedia
mechanism, generate a large number of "secondary" scattered rays, which generate
"tertiary" rays. All these rays walk through the snow crystallytes, which do not absorb
light, until they arrive at the surface and exit in random directions. The result is that we
get back in all directions all the light we sent, so that we can say that snow is white, in
spite of the fact that it is made of transparent objects (ice crystals).
For simplicity, here we have spoken of "reflections", but more generally the interface
between the small particles that constitute many materials is irregular on a scale
comparable with light wavelength, so diffuse light is generated at each interface, rather
than a single reflected ray, but the story can be told the same way.

This mechanism is very general, because almost all common materials are made of
"small things" held togheter. Mineral materials are generally polycrystalline: we can
describe them as made of a 3-D mosaic of small, irregularly shaped defective crystals.
Organic materials are usually composed of fibers or cells, with their membranes and their
complex internal structure. And each interface, inhomogeneity or imperfection can
deviate, reflect or scatter light, reproducing the above mechanism.
Few materials don't follow it: among them metals, which do not allow light to enter into
them; gases, liquids and glass (which has a liquid-like amorphous microscopic structure);
single crystals, such as some gems or a salt crystal; and some very special materials, such
as the tissues which make the cornea and the lens of an eye. These material can reflect
diffusely, however, if their surface is microscopically rough, like in a frost glass (figure
2), or, of course, if their homogeneous structure deteriorates, as in the eye lens.
A surface may also exhibit both specular and diffuse reflection, as is the case, for
example, of glossy paints as used in home painting, which give also a fraction of specular
reflection, while matte paints give almost exclusively diffuse reflection.

Colored objects

We have up to now spoken of white objects, which do not absorb light. But the above
scheme continues to be valid in the case that the material is absorbent. In this case,
diffused rays will lose some wavelengths during their walk in the material, and will
emerge colored.
More, diffusion affects in a substantial manner the color of objects, because it determines
the average path of light in the material, and hence to which extent the various
wavelengths are absorbed. Red ink looks black when it stays in its bottle. Its vivid color
is only perceived when we put it on a scattering material (paper). This is so because
light's path through the paper fibers (and through the ink) is only a fraction of millimeter
long. Light coming from the bottle, instead, has crossed centimeters of ink, and has been
heavily absorbed, even in its red wavelengths.
And, when a colored object has both diffuse and specular reflection, usually only the
diffuse component is colored. A cherry reflects diffusely red light, absorbs all other
colors and has a specular reflection which is essentially white. This is quite general,
because, except for metals, the reflectivity of most materials depends on their refraction
32
Computer Graphics and Multimedia
index, which varies little with the wavelength (though it is this variation that causes the
chromatic dispersion in a prism), so that all colors are reflected nearly with the same
intensity. Reflections from different origin, instead, may be colored: metallic reflections,
such as in gold or copper, or interferential reflections: iridescences, peacock feathers,
butterfly wings, beetle elytra, or the antireflection coating of a lens.

Interreflection

Diffuse interreflection is a process whereby light reflected from an object strikes other
objects in the surrounding area, illuminating them. Diffuse interreflection specifically
describes light reflected from objects which are not shiny or specular. In real life terms
what this means is that light is reflected off non-shiny surfaces such as the ground, walls,
or fabric, to reach areas not directly in view of a light source. If the diffuse surface is
colored, the reflected light is also colored, resulting in similar coloration of surrounding
objects.
In 3D computer graphics, diffuse interreflection is an important component of global
illumination. There are a number of ways to model diffuse interreflection when rendering
a scene. Radiosity and photon mapping are two commonly used methods.
Specular reflection
Unlike Diffusion, Specular reflection is viewpoint dependent. According to Snell's Law,
light striking a specular surface will be reflected at an angle which mirrors the incident
light angle, which makes the viewing angle very important. Specular reflection forms
tight, bright highlights, making the surface appear glossy (Figure 10.3, “Specular
Reflection.”).
Figure 10.3. Specular Reflection.

 
In reality, Diffusion and Specular reflection are generated by exactly the same process of
light scattering. Diffusion is dominant from a surface which has so much small-scale
roughness in the surface, with respect to wavelength, that light is reflected in many
different directions from each tiny bit of the surface, with tiny changes in surface angle.
33
Computer Graphics and Multimedia
Specular reflection, on the other hand, dominates on a surface which is smooth, with
respect to wavelength. This implies that the scattered rays from each point of the surface
are directed almost in the same direction, rather than being diffusely scattered. It's just a
matter of the scale of the detail. If the surface roughness is much smaller than the
wavelength of the incident light it appears flat and acts as a mirror.
Like Diffusion, Specular reflection has a number of different implementations, or
specular shaders. Again, each of these implementations shares two common parameters:
the Specular colour and the energy of the specularity, in the [0-2] range. This effectively
allows more energy to be shed as specular reflection as there is incident energy. As a
result, a material has at least two different colors, a diffuse, and a specular one. The
specular color is normally set to pure white, but it can be set to different values to obtain
interesting effects.
The four specular shaders are:
l CookTorr - This was Blender's only Specular Shader up to version 2.27. Indeed,
up to that version it was not possible to separately set diffuse and specular shaders
and there was just one plain material implementation. Besides the two standard
parameters this shader uses a third, hardness, which regulates the width of the
specular highlights. The lower the hardness, the wider the highlights.
l Phong - This is a different mathematical algorithm, used to compute specular
highlights. It is not very different from CookTorr, and it is governed by the same
three parameters.
l Blinn - This is a more 'physical' specular shader, thought to match the Oren-Nayar
diffuse one. It is more physical because it adds a fourth parameter, an index of
refraction (IOR), to the aforementioned three. This parameter is not actually used
to compute refraction of rays (a ray-tracer is needed for that), but to correctly
compute specular reflection intensity and extension via Snell's Law. Hardness and
Specular parameters give additional degrees of freedom.
l Toon - This specular shader matches the Toon diffuse one. It is designed to
produce the sharp, uniform highlights of toons. It has no hardness but rather a Size
and Smooth pair of parameters which dictate the extension and sharpness of the
specular highlights.
Thanks to this flexible implementation, which keeps separate the diffuse and specular
reflection phenomena, Blender allows us to easily control how much of the incident light
striking a point on a surface is diffusely scattered, how much is reflected as specularity,
and how much is absorbed. This, in turn, determines in what directions (and in what
amounts) the light is reflected from a given light source; that is, from what sources (and
in what amounts) the light is reflected toward a given point on the viewing plane.
It is very important to remember that the material color is just one element in the
rendering process. The color is actually the product of the light color and the material
color.
34
Computer Graphics and Multimedia
Ray tracing

Ray tracing has the tremendous advantage that it can produce realistic looking images. 
The technique allows a wide variety of lighting effects to be implemented. It also permits 
a range of primitive shapes which is limited only by the ability of the programmer to 
write an algorithm to intersect a ray with the shape. 
Ray tracing works by firing one or more rays from the eye point through each pixel. The
colour assigned to a ray is the colour of the first object that it hits, determined by the
object's surface properties at the ray-object intersection point and the illumination at that
point. The colour of a pixel is some average of the colours of all the rays fired through it.
The power of ray tracing lies in the fact that secondary rays are fired from the ray-object
intersection point to determine its exact illumination (and hence colour). This spawning
of secondary rays allows reflection, refraction, and shadowing to be handled with ease.
Ray tracing's big disadvantage is that it is slow. It takes minutes, or hours, to render a
reasonably detailed scene. Until recently, ray tracing had never been implemented in
hardware. A Cambridge company, Advanced Rendering Technologies, is trying to do just
that, but they will probably still not get ray tracing speeds up to those achievable with
polygon scan conversion.
Ray tracing is used where realism is vital. Example application areas are high quality
architectural visualisations, and movie or television special effects.

Gouraud Shading : 

In Gouraud Shading, the intensity at each vertex of the polygon is first calculated by
applying equation 1.7. The normal N used in this equation is the vertex normal which is
calculated as the average of the normals of the polygons that share the vertex. This is an
important feature of the Gouraud Shading and the vertex normal is an approximation to
the true normal of the surface at that point. The intensities at the edge of each scan line
are calculated from the vertex intensities and the intensities along a scan line from these.
35
Computer Graphics and Multimedia
The interpolation equations are as follows:

For computational efficiency these equations are often implemented as incremental


calculations. The intensity of one pixel can be calculated from the previous pixel
according to the increment of intensity:

(2.3) 

Phong Shading:

Phong Shading overcomes some of the disadvantages of Gouraud Shading and specular
reflection can be successfully incorporated in the scheme. The first stage in the process is
the same as for the Gouraud Shading - for any polygon we evaluate the vertex normals.
For each scan line in the polygon we evaluate by linear intrepolation the normal vectors
at the end of each line. These two vectors Na and Nb are then used to interpolate Ns. we
thus derive a normal vector for each point or pixel on the polygon that is an
approximation to the real normal on the curved surface approximated by the polygon. Ns
, the interpolated normal vector, is then used in the intensity calculation. The vector
interpolation tends to restore the curvature of the original surface that has been
approximated by a polygon mesh. We have :
36
Computer Graphics and Multimedia

These are vector equations that would each be implemented as a set of three equations, 
one for each of the components of the vectors in world space. This makes the Phong 
Shading interpolation phase three times as expensive as Gouraud Shading. In addition 
there is an application of the Phong model intensity equation at every pixel. The 
incremental computation is also used for the intensity interpolation: 

  

  

Color Models in Computer Graphics
YIQ

YIQ is the color space used by the NTSC color TV system, employed mainly in North
and Central America, and Japan. In the U.S., it is currently federally mandated for analog
over-the-air TV broadcasting as shown in this excerpt of the current FCC rules and
regulations part 73 "TV transmission standard":
I stands for in-phase, while Q stands for quadrature, referring to the components used in
quadrature amplitude modulation. Some forms of NTSC now use the YUV color space,
which is also used by other systems such as PAL.
The Y component represents the luma information, and is the only component used by
black-and-white television receivers. I and Q represent the chrominance information. In
YUV, the U and V components can be thought of as X and Y coordinates within the color
space. I and Q can be thought of as a second pair of axes on the same graph, rotated 33°;
therefore IQ and UV represent different coordinate systems on the same plane.
The YIQ system is intended to take advantage of human color-response characteristics.
The eye is more sensitive to changes in the orange-blue (I) range than in the purple-green
37
Computer Graphics and Multimedia
range (Q) — therefore less bandwidth is required for Q than for I. Broadcast NTSC limits
I to 1.3 MHz and Q to 0.4 MHz. I and Q are frequency interleaved into the 4 MHz Y
signal, which keeps the bandwidth of the overall signal down to 4.2 MHz. In YUV
systems, since U and V both contain information in the orange-blue range, both
components must be given the same amount of bandwidth as I to achieve similar color
fidelity.
Very few television sets perform true I and Q decoding, due to the high costs of such an
implementation[citation needed]. Compared to the cheaper R-Y and B-Y decoding which
requires only one filter, I and Q each requires a different filter to satisfy the bandwidth
differences between I and Q. These bandwidth differences also requires that the 'I' filter
include a time delay to match the longer delay of the 'Q' filter. The Rockwell Modular
Digital Radio (MDR) was one I and Q decoding set, which in 1997 could operate in
frame-at-a-time mode with a PC or in realtime with the Fast IQ Processor (FIQP). Some
RCA "Colortrak" home TV receivers made circa 1985 not only used I/Q decoding, but
also advertised its benefits along with its comb filtering benefits as full "100 percent
processing" to deliver more of the original color picture content. Earlier, more than one
brand of color TV (RCA, Arvin) used I/Q decoding in the 1954 or 1955 model year on
models utilizing screens about 13 inches (measured diagonally). The original Advent
projection television used I/Q decoding. Around 1990, at least one manufacturer
(Ikegami) of professional studio picture monitors advertised I/Q decoding.
RGB
There are many models used to measure and describe color. The RGB color model is 
based on the theory that all visible colors can be created using the primary additive colors 
red,   green   and   blue.   These   colors   are   known   as   primary   additives   because   when 
combined in equal amounts they produce white. When two or three of them are combined 
in different amounts, other colors are produced. For example, combining red and green in 
equal  amounts  creates  yellow, green and blue creates  cyan, and red and blue creates 
magenta.As you change the amount of red, green and blue you are presented with new 
colors. Additionally, when one of these primary additive colors is not present you get 
black.

RGB Color in Graphic Design

The   RGB   model   is   so   important   to   graphic   design   because   it   is   used   in   computer 


monitors.  The   screen  you  are  reading  this  very  article  on  is  using  additive   colors   to 
display   images   and   text.   Therefore,   when   designing   websites   (and   other   on­screen 
projects   such   as   presentations),   the   RGB   model   is   used   because   the   final   product   is 
viewed on a computer display.

Types of RGB Color Spaces
38
Computer Graphics and Multimedia
Within the RGB model are different color spaces, and the two most common are sRGB 
and   Adobe   RGB.   When   working   in   a   graphics   software   program   such   as   Adobe 
Photoshop or Illustrator, you can choose which setting to work in.

sRGB:  The sRGB space is best to use when designing for the web, as it is what most 
computer monitors use.

Adobe RGB: Because the Adobe RGB space contains a larger selection of colors that are 
not available in the sRGB space, it is best to use when designing for print. It is also 
recommended for use with photos taken with professional digital cameras (as opposed to 
consumer­level), because high­end cameras often use the Adobe RGB space. 

CMY and CMYK Color in Graphic Design

The  CMY  (Cyan,  Magenta,  Yellow) color model is used in printing where layers  of 


colors are removed or filtered out. Combining the primary colors in equal parts create the 
black color. 

Even if the whole color gamut can be created in CMY, black is often not produced by this 
system since the color created does not become true black. The reason for this is that the 
colored inks always contain minor impurities.

The CMYK color model (process color, four color) is a subtractive color model, used
in color printing, and is also used to describe the printing process itself. CMYK refers to
the four inks used in some color printing: cyan, magenta, yellow, and key black. Though
it varies by print house, press operator, press manufacturer and press run, ink is typically
applied in the order of the abbreviation.
39
Computer Graphics and Multimedia
The “K” in CMYK stands for key since in four-color printing cyan, magenta, and yellow
printing plates are carefully keyed or aligned with the key of the black key plate. Some
sources suggest that the “K” in CMYK comes from the last letter in "black" and was
chosen because B already means blue.[1][2] However, this explanation, though plausible
and useful as a mnemonic, is incorrect.[3]
The CMYK model works by partially or entirely masking colors on a lighter, usually
white, background. The ink reduces the light that would otherwise be reflected. Such a
model is called subtractive because inks “subtract” brightness from white.
In additive color models such as RGB, white is the “additive” combination of all primary
colored lights, while black is the absence of light. In the CMYK model, it is the opposite:
white is the natural color of the paper or other background, while black results from a full
combination of colored inks. To save money on ink, and to produce deeper black tones,
unsaturated and dark colors are produced by using black ink instead of the combination
of cyan, magenta and yellow

HSL and HSV Color Model


HSL and HSV are the two most common cylindrical-coordinate representations of points
in an RGB color model, which rearrange the geometry of RGB in an attempt to be more
perceptually relevant than the cartesian representation. They were developed in the 1970s
for computer graphics applications, and are used for color pickers, in color-modification
tools in image editing software, and less commonly for image analysis and computer
vision.
HSL stands for hue, saturation, and lightness, and is often also called HLS. HSV stands
for hue, saturation, and value, and is also often called HSB (B for brightness). A third
model, common in computer vision applications, is HSI, for hue, saturation, and
intensity. Unfortunately, while typically consistent, these definitions are not standardized,
and any of these abbreviations might be used for any of these three or several other
related cylindrical models. (For technical definitions of these terms, see below.)
In each cylinder, the angle around the central vertical axis corresponds to “hue”, the
distance from the axis corresponds to “saturation”, and the distance along the axis
corresponds to “lightness”, “value” or “brightness”. Note that while “hue” in HSL and
HSV refers to the same attribute, their definitions of “saturation” differ dramatically.
Because HSL and HSV are simple transformations of device-dependent RGB models, the
physical colors they define depend on the colors of the red, green, and blue primaries of
the device or of the particular RGB space, and on the gamma correction used to represent
the amounts of those primaries. Each unique RGB device therefore has unique HSL and
HSV spaces to accompany it, and numerical HSL or HSV values describe a different
color for each basis RGB space.[1]
40
Computer Graphics and Multimedia
Both of these representations are used widely in computer graphics, and one or the other
of them is often more convenient than RGB, but both are also commonly criticized for
not adequately separating color-making attributes, or for their lack of perceptual
uniformity. Other more computationally intensive models, such as CIELAB or
CIECAM02 better achieve these goals.
41
Computer Graphics and Multimedia

UNIT 5

Introduction
42
Computer Graphics and Multimedia
Multimedia is media and content that uses a combination of different content forms. The 
term can be used as a noun (a medium with multiple content forms) or as an adjective 
describing a medium as having multiple content forms. The term is used in contrast to 
media which only use traditional forms of printed or hand­produced material. Multimedia 
includes a combination of  text,  audio,  still images,  animation,  video, and  interactivity 
content forms.
Multimedia is usually recorded and played, displayed or accessed by information content 
processing devices, such as computerized and electronic devices, but can also be part of a 
live performance.  Multimedia  (as an adjective) also describes  electronic media  devices 
used to store and experience multimedia content. Multimedia is distinguished from mixed 
media in fine art; by including audio, for example, it has a broader scope. The term "rich 
media" is synonymous for  interactive multimedia.  Hypermedia  can be considered one 
particular multimedia application.

Usage

Multimedia   finds   its   application   in   various   areas   including,   but   not   limited   to, 
advertisements,  art,  education,  entertainment,  engineering,  medicine,  mathematics, 
business,  scientific research  and  spatial temporal applications. Several examples are as 
follows:

Creative industries

Creative industries  use multimedia for a variety of purposes ranging from fine arts, to 
entertainment, to commercial art, to journalism, to media and software services provided 
for any of the industries listed below. An individual multimedia designer may cover the 
spectrum   throughout   their   career.   Request   for   their   skills   range   from   technical,   to 
analytical, to creative.

Commercial
Much of the electronic  old  and  new media  used by commercial artists is multimedia. 
Exciting presentations are used to grab and keep attention in  advertising. Business to 
business, and interoffice communications are often developed by creative services firms 
for advanced multimedia presentations beyond simple slide shows to sell ideas or liven­
up training. Commercial multimedia developers may be hired to design for governmental 
services and nonprofit services applications as well.

Entertainment and fine arts
In   addition,   multimedia   is   heavily   used   in   the   entertainment   industry,   especially   to 
develop  special   effects  in   movies   and   animations.   Multimedia   games   are   a   popular 
43
Computer Graphics and Multimedia
pastime and are software programs available either as CD­ROMs or online. Some video 
games also use multimedia features. Multimedia applications that allow users to actively 
participate   instead   of   just   sitting   by   as   passive   recipients   of   information   are   called 
Interactive Multimedia.

Education

In Education, multimedia is used to produce computer­based training courses (popularly 
called CBTs) and reference books like encyclopedia and almanacs. A CBT lets the user 
go   through   a   series   of   presentations,   text   about   a   particular   topic,   and   associated 
illustrations  in various  information  formats.  Edutainment  is an informal  term used  to 
describe combining education with entertainment, especially multimedia entertainment.

Engineering

Software   engineers  may   use   multimedia   in  Computer   Simulations  for   anything   from 
entertainment to training such as military or industrial training. Multimedia for software 
interfaces are often done as a collaboration between creative professionals and software 
engineers.

Industry

In   the  Industrial   sector,   multimedia   is   used   as   a   way   to   help   present   information   to 
shareholders,   superiors   and   coworkers.   Multimedia   is   also   helpful   for   providing 
employee   training,   advertising   and   selling   products   all   over   the   world   via   virtually 
unlimited web­based technology

Multi Media Hardware
VIDEO CAMERAS
With the right adapters, software, and hardware, camcorders and digital video cameras 
can be used to capture full­motion images. Although regular camcorders store video on 
film, digital video cameras store images as digital data. This enables the digital images to 
be transferred directly  into the product being created.  Digital video cameras  range in 
price   from   under   a   hundred   dollars   for   small   desktop   cameras   like   the   Connectix 
QuickCam, to thousands of dollars for higher­end equipment.

Digital video cameras offer an inexpensive means of getting images into your computer, 
44
Computer Graphics and Multimedia
however, you should be aware that the resolution is often quite low and the color is 
sometimes questionable.
DIGITAL CAMERAS
Digital cameras allow you to take pictures just as you would with a regular camera, but 
without film developing and processing. Unlike regular cameras, photographs are not 
stored on film but are instead stored in a digital format on magnetic disk or internal 
memory. The photographs can be immediately recognized by the computer and added to 
any multimedia product.
SCANNERS
Scanners digitize already developed images including photographs, drawings, pages of 
text.   By   converting   these   images   to   a   digital   format,   they   can   be   interpreted   and 
recognized   by   the   microprocessor   of   the   computer.   A   better   way   of   scanning   larger 
images is to use a page or flatbed scanner. These scanners look like small photocopiers. 
Page   scanners   are   either   gray­scale   scanners   that   work   well   with   black­and­white 
photographs or color scanners that can record millions of colors.

GRAPHICS TABLET

A   graphics   tablet   is   similar   to   a   digitizing   tablet,   however,   it   contains   additional 


characters and commands. Like the digitizing tablet, each location on the graphics tablet 
corresponds to a specific location on the screen.

MICROPHONES

As   is   true   with   most   equipment,   all   microphones   are   not   created   equal.   If   you   are 
planning to use a microphone for input, you will want to purchase a superior, high­quality 
microphone because your recordings will depend on its quality.

Next to the original sound, the microphone is the most important factor in any sound 
system. The microphone is designed to pick up and amplify incoming acoustic waves or 
harmonics precisely and correctly and convert them to electrical signals. Depending on 
its sensitivity, the microphone will pick up the sound of someone's voice, sound from a 
musical instrument, and any other sound that comes to it. Regardless of the quality of the 
other audio­system components, the true attributes of the original sound are forever lost if 
the microphone does not capture them,

Macintosh computers  come with a built­in microphone, and more and more PCs that 
include Sound Blaster sound cards also include a microphone. These microphones are 
45
Computer Graphics and Multimedia
generally   adequate   for   medium­quality   sound   recording   of   voiceover's   and   narration. 
These microphones are not adequate for recording music

MIDI HARDWARE 

MIDI (Musical Instrument Digital Interface) is a standard that was agreed upon by the 
major   manufacturers   of   musical   instruments.   The   MIDI   standard   was   established   so 
musical instruments could be hooked together and could thereby communicate with one 
another.

To communicate, MIDI instruments have an "in" port and an "out" port that enables them 
to be connected to one another. Some MIDI instruments also have a "through" port that 
allows several MIDI instruments to be daisy chained together.

STORAGE

Multimedia   products   require   much   greater   storage   capacity   than   text­based   data.   All 
multimedia authors soon learn that huge drives are essential for the enormous files used 
in multimedia and audiovisual creation. Floppy diskettes really aren't useful for storing 
multimedia  products. Even small presentations  will quickly consume the 1.44 MB of 
storage allotted to a high­density diskette.

In addition to a hefty storage capacity, a fast drive is also important. This is because large 
files, even if they are compressed, take a long time to load and a long time to save and 
back   up.   Consequently,   if   the   drive   is   slow,   frustration   and   lost   productivity   will 
undoubtedly   follow.   When   purchasing   a   storage   medium,   consider   the   speed   of   the 
device­how  fast it can retrieve  and save large  files  as well as  the size of its  storage 
capacity.

OPTICAL DISKS

Optical storage offers much higher storage capacity than magnetic storage. This makes it 
a much better medium for storing and distributing multimedia products that are full of 
graphics, audio, and video files. In addition, reading data with lasers is more precise. 
Therefore,   when   working   with   multimedia,   optical   storage   media   such   as   Magneto­
Optical Disks (MO) and CD­ROM (CD) is more common than magnetic media. Digital 
Versatile   Disk   (DVD),   a   newer   optical   storage   medium   with   even   greater   storage 
capacity than a CD, will probably take the place of these other optical media within the 
next few years.

CD’S
46
Computer Graphics and Multimedia
CD­ROM   stands   for   compact   disk   read   only   memory.   A   CD­ROM   can   hold   about 
650MB of data. Because CDs provide so much storage capacity, they are ideal for storing 
large   data   files,   graphics,   sound,   and   video.   Entire   references   such   as   encyclopedias 
complete   with   text   and   graphics,   as   well   as   audio   and   video   to   further   enhance   the 
information, can be stored on one CD­ROM. In addition, interactive components that 
enable the user to respond to and control the medium ensure that the user will be even 
more attentive and likely to retain information. For these reasons, CDs have been the 
medium of choice for publishing multimedia applications.

Because a CD­ROM is the most common type of optical disk, computers  sold today 
include a CD­ROM drive as standard equipment. In fact, in order to have a multimedia 
personal computer based on the standards set by the MPC, you must have a CD­ROM 
drive. Therefore, when considering the purchase of a multimedia computer, the important 
consideration in regard to the CD­ROM drive is the speed of transfer.

CD­ROM speed is measured in kilobytes (KB) per second. This refers to the speed at 
which data is transferred from the CD to the computer processor or monitor. Double 
speed (2x) CD­ROM drives can transfer data at a rate of 300 KB per second, quadruple 
speed (4x) can transfer data at a rate of 600 KB per second, and so on up to 24x and 
higher.

DVDS

DVD, which in some places stands for Digital Versatile Disk, but really doesn't stand for 
anything,   is   the   newest   and   most   promising   multimedia   storage   and   distribution 
technology.   DVD   technology   offers   the   greatest   potential   to   multimedia   because   its 
storage capacity is extensive.

DVD's are the same size as CDs, but they offer much more storage capacity. DVD's are 
either single or double sided. A double­sided DVD is actually two single DVD's glued 
together.  By using more  densely  packed data  pits  together  with more  closely spaced 
tracks, DVD's can store tremendous amounts of data. DVD disk types and capacities 
include the following four:
l DVD­5: one layer, one side­max. capacity about 4.7GB. 
l DVD­9: one layer, dual sided­max. capacity about 8.5GB. 
l DVD­10: two layers, one side­max. capacity about 9GB. 
l DVD­18: two layers, dual sided­max. capacity about 17GB. 
MONITORS 

The image displayed on the computer monitor depends on the quality of the monitor and 
software, as well as the capability of the video adapter card. For multimedia applications, 
47
Computer Graphics and Multimedia
it is critical that all of these elements work together to create high quality graphic images. 
However,
because all display systems are not the same, you will have very little control over how 
your images appear on other people's systems. Consequently, it is a good idea to test your 
projects on different display systems to see how your digital images appear.

When purchasing a computer monitor to be used with multimedia applications, you will 
want to consider purchasing a larger screen. Screen sizes are measured along the diagonal 
and range in size from eight to more than 50 inches. You will probably want at least a 17­
inch monitor. Though this larger monitor will cost you a bit more, it will prove well 
worth it if you intend to spend any time at all either designing or otherwise working with 
multimedia   applications.   In   fact,   after   you   have   spent   some   time   working   with 
multimedia applications, you may even want to consider purchasing two monitors if you 
are using a Macintosh or PC setup that will support two monitors.

The number of colors that the monitor can display is also important. The number colors is 
dependent on the amount of memory installed on the video board as well as the monitor 
itself. The number of colors a monitor can ay varies as listed below:
4­bit system will display 16 different colors
8­bit system will display 256 different colors
16­bit system will display 65,636 different colors
A 24­bit system will display more than 16 million different colors.

Most monitors can display at least 256 colors (8­bit), which is probably adequate for 
multimedia presentations, particularly if the presentation is delivered via the Web, but it 
may not be adequate for video. Eight­bit images are the most compatible across multiple 
platforms  and they also take up very little disk space. Computer monitors capable of 
displaying thousands of colors (16­bit) are quickly becoming the multimedia standard. 
Images on these displays not only look better, they also display much faster.
Data and File format standards
Graphic images may be stored in a wide variety of file formats. In choosing a format, you 
should consider how and where the image will be used. This is because the application 
must support the file format you select. Some formats are proprietary while others have 
become universally supported by the graphics industry. Though proprietary formats may 
function   perfectly   in   their   own   environment,   their   lack   of   compatibility   with   other 
systems can create problems.

In the Macintosh environment, the PICT format, a vector­based file format, is the image 
format   supported   by   almost   all   Macintosh   applications.   Recently,   the   Windows 
environment has standardized on the BMP file format. Prior to this, there were multiple 
48
Computer Graphics and Multimedia
file formats under DOS that made it difficult to transfer graphic files from one application 
to another. The most common file formats are described below.

TIFF (TAGGED IMAGE FILE FORMAT)

The TIFF file format is probably the most widely used bitmapped file format. Image­
editing applications, scanning software, illustration programs, page­layout programs, and 
even word processing programs support TIFF files. The TIFF format works for all types 
of images and supports bit depths from 1 to 32 bits. In addition, TIFF is cross platform. 
Versions are available for the Mac, PC, and UNIX systems. The TIFF file format is often 
used when the output is printed.

BMP (SHORT FOR BITMAP)

The BMP format has been adopted as the standard bitmapped format on the Windows 
platform. It is a very basic format supported by most Windows applications. It is also the 
most efficient format to use with Windows.

GIF (GRAPHICS INTERCHANGE FORMAT)

CompuServe created this format. Consequently, you may see it listed as CompuServe 
GIF. It is one of two standard formats used on the Web without plug­ins. It is the method 
of storing bitmaps on the Web. The GIF format only supports up to 256 colors.

PICT/PICT2 (SHORT FOR PICTURE)

These are formats for the Macintosh. They are generally used only for screen display. 
Some Mac programs can only import images saved as either PICT or EPS. Unlike the 
EPS format, PICT does not provide information for separations, which means graphics 
saved with this file format will be smaller than EPS files. PICT2 added additional levels 
of color to the PICT format.

JPEG (JOINT PHOT04GRALPHIC EXPERTS GROUP)
This   format   creates   a   very   compact   file.   Because   of   its   small   file   size,   it   is   easy   to 
transmit   across   networks.   Consequently,   it   is   one   of   only   two   graphic   file   formats 
supported by the World Wide Web without plug­ins. Do keep in mind that in order to 
make the file so small, lossy compression is used when a file is saved or converted to this 
format. This means pixels will be removed from the image. JPEG files are bitmapped 
images.
49
Computer Graphics and Multimedia
CD (PHOTO CD)

This is Kodak's Photo CD graphics file format. It is a bitmapped format that contains 
different image sizes for each photograph.

This is only a small sample of all of the different graphic file formats available. There are 
also   many   proprietary   formats.   If   you   plan   to   transfer   files   from   application   to 
application,   consider   using   the   most   common   file   format   supported   by   all   of   your 
applications. If you get stuck because an application does not support a graphics file 
format, graphic conversion software is available to help you change the file format so that 
you can import and export graphic images from almost any application to another.

Keep in mind, that in most situations, commercial image providers are only selling the 
rights to use the image, they are not selling the image itself. In other words, they may sell 
you the right to use the image in one multimedia application, but the image does not 
become your property. If you want to use it again in a different multimedia application, 
you may very well have to pay another royalty. The agreements vary depending on the 
image, the original artist, and the commercial image provider. Take caution and read the 
licensing agreement carefully before you include an image from a CD or the Web in 
a multimedia application. Just because you purchased the CD or were given access to the 
image on the Web, that doesn't necessarily mean you own it.
Audio in Multi Media

Audio   has   long   fought   for   equal   billing   with   video.   With   the   acceptance   of   stereo 
sound,.home theaters and surround sound, audio has made great strides in the traditional 
video world. But the battle is being fought all over again in the multimedia world of 
QuickTime   and   Video   for   Windows.   We   worry   about   the   smallest   detail   of   video 
compression methods, data rates, and color palettes, but all too often handle the audio as 
an afterthought. While there are some limitations to what we can achieve with audio in 
multimedia applications, proper care can yield far better results than the default case we 
often settle for. 
The familiar audio CD is composed of 16­bit samples at a 44.1 KHz data rate. While this 
rate is supported by the latest computer sound cards, handling that much audio data can 
tax   even   high­end   systems   and   reduce   video   performance.   Remember   that   when 
establishing   the   target   data   rate   for   compressed   video,   the   audio   data   rate   must   be 
subtracted from the otal, with whatever is left over available for video. Every bit of extra 
quality we give to the audio side comes straight out of the video side. Simply throwing 
more data at the audio to improve its quality is not a good solution. That stereo CD­
quality audio we all want needs 176.4 K bytes/second, more than we usually allocate to 
the combination of audio and video together! 
50
Computer Graphics and Multimedia
The most common data rates used for audio in the multimedia environment are 22.050 
KHz and 11.025 KHz, both submultiples of the 44.1 KHz CD rate. Lower rates are also 
available, but are really only useful for low­quality voice. The sample rate we choose 
determines the maximum frequency that can be reproduced. Sampling theory tells us that 
the maximum accurately reproducible frequency can be no more than half the sample 
rate. This "half the sample rate" frequency is known as the Nyquist limit. Keep in mind, 
however, that this is the theoretical maximum; in the real world many factors conspire to 
keep you from actually getting to that limit. 
The sample size will be either 8­ or 16­bits. The most universal is 8­bit, but most sound 
cards sold today are 16­bit, slowly pushing the 8­bit cards out of the installed base. The 
sample size determines both the maximum dynamic range and the signal to noise ratio 
sample. While 16­bit audio has a respectable 98 dB theoretical SNR, 8­bits yields less 
than 50 dB SNR. 
The quality of your audio digitizing card is also important. Many sound cards add the 
audio input as an afterthought and have serious distortion in their input stages. Also, 
placing  audio  gear into a computer  box filled with digital  signals  invites  all sorts of 
interference and noise problems, particularly if working with microphone level inputs. 
Choose a digitizing card that has proper shielding and a good audio input section, or you 
will limit your results before you even get something digitized. If you are digitizing from 
a microphone, you will be better off using an external preamp to boost the signal to line 
level before feeding the digitizer card. 
So what can be done when we are forced to use mono, 11 KHz, 8­bit samples because of 
data rate limitations, when we know that we will be limited to a 5.5 KHz frequency range 
and a tiny dynamic range? Not to worry. With proper care in producing the audio, we can 
get surprisingly good results. And if we can allow ourselves to move to a 22 KHz sample 
rate, we can get something darn good. 
The first thing to do is to make sure that no frequencies above the Nyquist limit are ever 
sampled. This means inserting a low­pass filter into the audio before we ever get to the 
digitizing card. Adjust it so that nothing above the Nyquist limit will be passed through. 
Remember that audio filters are analog devices, so set the cutoff frequency somewhat 
below the Nyquist limit to allow for the slope of the filter. 
Next we need to work on the dynamic range of the material. Remember that digital audio 
has no "headroom". Once you hit 0 VU there is no more room to encode audio. If you 
think tape saturation sounds bad, try listening to digital clipping! To avoid clipping, use 
an audio compressor/limiter to compress the audio signal, reducing the dynamic range, 
and to limit it, making sure we never exceed the maximum signal level. Keep in mind 
that this is analog "compression" and is not related to the digital "compression" that we 
do on digitized video and audio data. 
Unlike video, compression of the digital audio data is not common. This is partly because 
the payoff for compression is not as great, what with audio having less data to compress, 
51
Computer Graphics and Multimedia
and   also   because   most   compression   algorithms   (A­law,   mu­law,   etc.)   came   from   the 
telephony   industry   and   were   designed   for   voice.   Most   high­fidelity   compression 
algorithms have been proprietary.

Data Compression
In computer science and information theory, data compression or source coding is the 
process of encoding information using fewer bits (or other information­bearing units) 
than an unencoded representation would use, through use of specific encoding schemes.
As with any communication, compressed data communication only works when both the 
sender  and receiver of the  information  understand the encoding scheme. For example, 
this text makes sense only if the receiver understands that it is intended to be interpreted 
as characters representing the English language. Similarly, compressed data can only be 
understood if the decoding method is known by the receiver.
Compression is useful because it helps reduce the consumption of expensive resources, 
such as  hard disk  space or transmission  bandwidth. On the downside, compressed data 
must be decompressed to be used, and this extra processing may be detrimental to some 
applications.   For   instance,   a   compression   scheme   for   video   may   require   expensive 
hardware   for   the   video   to   be   decompressed   fast   enough   to   be   viewed   as   it   is   being 
decompressed (the option of decompressing the video in full before watching it may be 
inconvenient, and requires storage space for the decompressed video). The design of data 
compression schemes therefore involves trade­offs among various factors, including the 
degree of compression, the amount of distortion introduced (if using a lossy compression 
scheme), and the computational resources required to compress and uncompress the data.

Lossless versus lossy compression

Lossless compression algorithms usually exploit statistical redundancy in such a way as 
to   represent   the   sender's   data   more   concisely   without   error.   Lossless   compression   is 
possible because most real­world data has statistical redundancy. For example, in English 
text, the letter 'e' is much more common than the letter 'z', and the probability that the 
letter 'q' will be followed by the letter 'z' is very small. Another kind of compression, 
called lossy data compression or perceptual coding, is possible if some loss of fidelity is 
acceptable.   Generally,   a   lossy   data   compression   will   be   guided   by   research   on   how 
people perceive the data in question. For example, the human eye is more sensitive to 
subtle variations in luminance than it is to variations in color. JPEG image compression 
works   in  part  by  "rounding off"  some  of this   less­important   information.  Lossy data 
compression   provides   a   way   to   obtain   the   best   fidelity   for   a   given   amount   of 
compression. In some cases, transparent (unnoticeable) compression is desired; in other 
cases, fidelity is sacrificed to reduce the amount of data as much as possible.
Lossless   compression   schemes   are   reversible   so   that   the   original   data   can   be 
reconstructed, while lossy schemes accept some loss of data in order to achieve higher 
52
Computer Graphics and Multimedia
compression.
However, lossless data compression algorithms will always fail to compress some files; 
indeed, any compression algorithm will necessarily fail to compress any data containing 
no discernible patterns. Attempts to compress data that has been compressed already will 
therefore usually result in an expansion, as will attempts to compress all but the most 
trivially encrypted data.
In practice, lossy data compression will also come to a point where compressing again 
does not work, although an extremely lossy algorithm, like for example always removing 
the last byte of a file, will always compress a file up to the point where it is empty.
An example of lossless vs. lossy compression is the following string:
25.888888888 

This string can be compressed as:
25.[9]8 

Interpreted as, "twenty five point 9 eights", the original string is perfectly recreated, just 
written in a smaller form. In a lossy system, using
26 

instead, the exact original data is lost, at the benefit of a smaller file.
Example for Lossy and Lossless

Lossy

Lossy image compression is used in digital cameras, to increase storage capacities with 
minimal degradation of picture quality. Similarly,  DVDs  use the lossy  MPEG­2  Video 
codec for video compression.
In lossy audio compression, methods of psychoacoustics are used to remove non­audible 
(or   less   audible)   components   of   the  signal.   Compression   of   human   speech   is   often 
performed with even more specialized techniques, so that "speech compression" or "voice 
coding" is sometimes distinguished as a separate discipline from "audio compression". 
Different audio and speech compression standards are listed under  audio codecs. Voice 
compression is used in Internet telephony for example, while audio compression is used 
for CD ripping and is decoded by audio players.

Lossless

The Lempel­Ziv (LZ) compression methods are among the most popular algorithms for 
53
Computer Graphics and Multimedia
lossless storage. DEFLATE is a variation on LZ which is optimized for decompression 
speed and compression ratio, therefore compression can be slow. DEFLATE is used in 
PKZIP,  gzip  and  PNG.  LZW  (Lempel­Ziv­Welch)   is   used   in   GIF   images.   Also 
noteworthy   are   the   LZR   (LZ­Renau)   methods,   which   serve   as   the   basis   of   the   Zip 
method. LZ methods utilize  a table­based compression model where table entries are 
substituted for repeated strings of data. For most LZ methods, this table is generated 
dynamically from earlier data in the input. The table itself is often Huffman encoded (e.g. 
SHRI, LZX). A current LZ­based coding scheme that performs well is  LZX, used in 
Microsoft's CAB format.
The very best compressors use probabilistic models, in which predictions are coupled to 
an algorithm called  arithmetic coding. Arithmetic coding, invented by  Jorma Rissanen, 
and   turned   into   a   practical   method   by   Witten,   Neal,   and   Cleary,   achieves   superior 
compression to the better­known Huffman algorithm, and lends itself especially well to 
adaptive data compression tasks where the predictions are strongly context­dependent. 
Arithmetic   coding   is   used   in   the   bilevel   image­compression   standard  JBIG,   and   the 
document­compression   standard  DjVu.   The   text  entry  system,  Dasher,   is   an   inverse­
arithmetic­coder.
Video

AVI
Audio Video Interleave, known by its acronym AVI, is a multimedia container format 
introduced by Microsoft in November 1992 as part of its Video for Windows technology. 
AVI   files   can   contain   both   audio   and   video   data   in   a   file   container   that   allows 
synchronous audio­with­video playback. Like the DVD video format, AVI files support 
multiple streaming audio and video, although these features are seldom used. Most AVI 
files also use the file format extensions developed by the  Matrox  OpenDML group in 
February 1996. These files are supported by Microsoft, and are unofficially called "AVI 
2.0".
AVI is a derivative of the  Resource Interchange File Format  (RIFF), which divides a 
file's data into blocks, or "chunks." Each "chunk" is identified by a FourCC tag. An AVI 
file takes the form of a single chunk in a RIFF formatted file, which is then subdivided 
into two mandatory "chunks" and one optional "chunk".
The first sub­chunk is identified by the "hdrl" tag. This sub­chunk is the file header and 
contains metadata about the video, such as its width, height and frame rate. The second 
sub­chunk is identified by the "movi" tag. This chunk contains the actual audio/visual 
data   that  make  up the AVI movie.  The third  optional  sub­chunk is  identified  by the 
"idx1" tag which indexes the offsets of the data chunks within the file.
By way of the RIFF format, the audio/visual data contained in the "movi" chunk can be 
encoded   or   decoded   by   software   called   a  codec,   which   is   an   abbreviation   for 
(en)coder/decoder. Upon creation of the file, the codec translates between raw data and 
54
Computer Graphics and Multimedia
the (compressed) data format used inside the chunk. An AVI file may carry audio/visual 
data   inside   the   chunks   in   virtually   any   compression   scheme,   including   Full   Frame 
(Uncompressed),   Intel   Real   Time   (Indeo),  Cinepak,  Motion   JPEG,   Editable  MPEG, 
VDOWave, ClearVideo / RealVideo, QPEG, and MPEG­4 Video.

3GP 
3GP  (3GPP   file   format)   is   a  multimedia  container   format  defined   by   the  Third 
Generation Partnership Project (3GPP) for 3G UMTS multimedia services. It is used on 
3G mobile phones but can also be played on some 2G and 4G phones.
3G2 (3GPP2 file format) is a multimedia container format defined by the 3GPP2 for 3G 
CDMA2000 multimedia services. It is very similar to the 3GP file format, but has some 
extensions and limitations in comparison to 3GP.
3GP is defined in the ETSI 3GPP technical specification.3GP is a required file format for 
video and associated speech/audio media types and timed text in ETSI 3GPP technical 
specifications   for  IP   Multimedia   Subsystem  (IMS),  Multimedia   Messaging   Service 
(MMS),  Multimedia Broadcast/Multicast Service  (MBMS) and Transparent end­to­end 
Packet­switched Streaming Service (PSS).
3G2 is defined in 3GPP2 technical specification.
The 3GP and 3G2 file formats are both structurally based on the  ISO base media file 
format defined in ISO/IEC 14496­12 ­ MPEG­4 Part 12  but older versions of the 3GP 
file format did not use some of its features.3GP and 3G2 are container formats similar to 
MPEG­4 Part 14 (MP4), which is also based on MPEG­4 Part 12. The 3GP and 3G2 file 
format   were   designed   to   decrease   storage   and   bandwidth   requirements   in   order   to 
accommodate mobile phones.
3GP and 3G2 are similar standards, but with some differences:
l  3GPP  file format was designed for GSM­based Phones and may have the filename 
extension .3gp 
l  3GPP2   file   format   was   designed   for  CDMA­based   Phones   and   may   have   the 
filename extension .3g2 
The 3GP file format stores video streams as MPEG­4 Part 2 or H.263 or MPEG­4 Part 10 
(AVC/H.264), and audio streams as  AMR­NB,  AMR­WB,  AMR­WB+,  AAC­LC,  HE­
AAC  v1 or Enhanced aacPlus (HE­AAC v2). 3GPP allowed use of AMR and H.263 
codecs in the ISO base media file format (MPEG­4 Part 12), because 3GPP specified the 
usage of the Sample Entry and template fields in the ISO base media file format as well 
as defining new boxes to which codecs refer. These extensions were registered by the 
registration   authority   for   code­points   in   ISO   base   media   file   format   ("MP4   Family" 
files).For   the   storage   of   MPEG­4   media   specific   information   in   3GP   files,   the   3GP 
specification refers to MP4 and the AVC file format, which are also based on the ISO 
55
Computer Graphics and Multimedia
base media file format. The MP4 and the AVC file format specifications described usage 
of MPEG­4 content in the ISO base media file format.
The 3G2 file format can store the same video streams and most of the audio streams used 
in the 3GP file format. In addition, 3G2 stores audio streams as EVRC, EVRC­B, EVRC­
WB, 13K (QCELP), SMV or VMR­WB, which was specified by 3GPP2 for use in ISO 
base media file format.The 3G2 specification also defined some enhancements to 3GPP 
Timed Text. 3G2 file format does not store Enhanced aacPlus (HE­AAC v2) and AMR­
WB+  audio streams.For the storage of MPEG­4 media (AAC audio, MPEG­4 Part 2 
video, MPEG­4 Part 10 ­ H.264/AVC) in 3G2 files, the 3G2 specification refers to the 
MP4 file format and the AVC file format specification, which described usage of this 
content in the ISO base media file format. For the storage of H.263 and AMR content 
3G2 specification refers to the 3GP file format specification.
MPEG
The  Moving Picture Experts Group  (MPEG) is a  working group  of experts that was 
formed by the ISO to set standards for audio and video compression and transmission. It 
was established in 1988 and its first meeting was in May 1988 in Ottawa, Canada. As of 
late 2005, MPEG has grown to include approximately 350 members per meeting from 
various industries, universities, and research institutions. MPEG's official designation is 
ISO/IEC   JTC1/SC29   WG11   ­  Coding   of   moving   pictures   and   audio  (ISO/IEC   Joint 
Technical Committee 1, Subcommittee 29, Working Group 11).
The MPEG compression methodology is considered asymmetric as the encoder is more 
complex than the decoder. The encoder needs to be algorithmic or adaptive whereas the 
decoder   is   'dumb'   and   carries   out   fixed   actions..This   is   considered   advantageous   in 
applications such as broadcasting where the number of expensive complex encoders is 
small   but   the   number   of   simple   inexpensive   decoders   is   large.   The   MPEG's   (ISO's) 
approach to standardization is novel, because it is not the encoder that is standardized, but 
the way a decoder interprets the bitstream. A decoder that can successfully interpret the 
bitstream is said to be compliant. The advantage of standardizing the decoder is that over 
time encoding algorithms can improve, yet compliant decoders continue to function with 
them.The MPEG standards give very little information regarding structure and operation 
of the encoder and implementers can supply encoders using proprietary algorithms.This 
gives   scope   for   competition   between   different   encoder   designs,   which   means   better 
designs can evolve and users have greater choice, because encoders of different levels of 
cost and complexity can exist, yet a compliant decoder operates with all of them.
MPEG also standardizes the protocol and syntax under which it is possible to combine or 
multiplex  audio   data   with   video   data   to   produce   a   digital   equivalent   of   a   television 
program.  Many  such  programs   can  be  multiplexed  and  MPEG  defines  the  way  such 
multiplexes can be created and transported. The definitions include the metadata used by 
decoders to demultiplex correctly.
56
Computer Graphics and Multimedia
Image Processing

The YIQ representation is sometimes employed in color image processing


transformations. For example, applying a histogram equalization directly to the channels
in an RGB image would alter the colors in relation to one another, resulting in an image
with colors that no longer make sense. Instead, the histogram equalization is applied to
the Y channel of the YIQ representation of the image, which only normalizes the
brightness levels of the image.

You might also like