Professional Documents
Culture Documents
1
1.1 Definitions .............................................................................................................. 1
1.2 Sources of Photogrammetric Information ................................................................... 2
1.3 Types and Uses of Photogrammetry .......................................................................... 2
1.3.1 Aerial vs Terrestrial (Close-range) .................................................................................. 2
1.3.2 Metric vs Non-metric (Semantic) .................................................................................... 3
1.3.3 Analog vs Analytical vs Digital – not included in the syllabus ........................................... 4
1.3.4 Stereo vs Single Image (Monoscopic) – not included in the syllabus................................. 4
1.4 Platform for Photogrammetric Sensing Systems.......................................................... 4
1.5 History and Developments in the field of Photogrammetry .......................................... 5
1.6 Differences between Traditional and Digital Photogrammetry ...................................... 6
REFERENCES .................................................................. 3
Lecture Notes (14 Feb. 2019) Chapter 1 Photogrammetric Concepts
PHOTOGRAMMETRIC CONCEPTS
Contents of this chapter
1 Introductory Concepts: [2 hrs]
1.1 Definitions
1.2 Sources of Photogrammetric Information
1.3 Types and uses of Photogrammetry – aerial and terrestrial, metric and non-
metric
1.4 Platform for photogrammetric sensing systems
1.5 History and developments in the field of Photogrammetry
1.6 Differences between traditional and digital photogrammetry
1.1 Definitions
There is no universally accepted definition of photogrammetry. The definition given below
captures the most important notion of photogrammetry (Schenk, 2005).
The name “photogrammetry" is derived from the three Greek words photos which means
light, gramma which means letter or something drawn, and metron, to measure (Bhatta,
2011). The most important feature of photogrammetry is the fact that the objects are
measured without being touched (Bhatta, 2011).
Photogrammetry is the science behind the creation of almost every topographic map made
since the 1930s. The human ability to perceive depth is the basis for the science and
technology of photogrammetry. This ability to see in three dimensions is due to the offset
in perspective centers between the left and the right eyes. Photographs taken to mimic
this perspective shift are referred to as stereoscopic or stereo (meaning two). A stereo
pair (or series) of images is taken from consecutive positions and overlap each other by
at least 60%. Through the use of photogrammetry, highly detailed three-dimensional data
can be derived from the two-dimensional photographs of a stereo pair (Matthews , 2008).
It is also defined as the art, science, and technology of obtaining reliable information about
physical objects, and the environment, through processes of recording, measuring, and
interpreting images and patterns of electromagnetic radiant energy and other phenomena
(McGlone, 1980).
Photogrammetry can be defined as the science and art of determining qualitative and
quantitative characteristics of objects from the images recorded on photographic
emulsions. Objects are identified and qualitatively described by observing photographic
image characteristics such as shape, pattern, tone, and texture. Identification of deciduous
versus coniferous trees, delineation of geologic landforms, and inventories of existing land
use are examples of qualitative observations obtained from photography. The quantitative
characteristics of objects such as size, orientation, and position are determined from
measured image positions in the image plane of the camera taking the photography. Tree
heights, stockpile volumes, topographic maps, and horizontal and vertical coordinates of
unknown points are examples of quantitative measurements obtained from photography
(US Army Corps of Engineers, 1993).
Geometric information involves the spatial position and the shape of objects. It is the
most important information source in photogrammetry.
As indicated in Table 1-1 the remotely sensed objects may range from planets to portions
of the earth’s surface, to industrial parts, historical buildings or human bodies. The generic
name for data acquisition devices is sensor, consisting of an optical and detector system.
The sensor is mounted on a platform. The most typical sensors are cameras where
photographic material serves as detectors. They are mounted on airplanes as the most
common platforms. Table 1-1 summarizes the different objects and platforms and
associates them to different applications of photogrammetry (Schenk, 2005).
Gray scale:very dark tone, dark tone, dark gray, mid tone, light tone, white
Texture: Frequency of tonal change of the object, eg.; smooth, coarse, speckled,
fine, fuzzy, soft, e.g., water appears as smooth while forest as coarse
Monoscopic (easier and quicker, flat area) versus Stereoscopic (number of photos, cost,
purpose)
Shadow: Used to determine 3D information from a 2D image, shape and height of object
via shadow
Shadows can enable objects to be "seen" when the natural conditions may not be
favourable, eg., power poles (small planimetric extent, can only be identified by shadow),
sand dunes (shadows result in contrast, can see topographic outline)
Appearance: How the surface appears in 3D, generally related to surface consistency and
change in height, eg., Object (Tall, spherical) and Area (Undulating, flat, rugged)
c) laser scanning (airborne) and imaging radar (airborne, satellite) offer alternatives e.g.
to generate surface models, but also to other Earth surface phenomena, active sensor
like some other active RS sensors, recorded are the reflected signals, range
measurement principle characterized by frequency, scan rate, scanning distance.
Advantage of lasers: - penetrate forest (to a certain extend),
Advantage of radar: - almost no problems with clouds.
d) new commercial Earth observation satellites.
e) UAVs (Unmanned Aerial Vehicles)
Analog Photogrammetry: From 1900 to 1960 the world saw rapid developments in
aviation, mostly as the result of two world wars that occurred during that period. These
developments impacted aerial photogrammetry. Growth in other areas of technology also
occurred during this time, particularly with film cameras. Film allowed people to compare
images to one another, which eventually lead to the development of analytical
photogrammetry.
There have been very rapid technological changes in the field of photogrammetry mainly
due to tremendous advancement in information technology and the general development
of science and engineering. Looking back over the last few decades one can distinguish
great developments in several facets of photogrammetry. The general development, in
particular electronics and computer technology, undoubtedly has opened up new advances
in photogrammetry in the areas of instrumentation, methodology, and integration.
Analytical Photogrammetry: This second phase of development began in the 1950’s due
to the advent of computers. Many analytical techniques were developed and computer-
aided- photogrammetry and mapping were designed. The first operational photo
triangulation program became available in the late sixties (Ackermann, Brown, Schut, to
name a few). Another area of development in this period was the generation of DEM and
manual feature extraction. These were also the result of consistent application of computer
technology. In these applications, the operator handles the task of measurement with very
few computer-assisted operations. It is the data processing that has made photo
triangulation, DEM generation, and feature extraction very efficient and reliable
techniques.
Perhaps the most important development in this period was the invention of the analytical
stereo plotter by Helava (1957). The analytical stereo plotter is essentially an instrument
with a built-in digital computer as its main component, which handles the physical and
mathematical relationship between object (ground) space and image space. The analytical
plotters were introduced into the market during 1976 International Society of
Photogrammetry and Remote Sensing (ISPRS) Congress. Intergraph’s InterMap Analytic
(IMA), a flexible photogrammetric workstation that combines interactive graphics and an
advanced stereo plotter, was introduced in 1986.
ELEMENTARY PHOTOGRAMMETRY
Contents of this chapter
2 Elementary Photogrammetry: [5 hrs]
2.1 Perspective Projection
2.2 Scale and Coverage
2.3 Vanishing Points
2.4 Image Coordinate System
2.5 Relief Displacement
2.6 Parallax
2.7 Stereo
2.7.1 Stereo Vision
2.7.2 Vertical Exaggeration
2.7.3 Stereo Image Separation Technique
2.7.4 Epipolar Planes and Lines
Orthographic projection means that the object is projected onto a plane in a certain
uniform scale by rays perpendicular to that plane. The plane in question is usually parallel
to a predetermined reference or datum plane. In case of a central projection all rays pass
through a common point called the projection or perspective centre. Central projections
are independent of datum planes. They may be horizontal, vertical or tilted (Derenyi,
1982).
(relief) displacements occur Figure 2-2. If the image plane and the object plane are not
parallel then tilt displacements occur. In both cases the scale of the image will not be
uniform but variable. The main task of photogrammetry is to produce maps from
photographs or in other words to transform the central projection into an orthographic
projection. The geometry of the central projection inside the camera is defined by the
interior orientation elements. Interior orientation establishes the relationship between the
interior projection or perspective centre and the image plane as shown in Figure 2-2
(Derenyi, 1982).
Figure 2-2: Idealized Objects in (a) Orthographic and (b) Perspective Projections
(Derenyi, 1982)
Engineers’ Scale
Engineers’ scale is expressed as a ratio in the form of 1 cm = 100 m, where one unit on
the photo or map represents a number of different units on the ground (Falkner and
Morgan, 2002).
Although expressions are different, fundamentally they implies the same thing and can be
converted from one form to the other. In order to convert RF of 1:500 to Engineer’s Scale,
assume that both are in centimetre. Then 500 cm divided by 100 cm equals 5 m, so the
resultant engineers’ scale is 1 cm = 5 m. On the photo or map, 1 cm is equal to 5 m on
the ground.
Photographic scale
Note that if a correct photographic scale ratio is to be computed using this definition, the
image distance and the ground distance must be measured in parallel horizontal planes.
This condition rarely occurs in practice
since the photograph is likely to be tilted
and the ground surface is seldom a flat
horizontal plane. Consequently, scale
will vary throughout the format of a
photograph, and photographic scale can
be defined only at a point (US Army
Corps of Engineers, 1993). Therefore,
photographic scale at a point is defined
as a ratio the focal length of the camera
used and the height above terrain at
which the photograph was taken. The
focal length is the distance from the
optical centre of the lens to the plane of
focus of the light passing through the
lens.
f
S= Figure 2-3: Photographic Scale
H−h
where
f = focal length or principal distance;
H = the flying height above a datum, usually sea level;
h = the height of a terrain point above datum.
Example 2.1 A wide angle camera is used to capture vertical photographs. If the flying
height is 4350 m amsl and the terrain elevation is 550 m, what is the scale of the
photographs?
Solution:
With the exception of flat terrain, h is variable, hence the scale will also be variable. When
have is taken as average height of the terrain, (H-have) becomes the average flying height
above terrain and the average scale of the photograph is obtained (Derenyi, 1982).
In some instances, such as flight planning calculations, approximate scaled distances are
adequate. If all ground points are assumed to lie at an average elevation, an average
photographic scale can be adopted for direct measurements of ground distances. Average
scale is calculated by (US Army Corps of Engineers, 1993):
f
S𝑎𝑣𝑒 =
H − h𝑎𝑣𝑒
where
have = the average height of a terrain point above datum
Figure 2-4 Scale of a vertical photograph over variable terrain (Derenyi, 1982)
Since, the focal length and the flying height remains constant for a project, the photo scale
can be regarded as the function of the terrain elevations. If the terrain is level, the
photograph has a constant scale. However, if the terrain has varying elevation, it will
exhibit continuous range of scale over the terrain. Likewise tilted and oblique photographs
also have non-uniform scale. Furthermore, Earth’s curvature and atmospheric refraction
impair the scale distortion.
Representative fractions (RF) are numerical statements of the map (air photo) to ground
distance. These scale statements may be an actual fraction (one map/air photo unit in the
numerator, and the equivalent number of units of ground distance in the denominator) or
it may be a ratio (one map/air photo unit: equivalent number of ground units). See the
Figure 2-5 for comparisons of different representative fractions.
A verbal scale is a written statement of the map to ground relationship (i.e., “one inch on
the map is equal to 63,360 inches on the ground” or “one centimetre on the map is equal
to one kilometre on the ground” or “an inch to a mile”).
Graphic or bar scales use a series of boxes and/or lines marked off in units of distance
measure. Use of a bar scale is somewhat analogous to using a ruler to measure the
distance between two points. Keep in mind that any map or air photo that is enlarged or
reduced through some type of photocopying will result in a change in scale. Bar scales will
change proportionally, but an RF or verbal scale statement will not.
Example 2.2 While taking a vertical photographs a wide angle camera is used and the
flying height is 4500 m above mean sea level. (a) What is the scale of photographs at
points A and B which are lying at elevations of 700 m and 1460 m respectively. (b) What
ground distance corresponds to 20.1 mm image distance measurement of each of these
elevations?
Solution:
(a) Scale
f 0.152 1
SA = = = = 1: 25000
H − hA 4500 − 700 25000
f 0.152 1
SB = = = = 1: 20000
H − hB 4500 − 1460 20000
(b) Distance Measurement
d = 20.1 mm = 0.0201 m
DA = 0.0201m × 25000 = 502 m
DB = 0.0201m × 20000 = 402 m
Example 2.3 An agricultural land measures 12.9 cm long and 8.7 cm wide on a vertical
photograph having a scale of 1:25,000. What is the area of the field at the ground level?
Solution:
1
Ground Length, L = l × S = 0.129 m × 25000 = 3225 m
1
Ground Width, B = b × S = 0.087 m × 25000 = 2175 m
2.2.2 Coverage
By coverage we think about ground coverage, spectral coverage and temporal coverage.
Ground coverage implies the size of the area covered by one photograph. It depends
basically on focal length of the camera and the flying height. Therefore, ground coverage
of an aerial photograph changes either by changing the focal length or the flying height.
Depending on the ground coverage of a photograph, scale of the photograph can be
expressed either by large scale or small scale (jointly known as relative scale). Relative
scale is used as a measure of viewable detail. Large scale implies relatively more detail is
visible, and small scale implies relatively small detail is visible. A map’s scale determines
how a feature will be represented. For example, on a large-scale map, a river might be
represented as a polygon rather than a line or a city’s extent is so large that it can only
be accurately represented as a polygon rather than a point.
The focal length together with the flying height determine the photo scale and ground
coverage. The focal length also determines the angle of view (AOV) of camera. The longer
the focal length, the narrower the angle of view. Camera lenses are available in standard
focal lengths ranging from 610 mm (narrow angle) to 88 mm (super-wide angle). Focal
length of 152 mm lens is the most commonly used lens (Tempfli et al., 2009).
Figure 2-6: Effect of Focal Length on Ground Coverage (Tempfli et al., 2009)
Super-wide angle lens have a focal length of 88 mm (3.5”). The other extreme is narrow
angle lens with a focal length of 610 mm (24”). Between these two extremes are wide
angle, intermediate angle and normal angle lens with focal length of 153 mm (6”), 213
mm (8.25”) and 3.3 mm (12”) respectively. Since the film format does not change, the
angle of view (AoV) or the ground coverage changes as well as the scale. Table 2-1 depicts
the most relevant data with different lens configurations (Schenk, 2005).
Table 2-1: Aerial Camera Lenses and Data with Different Lens Assembly (Schenk, 2005)
Super-wide angle lens are suitable for medium to small scale applications because the
flying height is much lower compared to a normal angle lens (same photo scale assumed).
Thus, the atmospheric effects, such as clouds and haze, are much less a problem. Normal
angle lens are preferred for large scale applications of urban areas. Here, a super-wide
angle lens would generate much more occlude areas, particularly in built-up areas with
tall buildings (Schenk, 2005). The wide angle camera is the one most commonly used.
The others are used for special purposes e.g., narrow and normal angles for urban areas;
super-wide angle for mapping large areas at small scale (Derenyi, 1982).
The total area of a project contained within neat models is commonly called stereo
coverage, or more precisely stereoscopic coverage. The area in which mapping can be
performed using photogrammetric methods is limited to the area of stereo coverage.
Features existing on only one photograph cannot be mapped. This should be taken into
account when planning a photography mission.
The spectral coverage is the total range of the spectrum that is covered by the channels
of the sensor. Hyperspectral scanners usually have a coherent coverage (no gaps between
the spectral bands) while multispectral sensors often have gaps in their spectral coverage.
The temporal coverage is the span of time over which images are recorded and stored in
image archives (Tempfli et al., 2009).
Scale, the ratio of ground distances to map (or air photo) distances, is often described
very simply as being either large or small. Large-scale maps or images show a small portion
of the Earth’s surface in good detail, while small-scale maps or images show a larger
portion of the Earth’s surface in less detail. Naturally, the larger the scale of the air photo,
the more details you can resolve, and the greater the potential for detailed image
interpretation. Keep in mind, however, that large-scale air photos are not always desirable
for regional to global scale studies, where a small-scale image might be more useful.
Depth estimation from monocular images can be retrieved from the perspective distortion.
One major effect of this distortion is that a set of parallel lines in the real world converges
into a single point in the image plane.
Figure 2-7 Three vanishing points and vanishing lines of a Cube. (Rother, 2000)
The analysis of vanishing points provides strong cues for inferring information about the
3D structure of a scene. With the assumption of perfect projection, e.g. with a pin-hole
camera, a set of parallel lines in the scene is projected onto a set of lines in the image
that meet in a common point. This point of intersection, perhaps at infinity, is called the
vanishing point. Vanishing points, which lie on the same plane in the scene, define a line
in the image, so-called the vanishing line. Figure 2-7 shows the three vanishing points and
vanishing lines of a cube, where a finite vanishing point is defined by a point on the image
plane and a vanishing point at infinity is defined by a direction on the image plane. When
the camera geometry is known, each vanishing point corresponds to an orientation in the
scene and vice versa (Rother, 2000).
The realization that we see lines known to be parallel in space as lines that appear to
converge in a corresponding vanishing point has led to techniques employed by artists
since at least the renaissance to render a credible impression of perspective.
Parallel edges or lines found in the three-dimensional real world will, when captured on a
two-dimensional image, meet at a definable point termed a vanishing point. For example
the rails of a railway track appear to converge or meet as they head into the distance (See
Figure 2-8). Additionally the edge of a platform parallel with the track would, if extended,
also meet at the same vanishing point. The lines used to determine the location of
vanishing points are called perspective lines (HOSDB September 2007).
Figure 2-8:
Determining the position of vanishing points most accurately can be best achieved with a
thorough scene survey and a suitable software package. Whilst it is possible to calculate
these points manually, this can be a lengthy process. It should be noted that locating
vanishing points may require lines in the image to be extended, and that they often lie
beyond the borders of an image (See Figure 2-9).
Figure 2-9:
The vanishing points associated with a given plane all lie on the same line, termed a
vanishing line. (See Figure 2-10). This is in effect the horizon for the plane in question.
Some photogrammetric techniques require the vanishing line associated with the ground
plane to be identified, usually through the location of two vanishing points (HOSDB
September 2007).
Photo coordinates can also be measured using a coordinate digitizer. Such devices
continuously display the x, y positions of a spatial reference mark as it is positioned
anywhere on the photograph. Another option for photo coordinate measurement is the
use of a precision instrument called comparator. A mono-comparator can be used to
measure very accurate coordinates on one photograph at a time; a stereo-comparator can
be used for making measurements on stereo pairs (Lillesand, Kiefer and Chipman, 2014).
The relief displacement equation can be rearranged to calculate vertical heights of objects,
whereby (Derenyi, 1982)
𝑑𝐻
ℎ= (1)
𝑟
This equation is useful for determining the height of objects above ground. The answer is
only approximate because of the effect of tilt and because the exact value of H is usually
not known (Derenyi, 1982). In order to determine height of an object, it is necessary that
both the top and bottom of the objects shall be visible in the image so that the magnitude
to d can be measured in the image. Photo interpreter is often interested to know the
relative heights of objects rather than the absolute elevations (Wolf, 1974).
Example 2.4 While taking a vertical photographs an aircraft took a flight from height
of 4500 m above a datum. Aerial image captured two points A and B which are 700 m
above and 1460 m below the datum respectively. If the radial distances in the image
to point A and B are respectively 72 mm and 86 mm, find the corresponding relief
distances?
Solution:
𝑟𝐴 ℎ𝐴 72 𝑚𝑚×700 𝑚
Relief Displacement at A, 𝑑𝐴 = 𝐻
= 4500𝑚
= 11.20 mm (outward)
𝑟𝐵 ℎ𝐵 86 𝑚𝑚×1460 𝑚
Relief Displacement at B, 𝑑𝐵 = 𝐻
= 4500𝑚
= 27.90 mm (inward)
Example 2.5 Relief displacement for a tower is measured as 5 mm and its radial
distance from the centre of the photo to the top of the tower is measured as 110 mm.
Find the height of the tower if the flying height is 850 m above the base of the tower.
Solution:
𝑑𝐻 5𝑚𝑚×850𝑚
Height of the tower, ℎ = 𝑟
= 110𝑚𝑚
= 38.64 𝑚
Topographic relief can have a great effect on displacing image features. The amount of
image displacement increases on high-degree slopes. Feature displacement also increases
radially away from the photo centre (Falkner and Morgan, 2002).
the negatives. This separation will not be of the same magnitude on successive photos
(Falkner and Morgan, 2002).
Relief displacement is the apparent leaning of objects within a photograph away from the
photo center or principle point. Vertical aerial photographs are good examples of point
projections where only the center image is correctly represented in its true position. From
the center point outward, all objects are warped (radially displaced) away from the center
point. Also, the displacement becomes greater the farther it is away from the center.
Examples of relief displacement can be most easily understood when viewing man-made
objects. The graphic below depicts how a tall factory smokestack would appear in an aerial
photograph and how it would be represented on an orthographic map projection.
2.6 Parallax
Parallax is defined as the apparent shift in the position of an object, caused by a shift in
the position of the viewer. Alternately closing one eye and then the other will demonstrate
the concept of parallax, as near objects appear to shift, whereas far objects will appear
stationary. This effect is the primary mechanism by which we achieve binocular depth
perception, or stereo vision. This effect is exploited in photogrammetry when two
overlapping photographs are taken. Within the overlap area, objects are imaged from two
different exposure positions and parallax1 is evidenced in the resulting images. Parallax
measurements can be used for approximate height computations in the following way
using the geometry in Figure 2-14. Two vertical overlapping photographs are taken at the
same altitude, and an image coordinate system is set up within each image with the x axis
parallel to the flight line, and with origin at the principal point in each image. For a given
point, the parallax is defined as (Bethel, 2003)
𝑝 = 𝑥𝑙𝑒𝑓𝑡 − 𝑥𝑟𝑖𝑔ℎ𝑡
where
𝑝 = stereoscopic parallax of object point
𝑥𝑙𝑒𝑓𝑡 = measured photo coordinate of a point on the left image
𝑥𝑟𝑖𝑔ℎ𝑡 = measured photo coordinate of the same point on the right image
The photo coordinates are not measured with respect to the fiducial axis system. Rather,
they are measured with respect to the axis along the line of flight.
1
Also called stereoscopic parallax or x-parallax or simply parallax.
Parallax displacements occur only parallel to the line of flight. In theory, the direction of
flight correspond precisely to the fiducial x-axis. In reality, however, unavoidable changes
in the aircraft orientation will usually slightly offset the fiducial axis from the flight axis
(Lillesand, Kiefer and Chipman, 2014). The true flight line axis may be found by first
locating on a photograph the points that correspond to the image centres of the preceding
and succeeding photographs. These points are called the conjugate principal points. A line
drawn through the principal points and the conjugate principal points defines the flight
axis. The line of flight for any given stereo-pair defines a photo-coordinate x-axis for use
in parallax measurement. Lines drawn perpendicular to the flight line and passing through
the principal point of each photo form the photographic y-axis for parallax measurement
(Lillesand, Kiefer and Chipman, 2014).
𝑥𝑎 (𝐻−ℎ𝐴 )
or, 𝑋𝐴 = 𝑓
𝑥
or, 𝑋𝐴 = 𝐵 𝑝𝑎
𝑎
Similarly,
𝑦
𝑌𝐴 = 𝐵 𝑝𝑎
𝑎
All these set of equations, that defines position of the point A are called parallax equations.
B L’
f a
O LL’
xa f
O’
xa’ f
pa
A
Superimposed Triangles
hA H
XA
Ax
YA
hA
O’
Figure 2-15: Parallax Relationships on Overlapping Vertical Photographs and
Superimposed Triangles (Lillesand, Kiefer and Chipman, 2014)
In many applications, the difference in elevation between two points is of more immediate
interest than is the actual value of elevation of either point. In such cases, the change in
elevation between two points can be found from the following equation (Lillesand, Kiefer
and Chipman, 2014):
∆𝑝𝐻′
∆ℎ =
𝑝𝑎
where
∆ℎ = difference in elevation between two points whose parallax difference is ∆𝑝
𝐻′ = flying height above the lower point
𝑝𝑎 = parallax of the higher point
Example 2.6 In a stereo-pair comprising of images ‘a’ and ‘b’, length of a line, and
difference in elevation between the points are to be determined. Throughout the flying
mission, the average flying height was 1460 m, the distance between successive
exposures are 600 m and the camera used is a wide angle camera (f = 152 mm). The
photo coordinates of the points A and B along the line of flight are xa=56.14 mm,
xb=76.89 mm, ya=58.20 mm, yb=-20.45 mm, xa’=-49.54 mm, and xb’=-29.37 mm. Find
the length of line AB, elevations of points A and B, and difference in elevation between
the two points.
Solution:
𝑥𝑎 600 × 56.14
𝑋𝐴 = 𝐵 = = 318.74 𝑚
𝑝𝑎 105.68
𝑥𝑏 600 × 76.89
𝑋𝐵 = 𝐵 = = 434.16 𝑚
𝑝𝑏 106.26
𝑦𝑎 600 × 58.20
𝑌𝐴 = 𝐵 = = 330.43 𝑚
𝑝𝑎 105.68
𝑦𝑏 600 × −20.45
𝑌𝐵 = 𝐵 = = −115.47 𝑚
𝑝𝑏 106.26
𝐷 = 460.60 𝑚
(b) Ground Elevations
𝐵𝑓 600 × 152
ℎ𝐴 = 𝐻 − = 1460 − = 597.02 𝑚
𝑝𝑎 105.68
𝐵𝑓 600 × 152
ℎ𝐵 = 𝐻 − = 1460 − = 601.73 𝑚
𝑝𝑏 106.26
(c) Difference in Elevation
∆ℎ = ℎ𝐵 − ℎ𝐴 = 601.73 − 597.02 = 4.71 𝑚
∆𝑝𝐻′ (106.26−105.68)×(1460−597.02)
Or ∆ℎ = 𝑝𝑎
= 106.26
= 4.71 𝑚
Example 2.7 In a stereo-pair an elevated water tank at 860 m above the sea level is
captured. Throughout the flying mission, the average flying height was 1460 m. Photo
measurement reveals that the average photo base (distance between principal point and
conjugate principal point) is 75 mm and the differential parallax at the base and top of
the water tank is 5 mm. Find the height of the tank above the ground.
Solution:
For clear and comfortable stereoscopic viewing, an essential condition is that the lines
joining the corresponding images be parallel with the direction of flight. When this
condition is failed to achieve, y-parallax is said to exist. Any slight amount of y-parallax
causes eyestrain, and excessive amounts prevents stereoscopic viewing altogether (Wolf,
1974).
The change in position of the impressions of an object from one photograph/image to the
next, perpendicular to the flight line is referred to as y-parallax. Due to the presence of y-
parallax, viewing and measuring of a DSM may be difficult. Unlike x-parallax, y-parallax is
an error which needs to be removed or reduced. The following factors may introduce y-
parallax (Bhatta, 2011: 255):
Unequal flying height between adjacent exposures. This effect causes a difference in
scale between the left and right images. As a result, y-parallax introduces and the 3D
stereo view becomes distorted.
Flight line misalignment during image collection. This results in large differences in
image orientation between two overlapping images. As a result, we experience eye
strain and discomfort while viewing the DSM.
Erroneous sensor model information. Inaccurate sensor model information creates
large differences in y-parallax between two images comprising a DSM.
As a result of these factors, the DSMs contain y-parallax, y-parallax introduces discomfort
during stereo viewing.
To minimize y-parallax, we are required to scale, translate, and rotate the images until a
clear and comfortable stereo view is available. While using DSMs created from sensor
model information, photogrammetric software automatically rotates, scales, and translates
the imagery to continually provide an optimum stereo view throughout the stereo model.
Thus y-parallax is automatically accounted for. The process of automatically creating a
clear stereo view is referred to as epi-polar resampling on the fly. As we roam throughout
the DSM, many software account and adjust for y-parallax automatically (Bhatta, 2011:
255)
Parallax Measurement
– refer to (Wolf, 1974), (Lillesand, Kiefer and Chipman, 2014: 159), (Bhatta, 2011)
2.7 Stereoscopy
Stereoscopy is the science of producing 3D visual model using 2D images. It has been the
basis for 3D measurement in photogrammetry. Not any two images can be viewed
stereoscopically, rather they must be stereo pair and shall fulfil several conditions. The
basic requirements for two images to be stereo pair are that the images of the same
objects or scene are taken from different positions, but not too far apart and at a very
similar scale. The 3D visual impression is called the stereo model or stereoscopic model.
In order to obtain a systematic coverage of an area by stereo images with airborne frame
camera, stripes of vertical images shall be overlapped by at least 60% (Tempfli et al.,
2009).
While viewing an object with both eyes, each eye sees this object from a slightly different
angle and transmits a slightly different impulse to the brain, where a three dimensional
view is formed. This is natural or direct stereo vision (Derenyi, 1982).
Indirect stereoscopic vision occurs when photographs of an object taken from two different
stations are viewed. The conditions for such stereoscopic vision are the following:
There are no problems with satisfying the first condition. Condition 2 is, however, difficult
to satisfy since the two photos must be viewed with nearly parallel eye axes and, at the
same time, the eyes must focus at a short distance. Fortunately, there are several
instrumental aids available such as pocket stereoscope, mirror stereoscope, anaglyph
viewing system, polonization filters, stereo-image alternator, etc., (Derenyi, 1982).
When a person with normal two-eyed vision looks simultaneously at two photographs
which have been taken of the same scene from different viewpoints, using one eye for
each photograph, he can visualize the scene in three dimensions. This phenomenon is
called stereoscopic vision (Moffitt, 1962).
Since the distance between a person's eyes is fixed, natural stereoscopic vision is possible
only at distances ranging from about 10” to about 2000’. The perception of depth can be
obtained at a distance of less than 10” by the use of lenses. In order to be able to perceive
depth at a distance greater than 9000’, a person must, in effect, increase the distance
between his eyes. He may use a pair of binoculars or a range finder. Or a pair of
photographs may be taken with cameras located at widely separated points, and these
photographs may be viewed at the same time in a suitable manner. The simultaneous
viewing of two such photographs is known as stereoscopy (Moffitt, 1962).
Stereoscopic vision determines the distance to an object by intersecting two lines of sight.
In the human vision system, the brain senses the parallactic angle between the converging
lines of sight and unconsciously associates the angle with a distance. Overlapping aerial
photographs can be viewed stereoscopically with the aid of a stereoscope. The
stereoscope forces the left eye to view the left photograph and the right eye to view the
right photograph. Since the right photograph images the same terrain as the left
photograph, but from a different exposure station, the brain perceives a parallactic angle
when the two images are fused into one. As the viewer scans the entire overlap area of
the two photographs, a continuous stereo-model of the ground surface can be seen. The
stereo-model can be measured in three dimensions, yielding the elevation and horizontal
position of unknown points. The limitation that elevation cannot be determined in a single
photograph solution is overcome by the use of stereo-photography (US Army Corps of
Engineers, 1993).
Figure 2-17: Real Terrain Model (left) vs Perceptual Terrain Model (right)
The primary factor contributing to vertical exaggeration in the stereoscopic image is the
ratio of the ground distance between exposure stations to the flying height above the
average elevation of the terrain. All other factors remaining constant, an increase in this
ratio will cause a corresponding increase in vertical exaggeration. The apparent distortion
in the horizontal and vertical scales does not have any effect on the determination of
elevations from parallax measurements (Moffitt, 1962).
Although there are many causes of vertical exaggeration to exist, the primary cause is the
lack of equivalence between photographic base-height ratio (as seen in real terrain model)
and the corresponding base-height ratios in stereo-viewing (as seen in perceptual model).
The ratio of these two base-height ratios gives an approximate value of vertical
exaggeration (Wolf, 1974). Therefore, referring to the Figure 2-17, vertical exaggeration
is:
𝐵⁄
𝐵 ℎ
𝐸𝑣 = 𝐻′ = ( ) ( )
𝑏𝑒⁄ 𝐻′ 𝑏𝑒
ℎ
where
𝐸𝑣 = vertical exaggeration
𝐵 = air base, i.e., distance between successive exposures
𝐻′ = flying height above average ground level
ℎ = distance from the eyes at which the stereo-model is perceived
𝑏𝑒 = eye base, i.e., distance between the eyes
The average adult eye base is about 2.6”, repeated tests revealed that average value of
h about 17”. Using these two average values, the approximate stereo base-height ratio is
approximately 0.15 (Wolf, 1974). One of the unique and advantageous outcomes of stereo
photogrammetry is that the vertical scale is exaggerated when viewed by the observer.
This allows for accurate determination of height by the photogrammetrist. Increasing the
flying height will decrease the amount of vertical exaggeration.
One may argue that the simplest way to achieve stereoscopic viewing is by displaying the
two images of a stereo-pair on two separate monitors. Viewing is achieved by means of
optical trains, e.g. a stereoscope, or by polarization. Matra adopted this principle by
arranging the two monitors at right angles, with horizontal and vertical polarization sheets
in front of them (Schenk, 2005).
Separation Implementation
Spatial 2 monitors + stereoscope
1 monitor + stereoscope (split screen)
2 monitors + polarization
Spectral Anaglyphic
Polarization
Temporal Alternate display of left and right image
Synchronized by polarization
The most popular realization of spectral separation is by anaglyphs. The restriction to
monochromatic imagery and the reduced resolution outweigh the advantage of simplicity
and low cost. Most systems today use temporal separation in conjunction with polarized
light. The left and right image is displayed in quick succession on the same screen. In
order to achieve a flicker-free display, the images must be refreshed at a rate of 60 Hz
per image, requiring a 120 Hz monitor (Schenk, 2005).
End lap, also known as forward overlap, is the common image area on consecutive
photographs along a flight strip. This overlapping portion of two successive aerial photos,
which creates the three-dimensional effect necessary for mapping, is known as a stereo
model or more commonly as a “model.”
shows the end lap area on a single pair of consecutive photos in a flight line
Figure 2-18
(Falkner and Morgan, 2002).
Practically all projects require more than a single pair of photographs. Usually, the aircraft
follows a predetermined flight line as the camera exposes successive overlapping images
(Falkner and Morgan, 2002).
Figure 2-18: Overlap in Aerial Frame Images: End lap (left) and Side lap (right)
If a project is too large to cover in a single flight line, additional flights will be made to
assure stereo coverage. These are normally parallel to the first. Side lap between the two
flight lines must again be planned to assure continuous coverage between flights of the
project area.
Side lap, sometimes called side overlap, encompasses the overlapping areas of
photographs between adjacent flight lines. It is designed so that there are no gaps in the
three-dimensional coverage of a multiline project. Figure 2-18 shows the relative head-on
position of the aircraft in adjacent flight lines and the resultant area of exposure coverage
(Falkner and Morgan, 2002). Side lap is essential in aerial photography to prevent gaps
from occurring between flight strips as a result of drift, crab, tilt, flying height variations,
and terrain variations. The amount of end lap of a stereo-pair can be determined with the
following equation (Wolf, 1974):
𝐺−𝑊
𝑆=( ) × 100 %
𝐺
where
𝑆 = side lap in percentage
𝑊 = distance between adjacent flight lines
𝐺 = ground coverage
A designed 20% side lap is desirable for most mapping projects. This normally assures
coverage considering flight line deviations, ground relief, and ground control requirements.
Taking 30 % side lap eliminates the need to use extreme edges of the photography, where
the imagery is of poorer quality. Photography for mosaic work is sometimes taken with
greater than 30 % side lap (Wolf, 1974). It is noted however, that in mountainous terrain
these percentages may need to be increased to allow for terrain changes and excessive
relief displacement.
In some cases where aerial photography is to be used for very precise photogrammetric
control extension, it may be taken with 60 % side lap as well as 60 % end lap (Wolf,
1974).
The epipolar geometry between two views is essentially the geometry of the intersection
of the image planes with the pencil of planes having the baseline as axis (the baseline is
the line joining the camera centres). This geometry is usually motivated by considering
the search for corresponding points in stereo matching (Hartley and Zisserman, 2004).
Epipoles are the points at which the line through the centers of projection intersects the
image planes. Let el and er be the left and right epipoles respectively. It is by construction
that the left epipole is the image of the projection center of the right camera and vice
versa as can be seen in figure above.
A special point to consider is if the line through the centers of projection, called baseline,
is parallel to one of the image planes, then the corresponding epipole is the point at infinity
of that line.
Epipolar plane is the plane containing the baseline and the corresponding points P, pl and
pr. There is a one-parameter family of epipolar planes.
Epipolar line is the intersection of an epipolar plane with one image plane. All epipolar
lines intersect at the epipole. An epipolar plane intersects the left and right image planes
in epipolar lines, and defines the correspondence between the lines.
MATHEMATICAL CONCEPTS
Contents of this chapter
3 Mathematical Concepts in Photogrammetry: [10 hrs]
3.1 Photogrammetric Transformations
3.2 Coordinate Reference System
3.2.1 Image Space Coordinate System
3.2.2 Object Space Coordinate System
3.2.3 Camera Coordinate System
3.3 Mathematical Relationship between image and ground coordinates
3.3.1 Theory of orientation
3.3.2 Interior orientation (IO)
3.3.3 Exterior Orientation (EO)
3.3.4 Relative Orientation (RO)
3.3.5 Absolute Orientation (AO)
3.3.6 Classification of Points used in Orientation
3.3.7 Photogrammetric Conditions: Collinearity and coplanarity equations
Image space coordinate system is identical to image coordinate system except that it adds
a third axis – z. The origin of the image space coordinate system is defined at the
perspective centre. The perspective centre is commonly the lens of the camera as it existed
when the photograph was captured. Its x- and y-axis are parallel to the x- and y-axis in
the image coordinate system. The z-axis is the optical axis; therefore, the z-value of an
image point in an image space coordinate system is usually equal to the focal length of
the camera. Image space coordinates are used to describe positions inside the camera,
and therefore, it is also known as the camera coordinate system (Bhatta, 2011).
An image space coordinate system (Figure 3-1) is identical to image coordinates, except
that it adds a third axis (z). The origin of the image space coordinate system is defined at
the perspective centre S as shown in Figure 3-1. The perspective centre is commonly the
lens of the camera as it existed when the photograph was captured. Its x-axis and y-axis
are parallel to the x-axis and y-axis in the image plane coordinate system. The z-axis is
the optical axis; therefore, the z value of an image point in the image space coordinate
system is usually equal to the focal length of the camera (f). Image space coordinates are
used to describe positions inside the camera, and usually use units in millimetres or
microns. This coordinate system is referenced as image space coordinates (x, y, z) in this
chapter (ERDAS, Inc., 2010).
Figure 3-1: Image Space and Ground Space Coordinate System (ERDAS, Inc., 2010)
3.1.2 Object Space Coordinate System (Ground Coordinate System)
A ground coordinate system is usually defined as a 3D coordinate system that utilizes a
known geographic map projection. Ground coordinates (X, Y, Z) are usually expressed in
feet or meters. The Z value is elevation above mean sea level for a given vertical datum
(ERDAS, Inc., 2010).
Most photogrammetric applications account for the curvature of the Earth in their
calculations. This is done by adding a correction value or by computing geometry in a
coordinate system that includes curvature. Two such systems are geocentric and
topocentric coordinates. A geocentric coordinate system has its origin at the centre of the
Earth ellipsoid. The Z-axis equals the rotational axis of the Earth, and the X-axis passes
through the Greenwich meridian. The Y-axis is perpendicular to both the Z-axis and X-
axis, so as to create a three-dimensional coordinate system that follows the right hand
rule. A topocentric coordinate system has its origin at the centre of the image projected
on the Earth ellipsoid. The three perpendicular coordinate axes are defined on a tangential
plane at this centre point. The plane is called the reference plane or the local datum. The
x-axis is oriented eastward, the y-axis northward, and the z-axis is vertical to the reference
plane (up) (ERDAS, Inc., 2010).
The scaling and rotation are each defined by one parameter. The translations involve two
parameters. Thus, there are a total of four parameters in this transformation. The
transformation requires a minimum of two points, called control points, that are common
to both systems. With the minimum of two points, the four parameters of the
transformation can be determined uniquely. If more than two control points are available,
a least squares adjustment is possible. After determining the values of the transformation
parameters, any points in the original system can be transformed.
(a) (b)
𝑋 ′ = 𝑠𝑋
𝑌 ′ = 𝑠𝑌 (3-1)
2. Rotation: In order to analyse the rotation, let us construct an arbitrary coordinate
system E’N’ parallel to the EN coordinate system. Now superimpose the scaled coordinate
system X’Y’ onto this constructed coordinate system such that the origins of both the
system coincide (Figure 3-5). Although the origins of both the systems coincide, axes are
not aligned. In order to align the axes of scaled coordinate system to the E’N’ system, they
need to be rotated about the origin. Rotating the scaled coordinate system by clockwise
angle , the rotated coordinates of a point can be determined by using the following
equations:
𝐸 ′ = 𝑋 ′ 𝑐𝑜𝑠𝜃 − 𝑌′𝑠𝑖𝑛𝜃
𝑁 ′ = 𝑋 ′ 𝑠𝑖𝑛𝜃 + 𝑌′𝑐𝑜𝑠𝜃 (3-2)
Rotation angle is the sum of angles and indicated in Figure 3-4. From the coordinates
of two control points, these angles are calculated as
Figure 3-5: X’Y’ Coordinate System superimposed onto the EN Ground Coordinate System
3. Translation: Finally, it is necessary to translate the origin of the rotated coordinate
system i.e., E’N’ to the origin of the ground coordinate system. Referring to the Figure
3-5, this can be accomplished by adding translation factors as follows:
𝐸 = 𝐸 ′ + 𝑇𝐸
𝑁 = 𝑁 ′ + 𝑇𝑁 (3-3)
where the translation factors TE and TN can be calculated for the coordinates of control
points as
𝑇𝐸 = 𝐸𝐴 − 𝐸′𝐴 = 𝐸𝐵 − 𝐸′𝐵
𝑇𝑁 = 𝑁𝐴 − 𝑁′𝐴 = 𝑁𝐵 − 𝑁′𝐵
If equations 3-1, 3-2 and 3-3 are combined a single set of equations is produced that can
be used directly to transform the coordinates of unknown points from arbitrary coordinate
system to the ground coordinate system as
𝐸 = 𝑠𝑋 cos 𝜃 − 𝑠𝑌 sin 𝜃 + 𝑇𝐸
𝑁 = 𝑠𝑋 𝑠𝑖𝑛 𝜃 + 𝑠𝑌 𝑐𝑜𝑠 𝜃 + 𝑇𝑁 (3-4)
plus a small non-orthogonality correction between the x and y axes. This results in a total
of six unknowns, therefore is also known as the six parameter transformation (Ghilani and
Wolf, 2006).
A two-dimensional affine transformation consists of four basic steps: (1) scale change in
x and y, (2) correction for non-orthogonality, (3) rotation, and (4) translation. Figure 3-6
illustrates the geometric relationship between the arbitrary coordinate system xy and the
final coordinate system XY. In this figure, the non-orthogonality of x and y is indicated by
the angleε. The rotation angle necessary to make the two systems parallel is θ, and
translations ΤX and ΤY account for the offset of the origin. The four steps of the derivation
are as follows (Wolf, DeWitt and Wilkinson, 2014: 543):
Several geometric possibilities exist; however, two configurations are most common and
are illustrated in Figure 3-7 (a) and (b). In both figures the x’y’ measurement systems
have already been scaled in accordance with step 1. The first configuration, illustrated in
(a), is appropriate for most comparators where separate x and y carriages provide
independent movement in both directions. The x’ coordinate is measured parallel to the x’
axis from the origin to the point, and the y’ coordinate is measured parallel to the y’ axis
from the origin to the point. The second configuration, shown in (b), is appropriate when
one is using satellite imagery that is acquired in a scanning fashion while the earth rotates
beneath. The resulting image has a distinct parallelogram shape. In this configuration the
x’ coordinate is measured parallel to the x’ axis from the y’ axis to the point, and the y’
coordinate is measured perpendicular to the x’ axis. For the configuration of Figure 3-7
(a), the correction for non-orthogonality is given by Eqs. (3-6) (Wolf, DeWitt and
Wilkinson, 2014).
Figure 3-7: (a) Two-dimensional affine relationship for typical comparator. (b) Two-
dimensional affine relationship for typical scanning-type satellite image.
3-6
Equations (3-7) express the relationship for the configuration of Figure 3-7 (b).
3-7
Step 3: Rotation
3-8
Step 4: Translation
The final step is to translate the origin by ΤX and ΤY to make it coincide with the origin
of the final system, as shown in Eqs. (3-9).
3-9
Combining the four steps for configuration (a) gives Eqs. (3-10).
3-10
Simplifying
or
or
Similarly
or
or
3-11
These equations are linear and can be solved uniquely when three control points exist
(i.e., points whose coordinates are known in the both systems). This is because for each
point, an equation set in the form of Equations (3-11) can be written, and three points
yield six equations involving six unknowns. If more than three control points are available,
a least squares solution can be obtained (Ghilani and Wolf, 2006).
Fiducial marks are only used to establish the interior orientation for photos taken from a
traditional metric film camera. Digital cameras applied with an area CCD2 sensor do not
need them because each CCD element gives the same image pixel every time. Therefore,
the method of camera definition differs a bit from that you already know, and the interior
orientation is given directly and must not be carried out image per image (Linder, 2009).
Interior orientation refers to the geometric relationship between the image plane and the
perspective centre of the lens. Typically, the interior orientation and refinement
parameters are considered known based on the calibration report. Camera calibration
parameters define the interior orientation of the imaged bundle of rays (US Army Corps
of Engineers, 2002).
Interior orientation involves placing the photographs in proper relation to the perspective
centre of the stereo-plotter by matching the fiducial marks to corresponding marks on the
photography holders and by setting the principal distances of the stereo-plotter to
correspond to the focal length of the camera (adjusted for overall film shrinkage) (US
Army Corps of Engineers, 2002). Recreate the geometry of the projected rays to duplicate
exactly the geometry of the original photos. Three steps of interior orientations are:
1. Centering diapositives on the projectors
2. Setting off the proper principal distance
3. Compensation for image distortion
2
Charged-coupled Device
Compiled by: Bikash Sherchan
3-11
Lecture Notes (14 Feb. 2019) Chapter 3 Mathematical Concepts
Unlike interior orientation, which only works on one photograph at a time, exterior
orientation requires to locate the image coordinates of the control points in both images
of a stereo-pair (Jensen, 2011).
All aerial photographs are tilted to some degree. We need to know how to model this tilt
if we are going to extract useful measurements from aerial photography. There are six
elements of exterior orientation (Figure 3-8) that express the spatial location and angular
orientation of a tilted aerial photograph at the moment of exposure (𝑋𝑆 , 𝑌𝑆 , 𝑍𝑆 , 𝜔, 𝜙, 𝜅)
(Jensen, 2011). All the methods developed to determine these six parameters for each
aerial photograph require photographic images of at least three ground-control points
whose X, Y, Z coordinates are known (Wolf, DeWitt and Wilkinson, 2014). If we can
determine these parameters for each aerial photograph, we can use the information to
relate image coordinates to real-world (exterior) map coordinates (Jensen, 2011).
(Chikatsu, n.d.)
procedures have been devised for determining the relative orientation in an analog
fashion. Most commonly used are stereo-plotters, optical devices that permit viewing of
image pairs and superimposed synthetic features called floating marks. (Horn, 1990).
Relative orientation of each stereo pair is performed by a least squares adjustment using
the collinearity equations. The stereo-model is created in an arbitrary coordinate system,
and the adjustment is unconstrained by ground coordinate values. Therefore, the photo
coordinate residuals should be representative of the point transfer and measuring
precision. The photo coordinate residuals should be examined to detect misidentified or
poorly measured points. The minimum number of points that will uniquely determine a
relative orientation is six (US Army Corps of Engineers, 2002).
RO recreates the relative relationship between dia-positives that existed at the time of the
photography. It creates, in miniature, a true 3D stereo-model of the overlapping area.
After the dia-positives have been placed in the projectors and the lights turned on,
corresponding light rays will not intersect to form a clear model.
(Chikatsu, n.d.)
Then, these are transformed to terrain co-ordinates in the absolute orientation (Linder,
2009). Absolute orientation uses the known ground coordinates of points identifiable in
the stereoscopic model to scale and to level the model. When this step is completed, the
Compiled by: Bikash Sherchan
3-13
Lecture Notes (14 Feb. 2019) Chapter 3 Mathematical Concepts
X, Y, and Z ground coordinates of any point on the stereoscopic model may be measured
and/or mapped (US Army Corps of Engineers, 2002).
After relative orientation, a true 3D model is formed, the purpose of which is to bring a
stereoscopic model to the desired map scale and to place the model in its correct
orientation with respect to reference system. Absolute orientation is achieved by a 3D
conformal coordinate transformation (7 parameter): three rotations, one scale factor, and
three shifts (translations). In order to perform AO, two horizontal control points are needed
to scale the model and three vertical control points are needed to level the model.
(Chikatsu, n.d.)
But often we have no signalised GCPs. Then we must look for real object (terrain) points
which we can clearly identify in the image as well as in a topographic map mentioned
before. But not every point is really good to serve as a GCP: As far as possible, choose
rectangle corners (e.g. from buildings) or small circle shaped points. These have the
advantage to be scale-invariant. Take into account that we need also the elevation – this
might be a problem using a point on the roof of a building, because it is not possible to
get its elevation from the map! Therefore, if possible, prefer points on the ground (Linder,
2009).
Figure 3-9: Examples for natural ground control points (Linder, 2009)
Ground Control Points may be horizontal control points, vertical control points or both.
Horizontal control point positions are known planimetrically in some X-Y coordinate
systems (e.g., a state plane coordinate system). Vertical control points have known
elevations with respect to a level datum (e.g., mean sea level). A single point with known
planimetric position and known elevation can serve as both horizontal and vertical control
point (Lillesand, Kiefer and Chipman, 2014) and often referred to as full control point.
For horizontal control, these control points should be clearly identifiable both on the
photograph and on the ground. The point should be well-defined with sharp contrast
between the feature and the surrounding area. A good quality vertical control point is one
whose elevation is higher than the surrounding area. Again, it needs to be identifiable on
both the terrain and photography.
Tie Points (Connection Point): Now imagine the case that we have much more than two
images, let’s say a block formed of 3 strips each containing 7 images as we will use in this
example, and we have no signalised points but only a topographic map, scale 1:50,000.
Greater parts of our area are covered with forest, so we can only find a few points which
we can exactly identify. It may happen that for some images we are not even able to find
the minimum of 3 points (Linder, 2009).
This may serve as a first motivation for that what we want to do now: The idea is to
measure points in the images from which we do not know their object coordinates but
which will be used to connect the images together. These are called connection points or
tie points. In addition, we will measure GCPs wherever we will find some (Linder, 2009).
Common points between adjacent strips are called tie points or bridging points. Tie points
are usually in areas of the photograph where the resolution is considerably lower (Schenk,
n.d.).
A point that can be recognized on multiple overlapping photos but whose
coordinates are not known.
Ground coordinates for tie points are computed during block triangulation.
Tie points can be measured manually and automatically.
Manual measurement in block images typically involves nine points in each image.
A point added to each neat model to produce adequate horizontal and/ or vertical control
and to eliminate a photo control target that otherwise would have required field survey
effort. Also known as an artificial point or a Pug Point (Caltrans).
Examples of particular problems for which the collinearity equations are useful include
space resection (camera exterior orientation unknown, object points known, observed
image coordinates given, usually implying a single image), space intersection (camera
exterior orientations known, object point unknown, observed image coordinates given,
usually implying a single object point), and bundle block adjustment (simultaneous
resection and intersection, multiple images, and multiple points). This equation is
nonlinear and a linear approximation is usually made if we attempt to solve for any of the
variables as unknowns. This dictates an iterative solution (Bethel, 2003).
The coplanarity condition implies that the two perspective centers, any object point and
the corresponding image points on the two photographs of stereo-pair, must all lie on a
common plane. This condition is fundamental to Relative Orientation or Space Intersection
(Ghosh, 2005).
Figure 3-11: Coplanarity Condition and Intersection for Relative Orientation (Ghosh,
2005)
The phrase conjugate image points refers to multiple image instances of the same object
point. If we consider a pair of properly oriented images and the pair of rays defined by
two conjugate image points, then this pair of rays together with the base vector between
the perspective centres should define a plane in space. The coplanarity condition enforces
this geometrical configuration. This is done by forcing these three vectors to be coplanar,
which is in turn guaranteed by setting the triple scalar product to zero. An alternative
explanation is that the parallelepiped defined by the three vectors as edges has zero
volume. Figure 3-11 illustrates this geometry. The left vector is given by (Bethel, 2003)
𝑢1 𝑥 − 𝑥𝑜
𝑅1 = 𝑘1 [ 𝑣1 ] = 𝑘1 𝑀1𝑡 [𝑦 − 𝑦𝑜 ]
𝑤1 −𝑓 1
𝑏𝑥 𝑋𝑂2 − 𝑋𝑂1
𝑏 = [ 𝑦 ] = [ 𝑌𝑂2 − 𝑌𝑂1 ]
𝑏
𝑏𝑧 𝑍𝑂2 − 𝑍𝑂1
The coplanarity equation is the above sated triple scalar product, i.e.; 𝐹 = 𝑏⃗ ∙ 𝑅
⃗⃗⃗⃗1 ∙ 𝑅
⃗⃗⃗⃗2 = 0
where
The most prominent application for which the coplanarity equation is used is relative
orientation. The equation is non-linear and a linear approximation is usually made in order
to solve for any of the variables as unknowns. This dictates an iterative solution (Bethel,
2003).
PHOTOGRAMMETRIC TRIANGULATION
Contents of this chapter
4 Photogrammetric Triangulation: [8 hrs]
Investigating the possibility of object space reconstruction from imagery using
4.1 Single Image
4.2 Stereo Pair (two overlapping images)
4.3 Single flight lines (strip triangulation)
4.4 Image Blocks
4.4.1 Block Adjustment of Independent Models (BAIM)
4.4.2 Bundle Block Adjustment
4.4.3 Block Adjustment with added parameters (Self-Calibration)
4.5 Advantages and disadvantages
4.6 Statistical Evaluation: Precision, Accuracy and Reliability
4.1 Introduction
Photogrammetry is the science of obtaining reliable information about objects and of
measuring and interpreting this information. The task of obtaining information is called
data acquisition. Another major task of photogrammetry is concerned with reconstructing
the object space from images. This entails two problems: geometric reconstruction (e.g.
the position of objects) and radiometric reconstruction (e.g. the gray shades of a surface).
The latter problem is relevant when photographic products are generated, such as
orthophotos. Photogrammetry is mainly concerned with the geometric reconstruction. The
object space is only partially reconstructed, however. With partial reconstruction we mean
that only a fraction of the information recorded from the object space is used for its
representation (Schenk, 2005).
The geometrical relationship between image and object space can best be established by
introducing suitable coordinate systems for referencing both spaces (Schenk, 2005). Such
relationship can be achieved by using:
Single image
Stereo-pair
Single strip or
Image blocks
Aero-triangulation is the term most frequently applied to the process of determining the
X, Y, and Z ground coordinates of individual points based on photo coordinate
measurements. Photo-triangulation is perhaps a more general term, however, because
the procedure can be applied to terrestrial photos as well as aerial photos. With improved
photogrammetric equipment and techniques, accuracies to which ground coordinates can
be determined by these procedures have become very high (Wolf, DeWitt and Wilkinson,
2014: 396).
Aero-triangulation is used extensively for many purposes. One of the principal applications
lies in extending or densifying ground control through strips and/or blocks of photos for
use in subsequent photogrammetric operations. When used for this purpose, it is often
called bridging, because in essence a “bridge” of intermediate control points is
developed between field-surveyed control that exists in only a limited number of photos
in a strip or block. Establishment of the needed control for compilation of topographic
Besides having an economic advantage over field surveying, aero-triangulation has other
benefits: (1) most of the work is done under laboratory conditions, thus minimizing delays
and hardships due to adverse weather conditions; (2) access to much of the property
within a project area is not required; (3) field surveying in difficult areas, such as marshes,
extreme slopes, and hazardous rock formations, can be minimized; and (4) the accuracy
of the field-surveyed control necessary for bridging is verified during the aero-triangulation
process, and as a consequence, chances of finding erroneous control values after initiation
of compilation are minimized and usually eliminated. This latter advantage is so meaningful
that some organizations perform bridging even though adequate field-surveyed control
exists for stereo model control. It is for this reason also that some specifications for
mapping projects require that aero-triangulation be used to establish photo control (Wolf,
DeWitt and Wilkinson, 2014).
At first, the first pair of photographs in the strip say photos 1 and 2 are relatively oriented
at some arbitrary scale (Davis et al., 2014). It is convenient, although not necessary, to
have enough field-surveyed ground control in the beginning model of the strip to enable
that model to be absolutely oriented. After completing the orientation of the first model
1/2, then the third photo 3 is then oriented to photo 2 (Wolf, 1974: 428). This procedure
is required so as not to destroy the first model and guarantee the continuity of the strip
model. This operation, therefore, is usually referred to as dependent relative orientation.
The scale of the model 2/3 may in general be different from that of the model 1/2. The
process of scale transfer between the two models is performed so that the scale of the
new model 2/3 is the same as that of the preceding model, 1/2. The process of dependent
relative orientation and scale transfer are repeated for all other models throughout the
strip. At the end a continuous strip model at an arbitrary uniform scale results. The data
from the plotter are then transferred and adjusted to fit available ground control (Davis et
al., 2014).
If there are not enough ground control in the beginning model of the strip to enable
absolute orientation, this first model may be set to some arbitrary scale and approximately
levelled. Then the strip is formed as previously described and elevations are read and
positions plotted as before for all ground control points and pass points. Pass point
positions and elevations are finally adjusted by numerical methods according to difference
between control coordinates and measured strip coordinates for all field-surveyed control
points. The amount of field-surveyed control needed in a strip depends upon the length
of the strip. As a minimum, about two horizontal and three or four vertical points should
exist in approximately every fifth model of the stip. The use of more control than this
amount will generally improve accuracy of the control extension (Wolf, 1974: 431).
Once the only possibility, analog methods do not play a significant role now-a-days
(Schenk, n.d.: 48).
Pass points for analogical control extension are normally selected in the general
photographic locations as shown in the Figure 4-1. The points may be images of natural
well-defined objects that appear in the required photo areas, or if no such points are
available, pass points may be artificially marked using a special point marking device. Even
though satisfactory natural points may exist in the required general locations on the
photographs, many photogrammetrists prefer to mark pass points artificially for two
reasons. First, a more discrete point is obtained so that more accurate measurements of
its position can be obtained. Second, the likelihood of misidentifying pass points is greatly
reduced. In analogical control extension only three pass points near the y axis of each
photo are makded as shown in the figure. When stereopairs of photos with pass points
marked in this manner are oriented in a plotter, six points appear in each stereomodel as
shown in the figure (Wolf, 1974: 432).
Figure 4-1: Idealized pass point locations for analogical control extension (top) and
locations of pass points in two adjacent stereomodels (bottom)
the images that make up stereomodel are more “tightly” oriented with respect to each
other, whereas in fully analytical adjustments the images are oriented to optimize their fit
with respect to a block of multiple photos which may lead to residual y parallax in the
orientation between individual stereopairs. Regardless of whether the sequential or
simultaneous method is employed, the process yields coordinates of the pass points in the
ground system. Additionally, coordinates of the exposure stations can be determined in
either process. Thus, semi-analytical solutions can provide initial approximations for a
subsequent bundle adjustment (Wolf, DeWitt and Wilkinson, 2014).
After a strip model has been formed, it is numerically adjusted to the ground coordinate
system using all available control points. If the strip is short, i.e., up to about four models,
this adjustment may be done using a three-dimensional conformal coordinate
transformation. This requires that a minimum of two horizontal control points and three
vertical control points be present in the strip. More control than the minimum is desirable,
however, as it adds stability and redundancy to the solution. As discussed later in this
section, if the strip is long, a polynomial adjustment is preferred to transform model
coordinates to the ground coordinate system. In the short strip illustrated in Figure 4-2 c,
horizontal control points H1 through H4 and vertical control points V1 through V4 would be
used in a three-dimensional conformal coordinate transformation to compute the ground
coordinates of pass points a through l and exposure stations O1 through O4 (Wolf, DeWitt
and Wilkinson, 2014).
or large size computer is necessary to contain and compute the volume of data for
extensive problems with economy and speed (Wolf, 1974: 439). The analytical aero-
triangulation methods can be classified according to their mathematical models as
(Schenk, n.d.):
Figure 4-3 shows the overlap configuration of an ideal block. The forward overlap is 60%
and the side overlap 20 to 30 %. Common points are found in the overlapping areas. As
indicated in the figure, the degree of overlap varies across the block. If we move from left
to right, we find the typical pattern of 1, 2, 3, 2, 3, …….; where the numbers refer to the
number of photographs covering the same area. This sequence of numbers doubles where
adjacent strips overlap. With this standard overlap configuration, the maximum overlap is
6. This corresponds to an area of 44mm×44mm in the four corners of the photographs.
Consecutive models overlap by 20 % (Schenk, n.d.).
Figure 4-4 (a) shows the typical pattern of 9 points per photograph, or 6 points per model.
The six model points serve to orient the model after the points are known in the object
coordinate system by the process of aerotriangulation. With this regular pattern we obtain
a total of p + 6 object points where p is the number of photographs per strip. With every
adjacent strip, this number is increased by two thirds. Thus, the total number of object
points, n, becomes n = 2p s+4s + p+2 with s the number of strips. To generalize, we
have (Schenk, n.d.):
where
s = number of strips
p = number of photographs per strip
n = total number of object points
In Table 4-1 the coefficients a1, . . . , a4 for different point densities are listed. A block
with 3 strips, 4 photographs each, and 9 points per photograph contains 42 object points.
Compare this with the number of measured points: 12 photographs, 9 points each, or 108
points. The ratio r of measured points to object points indicates the redundancy. For a
block with 10 strips, 20 photographs each, 25 points per photograph, we obtain n = 1763
and r = 2.84. The larger the redundancy the higher the reliability of a block adjustment.
Table 4-1: Coefficients for computing the total number of object points
With the enormous increase of computer performance, the method of adjusting strips by
polynomials lost its significance. Schut was one of the pioneers who developed this method
in the sixties. His published programs were very popular and in wide spread use.
The concept of the independent model method is to simultaneously determine the absolute
orientation of all models of a block by a least squares adjustment. Fig. 4.12 illustrates the
concept (Schenk, n.d.).
Inclusion of the perspective centers as tie points is critical for the geometric strength of
the solution. In standard block configurations, stereomodels overlap only at the very edges
in the along-strip direction. If tie points were chosen only within the edge regions, the tie
points would be nearly collinear, allowing the connected models to rotate around the
overlap region. Adding the perspective center as a tie point reinforces the connection by
preventing this rotation. If an analog stereoplotter is used to form the models, the
coordinates of the perspective center must be measured while the model is set in the
instrument. If analytical methods are used, the perspective center coordinates are
determined as part of the solution (Mikhail, Bethel and McGlone, 2013).
Independent model block adjustment was widely adopted, since it could use either existing
stereoplotters or comparators for input data generation, the computational expense was
well suited to available computers, and the accuracy was much better than that of the
polynomial methods (Mikhail, Bethel and McGlone, 2013: 122).
Several variations of the independent model method were developed. One of the most
successful was the implementation of PAT-M-43, developed at the University of Stuttgart.
This divided the seven-parameter planimetric adjustment and a three-parameter height
adjustment. The planimetric adjustment incorporated the scale factor, the K rotation
around the Z-axis, and the X and Y translations, while the height adjustment incorporated
the two levelling rotations, omega and phi, and the Z translation. These two adjustments
are not independent, since the planimetric adjustment assumes that the model is level
and the height adjustment assumes that the points are properly positioned planimetrically.
The two adjustments must therefore be repeated alternately until the corrections to the
point coordinates are negligible. Since the conditions are usually close to being met at the
start of the procedure and the solution is nearly linear, only a few iterations are typically
required (Mikhail, Bethel and McGlone, 2013: 122).
A variation of independent model adjustment is to work with larger units, such as triplets
or sub-blocks. Using larger basic units further reduces the number of parameters and
simplifies editing of tie points for blunders (Mikhail, Bethel and McGlone, 2013: 123).
Bundle adjustment is the most accurate and flexible method of triangulation currently in
use. The accuracy levels attainable in aerial triangulation allow its use for geodetic control
extension, while in close-range applications, precision of 1:500,000 of the largest
Compiled by: Bikash Sherchan
4-10
Lecture Notes (03 March 2019) Chapter 4 Photogrammetric Triangulation
dimension of the measured object has been reported. Numerous extensions of the basic
method have increased both flexibility and accuracy. The addition of self-calibration
parameters corrects for remaining systematic errors and increases the overall accuracy of
the adjustment. Geometric relationships within the scene, such as collinearity or
coplanarity, can be incorporated to add information and thereby increase the precision.
High-quality navigational data from GPS receivers can be rigorously included to reduce
control requirements dramatically. Bundle block adjustment is based on the collinearity
equations (Section 3.5.1) (Mikhail, Bethel and McGlone, 2013).
However, there are many situations in which we would like to determine the systematic
errors under operational conditions as part of the solution. The examination of residuals
from block adjustments often shows clear trends attributable to uncorrected systematic
error, due to differences in environmental conditions between the calibration laboratory
and the operational environment. In some cases, it is not feasible to completely calibrate
the camera in a laboratory, as in close-range applications that use nonmetric cameras.
Many navigational sensors, such as inertial navigation systems (INS) or statoscopes, have
systematic drift errors that are known, the exact amount of drift cannot be determined
beforehand and must be part of the solution. For these reasons, block adjustment with
self-calibration parameters was developed (Mikhail, Bethel and McGlone, 2013: 124).
Examination of the collinearity equations shows that the principal point coordinates and
the focal length appear directly in the equations and could be treated as additional
unknowns in the solution. It would seem that the refined image coordinates could be
replaced by unrefined image coordinates and the correction equations for lens distortion,
so that the coefficients can be obtained from the solution (Mikhail, Bethel and McGlone,
2013).
The problem with naively solving for the interior orientation parameters in this way is that
in many cases there is not enough geometric information to separate the effects of the
parameters so that they can be determined. In this case, we say that the parameters are
correlated, since choosing a value for one parameter fixes the value of the other. For
example, take the case of a perfectly vertical aerial image of flat terrain. There are an
infinite number of combinations of flying height and focal length that will yield the same
image. If we are given the measurements of an object in the image and on the ground,
we cannot determine both the focal length and flying height, since their effects cannot be
separated. Exactly the same thing happen in an aerial triangulation with added parameters
– the geometry is often such that the effects of the self-calibration parameters cannot be
distinguished from each other or from the effects of the orientation parameters (Mikhail,
Bethel and McGlone, 2013: 124).
Note that parameters that are not perfectly correlated can still cause problems with the
solution. In the previous example, if we are given the measurements of two objects at
different elevations, we can theoretically determine both the flying height and focal length.
However, in a practical sense, we will still have trouble determining both parameters
unless the difference in elevation is significant with respect to the flying height. If the
difference in scale for the two objects is not greater than the effects of measurement or
computational roundoff errors, the parameters determined will not be accurate (Mikhail,
Bethel and McGlone, 2013: 124).
RPC are not given or considered not accurate enough, you need GCPs. For aerial surveys
we can solve the exterior orientation in one of the following ways (Ip, 2005):
a) Indirect camera orientation: identify GCPs in the image and measure the row-column
coordinates; acquire (X,Y,Z) coordinates for these points, e.g. by GCPs or a sufficiently
accurate topographic map; use adequate software to calculate the exterior orientation
parameters, after having completed the interior orientation.
b) Direct camera orientation: make use of GPS and IMU recordings during image
acquisition employing digital photogrammetric software.
c) Integrated camera orientation, which is a combination of (a) and (b)
For images from very high resolution sensors such as Ikonos or QuickBird, adding one
GCP can already considerably improve the exterior orientation as defined by the RPC. For
Cartostat images it is advisable to improve the exterior orientation by at least five GCPs.
For orienting a frame camera image you need at least three GCPs (unless you also have
GPS and IMU data). After orientation, you can use the terrain coordinates of any reference
point and calculate its position in the image. The differences between measured and
calculated image coordinates (the residuals) allow you to estimate the accuracy of
orientation. As you may guess, advanced camera/sensor/image orientation is a topic for
further studies (Tempfli et al., 2009).
Compiled by: Bikash Sherchan
5-1
Lecture Notes (14 Feb. 2019) Chapter 5 Direct vs Indirect Orientation
The indirect orientation approach uses aerial triangulation, which relies on adjusting a
network of tie points in a block of images with a sufficient number of well distributed
known GCPs.
When the availability of GCPs is in question, such as within forests, snow covered grounds,
deserts, or along a coastline, the ability to resolve the exterior orientation parameters
indirectly is limited. Often these areas are also very important when emergency response
application is required, such as in the case of forest fires, flooding and hurricanes. Such
an application requires fast orthohoto generation, and there is insufficient time and
resources to extract the exterior orientation parameters using traditional aerial
triangulation. In addition, some projects only require a single strip or single photo
orientation. For instance, in the case where there is an existing DEM, the use of traditional
aerial triangulation to determine EOP is unpractical because it requires excessive GCPs and
additional overlapping photos. Hence in many applications direct orientation is either the
only practical solution, or the most cost effective solution (Ip, 2005).
The ground accuracy of object coordinates, when using a direct orientation, is dependent
upon the GPS accuracy for position and the IMU accuracy for orientation. The orientation
error produces a position error on the ground as a function of the flying height (or photo
scale). At the best, GPS provides about 5 – 10 cm RMS when using dual frequency
differential processing with highly accurate GPS base station data. For a high end direct
georeferencing system using ring-lase gyros (RLG), fibre optic gyros (FOG) or dry-tunes
gyros (DTG), the orientation accuracty is typically about 0.005 deg RMS for roll and pitch,
and about 0.008 deg RMS for the heading. For traditional large scale mapping projects
that are flown relatively low above the ground, the ground accuracy becomes dominated
by the DGPS position error [Mostafa et al (2001b)]. Therefore, for large scale mapping
(>1:1000) projects requiring centimetre accuracy, direct EO usage from DG system is
marginal for film camera system [Dardanelli et al (2004)]. To improve the GPS positioning
accuracy, further research has to be done [Brunton et al (2001)]. However, using AT to
improve the photo centre positioning accuracy has long been considered by researchers
as an acceptable solution, and has been introduced in OEEPE tests as the Integrated
Sensor Orientation (ISO) [Jacobsen and Wegmann (2001), Heipke et al (2001a)]. (Ip,
2005)
Accuracy Analysis of Reconstructed Points in Object Space From Direct and Indirect
Exterior Orientation Methods - Ayman Habib and Toni Schenk, Department of Civil and
Environmental Engineering The Ohio State University
Towards a Closer Combination of Direct and Indirect Sensor Orientation of Frame Cameras
- Birger Reese and Christian Heipke
Certain factors, depending generally on the purpose of the photography, must be specified
to guide a flight crew in executing its mission of taking aerial photographs. Some of them
are (1) boundaries of the area to be covered, (2) required scale of the photography, (3)
camera focal length and format size, (4) endlap, and (5) sidelap. Once these elements
have been fixed, it is possible to compute the entire flight plan and prepare a flight map
on which the required flight lines have been delineated. The pilot then flies the specified
flight lines by choosing and correlating headings on existing natural features shown on the
flight map. In the most modern systems, the flight planning is done using a computer and
the coordinates of flight lines are calculated. Then the aircraft is automatically guided by
an on-board GNSS system along the planned flight lines (Ghilani and Wolf , 2012: 828).
The purpose of the photography is the paramount consideration in flight planning. For
example, in taking aerial photos for topographic mapping using a stereo-plotter, endlap
should optimally be 60% and sidelap 30%. The required scale and contour interval of the
final map must be evaluated to settle flying height. Enlargement capability from photo
scale to map compilation scale is restricted for stereo-plotting, and generally should not
exceed about 5 if satisfactory accuracy is to be achieved. By these criteria, if required map
scale is 200 ft/in., photo scale becomes fixed at 1000 ft/in. If the camera focal length is 6
in., flying height is established by Equation (27.2) at 6(1000) = 6000 ft above average
terrain. Some organizations may push this factor higher than 5, but it should be done with
caution (Ghilani and Wolf , 2012: 828).
Information ordinarily calculated in flight planning includes (1) flying height above mean
sea level, (2) distance between exposures, (3) number of photographs per flight line, (4)
distance between flight lines, (5) number of flight lines, and (6) total number of
photographs. A flight plan is prepared based on these items (Ghilani and Wolf , 2012:
829).
NUMERICAL Example: A flight plan for an area 10 mi wide and 15 mi long is required. The average
terrain in the area is 1500 ft above datum. The camera has a 6 in. focal length with 9 X 9
in. format. Endlap is to be 60%, sidelap 25%. The required scale of the photography is
1:12,000 (1000 ft/in.).
Solution
Scale = f /H-havg
2. Distance between exposures, de: endlap is 60%, so the linear advance per photograph
is 40% of the total coverage of 9 in. X 1000 ft/in. = 9000 ft. Thus, the distance between
exposures is 0.40 X 9000 = 3600 ft.
3. Total number of photographs per flight line: length of each flight line = 15 mi (5280
ft/mi) = 79,200 ft
Adding two photos on each end to ensure complete coverage, the total is 23 + 2 + 2
= 27 photos per flight line.
4. Distance between flight lines: sidelap is 25%, so the lateral advance per flight line is
75% of the total photographic coverage,
number of spaces between flight lines = 52,800 ft/6750 ft/line = 7.8 (say 8)
(Note: The first and last flight lines should either coincide with or be near the edges
of the area, thus providing a safety factor to ensure complete coverage.)
Rectangular project areas are most conveniently covered with flight lines oriented north
and south or east and west. As illustrated in figure ____, this is desirable because the pilot
can take advantage of section lines and roads running in the cardinal directions and fly
parallel to them (Wolf, 1974).
If the project area is irregular in shape or if it is long and narrow and skewed to cardinal
directions, it may not be economical to fly north and south or east and west. In planning
coverage for such irregular areas, it may be most economical to alight flight lines parallel
to the project boundaries as nearly as possible. Flight planning templates are useful for
determining best and most economical photographic coverage for mapping, especially for
small areas. These templates, which show blocks of neat models, are prepared of
transparent plastic sheets at scales which correspond to the scales of the base maps upon
which flight plan is prepared. The templates are then simply superimposed on the map
over the project area and oriented in the position which yields best coverage with the
fewest number of neat models. Such a template is shown in figure ____. The crosses
represent exposure stations, and these may be individually marked on the flight map. This
template method of flight planning is exceptionally useful in planning exposure station
locations when artificial targets are used (Wolf, 1974).
Once camera focal length, photo scale, and image overlaps have been selected, the flight
map can be prepared.
5.3.2 Specifications
Most flight plans include a set of detailed specifications which outline the materials,
equipment, and procedures to be used on the project. These specifications include
requirements and tolerances pertaining to photographic scale (including camera focal
length and flying height), image overlaps, tilt, crab, and photographic quality (Wolf, 1974).
The following is a sample set to specifications for aerial photography:
Flight plans are normally portrayed on a map for the flight crew. However, old
photography, an index mosaic, or even a satellite image may be used for this purpose.
DIGITAL PHOTOGRAMMETRY
Contents of this chapter
6 Digital Photogrammetry: [5 hrs]
6.1 Digital Imagery
6.2 Digital Image Processing
6.3 Digital Image Resampling
6.4 Digital Image Compression
6.5 Digital Image Measurement
6.6 Feature Extraction
The digital or soft-copy photogrammetric systems are much simpler in design than the
analytical systems. They consists of a computer with stereo-capable graphics system, 3D
glasses with electronic shutters, and a 3D mouse as a user interface. The 3D mouse is a
reconfigured optical mouse with x, y, and z motion control and several user configurable
buttons. All other hardware of an analytical system has been replaced with software
programming (Bhatta, 2011).
Grayscale Image
A black and white image is made up of pixels each of which holds a single number
corresponding to the gray level of the image at a particular location. These gray levels
span the full range from black to white in a series of very fine steps, normally 256 different
grays. Since the eye can barely distinguish about 200 different gray levels, this is enough
to give the illusion of a stepless tonal scale as illustrated below:
Assuming 256 gray levels, each black and white pixel can be stored in a single byte (8
bits) of memory.
Figure 6-2:
Note that for images of the same size, a black and white version will use three times less
memory than a color version.
processing, the digital number of each pixel in an original image is input to a computer,
with its inherent row and column location. The computer operates on the digital number
according to some preselected mathematical function or functions, and then stores the
results in another array which represents the new or modified image. When all pixels of
the original image have been processed in this manner and stored in the new array, the
result is a new digital image.
Many different types of digital image processes can be performed. One type falls under
the general heading of preprocessing operations. These are generally aimed at
correcting for distortions in the images which stem from the image acquisition process,
and they include corrections for such conditions as scanner or camera imperfections and
atmospheric refraction. Another type of digital image processing, called image
enhancement, has as its goal, the improvement of the visual quality of images. Image
enhancement makes interpretation and analysis of images easier, faster, and more
accurate; and thus it can significantly improve the quality of photogrammetric products
developed from digital images, and reduce the cost of producing them. Digital orthophotos
in particular benefit significantly from the improved image quality that results from image
enhancements.
A third type of digital image processing, called image classification, attempts to replace
manual human visual analysis with automated procedures for recognizing and identifying
objects and features in a scene. Image classification processes have been widely used in
a host of different interpretation and analysis applications, as well as in the production of
a variety of thematic maps. They are also used in automated soft copy mapping systems.
A final type of digital image processing, data merging, combines image data for a certain
geographic area with other geographically referenced information in the same area. The
procedures may overlay multiple images of the same area taken at different dates—a
technique which is very useful in identifying changes over time, such as monitoring a forest
fire or following the spread of a disease in a certain tree species. The procedures can also
combine image data with nonimage data such as DEMs, land cover, and soils. These types
of digital image processing are extremely important in the operation of geographic
information systems.
There are several techniques available for resampling digital images, although three
particular ones are by far, most prevalent. They are known as nearest-neighbour
interpolation, bilinear interpolation, and bicubic interpolation. Other, more computationally
intensive techniques are generally not employed since they tend to be sensitive to sensor
noise which exists in digital imagery.
Figure 6-3: Relationship between pixels from the originally sampled image and the
resampled image
6.4.1 Nearest Neighbour
The nearest-neighbour interpolation is the simplest of the three. As its name implies, the
DN chosen will be that of the image pixel whose center is closest to the center of the grid
cell. From a computational standpoint, all that is required is to round of the fractional row
and column values to the nearest integral value. Figure 6-4 shows the DNs for a 4 x 4
subarea from a digital image. A pixel is superimposed at a fractional location (R = 619.71,
C = 493.39). Rounding these values to the nearest superimposed at a interger yields 620
and 493 for the row and column indices, respectively. Thus, the resamples value is 56.
Figure 6-4: A 4x4 subarea of image pixels with superimposed grid cell at a fractional
location
The primary advantage of the nearest-neighbor method is that it is the fastest of the three
techniques in terms of computational time. It also has the advantage of not modifying the
original image data, which is important if remote-sensing image classification will be
performed. However, since a continuous interpolation is not being performed, the resultant
appearance can be somewhat jagged or blocky.
The primary advantage of bilinear interpolation is the smoother appearance of the result.
This appearance is slightly compromised by the fact that some high-frequency detail is
filtered out. In other words, edges in the scene are slightly less distinct. In terms of
computational time, bilinear interpolation is slower than nearest-neighbor interpolation but
faster than bicubic interpolation.
Bicubic interpolation is the most rigorous resampling technique of the three on the basis
of signal processing theory. It achieves the smooth appearance without sacrificing as much
high frequency (edge) detail as is characteristic of the bilinear interpolation. The reason
for this is illustrated in Fig. E-5. This figure shows a profile of four consecutive samples
along a single row of a digital image. The curve represents a smooth (cubic spline)
interpolation of image brightness based on the samples at the four dots. The dashed line
represents the linear interpolation between the two center pixels. For the indicated
fractional column, the bicubic interpolation is able to capture the trend of the signal better
than bilinear interpolation. By capturing trends in the signal, the bicubic interpolation
retains high frequency information that would be clipped by the bilinear interpolation
method. However, this enhanced appearance comes at a penalty in terms of
computational time.
Compression Ratio
Lossy Compression
In lossy compression techniques, degradations of gray values are allowed in the
reconstructed image in exchange for a reduced bit rate as compared to lossless schemes.
Typically, lossy compression algorithms consist f three steps:
Loss-less Compression
Some image compression applications require the reconstructed image to be numerically
identical to the original image on a pixel-by-pixel basis. An example is medical imaging,
where lossy compression schemes may compromise diagonistic accuracy. Lossless
compression schemes are also important in remote sensing, where the spectral
characteristics of image have to be preserved. As one might expect, the price to be paid
for an error-free image is a much lower compression ration as compared to lossy schemes
(Rabbani et al., 1990).
The primary difference between lossy and the lossless schemes is the inclusion or
quantization in lossy techniques. By quantization, the number of possible output symbols
is reduced. The reduction of the number of output symbols at the quantization step leads
to degradations in the reconstructed image in exchange for a higher compression ratio.
For and 8-bit image, the maximum compression ratio that can be obtained using lossless
schemes is 8 divided by the entropy of that image.
The most common lossless scheme is Lossless Predictive Coding. It can be implemented
by using the same steps as explained in the lossy scheme (Rabbani et al., 1990) with the
exclusion of the quantization step.
The amount of compression provided by any of the various processes is dependent on the
characteristics of the particular image being compressed, as well as on the picture quality
desired by the application and the desired speed of compression and decompression.
Those based on the discrete cosine transform (DCT) are lossy, thereby allowing substantial
compression to be achieved while producing a reconstructed image with high visual fidelity
to the encoder’s source image. The second class of coding processes is not based upon
the DCT and is provided to meet the needs of applications requiring lossless compression.
These lossless encoding and decoding processes are used independently of any of the
DCT-based processes.
PHOTOGRAMMETRIC PRODUCTS
within 1.645 times the standard deviation, to be accurate to within half the contour
interval would require a standard deviation of 5/1.645 = 3 m.)
To estimate the maximum flying height that can be used to produce maps with a given
contour interval, photogrammetrists use the C-factor, which is the ration of the flying
height and the contour interval. The C-factor is influenced by all parts of the mapping
system, including the camera, the quality of the imagery, the stereoplotter, and the
operator. However, it can provide a rough guide for standard production situations. Older
direct-projection analog plotters were rated at a C-factor of 700 to 1500, while analyitical
plotters have C-factors of 2000-2200. For softcopy (digital) systems using scanned aerial
imagery, the C-factor is affected by the same geometric factors as for analog plotters and
also by the resolution at which the image is scanned. C-factors between 800 and 2100
have been estimated for softcopy systems, depending on the scanning resolution (Mikhail,
Bethel and McGlone, 2013: 226).
Contour maps usually include spot elevations at the tops of hills, in road intersections, and
at other points as required to fully describe the topography (Mikhail, Bethel and McGlone,
2013: 227).
If the terrain is relatively flat, the rectified image can be enlarged to a nominal scale and
used as a low-precision map. However, the more the terrain deviates from a plane, the
more inaccurate such a map will be. Even though rectified prints are relatively inexpensive
to produce, they have been replaced in most applications by digital orthophotos (Mikhail,
Bethel and McGlone, 2013: 227).
Manually produced image mosaics are seldom used now, except as indices to locate
specific images within a set of flight lines. Most mosaics are now generated automatically
from digital orthophotos (Mikhail, Bethel and McGlone, 2013: 228).
For instance, one current product of the USGS is the digital raster graphic (DRG), a
1:24,000 topographic map scanned into digital image form. This is distributed as a TIFF
image file, which can be displayed and manipulated with standard image manipulation
programs. However, the user cannot specify that only the roads and drainage be displayed
or have the computer calculate the total mileage of two-lane roads shown on the map. In
contrast, given the digital elevation model, and the user can generate alternative
representations of the data or process the data layers to extract new information (Mikhail,
Bethel and McGlone, 2013).
Another example of the difference between a simple graphical representation and a digital
data structure is the usefulness of different representations of elevation data. Contours
may be represented digitally as connected strings of point coordinates with attached
attributes indicating their elevation. In some sense a digital file of contours is a digital
elevation model. However, contours cannot be easily or efficiently used to determine the
elevation of an arbitrary point. Instead, digital contours are primarily a display
representation (Mikhail, Bethel and McGlone, 2013).
A digital elevation model represents the Earth’s surface elevation digitally as an array of
points. The most common DEM format is the raster grid, with elevations given at regularly
spaced points, or posts. DEMs are often classified by their post spacing, for example, a 1-
arc second or approximately 30 m DEM. A smaller horizontal spacing usually implies more
accurate elevation values, but in practice elevation accuracy is determined by the
production method and the product specifications. The horizontal spacing may be relative
to any coordinate systems; for instance, UTM with post spacings given in meters or
geodetic coordinate systems with post spacings specified in arc-seconds (Mikhail, Bethel
and McGlone, 2013: 228).
DEMs are currently produced by both manual and automated methods. In manual
production, either the stereoplotter sets the floating mark at the horizontal position of
each post and the operator places it on the ground, or the system drives along a profile
while the operator keeps the mark on the ground. The DEM may also be interpolated from
manually generated contours, although it is now more common for contours to be
generated from a DEM. Automated systems use computer vision techniques to perform
the operator’s task of determining the ground surface elevation by matching corresponding
portions of two stereo images (Mikhail, Bethel and McGlone, 2013: 230).
Both production methods have their strengths and weaknesses. Manual methods are
typically reliable, but are slow and expensive for large areas. Automated methods can be
fast and relatively inexpensive, but fail on complicated scenes, such as urban areas of
forests, and in featureless areas. Manual editing of automated results is nearly always
required. In some systems, the operator can specify beforehand any areas in which he
believes the automated process will not work well. The stereomatcher then skips these
areas, leaving them for the operator to do (Mikhail, Bethel and McGlone, 2013: 231).
Automated systems match on the visible surfaces, such as the tops of buildings or trees,
instead of the terrain surface that we want to represent in the DEM. While some progress
has been made toward automatically recognizing trees and buildings and dropping the
elevations to the ground, this process still requires extensive manual editing (Mikhail,
Bethel and McGlone, 2013: 232).
A common problem for both manual and automated methods is banding or corn rows,
which give the DEM a ridged appearance. This may be caused by systematic offsets
between adjacent elevation profiles, or it may be an artefact of interpolating the DEM from
contours. Operators often have a tendency to let the floating mark dig into the ground
when profiling uphill and to let it float slightly above the ground when profiling downhill.
If adjacent profiles are done in opposite directions, to reduce the amount of image
movement required, the profile elevations will differ by the amount of this systematic
offset. automated systems may also have systematic errors in their correlation along
profiles, leading to the same problems. Even if the overall amplitude of the regular pattern
is within accuracy specifications, systematic errors have an effect on relative accuracy and
also detract from the appearance of products generated from the DEM (Mikhail, Bethel
and McGlone, 2013: 232).
Automated systems may produce spikes of holes due to noise in the data of bad matches.
Large anomalies can usually be detected and corrected automatically; the harder problem
is errors close to the average terrain variation in magnitude, which must be recognized
and corrected manually (Mikhail, Bethel and McGlone, 2013: 233).
Feature types may simply be identified by the file name, as when a file contains only roads
or only buildings, or by an attribute, which is an identification code included in the data
structure. Attribute codes may refer to very broad classes of features (e.g., buildings or
roads), or be based on hierarchical classifications (e.g., four-lane divided highways, two-
lane highways, unpaved roads, etc.) (Mikhail, Bethel and McGlone, 2013).
Along with positions and object types, we want to represent the topology of the data set.
Topology is concerned with properties that are not changed by spatial deformation, such
as connectivity. In geospatial data sets, we use topological relationships to model
networks, such as roads or drainage, and area properties. Having topological information
allows us to query the dataset based on connectivity (for path planning or drainage
calculations, for example) and adjacency (to determine which parcels adjoin a certain
road, for example) (Mikhail, Bethel and McGlone, 2013: 233).
Basic topological elements are of three types: nodes, lines and faces. Nodes define line
endpoints and intersections. Lines are ordered sets of points between nodes. Note that a
point is not the same thing as a node – a point specifies a location only, while a node
specifies connectivity. There may be several points between nodes on a line. A line cannot
cross itself, and connects to or crosses other lines only at nodes. Lines may describe linear
features such as roads or may be the boundary of a face. A line has a left side and a right
side. Faces may represent physical regions, such as an area with a defined soil type, or
may just be the interior region of a set of intersecting lines with not particular physical
significance. Every data set has at least two faces, one bounded by the data set and one
outside it (Mikhail, Bethel and McGlone, 2013: 233).
Digital representations are discrete, meaning that we are trying to describe the continuous
outline of the object with a finite number of points. We can keep adding points to the
description of irregular objects and gain more detail; however, this added detail may not
be displayable, useful, or even meaningful (Mikhail, Bethel and McGlone, 2013: 234).
Generalization refers to the reduction in displayed detail as map scale decreases. For
example, a large-scale map will show bends in a river, both sides of a road, and most
buildings; a small-scale map will show only the main path of the river, the centreline of
the road, and an outline of the urban area. A similar process of generalization is performed
with digital vector data. Digital features are generalized by removing point, thereby making
the representation less detailed as shown in the Figure 7-1. Although this may seem like
a fairly simple operation, generalization becomes much more complicated when the
interactions between adjacent features and different types of features must be taken into
account. As the scale is reduced, the relationships between the features must be
maintained. A road adjacent to a river cannot intersect or cross the river as both are
generalized (Mikhail, Bethel and McGlone, 2013: 234).
One major difference between digital and hardcopy vector data is that the digital data has
no inherent scale. A feature in a digital data file can be shown in any size on a display
screen, expressing the scale as the ratio of the displayed size to the feature’s real size is
therefore meaningless. However, we can use the concept of generalization for an
approximate scale description. There are well-established levels of generalization for each
standard map scale. For instance, roads are represented to the same level of detail on all
1:24,000 maps and to another level of detail on 1:250,000 maps, although the same road
will be represented very differently at the two map scales. The amount of detail captured
in the data indicates the type of map for which it would be suitable, and we can use that
as a very rough indication of the “scale” of the data. In many cases, the digital data has
been derived by digitizing and existing map and, is this case, the scale in terms of
generalization is relatively well-defined (Mikhail, Bethel and McGlone, 2013).
7.2.4 Orthoimagery
An orthoimage is an image based on an orthographic projection, rather than the
perspective projection of a regular frame photograph. A frame image the rays of light pass
through a single point, the perspective center, before intersecting the image plane. Points
at the same horizontal location but at different elevations, such as the top and bottom of
a building, will therefore be imaged at different locations in the image. This is called relief
displacement. The scale of the frame image varies with the elevation of the terrain, due
to the central projection. In an orthographic projection, the projecting rays are
perpendicular to a horizontal reference plane. Changing the elevation of a point does not
affect its projection, so the scale of an orthoimage is constant.
Orthoimages are produced by first obtaining or generating DEM of the area. This elevation
information is then used to remove the elevation effects from the perspective image by
reprojection. Orthoimages are often produced from more than one source image to obtain
the required coverage for the final product. In addition, selecting only the center portions
of images minimizes the relief displacement shown by buildings or elevated objects.
Another way to reduce the relief displacement in images intended for orthoimage
production is to use lenses with longer focal lengths than the standard 6” lens.
There are two basic approaches to generating an orthoimage; forward projection and
backward projection. In forward projection, pixels in the source image are projected onto
the DEM and their object space coordinates are determined; the object space points are
then projected into the orthoimage. Since the spaces between the points projected into
the orthoimage vary due to the terrain variation and perspective effects, the final
orthoimage pixels must be determined by interpolating between the projected points.
The depiction of tall objects, such a buildings, that are not included in the terrain model
is a problem in orthoimage production. When the ortho projection is performed in image
regions showing objects not in the DEM, the wrong pixels will be extracted from the source
image. This building lean effect can be particularly noticeable in urban areas. Some
orthoimage production systems use an elevation model containing building outlines, either
automatically derived or manually delineated. This problem can be minimized by using
long focal length photography and by using only the central part of the photo, since relief
displacement increases with shorter focal length and with the radial distance from the
image nadir point. In general, points that have multiple elevations, such as a multilevel
freeway interchange, cannot be correctly orthorectified unless the photographic view is
from directly overhead.
In some military applications, points determined from a triangulation solution are stored
in a database, along with image “chips” that show the point. Future images may then be
easily recorded using these stored control points (Mikhail, Bethel and McGlone, 2013: 238).
Most elevation data are stored in a digital elevation model that can be analysed in
conjunction with other spatial data in a GIS. A digital elevation model (DEM) is defined as
a file or database containing elevation points over a contiguous area (Miller, 2004; Ma,
2005). DEMs may be subdivided into (Jensen, 2011):
1. Digital Surface Models (DSM) that contain elevation information about all features in
the landscape, such as vegetation, buildings, and other structures; and
2. Digital Terrain Models (DTM) that contain elevation information about the bare Earth
surface without the influence of vegetation or man-made structures.
Four major technologies are used to obtain elevation information, including (Bossler et al.,
2002 cited in (Jensen, 2011)):
In situ surveying,
Photogrammetry,
Interferometric Synthetic Aperture Radar (IFSAR), and
Light Detection and Ranging (LiDAR)
A DTM is a digital representation of terrain relief, a model of the shape of the ground
surface. We have a variety of sensors at our disposal that can provide us with 3D data:
line cameras and frame cameras, laser scanners and microwave radar instruments. They
all can produce (X, Y, Z) coordinates of terrain points, but not all of the terrain points be
points on the ground surface. Consider a stereo pair of photographs, or stereo pair from
SPOT-5 of a tropical rain forest. Will you be able to see the ground? Obtained coordinates
will pertain to points on the ground only in open terrain. Since a model based on such
data is not a DTM, we refer it to as digital surface model (DSM). The difference between
DTM and DSM is illustrated by Figure 7-2, we need to filter DSM data to obtain DTM data
(Tempfli et al., 2009).
Figure 7-2: Difference between DTM and DSM (Tempfli et al., 2009)
In terrain modelling, it is handy to choose the coordinate system such that Z stands for
elevation. If we model a surface digitally by nothing else than elevation values Z at
horizontal positions (X, Y), why not call such a model a digital elevation model (DEM). The
term DEM was introduced in the 1970’ies with the purpose to distinguish the simplest form
of terrain relief modelling from more complex types of digital surface representation. The
term DEM was original exclusively used for raster representation (elevation values given
at the intersection nodes of a regular grid). Note that both a DTM and DSM can be a DEM
and, moreover, ‘elevation’ would not have to relate to terrain but could relate to some
subsurface layer such as ground water level, or some soil layer, or the ocean floor (Tempfli
et al., 2009).
DEM –Digital Elevation Model (bare ground surface or a generic term for both DTM and
DSM)
DTM –Digital Terrain Model (bare ground surface) DSM –Digital Surface Model (the
elevation of the top of the surface, i.e. ground, trees, buildings).
Figure 7-4: Comparison of unrectified vertical aerial image and orthoimage of same
scene (GEOG 482: The Nature of Geographic Information).
Orthophoto alone do not convey topographic information. However, they can be used as
base maps for contour line overlays prepared in a separate stereoplotting operation. The
result of overprinting contour information on an orthophoto is a topographic
orthophotomap. Much time is saved in the preparation of such maps because the
instrument operator need not map the planimmetric data in the map compilation process
(Lillesand, Kiefer and Chipman, 2014).
One important thing here is that tall objects such as buildings will still appear to lean in
an orthophoto. This effect is particularly troublesome in urban areas. Such effect can be
removed by including building outline elevations in the DEM, or minimized by using the
central portion of a photograph, where relief displacement of vertical features is at a
minimum (Lillesand, Kiefer and Chipman, 2014). – True Orthophoto
the photo. Any photo which have constant scale throughout is an orthophoto having the
same planimetric correctness as map. It should be mentioned that although relief
displacements due to variable terrain are removed, a shortcoming of orthophotos is that
relief displacements of vertical surface such as walls of buildings cannot be removed (Wolf,
1974).
At first glance, an orthophoto looks the same as a perspective photo. But upon comparison
of an orthophoto and a perspective photo of the same area, differences can usually be
detected. Figure 7-4 (left) is a portion of a perspective photo, while the right one is an
orthophoto produced from the same portion of the perspective photo. Note in particular
how relief displacement has made the power line on the left image appear crooked,
whereas it appears straight on the orthophoto because relief displacements have removed
(Wolf, 1974).
The desirability of orthophotos over perspective photos has been recognized for many
years. As early as 1903, Scheimpflug had conceived the idea of directly producing
orthophotos from perspective photos. In the early 1930s, the Gallus-Ferber restitution
machine was introduced in France. In addition to its use for stereoscopic plotting, the
instrument could produce orthophotos directly from perspective photos. Even though the
instrument was operational, inefficiencies together with a general lack of enthusiasm for
the finished product caused orthophoto production to lie dormant until the 1950s (Wolf,
1974).
In 1950 Russell Bean of the US Geological Survey began experimenting with the equipment
for producing orthophotos, and his work led to the development of the instrument
presently known as the orthophotoscope. The first orthophotscope was introduced in
1953, and it was followed by several improved generations. Continued developments led
to the model T-64 Orthophotoscope which is currently used extensively by the Geological
Survey in producing orthophotos (Wolf, 1974).
Orthophotomaps prepared from orthophotos offer significant advantages over both line
maps and aerial photos. This is because orthophoto possess the advantages of both air
photos and line maps. On the one hand, orthophotos have the pictorial qualities of air
photos because the images of an infinite number of ground objects can be recognized and
identified. Furthermore, because of the planimetric correctness with which images are
shown, measurements may be taken directly from orthophotos just as from maps (Wolf,
1974).
The capability of being able to correlate images on the orthophtomap with what is
observed on the ground is an asset in many fields of endeavour. Engineers, planners,
surveyors, foresters, geologists, agronomists, etc., can use orthophotos advantageously
as base maps for plotting field observations. Foresters, for example, can classify and
delineate different timber types in their true position directly on the orthophotomap by
correlating images with what they observe in the field. Also, soil scientists can plot
locations where soil samples were taken and delineate soil-type boundaries directly on the
ground. Property surveyors can utilize orthophotomaps to advantage because fence lines
and other key items of evidence that can be field identified are shown in true plan view
on the orthophotomap (Wolf, 1974).
Because images can be correlated with their corresponding objects on the ground,
orthophotos make excellent base maps for preparing flight plan maps. Orthophotos are
also very useful communication tools. Property owners and lay people can understand an
orthophotomap, whereas they are frequently awestruck by a line and symbol map. Using
orthophotos, engineers are better able to communicate with property owners to discuss
right of way purchases, access to property, etc., (Wolf, 1974).
Orthophotos are produced from perspective photos (usually aerial photos) through a
process called differential rectification, which eliminates image displacements due to
photographic tilt and terrain relief. The process is essentially the same as standard
rectification described in Secs. 10-14 and 10-15, except that it is performed independently
at myriad individual, tiny surface patches or differential elements. In this way, rather than
rectify the photograph to some average scale (which does not correct for relief
displacements), each differential element is rectified to a common scale. Prior to the age
of digital photogrammetry, complicated optical-mechanical devices were employed to
produce orthophotos from film diapositives. Several of these are described in the second
edition of this book. Modern orthophotos, however, are produced digitally, Softcopy
systems are particularly well-suited for differential rectification. The essential inputs for
the process of differential rectification are a DEM and a digital aerial photo having known
exterior orientation parameters (ω, , κ, XL, YL, and ZL). It is also necessary to obtain
the digital image coordinates (row and column) of the fiducials so that a transformation
can be computed to relate photo coordinates to digital image coordinates. A systematic
application of the collinearity equations is then performed to produce the orthophoto
(Wolf, DeWitt and Wilkinson, 2014: 321).
Figure 7-5 illustrates the collinearity condition for a particular groundel point P in the DEM.
The X and Y coordinates of point P are based upon the row and column within the DEM
array, and its Z coordinate is stored in the DEM groundel array at that position. Given the
X, Y, and Z coordinates of point P and the known exterior orientation parameters for the
photograph, the collinearity equations can be solved to determine photo coordinates xp
and yp. These photo coordinates define the position where the image of groundel point P
will be found. Since photo coordinates are related to the fiducial axis system, a
transformation must be performed on these coordinates to obtain row and column
coordinates in the digital image. The transformed row and column coordinates will
generally not be whole numbers, so resampling is done within the scanned photo to obtain
the digital number associated with groundel point P (Wolf, DeWitt and Wilkinson, 2014:
321).
Figure 7-5: Collinearity relationship for a DEM point P and its corresponding image p.
The process of creating the digital orthophoto requires repetitive application of the
collinearity equations for all the points in the DEM array. Figure 13-7 gives a schematic
illustration of the process. In this figure, two arrays are shown in vertical alignment, where
each of the groundels of the DEM array corresponds one-to-one with a pixel of the
orthophoto array. The orthophoto array is initially empty, and will be populated with digital
numbers (shown in the figure as x’s) as the process is carried out. At each step of the
process, the X, Y, Z coordinates of the center point of a particular groundel of the DEM
are substituted into the collinearity equations as discussed in the preceding paragraph.
The resulting photo coordinates are then transformed to row and column coordinates of
the digital aerial photo. Resampling is performed to obtain a digital number, which is then
placed into the corresponding pixel of the digital orthophoto. The process is complete
when all pixels of the orthophoto have been populated with digital numbers (Wolf, DeWitt
and Wilkinson, 2014: 322).
Figure 8-1: Aerial Photo (top) and Topo Map (bottom) (Courtesy: Google Maps)
The field of highway planning and design provides an excellent example of how important
photogrammetry has become in engineering. In this field, high-altitude photos or satellite
images are used to assist in area and corridor studies and to select the best route; small
scale topographic maps are prepared for use in preliminary planning; large-scale
topographic maps and DEMs are compiled for use in final design; and earthwork cross
sections are taken to obtain contract quantities. In many cases the plan portions of the “
plan profile”sheets of highway plans are prepared from aerial photographs. Partial
payments and even final pay quantities are often calculated from photogrammetric
measurements. Map information collected from modern photogrammetric instruments is
directly compatible with computer-aided drafting (CAD) systems commonly used in
highway design. The use of photogrammetry in highway engineering not only has reduced
costs but also has enabled better overall highway designs to be created (Wolf, DeWitt and
Wilkinson, 2014).
Aerial photographs can provide the first objective impression of damages in a very rapid
changing environment. Specifically in situations such as earthquakes or flooding, Aerial
photos facilitate the planning and coordination of adequate response in terms of rescue
resources, identifying population groups at risk where human intervention is most needed
to limit and prevent hazards during first response stages (Haseena, Kiran and Murthy,
2013).
Photogrammetry still has some practical applications for finding mineral and fuel deposits,
mapping areas and tracking geological changes and water management as well as general
geological research that other applications cannot contribute to (Ray, 1960). A great
example of this is water drainage ahead of proposed new urban developments - flood
plain risks and subsidence.
They can also be used to study the process of natural changes, such as variations in soil
and geology over time as well as changes to the underlying ground that leads to disasters
such as landslides (Lohmann and Altrogge, n.d.).
REFERENCES
Alamús, R., Baron, A. and Talaya, J. (2001) 'Integrated Sensor Orientation at ICC,
mathematical models and experience', OEEPE-Workshop on Integrated Sensor
Orientation, Hannover.
Andersen, Ø. and Nilsen, B. (2001) 'Can map compilation rely on GPS/INS alone?', OEEPE-
Workshop on Integrated Sensor Orientation, Hannover.
Anderson, J.M. and Mikhail, E.M. (2013) Surveying: Theory and Practice, 7th edition, New
Delhi: McGraw Hill Education.
Anderson, J.M. and Mikhail, E.M. (2014) Surveying: Theory and Practice, 7th edition, New
Delhi: McGraw Hill Education (India) Private Limited.
Bannister, A., Raymond, S. and Baker, R. (2012) Surveying, 8th edition, New Delhi: Pearson
Education.
Bethel, J.S. (2003) 'Photogrammetry and Remote Sensing', in Civil Engineering Handbook,
2nd edition, CRC Press.
Bhatta, B. (2011) Remote Sensing and GIS, Oxford University Press, YMCA Library
Building, New Delhi 110001, India.
Bischof , H. and Leberl, F. (n.d) 'Chapter 5 - Digital Image Processing', in McGlone, J.C.
(ed.) Manual of Photogrammetry, American Society for Photogrammetry and Remote
Sensing.
Campbell, J.B. and Wynne, R.H. (2011) Introduction to Remote Sensing, New York, L
ondon: The Guilford Press.
Caprile, B. and Torre, V. (1990) 'Using Vanishing Points for Camera Calibration',
International Journal of Computer Vision 4, pp. pp. 127-139.
Coughlan, J.M. and Yuille, A.L. (September 1999) 'ManhattanWorld: Compass Direction
from Single Image by Bayesian Inference', Proceedings of the 11th International
Conference on Computer Vision (ICCV’99), Kerkyra, Greece, pp. 941-947.
Criminisi, A., Reid, I. and Zisserman, A. (September 1999) 'Single View Metrology',
Proceedings of the 11th International Conference on Computer Vision (ICCV’99), Kerkyra,
Greece, pp. 434-441.
Davis, R.E., Foote, F.S., Anderson, J.M. and Mikhail, E.M. (2014) Surveying: Theory and
Practice, 6th edition, McGraw Hill.
Derenyi, E.E. (1982) Photogrammetry for Civil and Forest Engineers, Department of
Geodesy and Geomatics Engineering, University of New Brunswick.
Falkner, E. and Morgan, D. (2002) Aerial Mapping - Methods and Applications, 2nd edition,
Washington DC: CRC Press.
Ghilani, C.D. and Wolf, P.R. (2006) Adjustment Computations: Spatial Data Analysis, 4th
edition, New Jersey: John Wiley & Sons.
Gonzalez, C.R. and Woods, E.R. (2014) Digital Image Processing, 3rd edition, New Delhi:
Dorling Kindersley.
Hartley, R. and Zisserman, A. (2004) 'Epipolar Geometry and the Fundamental Matrix', in
Multiple View Geometry in Computer Vision, 2nd edition, Cambridge University Press.
Haseena, H.K., Kiran, B.R. and Murthy, K.S. (2013) 'Application of Aerial Photography and
Remote Sensing in Environmental and Geological Interpretations in India: An Overview',
International Journal of Environmental Biology, vol. 3, no. 3, August, pp. 100-114,
Available: http://www.urpjournals.com.
Horn, B.K.P. (1990) 'Relative Orientation', International Journal of Computer Vision, vol.
4, pp. 59-78.
Ip, A.W.L. (2005) Analysis of Integrated Sensor Orientation for Aerial Mapping, Calgary:
Department of Geomatics Engineering, University of Calgary, Available:
http://www.geomatics.ucalgary.ca/links/GradThesis.html.
Jensen, J.R. (2011) Remote Sensing of the Environment, 2nd edition, Dorling Kindersley
India Pvt. Ltd.
Kasser, M. and Egels, Y. (2004) Digital Photogrammetry, London and Newyork: Taylor &
Francis.
Konecny, G. (1985) 'The International Society for Photogrammetry and Remote Sensing -
75 years old or 75 years young', in Photogrammetric Engineering and Remote Sensing.
Kraus, K., Jansa, J. and Kager, H. (1997) Photogrammetry Vol II: Advanced Methods and
Applications, Dümmlers.
Lillesand, T.M., Kiefer, R.W. and Chipman, J.W. (2014) Remote Sensing and Image
Interpretation, 6th edition, New Delhi: Wiley.
Lohmann, P. and Altrogge, G. (n.d) The use of SPOT and CIR aerial photography for urban
planning, Hannover: Institute for Photogrammetry and Engineering Surveys.
McGlone, J.C. (ed.) (1980) Manual of Photogrammetry, 4th edition, American Society for
Photogrammetry and Remote Sensing.
McGlone, J.C., Mikhail, E.M. and Bethel, J.S. (ed.) (2004) Manual of Photogrammetry, 5th
edition, American Society for Photogrammetry and Remote Sensing.
Mikhail, E.M., Bethel, J.S. and McGlone, J.C. (2013) Introduction to Modern
Photogrammetry , New Delhi: Wiley India.
Ray, R.G. (1960) Aerial Photographs in Geologic Interpretation and Mapping, Washington:
United States Government Printing Office.
Reddy, M.A. (2012) Textbook of Remote Sensing and Geographical Information Systems,
4th edition, Hyderabad: BS Publications.
Rother, C. (2000) 'A New Approach for Vanishing Point Detection in Architectural
Environments', Proceedings of the Eleventh British Machine Vision Conference, 11-14
September, Available: http://www.bmva.org/bmvc/2000/papers/p39.pdf [21 November
2014].
Schofield, W. and Breach, M. (2007) Engineering Surveying, 6th edition, Oxford: Elsevier.
Straforini, M., Coelho, C. and Campani, M. (1993) 'Extraction of vanishing points from
images of indoor and outdoor scenes', Image and Vision Computing, pp. pp. 91-99.
Svedberg , D. and Carlsson, S. (1999) 'Calibration, Pose and Novel Views from Single
Images of Constrained Scenes', Proceedings of the 11th Scandinavian Conference on
Image Analysis (SCIA’99), June, pp. pp. 111-117.
Tempfli, C., Kerle, N., Huurneman, G.C. and Janssen, L.L.F. (ed.) (2009) Principles of
Remote Sensing: An Introductory Textbook, 4th edition, Enschede: The International
Institute for Geo-Information Science and Earth Observation (ITC).
van den Heuvel, F.A. (1998) 'Vanishing point detection for architectural photogrammetry',
International Archives of Photogrammetry and Remote Sensing, pp. 652-659.
Wolf, P.R. (1974) Elements of Photogrammetry (with Air Photo Interpretation and Remote
Sensing), International Student Edition edition, Mcgraw Hill.
Wolf, P., , DeWitt, B., and Wilkinson, B.E. (2014) Elements of Photogrammetry with
Application in GIS, 4th edition, McGraw Hill Professional.