Professional Documents
Culture Documents
(Download PDF) 3D Imaging Analysis and Applications 2Nd Edition Yonghuai Liu Online Ebook All Chapter PDF
(Download PDF) 3D Imaging Analysis and Applications 2Nd Edition Yonghuai Liu Online Ebook All Chapter PDF
https://textbookfull.com/product/3d-cinematic-aesthetics-and-
storytelling-yong-liu/
https://textbookfull.com/product/hybrid-imaging-in-
cardiovascular-medicine-1st-edition-yi-hwa-liu/
https://textbookfull.com/product/high-performance-silicon-
imaging-fundamentals-and-applications-of-cmos-and-ccd-
sensors-2nd-edition-daniel-durini/
https://textbookfull.com/product/3d-printing-technology-
applications-and-selection-1st-edition-rafiq-noorani/
Robust Control: Theory and Applications Kang-Zhi Liu
https://textbookfull.com/product/robust-control-theory-and-
applications-kang-zhi-liu/
https://textbookfull.com/product/3d-printing-architecture-
workflows-applications-and-trends-carlos-banon/
https://textbookfull.com/product/polarimetric-sar-imaging-theory-
and-applications-first-edition-yamaguchi/
https://textbookfull.com/product/handbook-of-statistical-
analysis-and-data-mining-applications-2nd-edition-robert-nisbet/
https://textbookfull.com/product/chemical-analysis-of-food-
techniques-and-applications-2nd-edition-yolanda-pico-editor/
Yonghuai Liu · Nick Pears ·
Paul L. Rosin · Patrik Huber Editors
3D Imaging,
Analysis and
Applications
Second Edition
3D Imaging, Analysis and Applications
Yonghuai Liu Nick Pears Paul L. Rosin
• • •
Patrik Huber
Editors
3D Imaging, Analysis
and Applications
Second Edition
123
Editors
Yonghuai Liu Nick Pears
Department of Computer Science Department of Computer Science
Edge Hill University University of York
Ormskirk, Lancashire, UK York, UK
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
This book is primarily a graduate text on 3D imaging, shape analysis, and asso-
ciated applications. In addition to serving masters-level and doctoral-level research
students, much of the text is accessible enough to be useful to final-year under-
graduate project students. Also, we hope that it will serve wider audiences, for
example, as a reference text for professional academics, people working in com-
mercial research and development labs, and industrial practitioners.
We believe that this text is unique in the literature on 3D imaging in several
respects: (1) it provides a wide coverage of topics in 3D imaging; (2) it pays special
attention to the clear presentation of well-established core techniques; (3) it covers a
wide range of the most promising recent techniques that are considered to be
state-of-the-art.
This second edition (2020) upgrades existing topics presented in the first edition
(2012) with the most significant novel findings in the intervening period.
Additionally, it has new material covering consumer-grade RGB-D cameras, 3D
morphable models, deep learning on 3D datasets, as well as new applications in the
3D digitization of cultural heritage and the 3D phenotyping of plants.
Three-dimensional (3D) imaging techniques have been developed within the field
of computer vision that automatically reconstruct the 3D shape of the imaged
objects and scene. This is referred to as a 3D scan or 3D image and it often comes
with a registered color-texture image that can be pasted over the captured shape and
rendered from many viewpoints (if desired) on a computer display.
The techniques developed include both active systems, where some form of
illumination is projected onto the scene, and passive systems, where the natural
illumination of the scene is used. Perhaps the most intensively researched area of
3D shape acquisition has been focused on stereo vision systems, which, like the
human visual system, uses a pair of views (images) in order to compute 3D
v
vi Preface
structure. Here, researchers have met challenging problems such as the establish-
ment of correspondences between overlapping images for the dense reconstruction
of the imaged scene. Many applications require further processing and data analysis
once 3D shape data has been acquired. For example, identification of salient points
within the 3D data, segmentation into semantic parts, registration of multiple partial
3D data scans, computation of 3D symmetry planes, and matching of whole 3D
objects.
It is one of today’s challenges to design a technology that can cover the whole
pipeline of 3D shape capture, processing, and visualization. The different steps of
this pipeline have raised important topics in the research community for decades,
owing to the numerous theoretical and technical problems that they induce.
Capturing the 3D shape, instead of just a 2D projection as a standard camera does,
makes an extremely wide array of new kinds of application possible. For instance,
computer aided geometric design, 3D and free-viewpoint TV, virtual and aug-
mented reality, natural user interaction based on monitoring gestures, 3D object
recognition and 3D recognition for biometrics, 3D medical imaging, 3D remote
sensing, industrial inspection, and robot navigation, to name just a few. These
applications, of course, involve much more technological advances than just 3D
shape capture; storage, analysis, transmission, and visualization of the 3D shape are
also part of the whole pipeline.
While 3D imaging provides opportunities for numerous applications in the real
world, these applications also provide a platform to test the techniques developed.
3D phenotyping of plants is one of the representative applications that spans the
whole spectrum and different stages of 3D imaging: data capture, plant modeling,
segmentation, skeletonization, feature extraction and measurements, and tracking.
As discussed later in the book, despite the rapid progress of research and devel-
opment in the past four decades, the problem is far from being solved, especially in
the sense of real-time and high throughput, and thus will continue to attract
intensive attention from the community.
3D imaging and analysis is closely associated with computer vision, but it also
intersects with a number of other fields, for example, image processing, pattern
recognition, machine learning, computer graphics, information theory, statistics,
computational geometry, and physics. It involves building sensors, modeling them,
and then processing the output images. In particular, 3D image analysis bridges the
gap between low-level and high-level vision in order to deduce high-level (se-
mantic) information from basic 3D data.
The objective of this book is to bring together a set of core topics in 3D imaging,
analysis, and applications, both in terms of well-established fundamental techniques
and the most promising recent techniques. Indeed, we see that many similar
Preface vii
techniques are being used in a variety of subject areas and applications and we feel
that we can unify a range of related ideas, providing clarity to both academic and
industrial practitioners, who are acquiring and processing 3D datasets. To ensure
the quality of the book, all the contributors we have chosen have attained a
world-class standing by publishing in the top conferences and journals in this area.
Thus, the material presented in this book is informative and authoritative and
represents mainstream work and opinions within the community.
After an introductory chapter, the book covers 3D image capture methods,
particularly those that use two cameras, as in passive stereo vision, or one or more
cameras and light projector, as in active 3D imaging. It also covers how 3D data is
represented, stored, and visualized. Later parts of the book cover 3D shape analysis
and inference, firstly in a general sense, which includes feature extraction, shape
registration, shape matching, 3D morphable models, and deep learning on 3D
datasets. The final part of the book focuses on applications including 3D face
recognition, 3D cultural heritage, and 3D phenotyping of plants.
Acknowledgements We would like to express our sincere gratitude to all chapter authors for their
contributions, their discussions, and their support during the book preparation. It has been our
honor to work with so many truly world leading academics and, without them, the production of
this book would not have been possible.
We would also like to thank all of the chapter reviewers for their insightful comments, which
have enabled us to produce a high quality book. In particular, we thank Song Bai, Francesco
Banterle, Stefano Berretti, Stefano Brusaporci, Benjamin Bustos, Umberto Castellani, Chi Chen,
Andrew French, Ryo Furukawa, Yulan Guo, Lasse Klingbeil, Huibin Li, Feng Liu, Andreas
Morel-Forster, Michael Pound, Tomislav Pribanić, Stephen Se, Boxin Shi, Ferdous Sohel, Hedi
Tabia, Damon L. Woodard, Jin Xie, and Ruigang Yang.
We are grateful for the support of our publisher, Springer; in particular, we would like to thank
Helen Desmond who worked with us in a friendly and effective way throughout all stages of the
book production process.
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Johannes Brünger, Reinhard Koch, Nick Pears, Yonghuai Liu,
and Paul L. Rosin
ix
x Contents
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733
Contributors
xi
xii Contributors
1.1 Introduction
Since ancient times, humans have tried to capture their surrounding 3D environment
and important aspects of social life on wall paintings. Early drawings, mostly animal
paintings, are thought to date back 32,000 years, such as the early works in the
1 Typically, this term is used when the 3D data is acquired from multiple-viewpoint 2D images. We
prefer to avoid the use of this term for what we see as raw 3D data, as this is easily confused with
3D statistical shape models.
2 Typically, this term is used when a scanner acquires the 3D data, such as a laser stripe scanner.
3 Typically, this term is used when the data is ordered in a regular grid, such as the 2D array of depth
Chauvet Cave, France. Drawings in the famous Lascaux Caves near Montignac,
France are also very old and date back to around 17,000 years [16]. These drawings
were not correct in terms of perspective, but did capture the essence of the objects
in an artistic way.
A rigorous mathematical treatment of vision was postulated by Euclid4 in his
book Optics [13]. Thus, already early on in history, some aspects of perspective were
known. Another very influential mathematical text was the Kitab al-Manazir (Book
of Optics) by Alhazen5 [72].
In parallel with the mathematical concepts of vision and optics, physical optics
developed by the use of lenses and mirrors form the basis of modern optical instru-
ments. Very early lenses were found as polished crystals, like the famous Nimrud lens
that was discovered by Austen Henry Layard.6 The lens quality is far from perfect
but allows light focusing at a focal point distance of 110 mm. Lenses were used as
burning lenses to focus sunlight and as magnification lenses. An early written record
of such use is found with Seneca the Younger7 who noted
Letters, however small and indistinct, are seen enlarged and more clearly through a globe or
glass filled with water [42].
Thus, he describes the effect of a spherical convex lens. Early on, the use of such
magnification for observing distant objects was recognized and optical instruments
were devised, such as corrective lenses for bad eye-sight in the thirteenth–fifteenth
century CE and the telescope at the beginning of the seventeenth century. It is unclear
who invented the telescope, as several lens makers observed the magnification effects
independently. The German-born Dutch lens maker Hans Lippershey (1570–1619)
from Middelburg, province Zeeland, is often credited as the inventor of the telescope,
since he applied for a patent, which was denied. Other lens makers like his fellow
Middelburg lens maker Zacharias Janssen were also claiming the invention [38].
Combined with the camera obscura, optically a pinhole camera, they form the basic
concept of modern cameras. The camera obscura, Latin for dark room, has been
used for a long time to capture images of scenes. Light reflected from a scene enters
a dark room through a very small hole and is projected as an image onto the back
wall of the room. Already Alhazen had experimented with a camera obscura and it
was used as a drawing aid by artists and as a visual attraction later on. The name
camera is derived from the camera obscura. The pinhole camera generates an inverse
image of the scene with a scale factor f = i/o, where i is the image distance between
pinhole and image and o is the object distance between object and pinhole. However,
4 Euclid of Alexandria, Greek mathematician, also referred to as the Father of Geometry, lived in
physical optics and experimented with lenses, mirrors, camera obscura, refraction, and reflection.
6 Sir Austen Henry Layard (1817–1894), British archeologist, found a polished rock crystal during
the excavation of ancient Nimrud, Iraq. The lens has a diameter of 38 mm, presumed creation date
750–710 BC and now on display at the British Museum, London.
7 Lucius Annaeus Seneca, around 4 BC–65 CE, was a Roman philosopher, statesman, dramatist,
the opening aperture of the pinhole itself has to be very small to avoid blurring. A
light-collecting and focusing lens is then used to enlarge the opening aperture and
brighter, yet still sharp images can be obtained for thin convex lenses.8 Such lenses
follow the Gaussian thin lens equation: 1/ f = 1/i + 1/o, where f is the focal length
of the lens. The drawback, as with all modern cameras, is the limited depth of field,
in which the image of the scene is in focus.
Until the mid-nineteenth century, the only way to capture an image was to man-
ually paint it onto canvas or other suitable backgrounds. With the advent of pho-
tography,9 images of the real world could be taken and stored for future use. This
invention was soon expanded from monochromatic to color images, from mono-
scopic to stereoscopic10 and from still images to film sequences. In our digital age,
electronic sensor devices have taken the role of chemical film and a variety of elec-
tronic display technologies have taken over the role of painted pictures.
It is interesting to note, though, that some of the most recent developments in dig-
ital photography and image displays have their inspiration in technologies developed
over 100 years ago. In 1908, Gabriel Lippmann11 developed the concept of integral
photography, a camera composed of very many tiny lenses side by side, in front of
a photographic film [45]. These lenses collect view-dependent light rays from all
directions onto the film, effectively capturing a three-dimensional field of light rays,
the light field [1]. The newly established research field of computational photography
has taken on his ideas and is actively developing novel multilens-camera systems for
capturing 3D scenes, enhancing the depth of field, or computing novel image transfer
functions. In addition, the reverse process of projecting an integral image into space
has led to the development of lenticular sheet 3D printing and to auto-stereoscopic
(glasses-free) multi-view displays that let the observer see the captured 3D scene
with full depth parallax without wearing special purpose spectacles (glasses). These
3D projection techniques have spawned a huge interest in the display community,
both for high-quality auto-stereoscopic displays with full 3D parallax as used in
the advertisement (3D signage) and for novel 3D-TV display systems that might
eventually conquer the 3D-TV home market. This technique is discussed further in
Sect. 1.2.3. However, the market today is still dominated by traditional stereoscopic
displays, either based on actively switched shutter glasses for gamers or based on
passive polarization techniques for 3D cinema production.
8 Small and thin bi-convex lenses look like lentils, hence the name lens, which is Latin for lentil.
9 Nicéphore Niépce, 1765–1833, is credited as one of the inventors of photography by solar light
etching (Heliograph) in 1826. He later worked with Louis-Jacques-Mandé Daguerre, 1787–1851,
who acquired a patent for his Daguerreotype, the first practical photography process based on silver
iodide, in 1839. In parallel, William Henry Fox Talbot, 1800–1877, developed the calotype process,
which uses paper coated with silver iodide. The calotype produced a negative image from which a
positive could be printed using silver chloride coated paper [28].
10 The Greek word stereos for solid is used to indicate a spatial 3D extension of vision, hence
Fig. 1.1 Left: Human binocular perception of 3D scene. Right: the perceived images of the left and
right eye, showing how the depth-dependent disparity results in a parallax shift between foreground
and background objects. Both observed images are fused into a 3D sensation by the human eye–brain
visual system
It is important to note that many visual cues give the perception of depth, some of
which are monocular cues (e.g., occlusion, shading, and texture gradients) and some
of which are binocular cues (e.g., retinal disparity, parallax, and eye convergence). Of
course, humans, and most predator animals, are equipped with a very sophisticated
binocular vision system and it is the binocular cues that provide us with accurate
short range depth perception. Clearly, it is advantageous for us to have good depth
perception to a distance at least as large as the length of our arms. The principles of
binocular vision were already recognized in 1838 by Sir Charles Wheatstone,12 who
described the process of binocular perception:
... the mind perceives an object of three dimensions by means of the two dissimilar pictures
projected by it on the two retinae... [82]
The important observation was that the binocular perception of two correctly dis-
placed 2D-images of a scene is equivalent to the perception of the 3D scene itself.
Figure 1.1 illustrates the human binocular perception of a 3D scene, comprised of
a cone in front of a torus. At the right of this figure are the images perceived by the left
and the right eye. If we take a scene point, for example, the tip of the cone, this projects
to different positions on the left and right retina. The difference between these two
positions (retinal correspondences) is known as disparity and the disparity associated
with nearby surface points (on the cone) is larger than the disparity associated with
more distant points (on the torus). As a result of this difference between foreground
and background disparity, the position (or alignment) of the foreground relative to
the background changes as we shift the viewpoint from the left eye to the right eye.
This effect is known as parallax.13
Imagine now that the 3D scene of the cone in front of the torus is observed by
a binocular camera with two lenses that are separated horizontally by the inter-eye
distance of a human observer. If these images are presented to the left and right eyes
of the human observer later on, she or he cannot distinguish the observed real scene
from the binocular images of the scene. The images are fused inside the binocular
perception of the human observer to form the 3D impression. This observation led to
the invention of the stereoscope by Sir David Brewster14 [11], where two displaced
images could convey 3D information to the human observer.
13 The terms disparity and parallax are sometimes used interchangeably in the literature and this
misuse of terminology is a source of confusion. One way to think about parallax is that it is induced
by the difference in disparity between foreground and background objects over a pair of views
displaced by a translation. The end result is that the foreground is in alignment with different parts
of the background. The disparity of foreground objects and parallax then only become equivalent
when the distance of background objects can be treated as infinity (e.g., distant stars); in this case,
the background objects are stationary in the image.
14 Sir David Brewster, 1781–1868, Scottish physicist and inventor.
18 J. Brünger et al.
One of the first computer-based passive stereo systems employed in a clearly defined
application was that of Gennery who, in 1977, presented a stereo vision system for
an autonomous vehicle [27]. In this work, interest points are extracted and area-
based correlation is used to find correspondences across the stereo image pair. The
relative pose of the two cameras is computed from these matches, camera distortion
is corrected, and, finally, 3D points are triangulated from the correspondences. The
extracted point cloud is then used in the autonomous vehicle application to distinguish
between the ground and objects above the ground surface.
When eight or more image correspondences are given for a stereo pair, captured
by cameras with known intrinsic parameters, it is possible to linearly estimate the
relative position and orientation (pose) of the two viewpoints from which the two
projective images were captured. Once these relative viewpoints are known, then
the 3D structural information of the scene can be easily recovered from the image
22 The selection is subjective and open to debate. We are merely attempting to give a glimpse of the
Thefeatures
boy who likes to play games in which the lead soldier and other
of imitation warfare have a part, can make his own lead
soldiers, and other castings, by the use of a plaster-of-Paris mold. If
he cannot undertake this work alone, the process is interesting for
his older brother, or even for “daddy.” A mold of plaster of Paris, as
shown in the illustration, is used for the casting box. The hollow
impression of the soldier is filled with the molten lead, which is
poured in through the sprue hole at the top. When the lead cools, the
mold is opened, the casting removed, and the process repeated. An
entire army can thus be made with a single mold.
First obtain a small lead soldier, and coat it with shellac. Make a
box somewhat larger than the pattern for the soldier, as shown in the
sketch. Make it about 1¹⁄₂ in. deep, and set bolts near the corners, as
shown, pouring the plaster around them. Fill the box half full of
plaster of Paris. While still soft, press the pattern into the center of
the plaster so that half its thickness is imbedded. Permit the under
mold to dry, and remove the pattern. Shellac the surface of the
plaster and the impression. Wrap a layer of oiled paper around the
bolts. Replace the pattern in the impression and fill the remaining
half of the box with plaster, and permit it to dry.
Also make a small wooden plug, and set it in the center, its point
touching the pattern, and pour the plaster around it. When the mold
is dry remove this plug, thus forming the sprue hole, through which
the molten lead is poured into the mold.
Lead Soldiers, and Many Other Small Castings, can be Made by the Use of
This Plaster-of-Paris Mold
When the second part of the mold is dry, lift it carefully from the
under mold, and remove the pattern. Shellac the surface of the top
mold, cleaning away any small bits of plaster around the edges. Trim
down the box so that the top mold projects over it about ³⁄₈ in.,
making it easy to drop the top mold into place over the bolts. To use
the mold, make certain that it is clean inside and set the top into
place. Fasten down the wing nuts at the washers. Be very careful
that the mold is dry, as hot metal poured on a wet surface may cause
a dangerous splash. Repeat this process, and if care is taken about
300 castings can be made with one mold. The soldiers can be
painted suitably and even sold in sets. The process can be adapted
to many forms of other small castings, using other suitable metals, or
wax, where the casting is to be molded into shape further.
A Trap Nest for the Poultry House
The Trap Nest Automatically Closes as the Hen Enters the Nest Box,
Releasing the Trigger
Poultry raisers find a trap nest useful, and one can be made
quickly by fitting an old packing box with a suitable sliding gate. In
the arrangement shown, the gate is raised slightly as the hen enters
the nest box, releasing the spring and causing the gate to drop. The
gate and spring can be adjusted to various-sized breeds of poultry.
The two grooved uprights can be cut from flooring and the other
wooden parts made from laths or wooden strips. The trigger is made
of wire.—A. J. Call, Hartsville, Mass.
A Simple Wireless Detector
This Neat Wireless Detector was Made of Materials Easily Gathered in the
Boy’s Workshop