You are on page 1of 158
Research Activities in Digital Photogrammetry at The Ohio State University A Collection of Papers Presented At the XVIT Congress of ISPRS ‘Toni Schenk Editor Report No. 418 Department of Geodetic Science and Surveying ‘The Ohio State University Columbus, Ohio 43210-1247 July 1992 Foreword Most of the research efforts in photogrammetry are now directed toward digital photogram- metry. The arrival of digital photogrammetric workstations is @ clear demonstration of the considerable success that has been achieved in this rapidly developing subfield of photogram- stry At The Ohio State University we embarked on research in digital photogrammetry six years ago. From a rather small group with no special equipment we have grown: two faculty, sev- ‘eral postdoctoral résearchers and fifteen PRD students are now actively involved in digital photogrammetry research projects. Our laboratories are equipped with a high-performance softcopy workstation (Intergraph ImageStation), several UNIX workstations, image pracest- 5ng systems, digital cameras and scannest—all networked together. tis with great pleasure that I serve as editor of this report. I have gently persuaded the majority of my advisees to submit a paper to the ISPRS Congress 1992. This report is a col- lection of those contributions. The research in my group is primarily focused on automating photogrammetric processes. Specifically we are working on surface reconstruction, feature extraction and recognition, and on automated aerotriengulation. By surface reconstruction I refer not only to automatic DEM collection but include segment- ing processes with the goal of grouping the surface into breaklines and smooth patches to support the subsequent procestes of object recognition. Several papers contribute toward that goal. ‘The first two contributions are my invited papers for the ISPRS Congress 1902. They ‘may serve as a framework within whici all the other contributions fit. The first paper summarizes the most important concepts and issues of computer vision and relates them to digital photogrammetry. The second paper builds on this overview and focuses on conceptual and algorithmic aspects. There is some repetition because the presentations will not address ‘the same audience, Zong’s paper ie concerned with matching edges—in our case zero-crostings. She contin: ‘ued work originally initiated by Jin-Chen Li. ‘The idea is to find corresponding edges by positioning the templet on an edge in ane image and by finding the corresponding edge by cross-corzelation. ‘The matching results are checked for continuity as it is unlikely that discontinuities occur along edges. Matched edges are irregularly distributed in object space, Thus, the problem of interpolating ‘the surface arises. Al-Tahir investigates surface fitting methods. The thin plate method with weak continuity constraints is of particular interest because it allows detecting break- lines. They are compared with the position of edges which are also potential bresklines. It ‘now becomes possible to verify the hypothesis about breaklines and to use this information ‘on the next level of matching. ‘Wang analyzes the interpolated surface for objects of a certain vertical dimension, called humps. The surface is segmented into regions of similar elevations followed by comparing the shapes of their boundaries. The boundavies are grouped and classified into near horizontal and vertical edges. Hump detection is important for reconstructing surfaces in large-scale surban areas. One of the reasons for the astounding cepability of the human visual system to reconstruct surfaces is to integrate several depth cues, eg., apparent size, perspective, motion and tex- ture, Lee's paper is concerned with segmenting the image by analysing texture. Surface crientation and texture are very closely related. Most surface reconstruction methods adopt a hierarchical approach, for example by con- structing image pyramids. Stefanidis examines the hierarchical approach with regard to the sale space theory. He explore the relationship between images and surfaces since both can be zepresented in sale apace ‘The goal of our OSU surface reconstruction system is to segment the surface into smooth patches and bresklines and to represent them by a symbolic description. This step portant for the subsequent task of object recognition. Krupnile groups matched edges in the object space into straight lines and regular curves. He compares different methods that allow 3-D segmentation. ‘A fundamental task that occurs at all levels of the computer vision paradigm is compar- ing shapes. Fourier descriptors have long been used for that purpose. However, there is no real quantitative criterion for measuring the similarity of two objects. ‘Tseng employs an innovative approach by embedding shape invariants in a least-squares adjustment proce- dure that provides not only a supericr measure forthe goodness of the match but also the transformation parameters between the two shapes. Late vision processes, such as object recognition and image understanding, are application dependent (or goal-driven, if you prefer) and mast incorporate domain-specific knowledge. ‘Al-Garni’s work of interpreting landforms with the help of a knowledge-based system is an important contribution to our rescazch since we will have the surface reconstruction system tunder the contra of a knowledge-based system. ‘This report contains two contributions in the eres of automated acrotriangulaton, a sub: ject of considerable research interest. Agouris? paper addresses the problem of matching Iultiple image patches simultaneously. ‘This important step corresponds to the classical procedure of transferring and messuring points. Considering the notorious problem with point transfecing one can expect a significant increase in reliability from multiple imege Inatching, My paper deseibes general mathematical models which aze suitable for matching snultiple image patches ‘The remaining two papers from Toth deal with analytical plotters and their digital coun- terparts — softcopy workstations. Both workstation types play an important role in our research. So does Chasles Toth who developed software systems for analytical plotters that ‘ce invaluable not only for research but student Inboratories as well. His second paper de scribes our research efforts onthe softeopy workstation to keep the measuring mark (cursor) automatically on the ground. Thus, the operator i rdieved from setting the cursor precisely on the ground. Finally, I want to thank the authore for their contributions. However, this report would aot have been possible without the help of Peggy Agouris who spent many night shifts to put everything together. I wish to thank Irene Tesfai who diligently read all the papers. Her comments are appreciated by every writer—none with English as mother tongue. Funding for most of the research reported here was provided in part by the NASA Center for the ‘Commercial Development of Space Componeat ofthe Center for Mapping at ‘The Ohio State University, ‘Toni Schenk TABLE OF CONTENTS Foreword ... 1. Machine Vision and Close-Range Photogrammetry ‘Toni Schenk 2, Algorithms and Software Concepts for Digital Photogrammetric ‘Workstations ‘Toni Schenk 3. Resampling Digital Imagery to Epipolar Geometry ‘Woosug Cho, Toni Schenk & Mustafa Madani 4, Aerial Image Matching Based on Zero-Crossings ‘ia Zong, Jn-Cheng Li & Toni Schenk 5. On the Interpolation Problem of Automated Surface Reconstruction Raid ALTahis & Toni Schenk 6, 8D Urban Area Surface Analysis ‘hong Wang & ‘Tal Schenk 7. Image Segmentation from Texture Measurement Dong-Cheon Lee & Toni Schenk 8. On the Application of Seale Space Techniques in Digital Photogrammetry Anthony Stefanidis &e Toni Schenk 9. Segmentation of Edges in 2-D Object Space ‘Ammon Krupnik & Toni Sehenke 10. A Least-Squares Approach to Matching Lines with Fourier Descriptors Yillsing Teeng & Toni Schenk 11, Control Strategies for an Expert System to Interpret Landforms ‘Abdallah AL-Garni & Toni Schenk 12, Multiple Image Matching Peggy Agouris & Toni Schenk 13, Reconstructing Small Surface Patches from Multiple Images ‘Toni Schenk &e Charles Toth 14. On Matching Image Patches Under Various Geometrical Constraints CChasies Toth & Toni Schenke 15. A GIS Workstation-Based Analytical Plotter . Charles Toth & Toni Schenk i 7 “6 58 65 1 a 49 90 109 ast 14s MACHINE VISION AND CLOSE-RANGE, PHOTOGRAMMETRY ‘Toni Schenk Department of Geodetic Science and Surveying ‘The Ohio State University, Columbus, Ohio 43210-1247 USA ABSTRACT paper provides an overview of concepts and methods of machine vision as it may pertain to clore-range photogrammetry. ‘The ultimate goal of a machine vision system is to recognize objects from one o several 2-D images. This cannot be achieved in one giant step. Intermediate procetses and representations are necessary. Usually, the frst goal is to reconstruct the $-D surface of the object space, with emphasis placed on a symbolic description in which surface properties are made explicit. The surface information aids the subsequent object recognition task. The paper concludes with suggestions on how some of the concepts developed in machine vision can (and should!) be employed in digital dose- range applications. Report No. 418 July 1992 1 INTRODUCTION Since time immemorial, mankind hes been fascinated by the idea to create a machine that would somehow exhibit mental eapabil fies. ‘The robot is a typical example of such dreams. With the attempt of endowing com puters with information processing capabilities Similar to those of humans, researchers in ar- {ifelal intelligence purtue this dzeam in mod em times. Ever since computers became aval- able, researchers tried to mimic the mental fa Ulty of seeing. ‘The endeavour machine vision Seamed to achieve quick success. Expectations ‘were pushed far beyond what could be deliv- fred and disusion followed. ‘The problem has been tremendously underestimated —like many other problems tackled by artifical intelligence. We see and interpret scenes without corscious cffort, however, this does not mean that the task ie easy, Clearly, the Ick of a detailed understanding of vision is the reason why itis so difiult to rake computers understand and analyze im ages. It seems only natural that someone who fttempts to solve a vision task should have & food tnderstanding of the human visual sys- fem. Admittedly, this view is not sheed by every vision researcher. ‘As the name suggests, digital photogrammetry deals with digital imagery. Great strides have bbeen made during the last ten years due to the availability of new hardware and software, such fs image processing workstations, paral pro- essing, and increased storage capacity This Jn turn spurced much interest in research and evelopment. The arrival of digital photogram- ‘metric workstations is «clear demonstration of the progress achieved. ‘The goal of digital photogrammetry is to ture images and fo Store, manipulate ard pro- cess them automatically’ In that regard, dig- ital photogrammetry and machine vision have the Same goals. The purpose af this paper is to present the major concepts, methods, solu- tions and issues of machine vison. This may ‘be a risky enterprise, considering the glut of publications in that feld, and the high proba- bility that the machine vision research exmmma- nity would not unanimously agree on what the concepts and isues are We begin with a summary of human vision for {tis measure beyond all bounds. Most of the ‘material presented is based on recent research rerulis, We conclude the section about human Vision with Marr's theory about vision because itis the most advanced approsch to date. Tehas ‘been widely accepted by visual paychologsts and the machine vision esearch commnity, ‘The exposition of machine vision bogins with ‘the paradigm, followed by the most important concepts, methods, and critical issues. ‘This paves the way for comparing digital close range Photogrammetry and machine vision. We elab- rate on a few but very important aspects which the two disciplines share and point out Where they difer. It is hoped the concluding remarks stimulate discussions on how digital ‘photogrammetry and machine vision ean ben bit from each other ~ more than they do now 2 HUMAN VISION Por an animal or person to respond properly to ‘ changing environment i aust detect objects, vents and structures, ‘This ability, called pe ‘ception, requires that a living orgenism mast be sensitive to afferent atimli which carry Important information about the environment. ‘Most animals have some visual pereeption abi ities, For peopl, vision isthe most important sense. By the samme token, iti by far the most impressive and complicated sense We see and analyze our environment contin: ‘ously, nearly in Fea-time. That we do this ‘without conscious effort does not imply that we know how we analyze and understand scenes, however. Infact, the lack ofa detailed under standing of vision ie the reason why i is #0 Alificalt to program a computer to analyze and ‘understand images. Te seems only natural then that someone who attempts to solve pat of this tak should have a basie understanding of ‘man vision, Consider the following summary fs an exciting journey through the fascinating ‘world of vision. Mort ofthe material presented in the next subsection i fom Hubel (1988) 2.1 Neurophysiology of Human Vision Nearophysiology is concerned with the pro- ‘eaten that are performed by specialized tis sues and cells of the nervous system (Uttl, 1975) Vitual information is processed in ari ous stages at ceaters of spedalied nerve calls, {fom the retina to the primary visual cortex. ‘The proceming centers aze connected by the ‘nual pathway which can be thought of eval Unk (0 Fig. 2). Fatoty Gd Fig. 1: Visual pathway; cach structure con fists of millions of cells Information is tent to tne or several higher order structures. (Figure ‘adapted from ube, 1988) Tight is focused on the retina to form an im: age. Approximately 125 million light sensi: tive photoreceptors (rods and cones) are un cvenly distributed over the entire posterior po. tion of the eyeball. The retina consists of three layers: photoreceptors, middle layer, and ganglion eels whose dendrifes are bundled to- gether to form the optic nerve. Oday, light passes through two layers before it reaches the photoreceptors, except forthe site of acute vi Sion, the fovea region smal than amine ter in diameter. It is tempting to compare the eye with » cam: cra. The analogy must be met with caution, however. Firs, the quality of the retinal image is fer inferior fo that of any cheap Tnstamatie ‘camera, Aberrations of lms and cornea responsible for considerable distortions. The curvature of the retina causes straight lines in object space to appear curved, disturbing the metrical relationship Yetween image and ‘object space. Moreover, the constant move ‘ments of the eye results in a blurred image. ‘While the purpose of the camera is lo render a static mapshot of the world, the eye’ and Deain’s purpose isto extract useful information to guide a person's response to an ever chang ing environment, How do the ganglion cells respond to incident light and what is reported back to the next pTocesing centers? First, we note that there are far fewer ganglion cells than photoreceptors (he ratio is approximately 1125). Thi is 8 fiset indication that the retinal image is pro- cessed by the cells ofthe mile Iayer and the {ganglion cells. Te also implies that one garglion all receives impulses from several photoxcep ‘The receptive fled of « ganglion cell refers to those receptors which are “eomnected” to it ‘The circular center of a receptive field is sur- rounded by ring-shaped region. An on-center tzanglion cell reacts most (inerenses its ring. ate) ifthe center of its receptive fel i stnmae lated, for example by shining a spot of light con the receptors that form the center. The anglion call stops fring i the center-suround region ofits receptive ld is stimulated, but reacts with a burst of impulses when the sin: ‘lus is turned off. Off-center ealls exhibt the ‘opposite behvior. For example, i thei cen ters are stirmalated, fring is suppressed. Both, fon and off-center clls do not zeapond if their fentire receptive fd ie evenly dluminated ‘We conclude that ganglion cells respond to. brightness diferencer within their receptive feds, that is, to loal intensity difference, Re- captive fields differ in size. As one vould ex ‘pct, the size is smallest in the fovea and pro- igrenively increases further out in the visual Feld. The light intensity changes, transmitted. by the optical nerve, are detected by biclog cal filters of the retina, Campbell and Robson (1968) showed that calls are sensitive t dit {ferent spatial frequencies—a strong indication. that the visual input is processed in multiple independent channels. Inthe interest of brevity, we skip the nex: pro- cessing stage, the lateral geniculate bods, and shift our attetion to the primary visual (stri- ate) cortex, a complex substructure of the cere- bral corter. ‘The vieual cortex is topogreph- Jeally organized: an area of about two mil- limeter square has all the functionality. These arear—telf-contained modules of the striate Cortex—map out a portion of the visual field ‘Consequently, ifone such area is damaged, the corresponding part of the atinal image i aot processed further andthe reslt is local blind tess. Neighboring modules 4o not compensate forthe loss. However, the perceptual process “ili” completes the missing information by interpolating it from the surrounding area. ‘The specialization of cells in the cortex in- creases. So does the complecity of ther recep tive fields. Unlike eels of earlier levels, cort- ‘eal cells have no circular syzamettical receptive fies, and they respond quite differently too. A simple call, for example, reyponds best if x ait of light crosses its receptive field at a specific Angle, Changing the orientation and position ‘nly lightly evokes no response, Other simple tells respond more strongly if one half of the receptive fd is stimulated ‘The mest commonly found cells in the stri- ate cortex are the complex cells. Like simple cells they respond to propedly oriented stimuli However, the eel’ fring rate fades out rather quickly unless the stimulus is moved. So, com ples eels are movement sensitive; they respond With a barrage of impulses if « properly ot ented sit is swept across their receptive fields. Some complex eels are also direction sensitive, ‘That is, it matters in whieh direction the sit is moved, That a large population of cells is Highly sensitive to movements makes a lot of Sense, atleast from an evelutionary point of view. After al, to react properly and timely to the environment, moving objects should be discovered promptly. End-stopped cells are further specialized in that they aze sensitive to the length ofthe stim ulus. They respond much more strongly ifthe slit of ligt ends or changes direction within their receptive field. ‘Thus, they respond best to comers and curvature, So far information ftom the two eyes was ‘rested separately, even though one corti ‘al hemisphere receives infermation from bath eyes. As photogrammetrist we are prfession- ally intrested in stereopss. The corpus callo- ‘sum ie the site of stereovsion. Here, binocular Calls are found that respon to depth. Some of ‘these calls only fire if the stimulus is roughly as far away as the distance on which the to ‘yes are focused (zero parallax). Other calls vole a brisk barrage of impulses ifthe sim 1s is nearer or farther away from the fixation poiat. Ancther characteristic feature of these Aispacty-toned eels is that they are also erien- tation and movement sensitive. As one vould ‘expect, they do not respond at all if only one yes stimulated. Though dsparity-tunee eels “undoubtedly contribute to stereovsion they are just 2 partial explanation of how we perceive depth. One should bear in mind that Stereop: sis is only one of several depth cues. et us interrupt our journey through the visual system for a moment and recapitulate, What Feaches the brain is not an image, but infurma- tion about changes in the scene, eg. light in tensity differences, their orientation, and move ‘ment. The specialization of ells and the com- plenty of their receptive fields increase. How far will thie specialization go? After cells were discovered inthe visual aren of s monkey that responded to the shape of paws, the notion of 4 grandmother eell arose. Is there a cel that ‘would respond to grandmother's face? 2.2 Visual Perception ‘Vieual perception is the ability of humms to. ‘organize and interpret visual sensory informa: ‘on. ‘The psychology of human visual percep tom war dominated in the late 10th century by associationismn. Te was thought that perception ‘could be explained by associating simple sensa ‘ions. This was precisely what the Gestal: py. hologists attacked mot, for their basic tenet was that “the whole ie more tan the umnof ite parts". ‘They argued that the form and struc fre of sensations and their interrelatiocshipe should be taken into account. The Gestaliste thought that this synergism is accomplished by ‘magnetic force fields between brain events, The Gestalt psychology has fallen into disrepute, mainly because no evidence was found for the force-fields in the brain, Cognitive prychology adopts a more infor- mation theoretical approach where computer models of perceptual processes are legitimate goals for establishing psychological theories ‘This, together with a more quantitative ap proach in research, paves the way for “compu {ational perception”, results that can be con verted to algorithms. Perceptual organiza‘ion ‘The neurophysiological approach to vsion eft tus with the image decomposed into simple local features, suchas edges, corners and some depth Information. Such low-level descriptions must bbe organized into lange: perceptual structure Perceptual organization Is the fist process of perception (Hock, 1978). It detects groupings fd structures in images which in turn are be- Teved to be the input for object recognition and image understanding ‘The following are examples of a sot of crite. ia for grouping the image and finding asso- ations. "Most of these principles have been advocated by the Gettlt prychologists and fre known af the Gestalt laws of organization Prosimity groups local features together which fare cloae together. Depth is a very strong for proximity. ‘Things with similar disparity vale tes are grouped togethor and perceived as be- Tonging to the vame nface, Similarity groups similar features together. Similarity ean over Fide proximity. Commen fate groups things to- fether which appear te move together. Te can bbe demonstrated by generating randomly dis tributed dots and superimposing a copy with a light shift or rotation, The shift or rotation ie ‘dessly perceived. Anocher Gestalt law is good continuation which emphasizes smooth conti- nuity over abrupt changes, Closure emphasizes ‘preference for closee figures and symmetry ‘groups symmetrical features together, Figure ground separation is quite a strong perceptual ‘ganization process In reality, grouping processes work coneur- rently on the same image. Two (oF moze) pro- estes yielding the some interpretation rerults in-a more salient perception. McCafferty and Fryer (1087) showed that a very strong and stable perception results from eombining stereo with fgure-ground separation, Other perceptual processes Here, we mention some other powerful percep tual processes which could be used in compu tational vision. Filling in oF completion is responsible fr us to ‘not perceive the world asa patchwork of edges and blobs (as might be concluded from the ‘europhysiological discusion about vision). A very illustrative example is the blind spot Cote one eye and fix point with the open eye. Move a pencil with one hand so that it ‘crores the visual feld. Wien the pencil is i> aged at the blind spot, 1 disappears, as ex pected. However, you are not left with a black ‘pot; rather the hole in the retinal image i ov ‘red (Glled in) by the surrounding background, Filing in appears to belong to a more general perceptual process called surface interpolation (Ramachandran, 1992) e% & 2 Fig. 2(2): Example for virtual lies. Fig. 2(b) demonstrates the phenomenon of ilusionsry contours. he figure is perceived as © square fand not as four partial creles, Virtval Hines are imaginary lines, linking nearby tokens. Fig. 2(a) is an example. A Similar phenomenon are ilusionary contours, investigated by Kanissa (1979). In Fig. 2(0) wwe perceive the structure ofa square, The foar corners are lying on crcl. Another (unlikely) Interpretation ofthis fe four parti ci ‘Teatureis a very important but not well under. stood perceptual process. Texture is strongly related to surfaces. Slovly changing texture patterns give a strong perception for surface formals, Tulese studied texture segmentation. intensively. He concludes that textured regions cannot be segregated if their rst and second. forder statistics are idenical. In Julees and. Bergen (1989) the notion of textoas is intro- duced. ‘The authors claim that they play a complementary role in human texture segrega- tion, 2.8 Mare's Theory about Vision ‘The physiological approach to vision answered ‘the question: what happens where? How some thing happens cannot be fully explained unless the cells behavior can be described by a com: plate wiring diagram. For answering the que tion why single ells respond they way they do, fa broader view must be adopted. As Marr put it trying to understand percep tion by studying only reurons is ike trying to understand bird fight bby studying only feathers: Tt just cannot be done. In order to un- derstand bird fight, we have to un- derstand aerodynamics; only then do the structure of feschers and the different shapes of bird's wings make sense. (Marr, 1982, p. 27) Marr's theory about vision hat strong infor. mation processing underpinning. He argues for understanding an information process ~ vision at three diferent levels, computational theory species what the vi ‘sual system must do. It answers the ques tion about the purpose of the compute tion and the strategy for solutions. representation and algorithm investigates the representation of input snd output and the algoritm that trans: orm one into the ot hhardware implementation answers the question how the representation and the slgocthn ca be physialy implemented y neurons. ‘The tenet of Marr's theory is that the shapes ‘and, positions of things can be made ex- plicit from images without knowing what the things are and what role they play. However, this cannot be accomplished inne step, rather in a sequence of representations designed to fa- clitate the subsequent construction of physical properties of objects. ‘The thres main steps are briefly discuseed Primal sketch ‘The purpose of the primal sketch is to make intensity changes in the image explicit. Inten: sity changes, or edges for short, are aa. impor tant physical property of objects. In the real world edges occur over a wide range of spatial extents, “A sharp edge, for example, is man fest within a small aee, comprising a few pix ‘ls only. On the other hand, a fussy edge ean ‘only be detected by looking at # much larger tres, Marr and Hildreth (1980) propore a s=- quence of LoG operators to detect edges at various scale. ‘The LoG operator (Laplacian ‘of s Gaussian) is obtained by taking the see fond derivative of » Gaussian Mter. The Lapla lan (V4) is particularly suited because itis rection independent. By varying the stan- dard deviation ¢ of the Gaussian, the desired Sequence, also called multi channel implemen- {ation, is obtained. Obviously, the parameter @ determines the spatial extent within which fan edge is detected. ges are identical with the zero-cossing contours that result fom in- tersecting the convolution surface with a plane, ‘whore convolution vale i tro. ‘Thus, «sharp ‘uge is obtained by convolving the image with ‘a small ¢ (fine channel), and fanny edges result from coarser channels ‘There is much evidence that the human vigual system performs the same operations. Cells fcxist in the cortex that respond to difer. ‘ent spatial frequencies. Spatial information is ‘processed in each part of the visual field by five independent channels (Wilson and Bergen, 1078). Actually, the LoG operator is approx: mated by the dilference of two Gaussians of slightly diferent ¢. ‘The two coarser channels hhave transient properties, reponding to duct ating patterns, while the finer channels respond to stationary abjects, The finest chanel is n= Isted to acute vison ‘The primal sketch is more than just an agglom- ration of zero-crossings. Perceptual processes fperate on the image az well at on the edge, eulting in a curvilinear organizstion, virtual lines and groupings. Zero-crossngs from dif ferent chasnels are combined, governed by the rule that edges in different channels are local- ized in space. 2.5-D sketch Its purpose and depth of {inuities, The name of this sketch drives from ‘the assumption that il captures a great deal about the relative depths and surface orien- tations, and local changes and discontinuities, Dut some aspects are more accurately repre: seated than others, Very locally we can easly say from ‘motion or stereopsis information ‘whether one point isin front of ane other. But if we try to compare the distances to two surfaces that lie in different parts of the visual field, we do very poorly and can do this much les accurately than we can compare their surface orienta tions, (Marr, 1982, p. 282) ‘The 25D sketch is built up from the pri sal sketch, augmented with information from Sereopsis, texture, azalysis of motion, and Shading. ‘The surface eientaton is much more fccurate than depth. Only local changer in depth have a comparable accuracy. Discont nultis in depth may aise from stereopsis and ‘celusion. Occlusion may be specified by the presence of oceluded edges in the prima sketch, or by analysing motion patterns. ‘The 25-D sketch s represented as asa of prim- itives, depiciod as “eedles". ‘The length of each needle deseribes the degre of tlt of that pat of the surface, while the orientation of 2 peedle reflects the direction of slant. The dit- tance ftom the viewer i represented by a scalar quantity. Interpolation procedums are invoked in areas of fnguficient information. In ateas oflow con tart, no edges are present and therefore no depth informetion. ‘Tae missing depth infor: ‘mation is interpolated from surrounding areas ‘where contrast is present. Another example for ‘an interpolation process are illusory contours (see Fig: 28. ‘The 25D sketch jn the end product of early vision procesies, solely derived from images, vithout support from late vision or knowledge of the scene, The early vision processes are modular, they work pardlel and independent ‘from one ancther. ‘The segmentation problem is implicitly solved by making explicit the ds- continuities between diffrent surface, S:D Model representation ‘The purpote of this last step is to. describe shapes and their spatial ganization in object centered coordinate system. Marr and Nishi hhara (1978) suggest « modular organization of shape descriptions in a cosrdinate frame which is determined by the shape itself (canonical co crdinate freme). The modslar organization l- Tows a description that js independent on the degree of details an object is described. ‘The theory is restricted to a sot of guneralized cones. A generalized cone is obtained by mov- ing a cross section of eonitant shape but vari able size along an axis, A. vase is 8 good ex: ‘ample of a generalized cone. An object may ‘consist of several generalized cones, each with its own axis. All axer of one object form the ‘component axes of that object A library of -D model descriptions at diferent levels of specificity is generated for objects that say possibly appear in «scene. ‘The same 3 ‘D model description mus: be derived from the Jimage. Object recognitioa then entails to com- pare these descriptions with the library Occluding contours of an image provide strong: clues for finding the axes of generalized cones Oceluding contours are the silhouettes of ob- jects. Even though mos: sihouettes ace am- biguous, humane interpre! them in a particular way. Marr hypothesizes chat additional infor- ‘mation is ured to constrain the perception of - D shapes to silhouettes i we see them. These. constraints are general and do not 1 prior knowledge of the scene 3 MACHINE VISION 3.1 Introduction ‘rom time immemorial psople dreamed of ee- sting machines that would exhibit mental abil ities, With the invention of computers, re tearchers in the field of artificial intelligence (AD) pursue this dream to endow computers with information processing capabilities simi- Tar to those of humans. Richie (1988) defines Alas " the study of how to make computers o things at which, atthe moment, people are better”. Vision is'not only our most impret sive gente but also the most intensively studied sense in AT. By and large, machine vition pureues the same goal ae inuman vision: generate deserptions About the seene from images. The descptions ‘must be explicit and meaningful so as te allow other system components to carry out a task Ta that aspect, machine vision is part of an entire system that interects with the exvizon ‘ment, say a robot. Consequently, tasks mich as ecision making, planning, executing decisions, ‘are not part of machine vision. By the way, the terms computer vision and machine vision are used interchangeably. ‘Machine vision isa relatively new and rapidly changing feld. Many ofthe essential ccacepts hhave only evolved during the last ten years ‘The purpose of this chapter isto eucidete the ‘most important concepts and to elaborate on the major issues. Even though machine vie sion ir now fed in its own right i ir related to other areas, such as psychology, computer sraphice, pattern recognition and image pro- ring. In fact, significant progress bar bean ‘made, and wil be made, when an intedieee plinary approach is adopted. Take Mar’ the- ‘ory of vision as an example. It is actually the combination of research results in newrophysi- ‘ology, payehophysies, peresption, compucersc- ‘ence and signal processing. ‘Even though our knowledge of the human vi- sual system is only fragmentary, we know that itis very complex. Machine vision, therefore, is anon trivial task. Not surprisingly thea, no ‘general purpose vision system exists today and will not exit in the foreseeable future The lack of rapid success, as enthusiastically pre dicted thirty years ago, led some AT researchers to a rather pessimistic assessment. In their view, machine vision is so Ul-defined and ux Aerconstrained that no general solution exist, ‘As Barrow put it Despite considerable progress in recent yeats, our understanding of the principles underlying visual perception remains primitive. At {empteto construct computer mod. cls for the interpretation of ar ‘trary scones have resulted in such poor performance, limited range of abilities, and inflexibility that, ‘wore it not for the human existence ‘roof, we might have been tempted Jong ‘ago to conclude that high- performance, general-purpose vi- ion is impossible. (Barrow, 1978) Nevertheless, progress has been made, mainly in industrial applications, where the envirex- ‘ment, such as lighting condition, ean be better controlled. 3.2 Machine Vision Paradigm Marr's theory of vision gave rise to the most advanced and widely accepted paradigm of m= chine vision. Fig. depicts the building blocks. Usually, at the outset isa raw image. We also includs image formation, a point foreeflly ad vocated by Hora (see Hom, 1986) and now ac cepted by many vision researchers. Afterall, ‘machine vision may be viewed as the inver process ofimage formation. ‘Thus it makes only Sense to obtain a thorough understanding of Image formation. ‘The primal sketch isthe result of edge detec- tion. Badges are likely to have been caused by structures in the scene, such ar object Doundaries, markings and surface discontinu- iting. ‘The’ unorganized edge fragments, bare and blobs are grouped into higher-level tokens, which are now processed by the independent modules steropis, shading, motion, texture to yield the 25-D sketch. ‘The 26-D sketch contains fewer data than the raw image, but more important, itis more ex- plicit. An’ edge could be an object boundary for a shadow; a single pixel can be everything. Depth and $D shape information is particu. larly important. Shape and depth information is obtained independently from stereo, shad- ing, motion and texture proceses, also called shape-from-X processes. Note that the 25-D sketch is purely obtained from the raw images. Tis the result of bottom-up processes, also re ferred to as early vision,

You might also like