• Embed Doc
  • Readcast
  • Collections
  • CommentGo Back
Download
Chapter 2
Computational Model for Dynamic
Visual Analysis

In this chapter a computational model motivated by the human visual process is proposed. We call this model a dynamic visual model (DVM). This model performs visual analysis using video sequences. To begin, we first review the fundamentals of the human visual system in Section 2.1. The proposed DVM is then presented in Section 2.2. Potential applications with the proposed DVM are discussed in Section 2.3.

2.1 The Human Visual System
2.1.1 Structure and Motion Perceptions

As already mentioned, video sequences contain both spatial and temporal (spatiotemporal) information about scenes and are suitable for both structural and motion analyses of dynamic environments. A question arises as to whether these two different kinds of analyses can be carried out in a single framework, or must be done in two separate frameworks. To answer this question, we turn to the human visual system. It has been known that there are two types of ganglion cells, called parvocellular and magnocellular cells, in the eye. They are the starting neurons of two distinct pathways, referred to as the parvocellular and magnocellular pathways. The former is responsible for structural analysis, while the latter is devoted to motion analysis. These two pathways are partially independent [Zek74] due to their interaction at a number of different levels. The intermediate results of structural and motion analyses can actually influence each other at some stages of processing [Lev91].

2-1

Although structural and motion analyses are conducted in separate pathways, they are not necessarily processed in parallel. Motion is often analyzed before form and meaning, and, moreover, the greater the speed of an object the sooner its motion is perceived. Such a capability is especially prominent in lower animals when they perform such activities as escaping from predators, searching for food and selecting a mate. Likewise, we all have such an experience of getting out of the way of a speedy object before we determine what it is. Our brain is not fast enough to analyze the form of a rapidly moving object before perceiving its motion. Cognitive scientists have referred to such a phenomenon as an adaptive response of the brain [Mar91].

Structural analysis is indeed more complicated than motion analysis. Motion analysis involves the determination of two main attributes: speed and direction of motion. However, structural analysis needs to figure out the details of a form in order to achieve its perception. Furthermore, structural analysis demands higher precision than motion analysis. Several organs in the human visual system reveal this fact. It has long been known that there are two classes of receptors, cones and rods, constituting the retina. There are about 15 times as many rods as cones. Cones are located primarily in the center of the retina (the fovea) and decrease rapidly in density out to the outmost periphery of the retina. On the other hand, rods are least dense around the fovea and increase in density to some extent, and then decrease gradually in density toward the fringe of the retina. This anatomical evidence indicates that rods dominate a large proportion of the retina and serve to provide an overall picture of the field of view.

Since motion detection requires a large visual extent to be covered, the extensive periphery of the retina provides an adequate space for this purpose. Rods dominating the periphery of the retina play an important role in motion analysis. Moreover, rods are sensitive to illumination rather than color. This hints that color may not be a critical attribute for motion perception. Furthermore, unlike cones, each having its own nerve end, several rods are connected to a single nerve end. The sizes of ganglion cell receptive fields associated with rods are consequently larger than those of cones. A larger size of receptive field means lower spatial resolution information is transmitted to the brain. This suggests that high precision spatial information is not necessary for motion perception.

2.1.2 Pattern Recognition
2-2

Both structural and motion analyses can be transformed into pattern analysis. A pattern is an abstract representation of an object\u2019s shape, a spatial relationship, a movement, or some combination. From a practical point of view, a pattern consists of a set of distinctive features. A distinctive feature is an attribute that characterizes a shape, relationship, or movement. Pattern recognition (e.g., object and action recognition) relies on the detected features that uniquely distinguish an object or an action. Many studies on the visual systems of lower animals [Hur86, Let59, Tin51] and humans [Arb72, Rob75, Sek75, Sel59] have supported this assertion. The same studies reveal that hierarchical organs exist along the path from the retina to the primary receiving area of the cortex, and then along to the areas of the brain concerned with association. Each organ is sensitive to a specific feature of a certain level of complexity. Pattern recognition is accomplished in the association areas of the brain based on the features detected by these organs.

2.1.3 Parallel Distributed Processing

Traditional sequential computers have been known to be clumsy in dealing with pattern recognition tasks because such tasks require considering many pieces of information simultaneously, making them inadequate for such tasks. Human brains, with their extremely high degree of parallelism, seem to manage these tasks effortlessly. In addition, the method of knowledge representation is another important factor that influences the efficacy of the brain in pattern analysis. Diverse representation schemes have been studied. Of these, the distributed representation scheme has a number of appealing characteristics [Hur86], such as constructivity (recalling knowledge from incomplete or imperfect contents), generalization (generalizing modifications to related knowledge), and tunability (automatically adapting to changing circumstances). This representation scheme along with parallel processing leads to what cognitive scientists call parallel distributed processing.

There are various forms of information processing active both within and between layers of neurons, giving rise to different levels of information analysis, such as sensory, perceptual, syntactic, semantic, episodic, and action analyses. Pattern recognition is accomplished through a series of information analyzers arranged in a hierarchy. While analyzers play different roles in a pattern recognition process, they possess a similar structure [Kon67, Mar81]. Recall that different areas of cerebral cortex are specialized for different functions, such as visual, auditory, olfactory, and motor

2-3
of 00

Leave a Comment

You must be to leave a comment.
Submit
Characters: ...
You must be to leave a comment.
Submit
Characters: ...