• Embed Doc
  • Readcast
  • Collections
  • CommentGo Back
Download
 
THE STRUCTURE OF INTELLIGENCE
Get any book for free on: www.Abika.com127then it follows that the universe as we perceive it
must
possess the tendency to take habits. Of course, this line of thought is circular, because our argument for the general Gestalt rule involvedthe nature of our model of mind, and our model of mind is based on the usefulness of pattern-recognitive induction, which is conditional on the tendency to take habits. But all this does serveto indicate that perception is not merely a technical issue; it is intricately bound up with thenature of mind, intelligence, and the external world.
10
 
Motor Learning
 
10.0 Generating Motions
 Twenty years ago, Marr (1969) and Albus (1971) suggested that the circuitry of thecerebellum resembles the learning machine known as the "perceptron." A perceptron learns howto assign an appropriate output to each input by obeying the suggestions of its "teacher". Theteacher provides encouragement when the perceptron is successful, and discouragementotherwise. Marr and Albus proposed that the climbing fibers in the cerebellum play the role of the teacher, and the mossy fibers play the role of the input to which the perceptron is supposed toassign output.Perceptrons are no longer in vogue. However, the general view of the cerebellum as a learningmachine has received a significant amount of experimental support. For instance, Ito (1984) hasstudied the way the brain learns the vestibulo-ocular reflex -- the reflex which keeps the gaze of the eye at a fixed point, regardless of head movement. This reflex relies on a highly detailedprogram, but it is also situation-dependent in certain respects; and it is now clear that thecerebellum can change the gain of the vestibulo-ocular reflex in an adaptive way.The cerebellum, in itself, is not capable of coordinating complex movements. However, Fabreand Buser (1980) have suggested that similar learning takes place in the motor cortex -- the partof the cortex that is directly connected to the cerebellum. In order to learn a complex movement,one must do more than just change a few numerical values in a previous motion (e.g. the gain of a reflex arc, the speed of a muscle movement). Sakamoto, Porter and Asanuma (1987) haveobtained experimental evidence that the sensory cortex of a cat can "teach" its motor cortex howto retrieve food from a moving beaker.Asanuma (1989) has proposed that "aggregates of neurons constitute the basic modules of motor function", an hypothesis which is in agreement with Edelman's theory of NeuralDarwinism. He goes on to observe that "each module has multiple loop circuits with many othermodules located in various areas of the brain" -- a situation illustrated roughly by Figure 10. Inthis view, the motor cortex is a network of "schemes" or "programs", each one interacting withmany others; and the most interesting question is: how is this network structured?
10.1 Parameter Adaptation
 
 
THE STRUCTURE OF INTELLIGENCE
Get any book for free on: www.Abika.com128Consider an algorithm y=A(f,x) which takes in a guess x at the solution to a certain problem f and outputs a (hopefully better) guess y at the solution. Assume that it is easy to compute andcompare the quality Q(x) of guess x and the quality Q(y) of guess y. Assume also that A containssome parameter p (which may be a numerical value, a vector of numerical values, etc.), so thatwe may write y=A(f,x,p). Then, for a given set S of problems f whose solutions all lie in someset R, there may be some value p which
maximizes
the average over all f in S of the averageover all x in R of Q(A(f,x,p)) - Q(x). Such a value of p will be called
optimal
for S.The determination of the optimal value of p for a given S can be a formidable optimizationproblem, even in the case where S has only one element. In practice, since one rarely possesses apriori information as to the performance of an algorithm under different parameter values, one isrequired to assess the performance of an algorithm with respect to different parameter values in areal-time fashion, as the algorithm operates. For instance, a common technique in numericalanalysis is to try p=a for (say) fifty passes of A, then p=b for fifty passes of A, and then adopt thevalue that seems to be more effective on a semi-permanent basis. Our goal here is a more generalapproach.Assume that A has been applied to various members of S from various guesses x, with variousvalues of p. Let U denote the nx2 matrix whose i'th row is (f 
i
,x
i
), and let P denote the nx1 vectorwhose i'th entry is (p
i
), where f 
i
, x
i
and p
i
are the values of f, x and p to which the i'th pass of Awas applied. Let I denote the nx1 vector whose i'th entry is Q(A(f 
i
,x
i
,p
i
))-Q(x
i
). The crux of adaptation is finding a connection between parameter values and performance; in terms of thesematrices this implies that what one seeks is a function C(X,Y) such that %C(U,P)-I% is small,for some norm % %.So: once one has by some means determined C which thus relates U and I, then what? Theoverall object of the adaptation (and of A itself) is to maximize the size of I (specifically, themost relevant measure of size would seem to be the l
1
norm, according to which the norm of avector is the sum of the absolute values of its entries). Thus one seeks to maximize the functionC(X,Y) with respect to Y.
PARAMETER ADAPTATION AS A BANDIT PROBLEM
 The problem here is that one must balance three tasks: experimenting with p so as to locate anaccurate C, experimenting with P so as to locate a maximum of C with respect to Y, and at eachstage implementing the what seems on the basis of current knowledge most appropriate p, so asto get the best answer out of A. This sort of predicament, in which one must balanceexperimentalvariation with use of the best results found through past experimentation, is knownas a "bandit problem" (Gittins, 1989). The reason for the name is the following question: given a"two-armed bandit", a slot machine with two handles such that pulling each handle gives apossibly different payoff, according to what strategy should one distribute pulls among the twohandles? If after a hundred pulls, the first handle seems to pay off twice as well, how much moreshould one pull the second handle just in case this observation is a fluke?To be more precise, the bandit problem associated with adaptation of parameters is as follows.In practice, one would seek to optimize C(X,Y) with respect to Y by varying Y about the current
of 00

Leave a Comment

You must be to leave a comment.
Submit
Characters: ...
You must be to leave a comment.
Submit
Characters: ...