© All Rights Reserved

1 views

© All Rights Reserved

- amor sentimental pdf-ilovepdf-compressed (1).pdf
- Product Mindset English v11
- resume
- Book Review
- CS1 Essay
- Randy Sargent - Interview Transcript
- Brochure
- GA Fiction AvatarPt1-Prophet
- aacc 4
- Product-Mindset-English-v1.pdf
- 3
- Bakshi
- Global Artificial Intelligence (Ai) in Telecomunications Market 2019-2027
- International Conference on Science.docx
- Biomimetics for NASA Langley Research Center Year 2000 Report of Findings From a Six-Month Survey
- IJETR031817
- 3 HR Transformation Trends for 2019
- Palani_Thanaraj_CV (2)
- Filetype PDF Simulating and Arti Cial Adaptive Stock Market
- Lillicrap - Continuous Control With Deep RL

You are on page 1of 7

Hamid R. Tizhoosh

Pattern Analysis and Machine Intelligence Lab

Systems Design Engineering, University of Waterloo,

Waterloo, Ontario, Canada

Internet: http://pami.uwaterloo.ca/tizhoosh/, E-Mail: tizhoosh@uwaterloo.ca

Abstract merous paradigms for solving hyper-dimensional prob-

lems. Search and optimization techniques (e.g. genetic

Opposition-based learning as a new scheme for ma- algorithms), connectionist approaches (e.g. neural nets)

chine intelligence is introduced. Estimates and and feedback-oriented algorithms (e.g. reinforcement

counter-estimates, weights and opposite weights, and agents) are among mostly used intelligent methods to

actions versus counter-actions are the foundation of cope with challenging problems. In this paper, opposi-

this new approach. Examples are provided. Possibili- tion-based learning, as a new learning scheme, will be

ties for extensions of existing learning algorithms are introduced. In section 2, the basic idea will be de-

discussed. Preliminary results are provided. scribed. To illustrate the usefulness of opposition-

based learning, in sections 3-5 extensions of three

1. Introduction existing paradigms, namely genetic algorithms, rein-

Many machine intelligence algorithms are in- forcement learning and neural nets, will be briefly

spired by different natural systems. Genetic algo- introduced and preliminary results will be provided.

rithms, neural nets, reinforcement agents, and ant Section 6 concludes the paper.

colonies are, to mention some examples, well estab-

lished methodologies motivated by evolution, human 2. Basic Idea

nervous system, psychology and animal intelligence,

respectively. The learning in natural contexts such as Learning, optimization and search are fundamen-

these is generally sluggish. Genetic changes, for in- tal tasks in the machine intelligence research. Algo-

stance, take generations to introduce a new direction in rithms learn from past data or instructions, optimize

the biological development. Behavior adjustment estimated solutions and search in large spaces for an

based on evaluative feedback, such as reward and pun- existing solution. The problems are from different

ishment, needs prolonged learning time as well. nature, and algorithms are inspired by diverse biologi-

cal, behavioral and natural phenomena.

Social revolutions are, compared to progress rate

of natural systems, extremely fast changes in human Whenever we are looking for the solution x of a

society. They occur to establish, simply expressed, the given problem, we usually make an estimate x̂ . This

opposite circumstances. Revolutions are defined as “..a estimate is not the exact solution and could be based

sudden, radical, or complete change...a fundamental on experience or a totally random guess. The later is

change in political organization;...a fundamental usually the case in context of complex problems (e.g.

change in the way of thinking about or visualizing random initialization of weights in a neural net). In

something” [1]. Regardless in a scientific, economi- some cases we are satisfied with the estimate x̂ , and

cal, cultural or political sense, revolutions are based on sometimes we try further to reduce the difference be-

sudden and radical changes. Of course nobody can tween the estimated values and the optimal value if the

guarantee that redirecting a society, a system, or the latter is directly or indirectly known. In this sense, if

solution of a complex problem in the opposite direc- we understand the task of many intelligent techniques

tion will necessarily result in a more desirable situa- as function approximation, then we generally have to

tion. This is, however, a general phenomenon in ma- cope with computational complexity; a solution can be

chine learning (for instance, mutation can generate achieved, however, the necessary computation time is

lesser fit offspring causing either sub-optimal solu- usually beyond the permissible application limits (the

tions and/or slower convergence). curse of dimensionality).

Proceedings of the 2005 International Conference on Computational Intelligence for Modelling, Control and Automation, and International Conference on

Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC’05)

0-7695-2504-0/05 $20.00 © 2005 IEEE

In many cases the learning begins at a random peated examination of guess and counter-guess. The

point. We, so to speak, begin from scratch and move, opposite number xo of the initial guess x will be gen-

hopefully, toward an existing solution. The weights of erated. Based on which of the estimate or counter-

a neural network are initialized randomly, the parame- estimate is closer to the solution, the search interval

ter population in genetic algorithms is configured can be recursively halved until either the estimate or

randomly, and the action policy of reinforcement the counter-estimate is close enough to an existing

agents is initially based on randomness, to mention solution.

some examples. The random guess, if not far away

from the optimal solution, can result in a fast conver-

gence. However, it is natural to state that if we begin

with a random guess, which is very far away from the

existing solution, let say in worst case it is in the

opposite location, then the approximation, search or

optimization will take considerably more time, or in

worst case becomes intractable. Of course, in absence

of any a-priori knowledge, it is not possible that we

can make the best initial guess. Logically, we should

be looking in all directions simultaneously, or more

concretely, in the opposite direction. If we are search-

ing for x , and if we agree that searching in opposite

direction could be beneficial, then calculating the op-

posite number x̃ is the first step. Figure 1. Solving a one-dimensional equation

via recursive halving of the search interval

with respect to optimality of the estimate x

Definition – Let x be a real number defined on and opposite-estimate xo.

a certain interval: x [a,b]. The opposite number x̃

is defined as follows:

x̃ = a + b x . (1) 3. Extending Genetic Algorithms

For a = 0 and b = 1 we receive Genetic algorithms (GAs) are stochastic search

x̃ = 1 x . (2) methods inspired from the natural selection in the

Analogously, the opposite number in a multidimen- evolution process [2,3]. Parameters of an algorithm or

sional case can be defined. system can be regarded as chromosomes of individuals

Definition – Let P(x1 , x 2 ,..., x n ) be a point in a n- in a population of solutions. The fitter solution will

be reproduced.

dimensional coordinate system with x1,..., x n and

x i [a i ,bi ]. The opposite point P̃ is completely Idea – For every selected chromosome a corre-

defined by its coordinates x̃1,..., x̃ n where sponding anti-chromosome can be generated. The ini-

x̃ i = a i + bi x i i = 1,...,n . (3) tial chromosomes are generally generated randomly

meaning that they can possess high or low fitness.

The opposition scheme for learning can now be concre- However, in a complex problem it is usually very

tized: likely that the initial populations are not the optimal

ones. In lack of any knowledge about the optimal

Opposition-Based Learning – Let f (x) be the solution, hence, it is reasonable to look at anti-

function in focus and g() a proper evaluation function. chromosomes simultaneously. Considering the search

If x [a,b] is an initial (random) guess and x̃ is its direction and its opposite at the same time will bear

opposite value, then in every iteration we calculate more likelihood to reach the best population in a

f (x) and f ( x̃) . The learning continues with x if shorter time. Specially at the beginning of the opti-

mization, either the chromosome or the anti-

g( f (x)) g( f ( x̃)), otherwise with x̃ . chromosome may be fitter (in some cases both may be

The evaluation function g(), as a measure of op- fit solutions!). Considering a population member and

timality, compares the suitability of results (e.g. fit- its genetic opponent should accelerate finding the fit-

ness function, reward and punishment, error function test member.

etc.).

Opposition-Based Extension – Beside the regu-

Considering the unite interval [a1, b1] in Figure 1, lar mutation define a total mutation resulting in a

the solution for a given problem can be found by re- complete bit inversion of selected chromosome.

Proceedings of the 2005 International Conference on Computational Intelligence for Modelling, Control and Automation, and International Conference on

Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC’05)

0-7695-2504-0/05 $20.00 © 2005 IEEE

i=1 x i2 .

Experiments – In each iteration four of the best 20

f (x) = (4)

individuals and four of the weakest members are se-

lected. The anti-chromosomes are generated by total Results – Table 1 shows improvement caused by

mutation of the weakest members. the anti-chromosomes used by the opposition-based

GA (OGA) compared to the conventional GA. This

Experiment 1 – A continuous and unimodal func- improvement, however, is not present in results after

tion should be maximized. The following function 50 and 200 iterations. The reason for such can be un-

with a known maximum at f (x opt ) = 20971520 is derstood from the graphical display of the iteration

selected: progress shown in Figure 2.

GA OGA

After 25 iterations

f (x) 11245058 14083021

1198332 709189

After 50 iterations

f (x) 16525243 16385800

1112704 616371

After 200 iterations

f (x) 20462917 20148009

396008 249261

The anti-chromosomes result in a dramatic spike n1

100 ( x i+1 x i2 ) + (1 x i ) 2

.

2

in progress in the first several iterations, when the f (x) = (5)

individuals in the population are relatively far from the i=1

global optimum. Hence, in the subsequent iterations, Convergence to the global minimum is difficult and

the anti-chromosome may be effectless or even nega- hence is used to assess the performance of the anti-

tively affect the evolution toward optimality. The chromosomes.

pattern seen in Figure 2 was repeatedly observed.

The same effect, namely improvement in early it-

Experiment 2 – In order to apply and verify the erations, was observed here as well. Although, the

results obtained from the initial investigations, the benefit was not as outstanding as in the previous case.

more complicated Rosenbrock’s valley optimization

problem, also known as “Banana” function, was used:

Other Thoughts – The opposition-based exten- approach the solution faster. It is possible, however,

sion of genetic algorithms by means of total mutation that only a few parameters have to be changed to their

seems to be straightforward. However, this can be opposite. This brings us to the more logical operation

performed in different ways. In general, if we generate we may call total sub-mutation. The total sub-

the anti-chromosomes by total mutation, then we are mutation selects parameters within the chromosome

inverting all parameters, from which each chromosome and inverts them (Figure 3).

consists. This means that we are making the assump-

tion that all parameters should be changed in order to

Proceedings of the 2005 International Conference on Computational Intelligence for Modelling, Control and Automation, and International Conference on

Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC’05)

0-7695-2504-0/05 $20.00 © 2005 IEEE

that takes the agent to the left by one cell. Depending

on where the goal is situated, however, the actual rein-

forcements from the two actions may not be very dis-

similar. This situation arises when both actions are

equally unfavourable to reaching the goal. In order to

take into account the estimated reinforcement of the

nominally opposite action without explicitly perform-

ing the action, the contrariness value is calculated

based on the Q-value of the opposite action to define a

corrective measure b for reinforcement modification:

Figure 3. Total mutation versus total sub-

mutation. Top: all bits are inverted to mutate

to the anti-chromosome, bottom: only a part b = ec , (6)

of the chromosome completely mutates.

where c is the contrariness :

4. Extending Reinforcement Learning

Reinforcement learning (RL) is based on interac- 2 Q(s, a) Q(s, ã)

c= , (7)

tion of an intelligent agent with the environment by max(Q(s, a) , Q(s, ã) )

receiving reward and punishment [4]. The agent takes where s is the state, a the action and ã the counter or

actions to change the state of the environment and opposite action. A high contrariness value produces a

receives reward and punishment from the environment. small b, which is multiplied with the reinforcement

In this sense, reinforcement learning is a type of received by performing the ordinary action to produce

(weakly) supervised learning. In order to explain how an estimate of the reinforcement for the imaginary

the concept of opposition-based learning can be used opposite action. If the two Q-values are alike, c will

to extend reinforcement agents, we focus on the sim- be zero and b will be 1, thus assigning the same rein-

plest and most popular reinforcement algorithm, forcement to the opposite action.

namely the Q-learning [4,5].

The modified algorithm follows the conventional

Generally, the RL agents begin from scratch and Q-learning algorithm with another crucial difference –

make stochastic decisions, explore the environment, updates also need to be performed for the opposite

find rewarding actions and exploit them. Specially at action:

the very begin the performance of the RL agents is

poor due to lack of knowledge about which actions can Q(s, ã) Q(s, ã) + ( r̃ + max Q(s, a˜ ) Q(s, ã)) , (8)

steer the environment in the desired direction.

where reward for opposite action r̃ can be given as

should also consider the opposite action and/or oppo- Two sets of experiments were conducted: 1) the Q-

site state. This will shorten the state-space traversal matrix is reset after each trial and 2) the matrix is

and should consequently accelerate the convergence. propagated throughout all the trials. They will each be

compared to the conventional algorithm to determine

the effectiveness of the modification.

Opposition-Based Extension – Modification of the Q-

learning algorithm involves performing two Q-value Experiment – The program simulates a grid

updates for each action taken. The conventional algo- world of size 100100 where an agent has to learn a

rithm updates the Q-value for the selected action taken path to a fixed goal (Figure 4). The starting point at

at a particular state. The modified algorithm will also the beginning of each learning trial is randomly de-

update the Q-value corresponding to the action oppo- cided. The agent successively progresses from one

site to the one chosen. This “double update” is done state to another, receiving reinforcements along the

in a bid to speed up the learning process. way. Reinforcements are inversely proportional to the

distance between the goal and the agent. The rein-

To accommodate this modification, a contrariness forcement received at each step is used to update the

value is calculated at each update to determine how Q-matrix. Learning is considered completed when the

different one action is from its nominally opposite agent successfully reaches the goal, which is the ter-

action. For instance, in grid world problem or maze minal state.

learning, an action that takes the agent to the right by

one cell is nominally a direct opposite to the action

Proceedings of the 2005 International Conference on Computational Intelligence for Modelling, Control and Automation, and International Conference on

Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC’05)

0-7695-2504-0/05 $20.00 © 2005 IEEE

Results – The most direct and straightforward crite- of steps per trial (407 versus 450), average number of

rion for evaluating effectiveness of the algorithms is to trials required for success (5.6 versus 8.3), and average

observe the number of steps and number of trials re- number of steps per run (2259 versus 3754).

quired to reach the goal. A lower number of trials

indicates that the goal is reached with higher speed and

fewer failures. A smaller number of steps means the

charted paths are more direct, which is also an indica-

tion of faster learning.

The results of the different test runs are presented in

Table 2 (average μ and standard deviation ). From

Table 2, noticeable improvement in the convergence

behavior of the new algorithm is obvious. The opposi-

tion-based extension of Q-learning surpasses the con- Figure 4. Grid world problem: The agent can

ventional Q-learning in every criterion: average number move in different directions to reach the

goal (cell with star).

Table 2. Results for standard Q-learning (QL) and opposition-based Q-learning (OQL)

QL OQL

Trials Average Steps Total Steps Trials Average Steps Total Steps

In a third experiment, the two algorithms QL and Other Thoughts – Of course the concept of oppo-

OQL were run for different grid world sizes. For each sition can be applied if opposite actions and opposite

grid world two optimal policies 1 and 2 were gener- states are meaningful in the context of the problem at

ated manually. A performance measure A was defined hand. The counter-action is defined as "to make inef-

to measure the accuracy of the policy * generated by fective or restrain or neutralize the usually ill effects

the agent: of by an opposite force" [1]. In regard to action a and

state s and the existence of their opposites ã and s̃ ,

A * =

( *

) (

1 * 2 ), (10)

following cases can be distinguished:

where . denotes the cardinality of a two dimensional updated per state observation.

set, and the grid world is of size mm. For K trials the • Only ã can be defined: two cases can be up-

total average accuracy can be given as dated per state observation.

• Only s̃ can be defined: two cases can be up-

1

Amm = A (k)* . (11) dated per state observation.

K k

• Neither ã nor s̃ can be given: application of

The results are presented in Table 3.

opposition concept not straightforward.

Table 3. Results of experiments: Average Assuming that opposite actions and opposite

policy accuracies for Q-learning (QL) and states both exist, then at least four state-action pairs

the proposed opposition-based algorithm can be updated in each iteration. In general, if action a

(OQL) for different grid world sizes. A soft- is rewarded for the state s, then a is punished for the

max policy was used for all algorithms. opposite state s̃ , the counteraction ã is punished for s

(max. 50 episodes and 1000 iterations per and rewarded for s̃ (Figure 5).

episode).

5. Extending Neural Networks

Artificial neural networks (ANNs) have been es-

tablished as a major paradigm in machine intelligence

research [6,7]. Their ability to learn from samples or

instructions, supervised or unsupervised, and from

Proceedings of the 2005 International Conference on Computational Intelligence for Modelling, Control and Automation, and International Conference on

Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC’05)

0-7695-2504-0/05 $20.00 © 2005 IEEE

binary or discrete inputs has made them an unbearable where A + is a sufficiently large number. Hence, at

tool in dealing with complex problems. the beginning a large interval is assumed which shrinks

as learning proceeds.

The other alternative is to create an opposite net-

work. An identical instance of the network will be

created and all its weights will be replaced by opposite

values. The convergence criterion, e.g. the total error

of the net and opposite net, will be calculated and

compared. The learning will be continued with net or

opposite net based on which one has a lower error as

Figure 5. Time saving in RL: the action a is described in following algorithm:

rewarded for the state s (the gray cell). The

opposite cases are updated simultaneously Algorithm

without explicit action-taking.

1. Initialize the net N

2. Create the opposite net Ñ

The idea of opposition-based learning can be inte- 3. Calculate the errors of each network.

grated into the neural computing in different ways. 4. If Error(N ) Error(Ñ ), then replace N with

Generally, we have two possibilities to employ the

opposition concept: Ñ

5. Adjust the weights

• Opposite Weights – The weights of an ANN can 6. If converged, then stop, else go to step 2

be selected and replaced by opposite weights. This

procedure is similar to mutation in genetic algo- 5.3 Experiments with Opposite Weights

rithms. How many weights should be selected, and

how they should be selected offers a wide range of As a learning task, the digits 0 to 9 should be

possible schemes to be investigated. recognized by a neural net. The training set consisted

of gray-level images of size 2020. By minor transla-

• Opposite Nets – An instance of the network with tion and adding noise, the size of training data was

opposite weights can be generated. The entire net- increased to 396 images (Figure 6).

work is duplicated, and all its weights are replaced

with opposite weights. In this way we have to ob-

serve the error for the net and opposite net and make

the decision, which one should be deleted.

Assuming w ij is the i-th weight of the j-th layer,

the randomly selected weight w ij is then changed to

its opposite value w̃ ij by

Figure 6. An example from the training set.

w̃ ij = a + b w ij , (12) In order to demonstrate that the training set is

suitable, a conventional network as implemented in

where a and b are the minimum and maximum weight Matlab was used. The network had 400 inputs and 2

values, respectively. Since we may not know the range active layers (20 neurons in the hidden layer and 1 in

of every single weight, we have to estimate it. Assum- the output neuron).

ing that w ij (k) is the average value for every weight

over k consecutive iterations, the interval boundaries For training a resilient backpropagation algorithm

can be calculated as follows: was employed. The error function is illustrated in

Figure 7, which shows that the selected net dimension

( )

a(k) = w ij (k) 1 A ek , (13)

was suitable for the digit recognition in our case. We

kept this dimension for all other experiments and used

b(k) = w ij (k)(1+ A e ) ,k

(14) this net as a gold standard.

Proceedings of the 2005 International Conference on Computational Intelligence for Modelling, Control and Automation, and International Conference on

Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC’05)

0-7695-2504-0/05 $20.00 © 2005 IEEE

We investigated two different versions in order to 6. Conclusions

verify the usefulness of opposition-based learning for

extension of neural nets: The concept of opposition-based learning was in-

troduced in this work. The main idea to consider

counter-estimates, opposite numbers, anti-

chromosomes, counter-actions and opposite weights in

machine learning algorithms was discussed. Prelimi-

nary results were provided. Considering the magnitude

of methodologies and the vast number of possibilities

of their modification via opposition concept, the re-

sults presented in this work are certainly far from be-

ing sufficient or absolute assuring. The preliminary

results in this work should solely demonstrate the

usefulness of the opposition-based extension for exist-

ing algorithms in some cases. Hence, this work does

not advocate a final acceptance of opposition-based

Figure 7. Training with a resilient back- learning but attempts to demonstrate obvious benefits

propagation net for digit recognition. in a limited manner. Based on observations made,

however, one could conclude that the benefits of revo-

lutionary jumps during learning has clear advantages

Version 1 – In every learning iteration a opposite in the early stages and will turn into disadvantage as

net is built consisting of opposite weights. The entire the learning continues. Apparently sudden switching

original net is duplicated and its weights are replaced to opposite values should only be utilized at the start

by opposite weights, which are estimated as follows: to save time, and should not be maintained as the

estimate is already in vicinity of an existing solution.

( )

2 Extensive research is still required to implement dif-

w̃ k,ij = kw n,ij w k,ij , (15)

l n= kl +1 ferent concepts, conduct experiments and verify for

where k is the learning step, i the neuron, j the input of which tasks and in which situations the proposed

neuron, and l the length of stored weights (l=3 learn- learning scheme can bring clear improvement in terms

ing steps). of convergence speedup.

Version 2 – The second version calculates the op- Acknowledgements – The author would like to

posite weights as follows: gratefully thank students for conducting the experi-

ments. Special thanks go to Kenneth Lam, Leslie Yen,

w̃ k Graham Taylor, Monish Ghandi, Rene Rebmann and

a Jendrik Schmidt.

w̃ k,ij = w̃ k,ij 2l w n,ij

k b

k,ij

, (16)

l n= kl +1 e +1

n=kl+1wn,ij

k

a

w̃k,ij =wk,ij 2 wk,ij

l exp(kb )+1

where a and b are selected in such a way that the last 7. References

term at the beginning almost 10 and after the 50-th

learning step equal 1 is. The training was performed 50 [1] Webster Dictionary, www.webster.com.

times. The learning was interrupted after 200 epochs [2] D.E. Goldberg, Genetic Algorithms in Search,

and the net with lowest error was declared as the win- Optimization, and Machine Learning, Addison-

ner. In Table 4, the frequency of winning for each net Wesley Professional, 1989.

is presented for three different learning rates. Interest- [3] M.Mitchell, An Introduction to Genetic Algo-

ingly, in the case of a low learning rate, both versions rithms, MIT Press, 1998.

of revolutionary extension of the network were the best [4] R.S. Sutton, A.G. Barto, Reinforcement Learning:

(had the lowest error). An Introduction, MIT Press, 2003.

[5] C. Watkins, Learning from Delayed Rewards, The-

Table 4. The number of times a network had sis, University of Cambidge, England, 1989.

the lowest error [6] L.V. Fausett, Fundamentals of Neural Networks:

Net = 0.1 = 0.5 = 1.0 Architectures, Algorithms And Applications, Pren-

0 28 49 tice Hall, 1993.

Resilient

[7] M. Anthony, Neural Network Learning: Theoreti-

Version 1 50 14 1 cal Foundations, Cambridge University Press,

Version 2 50 8 0 1999.

Proceedings of the 2005 International Conference on Computational Intelligence for Modelling, Control and Automation, and International Conference on

Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC’05)

0-7695-2504-0/05 $20.00 © 2005 IEEE

- amor sentimental pdf-ilovepdf-compressed (1).pdfUploaded byWalter Marino Jimenez Rojas
- Product Mindset English v11Uploaded byAndrew Richard Thompson
- resumeUploaded byapi-338163986
- Book ReviewUploaded byabrilespinoza
- CS1 EssayUploaded byZheng Sun
- Randy Sargent - Interview TranscriptUploaded byJoon Jang
- BrochureUploaded byNarendra Chaurasia
- GA Fiction AvatarPt1-ProphetUploaded byCaleb Ramage
- aacc 4Uploaded byapi-406697821
- Product-Mindset-English-v1.pdfUploaded byzyo
- 3Uploaded byrytwister
- BakshiUploaded byGagan Preet
- Global Artificial Intelligence (Ai) in Telecomunications Market 2019-2027Uploaded byClive Cooper
- International Conference on Science.docxUploaded byAyu
- Biomimetics for NASA Langley Research Center Year 2000 Report of Findings From a Six-Month SurveyUploaded byAngelica Martinez
- IJETR031817Uploaded byerpublication
- Palani_Thanaraj_CV (2)Uploaded byPalani Thanaraj
- 3 HR Transformation Trends for 2019Uploaded bysan
- Filetype PDF Simulating and Arti Cial Adaptive Stock MarketUploaded byAaron
- Lillicrap - Continuous Control With Deep RLUploaded byaleong1
- T4- AA_How Valuable Will Humans Be in the Workplaces of the Future_ _ World Economic ForumUploaded byFernado Ronaldo
- Scenario IdentificationUploaded byVivek Barsopia
- ANN+PMUploaded byfaizan
- Dong Zhao; Deyi Xue -- Parametric Design With Neural Network Relationships and Fuzzy Relationships Considering UncertUploaded bybaconkho
- JapansSociety5.0-GoingBeyondIndustry4.0.docxUploaded byAulia Rahman
- AI for Classic Video Games Using Reinforcement LearningUploaded byrohan m
- Brave New World Pioneers Robots with Personality with Support from Government of CanadaUploaded byPR.com
- 32 ConvocationUploaded byswaroop paul
- JapansSociety5.0-GoingBeyondIndustry4.0.docxUploaded byZegger Kipli
- Algorithmic Trading Using Sentiment Analysis and Reinforcement LearningUploaded bySimerjot Kaur

- Ipe Plate 2 Fluid MachineriesUploaded byjanuel borela
- Raiborn VarianceUploaded byjoymariecurso
- On Almost Strongly Theta b Continuous FunctionsUploaded byDr.Hakeem Ahmed Othman
- BOQ DG 320 KVAUploaded byvinaydiesels
- SensorUploaded bymskumar_554
- kladt_91_a20-a35Uploaded byJuan Brujo
- dcs feb 20.rtfUploaded byskype
- CX20_30_34Uploaded byMicky Boza
- Archaeology and the Philosophy of WittgensteinUploaded byOzanUsanmaz
- o131 Sop for Operating Muffle FurnacesUploaded byUmair Rafi
- T stressUploaded byNiku Pasca
- Thrust Reverser for a Mixed Exhaust High Bypass Ratio Turbofan Engine and Its Effect on Aircraft and Engine PerformanceUploaded bybashirsba
- Guidelines, Standards, Certification and Legal PermitsUploaded bytagrilo
- e AccountUploaded byMono Mo
- bmUploaded bySuriyakumar Sundaram
- Motivation.pptUploaded byJeevani Battula
- VT307 - 5DZ 02 - FUploaded byAlan Abdiel Ruiz
- Delphia yacht model d31.pdfUploaded byBf Ipanema
- N2XSEYUploaded byRinda_Rayna
- Chemical Modification of Kenaf Fibers_EdeerozeyUploaded bypalajingga
- 1N6391Uploaded byFaulhaber Adrian
- D6530675-E7D3-D778-AC7B-F7FAAA4AB6BFUploaded bySham Aran
- Remote Usability Test Report for Google Maps (Desktop/Mobile)Uploaded byapolancojr
- arshad assignmentUploaded byJai Shankar
- Basic Electrical and Electronics and Instrumentation Engineering - Lecture Notes, Study Materials and Important questions answersUploaded byBrainKart Com
- Impact of Entrepreneurial Characteristics on the Organizational Development of the Small Business EntrepreneursUploaded byAlexander Decker
- DCI-MIT Tech BriefingUploaded byJack Cahn
- Principles of Structural Geology [John Suppe, 2005] @Geo Pedia.pdfUploaded byvlad
- genie Z-80 60sez8007-1665Uploaded byjose luis
- L3200-L3800Uploaded byMANUEL

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.