Final Paper Khaled

Solving inverse kinematics of PUMA 600 robot using Parallel Genetic Algorithms
Khaled Elghayesh School of Electrical Engineering and Computer Science University of Ottawa Ottawa, Canada K1N 6N5 kelgh092@uottawa.ca April 25, 2014
Introduction
Industrial robots are reprogrammable multiuse manipulators that are used in several industrial applications. They are generally designed by having multiple rigid links connected by joints, with one end attached to the ground and the other end acting as an end eector. The Programmable Universal Machine for Assembly (PUMA) is a robotic manipulator that was developed by Unimation in 1978. Shown in gure 1 [1], PUMA 600 is one model of the PUMA series which has six degrees of freedom and consists of seven links with six rotary joints. The problem of calculating the inverse kinematics of a PUMA 600 robot is the main topic of this study and it involves nding the set of joint variable values which realizes a desired end-eector pose for the robot. Determining the pose for the robotic arm includes calculating the vectors for position p and orientation from the joint values . Known as forward kinematics, this straightforward process is generally used to dene the trajectory of the end-eector of a PUMA robot. The reverse process that calculates the joint variable values is inverse kinematics. It is a complex non-linear problem that is not well-posed and that has been tackled by a multitude of dierent approaches before, from analytical to numerical to optimization-based approaches. The success of each of the previously implemented approaches in the literature has been often limited to certain robotic congurations and environments. Therefore, the need has risen for a general robust ecient mechanism to solve robotic inverse kinematics in a realtime or semi real-time fashion.
Fig. 1 - PUMA 600 manipulator
The purpose of this study is to investigate the feasibility of using an implementation of a parallel genetic algorithm to solve the inverse kinematics of the PUMA 600 robot. Dierent implementation ideas and topologies will be explored and compared statistically to results which are already available for a sequential GA implementation by the author.
Literature Review
Parallel Genetic Algorithms (PGA) are an important member of the wider family of parallel evolutionary algorithms (PEA). Sequential evolutionary algorithm are population-based stochastic optimization search methods that has been proven successful in many applications in engineering and machine learning. It involves a family of a multitude of search algorithms and techniques such as genetic algorithms, articial neural networks, genetic programming, evolutionary strategies. With work by Grefenstette from as early as from 1981, there has been rst attempts to propose a parallel computing architecture for a GA and investigating the issues underlying designing an ecient PGA. This was followed by lots of other literature from dierent researchers in the eld on all aspects associated with PGAs, such as by M uhlenbein and Tanese in [2], [5] and [6]. The proven success of the new ideas of PGA was demonstrated by its application on dierent well-known scheduling problems, such as the school timetabling problem [7] and the train scheduling problem [8]. Before discussing the review on the literature wrote on PGA, it is good to mention the purpose of PGA and the objectives of parallelizing a problem through a PGA implementation. It is not only to get a faster solution that minimizes the wall clock time needed to achieve the solution that we use PGA, it is to also try and maximize the quality of the solution reached by having an average solution that is closer the optimal solution. Having more processors that work on small data structures in parallel gives higher chances to achieve more accurate solutions [3]. However, we are stuck with the challenge that it is usually hard to measure the speedup in parallel evolutionary algorithms due to their probabilistic nature, in which each run is dierent than the other and will produce dierent results, 2
therefore performance evaluation has to be done in a statistical manner. Most of the literature discusses the dierent design aspects of PGAs in terms of topologies, subpopulation sizes, isolation time for synchronisation between generations. The most fundamental concept of PGA is migration. Genetic algorithms mimic the natural evolution process by having a population that stores certain characteristics encoded in genes and having them evolve with time through genetic operators to only keep individuals with a better tness. PGA adds to that the ability to have multiple subpopulations each operating independently, and communicating according to a dened topology. Migration refers to the exchange of individuals between subpopulations in a manner that does not aect the behavior of the algorithm as a whole from the serial point of view [4]. It is a major point of research in PGA and it was highlighted in the work of Tanese - one of the pioneers in the eld - on migration and communication strategies [9] and Belding [10]. Another interesting paper on synchronous vs. asynchronous migration and when to migrate strategically was written by Hijaze and Corne [11]. Migration directly aects the GA performance in operators like selection and mating. Performance degrades when the size of the neighborhood increases. If too big it will be almost like the behavior of a panmictic population. Sarma and De Jong propose a useful parameter in [12] as the ration between the neighborhood radius to the population radius to monitor that. PGAs are also categorized per topology. There are a few main streams for PGA topology design that all the literature follow and agree upon. The master-slave scheme is the most commonly-used base concept, in which the master processor stores the population, selects the parents of the next generation and applies genetic operators to individuals that are sent to slave processors to be evaluated and returned back at the end of each generation [13]. There are many variants on this principle, namely that use dierent communication and memory protocols. Higher processor utilization can be achieved if we abolish the division between generation so as the master sends and receives withough synchronization to take account of dierent processor speeds [14]. Shared memory versus distributed memory is also another variant that is applied in some topologies. This whole approach is named coarse-grained PGA, which in contrast to ne-grained PGA, involves a higher computation to communication ratio as illustrated in g.2 below [3]. The exact answer of which is better depends on many factors, most importantly the application itself. Detailed analytical comparison between both approaches can be found in [14], [15] and [16].
Fig. 2 - Coarse-grained vs. ne-grained PGA
Other factors to consider when designing and choosing a PGA for an optimization application include the type of communication between dierent subpopulations. Most 3
PGA implementations are synchronous since the time required to send individuals back and forth to do genetic operators is already more than any performance gains. It has been already shown that with a high level of communication, the overhead makes the solution quality almost close to how a sequential GA would perform. Another interesting conception to note here is the dierence between celullar evolutionary algorithms (cEA) and distributed evolutionary algorithms (dEA) [17]. dEA systems are those that involve sparse exchange of individuals, while cEA are those that are based on a neighborhood approach. The more subpopulations we have, the more of a cEA system we are getting, however the bigger subpopulation size we use, the bigger of a normal sequential panmictic GA we are approaching. Sefrioui and Periuax propose the idea of real-coded GAs and two way migration that involves two types of migration and also emphasies on the distinction between dEA and cEA. Alba and Troya show in [19] that a square or rectangular topology provides better performance for cEA. There has been quite a few ready made successful implementations of PGA in the literature that are based on dierent architectures. Due to the ease of the sequential implementation, it can basically be run on most parallel frameworks like MPI, PVM, JRMI, and even metacomputing internet-based architectures like Globus. The idea of having independent equivalent runs for the algorithm with dierent initial conditions has been explored before and was quite successful [3]. With how stochastic the GA is in its nature, it is without surprise that this methodology can be successful, especially that usually statistical information is used to evaluate the performance and speedup anyway. Calculating the speedup for PGA is also another challenge due to the probabilistic stochastic nature of the algorithms. It is hard to realize which run generated the best execution for a sequential GA so that it can be marked in the speedup calculation. Therefore, the usual way to analyze the performance of PGAs is to gradually increase the number of processors and statistically measure the performance by recording the best, worst, average performance (time-wise and solution accuracy-wise) [20]. This has been a general overview of PGA implementations and developments in the literature. There is a theoretical background in the literature as well that describes the behavior and convergence of PGAs that was somehow out of scope of the literature review of this study as the research methodology of investigating the feasibility of using PGAs for the PUMA 600 robot inverse kinematics optimization problem will be through empirical analysis and comparison with dierent techniques, in partcular those implemented by the author in [20]. A further review and implementation of particular algorithms in state-ofthe-art PGA literature is to be done as the next step in this study. Robotic inverse kinematics is a non-linear problem that is not well-posed, with complexity that varies according to the robotic arm design. Several approaches to tackle it in have been explored to try and solve it in the most optimal way. The nature of previous methods varies from being analytical geometrically or algebraically to being numerical iterative solutions that utilize the Jacobian of the position matrix [23] [24]. Major drawbacks of these methods are that they are very computationally intensive in a way that limits their usability in practical applications, where running in a real-time or a semi real-time fashion is expected. Therefore, it was the motivation of this research to investigate the feasibility of using PGAs to solve robotic inverse kinematics, and to build up on the previous work done on sequential genetic algorithms and other computational intelligence techniques.
3
3.1
Project Report
Sequential Genetic Algorithms
An important evolutionary computational technique that has been extensively and successfully used in various applications is Genetic Algorithms (GA). Inspired from the natural historical process of evolution that was initially explored by Darwin, the extensive research done on genetic algorithms from as old as 1957 [12] has provided valuable solutions to various problems that could not have any feasible solution in any conventional method. The idea of natural selection - to have only the ttest members and their predecessors surviving to next upcoming generations - was quite attractive to implement on a population-based stochastic optimization methods and also intuitive to produce promising results. The methodology of GAs in general is again based on generating a population of candidate solutions with size depending on the problem and having them evolve strategically towards an optimal solution. There are some specic GA operators that dene how the GA behaves and evolves dierently than other computational intelligence techniques. The three basic genetic operators are selection, crossover, mutation. Selection is an application of the principle of natural selection, in which individuals with higher tness are probabilistically chosen to be forwarded to the new pool of individuals for the next generation. There are many selection mechanisms with each suited to a dierent category of appilcations. Namely, the most popular selection mechanisms are roulette wheel selection and tournament selection [13]. The operator that mimics the process of mating is the crossover operator. Crossover is a genetic operator in which two individuals are chosen from the population, then a crossover site along the two bit strings is randomly chosen, on which, the values of the two strings are exchanged. For example, if S1 =10001000, and S2 =01011111, and the crossover point is 3, then the two osprings are 10011111 and 01001000. Moreover, genetic diversity of the population is maintained through the mutation, in which a particular subset of the genetic material (chromosome) is changed from its initial state. There are many types of mutation, depending on the representation being used for the genetic material. A crucial parameter is the mutation rate, which is used to calculate the frequency of mutations and if it is based on some preconditions or not. Figure 3 shows a basic illustration of how genetic operators work together. In the following section, some important basic concepts and dierent parameters will be discussed.
Fig. 3 - Basic GA in action
3.1.1
Parametrization and conguration
Encoding Encoding the variables of a real-valued optimization problem is a question of whether which encoding method will provide more exibility towards the optimal solution of this problem. There are two main encoding methods in the literature which will be explored. Binary Encoding Genetic Algorithms use chromosomes that are encoded in a binary format in their nature. In most GA implementations, we use binary encoding the encoding of the variables is done in binary format due to the ease of processing and of applying genetic operators such as crossover and mutation. The number of bits used reects the required precision in the problem to be optimized. For a particular solution range in the solution space, a linear scaling factor is used in the following format valuedecoded = min + (max min) valuechromosome 220
where min and max are the minimum and maximum values of the solution space, and valuechromosome is the decoded value of the binary encoded chromosome. Since up to 6 decimal places are required for the precision of the angles in the search space, 20 bits are used so that we can have a search space of 220 = 1,048,576 steps.
Continuous Encoding For higher precision real-valued problems and also for less storage space, a continuous GA is usually preferred and continuous encoding is used. Single oating-point numbers are used to store and represent the variables instead of having Nbit integers [13]. The main dierence between continuous and binary encoding in an N-dimensional optimization problem is that each variable is represented as a separate oating-point number. i.e: a single chromosome in the population is stored in this format [ p1, p2 , p3 , ....., pN ]. A continuous GA implementation starts by initializing the variables randomly in the search in a matrix of Npop * N var random values. Of course there are limits on the variables that each variable can have and this has to be taken into consideration in the generated random values.
3.1.2
Selection
The process of selection involves the decision technique of which chromosomes to be chosen for mating to produce osprings for the next generation. It is an important factor, and the art of choosing it has an impact on how easily the algorithm will converge to the global optimum. Natural Selection and Thresholding Selection is based on the concept of Survival of the ttest. There are dierent ways to model this, but the most straight-forward way is to introduce a selection rate, Xrate , which is the fraction of the population that will survives the next step of mating. For every iteration of the algorithm, this percentage of the population is automatically preserved as the elite and most t individuals, where the rest of the chromosomes are discarded while preserving those individuals for the next iteration. The usual and common selection rate in many implementations is 50%, and it is what is being used here. One particular disadvantage of this approach is the need to sort the chromosomes per tness for every iteration, which will add up to the computational cost of every iteration. Thresholding is another technique that is based on a particular threshold, in which all chromosomes that have a tness higher than this threshold are discarded, regardless of the number of them. It has the advantage that no sorting is needed, but the question remains of how to choose the threshold, and how to change it adaptively through generation to be relatively suitable to the current mean tness then.
Cost weighting The idea of cost weighting is presented in the very famous selection approach, roulette wheel selection, which is based on the idea of a routlette wheel that spins with a ball that rolls and falls on a particular slot. Bigger slots have a higher chance of being selected. How big of a slot the particular individual will have is inversely proportionally dependent on its tness. In most roulette wheel implementations, the selection probabilities are normalized to be between 0 and 1, but this is not done here due to the redundancy of these calculations. 7
Probability of an individual chromosome being selected is given by Pn =
(1/f itnessn ) N (1/f itness ) n 1
A random number is selected between 0 and the sum of probabilities and the wheel slot of the chromosome for which the number falls in is selected. This process is done twice to select the two parents for the mating process.
Rank weighting Another proven to be successful selection method is Rank weighting [15]. Its principle is similar to that of cost weighting, apart from the fact that the way the individual chromosomes are sorted and set for the selection process is done based on how they compare to each other, regardless of their particular tness value. The power of this approach is its ability to maintain a constant selection pressure among the chromosoms without experiencing the drawback that a small subset of very t chromosomes will dominate the selection process. Selection pressure is a measurement of how competitive it is between dierent individuals in the selection process, and it is calculated by the formula Pbest /Pavg . It is obvious how rank weighting will always have a constant selection pressure, if compared to cost weighting and roulette wheel selection.
3.1.3
Crossover
Crossover is the actual act of creating new osprings out of already existing parents that are in the current population. Crossover is the core of the genetic algorithms engine and it is how we create new information through the diversity that is already present in the population at a certain point of time. Single/double point crossover The simplest form of doing crossover is by just selecting a point along the bits and interchanging the parts of the parent chromosomes from before and after it to form the bits of the osprings. If this point is one point, then this is called single crossover, but another variant of it is to choose two crossover points and interchanging what is in between them. In continuous GA, the single/double point crossover is performed by interchanging the variables not the bits. The major dierence is that in continuous GA, we would be just interchanging the values of potential solution, while in binary GA with single/double point crossover, we would be actually generating new information to cause more diversity in the population. Continuous GA crossover is shown below. parent1 = [pf 1 , pf 2 , pf 3 , pf 4 , ....., pf N ] parent2 = [pm1 , pm2 , pm3 , pm4 , ......, pmN ] of f spring1 = [pf 1 , pf 2 , pm3 , pm4 , ......, pmN ] of f spring2 = [pm1 , pm2 , pf 3 , pf 4 ......, pf N ]
Uniform crossover Uniform crossover uses a randomly generated mask of 0s and 1s to decide how to choose each bit from which of the two parents. It can be considered as a generalization of other crossover methods as it involves in a more scattered distribution of the potential resulting osprings. It was shown by Falkenauer in [16] that it performs better than a single or double point crossover schema in almost all cases. Blending (continuous GA) The problem with many crossover methods is that there is not much chance to have new data and information in the generated osprings since we only get osprings that are along a line parallel to the coordinate axes [13]. Extending the possible range of generated osprings is the key to getting more diversity in the coming generations. Michalewicz proposed in [17] an interesting way to resolve this problem by introducing the blending method, in which values are generated from a combination of the two parent values through the following equation. pnew = pf n + (1 ) pmn where is a random number between [0,1], pfn , pmn are the nth variables from the parent chromosomes. However, the blending method implemented this way does not allow for any values outside the values in the parents chromosomes. An extrapolating factor can help improve this if introduced as in by Wright in [14] as follows. pnew1 = 0.5 pf n + 0.5 pmn pnew2 = 1.5 pf n 0.5 pmn pnew3 = 0.5 pf n + 1.5 pmn Three osprings are generated, and any variable that happens to be out of bounds of the solution space will be rounded to the boundaries. Then, the ttest two osprings will be chosen for the next generation.
3.1.4
Mutation
The way to avoid premature convergence by introducing random new information to random members of the population is through mutation. GAs tend to quickly converge to easily accessed local minima, and if left without intervening, it will almost always converge to local minima. Mutation tries to avoid this by having random changes to random individuals which will address new regions of the search space that, whether having a higher or lower tness, will always have a positive impact on the global convergence.
Binary Binary mutation involves ipping a random bit in a randomly selected chromosome from 0 to 1 or vice versa. The number of bits to be mutated is determined by what is called the mutation rate, . Mutation rate determines the percentage of bits to be chosen randomly to be mutated. Most of the literature uses with a value of somewhere between 0.01 and 0.02. Continuous Since in continuous GA there is no luxury of the simplicity of just ipping the bits to its other value, the randomness induced through mutation is by changing the selected variable to another dierent random variable within the allowable range for that variable. Mutation rate in continuous mutation determines the percentage of the variables to be chosen to be randomly changed. Elitism In order to avoid the problem of ruining the best t individuals by mutating them, there is a concept called elitism, in which a number of best t individuals are secured from mutation so that their powerful genes are preserved to be used for subsequent generations. In [14] and also in many papers in the literature, the number is chosen to be 1. 3.1.5 Why Genetic Algorithms?
The wider scope of this research is to explore the performance of dierent computational intelligence methods with the PUMA 600 inverse kinematics problem. Several other techniques were implemented, such as swarm intelligence algorithms and neural networks. Only the results of a few particle swarm optimization algorithms were comparable to those of the implemented genetic algorithm. An illustration of how genetic algorithm supercedes particle swarm optimization algorithms (PSO) is shown in the below percentile plot in gure 4, that shows how the distribution of the time taken is like for dierent runs performed in dierent techniques.
10
Fig. 4 - GA performance vs dierent PSO algorithms*
* MPSO-TVAC refers to Mutation-based PSO with Time-varying acceleration coecients [20]
3.2
Parallel Genetic Algorithms
The idea of parallelizing a sequential genetic algorithm comes by nature when we think of how suited for parallelisation genetic algorithms are. The idea of evolution and natural selection is inherently parallel in nature since individuals always evolve in subpopulations, rather than single populations. There are several aspects or factors to consider when designing a parallelisation scheme for a GA, including tness evaluation, mutation, number of subpopulations, communication scheme, selection scope (globally or locally). Therefore, there are more than eight dierent classes to categorize or classify PGAs [25]. The most commonly used implementations will be explained next along while highlighting the main dierences between all of them. 3.2.1 Independent runs
The initial ideas for automatic parallelisation is done through compiler, where a pool of processors is used to speed-up the execution of a sequential algorithm through what is known as independent runs [3]. No interaction at all takes place between dierent processors, as the work to be done is divided among the available processors. This technique is useful to speedup the execution time for a sequential GA, which can be exploited also to be able to have several independent runs and statistically analyze their performance, which is usually needed due to the stochastic nature of GAs. 3.2.2 Master-slave model
Another important factor that dierentiates between dierent PGA techniques is the number of subpopulations and their distribution. One of the simplest designs is the master-slave model, that includes having a single panmictic population, for which the tness evaluation 11
process is distributed among the dierent processors. The population usually resides in one processor, that is referred to as the master, from which all communication takes place to all other processors, which are referred to as slaves. The master processor stores the population data and periodically distributes individuals to be sent to slaves for genetic operators like crossover and mutation to be performed and for tness to be evaluated. Parallelisation occurs by assigning a specic percentage of the population to each individual in the slave processors. An illustration of an example of a master-slave schematic is shown in gure 5.
Fig. 5 - Master-slave PGA model
Communication in a master-slave implementation can be synchronous or asynchronous. Synchronous implementations are close to a normal sequential GA that is only faster in execution, while asynchronous communication when used introduces a little bit of a challenge for selection operators while implementing it. The master processor waits till a specic fraction of the individuals have been processed and are ready to proceed to the next generation. That is why tournament selection is better suited for master-slave PGA implementations because it can operate on those individuals whose tness have been already evaluated [25]. 3.2.3 Fine-grained cellular GA
Fine-grained implementations of PGAs are based on the concepts of neighborhoods and decentralization. The massively parallel architecture used in ne-grained parallel implementations forces us to exploit the high number of processors available by having a higher number of subpopulations, which a smaller subpopulation size, giving it the more popular name of cellular GA, or cGA. Communication within a cGA occurs between members of each neighborhood only, which are also in turn members of dierent neighborhoods at the same time, the concept that is illustrated below in gure 6 in a 2D grid structure.
Fig. 6 - Neighborhoods in cGA
12
cGAs open the door for much more tness and genotype diversity for a larger number of iterations [26]. They are more suited to be easily implemented on SIMD machines or on GPUs [27] due to the simple structure of each subpopulation. Parametrization is done through choosing the particular neighborhood tooplogy and structure for the population of individuals. The most commonly used topology is the 2D toroidal grid, in which each individual communicates with its direct neighbors (north, south, east, west) and uses their individuals for genetic operators. The proven high performance of cGAs is based on its swarm intelligence structure with decentralized control ow that allows it to explore more distant areas of the search space with the diversity of population that it can maintain. 3.2.4 Coarse-grained distributed GA
Last but not least, the most popular PGA implementation model is the coarse-grained distributed PGA or dGA. Not only the relative ease of implementation is what makes it popular, the results that have been shown since as early as Taneses work in [9] that have been always superior over cGAs were a consistent motive to use dGAs for many PGA applications. Contrary to cGAs, coarse-grained implementations of PGA are based on the concept of loosely coupled bigger-sized subpopulations that evolve more independently and communicate with each other in a lower migration rate than in ne-grained implementations. dGAs are based on what is known as the island model, in which particular subpopulations are represented by islands, as in gure 7. There are several island setups and each of them determines the migration scheme of individuals across dierent islands [16].
Fig. 7 - dGA island model
The most basic island model, the isolated island GA, involves no migration whatsoever. It only includes separate subpopulations each evolving on their own but without intermixing individuals. Parallelisation takes place through the diversity from having dierent individuals among all islands. Introducing migration in the islands, we get to the question of whether to have synchronous or asynchronous communication. In synchronous Island GA, subpopulations evolve at almost the same rate and after the same number of generations. In each island, the evolution process is blocked until the individuals are received from the other islands, depending on the migration scheme and topology chosen. This makes it easier to implement on a dedicated machine, but a lot harder to manage on a distributed environment, where dierent speeds of dierent machines can cause some processors to wait for 13
long times. Asynchronous Island GA slacks on this constraint and allows each subpopulation to evolve at its own speed in a way that mimics that natural process of evolution in a closer way. This implementation is more suited to a distributed implementation due to the expected variance in speeds between dierent workstations. Moreover, a novel technique in dGA is the Injection Island GA (iiGA), which is a heterogenous island-based dGA that has each subpopulation use a dierent encoding scheme for the problem. With dierent block sizes being used, migration is done through a oneway exchange of information direction from the low resolution to high resolution nodes. Thus, nodes are said to inject their best individuals in higher resolution nodes for negrained modication. The connection topology between islands is another important factor to consider. The most common topologies are the ring topology (as in gure 7) and the hypercube as in gure 8 below [11].
Fig. 8 - Hypercube dGA topology
There is a distinction between the structure of the population of cGA and dGA in the sense of how loosely or tightly coupled it is, and in other aspects such as the size of the subpopulations and the number of evolving sub-algorithms that are running in parallel. A graphical illustration of how this stands and how dGA and cGA compare to each other is shown below in gure 9 [13].
Fig. 9 - The structure population genetic algorithm cube
14
3.3
3.3.1
Problem description
Representation
Being a robot with six degrees of freedom, the kinematic equations of the position of the PUMA 600 are described using a three dimensional vector pT = (px , py , pz ) that species the position of the end eector at any point of time. The orientation is dened using three vectors in the cartesian three-dimensional space, in particular nT = (nx , ny , nz ), oT = (ox , oy , oz ) and aT = (ax , ay , az ) The parameters that dene the PUMA 600 are the joint angles, and the displacement parameters of each link between the joints. There are six joint angles in the PUMA 600 arm, (1 , 2 , 3 , 4 , 5 , 6 ) with each angle having a possible range that is determined by the maximum rotation range for each joint. The other parameters are the link lengths, (a1 , a2 , a3 , a4 , a5 , a6 ), and the link osets, (d1 , d2 , d3 , d4 , d5 , d6 ). The way the robotic manipulator is shaped also depends on the link twists, (1 , 2 , 3 , 4 , 5 , 6 ). A graphical representation of the PUMA 600 robotic manipulator with all its joint angles is shown below in gure 10. [20]
Fig. 10 - PUMA 600 manipulator workspace
The link parameters for the PUMA 600 robot are as listed in table 1.
15
Joint 1 2 3 4 5 6
-90 0 90 -90 90 0
1 2 3 4 5 6
d (cm) 0 0 d3 = 4.937 d4 = 17.000 0 0
a (cm) 0 a2 = 17.000 a3 = 0.75 0 0 0
Range - 160 1 160 - 225 2 45 - 45 3 225 - 170 4 170 - 135 5 135 - 170 6 170
Table 1 - Link parameters for the PUMA 600
The kinematic equations for the position and orientation of the PUMA 600 robot are given as follows: px = cos(1 ) [ d4 sin(2 + 3 ) + a3 cos(2 + 3 ) + a2 cos(2 ) ] d3 sin(1 ) py = sin(1 ) [ d4 sin(2 + 3 ) + a3 cos(2 + 3 ) + a2 cos(2 ) ] + d3 cos(1 ) pz = [ d4 cos(2 + 3 ) + a3 sin(2 + 3 ) + a2 sin(2 ) ] nx = cos(1 ) [ cos(2 + 3 ) [ cos(4 )cos(5 )cos(6 ) sin(4 )sin(6 ) ] sin(2 + 3 )sin(5 )cos(6 ) ] sin(1 ) [ sin(4 )cos(5 )cos(6 ) + cos(4 )sin(6 ) ] ny = sin(1 ) [ cos(2 + 3 ) [ cos(4 )cos(5 )cos(6 ) sin(4 )sin(6 ) ] sin(2 + 3 )sin(5 )cos(6 ) ] +cos(1 ) [ sin(4 )cos(5 )cos(6 ) + cos(4 )sin(6 ) ] nz = sin(2 + 3 ) [ cos(4 )cos(5 )cos(6 ) sin(4 )sin(6 ) ] cos(2 + 3 )sin(5 )cos(6 ) ox = cos(1 ) [ cos(2 + 3 ) [ cos(4 )cos(5 )sin(6 ) + sin(4 )cos(6 ) ] + sin(2 + 3 )sin(5 )sin(6 ) ] sin(1 ) [ sin(4 )cos(5 )sin(6 ) + cos(4 )cos(6 ) ] oy = sin(1 ) [ cos(2 + 3 ) [ cos(4 )cos(5 )sin(6 ) + sin(4 )cos(6 ) ] + sin(2 + 3 )sin(5 )sin(6 ) ] +cos(1 ) [ sin(4 )cos(5 )sin(6 ) + cos(4 )cos(6 ) ] oz = sin(2 + 3 ) [ cos(4 )cos(5 )sin(6 ) + sin(4 )cos(6 ) ] + cos(2 + 3 )sin(5 )sin(6 ) ax = cos(1 ) [ cos(2 + 3 )cos(4 )sin(5 ) + sin(2 + 3 )cos(5 ) ] sin(1 )sin(4 )sin(5 ) ay = sin(1 ) [ cos(2 + 3 )cos(4 )sin(5 ) + sin(2 + 3 )cos(5 ) ] + cos(1 )sin(4 )sin(5 ) az = sin(2 + 3 )cos(4 )sin(5 ) + cos(2 + 3 )cos(5 ) These equations dene the workspace coordinates for the position along with the orientation of the end-eector for every possible joint angle value in its possible range. Mathematically speaking, the inverse kinematics problem is about computing which values for (1 , 2 , 3 , 4 , 5 , 6 ) will generate the particular values for pT , nT , oT and aT . Note: Due to the redundancy of it, the aT orientation vector is omitted from the calculation of the accuracy of the solution because having correct values for nT and oT vectors will ensure a corresponding correct aT vector.
16
3.4
3.4.1
Implementation
Sequential GA
The following algorithm was designed and implemented with the below settings and parameters [20]. It was tested with the same ten points in the search space for which the PGA implemented was tested as well to see how they compare. The results of the sequential GA implemented will be shown in the results section along with the results of the PGA. Population size = 60 Mutation rate = 0.1 Selection rate = 0.5 Maximum number of generations = 5000 Initialize population of matrix of popSize x dimensions within allowable range; Calculate tness for all individuals; while bestFitness worse than required accuracy and generations less than max generations do Sort population per tness; Move top tness chromosomes to next generation through selection rate; Set cost weighting selection probabilities for all individuals; for i 1 to populationSize / 4 do Select parent1, parent2 through roulette wheel selection; Perform blending extrapolating crossover to generate 3 osprings; for each ospring do if out of bounds then Set to corresponding max/min boundary; end end Get ttest two osprings; Move to next generation; end for j 1 to mutationRate * populationSize * solutionDimension do getRandomInt(0,popSize-1); getRandomInt(0,solutionDimension); Change the value of the chosen variable to a random variable within range; end Calculate tness for all individuals; end Algorithm 1: Proposed Genetic Algorithm
3.4.2
PGA
As it was explained earlier that the main motive of this research was to explore the feasibility of how will a parallel genetic algorithm implementation will perform for the inverse kinematic problem of the PUMA 600 robot, the reasons behind chosing which particular PGA implementation were more directed towards ease of implementation and simplicity, so
17
that it can be a basis for a proof of concept and a gateway to more research in PGAs for the same problem. The PGA implemented was a coarse-grained distributed synchronous island genetic algorithm model, on an Intel dual core i3-2310M multicore processor. The PGA was implemented as an MPI implementation written in C, so that it is in the same language and architecture that the sequential GA was written and tested with. This was necessary to ensure a fair comparison and a correct measurement of the accurate speed up gained from parallelisation by ensuring all other factors that could have contributed in a better/worse eciency - such as dierent language compilers or dierent processor architecture - are eliminated. Again, targeting the simplicity of analyzing the algorithm, the synchronous communication scheme was chosen over the asynchronous one, despite the fact that results for asynchronous implementations have been shown to be better than those with synchronous implementations [11]. Last but not least, as explained before, the proven success of the performance of dGA compared to other PGA techniques made the choice more sensible and straightforward to implement the synchronous island GA model. Dierent settings were tested for the island PGA model in order to empirically optimize and ne tune the various parameters and settings available for the best performance. The number of processors n is the most basic parameter to vary. n was set to 4 to start with, and was increased to 6 and 8. The generation gap, which will be referred to here as g , was also an important parameter to vary by choosing the number of generations through which each subpopulation will evolve on its island before performing migration to exchange individuals. Genetic operators were used both on the local (island of subpopulation-specic) and the global (among dierent islands) level. Local genetic operators - such as crossover, selection and mutation - were all used as explained in the sequential GA implementation in section 3.2.1, while for global evolution, only selection was performed by doing unidirectional nearest neighbor migration of the best individual from each subpopulation to its nearest neighbor as shown below in gure 11.
Fig. 11 - Unidirectional nearest neighbor migration in a dGA with 4 islands
The generation gap, g , was tested with the values of 10 and 20, meaning that for g generations, each subpopulation will be evolving on its own then on the g th generation, the exchange of best individuals will be performed as shown above in the gure. A maximum limit of 5000 generations was set as the maximum limit before declaring the run has failed to nd a global optimum. The psuedocode of the PGA implemented is shown below. 18
Initialize island subpopulation; Calculate tness for all individuals; generations = 0; while bestFitness worse than required accuracy and generations less than max generations do Evolve individuals and apply genetic operators; Form next generation individuals; if generations == g then Send best individual to right neighbor; Receive best individual from left neighbor; end generations ++; end Algorithm 2: Implemented Island model PGA 3.4.3 Evaluation criteria and speedup measurement
Being a stochastic evolutionary algorithm with no guarantee of convergence to a global optimum, there had to be special ways to accomodate that while dening the evaluation criteria for PGA and comparing it with those of sequential GA. Furthermore, since there is no analytical dimensions or aspects in this research that are to be proven, empirical statistical analysis of the results was chosen to be the suitable method for evaluating the performance of the experiments done. Evaluation criteria used are the time taken to nd a solution, number of generations needed to reach that solution, the accuracy of the result found (if there was any). Accuracy of the solution is the sum of squared error between the actual and the desired output and it is given by the following equation accuracy =
dimensions
(computed required)2
To calculate the speedup of the PGA to compute its eciency, a dierent arrangement had to be done rather than the conventional speedup calculation equation known for all parallel T1 algorithms, that is S = T , where M is the number of processors. The main dierence in M PGA, and in parallel evolutionary algorithms in general is that a single successful run in insignicant, since it might be the case that if we do the same run n more times, all of them can fail. Therefore, it was the need to average a given number of statistically independent runs in order to have representative time values. 10 problems in the reachable workspace of the robot were chosen to be tested through the algorithms by calculating the joint variables of the PUMA 600 robot that has the endeector reach them. The algorithm was run 100 times for each problem in order to get a general understanding of how each parametrization of the algorithm behaves, due to the nondeterministic stochastic nature. 3.4.4 Results
For the experiments performed, these are the details of the results. The results of the sequential GA are displayed followed by the implemented PGA. Maximum allowed iterations
19
before declaring the run a failed one was set to 5000. Required accuracy is set to the order of 10-2 cm2 per dimension (9 102 cm2 ).
Time taken (sec) Best Worst Average Std deviation Generations Result accuracy Successful runs %
0.237 244.45 6.62 44.44
12 4696 158 897.61
0.05497 0.08632 0.07197 0.0175
41.57%
Table 2 - Sequential GA results
g Best
Time taken (sec)
Generations
Result accuracy
Successful runs %
Speedup
10
Worst Average Std deviation Best
20
10
10
Worst Average Std deviation
0.564 125.38 5.205 7.889 0.714 353.155 8.9748 18.211 0.512 64.302 6.71 8.861 0.714 42.864 5.632 6.244
10 45.05 293 38.52 13 4992 78.14 223.61 9 411 49.59 44.36 12 323 39.09 48.24
0.005507 0.073551 0.09 0.020677 0.00532 0.09 0.07626 0.01688 0.011753 0.08999 0.074969 0.018072 0.009884 0.09 0.015194 0.076848
69.38%
1.2718
68.75%
0.7376
73.13%
0.9866
74.88%
1.1754
Table 3 - PGA results
The wide performance gap in terms of the percentage of successful runs in PGA compared to sequential GA is obviously the most remarkable thing to denote from the results. An improvement of at least 26% in the percentage of successful runs proves how much the idea of having a more diverse search using dierent subpopulations is successful. However, the poor speedup - calculated using the average running time - is explainable due to the synchronous blocking nature of communication between dierent nodes which considerably degrades the processing time aspect of performance. The number of islands in the model also has a role in the time taken, since more subpopulations, means more information and individuals exchange which already takes a lot of time due to blocking synchronous communication. It is easier to view it graphically through a percentile plot as in gure 12 below to observe how the 4 island, 10 generations gap model provides the best time-wise performance.
20
160 sequential n=4, g=10 n=4, g=20 n=6, g=10 n=8, g=10
140
120
100 time /sec
80
60
40
20
10
20
30
40
50 Percentile
60
70
80
90
100
Fig. 11 - Percentile plot for time taken for all implemented GAs
3.5
Conclusion and Future work
Results conrm the proposed claim that PGAs can perform better in solving the inverse kinematics problem for the PUMA 600 robot. The signicant improvement in performance gained from paralellising the execution of the GA and dividing it into distributed coarsegrained subpopulations in n islands helps to explore more areas in the search space and also introduces new t individuals to each subpopulation every g generations. This directly tackles the long known problem and bottleneck for genetic algorithms and evolutionary techniques in general, of premature convergence. The improvement in the percentage of successful runs is quite signicant and shows how much the novel idea of having dierent subpopulations introduces more ecient performance. Despite the fact that the speedup was not really what makes it appealing to use PGAs instead of sequential GA and induce the overhead of writing and maintaining parallel programs, the convergence success outweighs this. Nevertheless, the promising fact is that these results are just what we got from this proof of concept attempt to explore the feasibility of using PGA, since there is a lot more that can be done to improve the processing time and speedup. As explained before, using synchronous communication is not what to use when seeking the fastest possible performance. Asychronous island GA model [11] [16] is the next step in the parallel techniques that should be implemented and tested for the invese kinematics problem. Its implementation should not be an issue since the island model setup is already done. Moreover, there are other more promising models as the hybrid PGA, which is explained by Enrique and Troya in [17], and which is a mixture of both cellular and distributed GA that has island models distributed in a ne-grained style. There is a big multitude of dierent PGA techniques that can be explored, and which along with them and with the expected potential performance improvements, we gain more condence in how they can perform on more complex robots with a higher number of degrees of freedom, and who can have the additional complexity of having to work in a constrained environment. This study concludes with the deductive reasoning that using PGA to solve robotic inverse kinematics problems was successful that it helped drastically alleviate the premature convergence problem that faces all evolutionary algorithms and stochastic computational intelligence methods. With lots of possible enhancements and potential applications in 21
the robotics domain that are mentioned above, this novel idea can be greatly exploited on dierent aspects and tracks.
References
1. Grefenstette, John J. Parallel Adaptive Algorithms for Function Optimization:(preliminary Report). Computer Science Department, Vanderbilt University, 1981. 2. Muhlenbein, Heinz. Evolution in time and space-the parallel genetic algorithm. Foundations of genetic algorithms. 1991. 3. Alba, Enrique, and Marco Tomassini. Parallelism and evolutionary algorithms. Evolutionary Computation, IEEE Transactions on 6.5 (2002): 443-462. 4. M uhlenbein, Heinz, M. Schomisch, and Joachim Born. The parallel genetic algorithm as function optimizer. Parallel computing 17.6 (1991): 619-632. 5. M uhlenbein, Heinz, Martina Gorges-Schleuter, and Ottmar Kr amer. Evolution algorithms in combinatorial optimization. Parallel computing 7.1 (1988): 65-85. 6. R. Tanese, Parallel genetic algorithms for a hypercube, in Proc. 2nd Int. Conf. Genetic Algorithms, J. J. Grefenstette, Ed., 1987, p. 177. 7. Abramson, David, and J. Abela. A parallel genetic algorithm for solving the school timetabling problem. Division of Information Technology, CSIRO, 1991. 8. Abramson, David, Graham Mills, and Sonya Perkins. Parallelisation of a genetic algorithm for the computation of ecient train schedules. Parallel Computing and Transputers 37 (1994): 139-149. 9. R. Tanese, Distributed genetic algorithms. In SCHAFFER J. D., Ed., Proceedings of the Third International Conference on Genetic Algorithms, p. 434439, Morgan Kaufmann (San Mateo, CA), 1989.
10. T. C. Belding, The distributed genetic algorithm revisited, in Proc. 6th Int. Conf. Genetic Algorithms, L. J. Eshelman, Ed., 1995, pp. 114121. 11. Hijaze, Muhannad, and David Corne. An investigation of topologies and migration schemes for asynchronous distributed evolutionary algorithms. Nature & Biologically Inspired Computing, 2009. NaBIC 2009. World Congress on. IEEE, 2009. 12. Sarma, Jayshree, and Kenneth De Jong. An analysis of the eects of neighborhood size and shape on local selection algorithms. Parallel Problem Solving From NaturePPSN IV. Springer Berlin Heidelberg, 1996. 236-244. 13. Cant u-Paz, Erick. A survey of parallel genetic algorithms. Calculateurs paralleles, reseaux et systems repartis 10.2 (1998): 141-171. 14. Baluja, Shumeet. Structure and Performance of Fine-Grain Parallelism in Genetic Search. ICGA. 1993. 15. Gordon, V. Scott, and Darrell Whitley. Serial and parallel genetic algorithms as function optimizers. ICGA. 1993. 22
16. Lin, Shyh-Chang, W. F. Punch III, and Erik D. Goodman. Coarse-grain parallel genetic algorithms: Categorization and new approach. Parallel and Distributed Processing, 1994. Proceedings. Sixth IEEE Symposium on. IEEE, 1994. 17. Alba, Enrique, and Jos e M. Troya. A survey of parallel distributed genetic algorithms. Complexity 4.4 (1999): 31-52. 18. Sefrioui, Mourad, and Jacques P eriaux. A hierarchical genetic algorithm using multiple models for optimization. Parallel Problem Solving from Nature PPSN VI. Springer Berlin Heidelberg, 2000. 19. Alba, Enrique, and Jos e M. Troya. Improving exibility and eciency by adding parallelism to genetic algorithms. Statistics and Computing 12.2 (2002): 91-114. 20. Elghayesh, Khaled. Solving Inverse Kinematics of PUMA 600 Using Tools of Computational Intelligence. Unpublished Masters Thesis. School of Electrical Engineering and Computer Science, University of Ottawa, 2014. 21. Herrera, Francisco, and Manuel Lozano. Gradual distributed real-coded genetic algorithms. Evolutionary Computation, IEEE Transactions on 4.1 (2000): 43-63. 22. Alba, Enrique, et al. Parallel heterogeneous genetic algorithms for continuous optimization. Parallel Computing 30.5 (2004): 699-719. 23. De Angulo, Vicente Ruiz, and Carme Torras. Learning inverse kinematics: Reduced sampling through decomposition into virtual robots. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on 38.6 (2008): 1571-1577. 24. Bocsi, Botond, et al. Learning inverse kinematics with structured prediction. Intelligent Robots and Systems (IROS), 2011 IEEE/RSJ International Conference on. IEEE, 2011. 25. Nowostawski, Mariusz, and Riccardo Poli. Parallel genetic algorithm taxonomy. Knowledge-Based Intelligent Information Engineering Systems, 1999. Third International Conference. IEEE, 1999. 26. Alba, Enrique, and Jos e Ma Troya. Cellular evolutionary algorithms: Evaluating the inuence of ratio. Parallel Problem Solving from Nature PPSN VI. Springer Berlin Heidelberg, 2000. 27. Vidal, Pablo, and Enrique Alba. Cellular genetic algorithm on graphic processing units. Nature Inspired Cooperative Strategies for Optimization (NICSO 2010). Springer Berlin Heidelberg, 2010. 223-232.
23

Final Paper Khaled

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Final Paper Khaled

Uploaded by

Copyright:

Available Formats

Solving inverse kinematics of PUMA 600 robot using Parallel Genetic Algorithms

Fig. 1 - PUMA 600 manipulator

Fig. 2 - Coarse-grained vs. ne-grained PGA

Fig. 3 - Basic GA in action

Parametrization and conguration

Probability of an individual chromosome being selected is given by Pn =

(1/f itnessn ) N (1/f itness ) n 1

Fig. 4 - GA performance vs dierent PSO algorithms*

* MPSO-TVAC refers to Mutation-based PSO with Time-varying acceleration coecients [20]

Parallel Genetic Algorithms

Fig. 5 - Master-slave PGA model

Fig. 6 - Neighborhoods in cGA

Fig. 7 - dGA island model

Fig. 8 - Hypercube dGA topology

Fig. 9 - The structure population genetic algorithm cube

Fig. 10 - PUMA 600 manipulator workspace

d (cm) 0 0 d3 = 4.937 d4 = 17.000 0 0

a (cm) 0 a2 = 17.000 a3 = 0.75 0 0 0

Table 1 - Link parameters for the PUMA 600

Fig. 11 - Unidirectional nearest neighbor migration in a dGA with 4 islands

0.237 244.45 6.62 44.44

12 4696 158 897.61

0.05497 0.08632 0.07197 0.0175

Table 2 - Sequential GA results

Time taken (sec)

Worst Average Std deviation Best

Worst Average Std deviation Best

Worst Average Std deviation Best

Worst Average Std deviation

Table 3 - PGA results

100 time /sec

Conclusion and Future work

You might also like