You are on page 1of 8

JOURNAL OF COMPUTER SCIENCE AND ENGINEERING, VOLUME 16, ISSUE 2, DECEMBER 2012 7

Hybrid Clonal Selection Algorithm with Curriculum Based Statistics for University CourseTime Tabling
Salisu M.Borodo and Safaai B. Deris
Abstract The University course timetabling problem involves the allocation of courses to class rooms and time slots subject to some constraints classified as either hard or soft. The timetabling is also known as a constraint satisfaction problem which involves the satisfaction of numerous constraints as well optimizing a desirable objective function. The timetabling problem is a recurring issue in higher institutions on a semester basis, hence several methodologies have been employed in order to effectively generate high quality time tables, some of the methods are sequential methods, graph coloring, cluster methods, constraint based as well as Meta heuristic methods. The Hybrid Clonal selection algorithm with Curriculum Based Statistics is the technique used in this research. The Clonal algorithm start with an initial solution, then it select some solutions and clone them in other for the clones to be mutated; the mutated solutions are used for the generation of new solutions that are closer to the optimum. The Curriculum Based Statistic helps the clone algorithm for searching the solution space efficiently. The algorithm employs a greedy search approach in other to optimize the exploitation and exploration of the search space. Index Terms time tabling, clone selection, immune system

1 INTRODUCTION

he University course timetabling problem (UCTP) is a type of timetabling problem, there are numerous oth- er timetabling problems such as the transport timetabling problem (i.e. train and bus timetabling), healthcare institu- tions timetabling problem (i.e. surgeon and nurse timeta- bling) and sport timetabling problem (i.e. timetabling of matches between pairs of team). The UCTP involves as- signing Lectures to a number of rooms and timeslots based on some constraints that must be satisfied (Hard constraints) and other constraints that are desired to be satisfied (soft constraints) in the allocation of the timetable subjects. The hard constraints are usually similar across board in different academic institutions. Hard constraints involves not assigning a course to more than one room at the same time, courses offered by students of the same group (students with courses in common) should not be assigned at the same time, the lecturer taking a course should also not be allocated more than a single course at the same time, the number of students offering a course should not exceed the room capacity of the allocated room. The soft constraint are usually different across sepa- rate institutions, there are based on the need of each par- ticular academic institution. The soft constraint can in- volve a specific course to be allocated to a certain time period, students should not be assigned more than a cer- tain number of courses in a day, Professors may prefer to

teach in a particular room or on a particular time of a day, a course may need to be scheduled ahead or before anoth- er course. The course time tabling problem also has an objective function that needs to be maximized in order for the time table solution to be of high quality. The objective function usually involves assigning points to each satisfied soft constraints of the problem; the more soft constraints are satisfied by the timetabling solution, the higher the value of the objective function which automatically im- plies a high quality solution. The university course time tabling problem is known to be a NP complete problem because it is a cumbersome problem with many con- straints to be solved and a huge search space to be ex- plored if the problem size increases.

LITERATURE REVIEW

[1] combined the PSO and the Local search for the univer- sity course timetabling problem. The experiments were carried out on three datasets; the results showed that the combination of PSO and Local Search algorithm provided better result in solving the UCTP compared to the other two techniques; the designed solution used less computa- tional time and zero penalties into generating a feasible solution.[2] made use of a hybrid memetic algorithm and simulated annealing for combining classes in timetables. The problem was formulated as a sub graph problem. The S.M.Borodo is a student with the Faculty of ComputerSciencew and Inforsystem was tested on real world data. Analysis carried mation System, Universiti Tekonologi Malaysia, Skudai, Johor, Malaysia out found that the timetable solution was of high quality S.B.Deris is a Professor with the Faculty of ComputerSciencew and Infor- and the algorithm employed is efficient and convergent. mation System, Universiti Tekonologi Malaysia, Skudai, Johor, Malaysia The curriculum Based Statistics (CBS) by [3] is a mecha- nism to store (memorize) which variables are in conflict with a certain assignment of another variable so as to

serve as a deterrent in subsequent assignment. Conflict- based statistics is a data structure that memorizes hard conflicts which have occurred during the search together with their frequency and assignments which caused them.[4] investigated the areas Artificial Immune System has been applied to; as well as the contribution AIS has made in those areas. By means of experimentation, the Clonal Selection Algorithm has proven to perform rela- tively well on benchmarked datasets for optimization tasks

3 METHODOLOGY
The method of study employed for this paper consisted of stages; this helped in describing the logical workflow of the research process from the beginning till the logical end. The research methodology consists of five stages. 3.1 Literature Review The review of the related literature was the first stage of the paper writing, the literature review was chosen first because it would give the researcher first hand infor- mation on the subject area; hence the background knowledge and understanding of the research area would be grasped at the onset of the research study. In the course of the literature review, an extensive study was carried out in other to cover the academic time tabling as well as the artificial immune system 3.2 Problem Formulation and Technique Selection After a successful review of the related literature which provided comprehensive knowledge of the problem and technique, the course timetabling problem was selected as the research problem because it was in line with the FSKSM time tabling data which would be used as the Da- taset for the study. The time tabling problem would con- sist of several constraints peculiar to the FSKSM. The Clonal Selection Algorithm (CLONALG) was selected from numerous immune system counterparts to be the algorithm; this is due to CLONALGs positive track record in optimization tasks, especially on benchmarked data sets. The CBS was chosen to complement the CLONALG because it is an effective mechanism that stores the fre- quency of conflicting assignments, this is useful when the CLONALG selects a random solution to be improved, and the CBS provides an array of the frequency of conflict be- tween the selected values as well as the potential values. When the value in the CBS array pointing to the affected room/time pair is not zero, the CLONALG algorithm abandons the potential values for another one, this helps

in minimizing the search time as well as avoiding conflict- ing assignments. 3.3 Modeling and the Design of the Hybrid CLONALG- CBS The Hybrid Clonal Selection Algorithm with Curriculum Based Statistics (Hybrid CLONALG-CBS) algorithm had to be modeled and designed to solve the University course time tabling problem. In simple terms, the Clonal Selection Algorithm (CLONALG) applied to the university timeta- bling problem consist of an anti body (that is a subject consisting of subject group, subject section, number of students, number of lessons and lecturer), which are rep- resented in an Event matrix with n number of rooms { r1; r2 .; rn } and m timeslots { t1; t2;.. ; tm }and w events to be scheduled { e1;e2;. ; ew } in the rooms and timeslots. The scheduling of the events is subject to some rules (hard constraints) and desired requirements (soft constraints). The successfully scheduled events (feasible solutions) would be improved based on a parameter called the objec- tive function. 3.4 Implementation and Testing Of FSKSM Time Ta- bling Data The Faculty of Computer Science and Information System (FSKSM) time tabling data for Semester 1 2012/2013 ses- sion would be tested for the designed Hybrid CLONALG- CBS algorithm. The basic data consist of the following: Total Number of subjects 170 Number of subjects with four lessons per week 87 Number of subjects with three lessons per week 71 Number of subjects with two lessons per week 12 Number of lessons for all the subjects 673 Number of lecturers 99 Number of student groups 16 Number of students 974 Total Number of classrooms 27 Number of classrooms with 50 persons capacity 20 Number of classrooms with 120 persons capacity 6 Number of classrooms with 160 persons capacity 1 Total timeslots 55 Total timeslots reserved 12 Total timeslots available 43 The testing involves initializing (allocation) all feasible subjects based on the constraints of the problem. The se- cond stage involves taking the preference weight of each subject; this is achieved by summing the preference of the room and time allocated to the subject. The improvement of the courses by the Hybrid CLONALG-CBS algorithm involves selecting an allocated course randomly from the

low quality allocated subjects (mutated courses) chosen for improvement, and then the algorithm also searches for a free room/time pair in the solution space for a possible re allocation. The weight of the free room/time slot is also computed, if the weight is greater than the randomly se- lected course, the improvement is carried out by assigning the course to the free room/time slot, if the weight of the free room/ time slot is less than the randomly selected one, the improvement is cancelled. The Curriculum Based Sta- tistics (CBS) is then used to store this unsuccessful im- provement attempt by means of adding 1 to the value of the corresponding CBS array. The CBS stores the course involved, its allocated room/time as well as the free room/time that was involved in the comparison. The val- ues stored by the CBS are used in subsequent searches by the CLONALG algorithm; this is carried out by checking whether an allocated course that is randomly selected for improvement has ever been tested with the free room/timeslot available. The weight of the allocated sub- jects is based on the objective function of the problem; the objective function is the sum of the weight of a room and a time slot pair corresponding to each allocated subject in the solution space. 3.5 Analysis and Discussion of Result In this stage, the results gathered from testing the Hybrid Clonal Selection Algorithm with Curriculum Based Statis- tics with the time tabling data of FSKSM is carried out. The Hybrid CLONALG-CBS would be tested based on some parameters of the University Course Timetabling problem such as Room Utilization, Timeslot Utilization, and Time tabling solution quality. The results would be analyzed and discussed. The findings gathered from the analysis are communicated in a way that can be utilized by the research community.

Fig. 1.Study Methodology

4. MODELLING AND DESIGN OF THE HYBRID CLONAL SELECTION ALGORITHM WITH CUR- RICULUM BASED STATICTICS 4.1 Modeling the Time tabling as a Constraint Satisfac- tion Problem The first step involves the modeling of the time tabling problem as a Constraint Satisfaction Problem (CSP). The academic timetabling problem satisfies the conditions of the CSP. A CSP Q consist of a finite set of variables, a range of values for each variable and a set of constraints restricting the allocation of certain values to some varia- bles or a relationship between variables. A finite set of variables, {X1, X2, , Xn} . A set of domains {D1, D2, , Dn}. Where Di is a set of possible values for the variable Xi A finite set of constraints {C1, C2, , Cq}. Where each constraint Ci is a pair (Si, Ri), where Ri is relation Ri Dxi , and D xi = x j si D j, defined on a subset of variables Si X called the scope of Ci. The relation denotes all compatible tu- ples of Dxi permitted by the constraint. A binary constraint Cij on variable Xi and Xj is a set of pairs too, such that Cij allows for Xi to take the value vi and Xj to take the value vj iff (vi, vj) Cij and we say the binary constraint is satisfied oth- erwise it is violated. A solution to the CSP involves assigning values from domains to all variables such that all constraints are satis- fied. The solution to the constraint satisfaction problem Q can also be defined as a complete assignment s of the var- iables from V that satisfies all the constraints. Therefore, the CSP model for timetabling problem can be formulated by deciding the variables, values and constraints. In this research study, the variables are the timeslot T(Si) and the room R(Si) for each lesson Si. Each variable in the CSP is associated with a domain containing possible values to be assigned to a variable. The values to be assigned to timeslot variables T(Si) are the total available timeslots in a week. In the case of Faculty of Computer Science and Information Systems, Universiti Teknologi Malaysia, there are 11 periods of 1 hour per a day for 5 working days in each week. Therefore the total number of timeslots (m) is 11 x 5 = 55. The values assigned to a room

10

variable R(Si) are the available rooms rk, 1<=k<= p, where p is the number of the available rooms. A solution to a timetabling problem can be defined relationally as an as- signment of time tj, 1<= j<= m and room rk, 1<= k <=p to lessons Si, 1<=i<= n taught by lecturer L(Si) such that all constraints C(Si) are satisfied. L(Si) and C(Si) are lecturer and constraints of lesson Si, respectively. The constraints refer to the relationship between two variables. The basic constraints or relations are the mathematical relations, i.e., <=,>=, = and . The types of constraints that have to be satisfied in the timetabling of the study are as follows The Hard Constraints that need to be met are: Lecturer time-clash constraints: A lecturer should not be assigned to teach more than one subject in the same timeslot Students group time-clash constraints: A stu- dent group should not be assigned to attend more than one subject in the same timeslot. Classroom time-clash constraints: one room should not be assigned to more than one subject for the same timeslot. Classroom capacity constraints: the number of students of a lesson assigned to a room should be less than or equal to the capacity of the classroom Classroom and timeslot-domain constraints: classrooms or timeslots assigned to subjects must be within the range of domain. Timeslot constraints: certain timeslots are re- served for non-academic activities such as co- curriculum and lunch hours; therefore, they are not available for any lectures. The Soft Constraints that need to be met are: The scheduled timeslot of the subject should fall within the preference sets as much possible. The higher the preference set shows, the better the timeslot. The scheduled classroom of the subject should fall within the preference sets as much possible. The higher the preference set shows, the better the classroom facility. The most preferred rooms are the ones with all the need- ed facilities such as good projector, new computer with latest and updated software or operating system etc. The most preferred rooms are given a score of 4, the rooms with average facilities are given a score of 2 and the least

rooms in terms of facilities are given a score of 1. The most preferred timeslots are the first three morning les- sons of the week days. These first set of timeslots are giv- en a score of 4, the afternoon timeslots are given a score of 2, while the late evening timeslots are given a score of 1. 4.2 Variable Ordering Variable ordering is directly related to the search efficien- cy and feasibility of the time tabling solution. There are two different types of ordering: the static and the dynam- ic ordering. In static ordering, the order of the specified variable is specified before the search begins and is not changed thereafter. The dynamic ordering is the one in which the choice of next variable to be considered at any point depends on the current state of the search. The main advantage of using static ordering is that the search pro- cedure can be done very fast because the order is fixed. On the other hand, it does not refer to any information of the current state during the search. Whilst with dynamic ordering, it is possible to refer to the next variable for its information of the current set of instantiations; therefore the variable order can vary from branch to branch in the search tree. This research study would use the static vari- able ordering to enhance the search process to find a good quality solution because of its simplicity. 4.3 Value Ordering A variable ordering determines which activity is going to be scheduled next, while value ordering determines which reservation (timeslot and classroom) should be assigned to the selected activity. Once the activity to be scheduled next has been selected, the value ordering heu- ristics determines which reservation to assign to the activ- ity. The value ordering is able to affect the topology by ensuring a branch that leads to a feasible solution is searched earlier than branches that would lead to infeasi- ble solutions. Because all the solutions found are feasible, there would be no need for a backtracking algorithm to be put in place to repair solutions that are inconsistent with the problems constraints, the value (rooms and timeslots) ordering in this research study depends on the variable that has been selected at the instance. 4.4 Hybrid Clone Selection Algorithm with Curricu- lum Based Statistics for Academic Timetabling The Hybrid Clone selection Algorithm with Curriculum Baaed Statistics would be able to solve the academic time tabling problem of FSKSM in two phases. The first phase involves the initialization of time table solutions for all the 170 offered courses in the time table data, the solution has to satisfy all the hard constraints of the problem, the

11

second phase involves the improvement of the already assigned courses; the improvement is by optimizing the objective function of the problem (Improving the soft constraints preference scores). The CLONALG algorithm consists of five stages. The first stage covers the first phase of the algorithm execution, while the other four stages make up the second phase of the algorithm 4.4.1 Initialization The Hybrid CLONALG-CBS algorithm begins by assign- ing subjects to their respective rooms and timeslots based on the constraint of the time tabling problem, the alloca- tion of subjects at this stage continues until all the subjects are allocated. The Initialization process emphasize that all subjects allocated must satisfy the hard constraints. The initialization process covers the first phase of the algo- rithm, the algorithm at this stage also tries to allocate pre- ferred rooms and time slots to the subject being allocated, but the core objective of the initialization stage is to en- sure the courses allocated satisfy the hard constraints of the problem. 4.4.2 Selection The selection involves assigning the objective function score for each allocated subject; this is called affinity measurement in the parlance of the CLONALG algo- rithm. The score is simply the sum of the corresponding preference scores of the allocated room and time slot for a lesson. 4.4.3 Cloning All allocated subjects in the population will be duplicated proportional to their affinity and enter the clone popula- tion C of size Nc, which is computed by the equation be- low Where Nc is the total amount of clones generated, is a multiplying factor, N is the total amount of antibodies and round (.) is the operator that rounds its argument towards the closest integer. 4.4.4 Mutation The mutation process of the CLONLAG is designed in such a way a low quality solution (an allocated subject) would be given more chance to be improved upon. This is accomplished by restricting the improvement phase of the algorithm to concentrate on only subjects with lower

objective function scores. The lower quality solutions are the ones with an objective function score that is less than 6 (8 being the maximum). This would ensure only the lower quality courses are improved during the improve- ment phase of the algorithm execution. The Curriculum Based Statistics come into action at this stage of the algo- rithm; it helps the CLONALG algorithm against search- ing for values that would be conflicting to the variable. The algorithm in this stage randomly select an allocated subject and check whether it is among those that need to be improved in the second phase of the algorithm. The chosen subject would first be tested for its preference score; when the score is found out to be greater than 6, the algorithm drops the subject for another one. The algo- rithm proceeds with the remaining aspect of the im- provement phase when the score is less than six. The al- gorithm randomly selects an available room and time slot pair for possible allocation to the chosen subject (im- provement). The process involves calculating the objec- tive function score of the free room and time pair. If the score is higher than the score of the chosen subject and all hard constraint have been satisfied, then the chosen sub- ject is assigned to the new room and time. The student group and lecturer availability arrays are also updated to reflect the new room and time slot, the update of the stu- dent group and lecturer availability arrays is to ensure the lecturer and students are not having any conflict as a re- sult of the update. A variable called subjectallocated is then incremented by one. The variable is used as a counter; the counter is used to control the improvement phase of the algorithm, when the value of the variable equals the value of the allg variable (allg keeps number of subjects to be improved), the improvement is stopped. The Hybrid CLONALG-CBS is stopped when all the low quality solu- tions have been improved at that stage. The key point is, the new allocation must have a higher preference score than the previous score of the subject; all the hard con- straint of the problem must still be satisfied with the new allocation, else the status quo is maintained and the algo- rithm randomly selects a different subject for the process to repeat again. After a certain time bound set has elapsed, the algorithm is stopped if all the subjects need- ing improvement have not improved by the algorithm. This is due to the NP complete nature of the time tabling problem. 4.4.5 Reselection and Diversity Introduction The reselection involves choosing a certain number of solutions for possible admission into the next generation of diversified population. A certain number of the low quality solution (subjects) would be randomly replaced with the new ones. The reselection is a mechanism to im- prove the quality of the solutions by giving a fair chance

12

to low quality antibodies to either improve or be replaced with better solutions.

This was carried out so that the quality of the timetabling solution would be of higher quality; maximizing the allo- cation of preferred timeslots implies a higher value of the overall objective function score.

Fig. 3. Preferred Timeslot Utilization

Fig. 2. Hybrid Clonal Selection Algorithm with Curricu- lum Based Statistics flowchart 5. DISCUSSION AND ANALYSIS OF RESULT The designed Hybrid Clonal Selection Algorithm with Curriculum Based Statistics was experimented with the case study timetable dataset. The Hybrid CLONALG-CBS would be tested based on some parameters of the Univer- sity Course Timetabling problem such as Room Utiliza- tion, Timeslot Utilization, Time tabling solution quality (lecture spread) and Maximum objective function achieved. 5.1 Preferred Timeslot Utilization The Hybrid CLONALG-CBS algorithm was able to opti- mize the allocation of rooms to the most preferred Timeslots; the most preferred timeslots are the early morning lectures during the five days of the week. The preferred timeslots are 15 in number. The algorithm was designed to first allocate 350 subjects that satisfy the prob- lems constraint to the most preferred timeslots, this im- plies that several class rooms must be allocated 350 sub- jects using the 15 timeslots only. This automatically makes most of the class rooms to utilize many of the pre- ferred timeslots. Subsequently, the remaining 28 timeslots are allowed to be part of the pool of timeslot values that could be allocated to the remaining timetable subjects.

The figure above clearly shows the first 15 most preferred timeslots being assigned most of the timetabling rooms by the Hybrid CLONALG-CBS; timeslot 11 was allocated 25 rooms which is the highest allocation of rooms to a timeslot. Overall, the most preferred timeslots have been allocated more than 50% of the available timetable sub- jects; utilization of the 15 preferred timeslots by subjects is above 80%. 5.2 Preferred Room Utilization The Hybrid Clonal Selection Algorithm with Curriculum Based Statistics was able to utilize all the rooms in the dataset for allocation of subjects. The most preferred rooms having the highest preference score of 4 had the highest allocation of timeslots. The first 11 rooms of the timetabling data are the most preferred rooms of the case study dataset. The first 11 rooms were allocated more than 60% of the timeslots. The algorithm was able to achieve that feat because; it has been optimized to allocate many timeslots to the most preferred rooms at its initiali- zation phase. The algorithm allocates 150 subjects to the most preferred rooms using any of the timeslots; after the 150 subjects have been allocated, the remaining 16 rooms would then have the same chance of being allocated timeslots with the most preferred rooms.

13

achieved by summing the preference score of all the allo- cated subjects of the timetabling problem. The formula is given below where P(T(Si)) are timeslot preferences for subjects Si, i=1,2,,n and P(R(Si)) are room preferences for subjects Si, i=1,2,,n. The table below provides the objective func- tion score of the time table solution as well as the compu- Fig. 4. Preferred Room Utilization tational time taken to achieve the time table solution. The algorithm was executed five times to get the values in the 5.3 Lecture Spread for Student Group table below The Hybrid Clonal Selection Algorithm with Curriculum TABLE. 1. Overall Objective Function Scores Based Statistics was able to spread the lectures across the week for the 16 student groups in the timetabling prob- Run 1st 2nd 3rd 4th 5th lem. The lecture spread ensures that subjects for some Score 2562 2502 2484 2517 2496 student groups are either spread in the week with lecture Time 1.27 2.18 2.10 1.30 1.54 free day(s) in between, or other student groups that have many lectures are spread through the entire week. There are more lectures on Monday because that is the begin- ning of the week.

Fig. 5. Subject allocation for the student groups

The daily average among the subject groups is 8 lessons a day with the exception of groups 1 and 3. The two excep- tional groups have many students; hence the higher number of lessons is due to many sections available in the subjects of the group. From the figure it can be seen that some of the groups (six of them) have lecture free days on Wednesday, Thursday and Fridays. This shows the ability of the algorithm to provide lecture free days to student groups that have fewer lessons in the week. 5.4 Objective Function Score Achieved by Solution The objective function of the problem needs to be maxim- ized in order for the solution to be of highest possible quality; the overall score of the objective function is

Fig. 6. Graph of the Objective function score and the time taken It can be seen from the above figure that the objective function score for running the algorithm five times are almost of the same range, the average for the five execu- tions is 2512. 6. CONCLUSION This research paper designed the Hybrid Clonal Selection Algorithm with Curriculum Based Statistics for the aca- demic timetabling problem. The algorithm was evaluated in terms of the time tabling solution quality, such as the room utilization, timeslot utilization and lecture spread. The result showed the algorithm is able produce high quality timetable solution

14

REFERENCES [1] S. F. H. Irene, S. Deris, and S. Z. M. Hashim, "A Combination of PSO and Local Search in University Course Timetabling Problem," 2009 International Conference on Computer Engineering and Technology, Vol Ii, Proceedings, pp. 492-495, 2009. [2] J.-l. L. Zan Wang, "Hybrid Memetic Algorithmfor Uniting Classes of University Timetabling Problem," in 2009 International Conference on Computational Intelligence and Security, 2009, p. 5. [3] R. B. T. Muller, H. Rudova, "Confict-based statistics," presented at the EU/ME Work-shop on Design and Evaluation of Advanced Hybrid Meta-Heuristics, University of Nottingham, 2004. [4] J. Greensmith, U. Aickelin, S. Cayzer, "Introducing dendritic cells as a novel immune- inspired algorithm for anomaly detection," presented at the 4th International Conference on Artificial Immune Systems (ICARIS 2005), Banff, Alberta, Canada, 2005. [5] M. Riebisch, I. Philippow, M. G. Tze, "UML- Based Statistical Test Case Generation," presented at the Revised Papers from the International Conference NetObjectDays on Objects, Components, Architectures, Services, and Applications for a Networked World, 2003.