This action might not be possible to undo. Are you sure you want to continue?

BooksAudiobooksComicsSheet Music### Categories

### Categories

### Categories

Editors' Picks Books

Hand-picked favorites from

our editors

our editors

Editors' Picks Audiobooks

Hand-picked favorites from

our editors

our editors

Editors' Picks Comics

Hand-picked favorites from

our editors

our editors

Editors' Picks Sheet Music

Hand-picked favorites from

our editors

our editors

Top Books

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Audiobooks

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Comics

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Sheet Music

What's trending, bestsellers,

award-winners & more

award-winners & more

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

CPU

**Computer Simulation Techniques: The de nitive introduction!
**

1 2

2 3 7

3

b

4 9 6 5 9 6

MCL = tarr

Yes

Is CPU queue empty?

No

MCL = MCL + 1

A new arrival event occurs

A departure event occurs

b

Is MCL = tarr ? Yes A new arrival event occurs

b

b

Harry Perros

**Computer Simulation Techniques: The definitive introduction!
**

by

Harry Perros

Computer Science Department NC State University Raleigh, NC

All rights reserved 2009

TO THE READER…

This book is available for free download from my web site: http://www.csc.ncsu.edu/faculty/perros//simulation.pdf Self-study: You can use this book to learn the basic simulation techniques by yourself! At the end of Chapter 1, you will find three examples. Select one of them and do the corresponding computer assignment given at the end of the Chapter. Then, after you read each new Chapter, do all the problems at the end of the Chapter and also do the computer assignment that corresponds to your problem, By the time you reach the end of the book, you will have developed a very sophisticated simulation model! You can use any high-level programming language you like. Errors: I am not responsible for any errors in the book, and if you do find any, I would appreciate it if you could let me know (hp@csc.ncsu.edu). Acknowledgment: Please acknowledge this book, if you use it in a course, or in a project, or in a publication. Copyright: Please remember that it is illegal to reproduce parts of this book or all of the book in other publications without my consent. I would like to thank Attila Yavuz and Amar Shroff for helping me revise Chapters 2 and 4 and also my thanks to many other students for their suggestions. Enjoy! Harry Perros, December 2009

TABLE OF CONTENTS

Chapter 1.

Introduction, 1 1.1 The OR approach, 1 1.2 Building a simulation model, 2 1.3 Basic simulation methodology: Examples, 5 1.3.1 The machine interference model, 5 1.3.2 A token-based access scheme, 11 1.3.3 A two-stage manufacturing system, 18 Problems, 22 Computer assignments, 23 The generation of pseudo-random numbers, 25 2.1 Introduction, 25 2.2 Pseudo-random numbers, 26 2.3 The congruential method, 28 2.3.1 General congruencies methods, 30 2.3.2 Composite generators, 31 2.4 Tausworthe generators, 31 2.5 The lagged Fibonacci generators, 32 2.6 The Mercenne Twister, 33 2.7 Statistical tests of pseudo-random number generators, 36 2.7.1 Hypothesis testing, 37 2.7.2 Frequency test (Monobit test), 38 2.7.3 Serial test, 40 2.7.4 Autocorrelation test, 42 2.7.5 Runs test, 43 2.7.6 Chi-square test for goodness of fit, 45 Problems, 45 Computer assignments, 46 The generation of stochastic variates, 47 3.1 Introduction, 47 3.2 The inverse transformation method, 47 3.3 Sampling from continuous-time probability distribution, 49 3.3.1 Sampling from a uniform distribution, 49 3.3.2 Sampling from an exponential distribution, 50

Chapter 2.

Chapter 3.

ii

Computer Simulation Techniques 3.3.3 Sampling from an Erlang distribution, 51 3.3.4 Sampling from a normal distribution, 53 3.4 Sampling from a discrete-time probability distribution, 55 3.4.1 Sampling from a geometric distribution, 56 3.4.2 Sampling from a binomial distribution, 57 3.4.3 Sampling from a Poisson distribution, 58 3.5 Sampling from an empirical probability distribution, 59 3.5.1 Sampling from a discrete-time probability distribution, 59 3.5.2 Sampling from a continuous-time probability distribution, 61 3.6 The Rejection Method, 63 3.7 Monte-Carlo methods, 65 Problems, 66 Computer assignments, 68 Solutions, 69

Chapter 4.

Simulation designs, 71 4.1 Introduction, 71 4.2 Event-advance design, 71 4.3 Future event list, 73 4.4 Sequential arrays, 74 4.5 Linked lists, 75 4.5.1 Implementation of a future event list as a linked list, 76 a. Creating and deleting nodes, 76 b. Function to create a new node, 78 c. Deletion of a node, 79 d. Creating linked lists, inserting and removing nodes, 79 e. Inserting a node in a linked list, 80 f. Removing a node from the linked list, 83 4.5.2 Time complexity of linked lists, 86 4.5.3 Doubly linked lists, 87 4.6 Unit-time advance design, 87 4.6.1 Selecting a unit time, 90 4.6.2 Implementation, 91 4.6.3 Event-advance vs. unit-time advance, 91 4.7 Activity based simulation design, 92 4.8 Examples, 94 4.8.1 An inventory system, 94 4.8.2 Round-robin queue, 97 Problems, 101 Computer assignments, 102 Appendix A: Implementing a priority queue as a heap, 104 A.1 A heap, 104 A.1.1 Implementing a heap using an array, 105 a. Maintaining the heap property, 106 A.2 A priority queue using a heap, 109 A.3 Time complexity of priority queues, 112

113 iii Estimation techniques for analyzing endogenously created data. Replications. 120 5. 118 5.1 Introduction. Chapter 8.6.6 Pilot experiments and sequential procedures for achieving a required accuracy. 155 7. 150 Variance reduction techniques. Chapter 7. 128 c.2 The antithetic variates technique. 155 7. 145 Validation of a simulation model. 149 Computer assignments.1 Transient-state simulation. 123 b. 115 5.2 An advance network reservation system .2 Collecting endogenously created data. 140 c. 166 8. 144 Computer assignments. Percentile of a probability distribution. 162 Computer assignments. 142 5. steady-state simulation. Variance of the probability distribution.2 Estimation of other statistics of a random variable.3 The control variates technique. Estimation of the autocorrelation coefficients. 138 a.3 Transient state vs. 173 Chapter 6.5 Estimation techniques for transient state simulation.1 Independent replications. 143 5.2 Batch means.4. 119 5.4. 156 7.2 Steady-state simulation.4 Estimation techniques for steady-state simulation. 120 a.1 An advance network reservation system – blocking scheme.delayed scheme.Table of Contents A.1 Estimation of the confidence interval of the mean of a random variable.3. Probability that a random variable lies within a fixed interval. 144 5. 131 5. 118 5. 141 5. . 138 b. 115 5. 115 5. 165 Simulation projects. 129 d. Implementing a future event list as a priority queue.6. 166 8.1 Introduction. Regenerative method. Batch means.3.4 Chapter 5.

.

Rather. Model validation 4. Examples are: model airplanes. Construction of the model 3.CHAPTER 1: INTRODUCTION 1. Implementation and maintenance of the solution The above steps are not carried out just once. evaluate various available alternatives (solution) 5. Operations Researchers like to call themselves model builders! A model is a representation of the structure of a real-life system. models can be classified as follows: iconic. Using the model. etc. traffic and economic systems.1 The OR approach The OR approach to solving problems is characterized by the following steps: 1. The basic feature of the OR approach is that of model building. . For instance. during model validation one frequently has to examine the basic assumptions made in steps 1 and 2. maps. An iconic model is an exact replica of the properties of the real-life system. during the course of an OR exercise. In general. one frequently goes back to earlier steps. An analogue model uses a set of properties to represent the properties of a real-life system. and symbolic. For instance. but in smaller scale. Symbolic models represent the properties of the real-life system through the means of symbols. analogue. Problem formulation 2. a hydraulic system can be used as an analogue of electrical.

However. are not uncommon.2 Computer Simulation Techniques such as mathematical equations and computer programs. Stochastic models are models which contain the element of probability. . and simulation techniques. namely deterministic models and stochastic models. Organization: Virtually all systems consist of highly organized elements or components. The argument goes that "when everything else fails. Examples are: queueing theory. 5. in general. 4. when other models are not applicable. 1. then use simulation techniques. That is. Simulation techniques rely heavily on the element of randomness. 3. is a collection of entities which are logically related and which are of interest to a particular application. Obviously. A system.2 Building a simulation model Any real-life system studied by simulation techniques (or for that matter by any other OR model) is viewed as a system. Sub-systems: Each system can be broken down to sub-systems. then simulate". which interact in order to carry out the function of the system. reliability. Operations Research models are in general symbolic models and they can be classified into two groups. Simulation is probably the handiest technique to use amongst the array of OR models. Environment: Each system can be seen as a subsystem of a broader system. 2. Deterministic models are models which do not contain the element of probability. Change: The present condition or state of the system usually varies over a long period of time. Interdependency: No activity takes place in total isolation. non-linear programming and dynamic programming. simulation models are symbolic models. stochastic processes. Examples are: linear programming. The following features of a system are of interest: 1. Simulation techniques are easy to learn and are applicable to a wide range of problems. deterministic simulation techniques in which there is a no randomness.

. The distinction between controllable and uncontrollable variables mainly depends upon the scope of the study. one simulates those sub-systems which are related to the problems at hand.Introduction 3 When building a simulation model of a real-life system under investigation. This can be graphically depicted using Beard's managerial pyramid as shown in Figure 1.2. The collection of blackened areas form those parts of the system that are incorporated in the model. All the relevant variables of a system under study are organized into two groups. Rather.1. Those which are considered as given and are not to be manipulated (uncontrollable variable) and those which are to be manipulated so that to come to a solution (controllable variables). Figure 1. This involves modelling parts of the system at various levels of detail. one does not simulate the whole system. one is interested in quantifying the performance of a system under study for various values of its input parameters.1: Beard's managerial pyramid A simulation model is. In particular. Such quantified measures of performance can be very useful in the managerial decision process. The basic steps involved in carrying out a simulation exercise are depicted in Figure 1. used in order to study real-life systems which do not currently exist. in general.

2. Another characterization of the relevant variables is whether they are affected or not during a simulation run. 3. The time interval between two successive arrivals. A variable having a value determined by other variables during the course of the simulation is called endogenous.4 Computer Simulation Techniques Define the Problem a Design Simulation Experiments Analyze Data (alternatives) Formulate Sub-Models Run the Simulator Combine Sub-Models Analyze the Results Collect Data Implement Results Write the Simulation Program Debug Validate Model Earlier Steps a Figure 1. Number of servers. The service time of a customer. Exogenous variables 1. when simulating a single server queue. . For instance.2: Basic steps involved in carrying out a simulation study. the following variables may be identified and characterized accordingly. A variable whose value is not affected is called exogenous.

Mean waiting time in the queue. and the service is . it was used extensively in computer modelling. Broken down machines are served in a FIFO manner. then the number of servers becomes an controllable variable. At any instance. will remain fixed. 5 The above variables may be controllable or uncontrollable depending upon the experiments we want to carry out. the selection of these variables is affected by what kind of information regarding the system one wants to maintain. Let us consider M machines. Obviously. Later on. These variables form the backbone of any simulation model. during a simulation run.1 Basic simulation methodology: Examples The machine interference problem Let us consider a single server queue with a finite population known as the machine interference problem. if we wish to find the impact of the number of servers on the mean waiting time in the queue. The remaining variables-the time interval between two arrivals and the service time. A machine remains broken down until it is fixed by the repairman. We assume that there is one repairman. (uncontrollable variables) Some of the variables of the system that are of paramount importance are those used to define the status of the system. Mean number of customers in the queue. Each machine is operational for a period of time and then it breaks down.3.3 1. For instance. 2. Such variables are known as status variables.Introduction 4. We now proceed to identify the basic simulation methodology through the means of a few simulation examples. Endogenous variables 1. one should be able to determine how things stand in the system using these variables. Priority discipline. This problem arose originally out of a need to model the behavior of machines. 1.

we will assume that the operational time of each machine is equal to 10 units of time. we assume that all the machines have identical constant operational times. in order to determine the down time of a machine. In other words. Also. Figure 1. one should be able to calculate the queueing time for the repairman.4 For simplicity. Other quantities of interest could be the utilization of the repairman.6 Computer Simulation Techniques non-preemptive. If this quantity is known. the repair time of each machine is equal to 5 units of time. one has information regarding the operational time and the repair time of a machine. .4: The machine interference problem. However. each machine follows a basic cycle as shown in figure 1. A machine becomes immediately operational after it has been fixed. In general. This can be visualized as a single server queue fed by a finite population of customers.3: The basic cycle of a machine. Let us now look at the repairman's queue. Repairman Finite Population of Machines Figure 1. which is repeated continuously. Thus. then one can calculate the utilization of a machine. as shown in figure 1.3. the total down time of a machine is made up of the time it has to "queue" for the repairman and the time it takes for the repairman to fix it. Obviously.

. A machine breaks down. the most important status variable is n. (This can be easily changed to more complicated cases where each machine has its own random operational and repair times.. i. is to identify the basic events whose occurrence will alter the status of the system.e.. the number of broken down machines. In this problem. If n=0. The selection of the status variables depends mainly upon the type of performance measures we want to obtain about the system under study. then the queue is empty and the repairman is busy. Now.Introduction 7 They also have identical and constant repair times.5: An arrival event.e. If n=1. then the repairman is busy and there are n-1 broken down machines in the queue. there are two events whose occurrence will cause n to change. A machine breaks down Repairman busy ? yes Join the queue Repairman becomes busy Repair starts Figure 1. then we know that the queue is empty and the repairman is idle. i. those waiting in the queue plus the one being repaired. This brings up the problem of having to define the status variables of the above problem. an arrival occurs at the queue. These are: 1. If n>1.) The first and most important step in building a simulation model of the above system.

. The clock will simply show the time instant at which the machine will break down. The flow-charts given in figures 1. i. Thus. it will arrive at the repairman's queue.e. In order to incorporate the above two basic events in the simulation model. at any instance. i. we need m+1 clocks. In particular.6 show what happens when each of these events occur. e. A machine is fixed. Each of these clocks is associated with the occurrence of an event.5. Obviously.. A machine is repaired Other machines to be repaired? no Repairman becomes idle A new repair starts Figure 1. and 1. we need to associate a clock for each machine. i.6: A departure event. m clocks are associated with m arrival events and one clock is associated with the departure event. known as clocks. In addition to these clocks. we require to have another clock which shows the time instant at which a machine currently being repaired will become operational. for this specific model. if we have m machines. in total. a departure occurs from the queue.. only the clocks of the operational machines are of interest.8 Computer Simulation Techniques 2. In particular. we need a set of variables. . which will keep track of the time instants at which an arrival or departure event will occur.e. it will cause a departure event to occur.

Let us assume that we have 3 machines. the model decides which of all the possible events will occur next. which simply keeps track of the simulated time. and the model takes action as indicated in the flow-charts given in figures 1. CL2. and 3 respectively (arrival event clocks). it is useful to maintain a master clock. using the above clocks. Then the master clock is advanced to this time instant.Introduction 9 In addition to these clocks. We are now ready to carry out the hand simulation shown below in table 1.7 . The heart of the simulation model centers around the manipulation of these events. Let CL1. and CL3 be the clocks associated with machine 1. 2. This event manipulation approach is depicted in figure 1.5 and 1. In particular. a A machine Breaks down Choose next event A machine is repaired Queue empty? n o Create operation time a Queue empty? Generate a new service ye s ye s a a Generate a new service n o a Figure 1.7: Event manipulation. Let CL4 be the clock associated .6.

therefore. each time a new repair service begins we set CL4=MC+5. In view of this. (These are known as initial conditions. we simply advance the Master clock to the next event which has the smallest clock time. Obviously. 1. there are three events that are scheduled to take place in the future. This is because following the time instant 11. These are: a) arrival of machine 1 at time 16. We assume that at time zero all three machines are operational and that CL1=1. the simulation will advance to time 16. the next event to occur is the latter event at time 16. For instance.) MC 0 1 4 6 CL1 1 16 CL2 4 4 - CL3 9 9 9 9 CL4 6 6 11 n 0 1 2 1 idle busy busy busy 9 11 16 16 16 - 21 21 26 11 16 21 2 1 1 busy busy busy Table 1: Hand simulation for the machine interference problem We note that in order to schedule a new arrival time we simply have to set the associated clock to MC+10. 16. 6. in the above hand simulation. Similarly. We note that in-between these instants no event occurs and. 4. CL2=4. 11.. having taken care of an event. the system's status remains unchanged. CL3=9. let MC be the master clock and let R indicate whether the repairman is busy or idle. b) arrival of machine 2 at time 21.. . after we have taken care of a departure of time 11. an event occurred). . A very important aspect of this simulation model is that we only check the system at time instants at which an event takes place.e. 9. Finally.. These times are: 0. and c) a departure of machine 3 at time 16. it suffices to check the system at time instants at which an event occurs. Furthermore. We observe in the above hand simulation that the master clock in the above example gives us the time instants at which something happened (i.10 Computer Simulation Techniques with the departure event.

as described below. CL2=4 CL3=9. The actual implementation of this simulation model is left as an exercise. we simulate a simplified version of such an access scheme. Access to the shared medium is controlled by a token.8: A flowchart of the simulation program. An outline of the flow-chart of the simulation program is given in figure 1. In this example. a node cannot transmit on the network unless it has the token. 1.8. CL1=1.3.2 A token-based access scheme We consider a computer network consisting of a number of nodes interconnected via a shared wired or wireless transport medium.9. as shown in figure 1.Introduction 11 The above simulation can be easily done using a computer program. n=0. R=0 (idle) A Arrival (ith machine) MC=CLi Next event Departure (ith machine) MC=CL4 n=0 ? yes R=1 No n=n-1 n=n+1 Yes n=0 ? No R=0 A CL4=MC+5 CL4=MC+5 A CLi=MC+10 A Figure 1. . Initiate simulation MC=0. That is.

one per node. that the node will transmit it to its next downstream logical neighbor. Such queueing systems are referred to as polling systems. it will complete the transmission and then it will surrender the token.12 Computer Simulation Techniques Figure 1.10. as shown in figure 1. The node surrenders the token when: a) time T has run out. Only the queue that has the token can transmit packets. A node cannot transmit unless it has the token. and they have been studied in the literature under a variety of assumptions. The header consists of the address of the sender. from its previous logical upstream node. . and the token is surrendered following the rules described above.9: Nodes interconnected by a shared medium There is a single token that visits the nodes in a certain logical sequence. the address of the destination. or c) it receives the token at a time when it has no packets in its queue to transmit. the order in which the nodes are logically linked may not be the same with the order in which they are attached to the network. In the case of a bus-based or ringbased wireline medium. During this time. or b) it has transmitted out all the packets in its queue before T expires. The maximum time that a queue can keep the token is T units of time. Conceptually. it may keep it for a period of time up to T. If time T runs out and the node is in the process of transmitting a packet. packets waiting in this queue can be transmitted on the network. and various control fields. A packet is assumed to consist of data and a header. The nodes are logically connected so that they form a logical ring. the node transmits packets. Surrendering the token means. this network can be seen as comprising of a number of queues. who switches cyclically between the queues. We will assume that the token never gets lost. When a node receives the token. Once the token is switched to a queue. The time it takes for the token to switch from one queue to the next is known as switch-over time. The token can be seen as a server.

T is assumed to be equal to 15 unit times. For the token. Also. and the first arrival to queues 1. The conceptual queueing system.10 consists of three queues. . That is.. In the hand simulation given below we assume that the token-based network consists of three nodes. at time zero. For each queue.10. 2. the queueing system in figure 1. if it is being transmitted. the token is in queue 1. there is a time of arrival at the next queue event and the time when the token has to be surrendered to the next node. we will assume that the packet arrives first. and the time a packet is scheduled to depart. we keep track of the time of arrival of the next packet. the number of the queue that may hold the token.Introduction 13 token Figure 1. The switch over time is equal to 1 unit time. The time it takes to transmit a packet is assumed to be constant equal to 6 unit times. 4 and 6 respectively. It is much simpler to use the queueing model given in figure 1.10 when constructing the simulation model of this access scheme. In addition. and 20 unit times respectively. 2. if the token and a packet arrive at a queue at the same time. the number of customers in the queue. The following variables represent the clocks used in the flow-charts: . 15.. we will assume that the arrival occurs first. For initial conditions we assume that the system is empty at time zero. For each queue. The following events have to be taken into account in this simulation. there is an arrival event and service completion event. For the token. In case when an arrival and a departure occur simultaneously at the same queue. and 3 are constant and they are equal to 10. and 3 will occur at time 2. we keep track of the time of arrival at the next queue. The inter-arrival times to queues 1. and the time-out. known as the time-out.

in figure 1. and figure 1. .12 deals with the case where a service is completed at queue i. This event-list has to be searched each time in order to locate the next event. Specifically. TOUT: Time out clock for token e.15. ANH: Arrival time clock of token to next queue The logic for each of the events in this simulation is summarized in figures 1. MC: Master clock b.11 we show what happens when an arrival occurs. The set of all possible events that can occur at a given time in a simulation is known as the Figure 1.2.2. DTi: Departure time clock from queue i.14 Computer Simulation Techniques a. i=1.14 summarizes the action taken when the token arrives at a queue. figure 1.14.3 c. i=1.11: Arrival event at queue i.3 d. Arrival occurs MC=ATi Join the queue qi = qi + 1 Schedule Next Arrival Arrivals Return ATi = MC + new inter-arrival time Figure 1. figure 1. ATi: Arrival time clock at queue i.13 event list.11 to 1. This event manipulation forms the basis of the simulation. and it is summarized in figure 1.13 describes the action taken when a time-out occurs.

Raise a flag Return Figure 1.Introduction 15 Service is completed MC = DT i Departs from queue q = q-1 Queue empty? no Token time out ? no Schedule new service yes Pass token Schedule arrival time H = (H+1) mod3 ANH = MC + switch over time yes Return DTi = MC + new service time Return Figure 1.13 : Time-out of token . Time-out occurs This means that node is still transmitting.12: Service completion at queue i.

e.16 Computer Simulation Techniques Figure 1. search event list for smallest number Take appropriate action i. branch to the appropriate part of the program (or procedure) If new events are scheduled update next event no Is simulation over ? yes End Figure 1.15 : Event manipulation. e. Initialize simulation Locate next event i.14 : Arrival of token at next queue. .

17 MC Queue 1 Queue 2 Queue 3 Arr Depart Queue Arr Depart Queue Arr Depart Queue Clock Clock size Clock Clock size Clock Clock size 2 2 12 12 12 12 12 12 22 22 22 22 32 32 32 32 32 42 42 42 42 52 52 52 30 30 36 36 36 42 42 42 9 9 9 0 0 1 1 1 1 0 0 1 1 1 1 2 2 2 2 1 2 2 1 1 2 1 1 4 4 4 4 19 19 19 19 19 19 19 34 34 34 34 34 34 34 49 49 49 49 49 49 49 16 16 0 0 0 0 1 1 1 1 1 0 0 1 1 1 1 1 1 1 2 2 2 2 2 2 6 6 6 6 6 26 26 26 26 26 26 26 26 26 26 46 46 46 46 46 46 46 46 46 23 23 23 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 Token Node Time No Out Clock 1 2 3 1 1 1 1 2 2 2 3 3 3 3 1 1 1 1 1 1 1 1 1 2 58 39 39 39 39 39 39 * * 32 32 32 25 25 18 18 18 Arrival Next Node 1 2 3 0 1 2 3 4 6 9 10 12 16 17 19 22 23 24 26 30 32 34 36 39 42 42 43 10 17 24 43 Table 2: Hand simulation for the token-based access scheme .Introduction The hand simulation is given in table 2.

A broken down server cannot provide service until it is repaired. after the server becomes operational again.18 1. Arrival of a customer to queue 1 (clock AT) 2. Stage 1 consists of an infinite capacity queue. referred to as queue 1. consists of a finite capacity queue. referred to as queue 2. we will assume that a server may break down whether it is busy or idle. Service completion at server 1 (clock DT1) 3.16: A two-stage queueing network.16. the customer will receive the balance of its service. server 1 stops. That is. For simplicity. served by a single server. Server 1 will remain blocked until a customer departs from queue 2. thus freeing the server to serve other customer in queue 1. upon service completion at server 1. In this case. the customer will resume its service after the server is repaired without any loss to the service it received up to the time of the breakdown. The following events (and associated clocks) may occur: 1. Server 1 breaks down (clock BR1) .3 A two-stage manufacturing system Computer Simulation Techniques Let us consider a two-stage manufacturing system as depicted by the queueing network shown in figure 1. the server gets blocked if the queue 2 is full. a space will become available in queue 2 and the served customer in front of server 1 will be able to move into queue 2. Each server may also break down. the server cannot serve any other customers that may be waiting in queue 1. More specifically. served by a single server. referred to as server 2. That is. If a customer was in service when the breakdown occurred.3. referred to as server 1. Service completion at server 2 (clock DT2) 4. When queue 2 becomes full. Stage 1 Stage 2 Queue 1 Server 1 Queue 2 Server 2 Figure 1. Stage 2.

if queue 2 is full. Server 2 becomes operational (clock OP2) 19 Below. Server 1 becomes operational (clock OP1) 6. Service completion at server 1 (new value for DT1 clock): This event will occur if server 1 was blocked. Service completion at server 2 (new value for DT2 clock): This event will occur if the customer who just completed its service at server 1 finds server 2 idle. a. . Arrival to queue 1 (new value for AT clock): This event is always scheduled each time an arrival occurs. 4. 2. update the clock of the service completion event at server 1 to reflect the delay due to the repair. b. Server 1 breaks down: a. Service completion at server 2: a. Service completion at server 2 (new value for D21 clock): This event will occur if there is one or more customers in queue 2. b. b. 1. we briefly describe the events that may be triggered when each of the above events occurs. Server 2 breaks down (clock BR2) 7. Service completion at server 1(new value for DT1 clock): This event will occur if there is one or more customers in queue 1. 3. Arrival to queue 1.Introduction 5. If the server was busy when it broke down. Service completion at server 1 (new value for DT1 clock): This event will be triggered if the new arrival finds the server idle. The occurrence of a service completion event at server 1 may cause server 1 to get blocked. Service completion at server 1: a. Server 1 becomes operational (new value for OP1 clock): This event gives the time in the future when the server will be repaired and will become operational.

Service completion time (new value for DT2): If the server was idle when it broke down. During this time the server is operational. and queue 2 is not empty at the moment it becomes operational. The first arrival occurs at time 10. b. repair time for server 1 = 50. service time at node 1 = 20. and server 2 at time 90. Server 2 breaks down (new value for BR2 clock): This event gives the time in the future when server 2 will break down. then a new service will begin. operational and repair times are constant with the following values: interarrival time = 40. In the hand-simulation given in table 3. Server 2 becomes operational: a. 6. Server 1 becomes operational: a. interarrival times. Service completion time (new value for DT1): If the server was idle when it broke down. . then a new service will begin. and repair time for server 2 = 150. During this time the server is operational. update the clock of the service completion event at server 2 to reflect the delay due to the repair. All service times. server 1 will break down for the first time at time 80. If server 2 was busy when it broke down. operational time for server 2 = 300. Server 2 breaks down: a.20 Computer Simulation Techniques 5. Server 1 breaks down (new value for BR1 clock): This event gives the time in the future when the server will break down. Initially the system is assumed to be empty. service time at node 2 = 30. it is assumed that the buffer capacity of the second queue is 4 (this includes the customer in service). b. 7. During that time the server is broken down. and queue 1 is not empty at the moment it becomes operational. operational time for server 1 = 200. and therefore it will become operational. Server 2 becomes operational (new value for OP2 clock): This event gives the time in the future when server 2 will be repaired.

Introduction Stage 1 MC 10 30 50 60 70 80 90 90 130 130 150 170 170 190 210 230 240 250 250 270 280 290 310 310 330 330 340 370 370 380 AT 50 50 90 90 90 90 90 130 170 170 170 210 210 210 250 250 250 290 290 290 290 330 330 330 370 370 370 410 410 410 #Cust 1 0 1 1 0 0 0 1 2 2 1 2 1 0 1 1 1 2 1 1 0 1 1 0 1 1 1 2 2 2 350 400 400 400 400 400 580 310 310 270 230 150 150 150 170 170 190 330 330 330 330 330 330 330 330 330 330 330 330 330 330 330 330 380 380 380 380 70 70 DT1 30 BR1 80 80 80 80 80 130 130 130 130 OP1 Server Status busy idle busy busy idle down down down down busy busy busy busy idle busy blocked blocked blocked busy blocked idle busy busy idle busy down down down down busy 1 1 0 1 1 1 1 1 1 2 2 3 4 4 4 4 4 4 4 4 4 3 4 4 4 3 3 2 2 100 100 250 250 250 250 250 250 250 250 250 250 250 250 280 280 310 310 340 340 340 340 370 370 400 400 540 540 540 540 540 540 540 540 540 540 540 540 540 540 60 60 # Cust DT2 21 Stage 2 BR2 90 90 90 90 90 90 240 240 240 240 240 240 240 240 240 240 OP2 Server Status idle busy busy idle busy busy down down down down down down down down down down busy busy busy busy busy busy busy busy busy busy busy busy busy busy Table 3: hand simulation for the two-stage manufacturing system .

b. for the following cases: a. Assume that the machines are not repaired in a FIFO manner. discussed in section 1. it is possible that more than one clock may have the same value.3. it is not possible to have events occurring at the same time. Vary the repair and operational times.) . Vary the number of queues. Vary the inter-arrival times. b. In this particular simulation. 3. (The third stage is similar to the second stage. That is. Therefore. Do the hand simulation of the two-stage manufacturing system. Do the hand simulation of the machine interference problem. Problems 1. and it has to be accounted for in the simulation. described in section 1. simultaneous events can be taken care in any arbitrary order. Vary the number of repairmen. c. for the following cases: a. c.22 Computer Simulation Techniques Since we are dealing with integer numbers. described in section 1. The packets in a queue are served according to their priority.2. typically.3. more than one event may occur at the same time. for the following cases: a. In general. 2.3. Assume that packets have priority 1 or 2 (1 being the highest). clocks are represented by real numbers. Assume a three-stage manufacturing system. Do the hand simulation of the token-based access scheme. Assume no breakdowns. In a simulation. Packets with the same priority are served in a FIFO manner.3.1. b. but according to which machine has the shortest repair time. the order with which such events are dealt with may matter. however.

print out a line of output to show the current values of the clocks and of the other status parameters (as in the hand simulation). and whether it updates the clocks and the other status parameters correctly.3. Run your simulation until the master clock is equal to 20. print out a line of output to show the current values of the clocks and of the other status parameters (as in the hand simulation).3. Check by hand whether the simulation advances from event to event properly.2. Run your simulation until the master clock is equal to 500.Introduction Computer assignments 23 1. Write a computer program to simulate the token-based access scheme as described in section 1. Each time an event occurs. Each time an event occurs. Each time an event occurs.3. and whether it updates the clocks and the other status parameters correctly. Run your simulation until the master clock is equal to 100. 3. Write a computer program to simulate the machine interference problem as described in section 1.3.1. print out a line of output to show the current values of the clocks and of the other status parameters (as in the hand simulation). and whether it updates the clocks and the other status parameters correctly. 2. . Check by hand whether the simulation advances from event to event properly. Check by hand whether the simulation advances from event to event properly. Write a computer program to simulate the two-stage manufacturing system as described in section 1.

.

Based on these methods. in numerical analysis. In this model it was assumed that the operational time of a machine was constant.CHAPTER 2: THE GENERATION OF PSEUDO-RANDOM NUMBERS 2. random numbers are used in the solution of complicated integrals. In computer programming. Numbers chosen at random are useful in a variety of applications. Similarly. in most of the cases one will observe that the time a machine is operational varies. normal.. Also. Random numbers also play an important role in cryptography. random numbers make a good source of data for testing the effectiveness of computer algorithms. we will find that they are typically characterized by a theoretical or an empirical probability distribution. i. For instance. the repair time may vary from machine to machine. . in order to make the simulation model more realistic. However. the repair times can be also characterized by a theoretical or empirical distribution. For instance. It is possible that one may identify real-life situations where these two assumptions are valid. If we are able to observe the operational times of a machine over a reasonably long period. one should be able to randomly numbers that follow a given theoretical or empirical distribution. we shall then proceed in the next Chapter to consider methods for generating random numbers that have a certain distribution. let us consider for a moment the machine interference simulation model discussed in the previous Chapter. In simulation. exponential. etc. it was assumed that the repair time of a machine was constant.e.1 Introduction We shall consider methods for generating random number uniformly distributed. random numbers are used in order to introduce randomness in the model. Also. Therefore.

True random numbers are ideal for critical applications such as cryptography due to their unpredictable and realistic random nature. 2. Efficient algorithms have been developed that can be easily implemented in a computer program to generate a string of random . is to use a mathematical algorithm. In addition. we speak of a sequence of random numbers that follow a specified theoretical or empirical distribution. Random numbers generated in this way are called true random numbers. there is no such a thing as a single random number. otherwise known as pseudo-random numbers. Also. which is the most popular approach.26 Computer Simulation Techniques In order to generate such random numbers one needs to be able to generate uniformly distributed random numbers. Such sources can be found in nature. we need to be able to reproduce a given sequence of random numbers. can give rise to a randomized source. sampling a human computer interaction processes. they are not useful in computer simulation. These pseudo-random numbers can be used either by themselves or they can be used to generate random numbers from different theoretical or empirical distributions. such as keyboard or mouse activity of a user. An alternative approach to generating random numbers. the production and storing of true random numbers is very costly. Finally. However. For instance. despite their several attractive properties. In the first approach. A true random number generator requires a completely unpredictable and nonreproducible source of randomness. known as random variates or stochastic variates. Rather. the elapsed time between emissions of particles during radioactive decay is a well-known randomized source. a physical phenomenon is used as a source of randomness from where random numbers can be generated. or they can be created from hardware and software. There are two main approaches to generating random numbers. where as will be seen below. The generation of stochastic variates is described in Chapter 3. the thermal noise from a semiconductor diode or resistor can be used as a randomized source. In this Chapter. we focus our discussion on the generation and statistical testing of pseudo-random numbers.2 Pseudo-random numbers In essence.

All other random numbers. an acceptable method for generating random numbers must yield sequences of numbers or bits that are: a. and d. Consequently. each time a random number is required. non-repeating for any desired length. given a starting value. reproducible. we will refer to pseudo-random numbers as random numbers. In general.Generating Pseudo-random numbers 27 numbers. Historically. That is. known as the seed. Pseudo-random numbers and in general random numbers are typically generated on demand. statistically independent c. We note that the term pseudo-random number is typically reserved for random numbers that are uniformly distributed in the space [0.1]. these random numbers are referred to as pseudo-random numbers. there is no need to generate a large set of random numbers in advance and store them in an array for future use as in the case of true random numbers.1]. the first method for creating random numbers by computer was Von Neuman's mid-square method. For simplicity. Despite the deterministic way in which random numbers are created. An advantage of generating pseudo random numbers in a deterministic fashion is that they are reproducible. these numbers appear to be random since they pass a number of statistical tests designed to test various properties of random numbers. the same sequence of random numbers can be produced each time as long as the seed remains the same. In view of this. including those that are uniformly distributed within any space other than [0. since the same sequence of random numbers is produced each time we run a pseudo-random generator given that we use the same seed. as we typically want to reproduce the same sequence of events in order to verify the accuracy of the simulation. uniformly distributed b. are referred to as random variates or stochastic variates. His idea was to take the square of the previous random . These algorithms produce numbers in a deterministic fashion. That is. This is helpful when debugging a simulation program. the appropriate generator is called which it then returns a random number.

. c and m are all non-negative numbers.3 The congruential method This is a very popular method and most of the available computer code for the generation of random numbers use some variation of this method. It was later abandoned in favour of other algorithms. Historically. let us assume that we are generating 10-digit numbers and that the previous value was 5772156649. The congruential method uses the following recursive relationship to generate random numbers. it does not. Well. Below we limit our discussion to mixed congruential generators. a = c = 7. we describe the the congruential method. 0. Then. 9. This method utilizes the relation xi+1=axi(mod m).. a. For example. The question here that arises is how such a method can give a sequence of random numbers. and the Mersenne twister. Multiply xi by a and then add c. The method using the above expression is known as the mixed congruential method. The square of this value is 33317792380594909291 and the next number is 7923805949. xi+1 = axi + c (mod m) where xi. 6. That is. For example. divide the result by m and set xi+1 equal to the remainder of this division. 6. compute the modulus m of the result. the Tausworthe generators. the next random number xi+1 can be generated as follows. 7. and m = 10 then we can obtain the following sequence of numbers: 7. In the following sections. if x0 = 0. The advantage of this congruential method is that it is very simple.. the lagged Fibonacci generators. 9. fast. but it appears to be! The mid-square method was relatively slow and statistically unsatisfactory. 0. multiplicative congruential generators came before the mixed congruential generators. . 2. A simpler variation of this method is the multiplicative congruential method. and it produces pseudo-random numbers that are statistically acceptable for computer simulation. Given that the previous random number was xi.28 Computer Simulation Techniques number and to extract the middle digits.

Systematic testing of various values for these parameters have led to generators which have a full period and which are statistically satisfactory. The division can be avoided by setting m equal to the size of the computer word. an overflow will occur. In particular. c. m and c have no common divisor.Generating Pseudo-random numbers 29 The numbers generated by a congruential method are between 0 and m-1. the larger is the period. It will become obvious later on that the seed value does not affect the sequence of the generated random numbers after a small set of numbers has been generated. That is. The larger the value of m. 806. The number of successively generated pseudo-random numbers after which the sequence starts repeating itself is called the period. It is important to note that one should not use any arbitrary values for a. c and m. If the overflow does not cause the execution of the program to be aborted. if r is a prime number (divisible only by itself and 1) that divides m. a = 1 (mod r) if r is a prime factor of m. Now. and m = 232 (for a 32 bit machine). when the calculation is performed. If the period is equal to m. 2. In this case. if the total numerical value of the expression axi+c is less than the word size. the following conditions on a. A set of such values is: a = 314. and m guarantee a full period: 1. then it divides a-1. 245. then the generator is said to have a full period. 3. 159. The implementation of a pseudo-random number generator involves a multiplication. then it is in itself the result of the operation axi+c (mod m). c = 453. an addition and a division. a = 1 (mod 4) if m is a multiple of 4. where m is set equal to the word size. 269. but it simply causes the significant digits to be lost. let us assume that the expression axi+c gives a number greater than the word size. Uniformly distributed random numbers between 0 and 1 can be obtained by simply dividing the resulting xi by m. For. This is because the lost . In order to get a generator started. we further need an initial seed value for x. then the remaining digits left in the register is the remainder of the division (axi+c)/m. Theorems from number theory show that the period depends on m.

1 General congruential methods The mixed congruential method described above can be thought of as a special case of a following generator: xi+1 = f(xi.. which is based on the relation f(xi. In this case. The first significant digit will be lost and thus the register will contain the number 60. the largest number that can be held in the register is 99. xi-1. xi-k) = a1xi + a2xi-1 + . The special case of a1=a2=1. Obviously. and 26 (mod 100) = 26. 2.. This has the form: 2 xi+1=a1xi + a2xi-1+ c (mod m).30 Computer Simulation Techniques significant digits will represent multiples of the value of m. which is the quotient of the above division.) is a function of previously generated pseudo-random numbers. if x=20. However. we set m equal to 100. akxi-k.3. c=0 and m being a power of 2 has been found to be related to the midsquare method. then we have that axi + c = 170. xi-1. .... the remaining of the division 170/100. . In order to demonstrate the above idea. x=2. Another special case that has been considered is the additive congruential method. and c=10.) (mod m). A special case of the above general congruential method is the quadratic congruential generator. the product axi (which is equal to 8x20) will cause an overflow to occur. For a=8... which is. If we now add c (which is equal to 10) to the above result we will obtain 70. . where f(. we have that axi + c = 26. let us consider a fictitious decimal calculator whose register can accommodate a maximum of 2 digits. Now.

The next random number that will be returned by the composite generator. By combining separate generators. 2. One of the good examples for this type of generator uses the second congruential generator to shuffle the output of the first congruential generator. The procedure repeats itself in this fashion. In view of this. xi-1)=xi+xi-1 has received attention.3. 2. In particular. even if two separate generators used are bad. The random number stored in the rth position of the vector is returned as the first random number of the composite generator. k.4 Tausworthe generators Tausworthe generators are additive congruential generators obtained when the modulus m is equal to 2.. one hopes to achieve better statistical behavior than either individual generator. It has been demonstrated that such a combined generator has good statistical properties.Generating Pseudo-random numbers The case of f(xi. the first generator is used to fill a vector of size n with its first k generated random numbers. is the one selected by the second generator from the updated vector of random numbers. In particular. . notated as ⊕ and defined by the following truth table. Thus. it suffices to assume that the coefficients ai are also binary. ….2 Composite generators 31 We can develop composite generators by combining two separate generators (usually congruential generators). xi is obtained from the above expression by adding some of preceding bits and then carrying out a modulo 2 operation. The second generator is then used to generate a random integer r uniformly distributed over the numbers 1.+ anxi-n) (mod 2) where xi can be either 0 or 1. This is equivalent to the exclusive OR operation. 2. This type of generator produces a stream of bits {bi}.. xi = (a1xi-1 + a2xi-2 + . The first generator then replaces the random number in the rth position with a new random number.

Based on this recurrence relation. 3.32 Computer Simulation Techniques A 0 0 1 1 B 0 1 0 1 A⊕B 0 1 1 0 A⊕B is true (i. and they are widely used in simulation. In the composite generator scheme discussed earlier on. whereby each element is computed using the two previously computed elements. when either A is true and B false. However. equal to 1). they are too slow since they only produce bits. 21. A fast variation of this generator is the trinomial-based Tausworthe generator. 5. an additive recurrence relation. The generated bits can be put together sequentially to form an l-bit binary integer between 0 and 2l-1. Tausworthe generators are independent of the computer used and its word size and have very long cycles.5 The lagged Fibonacci generators The lagged Fibonacci generators (LFG) are an important improvement over the congruential generators. one of the generators (but not both) could be a Tausworthe generator. 1. 8. or A is false and B true. Two or more such generators have to be combined in order to obtain a statistically good output.e. as shown below: xn = xn-1 + xn-2 where x0=0 and x1=1. They are based on the well-known Fibonacci sequence. 2. Several bit selection techniques have been suggested in the literature. 2. 13. The beginning of the Fibonacci sequence is: 0. 1. the general form of LFG can be expressed as follows: xn = xn-j O xn-k (mod m) .

In this case. equal to mk-1. LFGs generate a sequence of random numbers with very good statistical properties. or a multiplication as well as it can be a binary operation XOR. then this LFG is called the additive LFG (ALFG). can be obtained if m is a prime number. SRC-TR-94-115.6 The Mersenne twister The Mersenne twister (MT) is an important pseudo-random number generator with . Cuccaro. Thus. However. Likewise. it is very easy to implement and also it is quite fast. The maximum period is (2k-1)*2m-3. Their execution can also be parallelized. or a subtraction. and appropriate initial conditions have been made. Further references on this generator can be found in: "A fast. then it is called the multiplicative LFG (MLFG).Generating Pseudo-random numbers 33 where 0<j<k. the next element is calculated as follows: xn = xn-j + xn-k (mod M) where 0<j<k. Robinson.and reproducible parallel lagged-fibonacci pseudorandom number generator " by M. The additive LFG is the most frequently used generator. As can be seen. This operation O can be an addition. Mascagni. high quality. S. and they are nearly as efficient as the linear congruential generators. In this generator. Pryor & M. the statistical properties of an output sequence of random numbers varies from seed to seed. LFGs are highly sensitive on the seed. A very long period. the next element is determined by combining two previously calculated elements that lag behind the current element utilizing an algebraic operation O. Determining a good seed for LFGs is a difficult task. typically m is set to 232 or 264. In this case the maximum period of the additive LFG is (2k-1)x2m-1. D. if O is the multiplication operation. In general. using a prime number may not be very fast. If O is the addition operation. 1994 2. That is. However. Supercomputing Research Center Technical Report. The multiplicative LFG is as follows: xn = xn-j * xn-k (mod m) where m is set to 232 or 264 and 0<j<k.

and its output has very good statistical properties.e. which is as large as the period of the generator after which it begins to repeat itself. The blocks are considered to be random. k≥0 We assume that each block.where 0≤r≤w. represented by x. The meaning of the parameters used in the above equation is as follows: € u x k : The upper w-r bits of xk. A: A constant w x w matrix defined as below so that the multiplication operation in the above recurrence can be performed extremely fast. l x k +1 : The lower r bit of xk+l. joining) of two bit strings.34 Computer Simulation Techniques superior performance. ⎛ 0 ⎜ ⎜ 0 ⎜ 0 A = ⎜ ⎜ 0 ⎜ 0 ⎜ ⎜ a ⎝ w−1 1 0 0 0 0 a w− 2 0 0 0 0 0 0 0 0 0 0 0 0 0 € ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ 0 0 0 1 ⎟ ⎟ a 0 ⎟ ⎠ . Its maximum period is 219937-1.e. This bit sequence is typically grouped into 32-bit blocks (i.. n: Degree of recurrence relation m: Integer in the range of 1≤m≤n. has a size of w bits. which is much higher than many other pseudo-random number generators. • • • • • • • ⊕ : exclusive OR |: It indicates the concatenation (i. The following is the main recurrence relation for the generation of random sequence of bits: u k x k +n = x k +m ⊕ ( x k | x k +1 ) A . blocks equal to the computer word). The MT generates a sequence of bits..

x w−2 . xn-1. 1. x0. 1. each generated block is multiplied from the right with a special wxw invertible tempering matrix T. a1 .aw−2 . Model. Bit Mask Initializations: u ← 110 0 : (The bitmask for upper w-r bits) w− r r ll ← 0 011 : (The bitmask for lower r bits ) w− r r a ← [a w−1 . x 0 }. € At the last state of the algorithm. a 0 ]: (the last row of the matrix A) .Generating Pseudo-random numbers 35 The recurrence relation is initialized by providing seeds for the first d blocks. ACM Trans. This multiplication is performed in a similar manner as the multiplication with matrix A above and it involves only bitwise operations.. The multiplication operation xA can be done very fast as follows: ⎧ shiftright(x) if x 0 = 0 xA = ⎨ ⎩ shiftright(x) ⊕ a if x 0 = 1 where a = {aw−1 . as follows. i. Below we present a description of the MT algorithm following the paper: “Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator”. • (x >> u) indicates a shiftright operation by u times for variable x. Simul. t and u are pre-determined integer constants. • l. Nishimura. Comput. . . in order to increase the statistical properties of € generator’s output. M. 3-30 (1998).e. .a0 } and x = { x w−1 . . . • (x << u) indicates a shiftleft operation by u times for variable x. Matsumoto and T. x1. s. 8. a w− 2 . y = x ⊕ ( x >> u ) y = y ⊕ (( y << s ) ∧ b) y = y ⊕ ((y << t) ∧ c) z = y ⊕ (y >> l) where • b and c are block size binary bitmasks (vector parameters).

. x[n −1]) ←"any non . chi-square test for goodness of fit. 3. frequency test. serial test.36 2. 2.zero initial values" u l 3. x[1]. Not all random-number generators available through programming languages and other software packages have a good statistical behavior. In this section. 2..7 Statistical tests for pseudo-random number generators It is a good practice to check statistically the output of a pseudo-random number generator prior to using it. 4. Compute ( x k | x k +1 ) € y ← (x[i] ∧ u) ∨ (x[(i + 1 ) mod n] ∧ ll) 4 Multiplication with A: x[i ] ← x[(i + m) mod n] ⊕ ( y >> 1) ⎧0 if the least significant bit of y = 0 ⊕ ⎨ ⎩a if the least significant bit of y = 1 5. Multiplication operation x[i]*T y ← x[i] y ← y ⊕ (y >> u) y ← y ⊕ ((y << s) ∧ b) y ← y ⊕ ((y << t) ∧ c) y ← y ⊕ ( y >> l ) Output y 6. i = i+1 mod n and goto step 3. we describe the following five statistical tests: 1. runs test. autocorrelation test. Seed Initialization: Computer Simulation Techniques i←0 ( x[0]. and 5.

For instance. The last two statistical tests. into a sequence of bits. let us formulate the assertion that 30% of the cars are red. the runs test and the chi-square test for goodness of fit. in order to test the assertion that 30% of the cars are red.) 2. the null hypothesis is that “the produced sequence from the generator is random”. namely. or about some measure of a distribution. is also formulated. Statistical testing involves testing whether a particular assumption. The bitwise statistical tests can be used on pseudo-random number generators that generate real numbers.1]. such as the mean and the variance. by simply concatenating a very large number of random numbers. This assumption is called a hypothesis. known in statistics as a hypothesis. NIST special publication 800-22. denoted as Ha. The alternative hypothesis is that “the produced sequence is not random”. it is not true that 30% of the cars are red.1 Hypothesis testing Hypothesis testing is used in statistics to decide whether a particular assumption is correct or not. and it is denoted as H0. The first three statistical tests described below. we review the basic principles of hypothesis testing in statistics. is correct. and the autocorrelation test. The tested statistical hypothesis is called the null hypothesis. To test a hypothesis. (More information on this subject can be found in any introductory book on statistics. namely. red or not .Generating Pseudo-random numbers 37 Many useful statistical tests can be found in: “A statistical testing for pseudorandom number generators for cryptographic applications”. As an example. we first have to collect a representative sample of the colour of cars. the frequency test. test the randomness of a sequence of bits. that is. An alternative hypothesis. Before we proceed to describe the statistical tests. each represented by a binary string. check the randomness of pseudo-random numbers in [0.7. the serial test. This assertion is our null hypothesis. the alternative hypothesis is that this is not true. In the case of the testing a pseudo-random number generator. and it typically is an assertion about a distribution of one or more random variables. In this case. we first collect data and then based on the data we carry out a statistical hypothesis test.

Real situation Ho is true Ho is not true Decision Ho is not rejected Valid Type II Error Table 2. Table 2. type I error and type II error.2 Frequency test (Monobit test) The frequency test is one of the most fundamental tests for pseudo-random number . Due to sampling variation. 2.30 is due to randomness in the obtained sample or because our assumption that “30% of the cars are red” is incorrect. and the type II error is known as a false positive.1 summarizes the type I and type II errors.30. In the case of testing the randomness of a generator. whereas in fact it is correct (which of course we do not know). Typically. The result of the test (accept or reject) is not deterministic but rather probabilistic. Based on sampled data. The statistical test will help us determine whether the difference between pred and 0. Since we do not know the truth. a statistical hypothesis test is used to decide whether to accept or reject the null hypothesis. the percent of cars that were red. The probability of rejecting the null hypothesis when it is true.01 or 0. A type II error occurs when we accept the null hypothesis when in fact it is not correct. we will need an output sequence of random numbers. namely.38 Computer Simulation Techniques red. That is.05 in practical applications. A type I error occurs when we reject the null assumption. The type I error is commonly known as a false negative. with some probability we will accept or reject the null hypothesis. we set α to 0. we do not know whether we have committed a type I or a type II error.7. is denoted by α and it is known as the level of significance. even if our assumption is correct.1: Type I and type II errors H0 is rejected Type I error Valid There are two errors associated with hypothesis testing. and then based on the sample we calculate pred. the probability that a type I error occurs. that is. pred will never be exactly equal to 0.

Let Sn=X1 + X2 + . Xn 2.Generating Pseudo-random numbers 39 generators. Compute the test statistic S obs = Sn n ⎛ S ⎞ 3. This test checks whether this is correct or not. A large positive value of Sn is indicative of too many ones. . Compute the P-value = erfc⎜ obs ⎟ ⎝ 2 ⎠ If P-value < 0. In order to achieve this. the sequence is accepted as being random. NIST special publication 800-22.01. The 0’s are first converted into –1’s and then the obtained set of -1 and 1 values are all added up. the number of 1’s and 0’s should be about the same.6324. It is recommended that a sequence of a few thousand bits is tested. Let the length of this string be n. Thus the sequence is nonrandom else the sequence is accepted as a random sequence. Thus. For further details see: “A statistical testing for pseudorandom number generators for cryptographic applications”. Note that the P-value is small. The code for this function can be found in a statistical package like Matlab.e.01. if a pseudo-random number generators fails this test. and a large negative value of Sn is indicative of too many zeros. if |Sn| or |Sobs| is large. Generate m pseudo-random numbers and concatenate them into a string of bits. the test makes use of the complementary error function (erfc). Then.5271 > 0. . Thus. or in a programming language like Java. less than 0. Sn = 1-1 +1+11+1-1+1-1+1 = 2 and Sobs=0. we obtain that Pvalue = erfc(0. As an example. then it is highly likely that it will also fails more sophisticated tests.6324) = 0. In a true random sequence. let us consider the bit string: 1011010101. For a level of significance α = 0. The following steps are involved: 1..01. i. .01 then we decide to reject the null hypothesis.

€ € 3. augment e 2 € by appending its first k-2 bits to the end of the bit string e. k-2 bits found in e. if the sequence of the k bits is random. augment e by appending its first k-2 bits to the end of the bit string e. and eʹ′ respectively.2. Let e denote a sequence of n bits created by a pseudo-random number generator. If k=1. and let eʹ′ be the resulting bit string. where k < ⎣log 2 n⎦ − 2 . then this test becomes equivalent to the frequency test described above in section 2. The serial test determines whether the number of times each of these combinations occurs is uniformly distributed.6. k-1. Compute the following difference statistics: 2 k−2 ∑ f iʹ′ʹ′2 − n n i € ΔS k2 = S k2 − S k2−1 Δ2 S k2 = S k2 − 2 S k2−1 + S k2− 2 . The minimum recommended value for n is 100. Let fi. Augment e by appending its first k-1 bits to the end of the bit string e. The following steps are involved: € ʹ′ 1. Likewise.3 Serial test Computer Simulation Techniques There are 2k different ways of combining k bits. Each of these combinations has the same chance of occurring. Compute the frequency of occurrence of each k. k-1. The serial statistical test checks the randomness of overlapping blocks of k.40 2. and k-2 overlapping bit combination respectively. Finally. f iʹ′.7. and let eʹ′ be the 3 € 2. and f iʹ′ʹ′ 3 represent the frequency of occurrence of the ith k. and let e1 be the resulting bit string. k-1 and k-2 bit overlapping € ʹ′ 2 combination using the sequence e1 . Compute the following statistics: resulting bit string. eʹ′ . € € € Sk2 = 2k ∑ f i2 − n n i 2 k−1 ∑ f iʹ′2 − n n i 2 Sk−1 = 2 Sk−2 = € 4.

f3=1. c2=001. c 2 = 01 . € ʹ′ obtaining e1 = 001101110100 .ΔSk2 /2) P − value2 = IncompleteGamma(2 k−3 . 2-bit and 1-bit overlapping combination. 4 ʹ′ ʹ′ ʹ′ For the 2-bit blocks we use e2 . obtaining e2 = 00110111010 . As in the frequency test. (0. Do not need to repeat the process for k-3=0 since eʹ′ = e = 0011011101. Their corresponding frequency of occurrence is: f1ʹ′ = 1 . f2=1. Compute the P-values: 41 P − value1 = IncompleteGamma(2 k−2 . f6=2. Append the k-1=2 bits from the beginning of the sequence e to the end of e.) function can be found in software packages.Generating Pseudo-random numbers 5. 3 € 2. f4=2. Compute the following statistics and differences: .) The code for the IncompletGamma (. The bit combinations are: c1=000. The 2 bit combinations are: c1 = 00 . Their corresponding frequencies of occurrence are: f1=0. c5=100.01 then we reject the null hypothesis that the sequence is random. f 2ʹ′ = 3 . We repeat the process by appending the first k-2=1 ʹ′ bits of e to the end of e. with n=10. else we accept the null hypothesis that the sequence is random. 3 € ʹ′ For the 3-bit blocks we use e1 . 1. If P-value1 or P-value2 < 0. c4=011. The 1-bit combinations are c1ʹ′ = 0 and c 2ʹ′ = 1 and the frequency of occurrence is f1ʹ′ʹ′= 4 and f 2ʹ′ʹ′= 6 . Calculate the frequency of occurrence of each 3-bit. and c8=111. € 3. c3=010. c7=110.Δ2 Sk2 /2) 6. ʹ′ ʹ′ c3 = 10 . f5= 1. and c 4 = 11 . As an example. let assume the output sequence e = 0011011101. and k=3. if the overlapping bit combinations are not uniformly distributed. f 3ʹ′ = 3 and f 4ʹ′ = 3 . c6=101. Δ S k2 and Δ2 S k2 become large that causes the P-values to be small. f7=2 and f8 = 0. 5 ʹ′ ʹ′ For the 1-bit blocks we use e.01 is the level of significance.

1101100 by 3 positions. 0110110 by 2 positions. Then 0011011 is a bit string obtained by shifting e by 1 position. we find the number of bits that are different ei and ei+d. We compare shifted versions of e to find how many bits are different.0.4 10 2 2 ΔS k = ΔS 32 − ΔS 2 = 2. Compute P-values: P − value1 = IncompleteGamma(2.8 10 22 2 S2 = (1 + 9 + 9 + 9) − 10 = 1.8 4.2 10 2 S12 = (16 + 36) − 10 = 0.9057 P − value2 = IncompleteGamma(1. where 0<i<n-d-1. For instance. and this information is then used in a statistic for the hypothesis testing.42 Computer Simulation Techniques 23 S = (0 + 1 + 1 + 4 + 1 + 4 + 4 + 1) − 10 = 2.8) = 0.8805 5. let e = 1001101.01. where 1 ≤ d ≤ n/2.6 2 3 2 Δ2 S k2 = ΔS 32 − 2ΔS 2 + ΔS12 = 0. then it will be different from another bit string obtained by shifting the bits of e by d positions. The intuition behind the autocorrelation test is that if the sequence of bits in e is random. Let ei be the bit sequence obtained by shifting e by i positions. as follows: . Since both P-value1 and P-value2 are greater than 0. etc.8 − 1. Given an integer value d.2 = 1.0. For further details see: “A statistical testing for pseudorandom number generators for cryptographic applications”.4) = 0. this sequence passes the serial test.4 Autocorrelation test Let e denote a sequence of n bits created by a pseudo-random number generator. NIST special publication 800-22 2. The following steps are involved: 1.7.

we will accept the null hypothesis that the bit string e is random if S≤1. Further details can be found in: “Handbook of Applied Cryptography” by A. We start with a sequence of pseudo-random numbers in [0. total length of e is n=160.75. e is not very random! For d =8.1) if n-d≥10. J.7. Vanstone. We then compute the following statistic: n − d ⎞ ⎛ 2⎜ A(d ) − ⎟ 2 ⎠ ⎝ S= n−d 3. 0.1]. 0. and therefore we reject the hypothesis that e is random. S>1. For a significance level of 0. we find a run up of . 0. van Oorschot and S. Obviously. They can also be found in a table for the standardized Normal distribution in any introductory statistics book.) As an example.5 Runs test The runs test can be used to test the assumption that the pseudo-random numbers are independent of each other. Such a subsequence is called a run up.8. and it may be as long as one number. For example. let us consider the sequence: 0. A. 2.5. P. At 5% level of significance.96. Menezes.96.7. 0. C. and consequently S=3. 0.6. 0. 1996.4.Generating Pseudo-random numbers n − d −1 i =0 43 A(d ) = ∑e i ⊕ ei + d 2. We then look for unbroken subsequences of numbers. i.e.55. we have that A(8)=100. (Values for various level of significance can be readily computed using a function available in software packages.8933. S follows approximately the standardized normal distribution N(0.7. 0. 0. CRC Press. where the numbers within each subsequence are monotonically increasing. Starting from the beginning of this sequence. Then. from the left.. let us consider a bit string e which is made up by combining the following bit sequence eʹ′ = (11100 01100 010000 10100 11101 11100 10010 01001) four times.05.3.

1≤i≤6.6.. and aij are known coefficients. 0. 720 . let ri be the number of run ups of length i. The runs test can be also applied to a sequence of 0’s and 1’s. i=1.6. i. 0. In this case. r3=2. 120 .i. to see how to do a hypothesis testing with the chisquare distribution). 0.e. ⎦ and the bi coefficient is obtained as the ith element of the vector 1 5 11 19 29 1 (b1. The aij € coefficient is obtained as the (i. For n > 4000.44 Computer Simulation Techniques length 1.8. 1≤j≤6. then a run up of length 2. In general..d).6.75. j≤6 where n is the sample size and bi.) All run-ups with a length i≥6 are grouped together into a single run-up.4.3.. i. .f) under the _ null hypothesis that the random numbers are independent and identically distributed (i. 0.. Addison-Wesley). (See below in section 2..e. (In the above example we have r1=1. r2=1. i.7. R has a chi-square distribution with 6 degrees of freedom (d. 0.7. a run is an uninterrupted sequence of 0’s or 1’s. followed by two successive run ups of length 3.55..4 9044. For further information see “The Art of Programming – Volume 2” by Knuth. and 0. 24 . 0. 0. 5040 .9 18097 27139 36187 45234 55789 13568 27139 40721 54281 67852 83685 18091 36187 54281 72414 90470 111580 22615 45234 67852 90470 113262 139476 27892 55789 83685 111580 139476 172860 ⎤ ⎥ ⎥ . n 1≤i .e.5..7.9 13568 18091 22615 27892 9044. 840 ) . The ri values calculated for a particular sequence are then used to calculate the following statistic: R = 1 ∑ (ri − nbi )(rj − nb j )aij .b6) = ( 6 . 0.j)th element of the matrix ⎡ ⎢ ⎢ ⎣ 4529.

can be used to check whether an empirical distribution follows a specific theoretical distribution.1] are uniformity distributed. Problems 1. where n is the sample size. the function to generate χ2 values can be found in software packages. then the mean number of random numbers that fall within each subinterval is n/k. . The chi-square test measures whether the difference between the observed and the theoretical values is due to random fluctuations or due to the fact that the empirical distribution does not follow the specific theoretical distribution. we are concerned about testing whether the numbers produced by a generator are uniformly distributed.1]. The chi-square test. Let fi be the number of pseudo-random numbers that fall within the ith subinterval (make sure that enough random numbers are generated so that fi > 5). Let us consider a sequence of pseudo-random numbers between 0 and 1. The fi values are called the observed values. This value is called the theoretical value.Generating Pseudo-random numbers 2. Now. and of course. where k > 100. For the case where the theoretical distribution is the uniform distribution. We divide the interval [0. in general. The chi-square tables can be found in any introductory statistics book. The null hypothesis is that the generated random € numbers are uniformly distributed in [0. In our case.6 Chi-square test for goodness of fit 45 This test checks whether a sequence of pseudo-random numbers in [0. n i=1 k χ2 = and it has k-1 degrees of freedom. if these generated random numbers are truly uniformly distributed. the chi-square statistic is given by the expression k k n ∑ ( fi − )2 .1] into k subintervals of equal length.7. This hypothesis is rejected if the computed value of χ2 is greater than the one obtained from the chi-square tables for k-1 degrees of freedom and α level of significance. Consider the multiplicative congruential method for generating random digits.

Present your results in a table that gives the execution speed of each generator and whether it has failed or passed each statistical test. Measure the execution time for each generator for the same output size. 2. a. and then test them using the statistical tests described in section 2. to test a random number generator available on your computer. Discuss your results. Use the statistical tests described in section 2.7. x0 = 1. 2.7. 5. Computer Assignments 1. b. determine the length of the cycle for each set of values of a and x0 given below.46 Computer Simulation Techniques Assuming that m=10. 3. This is a more involved exercise! Implement the random number generators described in this chapter. a = 2. a = 3. x0 =1. 5. What is your favorite generator for your simulation assignments? .

In this Chapter. 3. Pseudo-random numbers which are uniformly distributed are normally referred to as random numbers.1].CHAPTER 3: GENERATING STOCHASTIC VARIATES 3. Sections 3. This is discussed below. Random numbers following a specific distribution are called random variates or stochastic variates. Section 3. We note that F(x) is defined in the region [0.1 Introduction In the previous Chapter we examined techniques for generating random numbers. we discuss techniques for generating random numbers with a specific distribution.5 discusses methods for obtaining stochastic variates from empirical distributions. The inverse transformation method is one of the most commonly used techniques.3 and 3. Let F(x) be its cumulative density function.2 The inverse transformation method This method is applicable only to cases where the cumulative density function can be inversed analytically. There are many techniques for generating random variates. We explore this property of the cumulative density function to obtain the following simple stochastic variates generator. Assume that we wish to generate stochastic variates from a probability density function (pdf) f(x). Random numbers play a major role in the generation of stochastic variates. Section 3.4 give methods for generating stochastic variates from known continuous and discrete theoretical distributions. .6 describes an alternative method for stochastic variates generation known as the rejection method.

0 < x < 1. We have x ⌠ 2tdt F(x) = ⌡ 0 = x2.48 Computer Simulation Techniques We first generate a random number r which we set equal to F(x). x = F-1(r). As an example. That is. where F-1(r) indicates the inverse transformation of F. Let r be a random number. That is. We have r = x2.1a: pdf f(x) 1 0 x 1 Figure 3. The quantity x is then obtained by inverting F. F(x) = r. _ _ A graphical representation of this probability density function is given in figure 3. 0 < x < 1.1b: Inversion of F(x) . We first calculate the cumulative density function F(x). or x= r. let us assume that we want to generate random variates with probability density function f(x) = 2x. _ _ 1 f(x) F(x) 1 r 0 x Figure 3.1a.

Generating stochastic variates 49 This inversion is shown graphically in figure 3.3 Sampling from continuous-time probability distributions In this section. We also describe two techniques for generating variates from the normal distribution.3.2. . 1 b-a 0 a<x<b . an exponential distribution. otherwise f(x) 1 b−a € a b Figure 3.1b. 3. we use the inverse transformation method to generate variates from a uniform distribution. and an Erlang distribution.1 Sampling from a uniform distribution The probability density function of the uniform distribution is defined as follows: ⎧ f(x) = ⎨ ⎩ and it is shown graphically in figure 3. In sections 3 and 4 we employ the inverse transformation method to generate random variates from various well-known continuous and discrete probability distributions. 3.2: The uniform distribution.

_ The cumulative density function is: x x −at F(x) = ∫ f(t)dt = ∫ ae 0 0 tdt = 1 . € € . a > 0.2 Sampling from an exponential distribution The probability density function of the exponential distribution is defined as follows: f(x) = ae-ax . x > 0.3.E(X)) a 2 (b-a)2 F(x)dx = 12 .50 The cumulative density function is: Computer Simulation Techniques x ⌠1 x−a F(x) = ⎮b-a dt = . The inverse transformation method for generating random variates is as follows. € x-a r = F(x) = b-a or x = a + (b .a)r. 3.e-ax. ⌡ b−a a € The expectation and variance are given by the following expressions: b b 1 E(X) = ∫ f(x)xdx = b−a a ∫ xdx= a b+a 2 b € Var(X) = (x ∫ € .

3 Sampling from an Erlang distribution In many occasions an exponential distribution may not represent a real life situation. a The inverse transformation method for generating random variates is as follows: € r = F(x) = 1 . we can use the following short-cut procedure r = e-ax.3. Since 1-F(x) is uniformly distributed in [0.a log(1-r) = . as a number of .1].E(x)log(l-r). For example. may not be exponentially distributed. the execution time of a computer program.e-ax or 1 . or the time it takes to manufacture an item.r = e-ax or 1 x = . and therefore. It can be seen.Generating stochastic variates 51 The expectation and variance are given by the following expressions: ∞ E(X) = ∫ aet 0 −at 1 dt dt = a ∞ Var(X) = € ∫ (t − E(X)) e 0 2 −at 1 tdt dt = 2 . however. 1 x = -a log r. 3.

x1. An Erlang distribution consisting of k exponential distributions is referred to as Ek. then the total service time follows an Erlang distribution. 1/a 1/a •!•!• 1/a Figure 3..3. The expected value and the variance of a random variable X that follows the Erlang distribution are: k E(X) = a and k Var(X) = 2 .. a Erlang variates may be generated by simply reproducing the random process on which the Erlang distribution is based. The Erlang distribution is the convolution of k exponential distributions having the same mean 1/a. as shown in figure 3. ⎟ ⎝ i=1 i=1 ⎠ . x2. We have k x= ∑ xi i=1 ⎛ k ⎞ 1 k -1 ⎜ = . This can be accomplished by taking the sum of k exponential variates.. .52 Computer Simulation Techniques exponentially distributed services which take place successively. If the mean of each of these individual services is the same.a ∑ log ri = a ⎜log ∑ ri⎟ .3 : The Erlang distribution. xk with identical mean 1/a.

3.) The central limit theorem briefly states that if x1. . we employ the central limit theorem. -∞ < x < +∞ . is said to have a normal distribution with parameters µ and σ. where σ is positive.Generating stochastic variates 53 3. .2 x2 1 f(x) = e . -∞ < x < +∞. The expectation and variance of X are µ and σ2 respectively..4 Sampling from a normal distribution A random variable X with probability density function 1 (x-µ)2 σ2 f(x) = 1 σ 2π -2 e ..+Xn approaches a normal distribution as n becomes large..xn are n independent random variates. each having the same probability distribution with E(Xi)=µ and Var(Xi)=σ2. 2π If a random variable X follows a normal distribution with mean µ and variance σ2. The mean and variance of this normal distribution are: . If µ=0 and σ=1. then the normal distribution is known as the standard normal distribution and its probability density function is 1 . x2.. In order to generate variates from a normal distribution with parameters µ and σ. (This approach is named after this particular theorem. then the sum ΣXi = X1+X2+. then the random variable Z defined as follows X-µ σ Z= follows the standard normal distribution.

1]. Since each ri is a uniformly distributed random number over the interval [0 . Σri ~ N⎜2 . (3.1) Now. That is ⎛k k ⎞ ⎟ .. Let x be such a normal variate..2) Equating (3. let us consider the normal distribution with parameters µ and σ from which we want to generate normal variates. Using the Central Limit theorem. Then x-µ σ ~ N(0.54 E(ΣXi) = nµ Var (ΣXi) = nσ2. Computer Simulation Techniques The procedure for generating normal variates requires k random numbers r1. we have that a+b 1 E(ri) = 2 = 2 (b-a)2 1 Var(ri) = 12 = 12 .1) and (3. (3. r2.2) gives .. 1).rk.k/2 k/ 12 ~ N(0. . we have that the sum Σri of these k random numbers approaches the normal distribution. ⎝ 12⎠ or Σri .1).

and a technique for sampling from a Poisson distribution. one can observe that k=12 has computational advantages). since the larger it is the better the accuracy.k/2 x-µ = . one has to balance accuracy against efficiency.4 Sampling from discrete-time probability distributions In this section. This method produces exact results and the speed of calculations compares well with the Central Limit approach subject to the efficiency of the special function subroutines. (In fact. Also. . we describe a technique for sampling from a binomial distribution.k⎟ + µ . Let r1 and r2 be two uniformly distributed independent random numbers.Generating stochastic variates 55 Σri . The smallest value recommended is k=10. ⎝ i 2⎠ k This equation provides us with a simple formula for generating normal variates with a mean µ and standard deviation σ. σ k/ 12 or x=σ ⎞ 12 ⎛ ⎜Σr . Usually. An alternative approach to generating normal variates (known as the direct approach) is the following. 3. we use the inverse transformation method to generate variates from a geometric distribution. The value of k has to be very large. Then 1 2 x1 = (-2 loge r1) cos 2πr2 1 2 x2 = (-2 loge r1) sin 2πr2 are two random variates from the standard normal distribution.

. Let p and q be the probability of a success and failure respectively. The random variable that gives the number of successive failures that occur before a success occurs follows the geometric distribution. . . The generation of geometric variates using the inverse transformation method can be accomplished as follows. n F(n) = ∑pqs s=0 . The probability density function of the geometric distribution is p(n) = pqn. where the outcome of each trial is either a failure or a success.2. 1. n = 0. 2. .56 Computer Simulation Techniques 3. n = 0.1. and its cumulative probability density function is n F(n) = ∑pqs s=0 . We have that p+q=1.1 Sampling from a geometric distribution Consider a sequence of independent trials. .. . The expectation and the variance of a random variable following the geometric distribution are: p E(X) = q p Var(X) = q2 .4.

1. From this expression we obtain that 1-F(n) € = qn+1.q. 1− q =p Since p = 1 . 3. Therefore. let r be a random number.Generating stochastic variates 57 n = p ∑qs s=0 1− q n+1 . and (1-F(n))/q varies between 0 and 1. Let X be a random variable indicating the . since (1-F(n))/q=qn. Alternatively.4.2 Sampling from a binomial distribution Consider a sequence of independent trials (Bernoulli trials). We observe that 1-F(n) varies between 0 and 1. then we have r = qn+1 or log r = (n+1) log q or log r n = log q . we have r = qn or log r n = log q . we have that F(n) = 1-qn+1. Let p be the probability of success and q=1-p the probability of a failure.

. 2. k = 0. n. .. 2. We generate n random numbers.3 Sampling from a Poisson distribution The Poisson distribution models the occurrence of a particular event over a time period. the number of occurrence x during a unit period has the following probability density function p(n) = e-λ(λn/n!). . k The expectation and variance of the binomial distribution are: E(X) = np Var(X) = npq. after setting a variable k0 equal to zero. Then. This method for generating variates is known as the rejection method. . . . We can generate variates from a binomial distribution with a given p and n as follows. 2. Then. this random variable follows the Binomial distribution. a check is made. 1. and the variable ki is incremented as follows: ⎧ki-1 + 1 ⎪ ki = ⎨ ⎪ ⎩ki-1 if ri < p if ri > p The final quantity kn is the binomial variate. The probability density function of X is ⎛ ⎞ p (k) = ⎝n⎠ pkqn-k. 3.58 Computer Simulation Techniques number of successes in n trials. For each random number ri. . i=1. 1.4. . This method is discussed in detail below in section 6.. Let λ be the average number of occurrences during a unit time period. . n = 0.

Now. n n +1 ∑ i=1 ti < 1< ∑t . where pi is calculated from empirical data. In this section. the unit time period.e. t2. one is obliged to generate variates which follow this particular empirical probability distribution. That is. and let p(X = i) = pi. since ti = -λ logri. n is given by the following expression: 1 € € ∑ ri > e-λ > i=0 3. n can be obtained by simply summing up random numbers until the sum for n+1 exceeds the quantity e-λ..5 n n+1 ∑ ri . t3.1 Sampling from a discrete-time probability distribution Let X be a discrete random variable.. f(t) = λe-λt . i=0 Sampling from an empirical probability distribution Quite often an empirical probability distribution may not be approximated satisfactorily by one of the well-known theoretical distributions.. 3. Random variates from this . In such a case. That is. we show how one can sample from a discrete or a continuous empirical distribution.Generating stochastic variates 59 It can be demonstrated that the time elapsing between two successive occurrences of the event is exponentially distributed with mean 1/λ.. Let p(X≤i) = Pi be the cumulative probability. i. These intervals are accumulated until they exceed 1.5. with an expected value equal to 1/λ. i i=1 The stochastic variate n is simply the number of events occurred during a unit time period. One method for generating Poisson variates involves the generation of exponentially distributed time intervals t1.

From historical data we have the following distribution for X. let us consider the well-known newsboy problem.85 5 1 The random variate generator can be summarized as follows: . X f(x) 1 0.40 3 0.15 The cumulative probability distribution is: X f(x) 1 0.60 Computer Simulation Techniques probability distribution can be easily generated as follows.4).30 4 0. In general.70 4 0. Figure 3. As an example. if Pi-1<r<Pi then x=i.20 2 0. Now let r be a random number. This method is based on the fact that pi=Pi-Pi-1 and that since r is a random number.15 5 0.20 3 0. Let X be the number of newspapers sold by a newsboy per day.4: Sampling from an empirical discrete probability distribution. Pi-1) pi% of the time. the random variate x is equal to 3. it will fall in the interval (Pi.20 2 0. Let us assume that r falls between P2 and P3 (see figure 3. Then.

Generating stochastic variates 1. Sample a random number r.

61

2. Locate the interval within which r falls in order to determine the random variate x. a. If 0.85 < r ≤ 1.00 b. If 0.70 < r ≤ 0.85 c. If 0.40 < r ≤ 0.70 d. If 0.20 < r ≤ 0.40 e. Otherwise 3.5.2 then then then then then x=5 x=4 x=3 x=2 x=1

Sampling from a continuous-time probability distribution

Let us assume that the empirical observations of a random variable X can be summarized into the histogram shown in figure 3.5. From this histogram, a set of values (xi, f(xi)) can be obtained,

f(x3 ) f(x2 ) f(x )

1

f(x 4) f(x 5) f(x6 ) f(x 7) x7

x1

x2

x3

x4

x5

x6

Figure 3.5: Histogram of a random variable X.

where xi is the midpoint of the ith interval, and f(xi) is the length of the ith rectangle. Using this set of values we can approximately construct the cumulative probability distribution shown in figure 3..6, where F(xi) = Σ1≤k≤1f(xk). The cumulative distribution is assumed to be monotonically increasing within each interval [F(xi-1), F(xi)].

62

Computer Simulation Techniques

Figure 3.6: The cumulative distribution.

Now, let r be a random number and let us assume that F(xi-1)<r<F(xi). Then, using linear interpolation, the random variate x can be obtained as follows: r - F(xi-1) x = xi-1 + (xi - xi-1) F(x ) - F(x ) , i i-1

where xi is the extreme right point of the ith interval.

Figure 3.7: "Discretizing" the probability density function.

Generating stochastic variates

63

This approach can be also used to generate stochastic variates from a known continuous probability distribution f(x). We first obtain a set of values (xi, f(xi)) as shown in figure 3.7. This set of values is then used in place of the exact probability density function. (This is known as "discretizing" the probability density function.) Using this set of values we can proceed to construct the cumulative probability distribution and then obtain random variates as described above. The accuracy of this approach depends on how close the successive xi points are.

1 cf(x)

a

Figure 3.8: Normalized f(x)

b

3.6

The rejection method

The rejection technique can be used to generate random variates, if f(x) is bounded and x has a finite range, say a ≤ x ≤ b. The following steps are involved: 1. Normalize the range of f(x) by a scale factor c so that cf(x) ≤ 1, a ≤ x ≤ b. (See figure 3.8) 2. 3. 4. Define x as a linear function of r, i.e. x = a + (b-a) r, where r is a random number. Generate pairs of random numbers (r1, r2). Accept the pair and use x = a + (b-a)r1 as a random variate whenever the pair satisfies the relationship r2 ≤ cf(a + (b-a)r1), i.e. the pair (x,r2) falls under the curve in figure 3.8.

Example 1: Use the rejection method to generate random variates with probability density function f(x)=2x. and set x = r1. . This can be accomplished using the following procedure: 1. otherwise.64 Computer Simulation Techniques The idea behind this approach is that the probability of r2 being less than or equal to cf(x) is p[r2 ≤ cf(x)] = cf(x). Generate r2. If r2 < f(r1). Select c such that df(x)≤1. 2. if x is chosen at random from the range (a. Generate a pair of random numbers (r1. A pair (r1. If r2 < cf(r1) = (1/2)2r1 = r1 then accept r2. Otherwise. it is rejected. the probability density function of the accepted x's will be exact. r2). c = 1/2. r2) lies on the circumference if r1 +r2 = 1. € . Consequently. where f(r1) = 1− r12 . Example 2: Use the rejection method to compute the area of the first quadrant of a unit circle. go back to step 2. r2) defined over the unit interval corresponds to a point within the unit square. 2.b) and then rejected if r2>cf(x). i. 2 2 The numerical integration can be accomplished by carrying out the following two steps for a large number of times: 1. and the second example deals with a numerical integration problem. r2) is accepted. We first note that any pair of uniform numbers (r1. Generate r1. The first example deals with random variate generation. then r2 is under (or on) the curve and hence the pair (r1. We demonstrate the rejection method by giving two different examples.e. 0≤x≤1. 3.

Monte Carlo methods are not concerned with the passage of time. The previously demonstrated rejection method for calculating the integral of a function is an example of Monte Carlo.. Monte Carlo methods are usually associated with problems of theoretical interest. the quantities fi = f(ζi) are independent random variates with expectation θ.Generating stochastic variates The area under the curve can be obtained as the ratio total number of acceptable pairs area = total number of generated pairs . Then. otherwise known as direct simulation.ζ2. whereby the distribution is broken into pieces and the pieces are then sampled in proportion to the amount of distribution area each contains. Unlike direct simulation techniques. The method of mixtures can be used. be random numbers between 0 and 1.7 Monte Carlo methods Monte Carlo methods comprise that branch of experimental mathematics which is concerned with experiments on random numbers. Therefore. An alternative method for calculating an integral of a function is the crude Monte Carlo method. θ=⌡ 0 Let ζ1. . This process is identical to the rejection method for each piece of the distribution.. as opposed to the simulation methods described in this book. Let us assume that we wish to calculate the one-dimensional integral 1 ⌠f(x)dx . 3. Every Monte Carlo computation that leads to quantitative results may be regarded as estimating the value of a multiple integral.. 65 The rejection method is not very efficient when c(b-a) becomes very large. plus a straightforward sampling of data.ζn. known as hit-or-miss Monte Carlo.

In general. E(f(x)) = ⌡ 0 _ Thus. This technique is more efficient than the technique mentioned in the previous section. Assuming that X is uniformly distributed in (0. if X is a random variable that takes values in [0.e. then 1 ⌡ E(f(x) = ⌠f(x)g(x)dx 0 where g(x) is the density function of X. The above technique of evaluating θ is based on the following idea.1] and f(x) is a function of this random variable.f ) 2 . the standard error of f is s/ n . i.66 _ Computer Simulation Techniques 1 n f = n ∑ fi i=1 is an unbiased estimator of θ. Use the inverse transformation method to generate random variates with probability density function . Its variance can be estimated using the expression _ 1 n s2 = n-1 ∑ (fi . i=1 _ Thus. Problems 1. f is an unbiased estimate of E(f(X)). we obtain 1 ⌠f(x)dx .1). g(X) = 1.

A customer in a bank may receive service which has an Erlang distribution E3 (each € phase with mean 10) or an Erlang distribution E4 (each phase with mean 5) with probability 0. Set up a procedure to generate stochastic variates from € ⎧ x 0 ≤ x ≤ 1/2 f(x) = ⎨ ⎩1− x 1/2 < x ≤ 1 4. (Note that f(x) below needs to be normalized.4 and 0. Set-up a procedure to generate random variates of a customer's service.6 respectively. 0≤x≤1 . 0 ≤ x ≤ 10 6. in order to generate variates from a Coxian distribution. Use the rejection method to generate stochastic variates from f(x) = (x-3)4. A Coxian distribution consists of a series of exponential distributions. Apply the inverse transformation method and devise specific formulae that yield the value of variate x given a random number r.Generating stochastic variates 67 ⎧3x2 f(x) = ⎨ ⎩0 . otherwise 2. Modify the procedure for generating stochastic variates from an Erlang distribution.) ⎧5x 0≤ x≤4 f(x) = ⎨ ⎩ x − 2 4 < x ≤ 10 3. and it has the following structure: . each with a different mean. 5.

there is a probability b1(=1-a1) of departing. b2 µn After receiving an exponential service time with parameter µ1.. and the time to transmit a long packet is exponentially distributed with mean 20. Run your simulation model as before. print out a line of output to show the new value of the clocks and the other relevant parameters . 2..5. Consider the machine interference problem. and so on until the kth exponential service is received. The time to transmit a short packet is exponentially distributed with a mean 2. print out a line of output to show the new value of the clocks and the other relevant parameters.68 a1 a2 Computer Simulation Techniques µ1 µ2 b1 . Consider the token-based access scheme. Make sure that your clocks are defined as real variables. Each time an event occurs. We assume that 80% of the transmitted packets are short and the remaining 20% are long. Run your simulation model as before. Change your simulation program so that the inter-arrival times are exponentially distributed with the same means as before. Change your simulation program so that the operational time and the repair time of a machine are exponentially distributed with the same means as before. 3. Each time an event occurs. or a probability a1 of receiving another exponential service time with parameter µ2. The packet transmission time is calculated as follows. Computer Assignments 1. Test statistically one of the stochastic variates generator discussed in this Chapter. Make sure that your clocks are defined as real variables. The switch over time and the time period T remain constant as before.

Run your simulation model as before. Each time an event occurs. and repair times are all exponentially distributed.⎜ 2 .2x ⎩ 2 4 < x ≤ 10 In order to normalize f(x). r = x3 or x = 2.8⎟ ⎝ ⎠ ⎧ 5 x2 0≤x≤4 ⎪ 2 ⎪ 40 + x2 . print out a line of output to show the new value of the clocks and the other relevant parameters. we have 4 10 ⌠ ⌠ ⌡5xdx + = ⌡(x . F(x) F(x) = ⎨ x ⌠ = ⌡stdt . Make sure that your clocks are defined as real variables. operational.Generating stochastic variates 69 4.2)dx 0 4 .2)dt .2t ⎪ ⎪0 ⎪4 ⎛16 ⎞ x2 = 40 + 2 . 1. Solutions 1 1 3 ⎪1 ⌠f(x) dx = ⌠3x dx = 3 x ⎪ = 31 = 1 ⌡ ⌡ 3 ⎪0 3 0 0 x t3 ⎪x 3x3 ⌡ F(x) = ⌠3t2 dt = 3 3 ⎪ = 3 = x3 ⎪0 0 Hence.x > 4 0 4 2 ⎪4 t2 ⎪x t = 5 2 ⎪ + 2 . Change your simulation program so that the inter-arrival. Consider the two-stage manufacturing system.x < 4 0 t2 ⎪x 5 = 5 2 ⎪ = 2 x2 ⎪0 4 x ⌠ ⌠ = ⌡5tdt = ⌡(t .2x . service. F(x) 3 r.

2x⎟ .2 + 2 . . ⎞ 1 ⎛ x2 Otherwise.1 2.70 or ⎪4 x2 ⎪10 5 = 2 x2 ⎪ + 2 . generate r2. 7 Otherwise go back to 2.4 Thus.2x ⎪ ⎪0 ⎪ 4 5 100 42 = 2 16 + 2 = 2 x 10 . Use calculus to establish the bounds of f(x) 74 34 1 2 3 10 Thus c = 1/74 Step 2: 2.2x⎞ ⎟ ⎩ 70 ⎜ ⎝ ⎠ 2 0≤x≤4 4 < x ≤ 10 Procedure: Draw a random number r. ⎝ ⎠ 5. from which one can solve for x. Computer Simulation Techniques ⎧ 1 5 x2 ⎪ 70 2 F(x) = ⎨ ⎪ 1 ⎛40 + x2 . r = 70 ⎜40 + 2 . Then 10r1. 40 5 140 If r < 70 then r = 140 x2 or x = 5 xr . 1 if r2 < 4 f(10r1) then accept 10r1 as a stochastic variate. Step 1.1.2 generate r1.

In view of this it is known as event-advance design.. The third design is activitybased. . It takes appropriate action as dictated by the occurrence of this event. In order to implement this idea. the master clock. In view of this. The simulation model. upon completion of processing an event.2 Event-advance design This is the design employed in the three examples described in Chapter 1.e. The first two designs are: a) event-advance and b) unit-time advance. Both these designs are event-based but utilize different ways of advancing the time. therefore. and then repeats the process of finding the next event (say at time t3). It then advances the time. The event-advance design is probably the most popular simulation design. we examine three different designs for building simulation models. i. The simulation model. each event is associated with a clock. regroups all the possible events that will occur in the future and finds the one with the smallest clock value. The basic idea behind this design is that the status of the system changes each time an event occurs. The value of this clock gives the time instance in the future that this event will occur. During the time that elapses between two successive events. the system's status remains unchanged. to this particular time when the next event will occur. say at time t2. moves through time by simply visiting the time instances at which events occur. it suffices to monitor the changes in the system's status.1 Introduction In this Chapter. say at time t1.CHAPTER 4: SIMULATION DESIGNS 4. 4.

Quite often the occurrence of a primary event may trigger off the creation of a new event. For instance. there are two types of events. . A Future event list Find next event Advance time Take appropriate action depending on the type of event A no Any conditional events ? yes Create a new event(s) Future event list A Figure 4. the occurrence of an arrival at the repairman's queue may trigger off the creation of a departure event if this arrival occurs at a time when the repairman is idle.1 of Chapter 1. the event of an arrival at the repairman's queue.1: The event-advance simulation design.72 Computer Simulation Techniques In the machine interference problem. and the event of a departure from the repairman's queue.1. The basic approach of the event-based design is shown in the flow chart in figure 4. described in section 3. Such triggered events are known as conditional events. These events are known as primary events. That is.

In the first scheme. In the simulation examples described in section 3 of Chapter 1. In such cases. Naturally. finding the next event might require more than a few comparisons. 2. when simulating complex systems. events with clock greater than t. Locating the next future event time and the associated event type. In the second scheme. 3. the list contains the following information: Time of occurrence (i. The efficiency of this algorithm depends on how the event list is stored in the computer. is known as the future event list.3 Future event list 73 Let us assume that a simulation model is currently at time t. For instance.. Below we examine two different schemes for storing an event list. the event list is stored in a sequential array. using the event type the program can determine which procedure to call. the number of events may be very large. value of the event's clock) Type of event • • The event type is used in order to determine what action should be taken when the event occurs.e. .e. i. For instance. An event list should be stored in such a way so as to lend itself to an efficient execution of the following operations. Inserting newly scheduled events in the event list. Deleting an event from the list after it has occurred. it is important to have an efficient algorithm for finding the next event since this operation may well account for a large percentage of the total computations involved in a simulation program. The collection of all events scheduled to occur in the future. in the case of the machine interference problem there were only two: an arrival to the repairman's queue and a service-ending (departure) event.Simulation designs 4. it is stored as a linked list. 1. However. For each event scheduled to occur in the future. there were only a few events..

If minIndex = i. the clock CL2 for the event type 2 is kept in the second position of the array. A newly-scheduled event j is inserted in the list by simply updating its clock given by A(j). in figure 4. else minValue ← A(i) minIndex ← i minIndex ← 1 minValue ← A(1) for i ← 1. 1. 2. 5.2. then its clock can be set to a very large value so that the above algorithm will never select it. An event is not deleted from the array after it has occurred. 6.74 4. then the next event is of type i and it will occur at time A(i). The simplest way to implement this.n if minValue ≤ A(i) continue Variable minIndex will eventually contain the location of the array with the smallest value. The following simple algorithm can be used to find the smallest value in an array A. Finding the next event is reduced to the problem of locating the smallest value in an array. the clock CL1 for event type 1 is kept in the first location of the array. 4. and so on. all future event times are stored sequentially in an array. . is to associate each event type with an integer number i. If an event is not valid at a particular time. CL 1 CL 2 CL 3 • !• !• CL n Figure 4. The clock associated with this event is always stored in the ith location of the array.4 Sequential arrays Computer Simulation Techniques In this scheme.2: Future event list stored sequentially in an array. 4. For instance. 3. 5.

NULL head Figure 4. it becomes time consuming. Due to the fact . This is a pointer that points to the location of the next data element.3: A linked list. in constant time). To overcome these problems. a node may consist of a number of data elements and links. If the linked list is empty. if the array is large. each node consists of two compartments. its complexity is linear in time).. 4.3. Thus.3. However. we store along with a data element the address of the next data element. The time it takes to find the smallest number in the array depends. In the example given in figure 4.. This pointer is often referred to as a link. on the length of the array n (i.e. one for storing a data element and the other for storing the pointer to the next node.e. then head is set to a special value called NULL indicating that it does not point to any node and that the list ends there.5 Linked lists A linked list representation of data allows us to store each data element of the list in a different part of the memory.Simulation designs 75 The advantage of storing an event list in a sequential array is that insertions of new events and deletions of caused events can be done very easily (i. in general. we no longer need contiguous memory locations and data can be dynamically added at runtime. it contains no nodes.e. Locating the smallest number in an array does not take much time if the array is small. Each node is represented by a box consisting of as many compartments as the number of data elements and links stored in the node. The data element and the link (or links) is generally referred to as a node. The pointer called head points to the first node in the list. In general. In order to access the data elements in the list in their correct order. The pointer of the last node is always set to NULL indicating that this is the last node in the linked list structure.. one should store the future event list in a linked list. Linked lists are drawn graphically as shown in figure 4. i.

. The nodes are arranged in an ascending order so that CLi ≤ CLj ≤ . this data structure is known as a singly linked list. CLi i head CLj j CLn n NULL Figure 4. ≤ CLn. d. and a value i indicating the type of event. Add a node to the linked list. A simple node which saves event type. a.4. namely a clock CLi showing the future time of an event. 4. We shall use the C programming language syntax to explain the implementation.5. Access a node through the means of a pointer. Each node will consist of two data elements.4: Future event list stored sequentially in a linked list. Remove a node from the linked list. A single linked list can be used to store a future event list as shown in figure 4. event clock and pointer to the next node can be represented as follows: . Create a new node(s) or delete an existing unused node(s). e.1 Implementation of a future event list as a linked list The following basic operations are required in order to manipulate the future event list a.76 Computer Simulation Techniques that two successive nodes are connected by a single pointer. c. Organize information such as data elements and pointers into a node. b. . Creating and deleting nodes Let us first see how we can organize the data elements and pointers in a node.

a node can be dynamically allocated as follows: struct node* nodePtr = (struct node*) malloc (sizeof (struct node)). 4.e. and the above call can be visualized as shown in figure 4. In addition to the data elements (i. } struct node { int type.5: sizeof(struct bytes node) Memory nodePt r Figure 4. struct node* next.5: Allocating memory for a node using malloc(). In C. type and clock) shown above the structure can contain other satellite data which may be specific to each event. 2. 77 The above structure is an example of a self referential structure since it contains a pointer to a structure of the same type as itself which would be the next node. Dynamic allocation of memory refers to reserving a number of bytes in the memory and setting up a pointer to the first memory location of the reserved space.Simulation designs 1. Thus. A new node must be dynamically allocated each time a new event is created. 3. float clock. the system call malloc() is used to return such a pointer. We use the cast (struct node*) to explicitly tell the compiler that the memory location returned by this call will be pointed to by a variable which is a pointer to the . 5.

Function to create a new node Consider the following function that creates a new node and returns a pointer to it: 1. nodePtr->type = type. nodePtr->clock = clock.5). 2. MCL+5. . struct node* createNode(int type. nodePtr->next = NULL. 6. In the above function we allocate memory for the node and assign values to the node passed into the function as parameters. 4. We can modify the fields of the structure any time in the program as follows: nodePtr->type = ARRIVAL.78 Computer Simulation Techniques node structure that we have defined. 5. } return nodePtr. nodePtr->next = NULL. The last line returns a pointer to the created node. Some C compilers may give warnings if such a cast is not applied. 7. The function can be called from the program as follows: newNodePtr = createNode(ARRIVAL. float clock) { struct node* nodePtr = (struct node*) malloc (sizeof (struct node)). 8. b. A simple function can be written which creates a new node and set its fields using the parameters passed to it. 3.

add nodes and removing nodes. the head can be defined as follows struct node* head = NULL. called a head of the linked list. nodePtr points to NULL.6. we are creating a new ARRIVAL event 5. inserting and removing nodes Now comes the interesting part of creating a linked list. and the assignment can be visualized as shown in figure 4. This can be easily done with the system call free() as follows: free(nodePtr). Thus. In C.6: Result of calling createNode(). c. To access a linked list we need the pointer to the first node of the list.Simulation designs 79 where ARRIVAL is one of the predefined event types and MCL is be the master clock. Deletion of a node When a node is deleted the memory allocated from it must be returned to the system. d. type newNodePtr clock NULL Node whose pointer is returned by createNode() Figure 4. . This function can be visualized as shown in figure 4.5 units of time in the future. After the above call. Creating linked lists.7.

After inserting nodes the list can be traversed using the head and the pointer to next node in each member node of the list. struct node* curr = head. return newNodePtr. Inserting a node in a linked list In order to insert a node into the linked list the pointer to the head of the list and the pointer to the new node must be provided. 10. struct node* insertNode(struct node* head. 9. 11.80 Computer Simulation Techniques NULL head Figure 4. 12. curr = curr->next. 13. For the purpose of simulation this field will be the event clock. 1. The linked list is initially empty and the head does not point to any node. Thus the head must be explicitly initialized to point to NULL. 3. } else { struct node* prev = NULL. 6. } else if (head->clock > newNodePtr->clock) { // Case 2 newNodePtr->next = head. 15. 8. 7.7: Result of calling createNode(). The node can then be inserted into the list in an appropriate position such that the nodes are ordered according to a specified field. 5. 14. // Case 3 // Case 1 . } while ( (curr!=NULL) && (curr->clock <= newNodePtr->clock) ) { prev = curr. struct node* newNodePtr) { if (head == NULL) { return newNodePtr. 2. e. 4.

. MCL+5. } } return head. the head is currently pointing to NULL.5). head must point to the node that is being inserted. examined below. 20. 21. newNodePtr->next = curr. Thus. 17. As commented in the code above there are three separate cases that we must handle. head = insertNode(head. which means that the list is empty and the node being inserted will be the first element in the list.93 NULL Figure 4.Simulation designs 16.9. This can be visualized as shown in figures 4. Case 1: head is NULL In this case. NULL head ARR newNodePtr 8.8 and 4. prev->next = newNodePtr. newNodePtr). 81 The above function can be called from a program as follows: newNodePtr = createNode(ARRIVAL. after the call to insertNode(). 18.8: Case 1.head and newNodePtr before the call to insertNode(). 19.

10 and 4.93 DEP 15. Thus.93 DEP 15.11: Case 2 .93 NULL Figure 4. the new node must be inserted before the head and the pointer to the new head node must be returned.7 NULL Figure 4.86 NULL OTH newNodePtr 6.head after the call to insertNode(). the head in not pointing to NULL and the clock value of the head is greater than the clock value of the new node.10: Case 2 .head after the call to insertNode() . Case 2: head->clock > newNodePtr->clock In the second case.82 Computer Simulation Techniques ARR head 8.86 NULL Figure 4.1. This can be visualized as shown in figures 4. ARR head 8.7 8.head and newNodePtr before the call to insertNode() ARR head OTH 6.9: Case 1 .

head after the call to insertNode().Simulation designs 83 Notice that in cases 1 and 2 the pointer to the head changes. the pointer returned by insertNode() must be assigned to the head.head and newNodePtr before the call to insertNode() . Removing a node from the linked list The linked list is always maintained in an ascending order of the clock value. This can be visualized as shown in figures 4.93 DEP 15. ARR head 8. the next event to occur is the first node of the linked list.13: Case 3 . Thus the new node must be inserted after the head.86 NULL OTH newNodePtr 10. Therefore.3. Thus. ARR head 8. Case 3: Insertion between head and last node of the list In the third case.12: Case 3 .93 DEP 15. the head in not pointing to NULL and the clock value of head is less than the clock value of the new node.86 NULL OTH 10. f.12 and 4.77 Figure 4. The node can be obtained from .77 NULL Figure 4.

ARR head NULL nextNodePtr 8. The first parameter passed to removeNode is the head of the linked list. *nextRef = head. 6. This parameter is of the type pointer to a pointer to a struct node. return head. head = removeNode(head. all that needs to be done is to return the node that was pointed to by the head and make the head to point to the next node.86 NULL Figure 4. 3. 7. The second parameter (struct node** nextRef) may look different from what we have seen so far. 5. Such variables are . 4. struct node** nextRef) { 2.84 the linked list as follows: Computer Simulation Techniques struct node* removeNode(struct node* head.93 DEP 15. if (head != NULL) { head = head->next. } (*nextRef)->next = NULL. 8. 10. 9. This can be done as follows: struct node* nextNodePtr = NULL.14: head and nextNodePtr before the call to removeNode(). &nextNodePtr). } After the node is removed.

It is interesting to see what goes on inside the function removeNode().17.Simulation designs 85 sometimes called reference pointers. as shown in figure 4.86 NULL Figure 4. We need a reference pointer in this function to return to the calling function the memory location of the next node. NULL nextRef nextNodePtr Figure 4.16: nextRef nextNodePtr ARR head 8.15: nextRef points to nextNodePtr since address of nextNodePtr (&nextNodePtr) is passed to removeNode().14.16: nextNodePtr now points to head node of the list. After line 2 of the function above. When the function gets called. the node that will be removed will point to NULL. the head points to the next node after the head. Also.93 DEP 15. . This can be visualized as shown in figure 4. After line 6. this changes as shown in figure 4. And after line 7. the address of the nextNodePtr (&nextNodePtr) is passed. the address of the variable nextNodePtr is passed to it. This new head is returned by the function removedNode() as its return value.15. as shown in figure 4. note that in the function call.

The maximum number of nodes compared in the worst case will be the total number of nodes in the list.93 NULL DEP head 15. ARR nextNodePtr 8.5.86 NULL Figure 4.86 NULL Figure 4.93 NULL DEP 15. 4.2 Time complexity of linked lists Let us try to analyze the time complexity of the insert and remove operations on the linked list. . let us take a look at the calling function.18.86 Computer Simulation Techniques nextRef nextNodePtr ARR head 8. As shown in figure 4.18: Result of calling function removeNode(). The nextNodePtr is returned at the address (&nextNodePtr) passed while calling the function. the head of the list returned by the function removeNode() is assigned to the head variable in the calling function. In order to insert a node in the linked list we have to traverse the linked list and compare each node until we find the correct insertion position. Thus the complexity of the insert operation is linear on n.17: The previous head node of the list is now removed from the list and head points to the second node. Finally.

the master clock can be advanced in fixed increments of time. a doubly-linked list may be more advantageous than a singly-linked list. namely from the first node to the last one. The actual insertion can then be located by sequentially searching the nodes of the sublist. Other data structures like heaps have a much better time complexity for these operations.6 Unit-time advance design In the event-advance simulation.5. as shown in figure 4. In view of this particular mode of advancing . Depending upon the application. The main disadvantage of these lists is that they can be only traversed in one direction.Simulation designs 87 Searching a linked list might be time consuming if n is very large. In this case. 4. we can easily establish in which sublist the insertion is to take place. 4. one can employ better searching procedures. By comparing clock value of the node to be inserted with the clock stored in this node. i head j n NULL Figure 4.1. This is possible because any two successive nodes are linked with two pointers. A doubly-linked list can be processed using procedures similar to those described above in section 4. For example. the master clock is advanced from event to event. This node logically separates the list into two sublists. a simple solution is to maintain a pointer B to a node which is in the middle of the linked list.5.19.9: A doubly linked list. Alternatively. each increment being equal to one unit of time.3 Doubly linked lists So far we have examined singly linked lists. Doubly linked lists allow traversing a linked list in both directions.

Each time the master clock is advanced by a unit time. this simulation design is known as the unit-time advance design. . all future event clocks are compared with the current value of the master clock. If any of these clocks is equal to the current E 1 time E 2 E n Figure 4.20: The unit-time advance design.21: The unit-time advance design.88 Computer Simulation Techniques the master clock. A Master clock is increased by a unit-time Is any future event clock = MC ? yes An event has occurred Take appropriate action no A Any conditional event(s) ? yes Schedule new event (j) A no A Figure 4.

The basic approach of the unit-time design is summarized in the flow-chart given in figure 4. This mode of advancing the simulation through time is depicted in figure 4.22: A unit-time advance design of a single server queue. as in the case of the . then no event has occurred and no action has to take place.1 End of service Take appropriate action >1 AT = AT . A MCL = MCL + 1 >1 ST = ST .Simulation designs 89 value of the master clock.1 AT =1 AT = AT . given in figure 4. it was implicitly assumed that a future event clock is a variable which. If no clock is equal to the current value of the master clock.21. the master clock is again increased by unit time and the cycle is repeated. In the flow-chart of the unit-time simulation design. Take appropriate action A Figure 4.1 ST =1 ST = ST . then the associated event has just occurred and appropriate action has to take place.1 An arrival has occurred.21. In either case.20.

90

Computer Simulation Techniques

event-advance design, contains a future time with respect to the origin. That is, it contains the time at which the associated event will occur. Alternatively, a future clock can simply reflect the duration of a particular activity. For instances, in the machine interference problem, the departure clock will simply contain the duration of a service, rather than the future time at which the service will be completed. In this case, the unit-time design can be modified as follows. Each time the master clock is advanced by a unit of time, the value of each future clock is decreased by a unit time. If any of these clocks becomes equal to zero, then the associated event has occurred and appropriate action has to take place. Obviously, the way one defines the future event clock does not affect the unit-time simulation design. In order to demonstrate the unit-time advance design, we simulate a single queue served by one server. The population of customers is assumed to be infinite. Figure 4.22 gives the flow-chart of the unit-time design. Variable AT contains the inter-arrival time between two successive arrivals. Variable ST contains the service time of the customer in service. Finally, variable MCL contains the master clock of the simulation model.

t

t + 1 Ej time

t+2

t+3

Ei t + 4

Figure 4.23: Events occurring in between two successive values of the master clock.

4.6.1

Selecting a unit time

The unit time is readily obtained in the case where all future event clocks are represented by integer variables. For, each event clock is simply a multiple of the unit time. However, quite frequently future event clocks are represented by real variables. In this case, it is quite likely that an event may occur in between two successive time instants of the master clock as shown in figure 4.23. Obviously, the exact time of occurrence of an event E is not known to the master clock. In fact, as far as the simulation is concerned, event E

Simulation designs

91

occurs at time t+1 (or t+2 depending upon how the program is set-up to monitor the occurrence of events). This introduces inaccuracies when estimating time parameters related to the occurrence of events. Another complication that might arise is due to the possibility of having multiple events occurring during the same unit of time. In general, a unit time should be small enough so that at most one event occurs during the period of a unit of time. However, if it is too small, the simulation program will spend most of its time in non-productive mode, i.e. advancing the master clock and checking whether an event has occurred or not. Several heuristic and analytic methods have been proposed for choosing a unit time. One can use a simple heuristic rule such as setting the unit time equal to one-half of the smallest variate generated. Alternatively, one can carry out several simulation runs, each with a different unit time, in order to observe its impact on the computed results. For instance, one can start with a small unit time. Then, it can be slightly increased. If it is found to have no effect on the computed results, then it can be further increased, and so on. 4.6.2 Implementation

The main operation related to the processing of the future event list is to compare all the future event clocks against the master clock each time the master clock is increased by a unit time. An implementation using a sequential array as described in section 4.3.1 would suffice in this case. 4.6.3 Event-advance vs. unit-time advance

The unit-time advance method is advantageous in cases where there are many events which occur at times close to each other. In this case, the next event can be selected fairly rapidly provided that an appropriate value for the unit time has been selected. The best case, in fact, would occur when the events are about a unit time from each other. The worst case for the unit-time advance method is when there are few events and they are far apart from each other. In this case, the unit-time advance design will spend a

92

Computer Simulation Techniques

lot of non-productive time simply advancing the time and checking if an event has occurred. In such cases, the event-advance design is obviously preferable. 4.7 Activity-based simulation design

This particular type of simulation design is activity based rather than event based. In an event oriented simulation, the system is viewed as a world in which events occur that trigger changes in the system. For instance, in the simulation model of the single server queue described in section 4.6, an arrival or a departure will change the status of the system. In an activity oriented simulation, the system modelled is viewed as a collection of activities or processes. For instance, a single server queueing system can be seen as a collection of the following activities: a) inter arriving, b) being served, and c) waiting for service. In an activity based design, one mainly concentrates on the set of conditions that determine when activities start or stop. This design is useful when simulating systems with complex interactive processing patterns, sometimes referred to as machine-oriented models.

Figure 4.24: Time components related to the ith arrival.

We demonstrate this design by setting up an activity-based simulation model of the single server queue problem studied in section 4.6. Let STi and WTi be the service

and an inter-arrival time ATi+2 for the (i+2)nd arrival. The (i+1)st arrival occurs during the time that the ith arrival is waiting. . one the following three situations may occur. starts its service at time si and ends its service at time si + STi.Simulation designs 93 time respectively the waiting time of the ith arrival. the waiting time WTi+1 of the (i+1)st customer can be easily determined as follows: a.25. where TWi is the total waiting time in the system of customer i. let ATi+1 be the interarrival time between the ith and (i+1)st arrival. Then. The basic mechanism of this activity-based simulation model is depicted in figure 4. as shown in figure 4. The (i+1)st arrival occurs after the ith arrival has departed from the queue For each of these three cases.ATi+1.ATi+1 = TWi . WTi+1= 0 Having calculated WTi+1. let us assume that the ith arrival occurs at time ai. The (i+1)st arrival occurs when the ith arrival is in service 3. Also. WTi+1 = (WTi -ATi+1) + STi = (WTi + STi) . we generate a service time STi+1 for the (i+1)st arrival. WTi+1 = (WTi + STi) . The waiting time of the (i+2)nd arival can be obtained using the above expressions. Let us assume now that we know the waiting time WTi and the service time STi of the ith customer.ATi+1 = TWi .ATi+1 c. 2.24. Finally. 1. b. Let ATi+1 be the inter-arrival time of the (i+1)st customer.

Here. These decisions are mainly related to the quantity of inventory to be acquired (or produced) and the frequency of acquisitions. one is mainly concerned with making decisions in order to minimize the total cost of the operation.8.94 Initial conditions A Computer Simulation Techniques Empty system WT = ST = 0. we . TW = 0 Generate AT Next arrival occurs when current arrival is either waiting or is in service TW > AT Compare TW with AT TW ! AT Next arrival occurs when server is idle Generate ST Generate ST TW = ST + (TW . 4. 4.8 Examples In this section.AT) TW = ST A A Figure 4.1 An inventory system In an inventory system. The total cost of running an inventory system is made up of different types of costs. we further highlight the simulation designs discussed in this Chapter by presenting two different simulation models.25: An activity-based simulation design for a single server queue.

The fluctuation in the inventory level is shown in figure 4. and c) shortage cost. During stockout days. If it is less than or equal to the re-ordering level. They are satisfied on the day the order arrives.D. Finally. If It´ is below a certain value.26: An inventory system. The holding cost is related to the cost of keeping one item of inventory over a period of time. Let It be the inventory at time t. The inventory level is checked at the end of each day. the shortage cost is associated with the cost of not having available a unit of inventory when demanded. and loss of future business. The lead time for the order begins to count from the following day. Also. the inventory at time t´ is It´ = It + S . We assume that the daily demand and the lead time follow known arbitrary distributions. increased overtime. One of the most important components of this cost is that of the invested capital.e.Simulation designs 95 will consider the following three costs: a) holding cost. . b) setup cost. orders are backlogged. expediting deliveries). Orders arrive in the morning and they can be disposed of during the same day.. The setup cost is related to the cost in placing a new order or changes to production. Figure 4. The time it takes for the ordered stock to arrive is known as the lead time. then an order is placed. Let S be the quantity added in the system between time t and t´. This cost may be in the form of transportation charges (i. Then. an order is placed. let D be the demand between these time instances.26.

The basic input parameters to the simulation model are the following. .27 and 4. The model keeps track of It on a daily basis. the Initialization a Generate AT Print results End >1 LT =0 LT = LT . c) BI.28. b) Q.1 no order outstanding =1 LT = 0 I=I+Q order has arrived t>T no yes Generate daily demand D I=I-D b Figure 4. The lead time is expressed in the same unit time. We note that this design arises naturally in this case. The model estimates the total cost of the inventory system for specific values of the reordering point and the quantity ordered.96 Computer Simulation Techniques The simulation model is described in the flow chart given in figures 4. reordering point.27: A unit-time simulation design of an inventory system. A unit of time is simply equal to one day. quantity ordered. In view of this. a) ROP. the model was developed using the unit-time advance design.

C3. TC3 representing the total holding. b I < ROP ? yes no LT = 0 yes Generate LT no TC1 = TC1 + I*C1 a TC2 = TC2+ C2 I<0 yes no TC1 = TC1 + I*C1 TC3 = TC3 + I*C3 a a Figure 4. TC2. C2. setup and shortage costs respectively.6. 4.28: A unit time simulation design of an inventory system. respectively. representing the holding cost per unit per unit time. and the shortage cost per unit per unit time. respectively. the total simulation time. e) T. and f) C1. d) probability distributions for variables D and LT representing the daily demand and lead time.Simulation designs 97 beginning inventory.2 A round-robin queue We consider a client-server system where a number of clients send requests to a . The output parameters are TC1. the setup cost per order.

Otherwise. the computer gets blocked. each request in the CPU queue gets a chance to use the CPU. The think time is typically significantly longer than the duration of a quantum.30.. Each time a user sends a request. think CPU think CPU . The requests are executed by the server in a round-robin fashion. then it would cycle through the CPU queue 5000 times. it is simply placed at the end of the CPU queue. which operates under the round-robin scheme.98 Computer Simulation Techniques particular server.) This system is similar to the machine . That is. and it is connected to a server over the Internet.e.. we assume that a client is a computer operated by a user. A request. We model this client-server scheme using the queueing system shown in figure 4. and a quantum is 1 msec. until it gets an answer back from the server. If a request requires 5 seconds of CPU time. whereas a quantum could be less than 1 msec. the mean think time could be 30 seconds. the user cannot do any more processing nor can it send another request. (These requests are often referred to in the client-server parlance as synchronous requests). If the request is done at the end of this quantum (or during the quantum). think CPU Figure 4. it continuously cycles through a think time and a CPU time. (We note that we only model the user’s think time and the CPU queue. For simplicity. short requests get done faster than long ones. A user spends sometime thinking (i. For instance. as shown in figure 4. that is. then it departs from the CPU. We assume that each user has always work to do. in general. it never becomes idle.29: Cycling through a think time and a CPU time The requests are executed by the CPU in a round robin manner. as will be explained below. A request that leaves the CPU simply goes back to the originating computer.29. In this manner. each request is allowed to use the CPU for a small quantum of time. At that instance the user at the computer goes into a think state. would require many CPU quanta. Furthermore.. typing a line or thinking what to do next). That is. upon completion of which it creates a request that is executed at the CPU. Since the user never becomes idle.

departure events occur every quantum of time.. In view of this. The clients are the machines and the CPU is the repairman. we can easily develop an event-advance simulation model.30: A round-robin queue The basic events associated with this system are: a) arrival of a request at the CPU queue. A departing request from the CPU may either go back to the originating client. most of the .Simulation designs 99 interference problem. Following the same approach as in the machine interference problem. new arrivals of requests at the CPU queue occur at time instances which may be a few hundreds of quanta apart. We observe that during the time that the CPU is busy. This new event will more likely be the next event to occur. When a departure occurs.e. thus ending the think state). The main difference from the machine interference problem is that the CPU queue is served in a round robin fashion rather in a FIF0 manner. a user sends a request. or it may be a request that has just received a quantum of CPU time and it requires further processing. and b) service completion at the CPU. A client in the think state is like a machine being operational. or it may simply join the end of the CPU queue for further execution. Terminals Clients • • • CPU Figure 4. described in section 1. However. most likely a new departure event will be scheduled to occur in the next quantum of time. A request arriving at the CPU queue may be either a new arrival (i.1 of Chapter 1. the event-advance design is quite efficient. We observe that in this case. The future event list will contain one event associated with a departure from the CPU and the remaining events will be associated with future new arrivals of requests.3.

100 Computer Simulation Techniques time. When a new request arrives at the CPU queue. All the information regarding the requests in the CPU queue. The nodes are ordered in the same way that the requests are kept in the CPU queue. a new node is created which is attached after node E.31: Future event list of all new arrivals to the CPU queue. Each node contains the number of quanta required by a request and its terminal identification number. Each node contains a future event time and a client identification number. Thus. Nodes are ordered in an ascending order of the data element that contains the future event time. The event time simply shows the time at which the user will stop thinking and will access the CPU.31. B t j1 j 1 t j2 j 2 • • • t jk j k E Figure 4.32: Future event list of all new arrivals to the CPU queue. a newly created event will be simply inserted at the top of the future event list. If a request requires further service upon completion of its quantum. is maintained in the separate linked shown in figure 4. all the events related to new arrivals at the CPU queue are kept in a linked list as shown in figure 4. Such insertions can be done in 0(1) time. then its node is simply placed at the end of the list. F t i1 i i t i2 i 2 • • • t in i n 0 Figure 4. We now give an alternative simulation design of the round-robin queue which utilizes both the event-advance and unit-time advance designs! Specifically. Pointers B and E point to the beginning and end of the list respectively. since it is singly linked and the last node is linked to the first node. This is achieved by simply setting E←B and B←LINK(B). the next new arrival event is given by the first node of the linked list. including the one in service.32. This linked list is known as a circular singly linked list. Thus. . the first node corresponds to the request currently in service.

Simulation designs 101 b MCL = tarr yes Is CPU queue empty ? no MCL = MCL + 1 A new arrival event occurs A departure event occurs b Is MCL=tarr ? yes A new arrival event occurs b Figure 4.33. Note that tarr gives the time of the next new arrival at the CPU queue. During the time that CPU is idle. the simulation model switches to an event-advance design.33: Hybrid simulation design of the round-robin queue. Checkout stands at a supermarket . This hybrid design is summarized in figure 4. The remaining details of this program. no b The simulation model operates under the unit-time advance design during the period of time that the CPU is busy. are left up to the reader as an exercise! Problems Consider the following systems: a.

Runways at an airport Computer Simulation Techniques i. Modify your simulation model so that the event list is maintained in the form of a linked list. Teller's window at a bank c. (Make your own assumptions whenever necessary. Traffic lights in a configuration of 8 city blocks e. Make sure that these assumptions do not render the system trivial!) Then. Computer assignments 1 Implement the hybrid simulation model of the round-robin queue discussed in section 4. As before. A telecommunication system with 5 nodes (virtual HDLC three linked lists) j. Elevators serving an office building d. then state clearly which are the state variables and what are the events. Assume that the queue of brokendown machines can be repaired by more than one server (i. Solar heating of a house Choose any of the above systems. Outpatient clinic f. If you choose an event-based simulation design.6. If you choose an activity-based design. state clearly what action the simulation model will take. First.. 2 Consider the machine interference problem. describe how the system operates. print out the usual line of output and also . set up a simulation model to represent the operations of the system. there are more than one repairman repairing machines off the same queue). Parametrize your program so that it can run for any number of machines and repairmen (maximum 20 and 10. Parking lot h.102 b.e.2. each time an event occurs. Pumps at a gasoline station g. For each caused event. state clearly which are the activities and under what conditions they are initiated and terminated. Run your simulation model until 20 repairs have been completed. respectively).

the host will wait to hear from the receiving host whether the packet has been received correctly or not. and the sender will proceed to transmit another packet. There is 0. Modify the simulation design in order to take advantage of the structure of the system. triggering-off each other). Describe how you will modify your simulation model in order to accommodate the above acknowledgement scheme. Also. If the packet has been correctly received. All these re-transmissions take place while the host has the token. when the token comes back to the station. do not generate arrivals to each of the nodes. Specifically. the packet will have to be re-transmitted. what additional events need to be introduced? Describe what action will be taken each time one of these additional events takes place. If the packet has been erroneously transmitted. or the packet is discarded. Store the residual inter-arrival time when the node surrenders the token. The time for the receiver to notify the sender may be assumed to be constant. That is. generate arrivals (starting from where you stopped last time) until the token times-out or it is . the sender will re-transmit the packet. Can this retsransmission scheme be accommodated without introducing more events? If yes. it may contain errors. Check by hand that your linked list implementation is correct. the host will carry on re-transmitting until either the packet is transmitted correctly. After the 5th re-transmission. The procedure is as follows: Upon completion of the transmission of a packet.. In this case.e. the host will proceed with the next transmission.01 probability that the packet has been transmitted erroneously. 3 Consider the token-based access scheme problem. No more than 5 re-transmissions will be attempted. Assume that transmissions are not error-free. how? If no. The sender will retransmit the packet immediately. 4 Consider the token-based access scheme problem. the packet will be discarded. When the token's time-out occurs. This procedure will be repeated until the packet is correctly transmitted. when a packet arrives at the destination node.Simulation designs 103 additional information pertaining to the event list. describe how these new events will interact with the existing events (i. Then.

1 A heap A heap is a binary tree data structure with the following important properties: 1 The binary tree is filled completely at each level. ) A. simulations with large number of events will run much faster if the future event list is implemented as a priority queue. Alternatively. H. Cormen et al. (We note that the heap algorithms were taken from Chapter 6 of “ Introduction to Algorithms” by T. At the last level the elements are filled from left to right. This is because the events in the priority queue are organized in such a way so that the highest priority event will always be stored at a specific location in its internal data structure. Thus. .104 Computer Simulation Techniques surrendered by the node. Thus for a max-heap the root element will always be the one with the largest value. This change leads to a considerably simpler simulation model. In the next section we explain the heap data structure. 2 The value of a child element is always greater than or equal to the value of its parent. These operations are done much faster than when we use arrays or linked lists. Such trees are called complete binary trees. This property is also called the heap property. It supports two important operations: insertion of a new event and removal of an event. a max-heap is a structure where the value of child node is always smaller than or equal to its parent. such heaps are called min-heaps. Also. except possibly the last level. Appendix A: Implementing a priority queue using a heap A priority queue is an interesting data structure used for storing events and it can be used to implement a future event list. The highest priority event can be obtained from a priority queue without performing a time intensive search as is done in arrays and linked lists. A priority queue is a binary tree called heap with functions to insert a new event and remove the highest priority event. This ensures that the value of the root element is the smallest. Whenever an event is added to the priority queue certain operations are performed in order to maintain this property.

of which only (a) satisfies both the properties of a heap. 6 and 7 are the data elements stored in the nodes of the tree.1. 2 6 2 2 3 5 5 3 9 6 (a) Valid 7 9 3 (b) Invalid 7 9 6 (c) Invalid 2 2 2 Figure A1: Examples of valid and invalid heaps A. The data structure (b) does not satisfy the heap property since the element containing 6 has a value greater than its right child.1 Implementing a heap using an array The binary tree of a heap can be efficiently stored using an array of n sequential locations. 5.1. 3. The left child of the root element is at . The data structure (c) does not satisfy the property of complete binary trees either since the second level is not completely filled.Simulation designs 105 In this section we will only deal with min-heaps. Observe that the root element is stored at location 1 of the array. An example of this is shown in figure A2. which has value 3. If we number the nodes from 1 at the root and place: left child of node i at 2i right child of node i at 2i +1 in the array • • then the elements of the heap can be stored in consecutive locations of an array. Three different examples of such heaps are shown in figure A. In order for (c) to be a heap either the element with value 9 or 6 must be the child of the root element which has value 2. The indices 1 through 6 represent the six consecutive array locations and the values 2. 9.

2 x 1 + 1. However the recursive calls will be made to nodes at lower levels. When HEAPIFY is called on a node. .e. i. it is assumed that the trees sub-rooted at its left and right children already satisfy the heap property. Maintaining the heap property In order to maintain the heap property the elements of a heap must be rearranged by calling the HEAPIFY routine. 1 2 2 3 1 2 3 4 5 6 3 4 5 5 6 2 3 5 9 6 7 9 6 7 (b) Array representation 2 (a) Heap Figure A2: Storing a heap in an array The indices of the parent. left child and right child of node with index i can be computed as follows: PARENT(i) return (floor (i/2)) LEFT(i) return (2i) RIGHT(i) return (2i + 1) a. This is a recursive routine called on a node that violates the heap property. its value is bigger than one of its child’s value. i.. 2 x 1. Each node shares this relationship with its left and right child.e. All through our discussion our first call to HEAPIFY will be to the root node of the heap. and the right child at location 3. The violating element moves down in the tree in each of the recursive calls until it finds a place so that the heap property is satisfied. i.106 Computer Simulation Techniques location 2..e.

3: The violating node is the root 2 1 2 2 3 1 2 3 4 5 6 3 4 5 7 6 2 3 7 9 6 5 9 6 5 2 A. The violating element with value 7 moves to A[3]. Observe that the trees sub-rooted at its left and right children are satisfying the heap property.4. Thus. we consider the example given in figures A. the element with value 7 no longer violates the heap property.Simulation designs 107 In order to understand how HEAPIFY works.4 and A.1). Let A be the name of the array representing the heap.3. 1 7 2 3 1 2 3 4 5 6 3 4 5 2 6 7 3 2 9 6 5 9 6 5 A. as shown in figure A. and. then the first call to HEAPIFY will be HEAPIFY(A.5. the violating node is the root node (array index 1).3) is called recursively and A[3] is swapped with A[6]. In this call A[1] is swapped with min(A[2].5.A[3]) which in this example is A[3].4: The node with value 7 still violates the heap property. as shown in figure A. .3. In figure A. but it still violates the heap property in its new location. A. HEAPIFY(A.

A[smallest]) HEAPIFY (A. In lines 3 to 8 the index of smallest of A[left]. else smallest ← i if (right ≤ heap-size[A]) and (A[right] < A[smallest]) smallest ← right if (smallest ≠ i) SWAP(A[i]. 4. The HEAPIFY algorithm HEAPIFY (A. 8. 6. i) 1. 11. . 3. 9. 7. A[right] and A[i] is stored in the variable smallest. 10. 2. 5. smallest) left ← LEFT(i) right ← RIGHT(i) if (left ≤ heap-size[A]) and (A[left] < A[i]) smallest ← left In lines 1 and 2 LEFT(i) and RIGHT(i) are obtained using the method described above.108 Computer Simulation Techniques 1 2 2 3 1 2 3 4 5 6 3 4 5 5 6 2 3 5 9 6 7 9 6 7 2 A5: The node with value 7 no longer violates the heap property. In line 9 to 11 if i was not the smallest then the values A[i] and A[smallest] are swapped and HEAPIFY is called recursively on the node to which the violating element was moved.

Now. we describe each sub-routine and highlight its operation through an example. and since the parent has a greater value than the new node. if want to insert a new element with value 4 into this heap we would call HEAP-INSERT with key equal to 4. Below.Simulation designs 109 A. key) 1. 3. 5. HEAP-INSERT (A. a priority queue is a heap with sub-routines to insert a new event and remove the highest priority event. Again let A be the array representing the heap. 2. The heap and its array representation are shown in figure A6. let us consider a heap with six nodes numbered 1 through 6. 4.2 A priority queue using a heap As mentioned earlier. HEAP-INSERT is used to insert a new event and EXTRACTMIN is used to remove the event with the highest priority from the heap. 1 2 2 3 1 2 3 4 5 6 7 3 4 5 5 6 7 2 4 3 5 9 6 7 4 9 6 7 2 A6: The heap and its array representation . the parent will move down while the new node moves into the place of its parent. The newly inserted node is compared with its parent. A[PARENT(i)]) i ← PARENT(i) As an example. Figure A7 shows the result of this operation. heap-size[A] ← heap-size[A] + 1 A[heap-size[A]] ← key i ← heap-size[A] while (i > 1) and (A[PARENT(i)] > A[i]) SWAP(A[i]. as shown in figure A8. 6.

6. 7. The call will be made as EXTRACT-MIN(A). 3. the root element has a value of 2. 4. The last . As shown in figure A9. 2. The minimum is always the root element of the heap. 1) return (min) Let us perform EXTRACT_MIN on the heap obtained after the insert operation in previous example. if (heap-size[A] < 1) error “No elements in the heap” min ← A[1] A[1] ← A[heap-size[A]] heap-size[A] ← heap-size[A] – 1 HEAPIFY (A.110 Computer Simulation Techniques 1 2 2 3 1 2 3 4 5 6 7 3 4 5 5 6 7 2 4 3 5 9 6 7 4 9 6 7 2 A7: The new node is inserted in the heap and it is compared to its parent 1 2 2 3 1 2 3 4 5 6 7 3 4 5 4 6 7 2 5 3 4 9 6 7 5 9 6 7 2 A8: The parent node and the new node switch places EXTRACT-MIN (A) 1. 5.

and the final result is shown in figure A11. see figure A10. 2 .1) is now called.Simulation designs 111 element A[heap-size[A]] is moved to the root node A[1] and the heap-size is reduced by one. HEAPIFY(A. 1 2 2 3 1 2 3 4 5 6 7 3 4 5 4 6 7 2 5 3 4 9 6 7 5 9 6 7 2 A9: The minimum is always the root element 1 5 2 3 1 2 3 4 5 6 7 3 4 5 4 6 7 5 5 3 4 9 6 7 5 9 6 7 2 A10: The last element A[heap-size[A]] is moved to the root node A[1] 1 3 2 3 1 2 3 4 5 6 5 4 5 4 6 3 5 4 9 6 7 9 6 7 A11: The heap now satisfies the heap property.

then its height procedure will be called recursively equals is equal to log2(n). Maximum comparisons = n Remove the first event. The running time (called the algorithmic time complexity) of an algorithm is proportional to the number of comparisons performed by the algorithm in order to achieve its objective. In this section. The efficiency of various algorithms is compared based on their time complexity.112 A. the height of the binary tree. In table A. HEAPIFY performs two comparisons (one with the left and one with the right child) Maximum comparisons= and calls itself recursively until the correct height of the tree* = log2(n) position for the violating element is found. linked lists and priority queues are: insert a new event and remove the next event.3 Time complexity of priority queues Computer Simulation Techniques So far we have discussed how to implement a priority queue as a heap and how to insert a new event and remove the highest priority event. Maximum comparisons = 0 Use EXTRACT-MIN which removes the root element. we show that a priority queue runs much faster than arrays and linked lists.1: Time complexity comparisons . Maximum comparisons = 2 x height of the tree = 2 log2(n) Table A. The two basic operations performed on arrays. moves the last element to the root and runs HEAPIFY on the root node. we approximate the maximum number of comparisons done when these two operations are performed using each of the three data structures. The maximum number of times this *If a heap has n elements. We assume that we already have n events in the data structure before either operation is performed. Get/Remove next event Search the entire array to get the event whose execution time is least.1. Insert new event Array Linked list Priority queue Store the new event at the first available location. Maximum Comparisons = 0 Search for the correct position where the new event must be inserted. Maximum comparisons = n Use HEAP-INSERT which searches for the correct position to insert the event from the lowest level up to the root level in the worst case.

The size of the array must be big enough to hold the maximum number of events at any point during the simulation. An array of such objects can then be declared to hold all the events. For instance. The size of the array must be declared at the beginning of the simulation. one can allocate memory for events dynamically.4 Implementing a future event list as a priority queue 113 An event should be defined as a structure or an object with one of its attributes being the clock associated with the occurrence of this event. in C one can use the malloc and realloc functions whereas in Java one can use an ArrayList or a Vector. you can directly use the library without implementing them on your own. Also. Alternatively. the entire event object must be moved. . All comparisons between the events of the heap should be made by comparing this attribute. the values used in the nodes of the binary trees or in the arrays can be seen as the clock attribute of the event objects. which means that the maximum required size of the array must be determined in advance. Thus. In the examples above. but whenever an element is be moved to another location in the array.Simulation designs A. almost all high level programming languages have a library class for the priority queue.

.

one can construct. an event simulation model of a system. and the duration of an activity. 5. These . we will first discuss briefly how one can collect data generated by a simulation program.1 Introduction So far we have examined techniques for building a simulation model. Now. one can incorporate additional logic to the simulation program in order to collect various statistics of interest such as the frequency of occurrence of a particular activity. The reason why one develops a simulation model is because one needs to estimate various performance measures. These techniques were centered around the topics of random number generation and simulation design. These measures are obtained by collecting and analyzing endogenously created data.2 Collecting endogenously created data A simulation model can be seen as a reconstruction of the system under investigation. say. we will examine various estimation techniques that are commonly used in simulation. In this Chapter. Using the techniques outlined in the previous Chapters. Before we proceed to discuss these techniques. This is similar to collecting actual data pertaining to parameters of interest in a real-life system. one can collect data pertaining to parameters of interest. This model simply keeps track of the state of the system as it changes through time. Within this reconstructed environment.CHAPTER 5: ESTIMATION TECHNIQUES FOR ANALYZING ENDOGENOUSLY CREATED DATA 5.

This is equal to the time the machine spends queueing up for the repairman plus the duration of its repair. The nodes are linked so that to represent the FIFO manner in which the machines are served. and b) time at which the repair was completed. Endogenous data can sometimes be collected by simply introducing in the simulation program single variables. This linked list is shown in figure 5. . If a machine arrives at the repairman's queue. However. the array will simply contain arrival and departure times for all the simulated breakdowns. in the machine interference problem one may be interested in the down time of a machine. one needs to introduce a storage scheme where the relevant endogenously created data can be stored. upon the nature of the problem. A more efficient method would be to maintain a linked list representing those machines which are in the repairman's queue. This information can be kept in an array.1. At the end of the simulation run. known as endogenous data.116 Computer Simulation Techniques statistics are obtained using data generated from within the simulation programs. a new node will be appended after the last node. and percentiles of the down time. Using this information one can then obtain various statistics such as the mean. pointed to by B. Each node contains the following two data elements: a) time of arrival at the repairman's queue. represents the machine currently in service. The total down time of a machine is calculated at the instance when the machine departs from the repairman. of course. In such cases. For instance. the first node. and b) index number of the machine. the standard deviation. pointed by F. quite frequently one is interested in the duration of an activity.1: The linked list for the repairman's queue. Thus. The form of the storage scheme depends. F t i i tj j •! • !• t a k B Figure 5. In the machine interference problem. acting as counters. The down time for each breakdown can be easily calculated. the down time of a machine can be obtained by keeping the following information: a) time of arrival at the repairman's queue.

Estimation Techniques

117

This is equal to the master clock's value at that instance minus its arrival time. In this fashion we can obtain a sample of observations. Each observation is the duration of a down time. Such a sample of observations can be analyzed statistically. Another statistic of interest in is the probability distribution of the number of broken down machines. In this case, the maximum number of broken down machines will not exceed m, the total number of machines. In view of this, it suffices to maintain an array with m + l locations. Location i will contain the total time during which there were i broken down machines. Each time an arrival or a departure occurs, the appropriate location of the array is updated. At the end of the simulation run, the probability p(n) that there are n machines down is obtained by dividing the contents of the nth location by T, the total simulation time. As another example on how to collect endogenously created data, let us consider the simulation model of the token-based access scheme. The simulation program can be enhanced so that each node is associated with a two-dimensional array, as shown in figure 5.2. In each array, the first column contains the arrival times of packets, and the second column contains their departure time. When a packet arrives at the node, its arrival time is stored in the next available location in the first column. When the packet departs from the node, its departure time is stored in the corresponding location of the second column. Thus, each array contains the arrival and departure times of all packets that have been through the node. Also, it contains the arrival times of all the packets currently waiting in the queue. Instead of keeping two columns per node, one can keep one column. When a packet arrives, its arrival time is stored in the next available location. Upon departure of the packet, its arrival time is substituted by its total time in the system. These arrays can be processed later in order to obtain various statistics per node. An alternative approach is to maintain a linked list per node, as described above for the machine interference problem. Each time a packet is transmitted, its total time is calculated by subtracting the current master clock value from it arrival time stored in the linked list. This value is then stored into a single array containing all durations of the packets in the order in which they departed.

118

Computer Simulation Techniques

Node 1 Arr Dep x x x Top of queue Current queue Bottom of queue x x x x x x x x

Node 2 Arr Dep x x x x x x x

Node 3 Arr Dep x x x x x x x

Figure 5.2: Data structure for the token-based access scheme simulation model.

5.3

Transient state vs. steady-state simulation

In general, a simulation model can be used to estimate a parameter of interest during the transient state or the steady state. Let us consider the machine interference problem. Let us assume that one is interested in obtaining statistics pertaining to the number of broken down machines. The simulation starts by assuming that the system at time zero is at a given state. This is known as the initial condition. Evidently, the behaviour of the system will be affected by the particular initial condition. However, if we let the simulation run for a long period, its statistical behaviour will eventually become independent of the particular initial condition. In general, the initial condition will affect the behavior of the system for an initial period of time, say T. Thereafter, the simulation will behave statistically in the same way whatever the initial condition. During this initial period T, the simulated system is said to be in a transient state. After period T is over, the simulated system is said to be in a steady state. 5.3.1 Transient-state simulation One may be interested in studying the behavior of a system during its transient state. In

Estimation Techniques

119

this case, one is mostly interested in analyzing problems associated with a specific initial starting condition. This arises, for example, if we want to study the initial operation of a new plant. Also, one may be forced to study the transient state of a system, if this system does not have a steady state. Such a case may arise when the system under study is constantly changing. 5.3.2 Steady-state simulation Typically, a simulation model is used to study the steady-state behaviour of a system. In this case, the simulation model has to run long enough so that to get away from the transient state. There are two basic strategies for choosing the initial conditions. The first strategy is to begin with an empty system. That is, we assume that there are no activities going on in the system at the beginning of the simulation. The second strategy is to make the initial condition to be as representative as possible of the typical states the system might find itself in. This reduces the duration of the transient period. However, in order to set the initial conditions properly, an a priori knowledge of the system is required. One should be careful about the effects of the transient period when collecting endogenously created data. For, the data created during the transient period are dependent on the initial condition. Two methods are commonly used to remove the effects of the transient period. The first one requires a very long simulation run, so that the amount of data collected during the transient period is insignificant relative to the amount of data collected during the steady state period. The second method simply requires that no data collection is carried out during the transient period. This can be easily implemented as follows. Run the simulation model until it reaches its steady state, and then clear all statistical accumulations (while leaving the state of the simulated system intact!). Continue to simulate until a sufficient number of observations have been obtained. These observations have all been collected during the steady-state. The second method is easy to implement and it is quite popular. The problem of determining when the simulation system has reached its steady state is a difficult one. A simple method involves trying out different transient periods T1,T2,T3,...,Tk, where T1<T2<T3<...<Tk. Compile steady-state statistics for each

the expectation of the random variable.1 Estimation of the confidence interval of the mean of a random variable Let x1. 5. In order to obtain the confidence interval for the sample mean we have to first . x2.. percentiles can be very useful too. i. the steadystate statistics do not change significantly..1) is an unbiased estimate of the true population mean. Also. The most commonly sought measures are the mean and the standard deviation of a random variable.4. xn be n consecutive endogenously obtained observations of a random variable. Another similar method requires to compute a moving average of the output and to assume steady-state when the average no longer changes significantly over time. However. From the management point of view. 5. Choose Ti so that for all the other intervals greater than Ti.120 Computer Simulation Techniques simulation run.. In particular. one may be interested in the 95% percentile of the down time... one may settle for the mean and standard deviation of the down time. Percentiles often are more meaningful to the management than the mean down time. in the machine interference problem one may be interested in the distribution of the down time. This is the down time such that only 5% of down times are greater than it.4 Estimation techniques for steady-state simulation Most of the performance measures that one would like to estimate through simulation are related to the probability distribution of an endogenously created random variable.e. For instance. Then _ 1 n x = n ∑xi i=1 (5. of importance is the estimation of percentiles of the probability distribution of an endogenously created random variable.

Now.1 ⎛ n ⎞ 2 (Σxi)2⎟ ⎜∑xi . …xn are assumed to come from a population known as the parent population whose mean µ we are trying to estimate. as shown in figure x) € 5. The confidence interval provides an indication of the error associated with the sample mean .2) or. Using the Central Limit x Theorem we have that follows the normal distribution N(µ. The distribution that x follows is known as the sampling distribution. Let σ2 be the variance of the parent population.. The confidence interval tells us that the true population mean lies within the interval 95% of the time. 95% of these times. if we repeat the above experiment 100 times. Observations x1.. Unfortunately. σ/ n ). If the observations x1.1. It is a very useful statistical tool and it should be always computed. we obtain the confidence interval _ _ s s (x .1 _ ∑ (xi -x)2 i=1 n (5. x2. let us fix points a and b in this distribution so that 95% of the observations (an € observation here is a sample mean € fall in-between the two points. x + 1.Estimation Techniques 121 estimate the standard deviation. using the short-cut formula 1 s2 = n ..96 ) n n at 95% confidence.96 . x2.3.xn are independent of each other. That is. the true population mean will be within the interval.n ⎝ i=1 ⎠ and. quite frequently it is ignored. therefore. . The theory behind the confidence interval is very simple indeed. Points a and b are € . then 1 s2 = n . on the average.

i.95 s . one can use in its place the sample standard deviation s. points a and b are calculated from the table of the standard normal distribution. The areas (-∞.96 σ n ) € € € € 95% of the time.122 Computer Simulation Techniques symmetrical around µ.e. The value 95% is known as the confidence.3: The normal distribution. 95% and 90%. x + 1.96σ/ n .96σ/ n .t. µ will lie in the€ interval € x.1. That is. this observation will lie in the interval [a. but points a and b will be obtained using the t distribution.e. Figure 5. If the sample size is small (less than 30). Most typical confidence values are 99%.96 standard deviation below µ.96σ/ n 95% of the time. +∞) account for 5% of the total distribution. For each value. i. ( x . Using the table of the standard normal distribution. its distance from µ will be less than 1. x + t.b] 95% of € € the time. In general. If σ is not known. a=µ-1..96 σ n .. a confidence interval can be calculated for any value of confidence. a) and (b. ( x . if we consider an arbitrary observation Therefore. Likewise. we have that a is 1. then we can construct similar confidence intervals. b=µ+1.95 s ) n n € € . Now.

.Y) > 0. € .. then one has to use a special procedure to get around this problem. let σ X and σ Y be the variance of X and Y respectively. one can proceed as described above. then Cov(X. and if Cov (X. Expression (5. c.. for obtaining the confidence interval of the sample mean is to first check if the observations x1. We define the covariance between X and Y to be € € Cov (X. a. then Cov(X. Also. x2. 123 In general.. For instance.. In the presence of correlated observations. d. then they are negatively correlated.X) = σ X . The correct procedure. Batch means. Let µX and µY be the expectation of X and Y 2 2 respectively. the down time of a machine depends on the down time of another machine that was ahead of it in the repairman's queue.2) for the variance does not hold.Estimation Techniques with (n-1) degrees of freedom. If Cov (X. If the observations are correlated. the observations x1.Y) = 0.. x2. Estimation of the autocorrelation coefficients Let X and Y be two random variables. Below. If they are not. xn are correlated. then X and Y are positively correlated. the above expression (5. If X and Y are uncorrelated. therefore.Y) 2 < 0. If Y is identical to X.Y) = E[(X-µX) (Y-µY)] = E(XY) . Estimation of the autocorrelation function.µX µY This statistic reflects the dependency between X and Y.. Regenerative method. xn that one obtains endogenously from a simulation model are correlated.. Replications. b. we discuss the following four procedures for estimating the variance of correlated observations: a.1) for the mean holds for correlated or uncorrelated observations.

(iii) negative correlation.Y) = 0. Cov(X.124 Computer Simulation Techniques (i) uncorrelated.4: The three cases of correlation. Cov (X.Y) > 0. i. . i. (ii) positive correlation. i.e.e. Cov (X.e. Figure 5.Y) <0.

The Cov (X..Estimation Techniques 125 Let us assume that we have obtained actual observations of the random variables X and Y in the form of (x. Then. x2. In view of this. it is not dimensionless. then they are highly negatively correlated. The scatter diagrams given in figure 5. (x3. . Then.Y) may take values in the region (-∞. If ρXY is close to -1. If ρXY is close to 1. in this case ρXY is called the autocorrelation or the serial correlation coefficient. let us regard the first observation in each pair as coming from a variable X and the second observation as coming from a variable Y. (xn-1.xn). the scatter diagram can be plotted out.. 2 ∑ (x i −X) ∑ (x i+1 −Y ) _ where X = _ 1 n 1 n−1 € ∑ x i and Y = n-1 ∑xi .. let us assume we have n observations x1. if ρXY=0. Also. It can be shown that 1≤ρXY≤1. (x2. It can be estimated as follows: n−1 r1 = i=1 n−1 i=1 ∑ (x i −X )(x i+1 −Y ) 2 n−1 i=1 . (xi. the correlation ρxy defined by Cov(X..x3). ρXY can be n −1 i=1 i=2 approximated by € .x4).. then X and Y are highly positively correlated. .y). Now.xi+1).. Finally.x2).. +∞). which makes its interpretation troublesome.4 shows the three cases of correlation between X and Y mentioned above. We form the following n-1 pairs of observations: (x1. then they are independent from each other. xn. Now.. For n reasonably large. .Y) σ x σy ρ XY = is typically used as the measure of dependency between X and Y..

126 n−1 Computer Simulation Techniques r1 = i=1 ∑ (x i −X )(x i+1 −X ) n−1 i=1 ∑ (x i −X ) 2 _ 1n € where X = n ∑xi is the overall mean. i=1 We refer to the above estimate of ρXY as r1 in order to remind ourselves that this is the correlation between observations which are a distance of 1 apart.. Typically. where Rk is given by the formula Rk = 1 n−k ∑ (x i −X )(x i+k −X ) n i=1 (5. This can be calculated using the expression: n−k rk = i=1 ∑ (x i −X )(x i+k −X ) n−1 i=1 ) 2 ∑ (x i −X ) In practice.3) We then compute rk as the ratio € Rk rk = R . we can obtain the lag k autocorrelation.. R1. In a similar fashion. that is the correlation between observations which are a distance k apart..4) where R0 = σ2. . . This autocorrelation is often referred to as lag 1 autocorrelation. rk is not calculated for values of k greater than about n/4. 0 (5. the autocorrelation coefficients are usually calculated by computing the series € of autocovariances R0.

then the correlogram also tends to alternate as shown in figure 5.5. If the time series has a tendency to alternate. tend to get successively smaller. A useful aid in interpreting the autocorrelation coefficients is the correlogram. with successive observations on different sides of the overall mean.6. If the time series exhibits a short-term correlation. while significantly greater than zero.Estimation Techniques +1 127 rk 0 lag k Figure 5. then its correlogram will be similar to the one shown in figure 5. This is a graph in which rk is plotted against lag k. +1 rk 0 lag k Figure 5.5: A correlogram with short-term correlation. This is characterized by a fairly large value of r1 followed by 2 or 3 more coefficients which. Values of rk for longer lags tend to be approximately zero. .6: An alternating correlogram.

contains xb+1... Let the batch size be equal to b. It involves dividing successive observations into batches as shown in figure 5. and so on... Each batch contains the same number of observations.. x2. The observations close to batch 2 are likely to be correlated with the observations in batch 2 which are close to batch 1. xb+2.. batch 2 | x1. Having obtained a sample of n observations x1. xb+2..x2.| xkb+1.. Let X i be the sample mean of the € .. ..7: The batch means method... Also. and so on.. € Batch means This is a fairly popular technique. Then. we calculate the autocorrelation coefficients using the expressions (5.3) and (5.. . . n i=1 € € b.. xkb+2. Then batch 1 contains observations x1..7. x(k+1)b batch k | Figure 5.128 Computer Simulation Techniques Let us now return to our estimation problem. the observations in batch 2 which are close to batch 3 are likely to be correlated with those in batch 3 which are close to batch 2.4).xn. xb | xb+1.. x2. xb... x2b | batch 1 batch 2 .. the variance can be estimated using the expression n/4 2 s2 = sX [1 + 2 ∑ (1− k=1 k rk ) ] n 2 € where sX is the standard deviation given by € € 2 sX = 1 n ∑ (x i −X ) 2 n −1 i=1 _ 1 n and X = ∑ x i . . .. x2b.

then the successive batch means will be correlated. X 2 X k can be shown that it is approximately uncorrelated. k ≥30). x1m ... Suppose we make n replications.x2. k k For small k. we can construct our confidence interval using the t distribution. for large k (i. and the above estimation procedure will yield severely biased estimates.96 ).. We can fix b so that it is 5 times the smallest value b' for which rb' is approximately zero. Therefore. Replications Another approach to constructing a confidence interval for a mean is to replicate the simulation run several times. If b is not large.e. we can treat these € € _ 1 k _ X =k ∑ Xi i=1 means as a sample of independent observations and calculate their mean and standard deviation. 129 X1 . . c. We have € s2 = 2 1 k ∑ (X i −X ) ... X + 1. each resulting to n observations as follows: replication 1: x11.. then the sequence . k −1 i=1 Therefore..1.xn. we obtain the confidence interval € _ _ s s (X . In general. .. If we choose b to be large enough.Estimation Techniques observations in batch i. An estimate of b can be obtained by plotting out the correlogram of the observations x1.96 .. the batch size b has to be large enough so that the successive batch means are not correlated. which can be obtained from a preliminary simulation run.. x12.

130 replication 2: . i.. j=1 _ _ _ We can then treat the sample means X 1... and b) decide on the length of the transient period. .. we can construct the sample mean _ 1 n X i = m ∑xij . therefore. each time having to allow the simulation to reach its steady state. Repeat this procedure n times. Allow the simulation to reach its steady state and then collect the sample observations. n i=1 Using the above statistics we can construct our confidence interval.e. In order to obtain n samples. xnm x21. .. xn2.. . One of the following two approaches can be employed: Start each simulation run with different values for the seeds of the random number generators. x2m Computer Simulation Techniques For each sample. .. the value m. . x22. . X n as a sample of independent observations. replication n: xnm. € The problems that arise with this approach are: a) decide on the length of each simulation run... X 2. we have to run n independent simulations.. thus obtaining _ 1 n _ X =n ∑ Xi i=1 s2 = 1 n ∑ (X i − X ) 2.

in the batch means method.. The regenerative method produces independent subsequences from a single run. one collects a large number of batches. other customers may arive thus forming a queue. we extend the simulation run in order to collect the second sample of observations. Allow first the simulation to reach its steady state. In the above case. t1. Its applicability. Let t0..Estimation Techniques 131 Alternatively. in general. It is important to note that the activity of the queue in the interval (ti. d. . some of the observations that will be collected at the beginning of a sampling period will be correlated with observations that will be collected towards the end of the previous sampling period. Subsequently. and then collect the first sample of observations. Such time instances occur when a customer departs and leaves an empty system behind. During its service. the batch size is relatively small and. The advantage of this method is that it does not require the simulation to go through a transient period for each sampling period. Let t1 be the point at which the last customer departs and leaves an empty system. Regenerative method The last two methods described above can be used to obtain independent or approximately independent sequences of observations. This will repeat itself as shown in figure 5.. The replication method appears to be similar to the batch means approach. t1 is the time instance where the server becomes idle.8. However. Let us consider a single server queue. However. The first customer that will arrive will see an empty system. The method of independent replications generates independent sequences through independent runs. t2. then the third sample and so on. however. That is. The batch means method generates approximately independent sequences by breaking up the output generated in one run into successive subsequences which are approximately independent. we can run one very long simulation. is limited to cases which exhibit a particular probabilistic behaviour. instead of terminating the simulation and starting all over again. be points at which the simulation model enters the state where the system is empty. each sampling period is very large and one collects only a few samples. Let t0 be the instance when the simulation run starts assuming an empty system.

ti). the process . For instance. The queue-length distribution observed during a cycle is independent of the distribution observed during the previous cycles. The time between two such points is known as the regeneration cycle or tour. ti+1) does not depend on the number of customers in the system during the previous interval. Such time instances occur when the last machine is repaired and the system enters the state where all machines are operational. the probability of finding customer n in the interval (ti. the time instances when the system contains 5 customers are regeneration points. or waiting time in the queue. Number in system t0 Time t1 t2 t3 Figure 5. The regeneration points identified above in relation with the queue-length (or waiting time) probability distribution are the time points when the system enters the empty state. can be also obtained if the service time is exponentially distributed. The same applies for the density probability distribution of the total time a customer spends in the system. can be identified with the time instances when the repairman is idle. the process repeats itself probabilistically.132 Computer Simulation Techniques ti+1) is independent of its activity during the previous interval (ti-1. In the case of the machine interference problem. regeneration points related to the repairman's queue-length probability. These time instances are known as regeneration points. That is to say. Due to the memoryless property of the service time.8: Regeneration points. Due to the assumption that operational times are exponentially distributed. Regeneration points identified with time instances when the system enters another state.

the number of observations in each batch was constant equal to batch size. E(X) is estimated using the ratio E(Z)/E(N). Due to the nature of the regeneration method. . where Z is a random variable indicating the sum of all observations of X in a cycle. € . This sequence of observations will be independent from the one obtained during the (i+1)st cycle. and N is a random variable indicating the number of observations in a cycle. Then a point estimate of E(X) can be obtained using the expression: 1 M M ∑ Zi 1 M i=1 M _ X = . Now.Estimation Techniques 133 repeats itself probabilistically following the occurrence of this state. the output is partitioned into independent sequences.. The simulation starts by initially setting the model to the regeneration state. In the batch means case. Let ni Zi = ∑xij j=1 be the sum of all realizations of X in the ith cycle. Let ni be the total number of observations. However. the output was partitioned into approximately independent sequences. assuming exponentially distributed repair times. Let (xi1. Thus. In this case. say M cycles. Then the simulation is run for several cycles. ∑ ni i=1 That is. xini) be a sequence of realizations of X during the ith regeneration cycle occurring between ti and ti+1 regeneration points. xi2.. the number of observations in each cycle is a random variable.. let us assume that we are interested in estimating the mean value of a random variable X. Other states can be used to identify regeneration points. there is no need to discard observations at the beginning of the simulation run.

and the jackknife method. it is consistent.E(N)E(X) = 0.134 Computer Simulation Techniques X is not an unbiased estimator of E(X). This because € € E(V) = E(Z . That is. However.2ZNE(X) + N2E(X)2) = E(Z2) .NE(X)) = E(Z) . The € € € random variables: the central limit theorem. the Vi's are independent and identically distributed with a mean E(V) = 0.E(Y)2 we have that 2 2 σ2 = σ11 + E(Z)2 . X → E(X) as M → ∞. and σ 22 = Var(N). That is. Now. E(V . € The central limit theorem 2 2 Let σ12 = Cov (Z. The variance σ2 of the Vi's is as follows: v = E(V2) = E(Z .E(V))2 .2(E(X)E(ZN) + E(Z)2 + E(N)2E(X)2.NE(X))2 = E(Z2 . That is E( X ) ≠ E(X). in order to construct a confidence € interval of following two methods can be used to construct confidence intervals for a ratio of two X .2E(X)E(ZN) + E(X)2 σ22 + E(N)2E(X)2 v 2 2 = σ11 + E(X)2 σ22 . we note that X is obtained as the ratio of two random variables. Also. let V=Z-NE(X). for cycle i we have Vi=Zi-niE(X). N). Since Var(Y) = E(Y2) .2E(X)E(ZN) + E(N2)E(X)2. Then. σ 11= Var(Z).

E(Y2)Y1 + E(Y1)E(Y2)) = E(Y1Y2) .2E(X)[E(ZN) . . we have that as M increases V becomes normally distributed with a mean equal to 0 and a standard deviation equal to 2 σ v / M .E(X)N .2E(X)[E(ZN) . _ By the central limit theorem.2E(X)σ12 v since Cov(Y1.E(Z)E(N)] or 2 2 σ2 = σ11 + E(X)2 σ22 .2E(X)E(ZN) + 2E(N)2E(X)2 v 2 2 = σ11 + E(X)2 σ22 .E(X) M ∑ ni i=1 i=1 _ _ = Z .Estimation Techniques 135 Given that E(Z) = E(N)E(X) we have 2 2 σ2 = σ11 + E(X)2 σ22 .E(Y1)E(Y2).E(Y1))(Y2 . Y2) = E(Y1 .E(Y2)E(Y1) + E(Y1)E(Y2) = E(Y1Y2) .E(Y2)) = E(Y1Y2 .E(Y1)Y2 .E(X)E(N)2 2 2 = σ11 + E(X)2 σ22 . where _ 1 M V = M ∑ Vi i=1 1 M = M ∑ (Zi .niE(X)) i=1 € 1 M 1 M = M ∑ Zi .E(Y1)E(Y2) .

Estimate Z . ZM (n1... Estimate 2 sZ = 1 M ∑ (zi −Z 2 ) M −1 i=1 € . it v € can be used in place of σ2 in the above confidence interval.E(X)N ~ N(0. 1) σ2 v M Dividing by N we obtain that _ _ Z/N . _ _ Z .. _ σ2 v (1/N) M Therefore. N and set X = Z / N . we obtain the confidence interval _ Z _ ± 1.E(X)N ~ N(0. v The above method can be summarized as follows: 1. Run the simulation for M cycles and obtain Z1.. Z3. nM.2X RZ.N + X 2sN is an estimate of σ2. . it can be shown that s2 = sZ . Z2. n3. or Computer Simulation Techniques σ2 v M ) _ _ Z ..136 Hence. .E(X) ~ N(0. 1).) _ _ _ _ _ 2.96 2 σv / M _ N N _ _ 2 2 Now.. n2. 3. Therefore.

Suppose we want to estimate φ=E(Y)/E(X) from the data y1. ∑xj j≠g The jacknife point estimator of φ is ^ φJ= ^ This is. The classical point estimator of φ is φc = constructed as follows.Y be two random variables. € The jacknife method € 1 M ∑ (zi −Z)(ni −N) .(n .. σ2 = J Then. Let n n ∑θg/n g=1 . it can be shown that € ∑ g=1 (θ g −φ j ) 2 n −1 . where the yi's are i.i.. Let X. sine the classical point estimator is usually biased. ...n. and Cov(yi. xn. in general. M −1 i=1 This method constitutes an alternative method to obtaining a confidence interval of a ratio of two random variables..xj)≠0 for i≠j. g = 1... The jacknife point and interval estimator can be €€ ∑yj j≠g ^ θg = n φ C . Let Y / X and its confidence interval can be obtained as shown above.i. less biased than φ C. the xi's are i. y2..d.1) .d. ..Estimation Techniques 137 M 1 2 sN = ∑ (ni −N 2 ) M −1 i=1 sZ2N = . yn and x1. . It also provides a means of obtaining a less biased point estimator of the mean of the ratio of two random variables.2.. x2. .

b.138 σ2 J n Computer Simulation Techniques ^ φ J ~ N(φ. n n The difficulty in using the regenerative method is that real-world simulations may not have regeneration points. we examine ways of estimating the above statistics. Probability that a random variable lies within a fixed interval. Below. Probability that a random variable lies within a fixed interval The estimation of this type of probability can be handled exactly the same way as the estimation of the mean of a random variable.4.96 . Percentiles of the probability distribution. we can obtain the confidence interval at 95% σJ σJ ^ ^ (φ J -1. Let I be the designated interval. ). as n → ∞. Variance of the probability distribution. We want to estimate p = Pr (X ∈ I) where X is an endogenously created random variable.96 ). c. the expected regeneration cycle may be too large so that only a few cycles may be simulated. Other interesting statistics related to the probability distribution of a random variable are: a. If they do happen to have regeneration points.2 Estimation of other statistics of a random variable So far we considered estimation techniques for constructing a confidence interval of the mean of an endogenously created random variable. 5. φ J +1. We generate M replications of the simulation. a. For each replication i we collect N . That is.

the batch means method or the regenerative method can be used. is not interested in the mean of a particular random variable.4 of this Chapter can be employed to estimate p. Management. Let vi be the number of times X was observed to lie in I. instead of replications. in one replication we have that (1/N)ΣYi is equal to Vi/N or pi. b.Estimation Techniques 139 observations of X. _ 1 M p = M ∑pi i=1 and s2 = M 1 ∑ ( pi − p)2 . Percentile of a probability distribution This is a very important statistic that is often ignored in favour of the mean of a random variable X. Let Yi=1 if ith realization of X belongs to I. Then.4 of this Chapter can be used in order to remove unwanted autocorrelations. Yi=0. each giving rise to one realization of p.9: Percentile xβ . pi = vi/N is an estimate of probability p. For instance. M −1 i=1 _ As before. For instance. Then. the techniques developed in section 5. that the estimation of p requires M independent replications. € We observe. sometimes. Other methods examined in section 5. the estimation of p can be seen as estimating a population mean. Alternatively. ! x! Figure 5. Thus.1) if M is large. the person in charge of a web service may not be interested in its . (p -p)/(s/ M ) ~ N(0. Otherwise. Thus.

a value such that the response time of the real time system is below it 95% of the time. x0. Let us assume independent replications of the simulation. y12. € The estimation of extreme percentiles requires long simulation runs. having allowed for the transient period. let xi1. If the runs are not long.90.. the 100βth percentile x (i) for the ith replication is observation yik where k=Nβ. let us consider a probability density function f(x). Hence € € 1 M (i) x β = ∑ xβ M i=1 and € s2 = 1 M ∑ (x (i) − x β ) 2 β M −1 i=1 Confidence intervals can now be constructed in the usual manner. These two operations can . That is. The € € 95th percentile is ⎣50x0. if we have a sample of 50 observations ordered in an € ascending order.95⎦+1= ⎣47.. In general. . x1N be the observed realizations of X. Each replication yields N observations. Typically. x0. More specifically. x12.50 or in extreme percentiles such as x0. y1N so that yij<yi. For each replication i. The 100βth percentile is the smallest value xβ such that f(xβ) < β. he or she may be interested in "serving as many as possible as fast as possible". Now.. .140 Computer Simulation Techniques mean response time.j+1.. the area from -∞ to xβ under f(x) is less or equal to β as shown in figure 5. he or she may be interested in knowing the 95th percentile of the response time. let us consider a reordering of these observations yi1.9.50 = 45. Rather.95. We are interested in placing a confidence interval on the point estimator of xβ of a distribution of a random variable X.90.99.5⎦ +1=47+1=48. there is interest in the 50th percentile (median) x0. then the estimates will be biased. and b) that we order the sample of observations in an ascending order. if Nβ is an β integer. Then... The calculation of a percentile requires that a) we store the entire sample of observations until the end of the simulation. then the 90th percentile is the observation number 0. That is. or k= ⎣Nβ ⎦ + 1 if Nβ is not an integer ( ⎣x⎦ means the largest integer which is less or equal to x). For instance.

Variance of the probability distribution Let us consider M independent replications of the simulation. From each replication i we obtain N realizations of a random variable xi1.. after we allow for the transient period. Finally. We have _ 1 N µ i = N ∑ xij j=1 and _ _ 1 M µ = M ∑µi i=1 is the grand sample mean. N s2 = i 1 ∑ (x ij − µ) )2 N j=1 _ _ _ _ _ 1 N 2 = N [ ∑ xij .Estimation Techniques 141 be avoided by constructing a frequency histogram of the random variable on the fly. c.. Hence.. it is immediately classified into the appropriate interval of the histogram.. At the end of the replication.2µ iµ +µ 2]. Thus. € j=1 As the point estimate of the variance σ2 we take s2 = i 1 M 2 ∑s M i=1 i € . Obviously. it suffices to keep track of how many observations fall within each interval. we note that instead of independent replications of the simulation. When a realization of the random variable becomes available. the accuracy of this implementation depends on the chosen width of the intervals of the histogram. the 100βth percentile can be easily picked out from the histogram. other methods can be used such as the regeneration method.xi2.xiN.

µ M i=1 € € j=1 M € The estimates of si2 are all functions of µ .142 Computer Simulation Techniques 1 2 1 M 1 N 2 2 = ∑ ( N ∑ x ij) . rather than using repications. and then calculating the standard deviation assuming that the successive observations are independent! This approach is correct when the sample of observations is extremely large. € € _ σ2 = i 1 1 ∑ (N ∑ M −1 j≠i i=1 N € _ 1 x2) . Thus. That is. A confidence 2 interval can be constructed by jacknifing the estimator s2. That is. In order to estimate statistics of a random variable X during the transient state one needs to be able to obtain independent realizations of X. Alternatively.1 µj 2 . 5. Then. i 1 Z=M M Zi i=1 ∑ _ 1 2 sZ = M-1 M (Zi . they are not independent. Let σ i be an estimate of σ2 € with the ith replication left out. j≠i ∑ _ € Define zi = Ms2 . employ the replications .(M-1) σ2 . The only way to get such independent observations is to repeat the simulation.Z) 2 i=1 ∑ and a confidence interval can be constructed in the usual way.M N ∑ µi µ + M µ 2 M i=1 i=1 j=1 = _ 1 M 1 N 2 _2 ∑ ( N ∑ x ij) . we can obtain a confidence interval of the variance by running the simualtion only once.M .5 Estimation techniques for transient-state simulation The statistical behaviour of a simulation during its transient state depends on the initial condition.

this problem is tackled by conducting a pilot experiment. That is.Estimation Techniques 143 technique discussed above in section 5. the larger the value of N.4. The expected width of the confidence interval is. In general. in order to halve the width of the confidence interval of the mean of a random variable. we discussed techniques for generating confidence intervals for various statistics of an endogenously generated random variable. proportional to 1/ N . observations used. the smaller the width confidence interval.95. Furthermore. N has to be increased by four times so that 1/ 4N =(1/2)(1/ N ). requires advanced knowledge of the length of the transient period.i. of course. This. Quite frequently one does not have prior information regarding the value of N that will give a required accuracy. the higher is the accuracy. Each replication will be obtained by starting the simulation with the same initial condition.4. one does not know the value of N that will yield a width equal to 10% of the estimate. For instance. The difference between two seeds should be about 10. This experiment provides a rough estimate of the value of N that will yield the desired confidence interval width. This can be achieved by using a different seed for each replication. let us assume that we want to estimate the 95th percentile of the probability distribution of a random variable X. An alternative approach is the sequential method. Obviously. For each replication we can obtain an estimate x. Each independent simulation run has to start with the same initial condition. . This can be achieved by simply replicating the simulation times. and running the simulation during its transient period.d. As an example. the main simulation experiment is carried out continuously.000. where N is the number of i. Typically. (This is defined as half the confidence interval. 5.) For instance. A confidence interval can be obtained as shown in 5. the pseudo-random numbers used in a replication have to be independent of those used in previous replications. in general. the accuracy of an estimate of a statistic depends on the width of the confidence interval. The smaller the width.6 Pilot experiments and sequential procedures for achieving a required accuracy So far.

1 θ 2 . We assume that the width of the ^ ˆ confidence interval is required to be approximately less€ equal to 0.1 θ 2 . Let us assume that we want to estimate a statistic θ of a random variable using independent replications. Let θ1 be a point estimate of θ and let Δ1 be the width of its confidence interval. the N2 observations will not exactly yield a confidence interval width equal € ˆ to 0. then 1 the simulation stops. The simulation program is modified to conduct N* independent replications. The above approach can be implemented as a sequential procedure as follows.1θ .1 θ1 . Using the pilot . Below. it will proceed to generate another N* replication and so on. 5. a test is carried out to see if the desired accuracy has been achieved.6. The simulation will stop if Δ2≤0. Rather. N2 = (Δ /0. The new point estimate θ 2 and width Δ2 are constructed € based on the total € ˆ 2N* replications. If Δ >0. then we need to conduct a main experiment involving N 1 1 1 2 ˆ replications where. € € 5. and then € carry out only N2. The point ˆ ˆ estimate θ1 and the width Δ1 of its confidence interval are calculated.6. € ˆ Let θ 2 and Δ be the new estimate and width based on N2 replications.1 θ1 . If Δ ≤0. Otherwise.144 Computer Simulation Techniques Periodically. we examine the methods of independent replications and batch means. the simulation proceeds to carry out N* additional ˆ replications.1 θ1 )2N1 appropriately rounded off to the nearest integer. A ˆ pilot experiment is first carried out involving N1 replications.1 Independent replications This discussion is applicable to transient and steady state estimation. The simulation stops the first time it senses that the desired accuracy has been achieved. We note that it is not necessary to run a new experiment involving N2 replications.1 θ .N1 additional replications. If Δ ≤0.2 Batch means This discussion is obviously applicable to steady state estimations. or ˆ then we stop. one can use the N1 replications from the pilot experiment. Otherwise. Due to € 1 2 € randomness. N* can be set equal to 10.

one can run the simulation for k1 batches.1 θ . i. Based on the remaining 1000 observations. it runs for k* additional batches.1 θ1 . € Now.. The simulation stops if ˆ Δ1≤0.1 θ1 . then the simulation stops. The calculation of the standard deviation may not be correct. then the desired accuracy has been achieved. Let θ and Δ1 be the point sequential procedure. b. Consider the machine interference problem. Set-up a data structure to collect information regarding the amount of time each machine spends being broken down. 2 2 2 2 € Otherwise. based on the 2k* batches ˆ ˆ generated to this point. The simulation first runs for k 1 estimate and the width of its confidence interval respectively. € Alternatively. one can ˆ ˆ obtain θ1 and Δ1. If Δ1≤0. These additional € batches can be obtained by simply re-running the simulator for k2-k1 batches.Estimation Techniques 145 experiment approach. ˆ k2-k1 additional batches have to be simulated.1 θ1 )2k1. Then. c. First remove the print statement that prints out a line of output each time an event occurs. waiting in the queue and also being repaired (see section 2 of Chapter 3). repairs). and so on.. run your simulation for 1050 observations (i. Augment your simulation program to calculate the mean and standard deviation.e. From this. (You may graph the correlogram by hand!) Based on the correlogram. Otherwise. Otherwise. Implement the batch means approach in your program and run your program for . calculate the mean and the standard deviation of the time a machine is broken down. θ and Δ are constructed. Write a program (or use an existing statistical package) to obtain a correlogram based on the above 1000 observations. calculate the batch size. If Δ ≤0. Discard the first 50 observations to account for the transient state. it runs for another k* batches. Modify the simulation model to carry out the following tasks: a. where k2=(Δ1/0. € Computer Assignments € 1. the simulation program can be enhanced to include the following € ˆ * batches.e. due to the presence of autocorrelation.

2. We will refer to this time as the response time. (You may graph the correlogram by hand. Using the batch means approach. from the time when it joins the queue to the time it is transmitted out. and use the other 30 batch means to construct a confidence interval of the mean time a customer spends in a queue. and 30 instead of 10. b. 25. Augment your simulation to record how much time each packet spends in a queue. and use the other 30 batch means to construct a confidence interval of the mean time a customer spends in the system. The calculation of the standard deviation may not be correct.6. Run your simulation for a total of 3100 packet departures (to all three queues). Implement the batch means approach in your program and run your program for 31 batches. Then. and 20 that you used before (which lead to an unstable system). c.146 Computer Simulation Techniques 31 batches. implement a sequential procedure for estimating the mean time a packet spends in a queue (see section 5. Record this information in the order in which the packets arrive. modify the simulation model to carry out the following tasks: a. implement a sequential procedure for estimating the mean time a machine is broken down (see section 5. Using the batch means approach. Consider the token-based access scheme.6.) Based on the correlogram. Disregard the first batch. due to the presence of autocorrelation. Discard the first 100 observations to account for the transient state. Implement a scheme to estimate the 95th percentile of the time a machine is broken down. First remove the print statement that prints out a line of output each time an event occurs.2). d.2). Based on the remaining observations calculate the mean and the standard deviation of the response time. Change the inter-arrival times to 20. e. . Write a program (or use an existing statistical package) to calculate the correlogram based on the response times collected from the above task a. 15. Augment your simulation program to calculate the mean and standard deviation of the response times collected during the simulation. calculate the batch size. Disregard the first batch.

Consider the two-stage manufacturing system. and use the other 30 batch means to construct a confidence interval of the mean time a customer spends in the system. (You may graph the correlogram by hand!) Based on the correlogram. Disregard the first batch. from the time it arrives to stage 1 to the time it departs from stage 2.) a. Implement a scheme to estimate the 95th percentile of the time a packet spends in a queue.e. due to the presence of autocorrelation. run your simulation for 1050 observations (i. calculate the batch size.6.e.e. Using the batch means approach. Set-up a data structure to collect information regarding the amount of time each customer spends in the system. Implement the batch means approach in your program and run your program for 31 batches. implement a sequential procedure for estimating the mean time a customer spends in the system (see section 5. d. c.2). 3. i.. (Make sure that the input values you use lead to a stable system.Estimation Techniques 147 d. Then. The calculation of the standard deviation may not be correct. . e. Implement a scheme to estimate the 95th percentile of the time a customer spends in the system. calculate the mean and the standard deviation of the time a customer spends in the system. customer arrivals). First remove the print statement that prints out a line of output each time an event occurs. Based on the remaining 1000 observations. Modify the simulation model to carry out the following tasks. i. . Write a program (or use an existing statistical package) to obtain a correlogram based on the above 1000 observations. Augment your simulation program to calculate the mean and standard deviation. Discard the first 50 observations to account for the transient state. b. the utilization of the first server is less than 100%.

.

Let us consider. their performance can be only evaluated by constructing a model. How accurately does a simulation model (or. say using simulation techniques. any kind of model) reflect the operations of a real-life system? How confident can we be that the obtained simulation results are accurate and meaningful? In general. the manufacturer would like to be able to study various alternative configurations of this new switch so that to come up with a good product. The management is considering various alternatives for expanding the system's capacity. for that matter. Also. change the model appropriately in order to analyze each of the above alternatives. The question that arises here is how does one make sure that the model that will be constructed is a valid representation of the system under study? Let us consider another example involving a communication system already in operation.CHAPTER 6: VALIDATION OF A SIMULATION MODEL Validation of a simulation model is a very important issue that is often neglected. Then. Therefore. for instance. these alternative configurations do not exist. The performance of such a switch can be only estimated through modelling since the actual system does not exist. such as a switch. a communications equipment manufacturer who is currently designing a new communications device. The standard method is to construct a model of the existing system. one models a system with a view to studying its performance. Let us assume that it operates nearly at full capacity. This system under study may or may not exist in real-life. the manufacturer would like to know in advance if the new switch has an acceptable performance. Obviously. The model of the existing system can be . Which of these alternatives will improve the system's performance at minimum cost? Now.

it is important that the management has the opportunity to check whether the model's assumptions are credible. However. This is a rather tedious task. Similar statistical tests can be carried out for each stochastic variate generator built into a simulation model. If actual data are available regarding the system under study. the future event list. and other relevant data structures each time an event takes place in the simulation. However. This is a rather difficult task. Then. 4. using this method one can discover possible logical errors and also get a good feel about the simulation model. 5. Consider the machine interference problem. Obtain exact values of the . Relationship validity. Output validity.1) and do they satisfy statistical criteria of independence? Usually. the simulation model is not valid. Set up the following validation scheme. This is one of the most powerful validity checks. One way of going about it is to print out the status variables. if they do not compare well. Obviously.150 Computer Simulation Techniques validated by comparing its results against actual data obtained from the system under investigation. then these data can be compared with the output obtained from the simulation model. Are the pseudo-random numbers uniformly distributed in (0. Therefore. Check the pseudo-random number generators. 1. there is no guarantee that when altering the simulation model so that to study one of the above alternatives. this new simulation model will be a valid representation of the configuration under study! The following checks can be carried out in order to validate a simulation model. one has to check by hand whether the data structures are updated appropriately. Check the stochastic variate generators. Computer assignments 1. 2. This problem can be also analyzed using queueing theory. one takes for granted that a random number generator is valid. 3. Check the logic of the simulation program. Quite frequently the structure of a system under study is not fully reflected down to its very detail in a simulation model.

Let it vary. 2. Then. a. Remove the changes you made in the code in task a above so that to restore the code in its original state. Consider the token-based access scheme. and do not let the token change queues if queue 1 is empty. obtain an estimate of the mean down time of a machine using your simulation model. queue 1 will be the only queue that will be served.. Use the batch means method to estimate the mean waiting time in queue 1.001. The analytic result should be within the confidence interval obtained from the simulation.) b. in this case the queues will be served in a round-robin fashion where a maximum of one customer is served from each queue. (You can do this by setting T to a very large number so that the time-out never occurs during the life of the simulation. Also. Assume an interarrival mean time of 20 and a mean service time of 5 for all packets. and consequently your estimation procedure will automatically give the mean waiting time of queue 1. Assume that T is very small. Be sure to indicate the confidence interval at each simulated point. make sure that the data structures for the other queues do not become very large so that to cause memory problems.001. Graph both sets of results.e. Compare the two sets of results. for instance. (The only departures are from queue 1. and the overhead time due to switch-over time is almost zero.e. from 1 to 50 so that to get a good spread. use these values in all the experiments. Compare your results to the mean waiting time of an M/M/1 queue with the same arrival and service rates. Modify your code so that the token is always in queue 1 and it never visits the other queues.) Then. it is equal to 0. So. T=0. while the other queues will grow in length for ever.. Carry out the following validations of your simulation. Assume that the inter-arrival time to each queue is equal to 20 . Note: Remove the iterative option from the simulation. For each of these values. By now.Validation 151 mean time a machine is broken down using queueing theory results for various values of the mean operational time. and that the switch over time is very small as well. i. you should have a good idea as to how many batches and the batch size you will need in order to get the required accuracy. i.

e. Likewise. and eventually will become zero.152 Computer Simulation Techniques and the mean service time is the same for all packets and it is equal to 2. the mean 2 2 operational time and the mean downtime respectively at the second queue. where µ1. Consider the two-stage manufacturing system. Keep λ. 1 1 . and λ is the arrival rate to the two-stage manufacturing system. 1/α and 1/β is the service rate. use these values in all the experiments. if we assume that the percent of time each machine is down is very small. the two-stage manufacturing system is equivalent to two infinite capacity queues in tandem. So. Each queue then becomes an M/M/1 queue. you should have a good idea as to how many batches and the batch size you will need in order to get the required accuracy. 1/β .15. Note: By now. the mean € waiting time in the second queue is given by the expression: W 2 = 1/[(β 2µ2 /(α 2 + β 2 )) − λ ] where µ2. i.. carry out the following experiment. and a mean service time of 2. The M/M/1 mean waiting time should be within the confidence intervals of your simulated mean waiting time. As the buffer capacity of the second queue increases. Based on € the above. Compare your results to the mean waiting time of an M/M/1 queue with an arrival rate equal to the sum of the arrival rates to the three queues. Use the batch means method to estimate the mean waiting time in the system. Run the simulation to calculate the mean waiting time in the two-stage manufacturing system for different buffer capacities of queue 2. i. At that moment. and the mean waiting time in the two-stage manufacturing system is equal to the mean waiting time W1 in the first queue plus the mean waiting time W2 in the second queue. 3. then the service time at each queue can be safely approximated by an exponential distribution. 1/α and 1/β is the service rate.e. λ=0. the probability that server 1 will get blocked upon service completion due to queue 2 being full will decrease. Now. µ1 1/α . the mean downtime divided by the sum of the mean operational time and the mean downtime is less than 2%. The mean waiting time in the first queue is given by the expression: W1 = 1/[(β1µ1 /(α 1 + β1 )) − λ ]. 1 1 the mean operational time and the mean downtime respectively at the first queue.

Note: Remove the iterative option for the simulation. (Remember to fix the mean 2 2 2 downtime and mean operational times so that the mean downtime divided by the sum of the mean operational time and the mean downtime is less than 2%). use these values in all the experiments.Validation 153 µ . 1/α and 1/β the same for all the simulations. Show that this curve approximates W1+W2 as calculated using the above expressions for the M/M1 queue. . So. you should have a good idea as to how many batches and the batch size you will need in order to get the required accuracy. Plot the mean waiting time against the buffer capacity of queue 2. By now.

.

For instance. large sample sizes required long simulation runs which. An alternative way to increasing the estimate's accuracy is to reduce its variance.e. One way to increase the accuracy of an estimate (i. are expensive. If one can reduce the variance of an endogenously created random variable without disturbing its expected value. then the confidence interval width will be smaller. where n is the sample size. for the same amount of simulation. the confidence interval width can be halved if the sample size is increased to 4n. reduce the width of its confidence interval) is to increase n. However.. it was mentioned that the accuracy of an estimate is proportional to 1/ n . . in general.CHAPTER 7: VARIANCE REDUCTION TECHNIQUES 7.1 Introduction In Chapter 6.

i. i = 1.. Also. x (1).. We have ⎛X(1)+X(2)⎞ ⎟ E(Z)= E⎜ ⎝ ⎠ 2 .. In this Chapter. it is not possible to know in advance whether a variance reduction technique will effectively reduce the variance in comparison with straightforward simulation. 7. i=1. therefore.2 The antithetic variates technique This is a very simply technique to use and it only requires a few additional instructions in order to be implemented. n. Therefore.. 2. It is standard practice. a) the antithetic variates technique and (b) the control variates technique.. let x1 . let us define a new random € variable € zi = (x (1) + x (2) ) /2 .i. 2 n observations of X obtained in a second simulation run. Furthermore.. (1) Let X be an endogenously created random variable. .. to carry out pilot simulation runs in order to get a feel of the effectiveness of a variance reduction technique and of the additional computational cost required for its implementation. Let x1 . x (2) be n i.d. let Z=(X(1)+X(2))/2.156 Computer Simulation Techniques Techniques aiming at reducing the variance of a random variable are known as Variance Reduction Techniques.. it is not possible to know in advance how much of variance reduction can be achieved. Also. x (1) be n i... i i (7. a small pilot study may be useful in order to decide whether or not to implement this technique. we will examine two variance reduction techniques. Most of these techniques were originally developed in connection with Monte Carlo Techniques. Variance reduction techniques require additional computation in order to be implemented. where X(i). indicates the random variable X € as observed in the ith simulation run.d. No general guarantee of its effectiveness can be given.. 2 n (2) realizations of X obtained in a simulation run.2. x (2). Now. namely.1) More specifically.

Y)=ρ Var(X)Var(Y) . As will be seen below. x (1) and x12 . we have that 1 Var(Z) = 2 Var(X) (1 + ρ). Now... Observations zi are obtained from (7. x (1). Remembering that Var(X(1))=Var(X(2))=Var(X).. X(2))].. X(2))) . We have ⎛X(1)+X(2)⎞ ⎟ Var(Z) = Var⎜ ⎝ ⎠ 2 1 = 4 [Var(X(1)) + Var(X(2)) + 2Cov(X(1). the expected value of this new random variable Z is identical to that of X. x 2 . x n 2 n we can cause Var(Z) to become significantly less than Var(X).. Since Cov(X.2) where ρ is the correlation between X(1) and X(2). In order to construct an interval estimate of E(X). This is achieved by € € causing ρ to become negative. In the special case where the two sets of observations . Thus.2).Variance Reduction Techniques 1 = 2 [E(X(1)) + E(X(2))] = E(X) 157 seeing that the expected value of X(1) or X(2) is that of X. by appropriately constructing the two samples (1) 2 2 x1 . we use random variable Z.. (7.1) and the confidence interval is obtained using (7.. we have that 1 Var(Z) = 2 (Var(X) + Cov (X(1).. let us examine its variance.

If Y is very small. let us consider that in the second run. x1 . ti=F-1(ri) andsi=G-1(vi) are an interarrival and a service variate. we make use of two controllable variables. 2 n Var(Z)=Var(X)/2. x1 and x12 . This difference may be positive or negative indicating that the queue is going through a busy or slack period respectively. therefore.e. Hence. then customers arrive slower and. Therefore. x n are independent of each other... As an example. An indication of whether the queue is tending to increase or decrease can be obtained by considering the difference di=ti-si. hence. These two variates can be associated with the ith simulated customer.e. On the other hand. the smaller the queue size. i. we have that ρ=0. That is..158 Computer Simulation Techniques 1 2 2 x1 . i In this example. This negative correlation between these two variables can be created in a systematic way as follows. the waiting time in the queue. X is small..s´ (where t´ =F-1(r´) and s´ = G-1(v´)) ´ i i 1 i i i has the opposite sign of di. we see that X and Y can be negatively correlated. so that i i d´ = t1 . the less a customer has to wait in the queue. the queue size gets larger. The larger the queue size. if the queue was going through a slack (busy) period in the first run at the time of the ith simulated customer.. € The antithetic variates technique attempts to introduce a negative correlation € between the two sets of observations. now it goes through a busy (slack) period. Y1 and Y2. we associate pseudo-random number r´ and v´ with the ith simulated customer. i. X is larger. indicating the interarrival time and the service time respectively. j=1.2 can be . the more a customer has to wait in the queue. Now. Obviously.. These two random variables are strongly correlated with X. then customers arrive faster and. Let ri and vi be pseudo-random numbers. and let X and Y indicate the waiting time in the queue and the interarrival time respectively.. Then. let us consider a simulation model of a single server queue.. Yj(1) and Yj(2). the queue size gets smaller.. if Y is large. It can be shown that this can be achieved by simply setting r´ = 1-ri and i v´ = 1-vi. Let F(t) and G(S) be the cumulative distribution of the interarrival and service time respectively. x 2 .

. Figure 7. was implemented in simulation of an M/M/1 queue.1: Straight simulation of an M/M/1 queue..Variance Reduction Techniques 159 negatively correlated by simply using the compliment of the pseudo-random numbers used in the first run.84 _ Table 7.. The results given in table 7. x (1) be n i. The random variable X is the time a customer spends in € € the system. we obtained a confidence interval of 13. The i. Simulate the single server queue. Re-run the simulation. j=1. we observe that the Z values are all close to the mean.52±1. We note that the antithetic variates techniques were employed using two sets of observations each of size equal to 300..11 + 2.i.99 _ 12. thus replicating 2 n the results. . This technique can be implemented as follows. x n be realizations of X.1-vi). Construct the interval estimate of E(X) using the random variable Z as described above.vi)=(1-r.i. using pseudo-random numbers (ri. i..e. indicating that their variance is small. We see.46 _ 13.d. Sample size n 600 900 1200 1500 1800 Confidence interval 13.. x (1).03 + 2. we see that a similar width was obtained using a sample size of n=1800.. From table 7. the correlation between the two samples of observations is € as good as the correlation between Y 1 and Y j2 . Let € 2 2 x12 .86 + 1. a total of 600 observations.1 were obtained using straight simulation. that the two samples of observations are fairly negatively correlated..2.86 + 3.2 shows the actual values for the original sample. observations of X were obtained by sampling every 10th customer.82 + 1. Using the antithetic variates technique.76.70 _ 13.30 _ 12. j The antithetic variates technique.. (1) and let x1 . x 2 . Also.d observations of X. the sample obtained using the antithetic variates and Z. as described above. Obviously.1.

0 45.0 35.0 5. .1: Antithetic variates technique applied to an M/M/1 queue.0 1 5 Simulated Number of Customers 10 15 20 25 Original Sample Antithetic Variates Mean of Two Samples 30 Figure 7.160 Computer Simulation Techniques Waiting Time Per Customer 55.0 15.0 25.

69 1.2 show that in this case there is little benefit to be gained from using the antithetic variates technique.68 .92 16.98 9.36 9.84 15.88 1.06 1. $ 1.12 Table 7.85 16.31 2.73 9.03 . this should not be construed that this method always works well.82 6.69 16. .Variance Reduction Techniques 161 In the above example.00 17. 9.59 6.72 .2 and figure 7.60 6.19 1.69 4.21 4.09 .4 800 1600 2400 Sample Size 3200 Figure 7. an M/M/2 queuing system was simulated.74 0.3: Straight simulation and antithetic variates techniques for an M/M/2 queue.70 8.93 1. Type of Simulation Sample Size 400 800 800 1600 1600 2400 2400 3000 3000 Mean Standard Dev.30 1.6 $!Straight Simulation #!Standard Antithetic Variate Technique # $ $ $ $ # $ # $ # Standard Error 1.2: Standard error plotted against sample size for straight simulation and antithetic variates techniques for an M/M/2 queue.84 0.61 6.64 1.36 1. the antithetic variates technique worked quite well.97 .11 Standard Error as a % of Mean 8.72 Cost $.17 Standard Error 1. However.67 5. Table 7.23 16.22 4.81 3.8 0.25 1.48 15.96 6. in the following example. In particular.44 1.01 3.69 0.07 2.61 0.64 0. Interval +/-3.17 0.04 15.79 1.79 5.2 0.60 Straight Straight Standard Antithetic Straight Standard Antithetic Straight Standard Antithetic Straight Standard Antithetic 18.96 Conf.

2|Cov(X. Var(Z) = Var(X) + Var(Y) + 2 Cov (X.Y)| Therefore.Y)| < 0 then a reduction in the variance of Z has been achieved.Y)<0. b. Since X and Y are negatively correlated. This is known as the control variable. we have that Cov(X. Let us consider the example of a single server queue mentioned in section 7. a reduction in the variance of Z has been achieved. We have the following two cases. We have E(Z)=E(X+Y-E(Y))=E(X). Then E(Z) = E(X .Y)| < 0 then. Var(Z) = Var(X) + Var(Y) .162 7. Let X be an endogenously created random variable whose mean we wish to estimate.2 |Cov(X. if Var(Y) . Then X is negatively correlated with the . Let X be the time a customer spends in the system. a. Random variable Y is strongly correlated with X. Let Y be another endogenously created random variable whose mean is known in advance. if Var(Y) .2|Cov(X.2. X and Y are negatively correlated: Define a new random variable Z so that Z=X+Y-E(Y). X and Y are positively correlated: Define Z=X-Y+E(Y).Y + E(Y)) = E(X). Y). Therefore.3 The control variates technique Computer Simulation Techniques This method is otherwise known as the method of Concomitant Information.

i.) yi is the inter-arrival time associated with the xi observation.. observations of X.Y) < 0. .96 n where _ 1 n Z = n ∑ zi i=1 and 2 sz = 1 n ∑ (zi −Z) 2 . Likewise.2a Cov(X.. . y2.. _ sz Then we can construct a confidence interval for the estimate of E(X) using Z +1. Also.) Let x1..i.d.. (These observations are i. Let zi = xi + yi .. (It is also positively correlated with the random variable representing the service time.a (Y . x2.E(Y).2. where a is a constant to be estimated and Y is positively or negatively correlated to X.E(Y)). i=1. xn be n i. yn be n observations of Y. .d. Again...Y) so that Z has a smaller variance than X if a2 Var(Y) . Var(Z) = Var(X) + a2 Var(Y) . by nature of the simulation model.2a Cov(X. let y1. n −1 i=1 More generally. we have E(Z)=E(X). random variable Z can be obtained using € Z = X .Variance Reduction Techniques 163 random variable Y representing the inter-arrival time. n...

. provided that X and Y are correlated. We have 2a Var(Y) . we always get a reduction in the variance of Z for the optimal value of a. as follows m Z=X- ∑ ai(Yi .2 m ai i=1 ∑ Cov(X. are any real numbers. Thus. In this case.E(Yi)) .Yi) + 2 ∑ i=2 i-1 aiaj j=1 ∑ Cov(Yi.2Cov(X. Sample estimates can be used in order to approximately obtain a*. substituting into the expression for Var(Z) we have [Cov(X... Var(Z) = Var(X) + m a2 i=1 i ∑ m Var(Yi) .164 Computer Simulation Techniques We select a so that to minimize the right-hand side given in the above expression.Y) = 0 or Cov (X. The definition of Z can be further generalized using multiple control variables. .. i=1.m.Yj). i=1 where ai. The determination of a* requires a priori knowledge of the Var(Y) and Cov (X.Y)]2 Var(Y) Var(Z) = Var(X) - 2 = (1-ρXY ) Var(X).Y) a* = Var (Y) Now.Y).2.

educe as the sample size goes up. Consider the machine interference problem. b. Implement the control variates technique. . Observe how the variance and the standard error expressed as a percentage of the mean. Implement the antithetic variance reduction technique. Compare the above two variance reduction techniques against straight simulation.Variance Reduction Techniques Computer assignments 1. In particular. Contrast this against the execution cost (if available). 165 c. Carry out the following tasks: a. compare the three techniques for various sample sizes.

we will assume reservations with blocking. its available bandwidth is reduced appropriately during that interval. The requests arrive long before the actual start of the requested reservation interval. The bandwidth of all the links is the same and finite and will be one of the inputs to the simulation. Every time a link is reserved for an interval. either one link or two links. This period between the arrival of the request and the start of the interval is called the intermediate period. For this simulation we will limit the path to be over a maximum of two links. A path can be established between two LSRs directly over the link connecting them or via other LSRs over multiple links.. In this project. Each LSR is connected to every other LSR by two simplex links in opposite directions. Requests are independent of each other and each one specifies a required bandwidth and a time interval in the future for which it needs a path to be reserved between two LSRs. Project description You have to simulate a fully connected network of N LSRs and an advance reservation system in which reservation requests arrive dynamically.166 Computer Simulation Techniques CHAPTER 8: SIMULATION PROJECTS 8.1 An advance network reservation system – blocking scheme The objective of this project is to simulate an advance reservation system that is used to reserve bandwidth for connections in a mesh network of LSRs in advance of the time that the connections are required. That is. a reservation request is blocked if network resources cannot be reserved within a certain window of time defined by the reservation request. i.e. .

we will assume that the window is twice the length of the reservation interval. i. where t' varies from 0 to t. The simulation model The following are the inputs to the simulation: 1. t1+t'+t]. we will assume that time is slotted and the length of a slot is equal to some given duration. N. Link capacity. destination LSR y. t2] within which the reservation has to be scheduled. The request is accepted if a path can be reserved from x to y during any such interval for the requested bandwidth. For simplicity. as shown in the figure below. Maximum intermediate period. 4. Also. arriving at time t0 specifies the source LSR x. 3. Mean inter-arrival time of reservation requests. t2.t1 = 2t. All values for t. you have to check if a path can be reserved for any interval t within a window t1 to t2. In this simulation. 5. you have to check if each of the link(s) along the one-hop or along any of the two-hop paths from x to y has sufficient available bandwidth during the requested window. t1. Thus.. 2. . Intermediate period t0 Arrival of request t1 Reservation window t' t t2 For each request. and a window [t1. Otherwise. duration of the reservation interval t. 6.e. Total simulated number of arrivals. t2 are therefore integers. you have to check if a path can be reserved for the request during any interval [t1+t' . Maximum duration of reservation interval. it is blocked.Variance Reduction Techniques 167 A reservation request. Specifically. number of LSRs. we assume that a slot is equal to 5 minutes.

If reservations can be done up to five hours in advance then the value of 'maximum intermediate period’ will be 60. The link capacity is the maximum available bandwidth of a link when it is idle. This array will be used to indicate the link’s available bandwidth at each time slot. {0-1. This method is. you may chose to generate these paths on the fly during the execution of each reservation request. then the value of the input 'maximum duration of reservation interval' will be 48 (i. then the value of the 'mean inter-arrival time of reservation requests' will be 8. the possible paths from LSR 0 to LSR 3 are: {0-3}. we assume that time is slotted. and each slot is equal to five minutes.168 Computer Simulation Techniques As mentioned above. Next. 4-3}. Consequently. Thus each element of the array gives the available bandwidth on the link. Now. 1-3}. the first five minutes are represented by 1. 12 five minute slots/hour * 2 hours). Alternatively. For simplicity. in the simulation time scale. Thus. at the time instant given by the index of the array. and so on. Save the paths in a data structure of your choice. Arrivals of reservation requests . as this will be used to dimension an integer array for each link in the network. of course. {0-2. the next five minutes by 2. which will be either 1 or 0.5 * Total simulated number of arrivals * mean inter-arrival period. more CPU intensive. we will assume that the link capacity of each link is 1 Gbps. Recall that we will only consider paths of up to a maximum of two links. if we decied to set the maximum duration of a reservation interval to two hours. As an example. As mentioned above. A safe estimate is: 1. There will be as many such integer arrays as the number of links. you do not need to generate paths longer than 2 links. in a mesh network of 5 LSRs. and that the bandwidth requested each time a reservation request arrives is also 1 Gbps. and it should be initialized to the link capacity. all links are assumed to have the same capacity. 2-3} and {0-4. If requests arrive on the average every forty minutes. Initialization You must first define the maximum simulation time (in slots).e. you have to generate a list of all possible paths from each LSR to every other LSR.. For example link03[5] is the available bandwidth on the link from LSR 0 to LSR 3 at simulation time = 5.

The intermediate period is generated as: t1 = arrival_time + (maximum intermediate period) * random(). (Number the LSRs from 0 to N).) You have to generate these values as described below before processing an arrival request. 3. This is basically. .(mean inter-arrival time) * loge(random()). y = N * random(). i. Also. For debugging purposes.. and duration of the reservation interval t. you should use a random number generator that you can seed yourself. it is advisable that you always use the same seed. That is. it may not be a bad idea to download one from a reliable mathematical package. where random() is a pseudo random number generator that yields a random number between 0 and 1. and the required bandwidth is 1 Gbps. which if added to the previous one will give you the time instance in the future when the arrival will occur.e. (The reservation window is set equal to 2*t. 2.(mean inter-arrival time) * loge(random()) yields a positive value which is a random sample from an exponential distribution with a mean equal to the “mean inter-arrival time”. the inter-arrival times are exponentially distributed with a mean inter-arrival time specified as input to the simulation. The destination LSR is generated in the same way. The actual arrival times can be generated as: arrival_time = previous_arrival_time . Each reservation request specifies the reservation parameters: source LSR x. a random value of the next inter-arrival time. destination LSR y. (Make sure you read the rules of the random number generator as to what type of number it will accept as a seed). Such a generator is readily available in most programming languages. 1. beginning of the reservation interval t1. Some of the random number generators are automatically seeded. Consequently. You should ensure that x and y are not the same. In view of this. Occasionally. so that to recreate the same sequence of events. such as Matlab. The expression: . The source LSR is generated randomly as: x = N * random(). these generators do not have good statistical behaviour.Variance Reduction Techniques 169 Arrivals of reservation requests occur in a Poisson fashion. be aware that a random number generator requires an initial value. known as seed.

1. Failing to find a free period of length t proceed to check each two-link path. 2. {0-2. There are several search schemes you can employ to find such a path. 1-3}.…. Let us assume a mesh network of 5 LSRs and that x = 0.2. for a given set of input values. Duration of the reservation interval is generated as: t = (maximum duration of reservation interval) * random(). The request is blocked. to find a free period of time of length t starting at time t1+t'. 4. Now check if a path can be reserved from x to y. If yes. if no free period of duration t is found on the one-link and all the two-link paths.1. Statistical estimation You will use the simulation model to calculate a) the probability that a reservation request is blocked. Run your simulation for a large number of . selected randomly.t.t.…. and the two-link paths are{0-1. where t'=0.. In addition. If a free period is found. 3. 4-3}.2. and b) the mean delay to schedule a reservation request. stop simulation and calculate final statistics. 5. To calculate the confidence interval you will have to obtain about 30 different estimates of the blocking probability under the same input values. The main logic of the simulation program is as follows: 1. y = 3. Generate next arrival Check to see whether it can be scheduled using scheme 1 or 2 Update variables accordingly Update statistics Is the total number of simulated arrivals equal to the value of “total simulated number of arrivals” specified at input time? If no. as described below. the one-link path is {0-3}. you will have to calculate confidence intervals. Then. and below. An easy way to do this is to use the batch means method. the links on that path are reserved for the time period t starting at t1+t'. 2-3} and {0-4. Check the one-link path to find a free period of time of length t starting at time t1+t'.170 Computer Simulation Techniques 4. go back to step 1. where t'=0. A confidence interval provides an indication of the error associated with the sample mean. we describe one such scheme.

Now.96 s s .96 ) n n The confidence interval tells us that the true population’s mean lies within the interval 95% of the time.96(s/ n )~ 0. and that you block your results in groups of 300 arrivals. You then calculate the new blocking probability over the second set of 300 observations.xn be the calculated blocking probabilities from n=30 batches of observations. on the average.e.Variance Reduction Techniques 171 arrivals.01xmean.000 arrivals. to the beginning of the time that the request was actually scheduled. zero the variable where you keep count of the number of blocked requests. You store this number in an array. € That is. . you run your simulation for the first 300 arrivals and you calculate the blocking probability by dividing the number of times an arrival was blocked over the total number of arrivals. the mean blocking probability is: 1 n ∑ xi n i=1 x mean = and the standard deviation s is: € s= ∑ n (x mean − x i ) 2 i=1 n −1 The confidence interval at 95% confidence is given by: € ( x mean −1. let x1. The delay to schedule a reservation request is the difference from the time of the beginning of the requested window.. and run the simulation again for 30 batches. x2. Then. x3. Let us say that you run the simulation for 9. the error 1.. if we repeat the above experiment 100 times. the true population mean will be within the interval. Accurate simulation results require that the confidence interval is very small. 95% of these times. That is. and then group your results as follows. 300. t1. You repeat this process until you have obtained 30 different blocking probabilities. Keep track of . If the error is not small enough. x mean + 1. Keep increasing the € batch size until you are satisfied that the error is within the limits. i. Use the batch means method as described above to calculate the mean delay and its confidence intervals. .. That is. It is given by the value of t'. increase the batch size from 300 observations to 350. and continue to simulate for another 300 arrivals without changing anything in your simulation.

….172 Computer Simulation Techniques all the values t' for all the successfully scheduled requests. and at the end of each batch calculate the average delay.7 1 Gbps 5. assuming that xi is the mean delay of the ith batch. Plot the mean blocking probability and its confidence interval vs.6. you will assume that there are no reservations. the mean interarrival time (range given in the above table) for N (number of LSRs. Plot the mean delay and its confidence interval vs.20 50 100. 3.6. In order to eliminate the effects of the initial condition. you should run the simulation for 100 arrivals. The initial behaviour of the simulation will be affected by this “system empty” initial condition. Then follow the above method.110.6. After that.200 Initially 9000 Distribution Exponential Uniform Uniform Recall that in this simulation. .6. the maximum duration of reservation (range given in the above table) for N = 5. and 7. Input values and required results For your simulation experiments.…. 2. start the batch means method described above. and 7. use the following input values: 1 2 4 5 6 7 Inputs Number of LSRs (N) Link capacity Mean inter-arrival period Maximum intermediate period Maximum duration of reservation Simulated number of arrivals Values 5. and 7. the system is empty. which is the link’s capacity.e. Plot the mean delay and its confidence interval vs. i.) = 5. the mean inter-arrival times (range given in the above table) for N = 5. the maximum duration of reservation (range given in the above table) for N = 5.7. 6.120. 4.6. and 7. Run your simulation to obtain the following: 1. When you start the simulation. Plot the mean blocking probability and its confidence interval vs. the requested bandwidth is always equal to 1 Gbps.

searches for a free period of time over the one-link and over all the two-link paths at each time slot. Failing that. 2-3} and {0-4. we will assume that a reservation request is never blocked.2 An advance network reservation system . Failing to find a free period. unlike the one used in Project 1. The simulation model All the assumptions regarding the simulation and data structures are the same as before. For (b) and (d) assume that the mean inter-arrival time be equal to 12. the one-link path is {0-3}. until you find a free period on one of the paths. 4-3}. selected randomly. Continue by increasing t' by one. The value of t' is the delay from the beginning of the requested reservation window to the beginning of the actual reservation. a reservation is made on one of the paths at the first available time from the requested start time of the reservation window. The search algorithm. In project 1. a request was blocked if it was not possible to find a path for the requested duration within the requested reservation window. The objective of this project is to simulate the same advance reservation system as before under the assumption of delayed reservations. with the exception of the search algorithm to find a free path. repeat the above but start at time t1+t'. and the twolink paths are{0-1. Then. Rather. repeat the same process by checking each of the two-link paths. The algorithm first checks the onelink path to see if it is available for t units of time starting at time t1. . y = 3. where t'=1. Discuss your results.delayed scheme This project builds on the previous simulation model of an advance reservation system. let us assume a mesh network of 5 LSRs and that x = 0. {0-2. 1-3}. assume that the maximum reservation duration is 150. In this project. For (a) and (c). You will be able to re-use most of the previous code. 8.Variance Reduction Techniques 173 5. As an example.

7. use 30 batches with the same batch size you used in Project 1. 3. use the following input values: Inputs Number of LSRs (N) Link capacity Mean inter-arrival time Maximum intermediate period Maximum duration of reservation Values 5.300 Distribution Exponential Uniform Uniform 1 2 4 5 6 The requested bandwidth is an integer number between 1 and 10. That is. for each batch i.6.6.6. uniformly distributed.210. Plot the percent of time that a reservation was made within its requested reservation window and its confidence interval vs.…. Run your simulation to obtain the following: 1. and 7. the mean inter-arrival time (range given in the above table) for N (number of LSRs. Plot the percent of time a reservation was made on a one-link path and its confidence interval vs vs. the mean inter-arrival times (range given in the above table) for N = 5. Plot the mean delay and its confidence interval vs. Plot the percent of time that a reservation was made within its requested reservation window and its confidence interval vs. the maximum duration of reservation (range given in the above table) for N = 5. The simulated number of arrivals should be the same as in Project 1.6. by associating this value with xi.…. and 7. That is. and 7.220. the mean inter-arrival times (range given in the above . calculate the ratio of the number of reservations made within their requested reservation window divided by the total number of successful reservations. Follow the batch means method.20 50 200.6. 4. Plot the mean delay and its confidence interval vs. 5.174 Computer Simulation Techniques Values and required results For your simulation experiments.7 10 Gbps 5. This percent of time can be calculated in the same way you calculate the blocking probability in project 1.6.) = 5. and 7. the maximum duration of reservation (range given in the above table) for N = 5. 2.

This percentage can be calculated in the same way as in item (c) above. calculate the ratio of the number of reservations made on one-link paths divided by the total number of successful reservations. Discuss your results. . 6. For (b). (c) and (e). Follow the batch means method. the maximum duration of reservation (range given in the above table) for N = 5. That is. Plot the percent of time a reservation was made on a one-link path and its confidence interval vs. assume that the maximum reservation duration is 250. and (f) assume that the mean inter-arrival time is 12.Variance Reduction Techniques 175 table) for N= 5. and 7. For (a). by associating this value with xi.6. and 7.6. for each batch i. (d).

.

All rights reserved 2009 .

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue listening from where you left off, or restart the preview.

scribd