You are on page 1of 70

Chapter 9:

Queuing Models

2007 Pearson Education

Queuing or Waiting Line Analysis


Queues (waiting lines) affect people
everyday
A primary goal is finding the best level of
service
Analytical modeling (using formulas) can
be used for many queues
For more complex situations, computer
simulation is needed

Queuing System Costs


1. Cost of providing service
2. Cost of not providing service (waiting time)

Three Rivers Shipping Example

Average of 5 ships arrive per 12 hr shift


A team of stevedores unloads each ship
Each team of stevedores costs $6000/shift
The cost of keeping a ship waiting is
$1000/hour
How many teams of stevedores to employ
to minimize system cost?

Three Rivers Waiting Line Cost Analysis


Number of Teams of Stevedores

Ave hours
waiting per ship
Cost of ship
waiting time
(per shift)
Stevedore cost
(per shift)

$35,000 $20,000 $15,000 $10,000


$6000 $12,000 $18,000 $24,000

Total Cost $41,000 $32,000 $33,000 $34,000

Characteristics of a
Queuing System
The queuing system is determined by:
Arrival characteristics
Queue characteristics
Service facility characteristics

A Queuing System
Average Wait
in Queue

Arrival
Rate

(Wq )

Service

Average Number Rate


in Queue (Lq )

Avg Time in System

Departure

(W )

Avg Number in System


7

(L )

Why is Queuing Analysis


Important?
Capacity problems are very common in industry and one
of the main drivers of process redesign
Need to balance the cost of increased capacity against the
gains of increased productivity and service

Queuing and waiting time analysis is particularly


important in service systems
Large costs of waiting and of lost sales due to waiting

Prototype Example ER at County Hospital


Patients arrive by ambulance or by their own accord
One doctor is always on duty
More and more patients seeks help longer waiting
times
Question: Should another MD position be instated?
8

Examples of Real World Queuing


Systems?
Commercial Queuing Systems
Commercial organizations serving external customers
Ex. Dentist, bank, ATM, gas stations, plumber, garage

Transportation service systems


Vehicles are customers or servers
Ex. Vehicles waiting at toll stations and traffic lights, trucks or
ships waiting to be loaded, taxi cabs, fire engines, elevators,
buses

Business-internal service systems


Customers receiving service are internal to the organization
providing the service
Ex. Inspection stations, conveyor belts, computer support

Social service systems


Ex. Judicial process, the ER at a hospital, waiting lists for organ
9
transplants or student dorm rooms

Components of a Basic Queuing


Process
Input
Source
Calling
Populatio
n

The Queuing
System
Jobs

Queue

Service
Mechanis
m

Served
Jobs
leave the
system

Queue
Discipline

Arrival
Process
Queue
Configuratio
n

Service
Process
10

Principal Queue Parameters


1.
2.
3.
4.
5.

Calling Population
Arrival Process
Service Process
Number of Servers
Queue Discipline

11

1. The Calling Population


Population of customers or jobs
The size can be finite or infinite
The latter is most common

Can be homogeneous
Only one type of customers/ jobs

Or heterogeneous
Several different kinds of customers/jobs
12

2. Arrival Process
In what pattern do jobs / customers arrive to the
queueing system?
Distribution of arrival times?
Batch arrivals?
Finite population?
Finite queue length?

Poisson arrival process often assumed


Many real-world arrival processes can be modeled
using a Poisson process
13

3. Service Process
How long does it take to service a job or
customer?
Distribution of arrival times?
Rework or repair?
Service center (machine) breakdown?

Exponential service times often assumed


Works well for maintenance or unscheduled service
situations

14

4. Number of Servers
How many servers are available?
Single Server Queue

Multiple Server Queue

15

Example Two Queue Configurations


Multiple
Queues

Servers

Single Queue
Servers

16

Multiple v.s. Single Customer Queue


Configuration
Multiple Line
Advantages

1.The service provided


can be differentiated
Ex. Supermarket express
lanes

2. Labor specialization
possible
3. Customer has more
flexibility
4. Balking behavior may be
deterred
Several medium-length lines
are less intimidating than one

Single Line Advantages


1. Guarantees fairness
FIFO applied to all arrivals

2. No customer anxiety
regarding choice of queue
3. Avoids cutting in
problems
4. The most efficient set up
for minimizing time in the
queue
5. Jockeying (line switching)
is avoided
17

5. Queue Discipline
How are jobs / customers selected from the
queue for service?
First Come First Served (FCFS)
Shortest Processing Time (SPT)
Earliest Due Date (EDD)
Priority (jobs are in different priority classes)

FCFS default assumption for most models

18

Arrival Characteristics
Size of the arrival population either
infinite or limited
Arrival distribution:
Either fixed or random
Either measured by time between
consecutive arrivals, or arrival rate
The Poisson distribution is often used
for random arrivals

Poisson Distribution
Average arrival rate is known
Average arrival rate is constant for some
number of time periods
Number of arrivals in each time period is
independent
As the time interval approaches 0, the
average number of arrivals approaches 0

Poisson Distribution
= the average arrival rate per time unit
P(x) = the probability of exactly x arrivals
occurring during one time period

P(x) = e- x
x!

Behavior of Arrivals
Most queuing formulas assume that all
arrivals stay until service is completed
Balking refers to customers who do not
join the queue
Reneging refers to customers who join
the queue but give up and leave before
completing service

Poisson Process with Rate


Interarrival times are independent and
exponentially distributed
Models well the accumulated traffic of
many independent sources
The average interarrival time is 1/
(secs/packet), so is the arrival rate
(packets/sec)
Time
Intera rrival Times

Batch Arrivals
Some sources transmit in packet bursts
May be better modeled by a batch arrival
process (e.g., bursts of packets arriving
according to a Poisson process)
The case for a batch model is weaker at
queues after the first, because of shaping
Time
Intera

rrival Times

Queue Characteristics
Queue length (max possible queue length)
either limited or unlimited
Service discipline usually FIFO (First In
First Out)

Service Facility Characteristics


1. Configuration of service facility
Number of servers (or channels)
Number of phases (or service stops)
2. Service distribution
The time it takes to serve 1 arrival
Can be fixed or random
Exponential distribution is often used

Exponential Distribution
= average service time
t = the length of service time (t > 0)
P(t) = probability that service time will be
greater than t

P(t) = e- t

Queuing System Concepts:


Arrival Rate, Occupancy, Time in the System
Queuing system
Data network where packets arrive, wait in various
queues, receive service at various points, and exit after
some time

Arrival rate
Long-term number of arrivals per unit time

Occupancy
Number of packets in the system (averaged over a
long time)

Time in the system (delay)


Time from packet entry to exit (averaged over many
packets)

Stability and Steady-State


A single queue system is stable if
packet arrival rate < system transmission capacity

For a single queue, the ratio


packet arrival rate / system transmission capacity

is called the utilization factor


Describes the loading of a queue

In an unstable system packets accumulate in various


queues and/or get dropped
For unstable systems with large buffers some packet delays
become very large
Flow/admission control may be used to limit the packet arrival rate
Prioritization of flows keeps delays bounded for the important traffic

Stable systems with time-stationary arrival traffic approach a


steady-state

Littles Law
For a given arrival rate, the time in the system is
proportional to packet occupancy
N=T
where
N: average # of packets in the system
: packet arrival rate (packets per unit time)
T: average delay (time in the system) per packet
Examples:
On rainy days, streets and highways are more crowded
Fast food restaurants need a smaller dining room than regular
restaurants with the same customer arrival rate
Large buffering together with large arrival rate cause large
delays

Explanation of Littles Law


Amusement park analogy: people arrive, spend
time at various sites, and leave
They pay $1 per unit time in the park
The rate at which the park earns is $N per unit
time (N: average # of people in the park)
The rate at which people pay is $T per unit time
(: traffic arrival rate, T: time per person)
Over a long horizon:
Rate of park earnings = Rate of peoples payment
or
N = T

Measuring Queue Performance


= utilization factor (probability of all
servers being busy)
Lq = average number in the queue
L = average number in the system
Wq = average waiting time
W = average time in the system
P0 = probability of 0 customers in system
Pn = probability of exactly n customers in
system

Kendalls Notation
A/B/s
A = Arrival distribution
(M for Poisson, D for deterministic, and
G for general)
B = Service time distribution
(M for exponential, D for deterministic,
and G for general)
S = number of servers

The Queuing Models


Covered Here All Assume
1.
2.
3.
4.
5.

Arrivals follow the Poisson distribution


FIFO service
Single phase
Unlimited queue length
Steady state conditions

We will look at 5 of the most commonly used


queuing systems.

Name Models
(Kendall Notation)

Covered
Example

Simple system
(M / M / 1)

Customer service desk in a


store

Multiple server
(M / M / s)

Airline ticket counter

Constant service
(M / D / 1)

Automated car wash

General service
(M / G / 1)

Auto repair shop

Limited population An operation with only 12


(M / M / s / / N) machines that might break

Device Queuing Mechanisms


Common queue examples for IP routers
FIFO: First In First Out
PQ: Priority Queuing
WFQ: Weighted Fair Queuing
Combinations of the above

Service types from a queuing theory standpoint


Single server (one queue - one transmission line)
Multiple server (one queue - several transmission lines)
Priority server (several queues with hard priorities - one
transmission line)
Shared server (several queues with soft priorities - one
transmission line)

Single Server FIFO


Single transmission line serving packets on a FIFO
(First-In-First-Out) basis
Each packet must wait for all packets found in the
system to complete transmission, before starting
transmission
Departure Time = Arrival Time + Workload Found in
the System +
Transmission time
Packets arriving to a full buffer are dropped
Arrivals
Transmission
Line

FIFO Queue
Packets are placed on outbound link to egress device in FIFO order
Device (router, switch) multiplexes different flows arriving on various
ingress ports onto an output buffer forming a FIFO queue

Multiple Servers
Multiple packets are transmitted
simultaneously on multiple lines/servers
Head of the line service: packets wait in a
FIFO queue, and when a server becomes
free, the first packet goes into service
Arrivals
Transmission
Lines

Priority Servers
Packets form priority classes (each may have several flows)
There is a separate FIFO queue for each priority class
Packets of lower priority start transmission only if no higher priority
packet is waiting
Priority types:
Non-preemptive (high priority packet must wait for a lower priority
packet found under transmission upon arrival)
Preemptive (high priority packet does not have to wait )
Transmission
Class
Class
Class123Arrivals
Arrivals
Arrivals
Interm.
High
Low
Line
Priority
Priority
Priority

Priority Queuing
Packets are classified into separate queues
E.g., based on source/destination IP address, source/destination TCP port,
etc.

All packets in a higher priority queue are served before a lower priority
queue is served
Typically in routers, if a higher priority packet arrives while a lower priority
packet is being transmitted, it waits until the lower priority packet completes

Shared Servers
Again we have multiple classes/queues, but they are
served with a soft priority scheme
Round-robin
Weighted fair queuing
Transmission
Class
Class
Class123Arrivals
Arrivals
Arrivals
Weight
Weight
Line
Weight
10
31

Round-Robin/Cyclic Service
Round-robin serves each queue in sequence
A queue that is empty is skipped
Each queue when served may have limited service (at most k packets
transmitted with k = 1 or k > 1)

Round-robin is fair for all queues (as long as some


queues do not have longer packets than others)
Round-robin cannot be used to enforce bandwidth
allocation among the queues.

Fair
Queuing
This scheduling method is inspired by the most fair of methods:
Transmit one bit from each queue in cyclic order (bit-by-bit round robin)
Skip queues that are empty

To approximate the bit-by-bit processing behavior, for each packet


We calculate upon arrival its finish time under bit-by-bit round robin
assuming all other queues are continuously busy, and we transmit by
FIFO within each queue
Transmit next the packet with the minimum finish time

Important properties:
Priority is given to short packets
Equal bandwidth is allocated to all queues that are continuously busy
Finish
Arrival
i-1
iDeparture
-1
Time
timestimes
of Packet i

Weighted Fair Queuing


Fair queuing cannot be used to implement bandwidth allocation and
soft priorities
Weighted fair queuing is a variation that corrects this deficiency
Let wk be the weight of the kth queue
Think of round-robin with queue k transmitting wk bits upon its turn
If all queues have always something to send, the kth queue receives
bandwidth equal to a fraction wk / i wi of the total bandwidth

Fair queuing corresponds to wk = 1


Priority queuing corresponds to the weights being very high as we
move to higher priorities
Again, to deal with the segmentation problem, we approximate as
follows: For each packet:
We calculate its finish time (under the weighted bit-by-bit round robin
scheme)
We next transmit the packet with the minimum finish time

Weighted Fair Queuing


Weights:
1=3
Illustration Queue
Queue 2 = 1
Queue 3 = 1

Combination of Several Queuing


Schemes

Example voice (PQ), guaranteed b/w


(WFQ), Best Effort
(Ciscos LLQ implementation)

Single Server Queuing System (M/M/1)

Poisson arrivals
Arrival population is unlimited
Exponential service times
All arrivals wait to be served
is constant
> (average service rate > average
arrival rate)

M/M/1 System
Nomenclature: M stands for Memoryless (a property of
the exponential distribution)
M/M/1 stands for Poisson arrival process (which is memoryless)
M/M/1 stands for exponentially distributed transmission times

Assumptions:

Arrival process is Poisson with rate packets/sec


Packet transmission times are exponentially distributed with mean 1/
One server
Independent interarrival times and packet transmission times

Transmission time is proportional to packet length


Note 1/ is secs/packet so is packets/sec (packet
transmission rate of the queue)
Utilization factor: = /stable system if 1)

Delay Calculation
Let
Q = Average time spent waiting in queue
T = Average packet delay (transmission plus
queuing)
Note that T = 1/ + Q
Also by Littles law
N = T and N = Q
where
N = Average number waiting in queue
These quantities can be calculated with formulas
derived by Markov chain analysis (see references)
q

M/M/1 Results
The analysis gives the steady-state
probabilities of number of packets in queue or
transmission
P{n packets} = n(1-) where = /
From this we can get the averages:
N = /(1 - )
T011TN
= N/ = /(1 - ) = 1/( - )
/

Example: How Delay Scales with


Bandwidth

Occupancy and delay formulas


N = /(1 - ) T = 1/( - ) = /

Assume:
Traffic arrival rate is doubled
System transmission capacity is doubled

Then:
Queue sizes stay at the same level ( stays the same)
Packet delay is cut in half ( and are doubled

A conclusion: In high speed networks


propagation delay increases in importance relative to delay
buffer size and packet loss may still be a problem

M/M/m, M/M/ System


Same as M/M/1, but it has m (or ) servers
In M/M/m, the packet at the head of the queue
moves to service when a server becomes free
Qualitative result
Delay increases to as= /mapproaches 1

There are analytical formulas for the occupancy


probabilities and average delay of these
systems

Finite Buffer Systems: M/M/m/k


The M/M/m/k system
Same as M/M/m, but there is buffer space for at most
k packets. Packets arriving at a full buffer are dropped

Formulas for average delay, steady-state


occupancy probabilities, and loss probability
The M/M/m/m system is used widely to size
telephone or circuit switching systems

Characteristics of M/M/. Systems


Advantage: Simple analytical formulas
Disadvantages:
The Poisson assumption may be violated
The exponential transmission time distribution is an
approximation at best
Interarrival and packet transmission times may be
dependent (particularly in the network core)
Head-of-the-line assumption precludes heterogeneous
input traffic with priorities (hard or soft)

M/G/1 System
Same as M/M/1 but the packet transmission
time distribution is general, with given mean 1/
and variance 2
Utilization factor = /
Pollaczek-Kinchine formula for
Average time in queue = (2 + 1/2)/2(1- )
Average delay = 1/ + (2 + 1/2)/2(1- )

The formulas for the steady-state occupancy


probabilities are more complicated
Insight: As 2 increases, delay increases

G/G/1 System
Same as M/G/1 but now the packet interarrival
time distribution is also general, with mean
and variance 2
We still assume FIFO and independent
interarrival times and packet transmission times
Heavy traffic approximation:
Average time in queue ~ (2 + 2)/2(1- )

Becomes increasingly accurate as

Operating Characteristics for M/M/1 Queue


1. Average server utilization
=/
1. Average number of customers waiting
Lq =

( )
2. Average number in system
L = Lq + /

1. Average waiting time


Wq = Lq =

( )
1. Average time in the system
W = Wq + 1/
2. Probability of 0 customers in system
P0 = 1 /
1. Probability of exactly n customers in
system
Pn = (/ )n P0

Arnolds Muffler Shop Example


Customers arrive on average 2 per hour
( = 2 per hour)
Average service time is 20 minutes
( = 3 per hour)
Install ExcelModules
Go to file 9-2.xls

Total Cost of Queuing System


Total Cost = Cw x L + Cs x s
Cw = cost of customer waiting time per
time period
L = average number customers in system
Cs = cost of servers per time period
s= number of servers

Multiple Server System (M / M / s)

Poisson arrivals
Exponential service times
s servers
Total service rate must exceed arrival rate
( s > )
Many of the operating characteristic
formulas are more complicated

Arnolds Muffler Shop


With Multiple Servers
Two options have already been considered:
System
Cost

Keep the current system (s=1)$32/hr


Get a faster mechanic (s=1)
$25/hr
Multi-server option
1. Have 2 mechanics (s=2)
?
Go to file 9-3.xls

Single Server System With


Constant Service Time (M/D/1)
Poisson arrivals
Constant service times (not random)
Has shorter queues than M/M/1 system
- Lq and Wq are one-half as large

Garcia-Golding Recycling Example

= 8 trucks per hour (random)


= 12 trucks per hour (fixed)
Truck & driver waiting cost is $60/hour
New compactor will be amortized at
$3/unload
Total cost per unload = ?
Go to file 9-4.xls

Single Server System With


General Service Time (M/G/1)
Poisson arrivals
General service time distribution with
known mean () and standard deviation ()
>

Professor Crino Office Hours


Students arrive randomly at an average
rate of, = 5 per hour
Service (advising) time is random at an
average rate of, = 6 per hour
The service time standard deviation is,
= 0.0833 hours
Go to file 9-5.xls

Muti-Server System With


Finite Population (M/M/s//N)
Poisson arrivals
Exponential service times
s servers with identical service time
distributions
Limited population of size N
Arrival rate decreases as queue lengthens

Department of Commerce Example


Uses 5 printers (N=5)
Printers breakdown on average every 20
hours
= 1 printer = 0.05 printers per hour
20 hours
Average service time is 2 hours
= 1 printer = 0.5 printers per hour
2 hours
Go to file 9-6.xls

More Complex Queuing Systems


When a queuing system is more complex,
formulas may not be available
The only option may be to use computer
simulation, which we will study in the next
chapter