You are on page 1of 70

# Chapter 9:

Queuing Models

## Queuing or Waiting Line Analysis

Queues (waiting lines) affect people
everyday
A primary goal is finding the best level of
service
Analytical modeling (using formulas) can
be used for many queues
For more complex situations, computer
simulation is needed

## Queuing System Costs

1. Cost of providing service
2. Cost of not providing service (waiting time)

## Average of 5 ships arrive per 12 hr shift

A team of stevedores unloads each ship
Each team of stevedores costs \$6000/shift
The cost of keeping a ship waiting is
\$1000/hour
How many teams of stevedores to employ
to minimize system cost?

## Three Rivers Waiting Line Cost Analysis

Number of Teams of Stevedores

Ave hours
waiting per ship
Cost of ship
waiting time
(per shift)
Stevedore cost
(per shift)

## \$35,000 \$20,000 \$15,000 \$10,000

\$6000 \$12,000 \$18,000 \$24,000

## Total Cost \$41,000 \$32,000 \$33,000 \$34,000

Characteristics of a
Queuing System
The queuing system is determined by:
Arrival characteristics
Queue characteristics
Service facility characteristics

A Queuing System
Average Wait
in Queue

Arrival
Rate

(Wq )

Service

in Queue (Lq )

Departure

(W )

7

(L )

## Why is Queuing Analysis

Important?
Capacity problems are very common in industry and one
of the main drivers of process redesign
Need to balance the cost of increased capacity against the
gains of increased productivity and service

## Queuing and waiting time analysis is particularly

important in service systems
Large costs of waiting and of lost sales due to waiting

## Prototype Example ER at County Hospital

Patients arrive by ambulance or by their own accord
One doctor is always on duty
More and more patients seeks help longer waiting
times
Question: Should another MD position be instated?
8

## Examples of Real World Queuing

Systems?
Commercial Queuing Systems
Commercial organizations serving external customers
Ex. Dentist, bank, ATM, gas stations, plumber, garage

## Transportation service systems

Vehicles are customers or servers
Ex. Vehicles waiting at toll stations and traffic lights, trucks or
ships waiting to be loaded, taxi cabs, fire engines, elevators,
buses

Customers receiving service are internal to the organization
providing the service
Ex. Inspection stations, conveyor belts, computer support

## Social service systems

Ex. Judicial process, the ER at a hospital, waiting lists for organ
9
transplants or student dorm rooms

Process
Input
Source
Calling
Populatio
n

The Queuing
System
Jobs

Queue

Service
Mechanis
m

Served
Jobs
leave the
system

Queue
Discipline

Arrival
Process
Queue
Configuratio
n

Service
Process
10

## Principal Queue Parameters

1.
2.
3.
4.
5.

Calling Population
Arrival Process
Service Process
Number of Servers
Queue Discipline

11

## 1. The Calling Population

Population of customers or jobs
The size can be finite or infinite
The latter is most common

Can be homogeneous
Only one type of customers/ jobs

Or heterogeneous
Several different kinds of customers/jobs
12

2. Arrival Process
In what pattern do jobs / customers arrive to the
queueing system?
Distribution of arrival times?
Batch arrivals?
Finite population?
Finite queue length?

## Poisson arrival process often assumed

Many real-world arrival processes can be modeled
using a Poisson process
13

3. Service Process
How long does it take to service a job or
customer?
Distribution of arrival times?
Rework or repair?
Service center (machine) breakdown?

## Exponential service times often assumed

Works well for maintenance or unscheduled service
situations

14

4. Number of Servers
How many servers are available?
Single Server Queue

15

Multiple
Queues

Servers

Single Queue
Servers

16

Configuration
Multiple Line

## 1.The service provided

can be differentiated
Ex. Supermarket express
lanes

2. Labor specialization
possible
3. Customer has more
flexibility
4. Balking behavior may be
deterred
Several medium-length lines
are less intimidating than one

1. Guarantees fairness
FIFO applied to all arrivals

2. No customer anxiety
regarding choice of queue
3. Avoids cutting in
problems
4. The most efficient set up
for minimizing time in the
queue
5. Jockeying (line switching)
is avoided
17

5. Queue Discipline
How are jobs / customers selected from the
queue for service?
First Come First Served (FCFS)
Shortest Processing Time (SPT)
Earliest Due Date (EDD)
Priority (jobs are in different priority classes)

## FCFS default assumption for most models

18

Arrival Characteristics
Size of the arrival population either
infinite or limited
Arrival distribution:
Either fixed or random
Either measured by time between
consecutive arrivals, or arrival rate
The Poisson distribution is often used
for random arrivals

Poisson Distribution
Average arrival rate is known
Average arrival rate is constant for some
number of time periods
Number of arrivals in each time period is
independent
As the time interval approaches 0, the
average number of arrivals approaches 0

Poisson Distribution
= the average arrival rate per time unit
P(x) = the probability of exactly x arrivals
occurring during one time period

P(x) = e- x
x!

Behavior of Arrivals
Most queuing formulas assume that all
arrivals stay until service is completed
Balking refers to customers who do not
join the queue
Reneging refers to customers who join
the queue but give up and leave before
completing service

## Poisson Process with Rate

Interarrival times are independent and
exponentially distributed
Models well the accumulated traffic of
many independent sources
The average interarrival time is 1/
(secs/packet), so is the arrival rate
(packets/sec)
Time
Intera rrival Times

Batch Arrivals
Some sources transmit in packet bursts
May be better modeled by a batch arrival
process (e.g., bursts of packets arriving
according to a Poisson process)
The case for a batch model is weaker at
queues after the first, because of shaping
Time
Intera

rrival Times

Queue Characteristics
Queue length (max possible queue length)
either limited or unlimited
Service discipline usually FIFO (First In
First Out)

## Service Facility Characteristics

1. Configuration of service facility
Number of servers (or channels)
Number of phases (or service stops)
2. Service distribution
The time it takes to serve 1 arrival
Can be fixed or random
Exponential distribution is often used

Exponential Distribution
= average service time
t = the length of service time (t > 0)
P(t) = probability that service time will be
greater than t

P(t) = e- t

## Queuing System Concepts:

Arrival Rate, Occupancy, Time in the System
Queuing system
Data network where packets arrive, wait in various
queues, receive service at various points, and exit after
some time

Arrival rate
Long-term number of arrivals per unit time

Occupancy
Number of packets in the system (averaged over a
long time)

## Time in the system (delay)

Time from packet entry to exit (averaged over many
packets)

A single queue system is stable if
packet arrival rate < system transmission capacity

## For a single queue, the ratio

packet arrival rate / system transmission capacity

## In an unstable system packets accumulate in various

queues and/or get dropped
For unstable systems with large buffers some packet delays
become very large
Flow/admission control may be used to limit the packet arrival rate
Prioritization of flows keeps delays bounded for the important traffic

## Stable systems with time-stationary arrival traffic approach a

Littles Law
For a given arrival rate, the time in the system is
proportional to packet occupancy
N=T
where
N: average # of packets in the system
: packet arrival rate (packets per unit time)
T: average delay (time in the system) per packet
Examples:
On rainy days, streets and highways are more crowded
Fast food restaurants need a smaller dining room than regular
restaurants with the same customer arrival rate
Large buffering together with large arrival rate cause large
delays

## Explanation of Littles Law

Amusement park analogy: people arrive, spend
time at various sites, and leave
They pay \$1 per unit time in the park
The rate at which the park earns is \$N per unit
time (N: average # of people in the park)
The rate at which people pay is \$T per unit time
(: traffic arrival rate, T: time per person)
Over a long horizon:
Rate of park earnings = Rate of peoples payment
or
N = T

## Measuring Queue Performance

= utilization factor (probability of all
servers being busy)
Lq = average number in the queue
L = average number in the system
Wq = average waiting time
W = average time in the system
P0 = probability of 0 customers in system
Pn = probability of exactly n customers in
system

Kendalls Notation
A/B/s
A = Arrival distribution
(M for Poisson, D for deterministic, and
G for general)
B = Service time distribution
(M for exponential, D for deterministic,
and G for general)
S = number of servers

## The Queuing Models

Covered Here All Assume
1.
2.
3.
4.
5.

## Arrivals follow the Poisson distribution

FIFO service
Single phase
Unlimited queue length

## We will look at 5 of the most commonly used

queuing systems.

Name Models
(Kendall Notation)

Covered
Example

Simple system
(M / M / 1)

store

Multiple server
(M / M / s)

Constant service
(M / D / 1)

General service
(M / G / 1)

## Limited population An operation with only 12

(M / M / s / / N) machines that might break

## Device Queuing Mechanisms

Common queue examples for IP routers
FIFO: First In First Out
PQ: Priority Queuing
WFQ: Weighted Fair Queuing
Combinations of the above

## Service types from a queuing theory standpoint

Single server (one queue - one transmission line)
Multiple server (one queue - several transmission lines)
Priority server (several queues with hard priorities - one
transmission line)
Shared server (several queues with soft priorities - one
transmission line)

## Single Server FIFO

Single transmission line serving packets on a FIFO
(First-In-First-Out) basis
Each packet must wait for all packets found in the
system to complete transmission, before starting
transmission
Departure Time = Arrival Time + Workload Found in
the System +
Transmission time
Packets arriving to a full buffer are dropped
Arrivals
Transmission
Line

FIFO Queue
Packets are placed on outbound link to egress device in FIFO order
Device (router, switch) multiplexes different flows arriving on various
ingress ports onto an output buffer forming a FIFO queue

Multiple Servers
Multiple packets are transmitted
simultaneously on multiple lines/servers
Head of the line service: packets wait in a
FIFO queue, and when a server becomes
free, the first packet goes into service
Arrivals
Transmission
Lines

Priority Servers
Packets form priority classes (each may have several flows)
There is a separate FIFO queue for each priority class
Packets of lower priority start transmission only if no higher priority
packet is waiting
Priority types:
Non-preemptive (high priority packet must wait for a lower priority
packet found under transmission upon arrival)
Preemptive (high priority packet does not have to wait )
Transmission
Class
Class
Class123Arrivals
Arrivals
Arrivals
Interm.
High
Low
Line
Priority
Priority
Priority

Priority Queuing
Packets are classified into separate queues
E.g., based on source/destination IP address, source/destination TCP port,
etc.

All packets in a higher priority queue are served before a lower priority
queue is served
Typically in routers, if a higher priority packet arrives while a lower priority
packet is being transmitted, it waits until the lower priority packet completes

Shared Servers
Again we have multiple classes/queues, but they are
served with a soft priority scheme
Round-robin
Weighted fair queuing
Transmission
Class
Class
Class123Arrivals
Arrivals
Arrivals
Weight
Weight
Line
Weight
10
31

Round-Robin/Cyclic Service
Round-robin serves each queue in sequence
A queue that is empty is skipped
Each queue when served may have limited service (at most k packets
transmitted with k = 1 or k > 1)

## Round-robin is fair for all queues (as long as some

queues do not have longer packets than others)
Round-robin cannot be used to enforce bandwidth
allocation among the queues.

Fair
Queuing
This scheduling method is inspired by the most fair of methods:
Transmit one bit from each queue in cyclic order (bit-by-bit round robin)
Skip queues that are empty

## To approximate the bit-by-bit processing behavior, for each packet

We calculate upon arrival its finish time under bit-by-bit round robin
assuming all other queues are continuously busy, and we transmit by
FIFO within each queue
Transmit next the packet with the minimum finish time

Important properties:
Priority is given to short packets
Equal bandwidth is allocated to all queues that are continuously busy
Finish
Arrival
i-1
iDeparture
-1
Time
timestimes
of Packet i

## Weighted Fair Queuing

Fair queuing cannot be used to implement bandwidth allocation and
soft priorities
Weighted fair queuing is a variation that corrects this deficiency
Let wk be the weight of the kth queue
Think of round-robin with queue k transmitting wk bits upon its turn
If all queues have always something to send, the kth queue receives
bandwidth equal to a fraction wk / i wi of the total bandwidth

## Fair queuing corresponds to wk = 1

Priority queuing corresponds to the weights being very high as we
move to higher priorities
Again, to deal with the segmentation problem, we approximate as
follows: For each packet:
We calculate its finish time (under the weighted bit-by-bit round robin
scheme)
We next transmit the packet with the minimum finish time

## Weighted Fair Queuing

Weights:
1=3
Illustration Queue
Queue 2 = 1
Queue 3 = 1

Schemes

## Example voice (PQ), guaranteed b/w

(WFQ), Best Effort
(Ciscos LLQ implementation)

## Single Server Queuing System (M/M/1)

Poisson arrivals
Arrival population is unlimited
Exponential service times
All arrivals wait to be served
is constant
> (average service rate > average
arrival rate)

M/M/1 System
Nomenclature: M stands for Memoryless (a property of
the exponential distribution)
M/M/1 stands for Poisson arrival process (which is memoryless)
M/M/1 stands for exponentially distributed transmission times

Assumptions:

## Arrival process is Poisson with rate packets/sec

Packet transmission times are exponentially distributed with mean 1/
One server
Independent interarrival times and packet transmission times

## Transmission time is proportional to packet length

Note 1/ is secs/packet so is packets/sec (packet
transmission rate of the queue)
Utilization factor: = /stable system if 1)

Delay Calculation
Let
Q = Average time spent waiting in queue
T = Average packet delay (transmission plus
queuing)
Note that T = 1/ + Q
Also by Littles law
N = T and N = Q
where
N = Average number waiting in queue
These quantities can be calculated with formulas
derived by Markov chain analysis (see references)
q

M/M/1 Results
probabilities of number of packets in queue or
transmission
P{n packets} = n(1-) where = /
From this we can get the averages:
N = /(1 - )
T011TN
= N/ = /(1 - ) = 1/( - )
/

Bandwidth

## Occupancy and delay formulas

N = /(1 - ) T = 1/( - ) = /

Assume:
Traffic arrival rate is doubled
System transmission capacity is doubled

Then:
Queue sizes stay at the same level ( stays the same)
Packet delay is cut in half ( and are doubled

## A conclusion: In high speed networks

propagation delay increases in importance relative to delay
buffer size and packet loss may still be a problem

## M/M/m, M/M/ System

Same as M/M/1, but it has m (or ) servers
In M/M/m, the packet at the head of the queue
moves to service when a server becomes free
Qualitative result
Delay increases to as= /mapproaches 1

## There are analytical formulas for the occupancy

probabilities and average delay of these
systems

## Finite Buffer Systems: M/M/m/k

The M/M/m/k system
Same as M/M/m, but there is buffer space for at most
k packets. Packets arriving at a full buffer are dropped

## Formulas for average delay, steady-state

occupancy probabilities, and loss probability
The M/M/m/m system is used widely to size
telephone or circuit switching systems

## Characteristics of M/M/. Systems

The Poisson assumption may be violated
The exponential transmission time distribution is an
approximation at best
Interarrival and packet transmission times may be
dependent (particularly in the network core)
input traffic with priorities (hard or soft)

M/G/1 System
Same as M/M/1 but the packet transmission
time distribution is general, with given mean 1/
and variance 2
Utilization factor = /
Pollaczek-Kinchine formula for
Average time in queue = (2 + 1/2)/2(1- )
Average delay = 1/ + (2 + 1/2)/2(1- )

## The formulas for the steady-state occupancy

probabilities are more complicated
Insight: As 2 increases, delay increases

G/G/1 System
Same as M/G/1 but now the packet interarrival
time distribution is also general, with mean
and variance 2
We still assume FIFO and independent
interarrival times and packet transmission times
Heavy traffic approximation:
Average time in queue ~ (2 + 2)/2(1- )

## Operating Characteristics for M/M/1 Queue

1. Average server utilization
=/
1. Average number of customers waiting
Lq =

( )
2. Average number in system
L = Lq + /

## 1. Average waiting time

Wq = Lq =

( )
1. Average time in the system
W = Wq + 1/
2. Probability of 0 customers in system
P0 = 1 /
1. Probability of exactly n customers in
system
Pn = (/ )n P0

## Arnolds Muffler Shop Example

Customers arrive on average 2 per hour
( = 2 per hour)
Average service time is 20 minutes
( = 3 per hour)
Install ExcelModules
Go to file 9-2.xls

## Total Cost of Queuing System

Total Cost = Cw x L + Cs x s
Cw = cost of customer waiting time per
time period
L = average number customers in system
Cs = cost of servers per time period
s= number of servers

## Multiple Server System (M / M / s)

Poisson arrivals
Exponential service times
s servers
Total service rate must exceed arrival rate
( s > )
Many of the operating characteristic
formulas are more complicated

## Arnolds Muffler Shop

With Multiple Servers
Two options have already been considered:
System
Cost

## Keep the current system (s=1)\$32/hr

Get a faster mechanic (s=1)
\$25/hr
Multi-server option
1. Have 2 mechanics (s=2)
?
Go to file 9-3.xls

## Single Server System With

Constant Service Time (M/D/1)
Poisson arrivals
Constant service times (not random)
Has shorter queues than M/M/1 system
- Lq and Wq are one-half as large

## = 8 trucks per hour (random)

= 12 trucks per hour (fixed)
Truck & driver waiting cost is \$60/hour
New compactor will be amortized at
Total cost per unload = ?
Go to file 9-4.xls

## Single Server System With

General Service Time (M/G/1)
Poisson arrivals
General service time distribution with
known mean () and standard deviation ()
>

## Professor Crino Office Hours

Students arrive randomly at an average
rate of, = 5 per hour
Service (advising) time is random at an
average rate of, = 6 per hour
The service time standard deviation is,
= 0.0833 hours
Go to file 9-5.xls

## Muti-Server System With

Finite Population (M/M/s//N)
Poisson arrivals
Exponential service times
s servers with identical service time
distributions
Limited population of size N
Arrival rate decreases as queue lengthens

## Department of Commerce Example

Uses 5 printers (N=5)
Printers breakdown on average every 20
hours
= 1 printer = 0.05 printers per hour
20 hours
Average service time is 2 hours
= 1 printer = 0.5 printers per hour
2 hours
Go to file 9-6.xls

## More Complex Queuing Systems

When a queuing system is more complex,
formulas may not be available
The only option may be to use computer
simulation, which we will study in the next
chapter