You are on page 1of 85

An Approximate Analysis of Dynamic Pricing,

Outsourcing, and Scheduling Policies for a Multiclass


Make-to-Stock Queue in the Heavy Traffic Regime
Baris Ata? and Nasser Barjesteh†
?
Booth School of Business, The University of Chicago

Rotman School of Management, University of Toronto

Abstract
We consider a make-to-stock manufacturing system selling multiple products to price-sensitive
customers. The system manager seeks to maximize the long-run average profit by making dy-
namic pricing, outsourcing, and scheduling decisions: First, she adjusts prices dynamically
depending on the system state. Second, when the backlog of work is judged excessive, she may
outsource new orders, thereby incurring outsourcing costs. Third, she decides dynamically on
which product to prioritize in the manufacturing process, i.e., she makes dynamic scheduling
decisions. This problem appears analytically intractable. Thus, we resort to an approximate
analysis in the heavy-traffic regime and consider the resulting Brownian control problem. We
solve this problem explicitly by exploiting the solution to a particular Riccati equation. The
optimal solution to the Brownian control problem is a two-sided barrier policy with drift rate
control: Outsourcing and idling processes are used to keep the workload process above the lower
reflecting barrier and below the upper reflecting barrier, respectively. Between the two barriers,
a state-dependent drift rate is used to control the workload. By interpreting this solution in the
context of the original model, we propose a joint dynamic pricing, outsourcing, and scheduling
policy, and demonstrate its effectiveness through a simulation study.

Keywords— dynamic pricing, make-to-stock production, heavy-traffic analysis, stochastic control,


Riccati equation

1 Introduction

This paper studies the dynamic control of a make-to-stock manufacturing system that sells multiple products
to price-sensitive customers. In make-to-stock manufacturing systems, the products are manufactured based
on a demand forecast and stored in finished goods inventory. Customers are dynamically quoted a price.
When a customer order is received, it is served from the finished goods inventory if the item requested is in
stock. Otherwise, it is backordered. The manufacturer incurs a holding cost for each product in inventory
and a backorder cost for each request on backorder (per unit of time). The make-to-stock manufacturing
system is modeled as a multiclass single-server queueing system. The system manager seeks to maximize the
long-run average profit by making dynamic pricing, outsourcing, and scheduling decisions: First, she con-
trols the demand by dynamically adjusting the prices. Second, when the backlog of work is judged excessive,

1
she outsources customer orders. Finally, she dynamically decides when and what products to manufacture.
The joint consideration of pricing and operational decisions, such as scheduling and outsourcing, leads to a
manufacturing system that is more responsive to the variability in the demand and production processes.
Moreover, dynamic prices provide the system manager with the opportunity to take advantage of the het-
erogeneity of customers in their price sensitivity to increase profits. In determining the prices, she takes into
account the price sensitivity of demand, the holding, backlog, and outsourcing costs, the variability in the
demand and production processes, and the production capacity of the manufacturing system.

Because the joint dynamic pricing, outsourcing, and scheduling control problem appears intractable in
its exact form, we follow Harrison (1988) and approximate it by a far more tractable formulation referred to
as the Brownian control problem. We do so in the so-called heavy traffic asymptotic regime which assumes
the demand and the system capacity are large, and the server utilization is near one. We derive an effective
and intuitive joint dynamic pricing, outsourcing, and scheduling policy for the manufacturing system by
solving the approximating Brownian control problem explicitly and interpreting its solution in the context
of the original control problem.

The approximating Brownian control problem is equivalent to a one-dimensional drift rate control prob-
lem, whose state process, called the workload process, represents the total (scaled) inventory (or backlog) in
the system measured in hours of total work. An important feature of this drift rate control problem is the
presence of state costs that capture the holding and backorder costs in the original control problem. The
drift rate control problem is solved explicitly by exploiting the solution to a Riccati equation. The optimal
policy comprises of the drift rate control policy and a two-sided barrier policy. The outsourcing and idling
processes are used to keep the workload process between the lower and upper reflecting barriers. Between
the two barriers, a state-dependent drift rate is used to control the workload, which is ultimately interpreted
as a dynamic pricing policy.

The optimal solution to the approximating Brownian control problem leads to an intuitively appealing
and easily implementable policy and reveals several structural insights. First, the pricing, outsourcing, and
scheduling decisions depend primarily on the (aggregate) workload. Second, in large systems, the magnitude
of the prescribed price changes are small. Third, as the backlog of work increases and the inventory decreases,
prices should be adjusted to decrease the (effective) demand rate. Finally, only when the backlog of work is
too excessive, outsourcing is worthwhile.

This paper extends the existing literature in several ways and makes the following contributions. First,
it studies dynamic pricing and outsourcing in addition to the scheduling of a make-to-stock manufacturing
system, thereby extending Wein (1992a) and Veatch and Wein (1996). To the best of our knowledge, it is the
first paper in the literature that jointly considers these controls for a multiclass make-to-stock manufacturing

2
system. The approximating Brownian control problems of Wein (1992a) and Veatch and Wein (1996) only
use singular controls. Thus, in those control problems, the search for the optimal policy is limited to finding
the upper barrier of a barrier policy; see Appendix N for further details. In the Brownian control problem
studied in this paper, the system manager can also control the drift rate, which makes the problem more
challenging. Since the trade-off between the immediate cost and the future cost is a crucial part of our
Brownian control problem, our Brownian control problem does not admit a pathwise solution. That is, one
could not characterize a policy that is optimal for all sample paths. Therefore, we need to solve the associated
Bellman equation. Solving this Bellman equation requires solving a nonlinear differential equation with free
boundaries. The solution to this equation dictates explicit dynamic pricing, outsourcing, and scheduling
policies.

Second, this paper solves a drift rate control problem with state costs and endogenously determined
reflecting barriers. The solution of the Bellman equation is explicit in the sense that the calculations involved
are lightning fast; they only require finding the root of a strictly increasing function via a one-dimensional
search.

Third, the solution derived, i.e., the optimal prices as well as the outsourcing and manufacturing policies,
depends crucially on the second moments of the stochastic model primitives, unlike the solutions of most
Brownian control problems previously considered in the literature. This highlights the fact that the trade-off
between the immediate cost and the future cost is captured in our proposed policies. It also shows that the
second moments of the primitives are crucial in this problem, which further highlights the intricacy of the
problem.

2 Literature Review

This paper is related to four streams of literature: i) The scheduling of make-to-stock manufacturing systems,
ii) the control of queueing systems, iii) the revenue management of manufacturing and queueing systems,
and iv) the expediting and outsourcing of orders in manufacturing systems. The literature on the scheduling
of manufacturing systems is vast; see Nahmias and Olsen (2015, Chapter 9) for a brief overview. The most
closely related paper to ours and an important antecedent of our paper is Wein (1992a). Wein (1992a)
proposes a scheduling policy to minimize the long-run average holding and backorder costs. Similar to
our paper, Wein (1992a) uses a heavy traffic approximation. Our proposed scheduling policy is similar to
the one proposed by Wein (1992a) except for the choice of the idling threshold. In this scheduling policy,
when there are backordered products, the system manager manufactures the backordered product with the
largest bµ index. Ha (1997) provides some theoretical justifications for this policy. In particular, it shows

3
that in a two-product setting, it is optimal to manufacture the product with the larger bµ index if it is
backordered (independent of the inventory of the other product). When no product is backordered, under
the Wein (1992a) policy, the system manager manufactures the product with the smallest hµ index. We
build on Wein (1992a) by introducing dynamic pricing and outsourcing. These extra features lead to a
different workload formulation, where the analytical solution of the associated Bellman equation dictates
both a specific dynamic pricing policy and a novel discretionary outsourcing threshold in addition to the
scheduling policy that is similar to Wein (1992a)’s.

Two other closely related papers are Perez and Zipkin (1997) and Veatch and Wein (1996). They propose
effective and computationally tractable index policies. In these policies, an index is dynamically calculated for
each product and the product with the smallest index is manufactured. Sanajian et al. (2010) numerically
shows that these index policies outperform the First-Come-First-Served (FCFS) policy. Véricourt et al.
(2000) provides a partial characterization of the optimality of an index policy proposed by Perez and Zipkin
(1997). The index policies proposed by Perez and Zipkin (1997) and Veatch and Wein (1996) motivate two
of the heuristic policies we use in our simulation study. Two other related papers are Zheng and Zipkin
(1990) and Zipkin (1995), which show that in the case of symmetric products, the policy that manufactures
the product with the smallest inventory outperforms the FCFS policy.

Other related papers consider multi-part systems, systems with switching costs, and system with both
make-to-stock and make-to-order products. Véricourt et al. (2000) focuses on the optimal scheduling of a
two-product make-to-stock manufacturing system. It characterizes the hedging point for a particular region
of the state space and shows that in this region, the monotone switching curve is a straight line. Veatch
and Véricourt (2003) studies the scheduling of a two-part-type single-product make-to-stock queue using the
model of Wein (1992a).

An important paper that considers switching costs is Duenyas and Van Oyen (1995) that considers the
scheduling of a multiclass queue with instantaneous but costly switchovers. Duenyas and Van Oyen (1996)
considers a similar problem with non-zero switching times and zero switching costs. In both papers, the
authors propose a heuristic threshold-type policy to decide when to switch or idle. Kim and Van Oyen
(1998) extends the analysis to a make-to-stock manufacturing system with both setup and switching costs.
Ayhan and Olsen (2000) proposes a heuristic policy to minimize the second moment of the waiting time in
a multiclass make-to-order queue. Olsen (1999) considers a similar problem with setup costs and proposes
a heuristic policy to minimize the mean (and outer percentiles) of the waiting time. Under the proposed
policy, the server works on the class with the greatest scaled age. Lan and Olsen (2006) considers a system
with non-zero setup times and setup costs. The authors use a fluid model to propose a scheduling policy
and a lower bound on the system performance.

4
Some of the early and important papers that consider the hybrid make-to-stock and make-to-order
systems are Carr and Duenyas (2000) and Iravani et al. (2012). Carr and Duenyas (2000) considers a
manufacturing system with two products. Product 1 is make-to-stock, while product 2 is make-to-order.
The demand for product 1 has to be met either from the stock or through an external supplier (at a
cost). The system manager makes scheduling decisions for both products and admission control decisions
for product 2. The authors show that a switching curve characterizes the optimal scheduling and admission
control policies. Iravani et al. (2012) extends Carr and Duenyas (2000) by backlogging product 1 orders.
The authors show that structural properties similar to Carr and Duenyas (2000) hold. See Peeters and van
Ooijen (2020) for further literature.

Other related papers on the scheduling of make-to-stock queues include Rubio and Wein (1996), Karaes-
men et al. (2002), and Bradley and Glynn (2002). Rubio and Wein (1996) considers a multi-product/multi-
server make-to-stock manufacturing system. Karaesmen et al. (2002) studies a system with advance order
information. Bradley and Glynn (2002) studies the joint capacity investment and scheduling of a single-class
make-to-stock manufacturing system.

The second stream of literature studies the control of queueing systems; see Stidham (1988, 2002) for
surveys. Our paper is closely related to the literature on the dynamic control of queueing systems in
heavy traffic, pioneered by Harrison (1988, 2000, 2003). In this stream of literature, one approximates the
queueing system with a Brownian system, which is simpler to analyze. Early examples of this approach
include Harrison and Wein (1989), that studies an optimal sequencing problem for a criss-cross network, and
Harrison and Wein (1990), that studies a multiclass two-station closed queueing network. In both cases,
the limiting Brownian models admit pathwise optimal solutions that have straightforward interpretations
in the original problem. Harrison and Wein show that their policies perform well. Since then many other
researchers followed the heavy traffic approach to study manufacturing and queueing systems; see for example
Wein (1990, 1991, 1992b), Chevalier and Wein (1993), Krichagina and Wein (1994), Reiman and Wein (1998),
Kumar (2000), Bell and Williams (2001), Plambeck et al. (2001), Markowitz and Wein (2001), Plambeck
(2004), Ata and Olsen (2009), Tezcan and Dai (2010), and Dai and Tezcan (2011).

Control of queueing systems often boils down to admission or service rate control, which have been
tackled either by heavy traffic approximations or Markov decision processes in the literature. The heavy
traffic approximations often result in drift rate control problems for Brownian models, which we review next.
Krichagina and Taksar (1992) and Krichagina et al. (1993) study service rate control problems and establish
the optimality of threshold-type policies. An important antecedent of our paper is Ata et al. (2005), that
studies a (one-dimensional) drift rate control problem of a reflected Brownian motion on a bounded interval.

Ata (2006) builds on Ata et al. (2005) to study the admission control of a multiclass single-server make-

5
to-order system. Another methodologically relevant paper is Ghosh and Weerasinghe (2007), that extends
Ata et al. (2005) by incorporating holding costs and allowing the system manager to choose the upper
barrier endogenously. The authors establish that an optimal solution exists and that it can be characterized
by solving the Bellman equation. We build on Ghosh and Weerasinghe (2007) by considering a drift rate
control problem on the entire real line and derive an explicit solution by solving the Bellman equation in
closed-form. Examples of other papers that study admission control, service rate control, or systems with
abandonment using heavy traffic approximations include Ghosh and Weerasinghe (2010), Budhiraja et al.
(2011), Ata and Tongarlak (2013), Rubino and Ata (2009), Weerasinghe (2015), and Ata et al. (2019).

As mentioned above, various authors tackled admission or service rate control problems using Markov
decision processes. Some of the relevant papers in this area include Crabill (1972, 1974), Stidham and Weber
(1989), George and Harrison (2001), Ata (2005), Ata and Zachariadis (2007), Adusumilli and Hasenbein
(2010), and Kumar et al. (2013).

The third stream of literature studies the revenue management and pricing of queueing and manufacturing
systems; see Elmaghraby and Keskinocak (2003) and Gallego and Topaloglu (2019) for overviews of this
literature. The revenue management of manufacturing systems has been studied using various techniques.
The closest paper to ours in this stream of literature is Çelik and Maglaras (2008), that studies the problem
of dynamic pricing, lead-time quotation, outsourcing, and scheduling in a make-to-order manufacturing
system. It approximates the original problem with a drift rate control problem. The solution to the drift
rate control problem is used to propose an effective dynamic control policy. Ata and Olsen (2013) is another
related paper that considers the problem of dynamically quoting price and leadtime menus for customers
in a system where two classes of customers compete for a given resource. Customers have convex–concave
delay costs. The authors derive a policy that is asymptotically optimal in the heavy traffic limit. Another
related paper is Caldentey and Wein (2006) that considers a single-product make-to-stock queue that uses
two alternative selling channels: long-term contracts and a spot market of electronic orders.

Many papers, as done in this paper, study the revenue management of manufacturing and queueing
systems by mapping them to admission and service rate control problems. Stidham (1992) is among the first
papers in this literature; see also Yoon and Lewis (2004). Two of most closely related papers in area are
Vulcano (2008) and Xu and Chao (2009). Vulcano (2008) studies the problem of dynamic pricing in a single-
class make-to-stock queue. The system manager can set the prices and a threshold for the admissible backlog.
For the linear demand model, Vulcano (2008) obtains closed-form expressions for the optimal demand rates,
which are then used as an approximate solution to the original problem. Another related paper, Xu and
Chao (2009), studies the production rate control and pricing in a single-product make-to-stock queueing
system. The controller aims to maximize the long-run average profit, which consists of the sales profit minus

6
holding, backlog, and the linear cost of production effort. The controller can dynamically choose between a
high and a low price and choose a production rate within a bounded interval. A closed-from solution for the
prices is obtained and an algorithm is proposed to compute the optimal state-dependent price and production
rates. Some papers in the literature have also considered the objective of maximizing social welfare; see e.g.,
Mendelson (1985), Ata and Shneorson (2006), and Mendelson and Whang (1990). Afèche (2013) builds on
this literature by considering a price/delay menu design problem. Examples of other papers that consider
pricing in queueing and manufacturing systems includes Ziya et al. (2006, 2008), Chao and Zhou (2006),
Gayon and Dallery (2007), and Afèche et al. (2013).

The fourth stream of literature studies the expediting and outsourcing of orders in manufacturing sys-
tems. These papers propose policies that resemble our threshold-type outsourcing policy. A related paper
in this area is Arslan et al. (2001), which studies the expediting of orders in a single-class make-to-order
manufacturing system. It shows that the optimal policy is a threshold-type policy when expediting has
both fixed and variable costs. When backlog reaches level S, expediting is used to bring it down to level s.
Bradley (2005) studies the joint scheduling and subcontracting of an M/M/1 manufacturing system. It shows
that the optimal policy is a dual base-stock policy. Bradley (2004) considers a single-class make-to-stock
manufacturing system, where inventory is either manufactured in-house or obtained from a subcontractor,
both of which have finite capacity. The author assumes a dual base-stock policy is used and provides a
closed-form expression for one optimal base-stock. Another related paper in this area is Huggins and Olsen
(2010) that considers a single-product, periodic-review inventory system. When shortages occur, the unmet
demand must be filled by expediting. The authors show that an (s, S) policy is optimal for regular produc-
tion. When the expediting cost is concave, the optimal expediting policy is a generalized (s, S) policy. When
the expediting cost consists of a fixed and linear per-unit cost, it is an order-up-to policy; see references in
Huggins and Olsen (2010) for further literature.

3 Model

This section advances a model of a make-to-stock manufacturing system selling K different products to
price-sensitive customers. We model the manufacturing system as a multiclass single server queue. Class k
2
products have a general production time distribution with mean mk and squared coefficient of variation νsk
for k = 1, . . . , K; we refer to µk = m−1
k as the production rate of class k products. Define m = (mk ) and let

Sk (t) for k = 1, . . . K denote the number of class k products manufactured until time t if the system were to
continuously work on class k products up to time t. We model the demand over time as a non-homogeneous
Poisson process whose intensity depends on the prices the system manager charges. To be specific, demand
for product k arrives according to a non-homogeneous Poisson process with instantaneous rate λk (t) for t ≥ 0

7
and k = 1, . . . , K. That is, the demand for product k up to time t is
Z t 
Nk λk (s)ds ,
0

where Nk is a unit rate Poisson process. We assume that for k = 1, . . . , K, the processes Nk and Sk are
independent of each other. Letting λ(t) = (λk (t)) denote the instantaneous demand rate vector at time t,
we refer to λ = {λ(t) : t ≥ 0} as the instantaneous demand rate process.

The system manager chooses the price vector p(t) = (pk (t)) ∈ P for t ≥ 0, where pk (t) denotes the price
of product k at time t for k = 1, . . . , K and P ⊂ RK
+ denotes the set of admissible price vectors. We refer

to p = {p(t) : t ≥ 0} as the price process. The price sensitivity of demand is captured by a non-negative
demand function Λ that maps the price vector to an instantaneous demand rate vector. That is,

λ(t) = Λ(p(t)), t ≥ 0.

The set of admissible instantaneous demand rate vectors, denoted by L, is given as follows:

L = Λ(p) for some p ∈ P .

Note that L ⊂ RK
+ because Λ is non-negative by definition. Following Çelik and Maglaras (2008), we make

the following regularity assumptions on the demand function.

Assumption 1. The demand function Λ satisfies the following:

(i) It is bounded and continuously differentiable on P.

(ii) For each product k = 1, . . . , K, Λk is strictly decreasing in pk for all p ∈ P.

(iii) The set of admissible instantaneous demand rate vectors L is convex.

(iv) There exists a unique inverse demand function Λ-1 : L → P that maps each admissible instantaneous
demand rate vector to the price vector that induces it.

We denote the variable cost of manufacturing for product k by δk for k = 1, . . . , K and let δ = (δk )
denote the associated cost rate vector. Then, we define the profit rate function Π as a function of the demand
rate as follows:

Π(x) = x0 (Λ−1 (x) − δ), x ∈ L.

For simplicity, we make the following regularity assumption on the profit rate function.

Assumption 2. The profit rate function Π is twice continuously differentiable on L and has a negative
definite Hessian matrix.

An immediate consequence of Assumption 2 is that the profit rate function is strictly concave.

8
In what follows, we view the instantaneous demand rate process λ as the system manager’s control, from
which the price process p can be inferred using the inverse demand function Λ−1 , i.e., p(t) = Λ−1 (λ(t))
for t ≥ 0. In order to avoid high backorder costs, the system manager may outsource customer orders
when the backlog of work is judged excessive, thereby incurring outsourcing costs. We denote the number
of class k orders outsourced up to time t by Ok (t) for k = 1, . . . , K. Then, letting O(t) = (Ok (t)), we
define O = {O(t) : t ≥ 0} to be K-dimensional outsourcing process. In addition to pricing and outsourcing
decisions, the system manager makes dynamic scheduling decisions. We allow preemptive-resume scheduling
and focus on head-of-line scheduling policies. Letting Tk (t) denote the cumulative amount of time devoted
to serving class k until time t, the system manager’s dynamic scheduling policy can be described by a K-
dimensional allocation process T = (Tk ). In summary, the dynamic control policy is denoted by (O, T, λ),
where O is the outsourcing process, T is the allocation process, and λ is the instantaneous demand rate
process.

We let Qk (t) denote the (possibly negative) number of class k products in inventory at time t. Letting

Q(t) = Qk (t) , the process Q = {Q(t), t ≥ 0} is called the inventory process. Assuming the system is empty
initially, under policy (O, T, λ), the inventory process Q evolves as follows: For k = 1, . . . , K and t ≥ 0,
Z t 
Qk (t) = Sk (Tk (t)) − Nk λk (s)ds + Ok (t). (1)
0

The cumulative idleness process associated with scheduling policy T is defined as follows:
K
X
I(t) = t − Tk (t), t ≥ 0. (2)
k=1

The dynamic control policy (O, T, λ) is said to be feasible if it is non-anticipating and

I, O, T are non-decreasing with I(0) = O(0) = T (0) = 0, (3)

I, T are continuous, (4)

λ(t) ∈ L for t ≥ 0. (5)

We define the state cost function υk : R → R+ that comprises of holding and backlogging costs as follows:
For k = 1, . . . , K and x ∈ R,
(
αk x, x ≥ 0,
υk (x) =
−βk x, x < 0,

where αk > 0 and βk > 0 are the per unit holding and backorder cost of class k products, respectively. We
denote the outsourcing cost in excess of the production cost for a class k order by νk > 0 for k = 1, . . . , K.
The total outsourcing cost for a class k order for k = 1, . . . , K is δk + νk . For brevity, we refer to νk as the
outsourcing cost (in excess of the production cost) for a class k order for k = 1, . . . , K. To avoid confusion,

9
we refer to δk + νk as the total outsourcing cost for a class k order for k = 1, . . . , K.

Given Π, υk , and νk for k = 1, . . . , K, we define the cumulative profit process associated with policy
(O, T, λ) as follows:1
Z t K Z
X t K
X
V (t) = Π(λ(s))ds − υk (Qk (s))ds − νk Ok (t), t ≥ 0. (6)
0 k=1 0 k=1

The first term on the right-hand side of (6) is a surrogate for the revenue (minus the variable costs) obtained
from selling the products. This term can be interpreted as the expected revenue obtained from selling the
products; see Plambeck et al. (2001), Çelik and Maglaras (2008), and Rubino and Ata (2009) for similar
treatments. The second term captures the holding cost associated with the unsold products and the backorder
cost associated with the unfulfilled orders. The third term is the cost associated with the outsourced orders.
Adopting the long-run average cost criterion, the system manager seeks to find the policy (O, T, λ) so as to
1  
maximize liminf E V (t) subject to (1) − (5). (7)
t→∞ t
Unfortunately, this formulation is a (nonlinear) multi-dimensional stochastic control problem. When the
production times are not exponentially distributed, it is also non-Markovian. Therefore, it is not analytically
tractable and suffers from the curse of dimensionality unless K is small. Following Harrison (1988, 2003),
Section 4 considers a sequence of closely related systems in the heavy-traffic regime and formulates the
approximating Brownian control problem, which is tractable analytically.

To facilitate the analysis to follow, ignoring the randomness in the system, consider the following static
planning problem: Find the instantaneous demand rate vector λ so as to

maximize Π(λ) subject to λ ∈ L. (8)

This formulation seeks to maximize the profit rate subject to the constraint λ ∈ L that ensures the instan-
taneous demand rate vector λ is achievable. If the system manager were to ignore the randomness in the
system, thus ignore all congestion-related costs, she would use an instantaneous demand rate that solves the
static planning problem (8). However, because of the randomness, she may benefit from dynamic adjust-
ments to the instantaneous demand rate. In order to model these dynamic adjustments, we next discuss the
Brownian approximation.

1
Since the production costs are accounted for in the profit rate function, the third term in (6) only includes the
outsourcing cost in excess of the production cost. Moreover, since the fixed production cost does not depend on the
policy, it is excluded from the profit calculation.

10
4 Approximating Brownian Control Problem

In Brownian approximations, one considers a sequence of closely related systems indexed by a parameter
n, whose formal limit is the Brownian control problem. We attach a superscript n to various quantities of
interest corresponding to the n-th system in this sequence. We focus on the asymptotic regime where both
the demand and system capacity grow with n as specified below.

Assumption 3 (Heavy Traffic Assumption). We assume that for all n ∈ N,

Λn (x) = nΛ(x), x ∈ P, (9)



µn = nµ + nη, (10)

where µ, η ∈ RK + are given constants. Furthermore, the static planning problem (8) has a unique optimal
PK
solution λ? ∈ interior(L) that satisfies k=1 λ?k /µk = 1.

Assumption 3 states that the profit maximizing demand rate puts the system in heavy traffic. Since
L ⊂ RK ? ?
+ , it follows from Assumption 3 that λk > 0 for k = 1, . . . , K. We refer to λ as the nominal

instantaneous demand rate in order to emphasize the fact that λ? is derived from an idealized planning
problem in which stochastic variability is suppressed. It is straightforward to show using (9) that for n ∈ N
and x ∈ L,

(Λn )-1 (nx) = Λ-1 (x) and Πn (nx) = nΠ(x). (11)

It is evident from (8) and (11) that if the system manager were to ignore the randomness in the system, thus
ignore all congestion-related and backlog-related costs, she would choose the instantaneous demand rate of
nλ? . However, she may benefit from (dynamic) adjustments to the instantaneous demand rate. Moreover,
the system inventory is of second order relative to the system size in the heavy-traffic asymptotic regime.
Therefore, we focus our attention on instantaneous demand rate vectors of the following form: For all n ∈ N
and some ζ : [0, ∞) → RK ,

λn (t) = nλ? + nζ(t), t ≥ 0; (12)

see Ata (2006), Çelik and Maglaras (2008), and Ata et al. (2019) for similar treatments.

In other words, we study a large balanced-flow system for large n. For such systems, the inventory and

outsourcing processes are expected to be of order n. Thus, we scale them accordingly. To be specific, for
t ≥ 0, we define the scaled inventory and outsourcing processes as follows:
Qn (t) On (t)
Z n (t) = √ and Rn (t) = √ . (13)
n n
For k = 1, . . . , K, let ρk = λ?k /µk denote the proportion of time the system manager should devote to class

11
k. As argued in Harrison (1988) (see also Harrison (2000, 2003)), any policy worthy of consideration satisfies
Tkn (t) ≈ ρk t for t ≥ 0 and large n. That is, ρk should give a first-order approximation to the fraction of
time the system manager allocates to class k for k = 1, . . . , K under allocation process T n . However, the

system manager can choose the second-order, i.e., order 1/ n, deviations from ρk . In order to capture these
deviations, we define the centered and scaled allocation and the scaled idleness processes for product k for
k = 1, . . . , K as follows:
√ √
Ykn (t) = n(ρk t − Tkn (t)) and U n (t) = nI n (t), t ≥ 0. (14)

We assume that the outsourcing cost νkn , holding cost αkn , and backorder cost βkn vary with n as follows: For
k = 1, . . . , K,
rk hk bk
νkn = √ , αkn = √ , and βkn = √ , (15)
n n n
where rk , hk , and bk are given constants. It is immediate from (15) that for k = 1, . . . , K and x ∈ R,

g (x)  hk x, x ≥ 0,
k
υkn (x) = √ , where gk (x) = (16)
n −b x, x < 0.
k

The next result shows that the profit under the deterministic system is an upper bound on the system
performance; see Appendix O for its proof.

Proposition 1. We have V n (t) ≤ nΠ(λ? )t for t ≥ 0.

Motivated by Proposition 1, we define the cumulative cost process as the deviation of the cumulative
profit process V n from that in the corresponding deterministic system, nΠ(λ? )t. That is,

ξ n (t) = nΠ(λ? )t − V n (t), t ≥ 0. (17)

In a similar fashion to Harrison (1988), Appendix A.1 formally derives the approximating Brownian control
problem as n gets large. In the approximating Brownian control problem, the performance-related processes
ξ n , Z n , Y n , U n , and Rn are replaced with their formal limits ξ, Z, Y , U , and R, which jointly satisfy the
following for t ≥ 0:
Z t K Z
X t K
X
ξ(t) = ζ(s)0 Hζ(s)ds + gk (Zk (s))ds + rk Rk (t), (18)
0 k=1 0 k=1
Z t
Zk (t) = Xk (t) − ζk (s)ds − µk Yk (t) + Rk (t), k = 1, . . . , K, (19)
0
K
X
U (t) = Yk (t), (20)
k=1

U, R are nondecreasing with U (0) = R(0) = 0, (21)

where H = −∇2 Π(λ? )/2 and Xk = {Xk (t), t ≥ 0} are independent Brownian motions with infinitesimal

12
drift ρk ηk and infinitesimal variance σk2 = λ?k (1 + νsk
2
). Equation (18) states that the cumulative cost process
comprises three terms. The first term captures the impact of the instantaneous demand rate on the profit
rate. The second term captures the holding and backorder cost. The third term captures the outsourcing
cost. The process ξ is (the limit of) the centered and scaled versions of the process V ; see Equations (6)
and (17). Equation (19), the counterpart of Equation (1), captures the impact of the demand rate and the
scheduling and outsourcing decisions on the scaled inventory process. Equation (20), the counterpart of
Equation (2), captures the impact of the scheduling decisions on the scaled idleness process. Equation (21),
the counterpart of Equation (3), ensures that the idleness and outsourcing decisions are not undone.

Hereafter, we call the adapted control (R, Y, ζ) admissible if the state process Z satisfies
E[kZ(t)k]
limsup = 0, (22)
t→∞ t
where k · k denotes the Euclidean norm. The approximating Brownian Control Problem (BCP) can now be
stated as follows: Choose the adapted control (R, Y, ζ) to
1  
minimize limsup E ξ(t) subject to (18) − (22). (23)
t→∞ t

As mentioned earlier, we consider a sequence of closely related systems indexed by n, whose formal limit
is the Brownian control problem. The original control problem can be viewed as a specific element of this
sequence of problems, which is determined by the particular choice of the system parameter n. Therefore,
the final step in the Brownian approximation is to choose a system parameter n, which will be used to
unscale the processes. The underlying assumption of the Brownian approximation is that the production
and instantaneous demand rates (for the original problem of interest modeled in Section 3) are large enough
so that various (scaled) processes of the original system can be approximated by the corresponding processes
of the Brownian control problem. Eventually, one uses the same system parameter to interpret the solution
to the Brownian control problem in the context of the original control problem.

We conclude this section by discussing the dynamic pricing policy prescribed by the BCP (23). Recall

from (12) that the instantaneous demand rate λn (t) is of the form λn (t) = nλ? + nζ(t). It follows from
Equation (9) and the one-to-one correspondence between the instantaneous demand rate process and the
price process that the proposed price process in the n-th system is equal to the nominal price vector, Λ-1 (λ? ),

plus a term of order 1/ n. To be specific, we have that
∇Λ-1 (λ? ) ζ(t)
pn (t) ' Λ-1 (λ? ) + √ , t ≥ 0, (24)
n
where ∇Λ-1 (λ? ) denotes the Jacobian of Λ-1 evaluated at λ? ; see Appendix A.2 for its derivation.

13
5 Equivalent Workload Formulation

Although the Brownian control problem advanced in Section 4 is simpler than the original control problem it
approximates, it is not easy to solve because it remains a multidimensional stochastic control problem. This
section develops a one-dimensional formulation that admits a closed-form solution and that is equivalent to
the Brownian control problem. To this end, we define the one-dimensional workload process W as follows:
K
X
W (t) = mk Zk (t), t ≥ 0, (25)
k=1

which represents the total (scaled) inventory (or backlog) in the system at time t measured in hours of total
work for the server. Pre-multiplying Equation (19) by m0 gives
Z t
W (t) = B(t) − θ(s)ds − U (t) + L(t), t ≥ 0,
0

that describes the evolution of the workload process, where


K
X K
X K
X K
X
B(t) = mk Xk (t), θ(t) = mk ζk (t), U (t) = Yk (t), and L(t) = mk Rk (t), t ≥ 0. (26)
k=1 k=1 k=1 k=1

We refer to L = {L(t) : t ≥ 0} as the effective outsourcing process and refer to θ = {θ(t) : t ≥ 0} as the
effective drift rate control process. One can interpret U (t) as the cumulative (scaled) idleness by time t and
interpret L(t) as the cumulative (scaled) amount of work outsourced by time t. Defining the indices j ? and
l? as
nh o nb o
k k
j ? = argmin : k = 1, . . . , K and l? = argmin : k = 1, . . . , K ,
mk mk
let h? = hj ? /mj ? , b? = bl? /ml? , and define the effective holding cost function as follows: For w ∈ R,

 h? w, w ≥ 0,
h(w) = (27)
−b? w, w < 0.

The effective holding cost function can be interpreted as the state cost associated with the following policy:
If workload is positive, it is held as inventory of product j ? , the cheapest product to hold inventory per unit
of work. If workload is negative, it is held as backordered requests of product l? , the cheapest product to
have backordered requests per unit of work; see Wein (1992a) for a similar interpretation. Also, define the
effective outsourcing cost and the corresponding class as follows:
nr o ri?
k
i? = argmin : k = 1, . . . , K and κ = . (28)
mk m i?
The effective outsourcing cost κ is the outsourcing cost associated with the greedy outsourcing policy that
only outsources product i? , which is the cheapest product to outsource in the sense that it has the lowest
outsourcing cost per unit of work outsourced. Moreover, we define the cost function c(x) associated with

14
the effective drift rate x and the corresponding optimal drift rate vector ζ ? (x) as follows:
n o n o
c(x) = min ζ 0 Hζ : m0 ζ = x, ζ ∈ RK and ζ ? (x) = argmin ζ 0 Hζ : m0 ζ = x, ζ ∈ RK , x ∈ R.

The next lemma provides closed-form expressions for c and ζ ? .

1 H -1 m
Lemma 1. For x ∈ R, we have that c(x) = x2 and ζ ? (x) = x.
m0 H -1 m m0 H -1 m
The Equivalent Workload Formulation (EWF) is then formulated as follows: Choose the adapted control
(L, U, θ) to
t t
1 h
Z Z i
minimize limsup E c(θ(s))ds + h(W (s))ds + κL(t) (29)
t→∞ t 0 0

subject to
Z t
W (t) = B(t) − θ(s)ds − U (t) + L(t), (30)
0

L, U are nondecreasing with L(0) = U (0) = 0, (31)


PK
where B is a Brownian motion with infinitesimal drift µ = k=1 mk ρk ηk and infinitesimal variance σ 2 =
PK 2 2
k=1 mk σk . We call the adapted control (L, U, θ) admissible if it satisfies (30)-(31) and

E[|W (t)|]
limsup = 0. (32)
t→∞ t
We refer to W as the workload process associated with control (L, U, θ). Proposition 4 (in Appendix B)
establishes the equivalence of the EWF (29)-(31) and the BCP (23). Proposition 4 shows that given the
workload w, the lifting map ∆ : R → RK given by

w
 , if k = j ? and w ≥ 0,
 mk


∆k (w) = w ? (33)
 mk , if k = l and w < 0,


0, otherwise,

for k = 1, . . . , K yields the optimal workload configuration. This lifting map distributes the workload as
follows: If workload is positive, it is held as inventory of product j ? , the cheapest product to hold inventory
per unit of work. If workload is negative, it is held as backordered requests of product l? , the cheapest
product to have backordered requests per unit of work.

6 Solution to the Equivalent Workload Formulation

This section solves the EWF (29)-(31). In order to minimize technical complexity, we restrict our attention
to stationary Markov control policies. That is, the (effective) drift rate θ(t) chosen at any time t is assumed
to depend on past history only through the workload W (t) at time t. To reflect this assumption, in what
follows, we write θ(W (t)) as opposed to θ(t). The optimal outsourcing and idling policies we derive can be

15
viewed as a barrier policy, which we define in Section 6.1.

6.1 Barrier Policies

The optimal outsourcing and idling policy we derive belongs to a class of barrier policies that we introduce
next. Below, we also derive an auxiliary identity for these barrier policies, which facilitates our solution
approach; see Equations (35)-(36) and Proposition 2. Following Harrison (2013, Section 7.7), we define a
barrier policy as follows.

Definition 1 (Barrier Policy). Given policy parameters l, u ∈ R with l < u, we call (L, U, θ) a barrier policy
if it is an admissible control for the EWF (29)-(31), and it satisfies W (t) ∈ [l, u],
Z t Z t
1{W (s)>l} dL(s) = 0 and 1{W (s)<u} dU (s) = 0, t ≥ 0. (34)
0 0

Processes L and U enforce the lower and upper reflecting barriers at l and u, respectively. Let C 2 [l, u]
denote the space of functions f : [l, u] → R that are twice continuously differentiable up to the boundary
(i.e., f is twice continuously differentiable on the interior of the interval, and its first and second derivatives
approach finite limits at the end points). The next proposition provides a useful identity that can be used to
compute the long-run average cost associated with a barrier policy. It helps motivate the Bellman equation;
see Appendix O for its proof. As a preliminary to the proposition, for a given real-valued function θ(·),
define the differential operator Γθ as follows:
1 2 00
Γθ f (w) = σ f (w) + µf 0 (w) − θ(w)f 0 (w), w ∈ (l, u).
2
Consider γ ∈ R and f ∈ C 2 ([l, u]) and assume that they jointly satisfy

Γθ f (w) + c(θ(w)) + h(w) = γ, w ∈ (l, u), (35)

subject to the boundary conditions

f 0 (l) = −κ and f 0 (u) = 0. (36)

Proposition 2. Consider the barrier policy (L, U, θ) with a lower barrier at l and an upper barrier at u. If
γ ∈ R and f ∈ C 2 ([l, u]) jointly satisfy (35)-(36), then
1 h t
Z Z t i
lim E c(θ(W (s)))ds + h(W (s))ds + κL(t) = γ.
t→∞ t 0 0

Consider the barrier policy (L, U, θ) with a lower barrier at l and an upper barrier at u. Proposition 2
states that if γ ∈ R and f ∈ C 2 ([l, u]) jointly satisfy
1 2 00
σ f (w) + µf 0 (w) − θ(w)f 0 (w) + c(θ(w)) + h(w) = γ, w ∈ (l, u),
2
subject to the boundary conditions f 0 (l) = −κ and f 0 (u) = 0, then γ is the long-run average cost associated

16
with the barrier policy. Building on this result, the next section proposes the Bellman equation that will be
used to solve the EWF (29)-(31).

6.2 Bellman Equation

Ata et al. (2005, Section 3.1) motivates the derivation of the Bellman equation for a related drift-rate control
problem under the average cost criterion; also see Rubino and Ata (2009) that incorporates state costs as
we do here. The boundary conditions given in Equation (38) are similar to those in Ata et al. (2005)). We
further augment the Bellman equation with the smooth pasting conditions given in Equation (39), because
the barriers l and u are to be chosen optimally; see Harrison (2013, Section 7.7) for the intuition behind such
constraints associated with the optimal barriers. Combining these leads to the following Bellman equation
for our workload formulation: find l, u, γ ∈ R and f ∈ C 2 ([l, u]) that satisfy

2 00
f (w) + µf 0 (w) − xf 0 (w) + c(x) + h(w)
1
min 2σ = γ, w ∈ (l, u), (37)
x∈R

subject to the boundary conditions

f 0 (l) = −κ and f 0 (u) = 0, (38)

and the smooth pasting conditions

f 00 (l) = 0 and f 00 (u) = 0. (39)

Motivated by Proposition 2, we interpret γ as a guess at the minimum average cost and interpret l and u as
the lower and upper reflecting barriers to be imposed on the workload process. The unknown function f is
often called the relative value function in average cost dynamic programming.

The Bellman equation is introduced primarily to motivate our solution approach; the properties of the
Bellman equation that we require will be proved from first principles. We shall develop an explicit solution
(l, u, γ, f ) of the Bellman equation, and define the candidate policy as the one that chooses in each state
w the (effective) drift rate θ(w) equal to the minimizer x in (37). Then, we will prove that this candidate
policy is optimal.

As a preliminary to analyzing (37)-(39), following Ata et al. (2005, Section 2), define the convex conjugate
φ of c, and its derivative as follows:
 
φ(y) = sup yx − c(x) and ψ(y) = argmax yx − c(x) , y ∈ R. (40)
x∈R x∈R

The following lemma provides closed-form expressions for φ and ψ.

17
Lemma 2. For y ∈ R, we have that
m0 H -1 m 2 m0 H -1 m
φ(y) = y and ψ(y) = y. (41)
4 2
Since Equation (37) does not involve the unknown function f itself, it is really a first-order equation.
Setting v(w) = f 0 (w) and using the definition (40), we rewrite (37) as

1 2 0
2 σ v (w) + µv(w) + h(w) − φ(v(w)) = γ, w ∈ (l, u). (42)

Then, using Lemma 2 and rearranging the terms in (42), we rewrite the Bellman equation as follows: find
l, u, γ ∈ R and v ∈ C 1 [l, u] that satisfy2
m0 H -1 m 2 2µ 2(γ − h(w))
v 0 (w) = v (w) − 2 v(w) + , w ∈ (l, u), (43)
2σ 2 σ σ2
subject to the boundary conditions

v(l) = −κ and v(u) = 0, (44)

and the smooth pasting conditions

v 0 (l) = 0 and v 0 (u) = 0. (45)

Equation (43) is a Riccati equation; see Appendix C for a brief overview of Riccati equations.

6.3 Solution to the Bellman Equation

To solve the Bellman equation, we proceed in three steps. First, we use the boundary and smooth pasting
conditions (44)-(45) to express l, u in terms of γ and show that l < 0 < u. Second, for each γ, we solve
Equation (43) on [l, 0] and [0, u], separately. This step exploits the special structure of the Bellman equation,
i.e., the solution of the Riccati equation. Finally, we use the desired continuity of v(w) at w = 0 to pin down
γ, which completes the solution of the Bellman equation.

To this end, we substitute v(l) = −κ and v 0 (l) = 0 into Equation (43) for w = l to obtain
m0 H -1 m 2
γ = h(l) − µκ − κ . (46)
4
Similarly, substituting v(u) = v 0 (u) = 0 into Equation (43) for w = u gives

γ = h(u). (47)

Combining (46) and (47) gives


m0 H -1 m 2
h(u) = h(l) − µκ − κ . (48)
4

2
Note that C 1 [l, u] denotes the space of functions v : [l, u] → R that are continuously differentiable on (l, u), and
their first derivatives approach finite limits at l and u.

18
The next result uses Equations (46)-(48) to obtain useful structural insights.

Lemma 3. Assume l, u, γ ∈ R and v ∈ C 1 ([l, u]) jointly satisfy (43)-(45). Then, we have that γ > 0 and
−1 m0 H -1 m 2  γ
l = ?
γ + µκ + κ < 0 and u = ? > 0. (49)
b 4 h

Next, we split the analysis of the Bellman equation (43)-(45) into two sub-intervals: [l, 0] and [0, u]. We
fix γ and solve the Bellman equation (43)-(45) on each sub-interval, separately. Then, we use the continuity
of v at zero to pin down γ. To do so, fix γ ≥ 0 and define3
−1 m0 H -1 m 2  γ
lγ = γ + µκ + κ < 0 and uγ = ? ≥ 0.
b? 4 h
Substituting (46) into (43) and focusing on the interval (lγ , 0] yields the following:
m0 H -1 m 2

2µ 2 h(lγ ) − h(w)
v 0 (w) = 2
 
v (w) − κ − v(w) + κ + , w ∈ (lγ , 0], (50)
2σ 2 σ2 σ2
subject to the boundary condition

v(lγ ) = −κ. (51)

It is straightforward to verify by substituting (51) into (50) for w = lγ that the solution to (50)-(51) satisfies
the smooth pasting condition v 0 (lγ ) = 0.

Similarly, substituting (47) into (43) and focusing on the interval [0, uγ ) yields the following:
m0 H -1 m 2

0 2µ 2 h(uγ ) − h(w)
v (w) = v (w) − 2 v(w) + , w ∈ [0, uγ ) (52)
2σ 2 σ σ2
subject to the boundary condition

v(uγ ) = 0. (53)

It is again straightforward to verify by substituting (53) into (52) for w = uγ that the solution to (52)-(53)
satisfies the smooth pasting condition v 0 (uγ ) = 0.

It is straightforward to show that (50)-(51) has a unique continuously differentiable solution, denoted
by vγ- : [lγ , 0] → R for each γ ≥ 0. Similarly, for γ ≥ 0, (52)-(53) has a unique continuously differentiable
solution, denoted by vγ+ : [0, uγ ] → R. Lemma 4 shows that vγ- (0) is strictly increasing in γ and vγ+ (0) is
strictly decreasing in γ; see Appendix O for its proof.

Lemma 4. We have the following:

(i) vγ- (0) is continuous and strictly increasing in γ with v0- (0) < 0 and lim vγ- (0) = ∞.
γ→∞

(ii) vγ+ (0) is continuous and strictly decreasing in γ with v0+ (0) = 0 and lim vγ+ (0) = −∞.
γ→∞

3
Although Lemma 3 proves that the average cost parameter γ that solves the Bellman equation is strictly positive,
we do not assume this apriori and allow γ ≥ 0 for mathematical convenience.

19
The following corollary is immediate from Lemma 4 and the unique γ characterized in the corollary
is illustrated in Figure 4 (in Appendix D). Corollary 1 is crucially used in solving the Bellman equation
(43)-(45).

Corollary 1. There exists a unique γ ? such that vγ- ? (0) = vγ+ ? (0).

Equations (50)-(51) and (52)-(53) belong to a class of Riccati differential equations that admits unique C 1
solutions; see e.g., Zaitsev and Polyanin (2002, Section 1.2.2). Appendix D uses the structure of the Riccati
equations to obtain closed-form expressions for vγ- and vγ+ . The parameter γ ? , characterized in Corollary 1,
can be computed using the closed-form expressions for vγ- and vγ+ provided in Lemma 7 (in Appendix D). By
Lemma 4, this computation can be done using a simple line search. Given γ ? , we define

 vγ- ? (w), w ∈ [lγ ? , 0],
v(w) = (54)
 v + ? (w), w ∈ (0, uγ ?].
γ

Proposition 3 shows that (lγ ? , uγ ? , γ ? , v) is the unique solution to the Bellman equation (43)-(45); see Ap-
pendix O for its proof.

Proposition 3. The function v is non-positive, strictly increasing, and continuously differentiable on


[lγ ? , uγ ? ]. Moreover, (lγ ? , uγ ? , γ ? , v) is the unique solution to (43)-(45).

To solve the Bellman equation (37)-(39), define


Z w
f (w) = v(x)dx, w ∈ [lγ ? , uγ ? ], (55)
lγ ?

where v is given by (54). The following corollary provides the unique solution to the Bellman equation
(37)-(39).

Corollary 2. The function f is strictly decreasing, strictly convex, and twice continuously differentiable.
Moreover, (lγ ? , uγ ? , γ ? , f ) solves the Bellman equation (37) subject to the boundary condition (38) and the
smooth pasting condition (39). It is also unique up to an additive constant.

6.4 Candidate Policy and Its Optimality for the Equivalent Workload Formu-
lation

Our candidate policy is the barrier policy with the lower barrier at lγ ? , the upper barrier at uγ ? , and the
(effective) drift rate that is the minimizer of the left-hand side of Equation (37) at every w ∈ [lγ ? , uγ ? ], i.e.,
m0 H -1 m
θ(w) = v(w), w ∈ [lγ ? , uγ ? ], (56)
2
where v is given by (54). The candidate policy is stationary and its (effective) drift rate is non-positive,
continuous, and strictly increasing in the workload. Under the candidate policy, the workload process W

20
evolves as a diffusion process with a strictly increasing state-dependent drift rate and reflecting barriers at
lγ ? and uγ ? . Theorem 1 shows that the candidate policy is optimal for the EWF (29)-(31); see Appendix O
for its proof. Corollary 3 (in Appendix E) builds on this result to propose an optimal solution to the BCP
(23).

Theorem 1. The barrier policy with the lower barrier at lγ ? , the upper barrier at uγ ? , and the (effective)
drift rate control θ(·) given by (56) is optimal for the EWF (29)-(31) and it has a long-run average cost of
γ?.

7 Proposed Policy

This section proposes a dynamic control policy for the problem introduced in Section 3 by interpreting the
solution to the EWF (29)-(31) in the context of the original control problem. The proposed policy has three
components: pricing, outsourcing, and scheduling. To describe the policy, recall that we consider a sequence
of systems indexed by n, whose formal limit is the BCP (23). The original control problem introduced in
Section 3 is a specific element of this sequence of problems, which is determined by the particular choice
of n. Thus, in order to describe the proposed policy, first, we need to choose the system parameter n that
will be used to unscale the processes of interest. Given n, define W = {W(t) : t ≥ 0} as the unscaled,
or nominal, workload process associated with the control problem introduced in Section 3. To be specific,
PK
W(t) = k=1 mk Qnk (t) for t ≥ 0.

Pricing Policy. Given the nominal workload process W, we choose the price vector
∇Λ-1 (λ? ) H -1 m W(t)
p(t) = Λ-1 (λ? ) +

√ v lγ ? ∨( √ ∧uγ ?) , t ≥ 0, (57)
n 2 n
where v is given by (54) and ∇Λ-1 (λ? ) denotes the Jacobian of Λ-1 evaluated at λ? .

This pricing policy follows from Corollary 3 and Equation (24). The term lγ ? ∨ ((W(t)/ n) ∧ uγ ?), which

projects W(t)/ n onto [lγ ? , uγ ?], enables us to prescribe a price for any value of the nominal workload process.
Note that although the workload process W in the equivalent workload formulation lives in [lγ ? , uγ ?], the

scaled nominal workload process W(t)/ n in the original control problem may leave the interval [lγ ? , uγ ?].
Therefore, we restrict it to [lγ ? , uγ ?]. by projecting it there if needed. Recall that the optimal effective drift
rate control θ is non-positive and strictly increasing in the workload. Thus, our proposed pricing policy
PK PK
induces an effective demand rate k=1 λk (t)mk that is less than or equal to k=1 λ?k mk = 1 and is strictly
increasing in the workload.

Outsourcing Policy. Outsource orders of product i? when the nominal workload W is less than or equal

to the outsourcing threshold ln = nlγ ? < 0.

21
The optimal solution to the EWF (29)-(31) imposes a lower barrier on W at lγ ? using the effective
outsourcing process L. We interpret this in the context of the original control problem as stated above. The
intuition behind this policy is that when the backlog is so large that W < ln , it is worthwhile to outsource
orders as opposed to letting the backlog grow further. Furthermore, the system manager only outsources the
"cheapest" product, i.e., product i? ; see Equation (28). For a discussion of the impact of the outsourcing
cost on the outsourcing policy, see Appendix N.

The approximating Brownian control problem sets all but one of the inventory levels to zero (whenever
the total workload is positive). As articulated in Harrison (1996), the zeros in the Brownian control problem
correspond to positive but small inventory levels, e.g., safety stocks, for further details see Harrison (1996,
1998), Harrison and López (1999), Maglaras (2000), and Ata and Kumar (2005). Therefore, as is customary
in the heavy traffic literature, we put small safety stocks in various buffers. The safety stock for buffer k is
denoted by sk for k = 1, . . . , K; the values sk can be calibrated via simulation as done in Wein (1992a).

Scheduling Policy. Interpreting the solution to the equivalent workload formulation in the context of
the original control problem, we propose idling the server whenever the nominal workload W exceeds the

idling threshold un = nuγ ? and Qnk ≥ sk for k = 1, . . . , K. Whenever the server is working, it first
prioritizes those products for which Qnk < sk , i.e., the inventory is below the safety stock. Among such
products, the system manager prioritizes them in the descending order of bk /mk . Finally, if Qnk ≥ sk for all
k = 1, . . . , K and W < un , i.e., the server is working, then the server focuses solely on product j ? , i.e., the
cheapest product to hold in inventory.

8 Simulation Study

In this section, we consider an example that illustrates the effectiveness of our proposed policy. Following
Wein (1992a, Section 10), we consider a manufacturing system with K = 3 products. In our example, the
nominal instantaneous demand rate vector is λ? = (30, 15, 10), which is hundred times the demand rate
vector in Wein (1992a). Following Wein (1992a), we assume the system manager spends the same fraction
of time on all classes, i.e., λ?1 /µ1 = λ?2 /µ2 = λ?3 /µ3 . To be more specific, similar to Wein (1992a), we assume
the production times are exponentially distributed and the production rates satisfy µ1 = 2µ2 = 3µ3 . We set
P3
µ1 = 94.7, µ2 = 47.4, and µ3 = 31.6, which result in an average server utilization of k=1 λ?k /µk = 0.95;
see Appendix N for a further discussion of the connection of our example to that of Wein (1992a). If one
assumes each period in our model corresponds to one week, then the average production times are two to
five hours, and the daily demand rate is approximately eight orders per day.

22
We assume the production costs are δ1 = δ2 = δ3 = 1.4 Following Çelik and Maglaras (2008), we use a
multinomial logit demand model. We assume customers arrive according to a Poisson process with a total
arrival rate of 100 per time unit and choose to order product k with probability
exp(ak − dk pk (t))
P(Ordering Product k) = P3 , k = 1, 2, 3. (58)
1+ i=1 exp(ai − di pi (t))

For simplicity, we assume d1 = d2 = d3 = 8.88 and a1 = 10.71, a2 = 10.01, and a3 = 9.61. These
yield the nominal instantaneous demand rate vector λ? = (30, 15, 10) and the (nominal) static price vector
p? = Λ−1 (λ? ) = (1.25, 1.25, 1.25). In particular, the profit margin is 25% under the nominal instantaneous
demand rate; see Appendix N for historical profit margins in industry.

Following Wein (1992a), we assume the holding and backorder costs of all classes are the same, i.e.,
α1 = α2 = α3 and β1 = β2 = β3 . Moreover, the ratio of the backorder cost to the holding cost is
β1 /α1 = β2 /α2 = β3 /α3 = 2. To be specific, we set α1 = α2 = α3 = 0.005 and β1 = β2 = β3 = 0.01; see
Appendix N for a justification of these parameter choices. We conduct a sensitivity analysis on the state
costs in Table 3 (in Appendix F) and show that the proposed policy performs well in a wide range of state
costs.

Recall from Section 3 that νk > 0 denotes the outsourcing cost in excess of the production cost for class
k orders. Therefore, the total outsourcing cost for class k orders for k = 1, 2, 3 is νk + δk > δk = 1. In our
base case, we set ν1 = ν2 = ν3 = 0.33, corresponding to the total outsourcing cost of 1.33. In Table 5 (in
Appendix F), we conduct a sensitivity analysis on the total outsourcing cost and show that the proposed
policy performs well in a wide range of parameters.

In this example, we have i? = j ? = l? = 3, i.e., the proposed policy only holds inventory of product 3,
only holds backorders of product 3, and only outsources product 3. We take the system parameter to be
n = 100; see Appendix N for a discussion of our choice of n. We determine the best safety stocks sk for
k = 1, 2, 3 for each set of primitives using a simulation-based search. For the base example, the safety stocks
are s1 = 0 and s2 = s3 = 1.

We compare our proposed policy with seven other policies; see Appendix F for a detailed description of
these policies. The first policy is a joint dynamic pricing, outsourcing, and scheduling policy that is computed
using the dynamic programming approach numerically; see Appendices G-H for its formulation and solution.
This is the (exact) optimal policy, which we can compute only for formulations with few products. For larger
problems, this approach suffers from the curse of dimensionality. We refer to this policy as the Markov
Decision Process (MDP) policy.

The second policy is the joint (optimal) dynamic outsourcing and scheduling policy that uses the best

4
Since only the relative costs and revenues matter, we normalize the production costs to one.

23
static price vector. We refer to this policy as the MDP policy with static prices.

The next three policies use scheduling and outsourcing policies similar to the proposed policy. They
are characterized by an outsourcing threshold l < 0, an idling threshold u > 0, and safety stocks sk for
k = 1, 2, 3. Policies three to five use different pricing policies. Their outsourcing and idling thresholds are
computed differently, as well. Next, we describe these differences.

The third policy uses the static price vector p? = Λ-1 (λ? ), i.e., the optimal solution to the static planning
problem (8). This corresponds to a zero drift rate in the EWF (29)-(31). We refer to the third policy as the
two-parameter static pricing policy. The outsourcing threshold, idling threshold, and safety stocks of this
policy are obtained using a simulation-based search.

The fourth policy uses the static price vector given by


∇Λ-1 (λ? ) H -1 m
Λ-1 (λ? ) + √ θ̄,
n m0 H -1 m
where θ̄ ∈ R is the static drift rate. We refer to this policy as the three-parameter static pricing policy. The
outsourcing threshold, idling threshold, static drift rate, and safety stocks of this policy are obtained using
a simulation-based search.

The fifth policy uses N ∈ N price vectors. We refer to this policy as the N -price policy. We consider this
policy with N = 1, 2, 3. We obtain the outsourcing and idling thresholds of this policy by solving a modified
equivalent workload formulation; see Appendix F for details. We determine the best safety stocks for this
policy using a simulation-based search.

The sixth policy combines the pricing, outsourcing, and idleness policies of the one-price policy described
above with the scheduling policy, referred to as the Myopic(P) index scheduling policy, proposed by Perez
and Zipkin (1997) and further studied in Veatch and Wein (1996). We refer to this policy as the PZVW
policy with static prices. The seventh policy combines the pricing, outsourcing, and idleness policies of our
proposed policy with the Myopic(P) index scheduling policy. We refer to this policy as the PZVW policy
with dynamic prices.

We simulate each policy 100 times for 108 time units starting from an initial inventory of zero. In order
to eliminate the transient effects, the initial 20% of each run is discarded. Table 2 (in Appendix F) reports
the long-run average cost of the MDP policy and the optimality gap of the other policies (with respect to
the MDP policy); see Appendix N for the details of the results. Table 2 shows that while the performance
gap between the MDP policy, which is optimal, and all other policies is significant, our proposed policy
outperforms the other policies by a significant margin. This is more pronounced at higher average server
utilizations. Interestingly, the N -price policy’s performance is relatively close to the proposed policy (under
1.7% ± 0.8% for all server utilizations considered) for N = 3.

24
Proposed Policy One-Price Policy PZVW Policy with Static Prices
MDP Policy with Static Prices Two-Price Policy PZVW Policy with Dynamic Prices
Three-Parameter Static Pricing Policy Three-Price Policy

40% 35%
35% 30%
Optimality Gap

Optimality Gap
30% 25%
25%
20%
20%
15%
15%
10% 10%

5% 5%
0% 0%
0.90 0.95 0.99 1.05 1.1 1.2 1.3
Utilization Total Outsourcing Cost

(a) Impact of the average server utilization. (b) Impact of the total outsourcing cost.

Figure 1: Impact of the average server utilization and the total outsourcing cost on the performance of
the various policies. The shaded areas depict the 95% confidence intervals. For visual clarity, we have not
included the two-parameter static pricing policy, which is substantially outperformed by all other policies.

In order to study the effect of server utilization, we vary the production rates with the same factor while
keeping the demand function fixed; see Figure 1a and Table 2 (in Appendix F) for the optimality gap of
the various policies and Figure 5a (in Appendix F) for the gap of the various policies from the proposed
policy. As we increase the server utilization, the optimality gap of our proposed policy (slightly) decreases,
whereas that of the static pricing policies increases substantially. This is because as the average server
utilization increases, so does the backlog. The proposed policy can adjust prices dynamically as needed in
order to counter this,5 whereas the static pricing policies cannot. Consequently, their performance degrades
significantly.

Next, we study the impact of the state costs on the performance of the various policies considered. First,
we keep the ratio of the backorder cost to the holding cost constant as we change them; see Table 3 (in
Appendix F). This is equivalent to changing the cost of capital. Then, we keep the holding cost constant
and change the backorder cost, which varies their ratio; see Table 4 (in Appendix F). We observe (in Tables
3-4) that the proposed policy outperforms the other policies (except for the MDP policy) in a wide range
of holding and backorder costs. Moreover, the optimality gap of all policies increases with the holding and
backorder costs.

Next, we study the impact of the total outsourcing cost (δk + νk for k = 1, 2, 3) on the performance of the
various policies considered; see Figure 1b and Table 5 (in Appendix F) for the optimality gap of the various
policies and Figure 5a (in Appendix F) for the gap of the various policies from the proposed policy. The
optimality gap of the proposed policy is (almost) insensitive to the outsourcing cost. The proposed policy

5
By increasing the prices dynamically as the backlog increases, the proposed policy can decrease the instantaneous
arrival rates, lowering the congestion as needed.

25
outperforms all static pricing policies for all outsourcing costs considered. However, as the outsourcing cost
decreases, the gap between the proposed policy and the static pricing policies decreases. This is because
when the outsourcing cost is small, the static pricing policies use outsourcing as a substitute for dynamic
pricing, which helps them decrease the (total) cost.

MDP Policy One-Price Policy PZVW Policy with Static Prices


MDP Policy with Static Prices Two-Price Policy PZVW Policy with Dynamic Prices
Three-Parameter Static Pricing Policy Three-Price Policy

50% 40%
Gap from the Proposed Policy

Gap from the Proposed Policy


40% 30%
30%
20%
20%
10%
10%

0% 0%

-10% -10%
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0
Coefficient of Variation of Production Times Coefficient of Variation of Production Times

(a) Lognormal production times. (b) Gamma production times.

Figure 2: Impact of the coefficient of variation of the production times on the performance of the various
policies. The shaded areas depict the 95% confidence intervals.

Next, we study the impact of the production time distribution on the performance of the various policies
considered. We consider lognormal and gamma distributions; see Appendix N for further details. We keep
the production rates constant and change the coefficient of variation of the production times (of all classes);
see Figure 2 and Table 7 (in Appendix F) . Because the exponential production time distribution is a crucial
assumption for the MDP policy and the MDP policy with static prices, we cannot compute these policies
for non-exponential production time distributions. As such, we compute these policies as if the production
times were exponentially distributed (using the mean of the true distribution) and simulate these policies
under lognormal and gamma production time distributions. Then, we report their long-run average cost.
We observe (in Figure 2 and Table 7) that the proposed policy performs well in a wide range of coefficients
of variation under both lognormal and gamma distributions. The MDP policy outperforms our proposed
policy when the coefficient of variation is near one. This is expected since the MDP policy is designed for the
exponential production time distribution, whose coefficient of variation is one. However, as the coefficient
of variation of the production times deviates from one, the gap between the MDP policy and the proposed
policy decreases. In fact, when the coefficient of variation of the production times is zero, the MDP policy
and the proposed policy are statistically indistinguishable. Similarly, as the coefficient of variation of the
production times deviates from one, the gap between the proposed policy and the MDP policy with static
prices increases. The gap between the proposed policy and the two-parameter static pricing policy, the three-

26
parameter static pricing policy, the N -price policy, and the PZVW policy with static prices increases with
the coefficient of variation. This is because as the coefficient of variation of the production times increases,
so does the backlog. The proposed policy can adjust prices dynamically as needed (with no constraints on
its price adjustments) to counter this. However, the static pricing policies cannot adjust the prices, and the
N -price policy can only use one of the N prices. Consequently, their performance degrades compared to the
proposed policy.

Lastly, we study the impact of the system parameter n on the performance of the various policies
considered. We keep the parameters of the Brownian approximating control problem, i.e., µ, η, hk , bk , and
rk for k = 1, 2, 3, constant. Then, we scale the demand function, production rates, holding costs, backorder
costs, and outsourcing costs of the manufacturing system as in (11) and (15). Figure 6 and Table 6 (in
Appendix F) show that the proposed policy outperforms the other policies (except for the MDP policy) in
a wide range of system parameters. Moreover, as the system parameter increases, the optimality gap of the
proposed policy, the two-price policy, and the three-price policy decreases while the optimality gap of the
MDP policy with static prices increases. The performance of the PZVW policy with dynamic prices also
improves as n gets large once it is sufficiently large.

In Appendix L, we compare the pricing, outsourcing, and scheduling decisions of the proposed policy
with those of the MDP policy. We show that the decisions made under the proposed policy are close to the
decisions made under the MDP policy.

9 Concluding Remarks

Although this paper focuses on a make-to-stock manufacturing system with backlogged orders, its solution
technique can be easily adapted to the lost sales case. Appendix M outlines a sketch of that analysis. As
a possible extension, one could add the capacity investment decision to the outsourcing, scheduling, and
pricing controls. Another interesting extension would be the addition of setup times and setup costs. One
could also consider the joint outsourcing, scheduling, and pricing of a hybrid multiclass make-to-stock and
make-to-order manufacturing system.

References
Abramowitz, M. and Stegun, I. A. (1965). Handbook of mathematical functions: with formulas, graphs, and mathe-
matical tables, volume 55. Courier Corporation.
Adusumilli, K. M. and Hasenbein, J. J. (2010). Dynamic admission and service rate control of a queue. Queueing
Systems, 66(2):131–154.
Afèche, P. (2013). Incentive-compatible revenue management in queueing systems: Optimal strategic delay. M&SOM,
15(3):423–443.
Afèche, P., Baron, O., and Kerner, Y. (2013). Pricing time-sensitive services based on realized performance. M&SOM,
15(3):492–506.

27
Arreola-Risa, A. and DeCroix, G. A. (1998). Make-to-order versus make-to-stock in a production–inventory system
with general production times. IIE Trans., 30(8):705–713.
Arslan, H., Ayhan, H., and Olsen, T. L. (2001). Analytic models for when and how to expedite in make-to-order
systems. IIE Trans., 33(11):1019–1029.
Ata, B. (2005). Dynamic power control in a wireless static channel subject to a quality-of-service constraint. Oper.
Res., 53(5):842–851.
Ata, B. (2006). Dynamic control of a multiclass queue with thin arrival streams. Oper. Res., 54(5):876–892.
Ata, B., Harrison, J. M., and Shepp, L. A. (2005). Drift rate control of a Brownian processing system. Ann. Appl.
Probab., 15(2):1145–1160.
Ata, B. and Kumar, S. (2005). Heavy traffic analysis of open processing networks with complete resource pooling:
Asymptotic optimality of discrete review policies. Ann. Appl. Probab., 15(1A):331–391.
Ata, B., Lee, D., and Sönmez, E. (2019). Dynamic volunteer staffing in multicrop gleaning operations. Oper. Res.,
67(2):295–314.
Ata, B. and Olsen, T. L. (2009). Near-optimal dynamic lead-time quotation and scheduling under convex-concave
customer delay costs. Oper. Res., 57(3):753–768.
Ata, B. and Olsen, T. L. (2013). Congestion-based leadtime quotation and pricing for revenue maximization with
heterogeneous customers. Queueing Systems, 73(1):35–78.
Ata, B. and Shneorson, S. (2006). Dynamic control of an M/M/1 service system with adjustable arrival and service
rates. Manage. Sci., 52(11):1778–1791.
Ata, B., Tongarlak, M., Lee, D., and Field, J. (2021a). A diffusion model of dynamic volunteer capacity management.
Working paper.
Ata, B., Tongarlak, M., Lee, D., and Field, J. (2021b). A dynamic model for managing volunteer engagement.
Working paper.
Ata, B. and Tongarlak, M. H. (2013). On scheduling a multiclass queue with abandonments under general delay
costs. Queueing Systems, 74(1):65–104.
Ata, B. and Zachariadis, K. E. (2007). Dynamic power control in a fading downlink channel subject to an energy
constraint. Queueing Systems, 55(1):41–69.
Ayhan, H. and Olsen, T. L. (2000). Scheduling of multi-class single-server queues under nontraditional performance
measures. Oper. Res., 48(3):482–489.
Barzilai, J. and Borwein, J. M. (1988). Two-point step size gradient methods. IMA J. Numer. Anal., 8(1):141–148.
Bell, S. L. and Williams, R. J. (2001). Dynamic scheduling of a system with two parallel servers in heavy traffic with
resource pooling: Asymptotic optimality of a threshold policy. Ann. Appl. Probab., 11(3):608–649.
Bertsekas, D. P. (2012). Dynamic programming and optimal control, volume 2. Athena Scientific, 4th edition.
Boyd, S. and Vandenberghe, L. (2004). Convex optimization. Cambridge university press.
Bradley, J. R. (2004). A brownian approximation of a production-inventory system with a manufacturer that sub-
contracts. Oper. Res., 52(5):765–784.
Bradley, J. R. (2005). Optimal control of a dual service rate M/M/1 production-inventory model. Eur. J. Oper. Res.,
161(3):812–837.
Bradley, J. R. and Glynn, P. W. (2002). Managing capacity and inventory jointly in manufacturing systems. Manage.
Sci., 48(2):273–288.
Budhiraja, A., Ghosh, A. P., and Lee, C. (2011). Ergodic rate control problem for single class queueing networks.
SIAM J. Control Optim., 49(4):1570–1606.
Caldentey, R. and Wein, L. M. (2006). Revenue management of a make-to-stock queue. Oper. Res., 54(5):859–875.
Carr, S. and Duenyas, I. (2000). Optimal admission control and sequencing in a make-to-stock/make-to-order pro-
duction system. Oper. Res., 48(5):709–720.
Çelik, S. and Maglaras, C. (2008). Dynamic pricing and lead-time quotation for a multiclass make-to-order queue.
Manage. Sci., 54(6):1132–1146.
Chao, X. and Zhou, S. X. (2006). Joint inventory-and-pricing strategy for a stochastic continuous-review system. IIE
Trans., 38(5):401–408.
Chevalier, P. B. and Wein, L. M. (1993). Scheduling networks of queues: Heavy traffic analysis of a multistation
closed network. Oper. Res., 41(4):743–758.
Crabill, T. B. (1972). Optimal control of a service facility with variable exponential service times and constant arrival
rate. Manage. Sci., 18(9):560–566.
Crabill, T. B. (1974). Optimal control of a maintenance system with variable service rates. Oper. Res., 22(4):736–745.
Dai, J. and Tezcan, T. (2011). State space collapse in many-server diffusion limits of parallel server systems. Math.
Oper. Res., 36(2):271–320.
Duenyas, I. and Van Oyen, M. P. (1995). Stochastic scheduling of parallel queues with set-up costs. Queueing
Systems, 19(4):421–444.
Duenyas, I. and Van Oyen, M. P. (1996). Heuristic scheduling of parallel heterogeneous queues with set-ups. Manage.

28
Sci., 42(6):814–829.
Elmaghraby, W. and Keskinocak, P. (2003). Dynamic pricing in the presence of inventory considerations: Research
overview, current practices, and future directions. Manage. Sci., 49(10):1287–1309.
Ethier, S. N. and Kurtz, T. G. (2009). Markov processes: Characterization and convergence. John Wiley & Sons.
Gallego, G. and Topaloglu, H. (2019). Revenue management and pricing analytics, volume 209. Springer.
Gayon, J.-P. and Dallery, Y. (2007). Dynamic vs static pricing in a make-to-stock queue with partially controlled
production. OR Spectrum, 29(2):193–205.
George, J. M. and Harrison, J. M. (2001). Dynamic control of a queue with adjustable service rate. Oper. Res.,
49(5):720–731.
Ghosh, A. P. and Weerasinghe, A. P. (2007). Optimal buffer size for a stochastic processing network in heavy traffic.
Queueing Systems, 55(3):147–159.
Ghosh, A. P. and Weerasinghe, A. P. (2010). Optimal buffer size and dynamic rate control for a queueing system
with impatient customers in heavy traffic. Stoch. Process. Their Appl., 120(11):2103–2141.
Ha, A. Y. (1997). Optimal dynamic scheduling policy for a make-to-stock production system. Oper. Res., 45(1):42–53.
Harrison, J. M. (1988). Brownian models of queueing networks with heterogeneous customer populations. In Stochastic
differential systems, stochastic control theory and applications, pages 147–186. Springer.
Harrison, J. M. (1996). The BIGSTEP approach to flow management in stochastic processing networks. Stochastic
Networks: Theory and Applications, 4:147–186.
Harrison, J. M. (1998). Heavy traffic analysis of a system with parallel servers: Asymptotic optimality of discrete-
review policies. Ann. Appl. Probab., pages 822–848.
Harrison, J. M. (2000). Brownian models of open processing networks: Canonical representation of workload. Ann.
Appl. Probab., 10(1):75–103.
Harrison, J. M. (2003). A broader view of Brownian networks. Ann. Appl. Probab., 13(3):1119–1150.
Harrison, J. M. (2013). Brownian models of performance and control. Cambridge University Press.
Harrison, J. M. and López, M. J. (1999). Heavy traffic resource pooling in parallel-server systems. Queueing systems,
33(4):339–368.
Harrison, J. M. and Van Mieghem, J. A. (1997). Dynamic control of Brownian networks: State space collapse and
equivalent workload formulations. Ann. Appl. Probab., 7(3):747–771.
Harrison, J. M. and Wein, L. M. (1989). Scheduling networks of queues: Heavy traffic analysis of a simple open
network. Queueing Systems, 5(4):265–279.
Harrison, J. M. and Wein, L. M. (1990). Scheduling networks of queues: Heavy traffic analysis of a two-station closed
network. Oper. Res., 38(6):1052–1064.
Huggins, E. L. and Olsen, T. L. (2010). Inventory control with generalized expediting. Oper. Res., 58(5):1414–1426.
Iravani, S. M., Liu, T., and Simchi-Levi, D. (2012). Optimal production and admission policies in make-to-stock/make-
to-order manufacturing systems. Prod. Oper. Manag., 21(2):224–235.
Karaesmen, F., Buzacott, J. A., and Dallery, Y. (2002). Integrating advance order information in make-to-stock
production systems. IIE Trans, 34(8):649–662.
Kim, E. and Van Oyen, M. P. (1998). Dynamic scheduling to minimize lost sales subject to set-up costs. Queueing
Systems, 29(2-4):193–229.
Kim, J. and Randhawa, R. S. (2017). The value of dynamic pricing in large queueing systems. Oper. Res., 66(2):409–
425.
Krichagina, E. V., Lou, S. X., Sethi, S. P., and Taksar, M. I. (1993). Production control in a failure-prone manufac-
turing system: Diffusion approximation and asymptotic optimality. Ann. Appl. Probab., 3(2):421–453.
Krichagina, E. V. and Taksar, M. I. (1992). Diffusion approximation for GI/G/1 controlled queues. Queueing systems,
12(3-4):333–367.
Krichagina, E. V. and Wein, L. M. (1994). Heavy traffic analysis of a production-distribution system. Working paper.
Kumar, R., Lewis, M. E., and Topaloglu, H. (2013). Dynamic service rate control for a single-server queue with
Markov-modulated arrivals. Nav. Res. Log., 60(8):661–677.
Kumar, S. (2000). Two-server closed networks in heavy traffic: Diffusion limits and asymptotic optimality. Ann.
Appl. Probab., 10(3):930–961.
Lan, W.-M. and Olsen, T. L. (2006). Multiproduct systems with both setup times and costs: Fluid bounds and
schedules. Oper. Res., 54(3):505–522.
Maglaras, C. (2000). Discrete-review policies for scheduling stochastic networks: Trajectory tracking and fluid-scale
asymptotic optimality. Ann. Appl. Probab., 10(3):897–929.
Markowitz, D. M. and Wein, L. M. (2001). Heavy traffic analysis of dynamic cyclic policies: A unified treatment of
the single machine scheduling problem. Oper. Res., 49(2):246–270.
Mendelson, H. (1985). Pricing computer services: Queueing effects. Commun. ACM, 28(3):312–321.
Mendelson, H. and Whang, S. (1990). Optimal incentive-compatible priority pricing for the M/M/1 queue. Oper.
Res., 38(5):870–883.

29
Nahmias, S. and Olsen, T. L. (2015). Production and operations analysis. Waveland Press.
Olsen, T. L. (1999). A practical scheduling method for multiclass production systems with setups. Manage. Sci.,
45(1):116–130.
Peeters, K. and van Ooijen, H. (2020). Hybrid make-to-stock and make-to-order systems: A taxonomic review. Int.
J. Prod. Res., 58(15):4659–4688.
Perez, A. P. and Zipkin, P. (1997). Dynamic scheduling rules for a multiproduct make-to-stock queue. Oper. Res.,
45(6):919–930.
Plambeck, E., Kumar, S., and Harrison, J. M. (2001). A multiclass queue in heavy traffic with throughput time
constraints: Asymptotically optimal dynamic controls. Queueing Systems, 39(1):23–54.
Plambeck, E. L. (2004). Optimal leadtime differentiation via diffusion approximations. Oper. Res., 52(2):213–228.
Reiman, M. I. and Wein, L. M. (1998). Dynamic scheduling of a two-class queue with setups. Oper. Res., 46(4):532–
547.
Royden, H. L. and Fitzpatrick, P. (1968). Real analysis. Macmillan New York, 4th edition.
Rubino, M. and Ata, B. (2009). Dynamic control of a make-to-order, parallel-server system with cancellations. Oper.
Res., 57(1):94–108.
Rubio, R. and Wein, L. M. (1996). Setting base stock levels using product-form queueing networks. Manage. Sci.,
42(2):259–268.
Sanajian, N., Abouee-Mehrizi, H., and Balcıoglu, B. (2010). Scheduling policies in the M/G/1 make-to-stock queue.
J. Oper. Res. Soc., 61(1):115–123.
Stidham, S. (1988). Scheduling, routing, and flow control in stochastic networks. In Stochastic Differential Systems,
Stochastic Control Theory and Applications, pages 529–561. Springer.
Stidham, S. (1992). Pricing and capacity decisions for a service facility: Stability and multiple local optima. Manage.
Sci., 38(8):1121–1139.
Stidham, S. (2002). Analysis, design, and control of queueing systems. Oper. Res., 50(1):197–216.
Stidham, S. and Weber, R. R. (1989). Monotonic and insensitive optimal policies for control of queues with undis-
counted costs. Oper. Res., 37(4):611–625.
Tezcan, T. and Dai, J. (2010). Dynamic control of N-systems with many servers: Asymptotic optimality of a static
priority policy in heavy traffic. Oper. Res., 58(1):94–110.
Veatch, M. H. and Véricourt, F. D. (2003). Zero-inventory conditions for a two-part-type make-to-stock production
system. Queueing systems, 43(3):251–266.
Veatch, M. H. and Wein, L. M. (1996). Scheduling a make-to-stock queue: Index policies and hedging points. Oper.
Res., 44(4):634–647.
Véricourt, F. D., Karaesmen, F., and Dallery, Y. (2000). Dynamic scheduling in a make-to-stock system: A partial
characterization of optimal policies. Oper. Res., 48(5):811–819.
Vulcano, G. (2008). Dynamic pricing for an M/M/1 make-to-stock system with controllable backlog. Working paper.
Weerasinghe, A. (2015). Optimal service rate perturbations of many server queues in heavy traffic. Queueing Systems,
79(3-4):321–363.
Wein, L. M. (1990). Scheduling networks of queues: Heavy traffic analysis of a two-station network with controllable
inputs. Oper. Res., 38(6):1065–1078.
Wein, L. M. (1991). Due-date setting and priority sequencing in a multiclass M/G/1 queue. Manage. Sci., 37(7):834–
850.
Wein, L. M. (1992a). Dynamic scheduling of a multiclass make-to-stock queue. Oper. Res., 40(4):724–735.
Wein, L. M. (1992b). Scheduling networks of queues: Heavy traffic analysis of a multistation network with controllable
inputs. Oper. Res., 40(3-supplement-2):S312–S334.
Xu, Y. and Chao, X. (2009). Dynamic pricing and inventory control for a production system with average profit
criterion. Probab. Eng. Inf. Sci., 23(3):489–513.
Yoon, S. and Lewis, M. E. (2004). Optimal pricing and admission control in a queueing system with periodically
varying parameters. Queueing Systems, 47(3):177–199.
Zaitsev, V. F. and Polyanin, A. D. (2002). Handbook of exact solutions for ordinary differential equations. CRC
Press.
Zheng, Y.-S. and Zipkin, P. (1990). A queueing model to analyze the value of centralized inventory information.
Oper. Res., 38(2):296–307.
Zipkin, P. H. (1995). Performance analysis of a multi-item production-inventory system under alternative policies.
Manage. Sci., 41(4):690–703.
Ziya, S., Ayhan, H., and Foley, R. D. (2006). Optimal prices for finite capacity queueing systems. Oper. Res. Lett.,
34(2):214–218.
Ziya, S., Ayhan, H., and Foley, R. D. (2008). A note on optimal pricing for finite capacity queueing systems with
multiple customer classes. Nav. Res. Log., 55(5):412–418.

30
A Derivations

A.1 Formal Derivation of the Approximating Brownian System

We start by writing the scaled inventory and the cumulative cost processes in terms of other scaled processes.
Then, we formally take the limit to obtain the approximating Brownian system.

Scaled Inventory Process. By Equation (1), for k = 1, . . . , K and t ≥ 0, we have


Z t 
Qnk (t) = Skn (Tkn (t)) − Nk λnk (s)ds + Okn (t)
0
h i h i
= Skn (Tkn (t)) + µnk Tkn (t)
− µnk Tkn (t) + µnk ρk t − µnk ρk t
hZ t Z t i Z t 
n n n
+ λk (s)ds − λk (s)ds − Nk λnk (s)ds + Okn (t) (59)
0 0 0
h i h i Z t
= Skn (Tkn (t)) − µnk Tkn (t) − µnk ρk t − Tkn (t) + µnk ρk t − λnk (s)ds
0
h Z t  Z t i
− Nkn λnk (s)ds − λnk (s)ds + Okn (t), (60)
0 0

where (59) is obtained by adding and subtracting the terms insides the brackets and (60) is obtained by
rearranging the terms. It follows from the functional strong approximations (Ethier and Kurtz (2009, Section
7.5)) that for t ≥ 0,
n √ √
Skn (t) = µnk t +
nµk νsk χ̂k (t) + o( n) (61)
√ n √
Nkn (nt) = nt + n χ̃k (t) + o( n), (62)
n n
almost surely, where χ̂k and χ̃k are independent standard Brownian motions and the notation f (n) = o(g(n))
denotes that f (n)/g(n) → 0 as n → ∞. Plugging (61)-(62) into (60) gives
Z t
√ n
h i
Qnk (t) = nµk νsk χ̂k Tkn (t) − µnk ρk t − Tkn (t) + µnk ρk t − λnk (s)ds

0
√ n 1 t n √
Z 
− n χ̃k λk (s)ds + Okn (t) + o( n), t ≥ 0. (63)
n 0

Dividing all terms in (63) by n and substituting in (10)-(14) gives
Z t
Qnk (t) √ n √ √
= µk νsk χ̂k Tkn (t) − µk Ykn (t) + nρk µk t + ρk ηk t − nλ?k t −

√ ζk (s)ds
n 0
Z t
n
 1 
− χ̃k λ?k t + √ ζk (s)ds + Rkn (t) + o(1), t ≥ 0. (64)
n 0
Then, substituting ρk µk = λ?k into (64) gives
Z t
Qnk (t) √ n
= µk νsk χ̂k Tkn (t) − µk Ykn (t) + ρk ηk t −

√ ζk (s)ds
n 0
Z t
n
 1 
− χ̃k λ?k t + √ ζk (s)ds + Rkn (t) + o(1), t ≥ 0.
n 0

31
Next, using a straightforward application of the random time change theorem (on the fifth term), we obtain
Z t
Qnk (t) √ n n n n
χ̂ ζk (s)ds − χ̃k (λ?k t) + Rkn (t) + o(1), t ≥ 0.

√ = µk νsk k Tk (t) − µk Yk (t) + ρk ηk t −
n 0

Finally, letting

χnk (t) = √µk νsk χ̂nk Tkn (t) + ρk ηk t − χ̃nk (λ?k t),


for k = 1, . . . , K and t ≥ 0, we arrive at


Z t
Zkn (t) = χnk (t) − µk Ykn (t) − ζk (s)ds + Rkn (t) + o(1)
0

for k = 1, . . . , K and t ≥ 0.

Cost Process. By Equation (6), we have


Z t K Z
X t K
X
V n (t) = Πn (λn (s))ds − υkn Qnk (s) ds − νkn Okn (t),

t ≥ 0. (65)
0 k=1 0 k=1

Let us start by focusing on the first term:


√ ζ(s) 
Πn λn (s) = Πn nλ? + nζ(s) = nΠ λ? + √ ,
 
(66)
n
where (66) follows from (11). By Taylor’s expansion, we have
ζ(s)  1 1 1
Π λ? + √ = Π(λ? ) + √ ∇Π(λ? )ζ(s) + ζ(s)0 ∇2 Π(λ? )ζ(s) + o
n n 2n n
1 1
= Π(λ? ) + ζ(s)0 ∇2 Π(λ? )ζ(s) + o

, (67)
2n n
where (67) follows the fact that ∇Π(λ? ) = 0 by Assumption 3. Therefore,
Z t
1 t
Z
n n ?
Π (λ (s))ds = Π(λ ) nt + ζ(s)0 ∇2 Π(λ? )ζ(s)ds + o(1), t ≥ 0. (68)
0 2 0
Then, by (17), (65), and (68), we have for t ≥ 0,
K Z t K
1 t
Z X X
ξ n (t) = − ζ(s)0 ∇2 Π(λ? )ζ(s)ds + υkn Qnk (s) ds + νkn Okn (t) + o(1)

2 0 0
k=1 k=1
Z t K t K
1 X Z X
ζ(s)0 ∇2 Π(λ? )ζ(s)ds + gk Zkn (s) ds + rk Rkn (t) + o(1),

= − (69)
2 0 0 k=1 k=1

where (69) follows from (13) and (15)-(16).

Approximating Brownian System. So far, we have established that for k = 1, . . . , K and t ≥ 0,


Z t
Zkn (t) = χnk (t) − µk Ykn (t) − ζk (s)ds + Rkn (t) + o(1), (70)
0

where

χnk (t) = √µk νsk χ̂nk Tkn (t) + ρk ηk t − χ̃nk (λ?k t),

t ≥ 0.

32
Moreover, for k = 1, . . . , K and t ≥ 0,
K Z t K
1 t
Z X X
ξ n (t) = − ζ(s)0 ∇2 Π(λ? )ζ(s)ds + gk Zkn (s) ds + rk Rkn (t) + o(1).

(71)
2 0 0 k=1 k=1

The key step in the development of our Brownian approximation is the following claim: If n is large, the
only interesting allocation policies are those for which

Tkn (t) ≈ ρk t (72)

for k = 1, . . . , K and t ≥ 0. For a defense of this claim, let us first restrict our attention to policies that
do not idle the manufacturing system as long as there are backordered requests. Because we focus on a
large system with balanced loading, i.e., n is large and ρ is near one, we expect the idleness to be small,
i.e., asymptotically negligible. That is, I n (t) ≈ 0 for large n. Moreover, the relative amount of time the
manufacturing system manufactures different products over the interval [0, t] must be approximately equal
to ρk . Otherwise, large inventories or backlogs of class k, i.e., O(n), would build. Thus, we conclude that the
difference between Tkn (t) and ρk t must be small. Namely, Equation (72) should hold. The same conclusion
holds for policies that allow idleness while there are backordered requests provided I n (t) ≈ 0. Any idleness
greater than this is undesirable. The heart of the matter is that backorders of order n result inevitably
if I n (t) 6≈ 0 but can be otherwise avoided; see Harrison (1988, Sections 5 and 11) for a similar informal
argument. Next, using a straightforward application of the random time change theorem and Equation (72),
we obtain

χnk ⇒ Xk (73)

for k = 1, . . . , K, where X1 , . . . , XK are independent Brownian motions with infinitesimal drift ρk ηk , in-
finitesimal variance σk2 = λ?k (1 + νsk
2
), and initial value zero.

Motivated by (70), (71), and (73), we argue similar to Harrison (1988) that as the system gets large, the
scaled processes defined above converge weakly to ξ, Z, Y , U , and R that satisfy the following for t ≥ 0,
Z t K Z t
X XK
ξ(t) = ζ(s)0 Hζ(s)ds + gk (Zk (s))ds + rk Rk (t),
0 k=1 0 k=1
Z t
Zk (t) = Xk (t) − µk Yk (t) − ζk (s)ds + Rk (t), k = 1, . . . , K,
0
K
X
U (t) = Yk (t),
k=1

U, R are nondecreasing with U (0) = R(0) = 0,

where H = −∇2 Π(λ? )/2 and Xk = {Xk (t), t ≥ 0} are independent (ηk ρk , σk2 ) Brownian motions with initial
value zero.

33
A.2 Derivation of Equation (24)

Recall that the instantaneous demand rate in the n-th system is of the form

λn (t) = nλ? + nζ(t), t ≥ 0.

It follows from the definition of the inverse demand function that for t ≥ 0,
√ ζ(t) 
pn (t) = (Λn )-1 (λn (t)) = (Λn )-1 nλ? + nζ(t) = Λ-1 λ? + √ ,

(74)
n
where the last equality in (74) follows from (11). Then, by Taylor’s expansion, we have
∇Λ-1 (λ? ) ζ(t) 1
pn (t) = Λ-1 (λ? ) + √ + o( √ ),
n n
for t ≥ 0, where ∇Λ-1 (λ? ) denotes the Jocobian of Λ-1 evaluated at λ? .

B Equivalence of the BCP and the EWF

Proposition 4 establishes an equivalence between the BCP (23) and the EWF (29)-(31). In particular, it
PK
shows that given the state Z(t) in the BCP, one sets W (t) = k=1 mk Zk (t) to arrive at the equivalent state
process in the EWF. To go in the other direction, given the state W (t) in the EWF, one sets Z(t) = ∆(W (t))
to arrive at the equivalent state process in the BCP.

Proposition 4. The EWF (29)-(31) is equivalent to the BCP (23) in the following sense:

(i) Suppose that (R, Y, ζ) is an admissible control for the BCP (23) with Brownian motion X and state
process Z. Then (L, U, θ) defined by (20) and (26) is an admissible control for the EWF (29)-(31) with
Brownian motion B given by (26) and state process W given by (25). Moreover, the cost of control
(L, U, θ) for the EWF (29)-(31) is less than or equal to the cost of control (R, Y, ζ) for the BCP (23).

(ii) Suppose that (L, U, θ) is an admissible control for the EWF (29)-(31) with Brownian motion B and
state process W . Then, there exists a Brownian motion X that satisfies (26). Furthermore, there
exists (R, Y, ζ) that is an admissible control for the BCP (23) with Brownian motion X and state
process {∆(W (t)) : t ≥ 0}, where ∆ is given by (33). These two policies have the same cost.

C Riccati Equation

This section discusses a special case of the Riccati equation and its solution. First, we introduce the Riccati
equation. Then, we provide its general solution in terms of the Bessel functions. Finally, we use this
characterization and the relationship between the Bessel functions and the Airy functions to write the

34
solution in terms of the Airy functions. Fix c0 , c1 , c2 , ĉ, y0 ∈ R such that c2 6= 0 and consider the differential
equation

y 0 (x) = c2 y 2 (x) + c1 y(x) + ĉx + c0 (75)

for x ∈ (0, ∞) subject to the boundary condition y(0) = y0 . This equation falls in the class of Riccati
equations; see e.g., Zaitsev and Polyanin (2002, Section 1.2.2.24). The next lemma provides a closed-form
expression for the general solution to Equation (75) in terms of the Bessel functions of order 1/3, i.e., J1/3
and Y1/3 . As a preliminary to this lemma, let C1 and C2 be fixed constants and define
4c0 c2 − c21
z(x) = x + , x ∈ [0, ∞),
4ĉc2
p p p p
u(x) = C1 exp( c12x ) z(x) J1/3 ( 23 ĉc2 z 3/2 (x)) + C2 exp( c12x ) z(x) Y1/3 ( 32 ĉc2 z 3/2 (x)), x ∈ [0, ∞),

where J1/3 and Y1/3 are the order 1/3 Bessel functions of the first and second kind.

Lemma 5. Zaitsev and Polyanin (2002, Sections 1.2.1.2 and 2.1.2.12). The (general) solution to the differ-
ential equation (75) is
1 u0 (x)
y(x) = − , x ∈ [0, ∞).
c2 u(x)

Proof. First, we transform the Riccati equation into a second-order linear differential equation. Then, we
write the solution to the second-order equation in terms of the Bessel functions of order 1/3. Following
Zaitsev and Polyanin (2002, Section 1.2.1.2), we substitute
 Z x 
u(x) = exp − c2 y(w)dw , x ∈ [0, ∞) (76)
0

into Equation (75) to obtain

u00 (x) − c1 u0 (x) + c2 (ĉx + c0 )u(x) = 0, x ∈ (0, ∞). (77)

By Zaitsev and Polyanin (2002, Section 2.1.2.12), the (general) solution to the differential equation (77) is
of the form
p p p p
u(x) = C1 exp( c12x ) z(x) J1/3 ( 23 ĉc2 z 3/2 (x)) + C2 exp( c12x ) z(x) Y1/3 ( 32 ĉc2 z 3/2 (x)), x ∈ [0, ∞),

where C1 and C2 are constants, J1/3 and Y1/3 are the order 1/3 Bessel functions of the first and second kind,
respectively, and
4c0 c2 − c21
z(x) = x + , x ∈ [0, ∞).
4ĉc2
Note that Equation (76) can be equivalently written as
1 u0 (x)
y(x) = − , x ∈ [0, ∞).
c2 u(x)

35
Next, we use the relationship between the Airy functions and the Bessel functions to write the solution
to the differential equation (75) subject to the boundary condition y(0) = y0 in terms of the Airy functions.
The Airy function of the first kind Ai : R → R and the second kind Bi : R → R are defined as follows:
1 ∞
Z
Ai(x) = cos( 13 z 3 + zx)dz, x ∈ R
π 0
1 ∞
Z
cos(− 31 z 3 + zx) + sin( 13 z 3 + zx) dz, x ∈ R.

Bi(x) =
π 0
The Airy functions and the Bessel functions of order 1/3 can be expressed in terms of each other; see e.g.,
Abramowitz and Stegun (1965, Section 10.4). The following lemma uses this relationship to provide a closed-
form solution for the differential equation (75) in terms of the Airy functions. This result is crucially used in
solving the Bellman equation in Section 6.3. In particular, Lemma 7 solves Equations (50)-(51) and (52)-(53)
by showing that they belong to the class of Riccati equations studied here. As a preliminary to the next
lemma, define the function
c21 − 4c2 (ĉx + c0 )
z̃(x) = , x ∈ [0, ∞)
4(ĉc2 )2/3
define the constants
Bi0 (z̃(0)) − Bi(z̃(0))(ĉc2 )−1/3 (c2 y0 + c21 )
C̃1 =
Ai(z̃(0))Bi0 (z̃(0)) − Ai0 (z̃(0))Bi(z̃(0))
−Ai0 (z̃(0)) + Ai(z̃(0))(ĉc2 )−1/3 (c2 y0 + c21 )
C̃2 = ,
Ai(z̃(0))Bi0 (z̃(0)) − Ai0 (z̃(0))Bi(z̃(0))
and the function

ũ(x) = C̃1 exp( c12x ) Ai(z̃(x)) + C̃2 exp( c12x ) Bi(z̃(x)), x ∈ [0, ∞). (78)

Lemma 6. The (unique) solution to the differential equation (75) subject to the boundary condition y(0) =
y0 is
1 ũ0 (x)
y(x) = − , x ∈ [0, ∞).
c2 ũ(x)

Proof. By Lemma 5, the (general) solution to the differential equation (75) is of the form
1 u0 (x)
y(x) = − , x ∈ [0, ∞) (79)
c2 u(x)
where
p p
u(x) = C1 exp( c12x ) z(x) J1/3 ( 32 ĉc2 z 3/2 (x))
p p
+ C2 exp( c12x ) z(x) Y1/3 ( 32 ĉc2 z 3/2 (x)), x ∈ [0, ∞), (80)

36
the functions J1/3 and Y1/3 are the order 1/3 Bessel functions of the first and second kind, respectively, and
4c0 c2 − c21
z(x) = x + , x ∈ [0, ∞). (81)
4ĉc2

First, we use the relationship between the Bessel functions of the first and second kind to write u solely
in terms of the Bessel functions of the first kind. To do so, note that by Abramowitz and Stegun (1965,
Equation (9.1.2)),
J1/3 (x) cos(π/3) − J−1/3 (x)
Y1/3 (x) = , x ∈ R. (82)
sin(π/3)
It follows from Equations (80) and (82) that u can be written as
p p
u(x) = Ĉ1 exp( c12x ) z(x) J1/3 ( 32 ĉc2 z 3/2 (x))
p p
+ Ĉ2 exp( c12x ) z(x) J−1/3 ( 32 ĉc2 z 3/2 (x)), x ∈ [0, ∞) (83)

for some Ĉ1 , Ĉ2 ∈ R. Next, we use the relationship between the Airy functions and the Bessel functions to
write u in terms of the Airy functions. By Abramowitz and Stegun (1965, Equation (10.4.22)),
√ √
ẑ J±1/3 ( 32 ẑ 3/2 ) = 3
2 Ai(−ẑ) ∓ 3
2 Bi(−ẑ), ẑ ∈ R, (84)

where Ai and Bi are the Airy functions of the first and second kind, respectively. It follows from Equations

(83)-(84) and the change of variable ẑ = 3 ĉc2 z that u = ũ, where

ũ(x) = C̃1 exp( c12x ) Ai(z̃(x)) + C̃2 exp( c12x ) Bi(z̃(x)), (85)

for x ∈ [0, ∞), C̃1 and C̃2 are constants and for x ∈ [0, ∞),
p
z̃(x) = − 3 ĉc2 z(x)
p 4c0 c2 − c21 
= − 3 ĉc2 x +
4ĉc2
4(ĉc2 )2/3 p
3 4c0 c2 − c21 
= − ĉc2 x +
4(ĉc2 )2/3 4ĉc2
c21 − 4c2 (ĉx + c0 )
= . (86)
4(ĉc2 )2/3
Moreover, it follows from u = ũ, Equation (79), and boundary condition y(0) = y0 that

ũ(0) = 1 and ũ0 (0) = −c2 y0 . (87)

It follows from Equations (85) and (87) that the constants C̃1 and C̃2 must satisfy
 
= exp(− c12x ) ũ(x)

C̃1 Ai z̃(x) + C̃2 Bi z̃(x)

x=0 x=0
  0 
c1 x
0
C̃1 Ai z̃(x) + C̃2 Bi z̃(x) = exp(− 2 ) ũ(x) ,

x=0 x=0

37
which can be simplified to
 
C̃1 Ai z̃(0) + C̃2 Bi z̃(0) = 1
c1
C̃1 Ai0 z̃(0) + C̃2 Bi0 z̃(0) = (ĉc2 )−1/3 (c2 y0 + ).
 
2
Thus,
" # " #−1 " #
C̃1 Ai(z̃(0)) Bi(z̃(0)) 1
= ,
C̃2 Ai0 (z̃(0)) Bi0 (z̃(0)) (ĉc2 )−1/3 (c2 y0 + c1
2 )

which gives

Bi0 (z̃(0)) −Bi(z̃(0))


" # " #" #
C̃1 1 1
= 0 0 0
C̃2 Ai(z̃(0))Bi (z̃(0)) − Ai (z̃(0))Bi(z̃(0)) −Ai (z̃(0)) Ai(z̃(0)) (ĉc2 ) −1/3
(c2 y0 + c1
2 )

Bi0 (z̃(0)) − Bi(z̃(0))(ĉc2 )−1/3 (c2 y0 +


" c1
#
1 2 )
= . (88)
Ai(z̃(0))Bi0 (z̃(0)) − Ai0 (z̃(0))Bi(z̃(0)) −Ai0 (z̃(0)) + Ai(z̃(0))(ĉc2 )−1/3 (c2 y0 + c21 )

In conclusion,
1 ũ0 (x)
y(x) = − , x ∈ [0, ∞),
c2 ũ(x)
where ũ is given by Equation (85), z̃ is given by Equation (86), and C̃1 and C̃2 are given by Equation
(88).

D Closed Form Solution of Equations (50)-(51) and (52)-(53)

Equations (50)-(51) and (52)-(53) both fall in the class of Riccati equations; e.g., see Zaitsev and Polyanin
(2002, Section 1.2). In this section, we use the structure of the Riccati equations to obtain closed-form
expressions for the solution to (50)-(51) and (52)-(53). Recall from Section 6.3 that the unique solution to
(50)-(51) is denoted by vγ- and the unique solution to (52)-(53) is denoted by vγ+ . To facilitate our analysis,
define Ai : R → R and Bi : R → R as
1 ∞
Z
Ai(w) = cos( 13 x3 + xw)dx, w ∈ R.
π 0
1 ∞
Z
cos(− 13 x3 + xw) + sin( 31 x3 + xw) dx,

Bi(w) = w ∈ R.
π 0
Functions Ai and Bi are referred to as the Airy functions of the first and second kind, respectively. The Airy
functions of the first and second kind are depicted in Figure 3. The Airy functions oscillate on (−∞, 0) and

38
1.5
Ai
Bi
vγ- (0)
1.0

γ?
0.5 γ

−10 −8 −6 −4 −2

−0.5 vγ+ (0)

Figure 3: Airy functions of the first and second Figure 4: vγ- (0) and vγ+ (0) as a function of the
kind. average cost γ.

are monotone on (0, ∞).6

As a preliminary to discussing closed-form expressions for vγ- and vγ+ , define the functions
−2/3
z̄(w) = m0 H -1 mb?σ 2 µ2 − m0 H -1 m (b?w + b?lγ + γ) , w ∈ [0, −lγ ]


−2/3
ẑ(w) = m0 H -1 mh? σ 2 µ2 − m0 H -1 mh?w , w ∈ [0, uγ ].


Similarly, define the constants


Bi0 (z̄(0)) + Bi(z̄(0))(m0 H -1 mb?σ 2 )−1/3 ( 12 m0 H -1 mκ + µ)
C̄1 =
Ai(z̄(0))Bi0 (z̄(0)) − Ai0 (z̄(0))Bi(z̄(0))
−Ai0 (z̄(0)) − Ai(z̄(0))(m0 H -1 mb?σ 2 )−1/3 ( 21 m0 H -1 mκ + µ)
C̄2 =
Ai(z̄(0))Bi0 (z̄(0)) − Ai0 (z̄(0))Bi(z̄(0))
−1/3
Bi0 (ẑ(0)) − Bi(ẑ(0)) m0 H -1 mh?σ 2 µ
Ĉ1 =
Ai(ẑ(0))Bi0 (ẑ(0)) − Ai0 (ẑ(0))Bi(ẑ(0))
−1/3
−Ai0 (ẑ(0)) + Ai(ẑ(0)) m0 H -1 mh?σ 2 µ
Ĉ2 =
Ai(ẑ(0))Bi0 (ẑ(0)) − Ai0 (ẑ(0))Bi(ẑ(0))
and define the functions

ȳ(w) = C̄1 exp(−µw/σ 2 ) Ai(z̄(w)) + C̄2 exp(−µw/σ 2 ) Bi(z̄(w)), w ∈ [0, −lγ ]

ŷ(w) = Ĉ1 exp(µw/σ 2 ) Ai(ẑ(w)) + Ĉ2 exp(µw/σ 2 ) Bi(ẑ(w)), w ∈ [0, uγ ].

Lemma 7 provides closed-form expressions for vγ- and vγ+ in terms of the Airy functions; see Figure 4 for an
illustration of vγ- (0) and vγ+ (0) as a function of γ.

6
The Airy functions can be expressed in terms of the Bessel functions and the modified Bessel functions of order
1/3; see e.g., Abramowitz and Stegun (1965, Section 10.4).

39
Lemma 7. We have the following closed form solutions for vγ- and vγ+ :
2σ 2 ȳ 0 (w − lγ )
vγ- (w) = − , w ∈ [lγ , 0],
m0 H -1 m ȳ(w − lγ )
2σ 2 ŷ 0 (uγ − w)
vγ+ (w) = , w ∈ [0, uγ ].
m0 H -1 m ŷ(bγ − w)

Proof. We prove the result for vγ- and vγ+ , separately. We start with vγ- . Consider the differential equation
m0 H -1 m 2 2µ 0 2b?
v̂ 0 (w) = 2
 
v̂ (w) − κ − v̂ (w) + κ + w, w ∈ (0, ∞), (89)
2σ 2 σ2 σ2
subject to the boundary condition v̂(0) = −κ. It is straightforward to see that

vγ- (w) = v̂(w − lγ ), w ∈ [lγ , 0]. (90)

Equation (89) is a Riccati equation of the form (75) with


m0 H -1 m 2 2µ 2µ m0 H -1 m 2b?
c0 = − κ − 2 κ, c1 = − , c2 = , and ĉ = .
2σ 2 σ σ2 2σ 2 σ2
The closed-form expression for vγ- then follows from Lemma 6 and the fact that by Equation (46), c0 = γ+b? lγ .

Next, we prove the results for vγ+ . Consider the differential equation
m0 H -1 m 2 2µ 0 2h?
ṽ 0 (w) = − ṽ (w) + ṽ (w) − w, w ∈ (0, ∞), (91)
2σ 2 σ2 σ2
subject to the boundary condition ṽ(0) = 0. It is straightforward to see that

vγ+ (w) = ṽ(uγ − w), w ∈ [0, uγ ]. (92)

Equation (91) is a Riccati equation of the form (75) with


2µ m0 H -1 m 2h?
c0 = 0, c1 = , c2 = − , and ĉ = − .
σ2 2σ 2 σ2
The closed-form expression for vγ+ then follows from Lemma 6.

E The Optimal Solution to the BCP

Corollary 3 uses Lemma 1, Proposition 4, and Theorem 1 to propose an optimal solution to the BCP.

Corollary 3. Let (L, U, θ) denote the barrier policy with the lower barrier at lγ ? , the upper barrier at uγ ? ,
and the (effective) drift rate control θ(·) given by (56). Let W denote the state process associated with the
barrier policy (L, U, θ) and let X denote a Brownian motion that satisfies (26). Then, the control policy

40
(R, Y, ζ) given by
H -1 m
ζ(t) = v(W (t)), (93)
2
( L(t)
mk if k = i? ,
Rk (t) = (94)
0 otherwise,
Z t

Yk (t) = Xk (t) − ζk (s)ds + Rk (t) − ∆k (W (t)) mk (95)
0

for k = 1, . . . , K and t ≥ 0 is optimal for the BCP (23) with Brownian motion X and state process
Z(t) = {∆(W (t)) : t ≥ 0}, where ∆ is given by (33).

F Supplementary Material for Section 8

This section provides supplementary material for the simulation study in Section 8. We start with a detailed
description of the alternative policies considered; see Table 1 for a brief description of the pricing, outsourcing,
and scheduling decisions made under these policies.

Table 1: Comparison of the policies in terms of their decisions.

Decision
Policy Pricing Outsourcing Scheduling
MDP Dynamic Dynamic Dynamic
Policy (optimal) (optimal) (optimal)

MDP Policy with Static Dynamic (optimal Dynamic (optimal


Static Prices (optimal static prices) given the static prices) given the static prices)

Two-Parameter Static Dynamic (threshold Dynamic (threshold


Static Pricing Policy (zero drift rate) tuned by simulation) tuned by simulation)

Three-Parameter Static (drift rate Dynamic (threshold Dynamic (threshold


Static Pricing Policy tuned by simulation) tuned by simulation) tuned by simulation)

N -Price Dynamic with N admissible drift Dynamic (threshold Dynamic (threshold


Policy rates (obatined using the EWF) obtained using the EWF) obtained using the EWF)

PZVW Policy with Static with one admissible drift Dynamic (threshold Dynamic (threshold
Static Prices rate (obatined using the EWF) obtained using the EWF) obtained using the EWF)

PZVW Policy with Dynamic with unrestricted drift Dynamic (threshold Dynamic (threshold
Dynamic Prices rates (obtained using the EWF) obtained using the EWF) obtained using the EWF)

Proposed Dynamic with unrestricted drift Dynamic (threshold Dynamic (threshold


Policy rates (obtained using the EWF) obtained using the EWF) obtained using the EWF)

The first policy is a joint dynamic pricing, outsourcing, and scheduling policy that is computed using
the dynamic programming approach numerically; see Appendices G-H for its formulation and solution. This
is the (exact) optimal policy, which we can compute only for formulations with few products. For larger

41
problems, this approach suffers from the curse of dimensionality. We refer to this policy as the Markov
Decision Process (MDP) policy.

The second policy is the joint dynamic outsourcing and scheduling policy that uses a static price vector.
We compute this policy as follows: First, we use the dynamic programming approach to find the dynamic
outsourcing and scheduling policies given a static price vector. Then, we use the gradient descent algorithm
described in Appendix I to find the best static price vector. We refer to this policy as the MDP policy with
static prices. When the workload is sufficiently high, i.e., close to the idling threshold, under the proposed
policy and the MDP policy, the system manager uses the (nominal) static prices; see Figure 10 (in Appendix
L). As the backlog increases, the system manager turns away more customers by increasing the prices under
those policies. This helps control the backlog. When the outsourcing cost is small, the system manager could
alternatively outsource these orders. In fact, when the outsourcing cost is small, the MDP policy with static
prices uses outsourcing as a substitute for dynamic pricing to limit the backlog; see Figure 1b and Table 5
(in Appendix F).

The next three policies use scheduling and outsourcing policies similar to the proposed policy. They
are characterized by an outsourcing threshold l < 0, an idling threshold u > 0, and safety stocks sk for
k = 1, 2, 3. They outsource orders of product i? = 3 when the nominal workload W is less than or equal to
their outsourcing threshold l. They idle the server whenever the nominal workload W exceeds their idling
threshold u and Qnk ≥ sk for k = 1, 2, 3. Under these policies, whenever the server is working, it first
prioritizes those products for which Qnk < sk , i.e., the inventory is below the safety stock. Among such
products, the system manager prioritizes them in the descending order of bk /mk , i.e., products 1, 2, and
then 3. Finally, if Qnk ≥ sk for all k = 1, 2, 3 and W < u, i.e., the server is working, then the server focuses
solely on product j ? = 3. Policies three to five use different pricing policies. Their outsourcing and idling
thresholds are computed differently, as well. Next, we describe these differences.

The third policy uses the static price vector p? = Λ-1 (λ? ), i.e., the optimal solution to the static planning
problem (8). This corresponds to a zero drift rate in the EWF (29)-(31). We refer to the third policy as the
two-parameter static pricing policy. The outsourcing threshold, idling threshold, and safety stocks of this
policy are obtained using a simulation-based search. The fourth policy uses the static price vector given by
∇Λ-1 (λ? ) H -1 m
Λ-1 (λ? ) + √ θ̄,
n m0 H -1 m
where θ̄ ∈ R is the static drift rate. We refer to this policy as the three-parameter static pricing policy. The
outsourcing threshold, idling threshold, static drift rate, and safety stocks of this policy are obtained using
a simulation-based search.

The fifth policy uses N ∈ N price vectors. For a given N ∈ N, this policy is computed as follows:

42
First, we solve the equivalent workload formulation with the additional constraint that the effective drift
rate θ(t) ∈ Θ for t ≥ 0, where Θ is a given finite set with N elements. Appendix J provides a solution to this
formulation; see Ata (2006), Ata et al. (2019), and Ata et al. (2021a,b) for similar formulations and solution
approaches. Second, we use the gradient descent algorithm described in Appendix I to find the set Θ with
the smallest long-run average cost in the (modified) equivalent workload formulation. We denote this set by
Θ? . Third, we interpret the solution to the (modified) equivalent workload formulation for the set Θ? as a
joint dynamic pricing, scheduling, and outsourcing policy as we did for our proposed policy. We refer to this
policy as the N -price policy. We consider this policy with N = 1, 2, 3. We determine the best safety stocks
for this policy using a simulation-based search.

The sixth policy combines the pricing, outsourcing, and idleness policies of the one-price policy described
above with the scheduling policy, referred to as the Myopic(P) index scheduling policy, proposed by Perez
and Zipkin (1997) and further studied in Veatch and Wein (1996). Under this policy, when the workload
is below the idling threshold and the server is not manufacturing a product, an index is calculated for each
product and the product with the smallest index is manufactured. These indices correspond to the change in
the holding and backorder cost if the server were to manufacture one more unit of that product. To be more
specific, they are equal to µk ∆gk (Qnk ), where ∆gk (Qnk ) = E[ν(Qnk + 1 − Dk )] − ν(Qnk ) and Dk denotes the
demand for product k during one production time of product k; see Appendix N for further details. We refer
to this policy as the PZVW policy with static prices. The seventh policy combines the pricing, outsourcing,
and idleness policies of our proposed policy with the Myopic(P) index scheduling policy. We refer to this
policy as the PZVW policy with dynamic prices.

Note that our proposed policy, the N -price policy, the PZVW policy with static prices, and the PZVW
policy with dynamic prices are easy to compute whereas the other policies require substantial computational
power (either for solving the MDPs or finding the best outsourcing threshold, idling threshold, and static
drift rate).

Next, we provide some supplementary figures and tables for the simulation study. Table 2 reports the
long-run average cost of the MDP policy and the optimality gap of the other policies (with respect to the
MDP policy). Figure 5 depicts the impact of the average server utilization and the total outsourcing cost
on the gap between the proposed policy and the other policies.

43
Table 2: Comparative evaluation of the long-run average cost of the policies considered at different average
server utilizations. The values followed by ± indicate the 95% confidence interval.

Average Server Utilization


0.90 0.95 0.97 0.99
Cost of the MDP Policy 0.036 0.049 0.057 0.066
Optimality gap
Proposed Policy 7.4% ± 1.0% 7.3% ± 0.5% 7.0% ± 0.6% 6.5% ± 0.6%
MDP Policy with Static Prices 8.6% 17.9% 23.0% 28.3%
Two-Parameter Static Pricing Policy 24.4% ± 1.3% 67.7% ± 0.8% 113.8% ± 0.5% 179.8% ± 0.5%
Three-Parameter Static Pricing Policy 15.3% ± 1.3% 24.6% ± 0.8% 29.2% ± 0.6% 34.0% ± 0.7%
One-Price Policy 17.1% ± 1.0% 25.3% ± 0.9% 29.8% ± 0.8% 35.2% ± 0.6%
Two-Price Policy 9.5% ± 1.1% 10.8% ± 0.8% 11.2% ± 0.7% 10.9% ± 0.6%
Three-Price Policy 8.3% ± 1.2% 8.8% ± 0.8% 8.5% ± 0.7% 8.3% ± 0.5%
PZVW Policy with Static Prices 25.2% ± 2.2% 34.6% ± 1.7% 38.2% ± 1.4% 43.1% ± 1.2%
PZVW Policy with Dynamic Prices 17.6% ± 2.1% 17.9% ± 1.5% 15.4% ± 1.3% 14.8% ± 1.3%
Gap from the Proposed Policy
MDP Policy with Static Prices 1.2% ± 0.9% 9.9% ± 0.5% 15.0% ± 0.7% 20.5% ± 0.7%
Two-Parameter Static Pricing Policy 15.9% ± 1.6% 56.2% ± 1.0% 99.9% ± 1.3% 162.7% ± 1.5%
Three-Parameter Static Pricing Policy 7.4% ± 1.6% 16.1% ± 0.9% 20.8% ± 0.9% 25.8% ± 0.9%
One-Price Policy 9.0% ± 1.4% 16.8% ± 1.0% 21.4% ± 1.0% 27.0% ± 0.9%
Two-Price Policy 2.0% ± 1.4% 3.2% ± 0.9% 4.0% ± 0.9% 4.1% ± 0.8%
Three-Price Policy 0.9% ± 1.5% 1.3% ± 0.9% 1.5% ± 0.9% 1.7% ± 0.8%
PZVW Policy with Static Prices 16.6% ± 2.3% 25.5% ± 1.7% 29.2% ± 1.5% 34.3% ± 1.4%
PZVW Policy with Dynamic Prices 9.6% ± 2.2% 9.8% ± 1.5% 7.9% ± 1.4% 7.8% ± 1.4%

Proposed Policy One-Price Policy PZVW Policy with Static Prices


MDP Policy with Static Prices Two-Price Policy PZVW Policy with Dynamic Prices
Three-Parameter Static Pricing Policy Three-Price Policy
Gap from the Proposed Policy

Gap from the Proposed Policy

30% 25%

25% 20%
20%
15%
15%
10%
10%

5% 5%

0% 0%
0.90 0.95 0.99 1.05 1.1 1.2 1.3
Utilization Total Outsourcing Cost

(a) Impact of the average server utilization. (b) Impact of the total outsourcing cost.

Figure 5: Impact of the average server utilization and the total outsourcing cost on the gap between the
proposed policy and the other policies. The shaded areas depict the 95% confidence intervals. For visual
clarity, we have not included the two-parameter static pricing policy, which is substantially outperformed by
all other policies.

44
Tables 3-4 report the impact of the state costs.

Table 3: Comparative evaluation of the long-run average cost of the policies considered at different state
costs. These values can be thought of as corresponding to different costs of capital, which increases from left
to right. The values followed by ± indicate the 95% confidence interval.

State Costs (Holding Cost, Backorder Cost)


(0.001, 0.002) (0.005, 0.01) (0.01, 0.02) (0.05, 0.1)
Cost of the MDP Policy 0.013 0.049 0.086 0.306
Optimality gap
Proposed Policy 5.9% ± 1.9% 7.3% ± 0.5% 8.5% ± 0.5% 10.8% ± 0.1%
MDP Policy with Static Prices 11.4% 17.9% 19.6% 20.2%
Two-Parameter Static Pricing Policy 31.3% ± 3.7% 67.7% ± 0.8% 84.3% ± 0.4% 91.4% ± 0.1%
Three-Parameter Static Pricing Policy 16.1% ± 4.0% 24.6% ± 0.8% 26.9% ± 0.4% 30.4% ± 0.1%
One-Price Policy 17.6% ± 2.8% 25.3% ± 0.9% 28.4% ± 0.5% 30.8% ± 0.1%
Two-Price Policy 8.5% ± 2.5% 10.8% ± 0.8% 11.9% ± 0.5% 13.9% ± 0.1%
Three-Price Policy 7.3% ± 2.5% 8.8% ± 0.8% 10.1% ± 0.5% 12.2% ± 0.1%
PZVW Policy with Static Prices 27.6% ± 7.3% 34.6% ± 1.7% 38.8% ± 1.0% 41.4% ± 0.3%
PZVW Policy with Dynamic Prices 11.4% ± 7.3% 17.9% ± 1.5% 20.4% ± 0.9% 22.0% ± 0.3%
Gap from the Proposed Policy
MDP Policy with Static Prices 5.2% ± 1.9% 9.9% ± 0.5% 10.2% ± 0.5% 8.5% ± 0.1%
Two-Parameter Static Pricing Policy 23.9% ± 4.1% 56.2% ± 1.0% 69.9% ± 0.9% 72.7% ± 0.3%
Three-Parameter Static Pricing Policy 9.6% ± 4.2% 16.1% ± 0.9% 17.0% ± 0.7% 17.7% ± 0.2%
One-Price Policy 11.1% ± 3.3% 16.8% ± 1.0% 18.4% ± 0.7% 18.0% ± 0.2%
Two-Price Policy 2.5% ± 3.0% 3.2% ± 0.9% 3.2% ± 0.7% 2.8% ± 0.2%
Three-Price Policy 1.4% ± 3.0% 1.3% ± 0.9% 1.4% ± 0.7% 1.3% ± 0.2%
PZVW Policy with Static Prices 20.4% ± 7.2% 25.5% ± 1.7% 28.0% ± 1.1% 27.6% ± 0.3%
PZVW Policy with Dynamic Prices 5.2% ± 7.2% 9.8% ± 1.5% 10.9% ± 1.0% 10.1% ± 0.3%

Table 4: Comparative evaluation of the long-run average cost of the policies considered at different backorder
costs. The values followed by ± indicate the 95% confidence interval.

Ratio of the Backorder Cost to the Holding Cost


1 2 3 5
Cost of the MDP Policy 0.035 0.049 0.058 0.070
Optimality gap
Proposed Policy 5.9% ± 1.2% 7.3% ± 0.5% 8.3% ± 0.6% 12.1% ± 0.6%
MDP Policy with Static Prices 13.1% 17.9% 21.4% 25.9%
Two-Parameter Static Pricing Policy 49.3% ± 1.3% 67.7% ± 0.8% 77.3% ± 0.5% 90.6% ± 0.1%
Three-Parameter Static Pricing Policy 17.7% ± 1.0% 24.6% ± 0.8% 28.9% ± 0.4% 37.1% ± 0.4%
One-Price Policy 19.1% ± 1.2% 25.3% ± 0.9% 29.2% ± 0.7% 37.9% ± 0.6%
Two-Price Policy 8.9% ± 1.1% 10.8% ± 0.8% 11.8% ± 0.7% 15.8% ± 0.5%
Three-Price Policy 7.0% ± 1.3% 8.8% ± 0.8% 9.9% ± 0.8% 13.6% ± 0.6%
PZVW Policy with Static Prices 24.9% ± 2.2% 34.6% ± 1.7% 37.7% ± 1.5% 43.6% ± 1.1%
PZVW Policy with Dynamic Prices 16.9% ± 2.4% 17.9% ± 1.5% 17.5% ± 1.5% 19.2% ± 1.1%
Gap from the Proposed Policy
MDP Policy with Static Prices 6.8% ± 1.2% 9.9% ± 0.5% 12.0% ± 0.6% 12.3% ± 0.6%
Two-Parameter Static Pricing Policy 41.0% ± 2.0% 56.2% ± 1.0% 63.6% ± 1.0% 70.0% ± 0.9%
Three-Parameter Static Pricing Policy 11.2% ± 1.5% 16.1% ± 0.9% 19.0% ± 0.8% 22.3% ± 0.8%
One-Price Policy 12.5% ± 1.7% 16.8% ± 1.0% 19.3% ± 1.0% 23.0% ± 0.9%
Two-Price Policy 2.9% ± 1.5% 3.2% ± 0.9% 3.1% ± 0.8% 3.3% ± 0.7%
Three-Price Policy 1.1% ± 1.7% 1.3% ± 0.9% 1.5% ± 0.9% 1.3% ± 0.8%
PZVW Policy with Static Prices 18.0% ± 2.5% 25.5% ± 1.7% 27.1% ± 1.5% 28.1% ± 1.2%
PZVW Policy with Dynamic Prices 10.5% ± 2.6% 9.8% ± 1.5% 8.4% ± 1.5% 6.3% ± 1.2%

45
Table 5 reports the impact of the total outsourcing cost.

Table 5: Comparative evaluation of the long-run average cost of the policies considered at different total
outsourcing costs. The values followed by ± indicate the 95% confidence interval.

Total Outsourcing Cost


1.05 1.1 1.15 1.25 1.33
Cost of the MDP Policy 0.049 0.049 0.049 0.049 0.049
Optimality gap
Proposed Policy 7.6% ± 0.8% 7.3% ± 0.6% 7.3% ± 0.6% 7.3% ± 0.6% 7.3% ± 0.5%
MDP Policy with Static Prices 12.5% 16.7% 17.8% 17.9% 17.9%
Two-Parameter Static Pricing Policy 32.4% ± 0.5% 50.8% ± 0.8% 59.9% ± 0.8% 66.2% ± 0.8% 67.7% ± 0.8%
Three-Parameter Static Pricing Policy 18.4% ± 0.6% 23.6% ± 0.8% 24.4% ± 0.8% 24.6% ± 0.8% 24.6% ± 0.8%
One-Price Policy 19.0% ± 0.7% 24.1% ± 0.8% 25.1% ± 0.8% 25.3% ± 0.9% 25.3% ± 0.9%
Two-Price Policy 9.8% ± 0.7% 10.3% ± 0.8% 10.7% ± 0.7% 10.7% ± 0.8% 10.8% ± 0.8%
Three-Price Policy 8.7% ± 0.8% 8.8% ± 0.8% 9.0% ± 0.9% 8.9% ± 0.9% 8.8% ± 0.8%
PZVW Policy with Static Prices 28.4% ± 1.5% 33.5% ± 1.6% 34.6% ± 1.6% 34.7% ± 1.5% 34.6% ± 1.7%
PZVW Policy with Dynamic Prices 18.0% ± 1.7% 17.8% ± 1.7% 17.9% ± 1.7% 17.9% ± 1.6% 17.9% ± 1.5%
Gap from the Proposed Policy
MDP Policy with Static Prices 4.6% ± 0.8% 8.7% ± 0.6% 9.7% ± 0.7% 9.9% ± 0.7% 9.9% ± 0.5%
Two-Parameter Static Pricing Policy 23.0% ± 1.0% 40.5% ± 1.1% 48.9% ± 1.2% 54.8% ± 1.2% 56.2% ± 1.0%
Three-Parameter Static Pricing Policy 10.0% ± 1.0% 15.2% ± 1.0% 15.9% ± 1.0% 16.1% ± 1.0% 16.1% ± 0.9%
One-Price Policy 10.6% ± 1.1% 15.6% ± 1.0% 16.6% ± 1.0% 16.7% ± 1.1% 16.8% ± 1.0%
Two-Price Policy 2.0% ± 1.0% 2.8% ± 1.0% 3.1% ± 0.9% 3.1% ± 1.0% 3.2% ± 0.9%
Three-Price Policy 1.0% ± 1.1% 1.4% ± 0.9% 1.6% ± 1.0% 1.5% ± 1.0% 1.3% ± 0.9%
PZVW Policy with Static Prices 19.3% ± 1.7% 24.3% ± 1.6% 25.4% ± 1.6% 25.5% ± 1.6% 25.5% ± 1.7%
PZVW Policy with Dynamic Prices 9.6% ± 1.7% 9.7% ± 1.7% 9.9% ± 1.7% 9.8% ± 1.7% 9.8% ± 1.5%

Table 6 and Figure 6 report the impact of the system parameter n.

Table 6: Comparative evaluation of the long-run average costs at different system parameters.

System Parameter
10 25 50 100 250 500 1000
Cost of the MDP Policy 0.059 0.054 0.052 0.049 0.047 0.046 0.045
Optimality gap
Proposed Policy 11.7% ± 0.3% 11.2% ± 0.4% 9.2% ± 0.5% 7.3% ± 0.5% 5.6% ± 1.2% 4.2% ± 1.9% 2.9% ± 3.3%
MDP Policy with Static Prices 12.5% 15.1% 16.8% 17.9% 19.3% 20.3% 21.2%
Two-Parameter Static Pricing Policy 59.6% ± 0.1% 63.9% ± 0.3% 67.1% ± 0.4% 67.7% ± 0.8% 67.9% ± 1.4% 67.3% ± 1.5% 65.4% ± 2.2%
Three-Parameter Static Pricing Policy 21.8% ± 0.2% 24.1% ± 0.3% 24.4% ± 0.5% 24.6% ± 0.8% 24.0% ± 1.2% 22.9% ± 1.4% 22.0% ± 2.6%
One-Price Policy 24.2% ± 0.2% 26.3% ± 0.4% 25.6% ± 0.5% 25.3% ± 0.9% 24.6% ± 1.3% 24.2% ± 2.0% 23.0% ± 4.4%
Two-Price Policy 14.0% ± 0.2% 13.8% ± 0.3% 12.2% ± 0.6% 10.8% ± 0.8% 8.8% ± 1.5% 7.8% ± 1.4% 6.0% ± 2.4%
Three-Price Policy 12.7% ± 0.2% 12.4% ± 0.4% 10.7% ± 0.6% 8.8% ± 0.9% 6.9% ± 1.5% 5.8% ± 2.4% 4.0% ± 4.0%
PZVW Policy with Static Prices 35.1% ± 0.4% 40.6% ± 0.7% 37.6% ± 0.9% 34.6% ± 1.7% 31.8% ± 2.0% 29.8% ± 3.7% 28.5% ± 7.4%
PZVW Policy with Dynamic Prices 23.8% ± 0.4% 25.9% ± 0.8% 21.4% ± 1.0% 17.9% ± 1.6% 13.2% ± 2.6% 10.0% ± 5.0% 8.1% ± 4.1%
Gap from the Proposed Policy
MDP Policy with Static Prices 0.8% ± 0.2% 3.4% ± 0.4% 6.9% ± 0.5% 9.9% ± 0.5% 12.9% ± 1.3% 15.5% ± 2.1% 17.9% ± 3.8%
Two-Parameter Static Pricing Policy 42.9% ± 0.4% 47.3% ± 0.6% 53.0% ± 0.8% 56.2% ± 1.1% 59.0% ± 2.3% 60.6% ± 3.3% 60.8% ± 5.6%
Three-Parameter Static Pricing Policy 9.1% ± 0.3% 11.6% ± 0.5% 13.9% ± 0.7% 16.1% ± 0.9% 17.4% ± 1.8% 18.0% ± 2.5% 18.6% ± 4.6%
One-Price Policy 11.3% ± 0.3% 13.5% ± 0.5% 15.0% ± 0.7% 16.8% ± 1.0% 18.0% ± 1.9% 19.3% ± 2.9% 19.6% ± 5.8%
Two-Price Policy 2.1% ± 0.3% 2.3% ± 0.5% 2.7% ± 0.7% 3.2% ± 0.9% 3.0% ± 1.9% 3.5% ± 2.3% 3.0% ± 4.0%
Three-Price Policy 0.9% ± 0.3% 1.0% ± 0.5% 1.3% ± 0.7% 1.3% ± 0.9% 1.2% ± 1.8% 1.6% ± 3.0% 1.1% ± 5.1%
PZVW Policy with Static Prices 21.0% ± 0.5% 26.4% ± 0.8% 26.0% ± 1.0% 25.5% ± 1.7% 24.8% ± 2.4% 24.7% ± 4.3% 25.0% ± 8.2%
PZVW Policy with Dynamic Prices 10.8% ± 0.4% 13.2% ± 0.8% 11.2% ± 1.1% 9.8% ± 1.5% 7.2% ± 2.8% 5.6% ± 5.1% 5.1% ± 5.3%

46
Proposed Policy One-Price Policy PZVW Policy with Static Prices
MDP Policy with Static Prices Two-Price Policy PZVW Policy with Dynamic Prices
Three-Parameter Static Pricing Policy Three-Price Policy

Gap from the Proposed Policy


40% 25%
35%
20%
Optimality Gap

30%
25%
15%
20%
15% 10%
10%
5%
5%
0% 1 0% 1
10 102 103 10 102 103
System Parameter System Parameter

(a) Gap from the MDP policy. (b) Gap from the proposed policy.

Figure 6: Impact of the system parameter on the performance of the various policies. The shaded areas
depict the 95% confidence intervals.

Table 7 reports the impact of the production time distribution.

Table 7: Comparative evaluation of the long-run average costs at different production time coefficient of
variations. The case of zero coefficient of variation corresponds to deterministic production times.

Coefficient of Variation of Production Times


0 0.5 1 1.5 2
Lognormal
Cost of the Proposed Policy 0.0333 ± 0.0003 0.0387 ± 0.0004 0.0533 ± 0.0004 0.0761 ± 0.0004 0.1045 ± 0.0004
Gap from the Proposed Policy
MDP Policy 0.3% ± 1.6% −4.7% ± 1.2% −6.9% ± 1.0% −5.9% ± 0.7% −4.9% ± 0.5%
MDP Policy with Static Prices 24.6% ± 1.5% 14.5% ± 1.5% 10.3% ± 1.1% 19.2% ± 0.7% 33.2% ± 0.7%
Two-Parameter Static Pricing Policy 36.0% ± 1.7% 41.6% ± 1.5% 56.5% ± 1.4% 70.4% ± 1.0% 79.7% ± 0.7%
Three-Parameter Static Pricing Policy 13.2% ± 1.7% 13.7% ± 1.6% 15.4% ± 1.0% 17.3% ± 0.6% 18.5% ± 0.7%
One-Price Policy 13.5% ± 1.7% 14.2% ± 1.4% 17.1% ± 1.2% 20.2% ± 0.7% 23.1% ± 0.6%
Two-Price Policy 2.7% ± 1.6% 2.6% ± 1.4% 3.0% ± 1.1% 3.7% ± 0.8% 4.4% ± 0.6%
Three-Price Policy 1.2% ± 1.6% 0.8% ± 1.3% 1.3% ± 1.1% 1.7% ± 0.7% 1.8% ± 0.5%
PZVW Policy with Static Prices 13.8% ± 2.4% 17.1% ± 2.2% 27.2% ± 1.9% 36.8% ± 1.3% 49.4% ± 1.1%
PZVW Policy with Dynamic Prices 0.9% ± 2.8% 3.4% ± 2.3% 11.8% ± 2.0% 17.9% ± 1.2% 26.6% ± 0.9%
Gamma
Cost of the Proposed Policy 0.0333 ± 0.0003 0.0385 ± 0.0003 0.0529 ± 0.0003 0.0724 ± 0.0004 0.0991 ± 0.0004
Gap from the Proposed Policy
MDP Policy 0.3% ± 1.6% −4.7% ± 1.3% −7.0% ± 0.9% −5.2% ± 0.8% −1.5% ± 0.6%
MDP Policy with Static Prices 24.6% ± 1.5% 14.5% ± 1.5% 9.9% ± 0.9% 16.2% ± 0.8% 32.8% ± 0.7%
Two-Parameter Static Pricing Policy 36.0% ± 1.7% 41.8% ± 1.6% 56.2% ± 1.0% 65.3% ± 1.1% 78.1% ± 0.8%
Three-Parameter Static Pricing Policy 13.2% ± 1.7% 14.5% ± 1.4% 16.1% ± 0.9% 17.7% ± 0.8% 19.7% ± 0.6%
One-Price Policy 13.5% ± 1.7% 14.3% ± 1.5% 16.8% ± 1.0% 18.1% ± 0.9% 19.8% ± 0.7%
Two-Price Policy 2.7% ± 1.6% 2.6% ± 1.3% 3.2% ± 0.9% 3.5% ± 0.8% 3.4% ± 0.6%
Three-Price Policy 1.2% ± 1.6% 1.3% ± 1.4% 1.3% ± 0.9% 1.8% ± 0.8% 1.3% ± 0.6%
PZVW Policy with Static Prices 13.8% ± 2.4% 17.1% ± 2.3% 25.5% ± 1.7% 36.7% ± 1.4% 47.4% ± 1.1%
PZVW Policy with Dynamic Prices 0.9% ± 2.8% 3.1% ± 2.5% 9.8% ± 1.5% 19.2% ± 1.2% 29.4% ± 1.0%

47
G Markov Decision Process Formulation

This section describes a Markov decision process (MDP) formulation of the joint dynamic pricing, outsourc-
ing, and scheduling control problem of the three-product example introduced in Section 7; see e.g., Bertsekas
(2012). This formulation allows the system manager to make new pricing, outsourcing, and scheduling de-
cisions whenever an order arrives or a product is manufactured, but not at any other time, i.e., at the event
times. The Markov decision process formulation focuses on policies that truncate the inventory of product
k for k = 1, 2, 3 at ±Mk , where Mk ∈ N is a tuning parameter referred to as the inventory truncation
parameter for product k.

State Space. We denote the system state for our MDP formulation by q(t) = (q1 (t), q2 (t), q3 (t)) for
t ≥ 0, where qk (t) denotes the inventory of product k at time t. The state space, denoted by Q, is as follows:

Q = (q1 , q2 , q3 ) : qk ∈ Z and − Mk ≤ qk ≤ Mk for k = 1, 2, 3 .

Note that we restrict qk (t) for k = 1, 2, 3 to integer values between −Mk and Mk for computational tractabil-
ity. As discussed in Appendix H, we use policy iteration to (numerically) solve our MDP. Due to the high
memory requirement of the policy evaluation step and the fact that we only have access to 512GB of memory
(for each computational job), we can only solve MDPs with up to about 1 billion states. Consequently, we
can solve MDPs with M1 = M2 = M3 ≤ 500. This appears to contain the likely states.

Actions. The system manager makes three decisions dynamically over time: First, she chooses the prices.
We denote the price vector chosen at time t by p(t) = (p1 (t), p2 (t), p3 (t)). The set of admissible price vectors
is P̄ = R3+ .

Second, she decides whether to outsource the next product k order for k = 1, 2, 3. We denote the
outsourcing decision for the next product k order by ok ∈ {0, 1} for k = 1, 2, 3. To be specific, ok = 1
denotes the decision to outsource and ok = 0 denotes the decision not to outsource.

Lastly, she decides which product to manufacture if any. We denote the scheduling (or manufacturing)
decision with s ∈ {0, 1, 2, 3}, where s = 0 denotes the decision to idle, and s = 1, s = 2, and s = 3 denote
the decision to manufacture products 1, 2, and 3, respectively. Combining these, the action space for the
MDP formulation is given as follows:

A = (p1 , p2 , p3 , o1 , o2 , o3 , s) : (p1 , p2 , p3 ) ∈ P̄, o1 , o2 , o3 ∈ {0, 1}, s ∈ {0, 1, 2, 3} .

Letting A(q) denote the set of feasible actions at state q ∈ Q, we have that

48

A(q) = (p1 , p2 , p3 , o1 , o2 , o3 , s) ∈ A : ok = 1 if qk = −Mk for k = 1, 2, 3,

s ∈ {0, 2, 3} if q1 = M1 , s ∈ {0, 1, 3} if q2 = M2 , and s ∈ {0, 1, 2} if q3 = M3 .

In words, the constraint ok = 1 when qk = −Mk for k = 1, 2, 3 means that that the incoming orders for
product k for k = 1, 2, 3 are outsourced when the inventory of product k is at the lowest allowable value,
i.e., −Mk . The remaining constraints mean that when the inventory of product k for k = 1, 2, 3 is at the
maximum allowable value, i.e., Mk , product k is not manufactured.

A policy is a mapping π : Q → A with π(q) ∈ A(q) for q ∈ Q. That is, we restrict attention to stationary
Markov policies. The system manager’s problem is to choose a policy π that maximizes the long-run average
profit over an infinite planning horizon.

To facilitate the analysis below, we introduce the following notation s(a), pk (a), Λk (a), and ok (a) for
k = 1, 2, 3 and a ∈ A. These describe the scheduling decision, price, demand rate, and outsourcing decision
for product k, respectively, under action a ∈ A. We also let ek ∈ R3 for k = 1, 2, 3 denote the unit vector
whose k-th component is equal to 1 and the other components are zero, e0 = (0, 0, 0), µ0 = 0, and δ0 = 0.

Uniformization. As done usually in the dynamic programming literature, we use the uniformization
technique to write down the Bellman equation. To that end, we let
3
X
Ψ= sup Λk (p1 , p2 , p3 ) + max{µ1 , µ2 , µ3 } < ∞,
(p1 ,p2 ,p3 )∈P̄ k=1

which serves as an upper bound on the transition rates. Note that the time between two consecutive events
is exponentially distributed with rate Ψ. An action a ∈ A(q) is chosen after each event, where q ∈ Q denotes
the state after the event. The action is not updated until the next event. Under action a ∈ A(q), the next
event corresponds to the arrival of an order for product k for k = 1, 2, 3 with probability Λk (a)/Ψ, the
manufacturing of product s(a) if s(a) ∈ {1, 2, 3} with probability µs(a) /Ψ, and a fictitious transition with
3
P 
probability Ψ − µs(a) − Λk (a) /Ψ under which the system state remains the same.
k=1

Transition Probabilities. In the uniformized system, the probability of transitioning from state q ∈ Q
to state q 0 ∈ Q at the next event under action a ∈ A(q) is given by

49



 Λ1 (a)1{o1 (a)=0} /Ψ, if q̃ = q − e1 ∈ Q,



Λ2 (a)1{o2 (a)=0} /Ψ,

if q̃ = q − e2 ∈ Q,






 Λ3 (a)1{o3 (a)=0} /Ψ,

if q̃ = q − e3 ∈ Q,


Pqq0 (a) = (96)



 µs(a) /Ψ, if q̃ = q + es(a) ∈ Q and s(a) ∈ {1, 2, 3},


Ψ − µs(a) − k=1 Λk (a)1{ok (a)=0} /Ψ, if q̃ = q,


 P3 






 0,
 otherwise,

where 1{·} denotes the indicator function.

Rewards. The (one-stage) expected reward, i.e., the expected reward until the next event, at state q ∈ Q
under action a ∈ A(q) is given by
3 3
υk (qk ) Λk (a) µs(a)
1{ok (a)=1} − δs(a) 1{s(a)∈{1,2,3}}
X X  
R(q, a) = − + pk (a) − (νk + δk (97)
Ψ Ψ Ψ
k=1 k=1

for q ∈ Q and a ∈ A(q). The first term on the right-hand side of Equation (97) denotes the expected holding
and backorder cost, the second term denotes the expected revenue, and the last term denotes the expected
manufacturing cost.

Bellman Equation. Next, we introduce the Bellman equation, which provides a means for characterizing
an optimal policy:
X
Pqq0 (a)f (q 0 ) ,

γ + f (q) = sup R(q, a) + q ∈ Q. (98)
a∈A(q) q 0 ∈Q

Here, one interprets γ as a guess at the maximum long-run average profit. The function f : Q → R is often
called a relative value function in average-cost dynamic programming. It is easy to see that the relative
value function can only be determined up to an additive constant, even if γ is treated as a known constant.
Therefore, we set f (0, 0, 0) = 0.

H Solving the Bellman Equation

This section describes the policy iteration algorithm for solving the Bellman equation (98). In order to
initialize the policy iteration algorithm, we choose policy π0 that makes the following pricing, outsourcing,
and scheduling decisions: First, it uses the price vector Λ−1 (λ? ) in all states, i.e.,

pk (π0 (q)) = Λ−1 ?


k (λ ), for k = 1, 2, 3 and q ∈ Q.

50
Second, it outsources product k orders for k = 1, 2, 3 only when qk = −Mk , i.e., for k = 1, 2, 3 and q ∈ Q,

 1, if qk = −Mk ,
ok (π0 (q)) =
 0, otherwise.

Third, it manufactures the product with the largest backlog among the products that have a backlog. If no
product has a backlog, it does not manufacture (any products). In other words, for q ∈ Q,

 argmin qk , if qk < 0 for some k ∈ {1, 2, 3},
s(π0 (q)) =
 0, otherwise.

Starting with π0 , we iteratively derive πi for i ∈ N as follows:

Policy Evaluation. Given policy πi−1 : Q → A, we solve for the function fi−1 : Q → R and scalar γi−1
that satisfy
X
γi−1 + fi−1 (q) = R(q, πi−1 (q)) + Pqq0 (πi−1 (q)) fi−1 (q 0 ), q∈Q (99)
q 0 ∈Q

and fi−1 (0, 0, 0) = 0. Equation (99) corresponds to a sparse system of linear equations.7 We use the
BiCGSTAB iterative sparse solver of the C++ library Eigen to solve (99).8

Policy Improvement. Given fi−1 : Q → R, we define the updated policy πi : Q → A as follows:


X
Pqq0 (a)fi−1 (q 0 ) ,

πi (q) = argmax R(q, a) + q ∈ Q. (100)
a∈A(q) q 0 ∈Q

Termination. The algorithm terminates when the actions under πi are sufficiently close to those under
πi−1 for some i ∈ N. To be specific, given an error threshold  > 0, we repeat the policy evaluation and
policy improvement steps until for k = 1, 2, 3, q ∈ Q, and some i ∈ N, we have

pk (πi (q)) − pk (πi−1 (q)) ≤ , ok (πi (q)) = ok (πi−1 (q)), and s(πi (q)) = s(πi−1 (q)).

The characterization of the updated policy πi in (100) involves solving a multi-dimensional optimization
problem. To facilitate the computation of the updated policy, we next provide a simpler characterization
that follows from a decomposition of the scheduling, outsourcing, and pricing decisions in Equation (100).
To be specific, in Lemma 8, we show that the updated scheduling decisions in step i for i ∈ N can be made
independently of the other decisions and solely based on fi . Similarly, in Lemma 9, we show that the updated
outsourcing decisions in step i for i ∈ N can be made independently of the other decisions and solely based
on fi . Finally, in Lemma 10, we characterize the updated pricing decisions using the updated outsourcing

7
The coefficient matrix of this system of linear equations is an (8M1 M2 M3 + 1) × (8M1 M2 M3 + 1) matrix. Each
row of this coefficient matrix has fewer than 7 non-zero elements. For M1 = M2 = M3 = 500, this corresponds to
fewer than 7 non-zero elements out of every 1 billion elements.
8
Eigen is available at http://eigen.tuxfamily.org (accessed on July 1st, 2021).

51
decisions. Lemma 10 shows that by using a relationship between the optimal prices of a multinomial logit
model, the three-dimensional pricing problem can be transformed into a one-dimensional problem in terms
of the price for product 1. The updated prices for products 2 and 3 then follow from the aforementioned
relationship and the updated price for product 1.

To simplify the analysis to follow, let us write out the definition of the updated policy πi . To be specific,
let us substitute (96) and (97) into (100) to obtain

Λ1 (p1 , p2 , p3 )
p1 − (ν1 + δ1 )1{o1 =1} + (fi−1 (q − e1 ) − fi−1 (q))1{o1 =0}

πi (q) = argmax +
(p1 ,p2 ,p3 ,o1 ,o2 ,o3 ,s)∈A(q) Ψ
Λ2 (p1 , p2 , p3 )
p2 − (ν2 + δ2 )1{o2 =1} + (fi−1 (q − e2 ) − fi−1 (q))1{o2 =0}

+
Ψ
Λ3 (p1 , p2 , p3 )
p3 − (ν3 + δ3 )1{o3 =1} + (fi−1 (q − e3 ) − fi−1 (q))1{o3 =0}

+
Ψ 
µs
− δs + (fi−1 (q + es ) − fi−1 (q)) 1{s∈{1,2,3}} ,

+ q ∈ Q. (101)
Ψ
This explicit characterization facilitates the decomposition of the scheduling, outsourcing, and pricing deci-
sions used in Lemmas 8-10.

Lemma 8. The updated scheduling decision at state q ∈ Q in step i ∈ N is given by


µs
− δs + (fi−1 (q + es ) − fi−1 (q)) 1{s∈{1,2,3}} .

s(πi (q)) = argmax
s∈{0}∪{k:qk <Mk ,k=1,2,3} Ψ

Proof. The scheduling decision s at state q ∈ Q only impacts the last term on the right-hand side of (101).
In fact, the last term on the right-hand side of (101) only depends on the scheduling decision s. Therefore,
the updated scheduling decision can be characterized independently of the other decisions and solely based
on the last term on the right-hand side of (101), which yields the result.

Lemma 9. The updated outsourcing decision for product k for k = 1, 2, 3 at state q ∈ Q in step i ∈ N is
given by

 0, if fi−1 (q − ek ) − fi−1 (q) > −(νk + δk ) and qk > −Mk ,
ok (πi (q)) = (102)
 1, otherwise.

Proof. First, we assume the (updated) prices (p1 , p2 , p3 ) ∈ P̄ are given. Then, we show that the updated
outsourcing decisions do not depend on the updated prices and obtain a simple characterization of the
outsourcing decisions. The outsourcing decision for product k for k = 1, 2, 3 only impacts the k-th term on
the right-hand side of (101). Therefore, given (p1 , p2 , p3 ) ∈ P̄, the updated outsourcing decision for product
k for k = 1, 2, 3 at state q ∈ Q satisfies
Λk (p1 , p2 , p3 )
pk − (νk + δk )1{ok =1} + (fi−1 (q − ek ) − fi−1 (q))1{ok =0} ,

ok (πi (q)) = argmax (103)
ok ∈Ok (q) Ψ

52
where

 {0, 1} if qk > −Mk ,
Ok (q) =
 {1} otherwise,

denotes the set of feasible outsourcing decisions for product k for k = 1, 2, 3 at state q ∈ Q. Since
Λk (p1 , p2 , p3 ) ≥ 0 for (p1 , p2 , p3 ) ∈ P̄, it follows from (103) that for k = 1, 2, 3 and q ∈ Q,

pk − (νk + δk )1{ok =1} + fi−1 (q − ek ) − fi−1 (q) 1{ok =0}


 
ok (πi (q)) = argmax
ok ∈Ok (q)

− (νk + δk )1{ok =1} + fi−1 (q − ek ) − fi−1 (q) 1{ok =0} .


 
= argmax (104)
ok ∈Ok (q)

It is evident from (104) that the updated outsourcing decisions do not depend on the prices (p1 , p2 , p3 ) ∈ P̄
and they are given by (102).

Lemma 10 provides a simple characterization of the updated pricing decisions for the multinomial logit
model discussed in Section 7. To be specific, Lemma 10 assumes
exp(ak − dk pk )
Λk (p1 , p2 , p3 ) = 100 P3 , k = 1, 2, 3, and (p1 , p2 , p3 ) ∈ P̄, (105)
1+ i=1 exp(ai − di pi )

where ak and dk for k = 1, 2, 3 are given constants. The (updated) pricing decisions at state q ∈ Q only
impact the first three terms on the right-hand side of (101). Therefore, given the updated outsourcing
decisions, the updated pricing decisions at state q ∈ Q are given by
3
X 
argmax Λk (p1 , p2 , p3 ) pk − zk (q) , (106)
(p1 ,p2 ,p3 )∈P̄ k=1

where

zk (q) = (νk + δk )1{ok (πi (q))=1} − (fi−1 (q − ek ) − fi−1 (q))1{ok (πi (q))=0} , k = 1, 2, 3 and q ∈ Q. (107)

If the next product k order is outsourced, zk (q) is equal to the total outsourcing cost for product k. Otherwise,
zk (q) is equal to the change in the relative value function as a result of receiving a product k order. Given
the state q ∈ Q and price of product 1, p1 ∈ R+ , we define the auxiliary price functions p̃2 and p̃3 as
1 1 1 1
p̃2 (q, p1 ) = z2 (q) + + p1 − z1 (q) − and p̃3 (q, p1 ) = z3 (q) + + p1 − z1 (q) − . (108)
d2 d1 d3 d1
As shown in Lemma 10, p̃2 (q, p1 ) and p̃3 (q, p1 ) characterize the updated price for products 2 and 3 given the
state q ∈ Q and the updated price for product 1, p1 ∈ R+ .

Lemma 10. Assume Λk for k = 1, 2, 3 is given by (105). The updated pricing decision for product 1 at
state q ∈ Q in step i ∈ N is given by
  P3  
exp a1 − d1 p1 p1 − z1 (q) + k=2 exp ak − dk p̃k (q, p1 ) p̃k (q, p1 ) − zk (q)
p1 (πi (q)) = argmax P3  . (109)
p1 ∈R+ 1 + exp(a1 − d1 p1 ) + k=2 exp ak − dk p̃k (q, p1 )

53
Given p1 (πi (q)), the updated pricing decisions for products 2 and 3 at state q ∈ Q in step i ∈ N are given by
 
p2 (πi (q)) = p̃2 q, p1 (πi (q)) and p3 (πi (q)) = p̃3 q, p1 (πi (q)) .

Proof. We use a relationship between the optimal prices in a multinomial logit model to transform the
three-dimensional optimization problem (106) into a one-dimensional optimization problem. To be specific,

it follows from Gallego and Topaloglu (2019, Section 8.6.2) that the optimal solution p?1 (q), p?2 (q), p?3 (q) to
(106) satisfies
1
p?k (q) = zk (q) + + θ(q), k = 1, 2, 3 and q ∈ Q (110)
dk
for some θ : Q → R. It follows from (110) for k = 1 that
1
θ(q) = p?1 (q) − z1 (q) − , q ∈ Q. (111)
d1
Substituting (111) into (110) for k = 2, 3 gives
1 1
p?k (q) = zk (q) + + p?1 (q) − z1 (q) − , k = 2, 3 and q ∈ Q. (112)
dk d1
It follows from (105) and (112) that (106) has the same optimal solution as
3
X exp(ak − dk pk ) 
argmax P3 pk − zk (q) (113)
(p1 ,p2 ,p3 )∈P̄ k=1
1 + i=1 exp(ai − di pi )
1 1
p2 = z2 (q) + + p1 − z1 (q) − ,
d2 d1
1 1
p3 = z3 (q) + + p1 − z1 (q) − .
d3 d1
Substituting the two constraints of (113) into its objective function gives (109). The updated prices for
products 2 and 3 follow from (108) and (112).

We use the BFGS solver of Python package Scipy to solve the one-dimensional nonlinear optimization
(109).

I A Gradient Descent Algorithm

This section describes a gradient descent algorithm. We use this algorithm to find the best static price vector
for the MDP policy with static prices and the best drift rates for the N -price policy; see Barzilai and Borwein
(1988) and Boyd and Vandenberghe (2004, Section 9.3) for further discussion of gradient descent algorithms.
We describe the gradient descent algorithm using the MDP policy with static prices. Let f : R3+ → R
denote the long-run average cost of the MDP policy with static price vector p ∈ R3+ . We compute f (p) by
(numerically) solving the MDP policy with the set of admissible prices P = {p}; see Appendices G-H. We
compute ∇f (p) using finite differences. To be specific, we fix a small δ ∈ R+ , and approximate the k-th

54
component of ∇f (p) for k = 1, 2, 3 with (f (p + δek ) − f (p − δek ))/2δ. We initialize the algorithm with the
static price vector p0 = Λ−1 (λ? ) and step size γ0 = 0.01. Starting with p0 and γ0 , we iteratively compute pi
and γi for i ∈ N as follows:

Update. Given pi−1 and γi−1 , we define the updated static price vector pi and step size γi as follows:

pi = max(pi−1 − γi−1 ∇f (pi−1 ), 0) and γi = |(pi − pi−1 )0 (∇f (pi ) − ∇f (pi−1 ))|/k∇f (pi ) − ∇f (pi−1 )k2 .

The maximum in the first equality ensures that the updated static price vector is non-negative. In all
iterations of our numerical experiments in Section 8, the updated price vector is (strictly) positive.

Termination. The algorithm terminates when the gradient of the long-run average cost under pi is
sufficiently small. To be specific, given an error threshold  > 0, we repeat the update step until for some
i ∈ N, we have k∇f (pi )k ≤ , where k · k denotes the Euclidean norm.9

J The Equivalent Workload Formulation with A Finite Number of

Admissible Drift Rates

In this section, we consider an equivalent workload formulation, in which the (effective) drift rate control
process θ can only take values in a given finite set. To be specific, we consider the following formulation:
Choose the adapted control (L, U, θ) to
t t
1 h
Z Z i
minimize limsup E c(θ(s))ds + h(W (s))ds + κL(t) (114)
t→∞ t 0 0

subject to
Z t
W (t) = B(t) − θ(s)ds − U (t) + L(t), (115)
0

L, U are nondecreasing with L(0) = U (0) = 0, (116)

θ(t) ∈ Θ, (117)

where B is a Brownian motion with infinitesimal drift µ and infinitesimal variance σ 2 , and Θ ⊂ R− is a
given finite set. The EWF (114)-(117) is identical to the EWF (29)-(31) except for constraint (117), which
ensures that the (effective) drift rate control process θ only takes values in the set Θ. We refer to Θ as the
set of admissible drift rates. Letting N denote the number of elements in Θ, we denote the elements of Θ
by θ1 , . . . , θN . Without loss of generality, we assume θ1 < θ2 < . . . < θN ≤ 0. We call the adapted control

9
In all our numerical experiments in Section 8, the gradient descent algorithm terminated in less than 30 iterations.

55
(L, U, θ) admissible if it satisfies (115)-(117) and
E[|W (t)|]
limsup = 0. (118)
t→∞ t

Similar to Section 6, we restrict our attention to stationary Markov control policies and write θ(W (t))
as opposed to θ(t). The optimal outsourcing and idling policies we derive can be viewed as a barrier policy;
see Section 6.1 for the definition of a barrier policy. Consider γ ∈ R and f ∈ C 2 ([l, u]) and assume that they
jointly satisfy

Γθ f (w) + c(θ(w)) + h(w) = γ, w ∈ (l, u), (119)

subject to the boundary conditions

f 0 (l) = −κ and f 0 (u) = 0. (120)

Proposition 2 in Section 6.1 shows that γ is the long-run average cost of the barrier policy (L, U, θ) with a
lower barrier at l and an upper barrier at u.

J.1 Bellman Equation

Proposition 2 and the smooth pasting arguments (see e.g., Harrison (2013, Section 7.7)) motivate the fol-
lowing Bellman equation: find l, u, γ ∈ R and f ∈ C 2 ([l, u]) that satisfy

2 00
f (w) + µf 0 (w) − xf 0 (w) + c(x) + h(w)
1
min 2σ = γ, w ∈ (l, u), (121)
x∈Θ

subject to the boundary conditions

f 0 (l) = −κ and f 0 (u) = 0, (122)

and the smooth pasting conditions

f 00 (l) = 0 and f 00 (u) = 0. (123)

Motivated by Proposition 2, we interpret γ as a guess at the minimum average cost and interpret l and u as
the lower and upper reflecting barriers to be imposed on the workload process. The unknown function f is
often called the relative value function in average cost dynamic programming.

The Bellman equation is introduced primarily to motivate our solution approach; the properties of the
Bellman equation that we require will be proved from first principles. We shall develop an explicit solution
(l, u, γ, f ) of the Bellman equation, and define the candidate policy as the one that chooses in each state w
the (effective) drift rate θ(w) equal to the minimizer x in (121). Then, we will prove that this candidate
policy is optimal.

56
φ(y)
τ1 τ2 ψ(y) y
θ1
θ3
θ2

θ2 θ1

θ3
τ1 τ2 y

Figure 7: An illustrative φ with N = 3. Figure 8: An illustrative ψ with N = 3.

As a preliminary to analyzing (121)-(123), following Ata et al. (2005, Section 2), define

φ(y) = sup yx − c(x) , y ∈ R. (124)
x∈Θ

It is straightforward to see the following: First, the supremum in (124) is finite for all y ∈ R, and second,
there exists a smallest x ∈ Θ that achieves the supremum. Hereafter, that smallest maximizer will be denoted
by ψ(y). That is,

ψ(y) = min argmax yx − c(x) , y ∈ R. (125)
x∈Θ

Also, for i = 1, . . . N − 1, let


θi + θi+1
τi = . (126)
m0 H -1 m
It follows from θ1 < θ2 < . . . < θN ≤ 0 that τ1 < τ2 < . . . < τN −1 < 0. Lemma 11 provides a closed-form
expression for φ and ψ in terms of τ1 , . . . , τN −1 ; see Figures 7 and 8 for an illustration of φ and ψ when
N = 3 and Ata et al. (2021b, Appendix E) for a similar characterization.

Lemma 11. For N = 1 and y ∈ R, we have ψ(y) = θ1 and φ(y) = θ1 y − c(θ1 ). For N > 1 and y ∈ R, we
have



 θ1 if y ≤ τ1 ,

ψ(y) = θi if τi−1 < y ≤ τi , i = 2, . . . , N − 1, (127)



θN if τN −1 < y.

and



 θ1 y − c(θ1 ) if y ≤ τ1 ,

φ(y) = θi y − c(θi ) if τi−1 < y ≤ τi , i = 2, . . . , N − 1, (128)



θN y − c(θN ) if τN −1 < y.

Proof. The proof for N = 1 is straightforward. To show the desired result for N > 1, we fix i ∈ {1, . . . , N }

57
and y ∈ R. We have θi ∈ argmaxx∈Θ {yx − c(x)} if and only if
θi2 θj2
yθi − ≥ yθj − , j = 1, . . . , N, j 6= i. (129)
m0 H -1 m m0 H -1 m
Rearranging the terms in (129) gives
θi2 − θj2
y(θi − θj ) ≥ , j = 1, . . . , N, j 6= i, (130)
m0 H -1 m
which is equivalent to
θi + θj θi + θj
y ≥ , j = 1, . . . , i − 1 and y ≤ , j = i + 1, . . . , N. (131)
m0 H -1 m m0 H -1 m
It follows form θ1 < θ2 < . . . < θN −1 that (131) if and only if
θi + θi−1 θi + θi+1
y ≥ , i > 1 and y ≤ , i < N. (132)
m0 H -1 m m0 H -1 m
Thus, θi ∈ argmaxx∈Θ {yx − c(x)} if and only if (132) holds. Therefore,




 {θ1 } if y < τ1 ,


 {θi−1 , θi }

if y = τi−1 , i = 2, . . . , N,
argmax {yx − c(x)} = (133)
x∈Θ 


 {θi } if τi−1 < y < τi , i = 2, . . . , N − 1,


 {θN }

if τN −1 < y.

The desired result then follows from (124), (125), and (133).

Since Equation (121) does not involve the unknown function f itself, it is really a first-order equation.
Setting v(w) = f 0 (w) and using the definition (125), we rewrite (121) as

1 2 0
2 σ v (w) + µv(w) + h(w) − φ(v(w)) = γ, w ∈ (l, u). (134)

Then, by rearranging the terms in (134), we rewrite the Bellman equation as follows: find l, u, γ ∈ R and
v ∈ C 1 [l, u] that satisfy
2µv(w) 2φ(v(w)) 2(γ − h(w))
v 0 (w) = − + + , w ∈ (l, u), (135)
σ2 σ2 σ2
subject to the boundary conditions

v(l) = −κ and v(u) = 0, (136)

and the smooth pasting conditions

v 0 (l) = 0 and v 0 (u) = 0. (137)

58
J.2 Solution to the Bellman Equation

To solve the Bellman equation, we proceed in three steps. First, we use the boundary and smooth pasting
conditions (136)-(137) to express l, u in terms of γ and show that l < 0 < u. Second, for each γ, we solve
Equation (135) on [l, 0] and [0, u], separately. Finally, we use the desired continuity of v(w) at w = 0 to pin
down γ, which completes the solution of the Bellman equation.

To this end, we substitute v(l) = −κ and v 0 (l) = 0 into Equation (135) for w = l to obtain

γ = h(l) − µκ − φ(−κ). (138)

Similarly, substituting v(u) = v 0 (u) = 0 into Equation (135) for w = u gives

γ = h(u) − φ(0). (139)

Combining (138) and (139) gives

h(u) = h(l) − µκ + φ(0) − φ(−κ). (140)

The next result uses Equations (136)-(140) to obtain useful structural insights.

Lemma 12. Assume l, u, γ ∈ R and v ∈ C 1 ([l, u]) jointly satisfy (135)-(137). Then, we have that γ > 0 and
−1  1 
l = ?
γ + µκ + φ(−κ) < 0 and u = ? γ + φ(0) > 0. (141)
b h

Proof. It is straightforward to write l and u in terms of γ using Equations (138)-(139). The argument for
proving l < 0 and γ > 0 is identical to the one provided in the proof of Lemma 3 with the exception that
we also use the fact that φ is non-increasing. It remains to show u > 0. We argue this by contradiction.
Assume u ≤ 0. First, we show that v 0 , v 00 > 0 on (w0 , u) for some w0 ∈ (l, u). Then, we show that this
result contradicts boundary condition (137). Note that v and h are continuously differentiable on (l, u) for
l, u ≤ 0. Moreover, it follows from boundary condition (136) and Lemma 11 that there exists w0 ∈ (l, u)
such that φ(v(w)) = θN v(w) − c(θN ) for w ∈ (w0 , u). Therefore, by (135), we have v ∈ C 2 ((w0 , u]); i.e., v 00
is well-defined on (w0 , u). Thus, we can differentiate both sides of (135) to obtain
2 2h0 (w)
v 00 (w) = − 2
(µ − θN )v(w) − , w ∈ (w0 , u). (142)
σ σ2
Recall that v and v 0 are continuous on [l, u] and v(u) = v 0 (u) = 0 by boundary conditions (136)-(137).
Therefore, the first term on the right-hand side of (142) is small, i.e., close to zero, in a neighborhood of
u. However, since h0 (w) = −b? for w ∈ [l, u) with l, u ≤ 0, the second term on the right-hand side of (142)
is positive. Therefore, for w0 sufficiently close to u, we have v 00 > 0 on (w0 , u). Moreover, since l, u ≤ 0,
by Corollary 7, we have v 0 > 0 on (l, u). Combining v 00 > 0 and v 0 > 0 on (w0 , u) gives v 0 (u) > 0, which
contradicts boundary condition (137).

59
Next, we split the analysis of the Bellman equation (135)-(137) into two sub-intervals: [l, 0] and [0, u].
We fix γ and solve the Bellman equation (135)-(137) on each sub-interval, separately. Then, we use the
continuity of v at zero to pin down γ. To this end, let γ0 = −φ(0) ≥ 0 and for γ ≥ γ0 , define
−1  1 
lγ = ?
γ + µκ + φ(−κ) < 0 and uγ = ? γ + φ(0) ≥ 0. (143)
b h
Substituting (143) into (135) and focusing on the interval (lγ , 0] yields the following:

0 2µ  2  2 h(lγ ) − h(w)
v (w) = − 2 v(w) + κ + 2 φ(v(w)) − φ(−κ) + , w ∈ (lγ , 0], (144)
σ σ σ2
subject to the boundary condition

v(lγ ) = −κ. (145)

It is straightforward to verify by substituting (143) into (144) for w = lγ that the solution to (144)-(145)
satisfies the smooth pasting condition v 0 (lγ ) = 0.

Similarly, substituting (137) into (135) and focusing on the interval [0, uγ ) yields the following:

0 2µ 2  2 h(uγ ) − h(w)
v (w) = − 2 v(w) + 2 φ(v(w)) − φ(0) + , w ∈ [0, uγ ), (146)
σ σ σ2
subject to the boundary condition

v(uγ ) = 0. (147)

It is again straightforward to verify by substituting (147) into (146) for w = uγ that the solution to (146)-
(147) satisfies the smooth pasting condition v 0 (uγ ) = 0.

It is straightforward to show that (144)-(145) has a unique continuously differentiable solution, denoted
by vγ- : [lγ , 0] → R for each γ ≥ γ0 . Similarly, for γ ≥ γ0 , (146)-(147) has a unique continuously differentiable
solution, denoted by vγ+ : [0, uγ ] → R. Lemma 13 shows that vγ- (0) is strictly increasing in γ and vγ+ (0) is
strictly decreasing in γ.

Lemma 13. We have the following:

(i) vγ- (0) is continuous and strictly increasing in γ with vγ-0 (0) < 0 and lim vγ- (0) = ∞.
γ→∞

(ii) vγ+ (0) is continuous and strictly decreasing in γ with vγ+0 (0) = 0 and lim vγ+ (0) = −∞.
γ→∞

Proof. The proof of this lemma closely resembles the proof of Lemma 4. We start by proving vγ-0 (0) < 0.
We argue by contradiction. Assume vγ-0 (0) ≥ 0. First, we show that there exists w0 ∈ (lγ , 0] such that
∂v0- (w0 )/∂w ≤ 0. Then, we show that this result contradicts Corollary 7. It follows from the intermediate
value theorem (see e.g., Royden and Fitzpatrick (1968, Sction 1.6)), the continuity of vγ-0 , and the fact that
vγ-0 (lγ0 ) = −κ < 0 and vγ-0 (0) ≥ 0 that there exists w0 ∈ (lγ0 , 0] such that vγ-0 (w0 ) = 0. Then, by Equation

60
(144), we have

∂ - 2µ  2  2 h(lγ0 ) − h(w0 )
v (w0 ) = − 2 v(w0 ) + κ + 2 φ(v(w0 )) − φ(−κ) + (148)
∂w γ0 σ σ  σ2
2µ 2  2 h(lγ0 ) − h(w0 )
= − 2 κ + 2 φ(0) − φ(−κ) + (149)
σ σ σ2
2  2
≤ 2 − µκ − φ(−κ) + h(lγ0 ) − 2 γ0 (150)
σ σ
= 0, (151)

where (149) follows from vγ-0 (w0 ) = 0, (150) follows from the non-negativity of h and the definition of γ0 ,
and (151) follows from (143). However, by Corollary 7, we know that ∂vγ-0 (w)/∂w > 0 for w ∈ (lγ0 , 0], which
contradicts (151). Therefore, the assumption that vγ-0 (0) ≥ 0 is incorrect.

Next, we show that vγ- (0) is strictly increasing in γ. Consider the differential equation
2µ 2 2b?
v̂ 0 (w) = −
 
v̂(w) + κ + φ(v̂(w)) − φ(−κ) + w, w ∈ (0, ∞), (152)
σ2 σ2 σ2
subject to the boundary condition v̂(0) = −κ. It follows from (144) and (152) that

vγ- (w) = v̂(w − lγ ), w ∈ [lγ , 0]. (153)

Thus, for γ ≥ γ0 ,
∂ - ∂ 1
v (0) = v̂(−lγ ) = ? v̂ 0 (−lγ ) > 0, (154)
∂γ γ ∂γ b
where the second equality in (154) follows from (143) and the inequality in (154) follows from Lemma 22.
Therefore, vγ- (0) is strictly increasing in γ.

It remains to show that vγ- (0) → ∞ as γ → ∞. By (143) and (153), it suffices to show that v̂(w) → ∞ as
w → ∞. We argue by contradiction. Assume v̂ is upper bounded. Then, since v̂ is also lower bounded (by
Lemma 21), the first two terms on the right-hand side of (152) are bounded. Then, since the third term on
the right-hand side of (152) increases without bound, i.e., 2b? w/σ 2 → ∞ as w → ∞, we have v̂ 0 (w) → ∞ as
w → ∞. Therefore, by the fundamental theorem of calculus, we have
Z w
v̂(w) = v̂(0) + v̂ 0 (x)dx → ∞, (155)
0

which contradicts the assumption that v̂ is upper bounded.

The proof of part (ii) resembles the proof of part (i). The equality vγ+0 (0) = 0 follows from (147). To prove
vγ+ (0) is strictly decreasing in γ, we consider the differential equation
2µ 2 2h?
ṽ 0 (w) =

ṽ(w) − φ(ṽ(w)) − φ(0) − w, w ∈ (0, ∞), (156)
σ2 σ2 σ2

61
subject to the boundary condition ṽ(0) = 0. It is straightforward to see that

vγ+ (w) = ṽ(uγ − w), w ∈ [0, uγ ]. (157)

By Lemma 23, ṽ, ṽ 0 < 0 on (0, ∞). Therefore, for γ ≥ γ0 ,


∂ + ∂ 1
v (0) = v̂(uγ ) = ? ṽ 0 (uγ ) < 0.
∂γ γ ∂γ h
In other words, vγ+ (0) is strictly decreasing in γ.

It remains to show that vγ+ (0) → −∞ as γ → ∞. By (157), it suffices to show that ṽ(w) → −∞ as w → ∞.
Since ṽ < 0 (be Lemma 23), the first two terms on the right-hand side of (156) are upper bounded. Then,
since the third term on the right-hand side of (156) decreases without bound, i.e., −2h? w/σ 2 → −∞ as
w → ∞, we have ṽ 0 (w) → −∞ as w → ∞. Therefore,
Z w
ṽ(w) = ṽ(0) + ṽ 0 (w0 )dw0 → −∞.
0

The following corollary is immediate from Lemma 13 and the unique γ characterized in the corollary is
illustrated in Figure 4 in Section 6.3. Corollary 4 is crucially used in solving the Bellman equation (135)-(137).

Corollary 4. There exists a unique γ ? such that vγ- ? (0) = vγ+ ? (0).

Next, we propose closed-form expressions for vγ- and vγ+ . To simplify the analysis, first, we show that vγ-
and vγ+ are strictly increasing.

Lemma 14. We have that vγ- is strictly increasing on [lγ , 0] and vγ+ is strictly increasing on [0, uγ ].

Proof. To show vγ- is strictly increasing, we consider the differential equation


2µ 2 2b?
v̂ 0 (w) = −
 
v̂(w) + κ + φ(v̂(w)) − φ(−κ) + w, w ∈ (0, ∞), (158)
σ2 σ2 σ2
subject to the boundary condition v̂(0) = −κ. It follows from (144) and (158) that

vγ- (w) = v̂(w − lγ ), w ∈ [lγ , 0]. (159)

The desired result then follows from (159) and the fact that v̂ is strictly increasing by Lemma 22.
To show vγ+ is strictly increasing, we consider the differential equation
2µ 2 2h?
ṽ 0 (w) =

ṽ(w) − φ(ṽ(w)) − φ(0) − w, w ∈ (0, ∞), (160)
σ2 σ2 σ2
subject to the boundary condition ṽ(0) = 0. It is straightforward to see that

vγ+ (w) = ṽ(uγ − w), w ∈ [0, uγ ]. (161)

The desired result then follows from (161) and the fact that ṽ is strictly decreasing by Lemma 23.

62
It follows from Lemma 14 and boundary conditions (145) and (147) that vγ- ? (w) ∈ [−κ, 0) for w ∈ [lγ ? , 0]
and vγ+ ? (w) ∈ (−κ, 0] for w ∈ [0, uγ ? ]. Therefore, by Lemma 11, we have

φ(vγ- ? (w)) ∈ {θl vγ- ? (w) − c(θl ) : τl > −κ, l = 1, . . . , N − 1} ∪ {θN }, w ∈ [lγ ? , 0],

φ(vγ+ ? (w)) ∈ {θl vγ- ? (w) − c(θl ) : τl > −κ, l = 1, . . . , N − 1} ∪ {θN }, w ∈ [0, uγ ? ].

Then, it follows from (143)-(147) that γ ? , lγ ? , uγ ? , vγ- ? , and vγ+ ? do not depend on {θl : τl ≤ −κ, l = 1, . . . , N −
1}. Motivated by this observation and without loss of generality, we assume τl > −κ for l = 1, . . . , N − 1.
To simplify the analysis to follow, we let τ0 = −κ and τN = 0.

We obtain a closed-form expression for vγ- in three steps. First, we show that [lγ , 0] can be partitioned
into disjoint sub-intervals such that ψ(vγ- (·)) is constant on each interval. Second, we provide a closed-
form expression for vγ- on each sub-interval (of the aforementioned partition). Finally, we determine the
partition and subsequently vγ- . It follows from Lemmas 11 and 13 that ψ(vγ- (·)) is non-decreasing. Thus, we
can partition (lγ , 0] into disjoint intervals (ŵ0 , ŵ1 ], . . . , (ŵN - −1 , ŵN - ] for some N − ∈ {1, . . . , N } such that
ŵ0 = lγ , ŵN - = 0, and ψ(vγ- (w)) = θl for w ∈ (ŵl−1 , ŵl ] and l = 1, . . . , N − . Then, it follows from (144)-(145)
that for l = 1, . . . , N - , vγ- satisfies

0 2 2 h(lγ ) − h(w) 2 2(θ12 − θl2 )
v (w) = − 2 (µ − θl )v(w) + − (µ + θ 1 )κ + , w ∈ (ŵl−1 , ŵl ], (162)
σ σ2 σ2 m0 H -1 m σ 2
subject to the boundary condition

v(ŵl−1 ) = τl−1 , (163)

which is a linear first order differential equation. The following lemma exploits the (unique) solution provided
in Appendix K for this class of differential equations to obtain a closed-form solution for vγ- on each interval
(ŵl−1 , ŵl ] for l = 1, . . . , N - . As a preliminary to this lemma, for l = 1, . . . , N - , let
2 2b? 2(θ2 − θ2 ) 2 2b?
c-l,0 = − 2
µ + θ1 )κ + 2 (lγ − ŵl−1 ) + 0 1 -1 l 2 , c-l,1 = − µ − θl ), and ĉ-l = .
σ σ m H mσ σ2 σ2

Lemma 15. We have the following closed form solution for vγ- on each interval (ŵl−1 , ŵl ] for l = 1, . . . , N - :
c-l,1 c-l,0 + ĉ-l  c- w ĉ-l w c-l,1 c-l,0 + ĉ-l
vγ- (w) = τl−1 + e l,1 − − , w ∈ [ŵl−1 , ŵl ). (164)
(c-l,1 )2 c-l,1 (c-l,1 )2

It remains to determine N - and ŵ1 , . . . , ŵN - −1 . We use a recursive algorithm for this purpose. Starting
with ŵ0 = lγ , we iteratively derive ŵl for l = 1, . . . , N - − 1 as follows: Let

-
c-l,1 c-l,0 + ĉ-l  c- w ĉ-l w c-l,1 c-l,0 + ĉ-l
vγ,l (w) = τl−1 + e l,1 − − , w ∈ [ŵl−1 , 0].
(cl,1 )2
- -
cl,1 (c-l,1 )2
- - −1 - −1 -
If vγ,l (0) > τl , we let ŵl = (vγ,l ) (τl ), where (vγ,l ) denotes the inverse of vγ,l , and proceed to the next
step. Otherwise, we let N - = l and terminate the algorithm.

Next, we obtain a closed-form expression for vγ+ in three steps. First, we show that [0, uγ ] can be

63
partitioned into disjoint sub-intervals such that ψ(vγ+ (·)) is constant on each interval. Second, we provide a
closed-form expression for vγ+ on each sub-interval (of the aforementioned partition). Finally, we determine
the partition and subsequently vγ+ . It follows from Lemmas 11 and 13 that ψ(vγ+ (·)) is non-decreasing. Thus,
we can partition [0, uγ ) into disjoint intervals [ŵN + −1 , ŵN + ), . . . , [ŵN −1 , ŵN ) for some N + ∈ {1, . . . , N } such
that ŵN + −1 = 0, ŵN = uγ , and ψ(vγ+ (w)) = θl for w ∈ (ŵl−1 , ŵl ] and l = N + , . . . , N . Then, it follows from
(146)-(147) that for l = N + , . . . , N , vγ+ satisfies

0 2 2 h(uγ ) − h(w) 2(θ2 − θ2 )
v (w) = − 2 (µ − θl )v(w) + 2
+ 0 N-1 l 2 , w ∈ (ŵl−1 , ŵl ], (165)
σ σ m H mσ
subject to the boundary condition

v(ŵl ) = τl , (166)

which is a linear first order differential equation. The following lemma exploits the (unique) solution provided
in Appendix K for this class of differential equations to obtain a closed-form solution for vγ+ on each (ŵl−1 , ŵl ]
for l = N + , . . . , N . As a preliminary to the lemma, for l = N + , . . . , N , let
2h? 2
2(θN − θl2 ) 2 2h?
c+l,0 = − (uγ − ŵl ) − , c+l,1 = µ − θl ), and ĉ+l = − .
σ2 m0 H -1 m σ 2 σ2 σ2
Lemma 16. We have the following closed form solution for vγ+ : For l = N + , . . . , N ,
c+l,1 c+l,0 + ĉ+l  c+ (ŵ −w) ĉ+l (ŵl − w) c+l,1 c+l,0 + ĉ+l
vγ+ (w) = τl + e l,1 l
− − , w ∈ [ŵl−1 , ŵl ). (167)
(cl,1 )2
+ +
cl,1 (c+l,1 )2

It remains to determine ŵN + , . . . , ŵN . We use a recursive algorithm for this purpose. Starting with
ŵN = uγ , we iteratively derive ŵl for l = N − 1, N − 2, . . . , N + as follows: Let

+
c+l,1 c+l,0 + ĉ+l  c+ (ŵ −w) ĉ+l (ŵl − w) c+l,1 c+l,0 + ĉ+l
vγ,l (w) = τl + e l,1 l
− − , w ∈ [0, ŵl ).
(c+l,1 )2 c+l,1 (c+l,1 )2
+ + −1 + −1 +
If vγ,l (0) < τl−1 , we let ŵl−1 = (vγ,l ) (τ1−1 ), where (vγ,l ) denotes the inverse of vγ,l , and proceed to the
next step. Otherwise, we let N + = l − 1 and terminate the algorithm.

The parameter γ ? , characterized in Corollary 4, can be computed using the closed-form expressions for
vγ- and vγ+ provided in Lemmas 15 and 16 and the recursive algorithms discussed above. By Lemma 13, this
computation can be done using a simple line search. Given γ ? , we define

 vγ- ? (w), w ∈ [lγ ? , 0],
v(w) = (168)
 v + ? (w), w ∈ (0, uγ ?].
γ

Proposition 5 shows that (lγ ? , uγ ? , γ ? , v) is the unique solution to the Bellman equation (135)-(137).

Proposition 5. The function v is non-positive, strictly increasing, and continuously differentiable on


[lγ ? , uγ ? ]. Moreover, (lγ ? , uγ ? , γ ? , v) is the unique solution to (135)-(137).

We skip the proof of Proposition 5 because it is almost identical to the proof of Proposition 3. To solve

64
the Bellman equation (121)-(123), define
Z w
f (w) = v(x)dx, w ∈ [lγ ? , uγ ? ], (169)
lγ ?

where v is given by (168). The following corollary provides the unique solution to the Bellman equation
(121)-(123).

Corollary 5. The function f is strictly decreasing, strictly convex, and twice continuously differentiable.
Moreover, (lγ ? , uγ ? , γ ? , f ) solves the Bellman equation (121) subject to the boundary condition (122) and
the smooth pasting condition (123). Moreover, f is unique up to an additive constant.

J.3 Candidate Policy

Our candidate policy is the barrier policy with the lower barrier at lγ ? , the upper barrier at uγ ? , and the
(effective) drift rate that is the minimizer of the left-hand side of Equation (121) at every w ∈ [lγ ? , uγ ? ], i.e.,

θ(w) = ψ(v(w)) = min{θl : v(w) ≤ τl , l = 1, . . . , N }, w ∈ [lγ ? , uγ ? ], (170)

where v is given by (168). The candidate policy is stationary and its (effective) drift rate is non-decreasing
in the workload. Under the candidate policy, the workload process W evolves as a diffusion process with a
non-decreasing state-dependent drift rate and reflecting barriers at lγ ? and uγ ? . Theorem 2 shows that the
candidate policy is optimal for the EWF (114)-(117). Its proof is almost identical to the proof of Theorem
1.

Theorem 2. The barrier policy with the lower barrier at lγ ? , the upper barrier at uγ ? , and the (effective)
drift rate control θ(·) given by (170) is optimal for the EWF (114)-(117) and it has a long-run average cost
of γ ? .

K An Auxiliary Linear Differential Equation

This section discusses a special linear first order differential equation and its solution. Fix c0 , c1 , ĉ, y0 ∈ R
such that c1 6= 0 and consider the differential equation

y 0 (x) = c1 y(x) + ĉx + c0 , x ∈ (0, ∞) (171)

subject to the boundary condition y(0) = y0 . This equation falls in the class of linear first order differential
equations; see e.g., Zaitsev and Polyanin (2002, Section 1.1.4). The next lemma provides a closed-form
expression for its solution. The proof follows from the general solution provided in Zaitsev and Polyanin
(2002, Section 1.1.4) and the substitution of boundary condition y(0) = y0 into the general solution to
determine the undetermined coefficient.

65
Lemma 17. Zaitsev and Polyanin (2002, Sections 1.1.4). The (unique) solution to the differential equation
(171) subject to boundary condition y(0) = y0 is
c1 c0 + ĉ  c1 x ĉx c1 c0 + ĉ
y(x) = y0 + e − − , x ∈ [0, ∞).
c21 c1 c21

L Comparison of the Pricing, Outsourcing, and Scheduling Deci-

sions of the Proposed Policy with those of the MDP Policy

In this section, we compare the pricing, outsourcing, and scheduling decisions of the proposed policy with
those of the MDP policy. We use the example introduced in Section 8 except for the outsourcing cost. In the
example for the base case in Section 8, the total outsourcing cost is δ1 + ν1 = δ2 + ν2 = δ3 + ν3 = 1.33, which
renders outsourcing costly and largely irrelevant in that case. Although we believe our base case is more
realistic from a practical perspective, setting a lower outsourcing cost helps illustrate the policy structure
more completely. To do so, in this subsection, we set the total outsourcing cost to δ1 + ν1 = δ2 + ν2 =
δ3 + ν3 = 1.05. To repeat, this change causes outsourcing to play a more prominent role, which helps with
the comparison of the outsourcing decisions of the proposed policy and the MDP policy.

2 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
0.35 2 0.000 0.000 0.000 0.000 0.000 0.001 0.004 0.015 0.057 0.216 0.000
0.20
1 0.000 0.000 0.000 0.000 0.002 0.007 0.026 0.097 0.370 0.000 0.000 1 0.000 0.000 0.000 0.000 0.000 0.001 0.005 0.016 0.050 0.142 0.000
0.30 0.18
0 0.000 0.000 0.000 0.001 0.002 0.006 0.019 0.057 0.157 0.000 0.000 0 0.000 0.000 0.000 0.000 0.001 0.003 0.011 0.039 0.136 0.071 0.000
Product 2 Inventory

Product 2 Inventory

-1 0.000 0.000 0.000 0.000 0.001 0.004 0.011 0.030 0.075 0.000 0.000
0.25 -1 0.000 0.000 0.000 0.000 0.001 0.003 0.008 0.025 0.068 0.012 0.000 0.15
Probability

Probability
-2 0.000 0.000 0.000 0.000 0.001 0.002 0.006 0.016 0.038 0.000 0.000 -2 0.000 0.000 0.000 0.000 0.001 0.002 0.005 0.013 0.033 0.002 0.000
0.12
0.20
-3 0.000 0.000 0.000 0.000 0.001 0.001 0.003 0.008 0.019 0.000 0.000 -3 0.000 0.000 0.000 0.000 0.000 0.001 0.003 0.007 0.017 0.000 0.000
0.10
-4 0.000 0.000 0.000 0.000 0.000 0.001 0.002 0.004 0.010 0.000 0.000 0.15 -4 0.000 0.000 0.000 0.000 0.000 0.001 0.001 0.003 0.008 0.000 0.000

0.08
-5 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.002 0.005 0.000 0.000 -5 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.002 0.004 0.000 0.000
0.10
-6 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.003 0.000 0.000 -6 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.002 0.000 0.000 0.05
-7 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.001 0.000 0.000 0.05 -7 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.000 0.000
0.03
-8 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.000 0.000 -8 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.000 0.000
0.00 0.00
-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2
Product 1 Inventory Product 1 Inventory

(a) Proposed policy. (b) MDP policy.

Figure 9: The joint probability distribution of the inventory of products 1 and 2 under the proposed policy
and the MDP policy. The value at (i, j) is equal to P(q : q1 = i, q2 = j), where P(·) denotes the stationary
distribution of the policy and q denotes the queue length vector. The shades depict the same information.
Darker shades represent tuples with a higher probability.

Figure 9 depicts the joint probability distribution of the inventory of products 1 and 2 under our proposed
policy as well as the MDP policy. In particular, it shows that the inventory of products 1 and 2 under both
policies is close to zero with high probability.

In this example, the system spends more than 99.99% of its time in the following states under the MDP

66
policy:

S = {(q1 , q2 , q3 ) : −8 ≤ q1 ≤ 1, −12 ≤ q2 ≤ 2, −18 ≤ q3 ≤ 6}. (172)

We refer to these states as the likely states.

Next, we compare the pricing, outsourcing, and scheduling decisions of the proposed policy with those
of the MDP policy focusing attention on the likely states.

Price (Proposed Policy) Price (Proposed Policy) Price (Proposed Policy)


Expected Price (MDP Policy) Expected Price (MDP Policy) Expected Price (MDP Policy)
Max/Min Price (MDP Policy) Max/Min Price (MDP Policy) Max/Min Price (MDP Policy)

1.254 1.28
1.27
Product 1 Price

Product 2 Price

Product 3 Price
1.253
1.27
1.252 1.26
1.26
1.251

1.250 1.25 1.25


−75 −50 −25 0 25 −75 −50 −25 0 25 −75 −50 −25 0 25
Workload Workload Workload

(a) Product 1 Price. (b) Product 2 Price. (c) Product 3 Price.

Figure 10: The prices as a function of the workload under the proposed policy and the MDP policy in the
likely states.

Comparison of the Pricing Policies. The proposed policy chooses the prices based on the nominal
workload. Figure 10 shows the prices under the proposed policy as a function of the workload. It also shows
the minimum, maximum, and expected prices under the MDP policy as a function of the nominal workload
in the likely states.10 Comparing the prices under the two policies for each likely state (see Equation (172))
reveals that they are within 0.5% of one another.

Comparison of the Outsourcing Policies. The proposed policy does not outsource product 1 or
product 2 orders. In our simulations of the MDP policy, no product 1 or product 2 orders are outsourced,
either. Next, we compare the outsourcing decisions of product 3, which is the cheapest product to outsource.
The proposed policy outsources product i? = 3 orders when the nominal workload W is less than or equal
to the outsourcing threshold ln = −65.9. On the other hand, the outsourcing decisions of the MDP policy
depend also on the inventory of products 1 and 2. In particular, given (q1 , q2 ), there exists a threshold
q̃3 (q1 , q2 ) such that product 3 orders are outsourced only when q3 ≤ q˜3 (q1 , q2 ). Let

M = {(q1 , q2 , q˜3 (q1 , q2 )) ∈ S}

10
The conditional expectation is taken with respect to the stationary distribution of the MDP policy.

67
denote the set of likely states with q3 = q̃3 (q1 , q2 ). Note that the MDP policy outsources product 3 orders
whenever the system state reaches the manifold M. Thus, we refer to M as the outsourcing manifold for
product 3 under the MDP policy. Then, we let MW = {m0 q : q ∈ M} denote the set of workload values
corresponding to the outsourcing manifold M, i.e., the projection of M onto the workload space. It turns
out that almost all the probability mass of the set MW is put on the three workload value -62.2, -61.1,
and -60.0 under the MDP policy. To be more specific, P(MW \{−62.2, −61.1, −60.0}) ≤ 10−7 . Figure 11
depicts the probability distribution of the workload conditioned on q ∈ M. It shows that the MDP policy
starts outsourcing product 3 orders at a larger nominal workload compared to the proposed policy. Indeed,
our simulations show that more product 3 orders are outsourced under the MDP policy compared to the
proposed policy. To be specific, about 0.24% of product 3 orders are outsourced under the MDP policy while
only 0.15% of product 3 orders are outsourced under the proposed policy.

0.5
Conditional Probability

0.4

0.3

0.2

0.1

0.0
−62.2 −61.1 −60.0 −58.9
Workload

Figure 11: Conditional probability distribution of the workload in the outsourcing manifold for product 3
under the MDP policy.

Roughly speaking, the MDP policy outsources product 3 where the workload falls in the interval
[−62.2, −60.0], and it only outsources product 3. Although this is not a threshold policy of the kind used
for its outsourcing decisions by our proposed policy, it can be approximated by one that has a threshold in
the interval [−62.2, −60.0].

Comparison of the Scheduling Policies. Under the proposed policy, the system idles whenever
the nominal workload W exceeds the idling threshold un = 27.6 and Qnk ≥ sk for k = 1, 2, 3. Whenever
the server is working, it first prioritizes those products for which Qnk < sk , i.e., the inventory is below the
safety stock. Among such products, the system manager prioritizes them in the descending order of bk /mk ,
i.e., products 1, 2, and then 3. Finally, if Qnk ≥ sk for all k = 1, 2, 3 and W < un = 27.6, i.e., the server is
working, then the server focuses solely on product j ? = 3, i.e., the cheapest product to hold in inventory.

For ease of comparison, we partition the set of likely states into three subsets and consider them separately:

i) {q ∈ S : q1 < 0},

68
ii) {q ∈ S : q1 = 0},

iii) {q ∈ S : q1 = 1}.

The system spends 18%, 38%, and 44% of its time under the MDP policy in each of these subsets, respectively.

We start with Case i): {q ∈ S : q1 < 0}. In these states, the proposed policy and the MDP policy make the
same scheduling decision. To be specific, both policies manufacture product 1 only.

Table 8: The likely states in which the scheduling decisions of the proposed policy and the MDP policy
differ. The last column shows the probability under the stationary distribution of the MDP policy.

Scheduling Decisions
Proposed Policy MDP Policy Probability
q2 = 0 −18 ≤ q3 ≤ −1 Product 2 Product 3 0.097
q1 = 0 q2 = 0 0 ≤ q3 ≤ 6 Product 2 Product 1 0.039
q2 = 1 0 ≤ q3 ≤ 6 Product 3 Product 1 0.046
q2 = 0 −18 ≤ q3 ≤ −1 Product 2 Product 3 0.004
q1 = 1
q2 = 1 2 ≤ q3 ≤ 6 Product 3 Product 2 0.093

Manufacture
Product 1
6 6 6
3 3 3
Product 3 Inventory

Product 3 Inventory

0 0 Product 3 Inventory 0
Manufacture

−3 Manufacture −3 Manufacture −3
Manufacture
Product 3

Product 3

−6 Product 2 −6 Product 2 −6
−9 −9 −9
−12 −12 −12
−15 −15 −15

−18 −18 −18


−12 −10 −8 −6 −4 −2 0 2 −12 −10 −8 −6 −4 −2 0 2 −12 −10 −8 −6 −4 −2 0 2
Product 2 Inventory Product 2 Inventory Product 2 Inventory

(a) Proposed policy. (b) MDP policy. (c) States with differing policies.

Figure 12: The manufacturing decisions in the likely states with q1 = 0.

Next, we consider Case ii): {q ∈ S : q1 = 0}. We partition this set into three subsets and consider them
separately:

ii-a) {q ∈ S : q1 = 0, q2 < 0},

ii-b) {q ∈ S : q1 = 0, q2 = 0},

ii-c) {q ∈ S : q1 = 0, q2 > 0}.

In Case ii-a): {q ∈ S : q1 = 0, q2 < 0}, the proposed policy and the MDP policy make the same scheduling
decision. To be specific, both policies manufacture product 2 only.

69
In Case ii-b): {q ∈ S : q1 = 0, q2 = 0}, the proposed policy manufactures product 2 (because q2 < s2 = 1).
However, when −18 ≤ q3 ≤ −1, the MDP policy manufactures product 3, and when 0 ≤ q3 ≤ 6, it
manufactures product 1; see the q1 = 0 section of Table 8 and Figure 12.

In Case ii-c): {q ∈ S : q1 = 0, q2 > 0}, the proposed policy manufactures product 3 if either q3 < 1 or
W < un = 27.6. Otherwise, it idles. The MDP policy makes similar decisions as depicted in Figure 12.
There are only seven likely states with q1 = 0 and q2 = 1 in which the scheduling decisions of the proposed
policy and the MDP policy differ.11 These states are listed in the q1 = 0 section of Table 8 and depicted in
Figure 12c. In these states, the proposed policy manufactures product 3 while the MDP policy manufactures
product 1.

Next, we consider Case iii): {q ∈ S : q1 = 1}. We partition {q ∈ S : q1 = 1} into three subsets and
consider them separately:

iii-a) {q ∈ S : q1 = 1, q2 < 0},

iii-b) {q ∈ S : q1 = 1, q2 = 0},

iii-c) {q ∈ S : q1 = 1, q2 > 0}.

In Case iii-a): {q ∈ S : q1 = 1, q2 < 0}, the proposed policy and the MDP policy make the same scheduling
decision. To be specific, both policies manufacture product 2 only.

In Case iii-b): {q ∈ S : q1 = 1, q2 = 0}, the proposed policy manufactures product 2 (because q2 < s2 = 1).
In states with q3 ≥ 0, the MDP policy makes the same decision. However, in states with −18 ≤ q3 ≤ −1,
the MDP policy manufactures product 3; see the q1 = 1 section of Table 8 and Figure 13.

In Case iii-c): {q ∈ S : q1 = 1, q2 > 0}, the proposed policy manufactures product 3 if either q3 < 1 or
W < un = 27.6. Otherwise, it idles. The MDP policy makes similar decisions as depicted in Figure 13.
There are only six likely states with q1 = q2 = 1 in which the scheduling decisions of the proposed policy and
the MDP policy differ. These states are listed in the q1 = 1 section of Table 8 and depicted in Figure 13c.
In these states, the proposed policy manufactures product 3 while the MDP policy manufactures product 2.

In summary, the proposed policy and the MDP policy make similar scheduling decisions. The MDP
policy spends more than 72% of its time in those likely states, where its scheduling decisions are the same as
those of the proposed policy. Furthermore, its scheduling decisions differ from those of the proposed policy
only when the inventory of product 1 or product 2 is close to its safety stock. In these states, the proposed
policy first prioritizes those products for which qk < sk , i.e., the inventory is below the safety stock. Among

11
Further analysis (available from the authors) shows that the policy comparison provided in Figure 12 and Table
8 remains similar if the entire state space is considered, although the set of states where the two policies differ is
slightly larger in that case.

70
6 6 6
3 3 3
Product 3 Inventory

Product 3 Inventory

Product 3 Inventory
0 0 0

Manufacture
−3 Manufacture −3 Manufacture −3

Manufacture
Product 3

Product 3
−6 Product 2 −6 Product 2 −6
−9 −9 −9
−12 −12 −12
−15 −15 −15

−18 −18 −18


−12 −10 −8 −6 −4 −2 0 2 −12 −10 −8 −6 −4 −2 0 2 −12 −10 −8 −6 −4 −2 0 2
Product 2 Inventory Product 2 Inventory Product 2 Inventory

(a) Proposed policy. (b) MDP policy. (c) States with differing policies.

Figure 13: The manufacturing decisions in the likely states with q1 = 1.

such products, it prioritizes them in the descending order of bk /mk , i.e., products 1, 2, and then 3. Although
the MDP policy makes similar decisions, it sometimes prioritizes the products differently. To be specific,
first, when q1 ∈ {0, 1}, q2 = 0, and −18 ≤ q3 ≤ −1, the proposed policy manufactures product 2 since
q1 ≥ s1 = 0, q2 < s2 = 1, and b2 /m2 > b3 /m3 . However, the MDP policy manufactures product 3 since
it is the only backordered product. Second, when q1 = 0, q2 = 0, and 0 ≤ q3 ≤ 6, the proposed policy
manufactures product 2 since q1 ≥ s1 = 0 and q2 < s2 = 1. In contrast, the MDP policy manufactures
product 1, aiming to hold a slightly larger stock of product 1 before switching to the other products. Lastly, if
qk ≥ sk for k = 1, 2, 3, the proposed policy either manufactures product 3 or idles the server. However, when
q1 = 0, q2 = 1, and 0 ≤ q3 ≤ 6, the MDP policy manufactures product 1, aiming to hold a slightly larger
stock of product 1 before switching to another product. Similarly, when q1 = 1, q2 = 1, and 2 ≤ q3 ≤ 6, the
MDP policy manufactures product 2, aiming to hold a slightly larger stock of product 2 before switching to
another product.

M Lost Sales

Although this paper focuses on a make-to-stock manufacturing system with backlogged orders, its solution
technique can be easily adapted to the lost sales case by imposing an exogenous lower reflecting barrier at
zero on the workload process. Letting κ denote the (effective) lost sales cost in the workload formulation,
the Bellman equation would be: find u, γ ∈ R and f ∈ C 2 ([0, u]) that satisfy

2 00
f (w) + µf 0 (w) − xf 0 (w) + c(x) + h(w) = γ,
1
min 2σ w ∈ (0, u),
x∈R

subject to the boundary conditions f 0 (0) = −κ and f 0 (u) = 0 and the smooth pasting condition f 00 (u) = 0.12
As done in Section 6.2, the Bellman equation can be equivalently written in terms of u, γ ∈ R and v ∈ C 1 [0, u].
Then, uγ and vγ : [0, uγ ] → R can be defined for each γ ≥ 0 as done in Section 6.3. It is straightforward to

12
Since the lower reflecting barrier is exogenous, no smooth pasting condition is needed for it.

71
characterize vγ as done in Lemma 7, argue that γ ? is the unique solution to vγ ? (0) = −κ, and show that
the optimal policy is characterized by (uγ ? , γ ? , vγ ? ). This two-sided barrier policy with drift rate control
can then be interpreted in the context of the original control problem to obtain an effective dynamic control
policy for the manufacturing system with lost sales as done above for the backlog case.

N Additional Details

Barrier Policies of Wein (1992a) and Veatch and Wein (1996). The policy proposed in Wein (1992a)
is a one-sided barrier policy (with no drift-rate control). The policy proposed for the lost sales case in Veatch
and Wein (1996) is a two-side barrier policy (with no drift rate control). However, its lower barrier is fixed
at zero.

Impact of the Scaling of the Outsourcing Cost. The scaling we use for the outsourcing cost ensures
that the trade-off between outsourcing and dynamic pricing carries over to the limiting control problem. If

the outsourcing cost (in excess of the production cost) were larger than O(1/ n), outsourcing would be too
expensive to use in the limit, and the system manager would choose dynamic pricing instead of outsourcing.
That is, no orders would be outsourced.

Relationship Between the Base Case in Our Simulation Study and That of Wein (1992a).
The values of λ?k and µk used in Section 8 for k = 1, 2, 3 are consistent with Wein (1992a) in the heavy
traffic limit, because Wein (1992a) scales time by n = 100 to arrive at the approximating Brownian control
problem whereas we scale the instantaneous demand and service rates by n = 100. Wein (1992a) does not
consider dynamic pricing, and uses an average server utilization of 0.90. We consider a range of average
server utilizations from 0.9 to 0.99 to assess the effectiveness of the proposed policy for a broader set of
parameters; see Table 2 in Appendix F.

System Parameter Used in the Simulation Study. In problems like ours, one can set n = O(1/(1−ρ)2 );
see e.g., Wein (1992a, Section 2). For our base case (in Section 8), this would give n = O(400). Another
PK
common approach is to set n = O( k=1 λ?k ); see e.g., Ata and Olsen (2009, Section 7). That approach
would give n = O(55). We followed a middle ground and set n = 100.

Combining the Myopic(P) index Scheduling Policy with our Proposed Pricing and Outsourcing
Policies. If the products can be renumbered such that hk /mk and bk /mk are non-increasing in k, the
Myopic(P) index policy holds most of its backlog and inventory in the products suggested by BCP (23).
Thus, we can combine this scheduling policy with the pricing and outsourcing policies derived from the
solution to the Brownian control problem. Perez and Zipkin (1997) also proposes an index policy, in which
the indices corresponds to the change in the holding and backorder cost if the server were to manufacture

72
the product for a longer duration of time. This duration of time is equal to the sojourn time of a system in
which the other products are removed. In our simulations, the two index policies had a similar performance.
In fact, in many cases, their long-run average costs were statistically indistinguishable. Therefore, motivated
by Veatch and Wein (1996), we use the Myopic(P) indices.

Holding and Backorder Costs Used in the Simulation Study. In Section 8, we assume the holding
cost is equal to the (per period) cost of capital multiplied by the production cost. To be specific, letting
r denote the (per period) cost of capital, we have αk = rδk for k = 1, 2, 3. Taking a (per period) cost of
capital of r = 0.005 (equivalent to an annual cost of capital of 52 × 0.005 = 0.26, i.e., 26%, when each period
corresponds to one week), we obtain α1 = α2 = α3 = 0.005.

Historical Profit Margins. See https://www.readyratios.com/sec/ratio/gross-margin/ (accessed on Novem-


ber 1st, 2021) for historical profit margins in various industries; see e.g., apparel and other finished products
made from fabrics and similar material, lumber and wood products, industrial and commercial machinery
and computer equipment, and miscellaneous manufacturing industries.

Reporting the Long-Run Average Cost As Opposed to the Long-Run Average Profit In the
Simulation Study. To highlight the difference among the policies we consider, we use the long-run average
cost of the manufacturing system, i.e., lim inf t→∞ E[ξ(t)]/t, as our performance criterion. This criterion
focuses solely on the controllable costs, i.e., the holding cost, the backorder cost, the outsourcing cost, and
the profit loss due to the deviation (of the price) from the (nominal) static price p? = Λ−1 (λ? ). It does
not include the profit under the nominal instantaneous demand rate Π(λ? ) since this term is uncontrollable.
Using the long-run average cost as opposed to the long-run average profit allows us to better highlight the
difference among the various policies, which helps our comparison; see Ata and Olsen (2009, Section 7)
and Kim and Randhawa (2017, Section 6.1 and Appendix A.1) for similar treatments. If we were to use
the long-run average profit, the profit under the nominal instantaneous demand rate would dominate the
controllable costs, and it would seem that all policies perform well.

Production Time Distributions Used in the Simulation Study. In Section 8, we use lognormal and
gamma distributions motivated by Bradley and Glynn (2002), which uses lognormal production times, and
Arreola-Risa and DeCroix (1998), which uses gamma production times. These distributions have the added
benefit of allowing us to change their coefficient of variation while keeping the average production time
constant. This helps us gain additional insights into the impact of the production times on the performance
of the various policies considered.

73
O Miscellaneous Proofs

Proof of Proposition 1. Fix n and let (On , T n , λn ) denote an arbitrarily chosen admissible policy for the
n-th system. The cumulative profit process of this system is given by
Z t K Z t
X K
X
V n (t) = Πn (λn (s))ds − υkn (Qnk (s))ds − νkn Okn (t), t ≥ 0. (173)
0 k=1 0 k=1

Since the second and third terms on the right-hand side of (173) are non-positive, we have
Z t
n
V (t) ≤ Πn (λn (s))ds, t ≥ 0. (174)
0

Substituting the second equality in (11) into (174) gives


Z t
V n (t) ≤ nΠ(λn (s)/n)ds, t ≥ 0. (175)
0

By Assumption 3, the static planning problem (8) has the unique optimal solution λ? , i.e., Π(λ) ≤ Π(λ? )
for λ ∈ L. Substituting this into (175) yields the desired result.

Proof of Proposition 4. The proof of this proposition is almost identical to the proof of Rubino and
Ata (2009, Proposition 3) and closely follows the proof of Harrison and Van Mieghem (1997, Propositions
3-4 and Theorem 2) using the approach outlined in the appendix of Harrison and Van Mieghem (1997).
The only minor modification needed is the generalization of the arguments in Harrison and Van Mieghem
(1997) to allow a state-dependent drift term and an outsourcing term. However, the analysis of Harrison
and Van Mieghem (1997) is not sensitive to the drift term or the inclusion of an outsourcing process, and
therefore, a state-dependent drift term and an outsourcing term can be accommodated easily.

Proof of Proposition 2. Consider the barrier policy (L, U, θ) and assume γ ∈ R and f ∈ C 2 ([l, u]) jointly
satisfy (35)-(36). A routine application of Ito’s lemma gives
hZ t i hZ t i hZ t i
0
f 0 (W (s))dU (s)
 
E f (W (t)) − f (W (0)) = E Γθ f (W (s))ds + E f (W (s))dL(s) − E
0 0 0
hZ t i hZ t i
=E Γθ f (W (s))ds + E −κdL(s) (176)
0 0
h t
Z Z t i h i
= γt − E c(θ(W (s)))ds + h(W (s))ds − E κL(t) (177)
0 0
h t
Z Z t i
= γt − E c(θ(W (s)))ds + h(W (s))ds + κL(t) , (178)
0 0

where (176) follows from (34) and (36), (177) follows from (35), and (178) follows from (177) and a rear-
rangement of the terms. Dividing both sides of (178) by t and taking the limit as t → ∞ gives
1 h t
Z t
1  1
Z i

lim E f (W (t)) − lim f (W (0)) =γ − lim E c(θ(W (s)))ds + h(W (s))ds + κL(t) .
t→∞ t t→∞ t t→∞ t 0 0

The first term on the left-hand side of this equation vanishes because f 0 is bounded and W (t) ∈ [l, u] for

74
t ≥ 0. The second term on the left-hand side of this equation also vanishes since W (0) = 0. Consequently,
we arrive at
1 h t
Z Z t i
lim E c(θ(W (s)))ds + h(W (s))ds + κL(t) = γ.
t→∞ t 0 0

Proof of Lemma 3. It is straightforward to write l and u in terms of γ using Equations (46)-(47). It


remains to prove l < 0, u > 0, and γ > 0. We prove these statements separately.

Lower barrier is negative. As a preliminary to proving l < 0, note that κ is non-negative by definition.
Moreover, µ is non-negative because η ≥ 0. Furthermore, H -1 is positive definite because H is positive
definite by Assumption 2. Therefore, by Equation (48), we have

h(u) ≤ h(l). (179)

We prove l < 0 by contradiction. Assume l ≥ 0. Since h is strictly increasing on [0, ∞), u ≥ l by definition,
and h(u) ≤ h(l) by (179), we must have l = u. However, l = u contradicts the boundary conditions (44)-(45).
Therefore, the assumption that l ≥ 0 is incorrect.

Upper barrier is positive. We prove u > 0 by contradiction. Assume u ≤ 0. First, we show that v 0 , v 00 > 0
on (w0 , u) for some w0 ∈ (l, u). Then, we show that this result contradicts boundary condition (45). Note
that v ∈ C 1 ([l, u]), the quadratic function is smooth, and h is continuously differentiable on (l, u) for l, u ≤ 0.
Therefore, it follows from (43) that v ∈ C 2 ([l, u]); i.e., v 00 is well-defined on [l, u]. Thus, we can differentiate
both sides of (43) to obtain
m0 H -1 m 2µ 0 2h0 (w)
v 00 (w) = v(w) v 0
(w) − v (w) − , w ∈ (l, u). (180)
σ2 σ2 σ2
Recall that v and v 0 are continuous on [l, u] and v(u) = v 0 (u) = 0 by boundary conditions (44)-(45).
Therefore, the first two terms on the right-hand side of (180) are small, i.e., close to zero, in a neighborhood
of u. However, since h0 (w) = −b? for w ∈ [l, u) with l, u ≤ 0, the third term on the right-hand side of (180)
is positive. Therefore, there exists w0 ∈ (l, u) such that v 00 > 0 on (w0 , u). Moreover, since l, u ≤ 0, by
Corollary 6, we have v 0 > 0 on (l, u). Combining v 00 > 0 on (w0 , u) and v 0 > 0 on (l, u) gives v 0 , v 00 > 0 on
(w0 , u), which implies v 0 (u) > 0. This contradicts boundary condition (45). Therefore, the assumption that
u ≤ 0 is incorrect.

Average Cost is positive. It follows from Equation (49) and u > 0 that γ > 0.

Proof of Lemma 4.

Part 1. We start by proving v0- (0) < 0. We prove this statement by contradiction. Assume v0- (0) ≥ 0. First,
we show that there exists w0 ∈ (l0 , 0] such that ∂v0- (w0 )/∂w ≤ 0. Then, we show that this result contradicts
Corollary 6. It follows from the intermediate value theorem (see e.g., Royden and Fitzpatrick (1968, Sction

75
1.6)), the continuity of v0- , and the fact that v0- (l0 ) = −κ < 0 and v0- (0) ≥ 0 that there exists w0 ∈ (l0 , 0]
such that v0- (w0 ) = 0. Then, by Equation (50), we have
m0 H -1 m

∂ - - 2 2
 2µ -  2 h(l0 ) − h(w0 )
v (w0 ) = (v0 ) (w0 ) − κ − 2 v0 (w0 ) + κ +
∂w 0 2σ 2 σ σ2
0 -1

mH m 2 2µ 2 h(l 0 ) − h(w0 )
= − κ − 2κ + (181)
2σ 2 σ σ2
m0 H -1 m 2 2µ 2h(l0 )
≤ − κ − 2κ + (182)
2σ 2 σ σ2
= 0, (183)

where (181) follows from v0- (w0 ) = 0, (182) follows from the non-negativity of h, and (183) follows from (46).
However, by Corollary 6, we know that ∂v0- (w)/∂w > 0 for w ∈ (l0 , 0], which contradicts (183). Therefore,
the assumption that v0- (0) ≥ 0 is incorrect.

Next, we show that vγ- (0) is strictly increasing in γ. Consider the differential equation
m0 H -1 m 2 2µ 2b?
v̂ 0 (w) = 2
 
v̂ (w) − κ − v̂(w) + κ + w, w ∈ (0, ∞), (184)
2σ 2 σ2 σ2
subject to the boundary condition v̂(0) = −κ. It follows from (50) and (184) that

vγ- (w) = v̂(w − lγ ), w ∈ [lγ , 0]. (185)

Thus, for γ > 0,


∂ - ∂ 1
vγ (0) = v̂(−lγ ) = ? v̂ 0 (−lγ ) > 0, (186)
∂γ ∂γ b
where the second equality in (186) follows from (49) and the inequality in (186) follows from Lemma 19.
Therefore, vγ- (0) is strictly increasing in γ.

It remains to show that vγ- (0) → ∞ as γ → ∞. By (49) and (185), it suffices to show that v̂(w) → ∞ as
w → ∞. We prove this statement by contradiction. Assume v̂ is upper bounded. Then, since v̂ is also lower
bounded (by Lemma 18), the first two terms on the right-hand side of (184) are bounded. Then, since the
third term on the right-hand side of (184) increases without bound, i.e., 2b? w/σ 2 → ∞ as w → ∞, we have
v̂ 0 (w) → ∞ as w → ∞. Then, by the fundamental theorem of calculus, we have
Z w
v̂(w) = v̂(0) + v̂ 0 (w0 )dw0 → ∞, (187)
0

which contradicts the assumption that v̂ is upper bounded.

Part 2. The proof of Part 2 resembles the proof of Part 1. The equality v0+ (0) = 0 follows from (53). To
prove vγ+ (0) is strictly decreasing in γ, we consider the differential equation
m0 H -1 m 2 2µ 2h?
ṽ 0 (w) = − ṽ (w) + ṽ(w) − w, w ∈ (0, ∞), (188)
2σ 2 σ2 σ2

76
subject to the boundary condition ṽ(0) = 0. It is straightforward to see that

vγ+ (w) = ṽ(uγ − w), w ∈ [0, uγ ]. (189)

By Lemma 20, ṽ, ṽ 0 < 0 on (0, ∞). Therefore, for γ > 0,


∂ + ∂ 1
v (0) = v̂(uγ ) = ? ṽ 0 (uγ ) < 0.
∂γ γ ∂γ h
In other words, vγ+ (0) is strictly decreasing in γ.

It remains to show that vγ+ (0) → −∞ as γ → ∞. By (189), it suffices to show that ṽ(w) → −∞ as w → ∞.
Since ṽ < 0 (be Lemma 20), the first two terms on the right-hand side of (188) are upper bounded. Therefore,
since the third term on the right-hand side of (188) decreases without bound, i.e., −2h? w/σ 2 → −∞ as
w → ∞, we have ṽ 0 (w) → −∞ as w → ∞. Therefore,
Z w
ṽ(w) = ṽ(0) + ṽ 0 (w0 )dw0 → −∞.
0

Proof of Proposition 3. We prove the result in four steps. In Step 1, we show that v is non-positive
and strictly increasing. In Step 2, we show that v is continuously differentiable. In Step 3, we show that
(lγ ? , uγ ? , γ ? , v) is a solution to (43)-(45). Finally, in Step 4, we show that the solution to (43)-(45) is unique.

Step 1: v is non-positive and strictly increasing on [lγ ? , uγ ? ]. If we show that v is strictly increasing
on [lγ ? , uγ ? ], it follows from v(uγ ? ) = 0 that v is non-positive. To prove v is strictly increasing, we consider
the intervals (lγ ? , 0) and (0, uγ ? ), separately. It follows from the definition of v in (54) and Corollary 6 that
v 0 (w) > 0 for w ∈ (lγ ? , 0). To prove v 0 (w) > 0 for w ∈ (0, uγ ? ), consider the differential equation
m0 H -1 m 2 2µ 2h?
ṽ 0 (w) = − 2
ṽ (w) + 2 ṽ(w) − 2 w, w ∈ (0, ∞),
2σ σ σ
subject to the boundary condition ṽ(0) = 0. It is straightforward to see that

vγ+ ? (w) = ṽ(uγ ? − w), w ∈ [0, u?γ ].

Since ṽ 0 (w) > 0 for w ∈ (0, uγ ? ) (by Lemma 20), we have v 0 (w) > 0 for w ∈ (0, uγ ? ).

Step 2: v is continuously differentiable. Since vγ- ? is continuously differentiable on [lγ ? , 0], vγ+ ? is
continuously differentiable on [0, uγ ? ], and v in given by (54), we just need to show that v and v 0 are
continuous at the origin. The continuity of v at the origin follows from Corollary 1. To show v 0 is continuous
at the origin, we calculate the left and right limits of v 0 at the origin and show that the two limits coincide.
By (50)-(51) and (54), we have
∂ - m0 H -1 m 2µ 2h(lγ ? )
lim - v 0 (w) = lim - (vγ- ? )2 (0) − κ2 − 2 vγ- ? (0) + κ +
 
vγ ? (w) = 2
. (190)
w→0 w→0 ∂w 2σ σ σ2

77
Similarly, by (52)-(53) and (54), we have
∂ + m0 H -1 m + 2 2µ 2h(uγ ? )
lim + v 0 (w) = lim + vγ ? (w) = (vγ ? ) (0) − 2 vγ+ ? (0) + . (191)
w→0 w→0 ∂w 2σ 2 σ σ2
Then, it follows from Corollary 1, Equations (190)-(191), and the definition of lγ ? and uγ ? that

lim v 0 (w) = lim + v 0 (w). (192)


w→0- w→0

Step 3: (lγ ? , uγ ? , γ ? , v) is a solution to (43)-(45). First, we show that (lγ ? , uγ ? , γ ? , v) satisfies the
differential equation (43) and boundary condition (44). Then, we show that it also satisfies boundary
condition (45). It follows from the definition of lγ ? , the fact that vγ- ? solves (50)-(51), and the definition of
v in (54) that v satisfies (43) on (lγ ? , 0) and v(lγ ? ) = −κ. Similarly, it follows from the definition of uγ ? , the
fact that vγ+ ? satisfies (52)-(53), and the definition of v in (54) that v satisfies (43) on (0, uγ ? ) and v(uγ ? ) = 0.

It remains to show v 0 (lγ ? ) = v 0 (uγ ? ) = 0. Substituting (51) into (50) gives ∂vγ- ? (lγ ? )/∂w = 0. Then,
since v(w) = vγ- ? (w) for w ∈ [lγ ? , 0], we have v 0 (lγ ? ) = 0. Similarly, substituting (53) into (52) gives
∂vγ+ ? (lγ ? )/∂w = 0. Then, since v(w) = vγ+ ? (w) for w ∈ [0, uγ ? ], we have v 0 (uγ ? ) = 0.

Step 4: The solution to (43)-(45) is unique. Any solution to (43)-(45) must satisfy (50)-(51) and
(52)-(53). It must also be continuous at the origin. Given γ, the solutions to (50)-(51) and (52)-(53) are
unique. Moreover, there exists a unique γ for which the solutions to (50)-(51) and (52)-(53) coincide at the
origin (to ensure v is continuous at the origin). Therefore, the solution to (43)-(45) is unique.

Proof of Theorem 1. Given v, define f : R → R as follows:






 κ(lγ ? − w), w ∈ (−∞, lγ ? ),



 Z w

f (w) = v(x)dx, w ∈ [lγ ? , uγ ? ], (193)
 lγ ?


 Z uγ ?
v(x)dx, w ∈ (uγ ? , ∞).




lγ ?

Note that Equation (193) extends the domain of the real-valued function f defined in (55) from [lγ ? , uγ ? ] to
the entire real line. We prove the theorem in two steps. In Step 1, we show that −κ ≤ f 0 (w) ≤ 0 for w ∈ R
and

2 00
f (w) + µf 0 (w) − xf 0 (w) + c(x) + h(w) ≥ γ ? ,
1
min 2σ w ∈ R. (194)
x∈R

In Step 2, we show that the average cost of any arbitrary policy (L, U, θ) is greater than or equal to the
average cost of our candidate policy.

Step 1. By Proposition 3, v is strictly increasing on (lγ ? , uγ ? ), v(lγ ? ) = −κ, and v(uγ ? ) = 0. Therefore,
−κ ≤ v(w) ≤ 0 for w ∈ [lγ ? , uγ ? ], which gives −κ ≤ f 0 (w) ≤ 0 for w ∈ R. Moreover, by Corollary
2, (lγ ? , uγ ? , γ ? , f ) solves (37)-(38). Equation (194), then, follows from (37) and the fact that h is strictly

78
decreasing on (−∞, lγ ? ) and strictly increasing on (uγ ? , ∞).

Step 2. Let (L, U, θ) be an arbitrary admissible policy and denote its state process with W . A routine
application of Ito’s lemma gives
hZ t Z t Z t i
1 2 00
µf 0 (W (s))ds − θ(s)f 0 (W (s))ds
 
E f (W (t)) − f (W (0)) = E 2 σ f (W (s))ds +
0 0 0
hZ t i hZ t i
+E f 0 (W (s))dL(s) − E f 0 (W (s))dU (s) , t ≥ 0. (195)
0 0

First, we obtain a lower bound for each of the three terms on the right-hand side of Equation (195). Then,
we substitute the lower bounds into Equation (195) and take the lim inf as t → ∞ to obtain the desired
result. We start with the first term on the right-hand side of (195). By (194),

1 2 00
2 σ f (W (t)) + µf 0 (W (t)) − θ(t)f 0 (W (t)) + c(θ(t)) + h(W (t)) ≥ γ ? , t ≥ 0. (196)

Integrating both sides of (196) from 0 to t, taking the expectation, and rearranging the terms gives
hZ t Z t Z t i hZ t i
1 2 00 0 0 ?
E 2 σ f (W (s))ds + µf (W (s))ds − θ(s)f (W (s))ds ≥ γ t − E c(θ(s))ds
0 0 0 0
hZ t i
−E h(W (s))ds . (197)
0

for t ≥ 0. To obtain a lower bound for the second term on the right-hand side of (195), we write it out as
hZ t i hZ t X i
0
f 0 (W (s))dLc (s) +

E f (W (s))dL(s) = E f W (s- ) + ∆L(s) − f (W (s- ) , (198)
0 0 s≤t
c
where L denotes the continuous part of L and ∆L denotes its jumps. To be specific, for t ≥ 0,
X
∆L(t) = L(t) − L(t- ) and Lc (t) = L(t) − ∆L(t).
L(s)6=L(s- )
0≤s≤t

Substituting f 0 (w) ≥ −κ for w ∈ R, which we showed in Step 1, into (198) gives


hZ t i hZ t X i
E f 0 (W (s))dL(s) ≥ − E κ dLc (s) + κ ∆L(s)
0 0 s≤t
 
= − κ E L(t) , t ≥ 0. (199)

Using a similar argument, we obtain


hZ t i
E f 0 (W (s))dU (s) ≤ 0, t ≥ 0. (200)
0

Substituting (195), (199), and (200) into (195) gives


hZ t Z t i
?
 
E f (W (t)) − f (W (0)) ≥ γ t − E c(θ(s))ds − h(W (s))ds − κL(t) , t ≥ 0. (201)
0 0

We divide both sides of (201) by t and take the lim inf as t → ∞. The first term on the left-hand side
vanishes because f 0 is bounded and W satisfies (32). The second term on the left-hand side vanishes because

79
W (0) = 0. Consequently, we obtain
1 h t
Z Z t i
liminft→∞ E c(θ(s))ds + h(W (s))ds + κL(t) ≥ γ ? .
t 0 0

P Auxiliary Results

This section introduces some auxiliary results that aid in the proof of various lemmas and propositions in
Appendices J and O. First, we present the auxiliary results for Appendix O since the main results are proved
in this appendix. Then, we provide the auxiliary results for Appendix J.

P.1 Auxiliary Results for Appendix O

Lemma 18. Let v ∈ C 1 ([0, ∞)) be the solution to


m0 H -1 m 2 2µ 2b?
v 0 (w) = v (w) − κ2 − 2 v(w) + κ + 2 w,
 
2
w ∈ (0, ∞), (202)
2σ σ σ
subject to the boundary condition v(0) = −κ. Then, v(w) ≥ −κ for w ∈ [0, ∞).

Proof. We prove this result by contradiction. Assume there exists w0 ∈ (0, ∞) such that v(w0 ) < −κ. Then,
it follows from the continuity of v, v 0 and the initial condition v(0) = −κ that there exists w1 ∈ (0, w0 ) such
that v(w0 ) < v(w1 ) < −κ and v 0 (w1 ) ≤ 0. The first two terms on the right-hand side of Equation (202)
evaluated at w = w1 are non-negative because v(w1 ) < −κ. Therefore, it follows from Equation (202) that
2b?
v 0 (w1 ) ≥ w1 > 0,
σ2
which contradicts v 0 (w1 ) ≤ 0. Thus, the assumption that there exists w0 ∈ [0, ∞) with v(w0 ) < −κ is
incorrect.

Lemma 19. Let v ∈ C 1 ([0, ∞)) be the solution to


m0 H -1 m 2 2µ 2b?
v 0 (w) = v (w) − κ2 − 2 v(w) + κ + 2 w,
 
2
w ∈ (0, ∞), (203)
2σ σ σ
subject to the boundary condition v(0) = −κ. Then, v 0 (w) > 0 for w ∈ (0, ∞).

Proof. First, we show that there exists w0 > 0 such that v 0 (w) > 0 for w ∈ (0, w0 ]. Then, we partition the
interval (w0 , ∞) into three segments and show that v 0 > 0 on each segment. Since v ∈ C 1 ([0, ∞)) and the
quadratic function is smooth, it follows from (203) that v ∈ C 2 ([0, ∞)), i.e., v 00 is well-defined on [0, ∞).
Therefore, we can differentiate both sides of (203) to obtain
m0 H -1 m 2µ 0 2b?
v 00 (w) = v(w) v 0
(w) − v (w) + , w ∈ [0, ∞). (204)
σ2 σ2 σ2

80
It follows from Equation (203), boundary condition v(0) = −κ, and the continuous differentiability of v that
v 0 (0) = 0. Since v and v 0 are continuous on [0, ∞) and v 0 (0) = 0, the first two terms on the right-hand
side of (204) are small, i.e., close to zero, in a neighborhood of zero. Therefore, since the third term on the
right-hand side of (204) is positive, there exists w0 ∈ (0, ∞) such that v 00 (w) > 0 for w ∈ (0, w0 ]. It then
follows from v 0 (0) = 0 that v 0 (w) > 0 for w ∈ (0, w0 ].

Next, we show that v 0 (w) > 0 for w ∈ (w0 , ∞). To simplify the analysis, we partition (w0 , ∞) into three
subsets. To do so, let
n 2µ o
w1 = inf w ∈ (0, ∞) : v(w) ≥ . (205)
m0 H -1 m
Without loss of generality, we assume w0 is small enough such that
2b?
v 0 (w0 ) < , (206)
2µ + m0 H -1 m κ
and w0 < w1 . We prove v 0 (w) > 0 for w ∈ (w0 , w1 ), w = w1 , and w ∈ (w1 , ∞), separately.

We start with w ∈ (w0 , w1 ). We would like to find a condition that ensures v 00 (w) ≥ 0. To this end, we fix
w ∈ (w0 , w1 ) and assume v 00 (w) < 0. It follows from Equation (204) that
m0 H -1 m 0 2µ 0 2b?
v(w) v (w) − v (w) + < 0. (207)
σ2 σ2 σ2
Multiplying both sides of (207) by σ 2 and rearranging the terms gives

2b? < 2µ − m0 H -1 m v(w) v 0 (w).



(208)

It follows from w < w1 and the definition of w1 in (205) that 2µ − m0 H -1 m v(w) > 0. Thus, dividing both
sides of (208) by 2µ − m0 H -1 m v(w) gives
2b? 2b?
v 0 (w) > ≥ , (209)
2µ − m0 H -1 m v(w) 2µ + m0 H -1 m κ
where the second inequality follows from v(w) ≥ −κ (by Lemma 18). So far, we have shown that for any
w ∈ (w0 , w1 ) for which v 00 (w) < 0, (209) holds. Therefore, if the converse of (209) holds, i.e.,
2b?
v 0 (w) ≤ , (210)
2µ + m0 H -1 m κ
we have v 00 (w) ≥ 0. In other words, at w ∈ (w0 , w1 ) for which (210) holds, v 0 is non-decreasing. It then follows
from (206) and w0 < w1 that v 0 (w) ≥ v 0 (w0 ) > 0 for w ∈ (w0 , w1 ). Since v is continuously differentiable, we
also have v 0 (w1 ) > 0.

It remains to show that v 0 (w) > 0 for w ∈ (w1 , ∞). We prove this statement by contradiction. Assume there
exists w2 ∈ (w1 , ∞) such that v 0 (w2 ) ≤ 0. Since v 0 (w1 ) > 0 and v 0 (w2 ) ≤ 0, there exists w3 ∈ (w1 , w2 ) such
that v 0 (w3 ) > 0 and v 00 (w3 ) ≤ 0. Without loss of generality, we assume w3 is the smallest such value, i.e.,
v 0 (w) > 0 for w ∈ (w1 , w3 ]. Then, it follows from v(w1 ) = 2µ/(m0 H -1 m) (by the definition of w1 in (205)) and

81
v 0 (w) > 0 for w ∈ (w1 , w3 ] that v(w3 ) > 2µ/(m0 H -1 m). Substituting v 0 (w3 ) > 0 and v(w3 ) > 2µ/(m0 H -1 m)
into (204) gives v 00 (w3 ) > 0, which contradicts v 00 (w3 ) ≤ 0. Therefore, the assumption that there exists
w2 ∈ (w1 , ∞) such that v 0 (w2 ) ≤ 0 is incorrect.

Corollary 6. Given l < 0, let v ∈ C 1 ([l, 0]) be the solution to


m0 H -1 m 2

2µ 2 h(l) − h(w)
v 0 (w) = 2
 
v (w) − κ − v(w) + κ + , w ∈ (l, 0)
2σ 2 σ2 σ2
subject to the boundary condition v(l) = −κ. Then, v 0 (w) > 0 for w ∈ (l, 0].

Lemma 20. Let v ∈ C 1 ([0, ∞)) be the solution to


m0 H -1 m 2 2µ 2h?
v 0 (w) = − 2
v (w) + 2 v(w) − 2 w, w ∈ (0, ∞) (211)
2σ σ σ
subject to the boundary condition v(0) = 0. Then, v, v 0 < 0 on (0, ∞).

Proof. First, we show that there exists w0 > 0 such that v, v < 0 on (0, w0 ]. Then, we use contradiction to
argue that the same result holds for w ∈ (w0 , ∞). Since v ∈ C 1 ([0, ∞)) and the quadratic function is smooth,
it follows from (211) that v ∈ C 2 ([0, ∞)), i.e., v 00 is well-defined on [0, ∞). Therefore, we can differentiate
both sides of (211) to obtain
m0 H -1 m 2µ 0 2h?
v 00 (w) = − v(w) v 0
(w) + v (w) − , w ∈ (0, ∞). (212)
σ2 σ2 σ2
It follows from Equation (211), boundary condition v(0) = 0, and the continuous differentiability of v that
v 0 (0) = 0. Then, since v and v 0 are continuous on [0, ∞) and v 0 (0) = 0, the first two terms on the right-hand
side of Equation (212) are small, i.e., close to zero, in a neighborhood of zero. Therefore, since the third
term on the right-hand side of Equation (212) is negative, there exists w0 ∈ (0, ∞) such that v 00 (w) < 0 for
w ∈ (0, w0 ]. It then follows from v 0 (0) = 0 that v 0 (w) < 0 for w ∈ (0, w0 ]. Similarly, it follows from boundary
condition v(0) = 0 and v 0 (w) < 0 for w ∈ (0, w0 ] that v(w) < 0 for w ∈ (0, w0 ].

It remains to show that v 0 (w) < 0 for w ∈ (w0 , ∞). We prove this by contradiction. Assume there exists
w1 ∈ (w0 , ∞) such that v 0 (w1 ) ≥ 0. Since v 0 (w0 ) < 0 and v 0 (w1 ) ≥ 0, there exists w2 ∈ (w0 , w1 ) such
that v 00 (w2 ) ≥ 0. Without loss of generality, we assume w2 is the smallest such value, i.e., v 00 (w) < 0 for
w ∈ (w0 , w2 ). Then, since v(w0 ), v 0 (w0 ) < 0, we have v(w2 ), v 0 (w2 ) < 0. Substituting v(w2 ), v 0 (w2 ) < 0 into
(212) evaluated at w = w2 gives v 00 (w2 ) < 0, which contradicts v(w2 ) ≥ 0. Thus, the assumption that there
exists w1 ∈ (w0 , ∞) such that v 0 (w1 ) ≥ 0 is incorrect. The result v(w) < 0 for w ∈ (w0 , ∞) then follows
from v(w0 ) < 0 and v 0 (w) < 0 for w ∈ (w0 , ∞).

P.2 Auxiliary Results for Appendix J

In the remainder of this section, we assume Θ ⊂ R− is a given finite set and ψ is characterized by (127).

82
Lemma 21. Let v ∈ C 1 ([0, ∞)) be the solution to
2µ 2 2b?
v 0 (w) = −
 
v(w) + κ + φ(v(w)) − φ(−κ) + w, w ∈ (0, ∞), (213)
σ2 σ2 σ2
subject to the boundary condition v(0) = −κ. Then, v(w) ≥ −κ for w ∈ [0, ∞).

Proof. The proof of this lemma is almost identical to the proof of Lemma 18 with the small exception that
we also use the fact that φ is non-increasing. Therefore, we omit it.

Lemma 22. Let v ∈ C 1 ([0, ∞)) be the solution to


2µ 2 2b?
v 0 (w) = −
 
v(w) + κ + φ(v(w)) − φ(−κ) + w, w ∈ (0, ∞), (214)
σ2 σ2 σ2
subject to the boundary condition v(0) = −κ. Then, v 0 (w) > 0 for w ∈ (0, ∞).

Proof. First, we show that there exists w0 ∈ (0, ∞) such that v 0 (w) > 0 for w ∈ (0, w0 ]. Then, we show
that v 0 (w) > 0 for w ∈ (w0 , ∞). Since φ is continuously differentiable on R\{τl : l = 1, . . . , N − 1}, there
exists τ̄ > −κ such that φ is continuously differentiable on [−κ, τ̄ ]. Then, since v is continuous, v(0) = −κ,
and v(w) ≥ −κ for w ∈ (0, ∞) (by Lemma 21), there exists w0 ∈ (0, ∞) such that φ(v(·)) is continuously
differentiable on [0, w0 ). Then, it follows from (214) that v 00 is well-defined on [0, w0 ) and we can differentiate
both sides of (214) to obtain
2µ 0 2 0 2b?
v 00 (w) = − v (w) + φ (v(w))v 0
(w) + , w ∈ (0, w0 ). (215)
σ2 σ2 σ2
It follows from Equation (214), boundary condition v(0) = −κ, and the continuous differentiability of v that
v 0 (0) = 0. Since v and v 0 are continuous on [0, w0 ) and v 0 (0) = 0, the first two terms on the right-hand
side of (215) are small, i.e., close to zero, in a neighborhood of zero. Therefore, since the third term on the
right-hand side of (215) is positive, for sufficiently small w0 , we have v 00 (w) > 0 for w ∈ (0, w0 ]. It then
follows from v 0 (0) = 0 that v 0 (w) > 0 for w ∈ (0, w0 ).

Next, we show that v 0 (w) > 0 for w ∈ (w0 , ∞). We argue by contradiction. Since v 0 is continuous and
positive on (0, w0 ], if v 0 is not positive on [w0 , ∞), there exists w1 ∈ [w0 , ∞) such that v 0 (w1 ) = 0. Without
loss of generality, assume w1 is the smallest such number. In other words, v 0 > 0 on (0, w1 ) and v 0 (w1 ) = 0.
Substituting v 0 (w) > 0 for w ∈ (0, w1 ) into (214) yields

µ(v(w) + κ) − (φ(v(w)) − φ(−κ)) < b? w, w ∈ (0, w1 ), (216)

Similarly, substituting v 0 (w1 ) = 0 into (214) yields

µ(v(w1 ) + κ) − (φ(v(w1 )) − φ(−κ)) = b? w1 . (217)

83
It follows from (216)-(217) that

µ(v(w1 ) − v(w)) − (φ(v(w1 )) − φ(v(w))) > b? (w1 − w), w ∈ (0, w1 ).

Then, since φ(v(w1 )) − (φ(v(w))) ≤ θN (v(w1 ) − v(w)) (see Figure 7), we have

(µ − θN )(v(w1 ) − v(w)) > b? (w1 − w), w ∈ (0, w1 ). (218)

Since µ is non-negative and θN is non-positive, we have µ − θN ≥ 0. When µ − θN = 0, (218) simplifies to


b? (w1 − w) < 0 for w ∈ (0, w1 ), which contradicts b? > 0. When µ − θN > 0, we rearrange the terms in (218)
to obtain
v(w1 ) − v(w) b?
> > 0, w ∈ (0, w1 ).
w1 − w µ − θN
Taking the limit as w → w1 yields v 0 (w1 ) ≥ b? /(µ − θN ) > 0, which contradicts the assumption that
v 0 (w1 ) = 0. Therefore, the assumption that v 0 (w) ≤ 0 for some w1 ∈ [w0 , ∞) is incorrect.

Corollary 7. Given l < 0, let v ∈ C 1 ([l, 0]) be the solution to



0 2µ  2  2 h(l) − h(w)
v (w) = − 2 v(w) + κ + 2 φ(v(w)) − φ(−κ) + , w ∈ (l, 0),
σ σ σ2
subject to the boundary condition v(l) = −κ. Then, v 0 (w) > 0 for w ∈ (l, 0].

Lemma 23. Let v ∈ C 1 ([0, ∞)) be the solution to


2µ 2 2h?
v 0 (w) =

2
v(w) − 2 φ(v(w)) − φ(0) − 2 w, w ∈ (0, ∞), (219)
σ σ σ
subject to the boundary condition v(0) = 0. Then, v, v 0 < 0 on (0, ∞).

Proof. Since µ is non-negative and θN is non-positive, we have µ − θN ≥ 0. When µ = θN = 0, the first


two terms on the right-hand side of (219) are non-positive, which gives v 0 (w) < 0 for w ∈ (0, ∞). Then, it
follows from v(0) = 0 that v(w) < 0 for w ∈ (0, ∞). In the remainder the proof, we assume µ − θN > 0.
First, we show that there exists w0 > 0 such that v 0 < 0 on (0, w0 ]. Then, we use contradiction to argue
that the same result holds for w ∈ (w0 , ∞). It then follows form boundary condition v(0) = 0 that v(w) < 0
for w ∈ (0, ∞). Substituting φ(v(w)) − φ(0) ≥ θN v(w) (see Figure 7) into (219) and rearranging the terms
gives
2 2h?
v 0 (w) ≤ (µ − θ N )v(w) − w, w ∈ (0, ∞). (220)
σ2 σ2
Similarly, by substituting v(0) = 0 into (219), we obtain v 0 (0) = 0. It then follows from v 0 (0) = 0 and the
continuous differentiability of v that there exits w0 ∈ (0, ∞) such that |v 0 (w)| ≤ h? /2(µ − θN ) for w ∈ (0, w0 ].
Then,
w w
h? h?
Z Z
v(w) = v 0 (y)dy ≤ dy = w, w ∈ (0, w0 ]. (221)
0 0 2(µ − θN ) 2(µ − θN )

84
Next, by substituting (221) into (220), we obtain
h? 2h? h?
v 0 (w) ≤ w − w = − w < 0, w ∈ (0, w0 ]. (222)
σ2 σ2 σ2
It remains to show that v 0 (w) < 0 for w ∈ (w0 , ∞). We argue by contradiction. Suppose there exists
w1 ∈ (w0 , ∞) such that v 0 (w1 ) ≥ 0. Without loss of generality, we assume w1 is the smallest such number.
In other words, v 0 < 0 on (0, w1 ) and v 0 (w1 ) = 0. Since v 0 < 0 on (0, w1 ) and v(0) = 0, we have v(w1 ) < 0.
Also, by substituting v 0 (w1 ) = 0 into (220), we obtain
2 2h?
2
(µ − θN )v(w1 ) ≥ 2 w1 > 0. (223)
σ σ
Rearranging the terms in (223) gives v(w1 ) > 0, which contradicts v(w1 ) < 0. Therefore, the assumption
that there exits w1 with v 0 (w1 ) ≥ 0 is incorrect.

85

You might also like