Stochastic Nash Equilibrium

Competitive Revenue Management of Perishable
Assets with Multiple Predetermined Options
Ming Hu
Department of Industrial Engineering and Operations Research,
Columbia University, New York, NY 10027, USA
mh2252@columbia.edu
Abstract
We study continuous-time revenue management models under price or sales competition with
multiple capacity providers competing to sell their own fixed initial inventories of perishable items
over a finite sales horizon. We assume the available menu of decision variables is given and each
player can observe others’ remaining capacities. For the sales competition of substitutable products
or price competition of complementary products, we obtain a threshold-type optimal control policy
for each player to switch sales target or price in closed form that can be sustained as an exact Nash
equilibrium for the stochastic game. Such a policy can be constructed in a reasonable computation
effort.
Furthermore, we provide counterexamples for price competition of substitutable products and
sales competition of complementary products to show that Nash equilibrium in these types of
competition fails to have monotone threshold structure generally. As a variant of price compe-
tition of substitutable products, dynamic competition with a Multinomial Logit customer choice
model where each player sells finite differentiated products by making available offer sets, has Nash
equilibrium policy not necessarily nested by fare, in contrast to its monopolistic version.
Key words: Revenue Management; Nash Equilibrium; Monotonicity; Optimal Switching Time
History:
1
1 Introduction
Oligopoly is arguably the most interesting market condition for studying revenue management
(RM) since it is the prevailing competitive situation in many RM industries. The two classic
models of oligopoly are the Cournot model (competition in quantities Cournot (1838)) and the
Bertrand model (competition in prices Bertrand (1883)). The empirical study of Brander and
Zhang (1990) shows that in the airline industry the Cournot model seems much more consistent
with the data than the Bertrand model, where the volume or output is aggregated over a quarter.
Wei and Hansen (2007) models airlines’ quantity decisions of both aircraft size and service frequency
under competition, based on empirically derived cost function and market share model. Thus it is
empirically suggested that airlines’ long-term competitive business strategy is better modeled by a
Cournot model.
For the short-term decision at the operational level that RM is particularly focused on, it is
difficult to say that firms practicing RM compete purely on price or quantity. On the one hand, as
capacity is fixed in a short term, price becomes the relevant decision variable. So price competition1
might seem more appropriate. On the other hand, the main decision variables in quantity-based
RM are capacity allocations, which are quantity-based variables. This would suggest that quantity
competition models are more relevant for quantity-based RM.
The price of a product provided by one firm can affect sales of another product offered by its
competitor in two ways, substitutably or complementarily. Most RM literature on competition
considers substitutable effects between differentiated products offered by competing firms, where
a price increase (decrease) in a product positively (negatively) influences sales of other related
products. For example, a price increase of a flight offered by one airline can increase the sales of a
competing flight. Most RM models often ignore complementary effects, where a price increase (de-
crease) in a product negatively (positively) influences sales of other related products. For example,
lowering the price of a particular flight may increase the demand for car rentals at the destination.
In this paper, we consider an oligopolistic market where multiple capacity providers com-
pete to sell their own fixed initial inventories of perishable products over a finite sales horizon.
1
We mention price (quantity) competition rather than Bertrand (Cournot) competition here in the RM setup
because our problem is slightly different from what Bertrand (Cournot) competition means in economics literature
in the sense that our decision is made dynamically in a continuous time over a finite horizon rather than a static
aggregated decision.
2
There are four types of competition as a combination of quantity/price competition with substi-
tutable/complementary products. First, quantity competition with substitutable products may be
more appropriate for the competition of hotels, cruise ships, and rental cars and other common
quantity-based RM industries with substitutable products. Firms fix the prices for the duration of
the sale of their inventories, and only change allocations to the products. This is how quantity based
revenue management is practiced in the airline industry. Such static, fixed, prices are preferred
when prices have to be advertised, or resources are sold based on reservations, or it is otherwise
costly to change prices. Second, the current airline market with fierce competition from low cost
carriers demands a price competition model with substitutable products in a market environment
with price transparency and low cost of changing price. In addition, price or quantity competition
models with complementary products can best capture the independent efforts of airline, car rental
and hotel industry to secure its own business by taking the most favorable price-based or quantity-
based actions according to other industries’ decisions on related products with complementary
effects.
Singh and Vives (1984) points out quantity (price) competition with substitutable products is
the dual of price (quantity) competition with complementary products. We extend the traditional
one-shot economic problem to a RM problem with decision making in a continuous time. We
assume the decision options for each player are finite and pre-determined. For the quantity-based
RM competition problem with substitutable products and the price-based RM competition problem
with complementary products, we provide an algorithm to construct the exact Nash equilibrium
for the stochastic game.
In order to derive the feedback type of Nash equilibrium, we need to assume that the joint
inventory levels are observable by all players at any time. This requirement used to be unrealistic
but is arguably implementable now. For example, in 2004, to help customers avoid getting stuck
in a seat they do not want, Orbitz started offering a seat map feature that lets travelers compare
flight options by seat availability when reviewing fares. Now more online travel agencies (e.g.,
Expedia, Travelocity) and major airlines (e.g., Northwest, American, Delta, United, Continental,
US Airways) have joined to offer such preview seat features from their websites. It is indeed possible
to use scraper programs to keep track of competitors’ inventory levels as well as prices in a real-time.
The assumption of fixed number of pricing options is not restrictive but actually is the current
3
practice. As Gallego and van Ryzin (1994) notes, discrete price points become possible when there
exists an explicit or implicit consensus at the industry level, or arise if a firm wants to achieve
certain market segmentation. For example, airlines typically publish promotional and full fares
in advance and retailers usually consider discounts of 25%-50% off list prices. After all, current
practice in airline industry makes the ticket price round in currency denomination within some
reasonable range and in retailing business practice prefers to have “9”(or “99”) as the last digit(s)
of price.
Gallego and Hu (2006) provides a choice-based, multi-player, pricing game theoretic formulation
as a stochastic control problem to model an oligopolistic market with multiple capacity providers
competing to sell their own fixed perishable inventories over a finite sales horizon. The open-loop
(essentially closed-loop) Nash equilibrium in the corresponding differential game can be proved to
be an Nash equilibrium for the original stochastic game asymptotically in the sense that either
the initial capacity of each player or the sales horizon is sufficiently large. This fluid heuristic sug-
gested by the open-loop (closed-loop) equilibrium provides a good approximation to the stochastic
game when a large sales volume and a long selling horizon are able to smooth out the stochastic
fluctuations in sales over the horizon. Similarly for a quantity-based stochastic game, the fluid
heuristic provides good performance asymptotically. However, it is less likely to be the case for
small initial inventories and a short selling horizon. It is reported in the paper that for a particular
MultiNomial Logit (MNL) model the relative performance of the open-loop heuristic is worse than
10% below the Nash equilibrium when the number of capacities from any player is fewer than 15
and is nearly optimal for more than 37 items. The heuristic is also tested not to perform very well
for a short selling horizon. Given the competitive nature and large size of the market2 , a 5% gap
is significant3 and is truly the goal for revenue management systems.
Several papers model the price competition with gross substitution effects. Talluri (2003) (also
see Talluri and van Ryzin (2004, §8.4.3.2)) studies a duopoly quantity-based RM competition with
substitution, where each player makes decisions in a discrete time and can effectively change the
prices by deciding on what subset they make available simultaneously. We believe this model is
2
Nowadays the major U.S. airlines have annual domestic revenue about $2-$12 billion. (Source: Forbes)
3
By most estimates, the revenue gains from the use of quantity-based RM systems are 4-5%, roughly comparable
to many airlines’ total profitability in a good year (see Talluri and van Ryzin (2004, §1.2.2)). It is estimated that
network RM techniques add 0.5% to 1% of additional revenue on top of the revenue gains from single-leg capacity
allocation and overbooking controls (see Phillips (2005, §8.2)).
4
essentially price competition in the sense of Bertrand competition though the decision variables
are making available offer sets. Lin and Sibdari (2006) develops an oligopolistic model to describe
dynamic price competitions in a discrete time between firms that sell substitutable products. It
has been shown in both papers that in general it is even hard to prove the existence of a Nash
equilibrium for the stochastic game not to mention its uniqueness and structure properties. In this
paper, we also provide examples to show that in a continuous-time price competition model for
gross substitutes, we do not expect to have price monotonicity generally that we used to see in the
monopolistic RM problem.
We provide a sufficient condition for the existence of Nash equilibrium under either price or
quantity competition for both gross substitutes and complements. Based on the characterization,
we design a constructive approach to computing the best-response strategies (exact 4 Nash equilib-
rium) when the available price or quantity options for each player are predetermined, discrete and
finite. We need to have submodularity to have the algorithm works. Thus for quantity competition
with gross substitutes or for price competition with gross complements, we can obtain Nash equi-
librium in closed-form by providing a recursive expression. For the two types of competition with
submodular property, the construction not only can practically guarantee performance for small
sales volumes or a short selling horizon, but also is theoretically elegant. It is rarely seen examples
of Nash equilibrium solved for multi-player stochastic games in the operations research (OR) lit-
erature due to its stochastic nature. In a seminal work, Vieille (2000) establishes the existence of
equilibrium payoffs in general two-person nonzero-sum undiscounted stochastic games with finite
action and state sets. It is appealing to have concrete examples from classic OR setups to com-
plement the theoretic literature. The monopolistic dynamic pricing problem initialized by Gallego
and van Ryzin (1994) can potentially be extended to the game context to provide such a example
because the no-replenishment nature of such RM problems only allow the inventory level to go
down monotonically over the horizon rendering simple finite-step state changes. This paper follows
the technique introduced by Feng and Xiao (2000a) to construct a Nash equilibrium for the game
version of the RM problem with finite price options. For the other two types of competition with
supermodular property, i.e., the price competition with gross substitutes and quantity competition
with gross complements, we do provide examples that the construction sometime also works.
4
We put “exact” here to differentiate the close-loop or feedback strategies from the open-loop Nash equilibrium.
5
Under the setup of Gallego and van Ryzin (1994), there is a series of papers studying the
optimal strategy for the monopoly with a menu of available prices. If price is only allowed to
change monotonically, i.e., either the markup or markdown policy is implemented, optional sampling
theorem from martingale theory can be utilized to characterize the optimal solutions. Feng and
Gallego (1995) provide an optimal threshold-type switching time policy when the monopoly is only
allowed at most one price change, from a given initial price to another given price. Feng and Gallego
(2000) further characterize the optimal timing of price changes within a given menu of allowable
price paths that is time-dependent. Feng and Xiao (2000b) identify an optimal pricing policy in a
recursive form under the assumption that the allowable price set is discrete and finite and the price
changes are irreversible. If reversible price changes are allowed, Feng and Xiao (2000a) show that
optimal policies can be constructed in a closed form by directly deriving sufficient optimal conditions
from the associated Bellman equation. Feng and Xiao (2001) further apply the idea to an airline seat
inventory control problem with multiple origins, one hub and one destination. Whether reversible
price changes are allowed, we always have the following results about the optimal solutions: (1)
An exact solution can be derived in a recursive form; (2) At each inventory level there exists a
sequence of nested time thresholds that guide price changes; (3) The threshold time points shift
monotonically as inventory level changes. It turns out under unilateral concavity and submodularity
of the revenue rates, these properties are retained in the game context for each player. We also
show that a strong version of unilateral concavity and submodularity can guarantee the uniqueness
of the Nash equilibrium.
The algorithm to compute the close-loop Nash equilibrium under submodularity needs a reason-
able computation effort due to the discretization of continuous-time analytic formulae. However,
cumulative discretization errors are limited to a negligible size for short planning horizon and small
inventory levels. For long planning horizon or large inventory levels, fixed pricing policy is proved
to provide very good performance (see Gallego and Hu (2006)).
The rest of the paper is organized in the following order: §2 discusses quantity competition
with substitutable products. Concretely, §2.1 presents a mathematical formulation of the stochas-
tic game. §2.2 derives the sufficient and necessary condition for a Nash equilibrium in such a
stochastic game. §2.3 constructs the Nash equilibria recursively. §3 studies price competition with
complementary products as a dual problem. Numerical results and counterexamples are placed in
6
§4, followed by concluding remarks and future research questions.
2 Quantity Competition with Substitutable Products
2.1 The Stochastic Game
We consider an oligopolistic market of m differentiated substitutable products with a set S =

S
{1, 2, . . . , m} of competitors. At time zero, firm i ∈ S has inventory ni ∈ Z+ = N {0} units of the
substitutable perishable asset and the same finite time t > 0 to sell them. We assume the salvage
value of the asset at time t, the end of selling horizon, is zero and that all other costs are sunk.
We assume a discrete sales target menu for each firm that contains finite options. For each
firm i ∈ S, there are K(i) ∈ N number of sales target options and we denote the set of them by
Qi = {λi,1 , λi,2 , . . . , λi,K(i) }, with 0 ≤ λi,K(i) < · · · < λi,2 < λi,1 < ∞. We denote the joint strategy
space by Q := Πm
i=1 Qi .
The market is assumed to be an imperfect market where demand is a function of the prices
across the industry and in a Cournot competition each firm tries to set price to meet sales target
chosen. At any time s, the current price vector p(s) = (p1 (s), p2 (s), . . . , pm (s)) is determined by
the vector of non-homogeneous Poisson demand intensity vector λ(s) = (λ1 (s), λ2 (s), . . . , λm (s))
with λi (s) ∈ Qi through a mapping p(λ) : Q → Rm

+ , λ 7→ p(λ) = (p1 (λ), p2 (λ), . . . , pm (λ)). Let
λ−i = (λ1 , . . . , λi−1 , λi+1 , . . . , λm ), ∀ i ∈ S denote the demand intensity vector of the other m − 1
firms who compete with firm i. We denote the range of the mapping p(λ) by P. For any given λ−i ,
the revenue rate function for player i, ∀ i ∈ S is ri (λi ; λ−i ) := λi pi (λ). We assume the mapping
p(λ) is known and that it satisfies the following assumptions:
Assumption 1 (One-to-One Mapping). The mapping p(λ) : Q → Rm

+ has a one-to-one reverse
mapping λ(p) : P → Rm
+.
Assumption 2 (Concavity). ri (λi ; λ−i ) is an increasing and concave of λi for any λ−i .
Assumption 3 (Decreasing Differences). ri (λi,l ; λ−i ) − ri (λi,r ; λ−i ) is decreasing in λ−i for any λi,l >
λi,r .
Assumptions 1-3 are satisfied for the inverse function of most commonly used demand functions,
e.g. linear demand function and MultiNomial Logit(MNL) demand function.
7
Example 1 (MNL Demand Function). The demand mapping is
a exp(−bi pi )
λi (p) = M Pim , ∀ i ∈ S,
a0 + j=1 aj exp(−bj pj )
where a0 ≥ 0 is the no-purchase option value and ai , bi ≥ 0, ∀ i ∈ S. Its corresponding price
mapping is
ai [M − m
P
1 j=1 λj ]
pi (λ) = ln , ∀ i ∈ S.
bi a0 λi
It is not hard to check that Assumption 2 is satisfied for ri (λi ; λ−i ). Additionally since
∂ri (λ) λi 1
=− , ∀ j 6= i,
bi M − m
P
∂λj j=1 λj
is decreasing in λi , Assumption 3 holds.
Example 2 (Linear Demand Function). The demand mapping is
X
λi (p) = ai − bi pi + cij pj , ∀ i ∈ S,
j6=i
where ai , bi > 0, ∀ i ∈ S, cij ≥ 0, j 6= i, ∀ i ∈ S. We assume that the matrix B with Bii = −bi and
Bij = cij for j 6= i, is nonsingular, then the price mapping is
X
pi (λ) = âi − b̂i λi − ĉij λj , ∀ i ∈ S.
j6=i
It is easy to see for linear demand function, ∂ 2 ri (λi ; λ−i )/∂λ2i = −b̂i < 0 and ∂ri (λi ; λ−i )/∂λj =
−ĉij λi , j 6= i is decreasing in λi .
The firms compete for the market by adjusting its own targeted sales level. At any time s ∈ [0, t],
firm i, ∀ i ∈ S applies its own non-anticipating sales level λi (s) ∈ Qi . Let


 1, if λi,j ∈ Qi is effective at s,

Ii,j (s) =
 0, otherwise.

8
A non-anticipating policy ui for firm i is defined to be
 
 K(i) 
X
ui = (Ii,1 (s), Ii,2 (s), . . . , Ii,m (s)) : Ii,j (s) = 1, 0 ≤ s ≤ t ,
 
j=1
where the imposed constraint is to ensure that one and only one sales target for each firm is active
at any given time. Let Ni,j (s), ∀ i ∈ S, j = 1, 2, . . . , K(i) represent the accumulated number of
items sold up to time s at sales target λi,j ∈ Qi . A demand for any firm i ∈ S is realized at sales
target λi,j at time s if dNi,j (s) = 1. We denote by U the joint Markovian allowable sales policy
space: any joint allowable sales policy u = (u1 , u2 , . . . , um ) ∈ U must satisfy that for ∀ i ∈ S,
K(i) Z t
X
Ii,j (s)dNi,j (s) ≤ ni , a.s.
j=1 0
and the sales level λi (s) targeted by firm i is a function of the elapsed time s, its own inventory
level as well as the capacity levels of all other firms at time s, i.e.,
λi (s) = λi (s, n1 (s), . . . , nm (s)) ,
PK(i)
where ni (s) := ni − j=1 Ni,j (s) is the remaining inventory of firm i ∈ S at time s ∈ [0, t]. In terms
of game theory, we analyze strategies in feedback form, or in other words, closed-loop strategies.
Given sales target policy u ∈ U, joint initial stock vector n5 = (n1 , n2 , . . . , nm ) ∈ Zm

+ and a
finite sales horizon t > 0, we denote the expected profit for any firm i ∈ S by
 
K(i) Z t
X
Ji (t, n, u) := E  Ii,j (s)pi (λ(s)) 1{ni (s−)>0} dNi,j (s) .
j=1 0
The goal of each firm i ∈ S is to maximize its total expected profit over [0, t] in the competitive
market. We assume all firms have perfect information of inventory levels about each other. More
specifically, all firms completely observe the joint state vector (ni (s), i ∈ S) at any time s ∈ [0, t],
and act upon that information. A joint policy u∗ = (u∗1 , u∗2 , . . . , u∗m ) ∈ U constitutes a Nash equi-
librium if, whenever any firm modifies its policy away from the equilibrium, its own payoff will not
5
We will omit vectos sign above all vectors for simplicity of notation. Readers should be able to tell that from the
context.
9
increase. More precisely, u∗ ∈ U is called a Nash equilibrium if Ji t, n, ui , u∗−i ≤ Ji t, n, u∗i , u∗−i ,

∀ i ∈ S for any (ui , u∗−i ) = u∗1 , . . . , u∗i−1 , ui , u∗i+1 , . . . , u∗m ∈ U. In other words, we are requiring

that, for any i ∈ S, the policy u∗i provides the optimal solution to the dynamic pricing problem for
firm i while all firms j 6= i use policy u∗j . Generally it is extremely difficult to solve such a stochas-
tic game for Nash equilibria as current research still stays at theoretically proving the existence
of approximate equilibria. However, for this particular stochastic game, we can solve for the Nash
equilibrium in closed form and compute it with a reasonable effort.
2.2 Sufficient Equilibrium Condition
2.2.1 The Monopolistic Problem
Before we present a sufficient condition to characterize the Nash equilibrium, we present its reduced
version in one-dimensional state space (m = 1), where there is no competition, to gain some insights.
For the single capacity holder in a monopolistic environment, we drop the subscript i that specifies
a player for the time being and denote the set of potential sales target by Q = {λ1 , λ2 , . . . , λK }
with λK < · · · < λ2 < λ1 . By Assumption 2, all the sales level options are efficient in the sense
that Q forms a maximum concave envelope 6 . Here the price function is a one-variable mapping of
the demand intensity decision, i.e., pi = p(λi ). Since price is strictly decreasing in sales, we have
p1 < p2 < · · · < pK . The notion ri := pi λi represents the expected revenue rate at λi . We assume
ri > rj if λi > λj (or pi < pj ). Given a sales target policy u ∈ U, an initial stock n ∈ Z+ and a sales
horizon t > 0, we denote the expected revenue by J(t, n, u). The firm’s problem is to find a sales
policy restricting itself to the discrete set of sales levels to maximize the total expected revenue
generated over [0, t], denoted by J ∗ (t, n). The following Lemma is adapted from Feng and Xiao
(2000a, Lemma 2) and it is crucial to our proof of the optimality condition that characterizes the
Nash equilibrium.
Lemma 1. Let V (t, n) be a differentiable function for all give n ≥ 0. If V (t, n) satisfies:
∂V (t, n)
− + λi [V (t, n − 1) − V (t, n)] + ri = 0, (1)
∂t
6
See Feng and Xiao (2000a, Lemma 3) for how to determine the envelope if given an arbitrary set of predetermined
sales targets(or prices).
10
while i is the smallest integer l = 1, 2, . . . , K − 1 such that
rl − rl+1
V (t, n) − V (t, n − 1) ≤ , (2)
λl − λl+1
then
(i) an optimal sales target at (t, n) is λi and V (t, n) is the value function of the monopolistic
model, i.e., V (t, n) = J ∗ (t, n).
(ii) if P is a strictly maximum concave envelope, λi is the unique optimal sales target at (t, n).
Proof. An informal derivation by Principle of Optimality leads to the following Hamilton-Jacobi-
Bellman (HJB) equation that characterizes the optimal policy:
∂J ∗ (t, n)
− + max {λj (t) [J ∗ (t, n − 1) − J ∗ (t, n) + pj ]} = 0, (3)
∂t j=1,2,...,K
where 
 λj , if Ij (t) = 1,

λj (t) =
 0,

otherwise.
A rigorous justification can be obtained by using Theorem II.1 in Brémaud (1980). Suppose i is
the smallest integer such that (2) holds. By the maximum concave envelope assumption of P (a
crucial condition7 ), we have
(i) ∀ j < i, λj > λi ,

ri−1 − ri rj − ri
V (t, n) − V (t, n − 1) > ≥ .
λi−1 − λi λj − λi
(ii) ∀ j > i, λj < λi ,

ri − ri+1 ri − rj
V (t, n) − V (t, n − 1) ≤ ≤ .
λi − λi+1 λi − λj
By (1), −∂V (t, n)/∂t = λi [V (t, n) − V (t, n − 1)] − ri , thus for ∀ j ∈ {1, 2, . . . , K},
∂V (t, n)
− + λj [V (t, n − 1) − V (t, n)] + rj = (λi − λj )[V (t, n) − V (t, n − 1)] + rj − ri ≤ 0.
∂t
7
The proof of Lemma 2 in Feng and Xiao (2000a) does not make this assumption explicitly.
11
Hence, V (t, n) satisfies HJB equation (3) and λi achieves the optimum at (t, n). In addition, no
other λj , j 6= i can achieve the optimum if r(λ) is strictly concave in λ.
2.2.2 The Oligopolistic Problem
The multi-dimensional version of maximum concave envelope can be guaranteed by Assumption 2,
which implies that the marginal revenue rate
ri (λi,l ; λ−i ) − ri (λi,l+1 ; λ−i )

λi,l − λi,l+1
of player i to switch sales target is increasing in l for any given λ−i . For an oligopolistic market
(m > 0), let n = (n1 , n2 , . . . , nm ) ∈ Zm m

+ be the joint state vector and ei ∈ R+ , ∀ i ∈ S be the i
th
unit vector. We have the following extension of Lemma 1 to the multi-player game.
Theorem 1. Let Vi (t, n), i ∈ S be differentiable functions for all given n ∈ Zm

+ . If Vi (t, n), ∀ i ∈ S
satisfies:
∂Vi (t, n)
− + λi,j(i) [Vi (t, n − ei ) − Vi (t, n)] + ri λi,j(i) ; ~λk,j(k) = 0, ∀ i ∈ S, (4)
∂t
while j(i): S → N is a function mapping i to the smallest integer l = 1, 2, . . . , K(i) − 1 such that

ri λi,l ; ~λk,j(k) − ri λi,l+1 ; ~λk,j(k)
Vi (t, n) − Vi (t, n − ei ) ≤ , (5)
λi,l − λi,l+1
where ~λk,j(k) := λk,j(k) , k ∈ S \ {i} , then an equilibrium sales policy u∗ at (t, n) is (λi,j(i) , i ∈ S)

and Vi (t, n), i ∈ S are value functions of the Nash equilibrium, i.e., Vi (t, n) = Ji (t, n, u∗ ), i ∈ S.
Proof. If there exists a set of differentiable functions Vi (t, n) satisfying the system of (4) and (5)
for any i ∈ S simultaneously, by Lemma 1, then Vi (t, n), ∀ i ∈ S is the value function of the best
response problem for firm i when the competitors’ strategy is ~λk,j(k) at (t, n), which indicates that
Vi (t, n), i ∈ S are value functions of the Nash equilibrium.
Remark 1. Since the system of HJB equations is a necessary and sufficient criterion to characterize

value functions of a Nash equilibrium, then if λi,j(i) (t, n), i ∈ S is a unique solution to achieve
the optimum in the system (4), the stochastic game has a unique Nash equilibrium.
12
2.3 Construction of the Nash Equilibrium
The system of equations (4) potentially provides a dynamic programming scheme to construct the
value function Vi (t, n), ∀ i ∈ S explicitly with boundary conditions
Vi (0, n) = 0, ∀ i ∈ S, ∀ n ∈ Zm
+, (6)
Vi (t, (n1 , . . . , ni−1 , 0, ni+1 , . . . , nm )) = 0, ∀ nj ∈ Z+ , ∀ j 6= i, ∀ i ∈ S, ∀ t ∈ R+ . (7)
We first walk through the construction procedure for a duopoly game(m = 2) to ease the
illustration and gain some intuition, and then provide a generic algorithm to treat the general
multiple-player case(m > 2).
2.3.1 Solution Scheme for Duopoly (m = 2)
• For (n1 , n2 ) = (0, 0), we have V1 (t, (0, 0)) = 0, V2 (t, (0, 0)) = 0, ∀ t ∈ R+ .
• For (n1 , n2 ) = (1, 0) when only firm 1 has capacity to sell, we have V2 (t, (1, 0)) = 0 and note that
V1 (t, (1, 0)) → 0 when t ↓ 0, the inequality
r1 (λ1,1 ) − r1 (λ1,2 )
V1 (t, (1, 0)) − V1 (t, (0, 0)) ≤
λ1,1 − λ1,2
can hold for t ↓ 0. Hence, in the right neighborhood of t = 0, λ1,1 is the unique optimal sales for
firm 1. This is consistent with our intuitions: firm 2 has run out of inventory and with little time
left, it is necessary for firm 1 to target sales at the highest level λ1,1 .
Solving for V1 (t, (1, 0)) in HJB equation (4) at λ1,1 with the boundary condition V1 (0, (1, 0)) = 0
yields

V1 (t, (1, 0)) = p1 (λ1,1 ) 1 − e−λ1,1 t , t ↓ 0.
Since V1 (t, (1, 0)), t ↓ 0 is strictly increasing and concave in t,

r1 (λ1,1 ) − r1 (λ1,2 )
z1,1 (1, 0) := sup t ≥ 0 : V1 (t, (1, 0)) ≤
λ1,1 − λ1,2
is well-defined with sales λ1,1 optimal for t ∈ (0, z1,1 (1, 0)]. As λ1,2 is effective for (z1,1 (1, 0), t] when
t ↓ z1,1 (1, 0), we can solve V1 (t, (1, 0)) from (4) with the boundary condition
13
V1 (z1,1 (1, 0), (1, 0)) = (r1 (λ1,1 ) − r1 (λ1,2 )) /(λ1,1 − λ1,2 ):
p1 (λ1,1 ) − p1 (λ1,2 ) −λ1,2 (t−z1,1 (1,0))

V1 (t, (1, 0)) = p1 (λ1,2 ) + λ1,1 e , t ↓ z1,1 (1, 0).
λ1,1 − λ1,2
Since V1 (t, (1, 0)), t ↓ z1,1 (1, 0) is strictly increasing and concave in t,

r1 (λ1,2 ) − r1 (λ1,3 )
z1,2 (1, 0) := sup t ≥ z1,1 (1, 0) : V1 (t, (1, 0)) ≤
λ1,2 − λ1,3
is well-defined with sales λ1,2 optimal for t ∈ (z1,1 (1, 0), z1,2 (1, 0)].
The above procedure is repeated until z1,K(1)−1 (1, 0) is calculated by the following recursive
formula, i = 2, . . . , K(1) − 1:
p1 (λ1,i−1 ) − p1 (λ1,i ) −λ1,i (t−z1,i−1 (1,0))

V1 (t, (1, 0)) =p1 (λ1,i ) + λ1,i−1 e , z1,i−1 (1, 0) < t ≤ z1,i (1, 0),
λ1,i−1 − λ1,i

r1 (λ1,i ) − r1 (λ1,i+1 )
z1,i (1, 0) := sup t ≥ z1,i−1 (1, 0) : V1 (t, (1, 0)) ≤ .
λ1,i − λ1,i+1
Then the optimal policy is to apply sales target λ1,i when t ∈ (z1,i−1 (1, 0), z1,i (1, 0)], i = 1, 2, . . . , K(1)−
1 with the convention z1,0 (1, 0) = 0 and apply sales target λ1,K(1) when t > z1,K(1)−1 (1, 0).
• For (n1 , n2 ) = (0, 1), we have a similar result as (n1 , n2 ) = (1, 0).
• For (n1 , n2 ) = (1, 1), note that V1 (t, (1, 1)) → 0, V2 (t, (1, 1)) → 0 when t ↓ 0 and V1 (t, (0, 1)) = 0,
V2 (t, (1, 0)) = 0, the inequalities
r1 (λ1,1 , λ2,1 ) − r1 (λ1,2 , λ2,1 )

V1 (t, (1, 1)) − V1 (t, (0, 1)) ≤ ,
λ1,1 − λ1,2
r2 (λ1,1 , λ2,1 ) − r2 (λ1,1 , λ2,2 )
V2 (t, (1, 1)) − V2 (t, (1, 0)) ≤
λ2,1 − λ2,2
can hold for t ↓ 0. Hence, by Theorem 1, in the right neighborhood of t = 0, (λ1,1 , λ2,1 ) is the
equilibrium sales target. Solving for V1 (t, (1, 1)), V2 (t, (1, 1)) in HJB equation (4) at λ1,1 , λ2,1 with
the boundary conditions V1 (0, (1, 1)) = 0, V2 (0, (1, 1)) = 0 respectively yields

V1 (t, (1, 1)) = p1 (λ1,1 , λ2,1 ) 1 − e−λ1,1 t , t ↓ 0,

V2 (t, (1, 1)) = p2 (λ1,1 , λ2,1 ) 1 − e−λ2,1 t , t ↓ 0.
14
Since both V1 (t, (1, 1)) and V2 (t, (1, 1)), t ↓ 0 are strictly increasing and concave in t,

r1 (λ1,1 , λ2,1 ) − r1 (λ1,2 , λ2,1 )
z1,1 (1, 1) ← sup t ≥ 0 : V1 (t, (1, 1)) ≤ ,
λ1,1 − λ1,2

r2 (λ1,1 , λ2,1 ) − r2 (λ1,1 , λ2,2 )
z2,1 (1, 1) ← sup t ≥ 0 : V2 (t, (1, 1)) ≤
λ2,1 − λ2,2
are well-defined with (λ1,1 , λ2,1 ) to be the equilibrium for t ∈ (0, z1,1 (1, 1) ∧ z2,1 (1, 1)].
(1) If z1,1 (1, 1) < z2,1 (1, 1), (λ1,2 , λ2,1 ) should be the equilibrium for (z1,1 (1, 1), t] when t ↓
z1,1 (1, 1). Hence, we can solve V1 (t, (1, 1)), V2 (t, (1, 1)) in HJB equation (4) at (λ1,2 , λ2,1 ) with the
boundary conditions
r1 (λ1,1 , λ2,1 ) − r1 (λ1,2 , λ2,1 )

V1 (z1,1 (1, 1), (1, 1)) = ,
λ1,1 − λ1,2
r (λ , λ ) − r (λ , λ )
2 1,1 2,1 2 1,1 2,2
V2 (z1,1 (1, 1), (1, 1)) = p2 (λ1,1 , λ2,1 ) 1 − e−λ2,1 z1,1 (1,1) ≤ ,
λ2,1 − λ2,2
yielding t ↓ z1,1 (1, 1),
V1 (t, (1, 1)) = p1 (λ1,2 , λ2,1 )[e−z1,1 (1,1) − e−λ1,2 t ] + V1 (z1,1 (1, 1), (1, 1))e−λ1,2 (t−z1,1 (1,1)) ,
V2 (t, (1, 1)) = p2 (λ1,2 , λ2,1 )[e−z1,1 (1,1) − e−λ2,1 t ] + V2 (z1,1 (1, 1), (1, 1))e−λ2,1 (t−z1,1 (1,1)) .
We can verify that both V1 (t, (1, 1)) and V2 (t, (1, 1)), t ↓ z1,1 (1, 1) are strictly increasing and concave
in t, where justifications are provided in the next section. By the concavity Assumption 2,
r1 (λ1,1 , λ2,1 ) − r1 (λ1,2 , λ2,1 ) r1 (λ1,2 , λ2,1 ) − r1 (λ1,3 , λ2,1 )

≤ .
λ1,1 − λ1,2 λ1,2 − λ1,3
By the decreasing differences Assumption 3,
r2 (λ1,1 , λ2,1 ) − r2 (λ1,1 , λ2,2 ) r2 (λ1,2 , λ2,1 ) − r2 (λ1,2 , λ2,2 )

≤ .
λ2,1 − λ2,2 λ2,1 − λ2,2
15
Therefore,

r1 (λ1,2 , λ2,1 ) − r1 (λ1,3 , λ2,1 )
z1,2 (1, 1) ← sup t ≥ z1,1 (1, 1) : V1 (t, (1, 1)) ≤ ,
λ1,2 − λ1,3

r2 (λ1,2 , λ2,1 ) − r2 (λ1,2 , λ2,2 )
z2,1 (1, 1) ← sup t ≥ z1,1 (1, 1) : V2 (t, (1, 1)) ≤
λ2,1 − λ2,2
are well-defined.
Firm 1 switches from λ1,2 to λ1,3 right at z1,2 (1, 1) if z1,2 (1, 1) < z2,1 (1, 1). Firm 2 switches from
λ2,1 to λ2,2 right at z2,1 (1, 1) if z1,2 (1, 1) > z2,1 (1, 1). Otherwise both firms switch at the same time
of z1,2 (1, 1) = z2,1 (1, 1). We can repeat the above procedure to extend V1 (t, (1, 1)) and V2 (t, (1, 1))
piecewisely along the time horizon until z1,K(1)−1 (1, 1) and z2,K(2)−1 (1, 1) are obtained. Note that
both V1 (t, (1, 1)) and V2 (t, (1, 1)) could have at most K(1) + K(2) segments.
(2) If z1,1 (1, 1) ≥ z2,1 (1, 1), we have the same way of construction. Note that whenever the
switching time coincides, both firms switch the sales target simultaneously.
• For n = (n1 , n2 ) > (1, 1), we can follow the same procedure as n = (1, 1) of constructing the
equilibrium value functions and the optimal switching times.
2.3.2 Generic Algorithm for Oligopoly (m ≥ 2)
We first describe a generic algorithm to construct value functions and optimal switching times
for the multiple-player nonzero-sum Cournot competition, and then provide proofs to justify the
well-definedness of each step. Having gained insights from the case of m = 2, it will not be hard to
understand how the following algorithm works.
Algorithm 1. Generic Algorithm for Oligopoly Quantity Competition with Substitutable Prod-
Pm
ucts: To Compute the Equilibrium Switching Times for n, ∀ n ∈ {z ∈ Zm
+ : i=1 zi ≤ L}:
Parameter. m; L ≥ 1; K(i), ∀ i ∈ S; Qi = {λi,1 , λi,2 , . . . , λi,K(i) }, ∀ i ∈ S; pi (λ), ∀ λ ∈ Πm

j=1 Qj ,
∀ i ∈ S.
Step 0. Initialization.
SET l ← 1;
SET Vi (t, (n1 , . . . , ni−1 , 0, ni+1 , . . . , nm )) ← 0, ∀ nj ∈ Z+ , ∀ j 6= i, ∀ i ∈ S, ∀ t ∈ [0, +∞).
16
Step 1. Construction.
Pm
FOR any n ∈ {z ∈ Zm
+ :l−1≤ i=1 zi ≤ l}, DO the following:
Step 1.1. Initialization of current effective sales index c(i) and value functions around t = 0 for n.
SET c(i) ← 1, ∀ i ∈ S; SET
Z t
Vi (t, n) ← λi,c(i) [Vi (s, n − ei ) + pi (λj,c(j) , j ∈ S)]e−λi,c(i) (t−s) ds, ∀ t ∈ [0, +∞), ∀ i ∈ S.
0
Step 1.2. Construction of value functions and switching times zi,j (n) for n.
WHILE Sa ← {i ∈ S : c(i) < K(i)} =

6 ∅, DO the following: SET

zi,c(i) (n) ← sup t ≥ zi,c(i)−1 (n) : Vi (t, n) − Vi (t, n − ei )
ri (λj,c(j) , j ∈ S) − ri (λi,c(i)+1 , λj,c(j) , j ∈ S \ {i})

≤ , ∀ i ∈ Sa ;
λi,c(i) − λi,c(i)+1
SET zc ← min{zi,c(i) (n), i ∈ Sa };
SET Sc ← {i ∈ Sa : zi,c(i) (n) = zc };
SET c(i) ← c(i) + 1, ∀ i ∈ Sc ; SET
Z t
Vi (t, n) ← λi,c(i) [Vi (s, n − ei ) + pi (λj,c(j) , j ∈ S)]e−λi,c(i) (t−s) ds + Vi (zc , n)e−λi,c(i) (t−zc ) ,
zc
∀ t ∈ [zc , +∞), ∀ i ∈ S.
Step 2. SET l ← l + 1;
IF l > L, STOP; OTHERWISE, GOTO Step 1.
Remark 2. Lemma 3 and 4 proved next can guarantee that Vi (t, n)−Vi (t, n−ei ) is strictly increasing
in t, which together with Assumptions 2-3 can guarantee that zi,c(i) (n) is always well-defined.
Remark 3. The whole idea behind Algorithm 1 is to use dynamic programming approach to con-
struct value functions that satisfy the system of HJB equations. We denote the index of the current
effective sales level for player i by c(i). zc is the point of time when there is a sales level switch
by some player(s). Vi (t, n), ∀ i ∈ S is then adjusted to such a switch and used to seek the next
switching point. Assumptions 2-3 make sure the switching point will not be altered after being
17
determined, essentially guaranteeing zi,j ≤ zi,j+1 , j = 1, 2, . . . , K(i) − 2, ∀ i ∈ S. For any n, there
are K(i) − 1 switching points of time for any player i along the whole time horizon, and Vi (t, n)
∀ i ∈ S is piece-wise continuous in t but could have at most m

P
j=1 K(j) pieces.
Next we rigorously justify that Algorithm 1 is well-defined and indeed generates value functions
and sales strategies of a Nash equilibrium.
Lemma 2. 8 If f (s) : R+ → R+ is increasing and concave function of s with f (z) ≥ aλ (f (z) > aλ)
and f 0 (z) ≤ λ[f (z) − aλ] for fixed λ > 0, z ≥ 0 and a ∈ R, then the function
Z t
F (t) := f (s)e−λ(t−s) ds + ae−λ(t−z)
z
is also a (strictly) increasing and concave function of t for t ≥ z.
Proof. F (t) is a unique solution to equation F 0 (t) = −λF (t) + f (t), F (z) = a. Since 0 ≤ f (z) ≤
f (s) ≤ f (t) for 0 ≤ z ≤ s ≤ t,
Z t
∂F (t)
= −λ f (s)e−λ(t−s) ds − aλe−λ(t−z) + f (t) ≥ [f (t) − aλ]e−λ(t−z) ≥ [f (z) − aλ]e−λ(t−z) ≥ 0,
∂t z
with the last inequality strictly holding if f (z) > aλ.
Note that −F 0 (t) is a unique solution to equation G0 (t) = −λG(t) − f 0 (t), G(z) = aλ − f (z).
Since f 0 (t) ≤ f 0 (s) ≤ f 0 (z) ≤ 0 for 0 ≤ z ≤ s ≤ t,

∂ − ∂F∂t(t) Z t
= λ f 0 (s)e−λ(t−s) ds − (aλ − f (z))λe−λ(t−z) − f 0 (t)
∂t z
≥ [λf (z) − aλ2 − f 0 (t)]e−λ(t−z) ≥ [λf (z) − aλ2 − f 0 (z)]e−λ(t−z) ≥ 0.
Remark 4. If f (s) is strictly concave in s a.s., F (t) is also strictly concave in t a.s..
Lemma 3. Vi (t, ei ) is a strictly increasing, continuously differentiable and concave function of t.
Proof. This is the case of l = 1 in Algorithm 1, which is reduced to a monopolistic problem. In view
of the case of (n1 , n2 ) = (1, 0) in §2.3.1, the algorithm gives us Vi (t, ei ) = pi (λi,1 )(1 − exp(−λi,1 t))
8
A simple version of this Lemma in Feng and Xiao (2001, Lemma 3.2.) is not sufficient to guarantee our results
in Lemma 4 for z > 0.
18
for 0 ≤ t < zi,1 (ei ) where zi,1 (ei ) = sup {t ≥ 0 : Vi (t, ei ) ≤ (ri (λi,1 ) − ri (λi,2 ))/(λi,1 − λi,2 )} and for
j = 2, . . . , K(i) − 1,
pi (λi,j−1 ) − pi (λi,j ) −λi,j (t−zi,j−1 (ei ))

Vi (t, ei ) =pi (λi,j ) + λi,j−1 e , zi,j−1 (ei ) < t ≤ zi,j (ei ),
λi,j−1 − λi,j

ri (λi,j ) − ri (λi,j+1 )
zi,j (ei ) := sup t ≥ zi,j−1 (ei ) : Vi (t, ei ) ≤ .
λi,j − λi,j+1
The sales target λi,K(i)−1 (ei ) is effective when t > zi,K(i)−1 (ei ). It is easy to verify Vi (t, ei ) is a
strictly increasing in t ≥ 0, continuously differentiable and concave at t 6= zi,j , j = 1, 2, . . . , K(i)−1.
It suffices to show that the differentiability also holds at these switching times. As a matter of fact,
we have for j = 1, 2, . . . , K(i) − 1,
∂Vi (t, ei ) ∂Vi (t, ei ) pi (λi,j ) − pi (λi,j+1 )

lim = lim = −λi,j λi,j+1 ≥ 0.
t→zi,j (ei )− ∂t t→zi,j (ei ) + ∂t λi,j − λi,j+1
Lemma 4. If Vi (t, n − ei ) is a strictly increasing, continuously differentiable and strictly concave
function of t, then Vi (t, n) constructed from Vi (t, n − ei ) in Step 1.1.-1.2. of Algorithm 1 satisfies
that
(i) Vi (t, n) is a strictly increasing, continuously differentiable and strictly concave function of t;
(ii) Vi (t, n) − Vi (t, n − ei ) is strictly increasing in t.
Proof. Whenever some switching time zi0 ,c(i0 ) (n) reaches the minimum among all indices of set
Sa in Algorithm 1, player i0 switches price from pi0 ,c(i0 ) to pi0 ,c(i0 )+1 and Vi (t, n) is updated for
t ≥ zi0 ,c(i0 ) (n), ∀ i ∈ Sa . Under Assumptions 2-3, no matter how the competitors switch sales target
lower as the time-to-go increases, the barrier for Vi0 (t, n) − Vi0 (t, n − ei0 ) to reach when switching
from λi0 ,c(i0 )+1 to λi0 ,c(i0 )+2 is always greater than the current barrier of switching point zi0 ,c(i0 ) (n).
This implies zi,j ≤ zi,j+1 , j = 1, 2, . . . , K(i) − 2, ∀ i ∈ S.
It is not hard to show the differentiability since at any switching time zi,c(i) (n) (zc for short in
19
this proof), we have
∂Vi (t, n) ∂Vi (t, n)

lim = lim
t→zc − ∂t t→zc + ∂t

λi,c(i) λi,c(i)+1 pi (λj,c(j) , j ∈ S) − pi (λi,c(i)+1 , λj,c(j) , j ∈ S \ {i})
= − ≥ 0,
where λj,c(j) , ∀ j ∈ S \ {i} is understood as the effective sales level of player j at the time of zc
when player i switches sales target from λi,c(i) to λi,c(i)+1 .
Taking derivative w.r.t. t on both sides of (4) yields
∂[Vi (t, n) − Vi (t, n − ei )] ∂ 2 Vi (t, n)

λi,c(i) =− ≥ 0, (8)
∂t ∂t2
which reveals that the (strict) increasing property of Vi (t, n) − Vi (t, n − ei ) in t is equivalent to the
(strict) concavity of Vi (t, n) in t.
We use induction to show the strict increasing property and concavity of the value function
Vi (t, n) in t since it is constructed piecewisely and sequentially in Step 1.1.-1.2. of Algorithm 1.
(1) In Step 1.1., where c(i) = 1, ∀ i ∈ S, in view of Lemma 2, it suffices to verify f (0) ≥ aλ and
f 0 (0) ≤ λ[f (0)−aλ], where f (t) = λi,1 [Vi (t, n−ei )+pi (λj,1 , j ∈ S)], λ = λi,1 and a = 0. It is easy to
verify both inequalities by noticing that f (0) = pi (λj,1 , j ∈ S)λi,1 and f 0 (0) = pi (λj,1 , j ∈ S)(λi,1 )2 .
By Lemma 2 and Remark 4, Vi (t, n) constructed in Step 1.1. is strictly increasing and concave in
t.
(2) Suppose the value function Vi (t, n) constructed up to the point of time zc is strictly increasing
and concave. Note that Vi (t, n), t ≥ zc is a unique solution to
∂Vi (t, n)
= −λi,c(i) Vi (t, n) + λi,c(i) Vi (t, n − ei ) + pi (λj,c(j) , j ∈ S) ,
∂t
with boundary condition Vi (zc , n) = Vi (zc −, n), the left limit at t = zc of the value function
obtained from the previous stage. In view of Lemma 2, it suffices to verify f (zc ) ≥ aλ and f 0 (zc ) ≤
λ[f (zc ) − aλ], where f (t) = λi,c(i) [Vi (t, n − ei ) + pi (λj,c(j) , j ∈ S)], λ = λi,c(i) and a = Vi (zc , n). It is
easy to see that f (zc ) ≥ aλ is equivalent to Vi (zc , n) − Vi (zc , n − ei ) ≤ pi (λj,c(j) , j ∈ S), which can
20
be shown by the definition of zc , since

Vi (zc , n) − Vi (zc , n − ei ) ≤ ≤ pi (λj,c(j) , j ∈ S).
It is also easy to see that f 0 (zc ) ≤ λ[f (zc ) − aλ] is equivalent to

∂Vi (t, n − ei ) ∂Vi (t, n)
≤ λi,c(i) [Vi (zc , n − ei ) − Vi (zc , n) + pi (λj,c(j) , j ∈ S)] = ,
∂t
t=zc ∂t t=zc
which can be guaranteed by the concavity of Vi (t, n) up to t = zc and the differentiability at
t = zc .
Theorem 2. Vi (t, n), i ∈ S constructed in Algorithm 1 are indeed the value functions of a Nash
equilibrium and
(i) Vi (t, n) is a strictly increasing, continuously differentiable and strictly concave function of t;
(ii) Vi (t, n) − Vi (t, n − ei ) is strictly increasing in t.
Proof. Vi (t, n), i ∈ S satisfy Theorem 1 and thus are the value functions of a Nash equilibrium. By
Lemma 3, the statements (i)-(ii) are true for n = ei . Lemma 4 then leads to the conclusion for any
n through induction on n.
Remark 5. In a Nash equilibrium, for player i at the joint inventory level n, sales target λi,1 is
effective for 0 ≤ t < zi,1 , sales target λi,j is effective for zi,j−1 ≤ t < zi,j , ∀ j = 2, . . . , K(i) − 1 and
sales target λi,K(i) is effective if t ≥ zi,K(i)−1 .
Theorem 3. If p(λ) satisfies Assumptions 2 and 3 strictly, Vi (t, n), i ∈ S constructed in Algorithm
1 are the value functions of a unique Nash equilibrium.
Proof. If p(λ) satisfies a strong version of Assumptions 2 and 3, then for any i the marginal revenue
rate
is strictly increasing between two consecutive evaluations in Step 1.2. no matter which player
switches sales level in the previous stage: a strong version of Assumption 2 guarantees that the
marginal revenue rate is strictly increasing when player i just switches his own sales target and
21
a strong version of Assumption 3 guarantees that the marginal revenue rate is strictly increasing
when some player j, j 6= i just switches sales target. Hence, zi,j (n) constructed in Algorithm 1
does not coincide for various j ∈ {1, 2, . . . , K(i) − 1}. The solution described in Remark 5 is a
unique solution to the system of HJB equations and in view of Remark 1, it must be a unique Nash
equilibrium.
3 Price Competition with Complementary Products
We consider an oligopolistic market of m differentiated complementary products with a set S =

S
{1, 2, . . . , m} of competitors. At time zero, firm i ∈ S has inventory ni ∈ Z+ = N {0} units of
the complementary perishable asset and the same finite time t > 0 to sell them. We assume the
salvage value of the asset at time t, the end of selling horizon, is zero and that all other costs are
sunk.
We assume a discrete price menu for each firm that contains finite options. For each firm
i ∈ S, there are K(i) ∈ N number of price options and we denote the set of them by Pi =
{pi,1 , pi,2 , . . . , pi,K(i) }, with 0 ≤ pi,1 < pi,2 < · · · < pi,K(i) < ∞. We denote the joint strategy
space by P := Πm
i=1 Pi . The market is assumed to be an imperfect market where demand is a
function of the prices across the industry and in a Bertrand competition each firm tries to set
price to compete. At any time s, the current non-homogeneous Poisson demand intensity vector
λ(s) = (λ1 (s), λ2 (s), . . . , λm (s)) is determined by the price vector p(s) = (p1 (s), p2 (s), . . . , pm (s))
with pi (s) ∈ Pi through a mapping λ(p) : P → Rm

+ , p 7→ λ(p) = (λ1 (p), λ2 (p), . . . , λm (p)). Let
p−i = (p1 , . . . , pi−1 , pi+1 , . . . , pm ), ∀ i ∈ S denote the price vector of the other m − 1 firms who
compete with firm i. We denote the range of the mapping λ(p) by Q. For any given p−i , the
revenue rate function for player i, ∀ i ∈ S is ri (pi ; p−i ) := pi λi (p). We assume the mapping λ(p) is
known and that it satisfies the following assumptions:
Assumption 4 (One-to-One Mapping). The mapping λ(p) : P → Rm

+ has a one-to-one reverse
mapping p(λ) : Q → Rm
+.
Assumption 5 (Concavity). ri (pi ; p−i ) is an increasing and concave of pi for any p−i .
Assumption 6 (Decreasing Differences). ri (pi,l ; p−i ) − ri (pi,r ; p−i ) is decreasing in p−i for any pi,l >
pi,r .
22
Example 3. The direct demand mapping is
1/(β−1)
pi
λi (p) = (βθ)1/(1−βθ) P β/(β−1) (1−θ)/(1−βθ)
, ∀ i ∈ S, −1 ≤ β < 0, −1 < θ < 0,
( m p
j=1 j )
and the inverse demand mapping is given by
m
λβj )θ−1 λβ−1
X
pi (λ) = βθ( i , ∀ i ∈ S.
j=1
It is easy to see that such a demand function with constant elasticity does satisfy Assumption 4-6.
The firms compete for the market by adjusting its own price. At any time s ∈ [0, t], firm i,
∀ i ∈ S applies its own non-anticipating price pi (s) ∈ Pi . Let


 1, if pi,j ∈ Pi is effective at s,

Ii,j (s) =
 0, otherwise.

A non-anticipating policy ui for firm i is defined to be

 
 K(i) 
X
ui = (Ii,1 (s), Ii,2 (s), . . . , Ii,m (s)) : Ii,j (s) = 1, 0 ≤ s ≤ t ,
 
j=1
where the imposed constraint is to ensure that one and only one price for each firm is active at
any given time. Let Ni,j (s), ∀ i ∈ S, j = 1, 2, . . . , K(i) represent the accumulated number of items
sold up to time s at price pi,j ∈ Pi . A demand for any firm i ∈ S is realized at price pi,j at time
s if dNi,j (s) = 1. We denote by U the joint Markovian allowable pricing policy space: any joint
allowable pricing policy u = (u1 , u2 , . . . , um ) ∈ U must satisfy that for ∀ i ∈ S,
K(i) Z t
X
Ii,j (s)dNi,j (s) ≤ ni , a.s.
j=1 0
and the price pi (s) set by firm i is a function of the elapsed time s, its own inventory level as well
as the capacity levels of all other firms at time s, i.e.,
pi (s) = pi (s, n1 (s), . . . , nm (s)) ,
23
PK(i)
where ni (s) := ni − j=1 Ni,j (s) is the remaining inventory of firm i ∈ S at time s ∈ [0, t].
Given pricing policy u ∈ U, joint initial stock vector n9 = (n1 , n2 , . . . , nm ) ∈ Zm

+ and a finite
sales horizon t > 0, we denote the expected profit for any firm i ∈ S by
 
K(i) Z t
X
Ji (t, n, u) := E  Ii,j (s)pi,j (s) 1{ni (s−)>0} dNi,j (s) .
j=1 0
The goal of each firm i ∈ S is to maximize its total expected profit over [0, t] in the competitive mar-
ket. We assume all firms have perfect information of inventory levels about each other. More specif-
ically, all firms completely observe the joint state vector (ni (s), i ∈ S) at any time s ∈ [0, t], and act
upon that information. A joint policy u∗ = (u∗1 , u∗2 , . . . , u∗m ) ∈ U constitutes a Nash equilibrium if
Ji t, n, ui , u∗−i ≤ Ji t, n, u∗i , u∗−i , ∀ i ∈ S for any (ui , u∗−i ) = u∗1 , . . . , u∗i−1 , ui , u∗i+1 , . . . , u∗m ∈ U.

As a dual to the quantity competition with substitutable products, we have the following con-
structive algorithm to compute the Nash equilibrium for the stochastic game. We omit the proofs
of the existence and uniqueness due to its analogy to the quantity competition case.
Algorithm 2. Generic Algorithm for Oligopoly Pricing Competition with Complementary Prod-
Pm
ucts: To Compute the Equilibrium Switching Times for n, ∀ n ∈ {z ∈ Zm
+ : i=1 zi ≤ L}:
Parameter. m; L ≥ 1; K(i), ∀ i ∈ S; Pi = {pi,1 , pi,2 , . . . , pi,K(i) }, ∀ i ∈ S; λi (p), ∀ p ∈ Πm

j=1 Pj ,
∀ i ∈ S.
Step 0. Initialization.
SET l ← 1;
SET Vi (t, (n1 , . . . , ni−1 , 0, ni+1 , . . . , nm )) ← 0, ∀ nj ∈ Z+ , ∀ j 6= i, ∀ i ∈ S, ∀ t ∈ [0, +∞).
Step 1. Construction.
Pm
FOR any n ∈ {z ∈ Zm
+ :l−1≤ i=1 zi ≤ l}, DO the following:
Step 1.1. Initialization of current effective price index c(i) and value functions around t = 0 for
n.
9
We will omit vectos sign above all vectors for simplicity of notation. Readers should be able to tell that from the
context.
24
SET c(i) ← 1, ∀ i ∈ S; SET
Z t
Vi (t, n) ← λi (pj,c(j) , j ∈ S)[Vi (s, n − ei ) + pi,c(i) ]e−λi (pj,c(j) ,j∈S)(t−s) ds, ∀ t ∈ [0, +∞), ∀ i ∈ S.
0
Step 1.2. Construction of value functions and switching times zi,j (n) for n.
WHILE Sa ← {i ∈ S : c(i) < K(i)} =

6 ∅, DO the following: SET

zi,c(i) (n) ← sup t ≥ zi,c(i)−1 (n) : Vi (t, n) − Vi (t, n − ei )
ri (pj,c(j) , j ∈ S) − ri (pi,c(i)+1 , pj,c(j) , j ∈ S \ {i})

≤ , ∀ i ∈ Sa ;
pi,c(i) − pi,c(i)+1
SET zc ← min{zi,c(i) (n), i ∈ Sa };
SET Sc ← {i ∈ Sa : zi,c(i) (n) = zc };
SET c(i) ← c(i) + 1, ∀ i ∈ Sc ; SET
Z t
Vi (t, n) ← λi (pj,c(j) , j ∈ S)[Vi (s, n − ei ) + pi,c(i) ]e−λi (pj,c(j) ,j∈S)(t−s) ds + Vi (zc , n)e−λi (pj,c(j) ,j∈S)(t−zc ) ,
zc
∀ t ∈ [zc , +∞), ∀ i ∈ S.
Step 2. SET l ← l + 1;
IF l > L, STOP; OTHERWISE, GOTO Step 1.
4 Numerical Results
Example 4 (Quantity Competition). Two airlines compete to sell tickets for the same route.
Airline 1 has sales target or booking limit options λ1,1 = 4, λ1,2 = 3 and λ1,3 = 2 and airline 2 has
options λ2,1 = 3, λ2,2 = 2 and λ2,3 = 1. The demand system is estimated to fit an MNL model:
a1 e−b1 p1,j
λ1 (p1,j , p2,l ) = M ,
a0 + a1 e−b1 p1,j + a2 e−b2 p2,l
a2 e−b2 p2,l
λ2 (p1,j , p2,l ) = M ,
a0 + a1 e−b1 p1,j + a2 e−b2 p2,l
25
with the no-purchase value a0 = 1, a1 = 1, a2 = 1.1, b1 = 1/100, b2 = 1/90 and M = 400/12.
This model assumes that each player has +∞ as a null price but does not necessarily have it as a
price option. By using Algorithm 1, we compute the equilibrium switching times, which are listed
in Table 4 and plotted in Figure 1.

Equilibrium Switching Times for Airline 1 Equilibrium Switching Times for Airline 2
Sales Sample Path 7

4
6
3.5 z2,2(n1,n2)
z1,2(n1,n2)
3 5
2.5 4
t
t
2
3
z1,1(n1,n2)
1.5
2
1 z2,1(n1,n2)
1
0.5
Sales Sample Path
10
0
0 5n 0
2 10
1 4 6 8
10 8 6 n2 4 2 0 0 n1 6
8 2 4 n2
10 0
Figure 1: Equilibrium Switching Times for Airline 1 and Airline 2
Example 5 (Price Competition). Two airlines compete to sell tickets for the same route. Airline
1 has price options p1,1 = $200, p1,2 = $300 and p1,3 = $400 and airline 2 has price options
p2,1 = $250, p2,2 = $300 and p2,3 = $360. The demand system is assumed to the same as in
Example 1. By using Algorithm algor:ber, we still succeed in computing the equilibrium switching
times for this case, which are listed in Table 5 and plotted in Figure 2. However, in general the
algorithm might not work.

Equilibrium Switching Times for Airline 1
Equilibrium Switiching Times for Airline 2
12
Sales Sampel Path
12
10 z2,2(n1,n2)
10
z1,2(n1,n2)
8
8 Sales Sampel Path
t 6
t 6
4
4 z (n1,n2)
2,1
2 z (n1,n2) 2 10
1,1
10 5 n2
0 5 0
10 8 n 0
n 0 2 4 n
6 2 4 2 0 1 1 6 8 10 0
Figure 2: Equilibrium Switching Times for Airline 1 and Airline 2
26
n z1,1 (n) z1,2 (n) z2,1 (n) z2,2 (n) n z1,1 (n) z1,2 (n) z2,1 (n) z2,2 (n)
(0,0) 0 0 0 0 (0,3) 0 0 1.24 2.07
(1,0) 0.19 0.32 0 0 (1,3) 0.16 0.28 0.64 1.83
(2,0) 0.47 0.72 0 0 (2,3) 0.42 0.65 0.77 1.93
(3,0) 0.77 1.12 0 0 (3,3) 0.69 1.05 0.91 2.00
(4,0) 1.08 1.53 0 0 (4,3) 0.96 1.45 1.01 2.03
(5,0) 1.38 1.93 0 0 (5,3) 1.30 1.85 1.04 2.07
(0,1) 0 0 0.34 0.64 (0,4) 0 0 1.68 2.78
(1,1) 0.16 0.29 0.26 0.58 (1,4) 0.16 0.28 0.82 2.49
(2,1) 0.45 0.72 0.29 0.61 (2,4) 0.42 0.65 0.95 2.58
(3,1) 0.76 1.12 0.29 0.59 (3,4) 0.69 1.01 1.10 2.66
(4,1) 1.07 1.52 0.29 0.59 (4,4) 0.97 1.43 1.24 2.70
(5,1) 1.37 1.92 0.29 0.59 (5,4) 1.24 1.83 1.34 2.73
(0,2) 0 0 0.78 1.37 (0,5) 0 0 2.13 3.48
(1,2) 0.16 0.28 0.45 1.19 (1,5) 0.16 0.28 1.01 3.15
(2,2) 0.42 0.67 0.59 1.28 (2,5) 0.42 0.65 1.13 3.24
(3,2) 0.71 1.07 0.66 1.34 (3,5) 0.69 1.02 1.27 3.32
(4,2) 1.02 1.52 0.66 1.32 (4,5) 0.97 1.37 1.43 3.37
(5,2) 1.36 1.92 0.66 1.28 (5,5) 1.25 1.81 1.56 3.41
Table 1: Optimal Switching Times for (n1 , n2 ), 0 ≤ n1 ≤ 5, 0 ≤ n2 ≤ 5
5 Conclusion
For the sales competition of substitutable products or price competition of complementary products,
we obtain a threshold-type optimal control policy for each player to switch sales target or price in
closed form that can be sustained as an exact Nash equilibrium for the stochastic game. Such a
policy can be constructed in a reasonable computation effort.
For price competition of substitutable products and sales competition of complementary prod-
ucts, Nash equilibrium in these types of competition fails to have monotone threshold structure
generally. As a variant of price competition of substitutable products, dynamic competition with a
Multinomial Logit customer choice model where each player sells finite differentiated products by
making available offer sets, has Nash equilibrium policy (if exists) not necessarily nested by fare,
in contrast to its monopolistic version.
27
n z1,1 (n) z1,2 (n) z2,1 (n) z2,2 (n) n z1,1 (n) z1,2 (n) z2,1 (n) z2,2 (n)
(0,0) 0 0 0 0 (0,3) 0 0 1.97 3.32
(1,0) 0.28 0.92 0 0 (1,3) 0.30 0.98 2.05 3.42
(2,0) 0.68 1.94 0 0 (2,3) 0.72 2.07 2.11 3.48
(3,0) 1.08 2.96 0 0 (3,3) 1.14 3.12 2.14 3.54
(4,0) 1.48 3.97 0 0 (4,3) 1.58 4.16 2.17 3.59
(5,0) 1.89 4.97 0 0 (5,3) 2.01 5.19 2.20 3.62
(0,1) 0 0 0.59 1.08 (0,4) 0 0 2.64 4.41
(1,1) 0.30 0.97 0.64 1.15 (1,4) 0.30 0.98 2.73 4.53
(2,1) 0.72 2.02 0.67 1.18 (2,4) 0.72 2.07 2.79 4.59
(3,1) 1.13 3.05 0.67 1.21 (3,4) 1.14 3.14 2.85 4.65
(4,1) 1.55 4.08 0.67 1.22 (4,4) 1.58 4.19 2.88 4.71
(5,1) 1.96 5.10 0.67 1.22 (5,4) 2.01 5.23 2.91 4.76
(0,2) 0 0 1.28 2.22 (0,5) 0 0 3.31 5.49
(1,2) 0.30 0.98 1.35 2.30 (1,5) 0.30 0.98 3.41 5.63
(2,2) 0.72 2.05 1.39 2.36 (2,5) 0.72 2.07 3.47 5.69
(3,2) 1.14 3.09 1.42 2.40 (3,5) 1.14 3.15 3.53 5.75
(4,2) 1.57 4.12 1.44 2.43 (4,5) 1.58 4.21 3.58 5.81
(5,2) 1.99 5.14 1.44 2.46 (5,5) 2.01 5.25 3.61 5.87
Table 2: Optimal Switching Times for (n1 , n2 ), 0 ≤ n1 ≤ 5, 0 ≤ n2 ≤ 5
References
Bertrand, J. 1883. Theorie mathématique de la richesse sociale. Journal des Savants 67 499C508.
Brander, J. A., A. Zhang. 1990. Market conduct in the airline industry: an empirical investigation.
RAND Journal of Economics 21(4) 567–583. 1
Brémaud, P. 1980. Point Processes and Queues, Martingale Dynamics. Springer-Verlag, New York.
2.2.1
Cournot, A. 1838. Recherches sur les Principes Mathématiques de la Théorie des Richesses. Macmil-
lan, Paris, France. 1
Feng, Y., G. Gallego. 1995. Optimal stopping times for end of season sales and optimal starting
times for promotional fares. Management Sci. 41(8) 1371–1391. 1
28
Feng, Y., G. Gallego. 2000. Perishable asset revenue management with markovian time dependent
demand intensities. Management Sci. 46(7) 941–956. 1
Feng, Y., B. Xiao. 2000a. A continuous-time yield management model with multiple prices and
reversible price changes. Management Sci. 46(5) 644–657. 1, 2.2.1, 6, 7
Feng, Y., B. Xiao. 2000b. Optimal policies of yield management with multiple predetermined prices.
Oper. Res. 48(2) 332–343. 1
Feng, Y., B. Xiao. 2001. A dynamic airline seat inventory control model and its optimal policy.
Oper. Res. 49(6) 938–949. 1, 8
Gallego, G., M. Hu. 2006. Dynamic pricing of perishable assets under competition. Working paper.
Gallego, G., G. van Ryzin. 1994. Optimal dynamic pricing of inventories with stochastic demand
over finite horizons. Management Sci. 40(8) 999–1020. 1
Lin, K. Y., S. Y. Sibdari. 2006. Dynamic price competition with discrete customer choices. Working
paper. 1
Phillips, R. L. 2005. Pricing and Revenue Optimization. Stanford University Press, Stanford,
California. 3
Singh, N., X. Vives. 1984. Price and quantity competition in a differentiated duopoly. Rand Journal
of Economics 15(4) 546–554. 1
Talluri, K. 2003. On equilibria in duopolies with finite strategy spaces. Working paper. 1
Talluri, K. T., G. J. van Ryzin. 2004. The Theory and Practice of Revenue Management. Kluwer
Academic Publishers. 1, 3
Vieille, N. 2000. Two-player stochastic games i: A reduction; two-player stochastic games ii: The
case of recursive games; small perturbations and stochastic games. Israel Journal of Mathematics
119 55–91;92–126;127–142. 1
Wei, W., M. Hansen. 2007. Airlines’ competition in aircraft size and service frequency in duopoly
markets. Transportation Research Part E 43 409C424. 1
29

Stochastic Nash Equilibrium

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Stochastic Nash Equilibrium

Uploaded by

Copyright:

Available Formats

Competitive Revenue Management of Perishable

Assets with Multiple Predetermined Options

or price competition of complementary products, we obtain a threshold-type optimal control policy

Furthermore, we provide counterexamples for price competition of substitutable products and

competition models are more relevant for quantity-based RM.

competitor in two ways, substitutably or complementarily. Most RM literature on competition

tutable/complementary products. First, quantity competition with substitutable products may be

one-shot economic problem to a RM problem with decision making in a continuous time. We

for the stochastic game.

is significant3 and is truly the goal for revenue management systems.

of the Nash equilibrium.

to provide very good performance (see Gallego and Hu (2006)).

2 Quantity Competition with Substitutable Products

2.1 The Stochastic Game

We consider an oligopolistic market of m differentiated substitutable products with a set S =

with λi (s) ∈ Qi through a mapping p(λ) : Q → Rm

p(λ) is known and that it satisfies the following assumptions:

Assumption 1 (One-to-One Mapping). The mapping p(λ) : Q → Rm

e.g. linear demand function and MultiNomial Logit(MNL) demand function.

where a0 ≥ 0 is the no-purchase option value and ai , bi ≥ 0, ∀ i ∈ S. Its corresponding price

is decreasing in λi , Assumption 3 holds.

Example 2 (Linear Demand Function). The demand mapping is

Bij = cij for j 6= i, is nonsingular, then the price mapping is

firm i, ∀ i ∈ S applies its own non-anticipating sales level λi (s) ∈ Qi . Let

λi (s) = λi (s, n1 (s), . . . , nm (s)) ,

Given sales target policy u ∈ U, joint initial stock vector n5 = (n1 , n2 , . . . , nm ) ∈ Zm

equilibrium in closed form and compute it with a reasonable effort.

2.2 Sufficient Equilibrium Condition

2.2.1 The Monopolistic Problem

model, i.e., V (t, n) = J ∗ (t, n).

Proof. An informal derivation by Principle of Optimality leads to the following Hamilton-Jacobi-

Bellman (HJB) equation that characterizes the optimal policy:

crucial condition7 ), we have

(i) ∀ j < i, λj > λi ,

(ii) ∀ j > i, λj < λi ,

other λj , j 6= i can achieve the optimum if r(λ) is strictly concave in λ.

2.2.2 The Oligopolistic Problem

The multi-dimensional version of maximum concave envelope can be guaranteed by Assumption 2,

which implies that the marginal revenue rate

ri (λi,l ; λ−i ) − ri (λi,l+1 ; λ−i )

(m > 0), let n = (n1 , n2 , . . . , nm ) ∈ Zm m

Theorem 1. Let Vi (t, n), i ∈ S be differentiable functions for all given n ∈ Zm

Vi (t, n), i ∈ S are value functions of the Nash equilibrium.

value function Vi (t, n), ∀ i ∈ S explicitly with boundary conditions

Vi (t, (n1 , . . . , ni−1 , 0, ni+1 , . . . , nm )) = 0, ∀ nj ∈ Z+ , ∀ j 6= i, ∀ i ∈ S, ∀ t ∈ R+ . (7)

multiple-player case(m > 2).

2.3.1 Solution Scheme for Duopoly (m = 2)

V1 (t, (1, 0)) → 0 when t ↓ 0, the inequality

Since V1 (t, (1, 0)), t ↓ 0 is strictly increasing and concave in t,

p1 (λ1,1 ) − p1 (λ1,2 ) −λ1,2 (t−z1,1 (1,0))

p1 (λ1,i−1 ) − p1 (λ1,i ) −λ1,i (t−z1,i−1 (1,0))

V2 (t, (1, 0)) = 0, the inequalities

r1 (λ1,1 , λ2,1 ) − r1 (λ1,2 , λ2,1 )

r1 (λ1,1 , λ2,1 ) − r1 (λ1,2 , λ2,1 )

yielding t ↓ z1,1 (1, 1),

r1 (λ1,1 , λ2,1 ) − r1 (λ1,2 , λ2,1 ) r1 (λ1,2 , λ2,1 ) − r1 (λ1,3 , λ2,1 )

By the decreasing differences Assumption 3,

r2 (λ1,1 , λ2,1 ) − r2 (λ1,1 , λ2,2 ) r2 (λ1,2 , λ2,1 ) − r2 (λ1,2 , λ2,2 )

equilibrium value functions and the optimal switching times.

2.3.2 Generic Algorithm for Oligopoly (m ≥ 2)

understand how the following algorithm works.

Parameter. m; L ≥ 1; K(i), ∀ i ∈ S; Qi = {λi,1 , λi,2 , . . . , λi,K(i) }, ∀ i ∈ S; pi (λ), ∀ λ ∈ Πm

SET Vi (t, (n1 , . . . , ni−1 , 0, ni+1 , . . . , nm )) ← 0, ∀ nj ∈ Z+ , ∀ j 6= i, ∀ i ∈ S, ∀ t ∈ [0, +∞).

SET c(i) ← 1, ∀ i ∈ S; SET