Stochastic Games
A (discounted) stochastic game with $N$ players consists of the following elements.
- A state space $\mathcal{X}$.
- For each player $i$ and state $x$, a set $A_i(x)$ of actions available to player $i$ in state $x$.
- For each player $i$, state $x$, and action vector $\boldsymbol{a} \in \displaystyle\prod_i A_i(x)$, a stage payoff $Q_i(\boldsymbol{a} ; x)$.
- For each state $x$ and action vector $\boldsymbol{a} \in \displaystyle\prod_i A_i(x)$, a transition probability $\mathbb{P}\left(x^{\prime} \mid x, \boldsymbol{a}\right)$ that is a distribution on the state space $\mathcal{X}$.
- A discount factor $\delta \in (0, 1]$.
- An initial state $x^0$.
They play proceeds as follows.
- The game starts in state $x^0$.
- At each stage $t$, all players simultaneously choose (possibly mixed) actions $a_i^t$, with possible pure actions given by the set $A_i\left(x^t\right)$.
- The stage payoffs $Q_i\left(\boldsymbol{a}^t ; x^t\right)$ are realized, and the next state is chosen according to $\mathbb{P}\left(\cdot \mid x^t, \boldsymbol{a}^t\right)$.
All players observe the entire past history of play before choosing their actions at stage $t$.
Let $s_i$ denote a strategy for player $i$ in this dynamic game; it is a mapping from histories (including states and actions) to actions. (After any history leading to state $x$, player $i$ 's strategy must choose an action in $A_i(x)$.) Given strategies $s_1, \ldots, s_N$, the expected discounted payoff of player $i$ starting from state $x^0$ is: $$ \Pi_i\left(s_1, \ldots, s_N ; x^0\right)=\mathbb{E}\left[\sum_{t=0}^{\infty} \delta^t Q_i\left(s_1\left(x^t\right), \ldots, s_N\left(x^t\right) ; x^t\right)\right] . $$ Here the expectation is over both randomization in state transitions, and randomization in players' choice of actions after any history.
Markov perfect equilibrium (MPE) refers to a subgame perfect equilibrium (SPE) in which all players use Markov strategies[1].
Existence result
An MPE exists in any finite multistage game with observed actions, and in infinite horizon games whose payoffs are continuous at $\infty$.
Example: A Duopoly Model[2]
Two firms are the only producers of a good, the demand for which is governed by a linear inverse demand function $$ \begin{equation} p=a_0-a_1\left(q_1+q_2\right) \label{Markov-perf-ex-demand} \end{equation} $$ Here $p=p_t$ is the price of the good, $q_i=q_{i t}$ is the output of firm $i=1,2$ at time $t$ and $a_0>0, a_1>0$
In $\eqref{Markov-perf-ex-demand}$ and what follows,
- the time subscript is suppressed when possible to simplify notation
- $\hat{x}$ denotes a next period value of variable $x$
Each firm recognizes that its output affects total output and therefore the market price.
The one-period payoff function of firm $i$ is price times quantity minus adjustment costs: $$ \begin{equation} \pi_i=p q_i-\gamma\left(\hat{q}_i-q_i\right)^2, \quad \gamma>0 \label{Markov-pref-ex-2} \end{equation} $$ Substituting the inverse demand curve $\eqref{Markov-perf-ex-demand}$ into $\eqref{Markov-pref-ex-2}$ lets us express the one-period payoff as $$ \pi_i\left(q_i, q_{-i}, \hat{q}_i\right)=a_0 q_i-a_1 q_i^2-a_1 q_i q_{-i}-\gamma\left(\hat{q}_i-q_i\right)^2, $$ where $q_{-i}$ denotes the output of the firm other than $i$.
The objective of the firm is to maximize $\sum_{t=0}^{\infty} \beta^t \pi_{i t}$.
Firm $i$ chooses a decision rule that sets next period quantity $\hat{q}_i$ as a function $f_i$ of the current state $\left(q_i, q_{-i}\right)$.
See https://python.quantecon.org/markov_perf.html
Cont’d
Two firms set prices and quantities of two goods interrelated through their demand curves.
Relevant variables are defined as follows:
- $I_{i t}=$ inventories of firm $i$ at beginning of $t$
- $q_{i t}=$ production of firm $i$ during period $t$
- $p_{i t}=$ price charged by firm $i$ during period $t$
- $S_{i t}=$ sales made by firm $i$ during period $t$
- $E_{i t}=$ costs of production of firm $i$ during period $t$
- $C_{i t}=$ costs of carrying inventories for firm $i$ during $t$
The firms' cost functions are
-
$C_{i t}=c_{i 1}+c_{i 2} I_{i t}+0.5 c_{i 3} I_{i t}^2$
-
$E_{i t}=e_{i 1}+e_{i 2} q_{i t}+0.5 e_{i 3} q_{i t}^2$ where $e_{i j}, c_{i j}$ are positive scalars
Inventories obey the laws of motion
$$ I_{i, t+1}=(1-\delta) I_{i t}+q_{i t}-S_{i t} $$
Demand is governed by the linear schedule $$ S_t=D p_{i t}+b $$ where
- $S_t=\left[\begin{array}{ll}S_{1 t} & S_{2 t}\end{array}\right]^{\prime}$
- $D$ is a $2 \times 2$ negative definite matrix and
- $b$ is a vector of constants
Firm $i$ maximizes the undiscounted sum
$$ \lim _{T \rightarrow \infty} \frac{1}{T} \sum_{t=0}^T\left(p_{i t} S_{i t}-E_{i t}-C_{i t}\right) $$
Lack of robustness
Markov perfect equilibria are not stable with respect to small changes in the game itself. A small change in payoffs can cause a large change in the set of Markov perfect equilibria. This is because a state with a tiny effect on payoffs can be used to carry signals, but if its payoff difference from any other state drops to zero, it must be merged with it, eliminating the possibility of using it to carry signals.
Related Papers
- Eric Maskin and Jean Tirole. A Theory of Dynamic Oligopoly, II: Price Competition, Kinked Demand Curves, and Edgeworth Cycles. Econometrica, 56(3):571-599, 1988.
- Eric Maskin and Jean Tirole. A Theory of Dynamic Oligopoly, III: Cournot Competition. European Economic Review, 31(4):947-968, 1987. DOI
- Richard Ericson and Ariel Pakes. Markov-perfect industry dynamics: a framework for empirical work. The Review of Economic Studies, 62(1):53–82, 1995. DOI
- Ulrich Doraszelski and Mark Satterthwaite. Computable markov-perfect industry dynamics. The RAND Journal of Economics, 41(2):215–243, 2010. DOI
- Stephen P Ryan. The costs of environmental regulation in a concentrated industry. Econometrica, 80(3):1019–1061, 2012. DOI
- Jaap H. Abbring, Jeffrey R. Campbell, Jan Tilly, Nan Yang. Very Simple Markov-Perfect Industry Dynamics: Theory. Econometrica, 82(2):721-735, 2018. DOI
-
Eric Maskin and Jean Tirole. Markov Perfect Equilibrium: I. Observable Actions. Journal of Economic Theory, 100(2):191-219, 2001. DOI
-
Eric Maskin and Jean Tirole. A Theory of Dynamic Oligopoly, I: Overview and Quantity Competition with Large Fixed Costs. Econometrica, 56(3):549-569, 1988.