Dynamic Programming

Deterministic Dynamic Programming

Introductory Example

考虑一个标准的 discounted utility model: $$ \begin{aligned} \max_{\{c_t\}_{t=0}^\infty}\; & U= \sum_{t=0}^\infty \beta^t u(c_t) \\ \text{ s.t. } & \sum_{t=0}^\infty c_t \leq I \end{aligned} $$ 决策者面临一个跨时期的消费选择问题，总的预算是 $I$。决策变量是一个序列 $\{c_t: t=0,\dots, \infty\}$，一般地，可以假设 $u(c)=\ln (c)$ 。

构造 Lagrangian 函数： $$ \mathcal{L} = \sum_{t=0}^\infty \beta^t u(c_t) + \lambda \left(I - \sum_{t=0}^\infty c_t\right) $$ 其中 $\lambda$ 是 Lagrangian 乘子，代表预算约束的边际价值。

对每个时期的消费 $c_t$ 求导，得到一阶条件： $$ \frac{\partial \mathcal{L}}{\partial c_t} = \beta^t u'(c_t) - \lambda = 0, \quad \forall t = 0,1,2,\dots $$

比较相邻两期的一阶条件：

第 $t$ 期：$\beta^t u'(c_t) = \lambda$
第 $t+1$ 期：$\beta^{t+1} u'(c_{t+1}) = \lambda$

两式相除，得到欧拉方程：

$$ u'(c_t) = \beta u'(c_{t+1}) $$

注意到，折现后的边际效用在各期相等

当 $u(c) = \ln(c)$ 时，$u'(c) = \frac{1}{c}$，代入欧拉方程有 $c_{t+1}=\beta c_t$ ，再利用预算约束求解 $$ \sum_{t=0}^\infty c_t = c_0 \sum_{t=0}^\infty \beta^t = \frac{c_0}{1-\beta} = I $$ 解得：

$$ c_0 = (1-\beta)I, \quad c_t = (1-\beta)\beta^t I $$

观察到：

最优消费路径随时间递减
$\beta$ 越大（决策者越耐心），消费下降越缓慢，前期消费相对越低
令 $W_t=I-\sum_{i=0}^{t-1}c_i$ 表示在 $t$ 时刻剩余的储蓄，则有 $W_t=\beta W_{t-1}$ ，表示储蓄率是常数。

同时，这个问题可以表述为动态规划形式：

状态变量：当前剩余财富 $w_t$
控制变量：当期消费 $c_t$
状态转移：$w_{t+1} = w_t - c_t$
值函数：$V(w) = \max_{\{c_t\}} \sum_{t=0}^\infty \beta^t u(c_t)$

这个问题的 Bellmen Equation 是： $$ V(w) = \max_{0 \leq w' \leq w} \left\{ \ln (w-w') + \beta V(w') \right\} $$ 假定优化问题存在内点解，一阶条件 $$ \frac{1}{w-w'} = \beta V'(w') $$ Envelope condition: $$ V'(w) = \frac{1}{w-w'} \implies V'(w') = \frac{1}{w'-w''} $$ 结合两个条件，有 $$ {w'-w''}= \beta ({w-w'}) $$

Define the social planning problem as $$ \begin{aligned} \max_{\left\{k_{t+1}\right\}_{t=0}^{\infty}} & \; \sum_{t=0}^{\infty} \beta^t u\left[g(k_t)-k_{t+1}\right] \\ \text{ s.t. } & 0\leq k_{t+1} \leq g(k_t) \end{aligned} $$ Assuming an interior solution, the first order condition for $k_{t+1}$ is $$ -\beta^t u^{\prime}\left[g\left(k_t\right)-k_{t+1}\right]+\beta^{t+1} u^{\prime}\left[g\left(k_{t+1}\right)-k_{t+2}\right] g^{\prime}\left(k_{t+1}\right)=0 $$ Rearranged, this gives us the familiar Euler equation $$ u^{\prime}\left[g\left(k_t\right)-k_{t+1}\right]=\beta u^{\prime}\left[g\left(k_{t+1}\right)-k_{t+2}\right] g^{\prime}\left(k_{t+1}\right), \quad t=0,1,2 \ldots $$

Consider the analogous finite horizon problem $$ \begin{aligned} \max_{\left\{k_{t+1}\right\}_{t=0}^{T}} & \; \sum_{t=0}^{T} \beta^t u\left[g(k_t)-k_{t+1}\right] \\ \text{ s.t. } & 0\leq k_{t+1} \leq g(k_t) \end{aligned} $$

Solving the Problem Recursively

$$ \begin{equation} v(k)=\max _{0 \leq k^{\prime} \leq g(k)} \; u\left[g(k)-k^{\prime}\right]+\beta v\left(k^{\prime}\right) \label{infi-recursive} \end{equation} $$ When there is an interior solution, and $v$ is differentiable the first-order condition for the maximization problem in $\eqref{infi-recursive}$ is $$ u^{\prime}\left[g(k)-k^{\prime}\right]=\beta v^{\prime}\left(k^{\prime}\right) $$ we also have the envelope condition $$ v^{\prime}(k)=u^{\prime}\left[g(k)-k^{\prime}\right] g^{\prime}(k) $$

A General Dynamic Optimization Problem

Stokey and Lucas write down a general dynamic optimization problem $$ \begin{aligned} \max _{\left\{x_{t+1}\right\}_{t=0}^{\infty}} & \; \sum_{t=0}^{\infty} \beta^t F\left(x_t, x_{t+1}\right) \\ \text{ s.t. } & \; x_{t+1} \in \Gamma(x_t) \end{aligned} $$ the following dynamic programming problem $$ v(x)=\max _{y \in \Gamma(x)} F(x, y)+\beta v(y) $$ condition $$ \begin{aligned} &\text{FOC}: & F_y(x, y)+\beta v^{\prime}(y)=0 \\ &\text{Envelope}: & v^{\prime}(x)=F_x(x, y) \end{aligned} $$

Blackwell’s Theorem

Let $T: \mathcal{F} \rightarrow \mathcal{F}$ be an operator on a metric space $\left(\mathcal{F}, d_{\infty}\right)$, where $\mathcal{F}$ is a set of bounded functions and $d_{\infty}$ is the sup norm. Assume that $T$ has the following properties:

Monotonicity: For any $f, g \in \mathcal{F}, f \geq g \Rightarrow T f \geq T g$.

Discounting: For any constant real number $c>0$, and every $f \in \mathcal{F}, T(f+c) \leq T f+\beta c$, for some $0 \leq \beta<1$.

Then $T$ is a contraction mapping with modulus $\beta$.

Examples

Bugest Planning

Consider $u(c_t) = \ln c_t, \, g(k_{t})=k_{t}^\alpha$

The finite horizon case $$ v_{t} (k_t) = \max_{k_{t+1}} \{\ln (k_{t}^\alpha - k_{t+1}) + \beta v_{t+1}(k_{t+1})\} $$

FOC $$ \frac{1}{k_t^\alpha - k_{t+1}} = \beta v^\prime _{t+1}(k_{t+1}) $$ ENVELOPE

$$ v^\prime _t(k_t) = \frac{\alpha k_t^{\alpha-1}}{k_t^\alpha - k_{t+1}} \quad \overset{t\to t+1}{\Longrightarrow}\quad \beta v^\prime_{t+1}(k_{t+1}) = \beta \frac{\alpha k_{t+1}^{\alpha-1}}{k^\alpha_{t} - k_{t+1}} $$

We have the following condition $$ \frac{1}{k_t^\alpha-k_{t+1}}=\beta\left(\frac{1}{k_{t+1}^\alpha-k_{t+2}}\right) \alpha k_{t+1}^{\alpha-1} $$ Define $$ z_t = \frac{k_{t+1}}{k_{t}^\alpha} $$ Then the condition can be written $$ \frac{1}{k_t^\alpha}\left(\frac{1}{1-z_t}\right)=\frac{\beta}{k_{t+1}^\alpha}\left(\frac{1}{1-z_{t+1}}\right) \alpha k_{t+1}^{\alpha-1} $$ which may be simplified $$ z_{t+1}=1+\alpha \beta-\frac{\alpha \beta}{z_t} $$ Start from $z_T = 0$, we can perform recursive operation explicitly $$ z_T=0=1+\alpha \beta-\frac{\alpha \beta}{z_{T-1}} \quad \Rightarrow \quad z_{T-1} = \frac{\alpha \beta}{1 + \alpha \beta} $$

$$ z_T=0=1+\alpha \beta-\frac{\alpha \beta}{z_{T-1}} $$

The infinite horizon case $$ v (k_t) = \max_{k_{t+1}} \{\ln (k_{t}^\alpha - k_{t+1}) + \beta v(k_{t+1})\} $$ We have the following condition $$ \frac{1}{c_t}=\alpha \beta \frac{1}{c_{t+1}} k_{t+1}^{\alpha-1} $$ and budget constraint $$ c_t + k_{t+1} = k_{t}^\alpha $$ Rewrite and combine the both constraints

Discounted Utility Model

A decision maker chooses to maximize his total utility $$ \begin{aligned} \max\; & \sum_{t=1}^T u_t(x_t) \\ \text{s.t.} & \sum_{t=1}^T x_t = I \end{aligned} $$

Appendix

Envelope Theorem

Consider an example of the following maximization problem $$ \max_{x} f(x, \theta) = x^2 - \theta x $$ Let $v(\theta)=\displaystyle \max _{x} f(x, \theta)$, and we want to find $\displaystyle\frac{\mathrm{d}v }{\mathrm{d} \theta}$ .

Clearly, $f(x, \theta)$ attains its maximum w.r.t $x$ at $x^\ast=\theta/2$ , and thus $v(\theta) = -\theta^2/4$ .

The Evenlope theorem states that $$ \frac{\mathrm{d}v }{\mathrm{d} \theta} = \frac{\partial f}{\partial \theta}\bigg|_{x=x^\ast} = -x \bigg| _{x=\theta/2} = -\frac{\theta}{2} $$ This can be understood by $$ \frac{\mathrm{d}v }{\mathrm{d} \theta} = \frac{\partial f(x^\ast(\theta), \theta)}{\partial \theta} = \frac{\partial f}{\partial \theta} + \frac{\partial f}{\partial x}\frac{\partial x^\ast}{\partial \theta} = \frac{\partial f}{\partial \theta}\bigg| _{x=x^\ast(\theta)} $$ $\partial f / \partial x = 0$ is due the the first order condition when maximize $f(x, \theta)$ .

The Envelope Theorem: For the maximization problem $\displaystyle\max _{x \in \mathbb{R}} f(x ; \theta)$, if $f: \mathbb{R}^2 \rightarrow \mathbb{R}$ is a $C^1$-function and if the solution function $x^\ast(\theta)$ for the maximization problem is differentiable at $\bar{\theta}$, then the derivative of the value function satisfies $v^{\prime}(\theta)=\displaystyle\frac{\partial f}{\partial \theta}$ at $\bar{\theta}$ .

Envelope Theorem for Constrained Optimization

Now let’s add a constraint to the maximization problem $$ v(\theta) = \left[ \begin{aligned} \max_x \; & f(x; \theta) \\ \text{ s.t. } & G(x; \theta) \leq 0 \end{aligned} \right] $$ We assume $f, G$ are continuously differentiable functions and $x^\ast(\theta)$ exists and is differentiable.

The ﬁrst-order condition for the maximization problem is that for some $\lambda \geq 0$ $$ \frac{\partial f}{\partial x} (x; \theta) = \lambda \frac{\partial G }{\partial x}(x; \theta) $$ The envelope theorem states that $$ \begin{aligned} \frac{\mathrm{d} v}{\mathrm{d} \theta} & = \frac{\partial f}{\partial \theta} + \frac{\partial f}{\partial x} \frac{\partial x^\ast}{\partial \theta} \\ & = \frac{\partial f}{\partial \theta} + \lambda \frac{\partial G}{\partial x} \frac{\partial x^\ast}{\partial \theta} \qquad (\text{FOC}) \\ & = \frac{\partial f}{\partial \theta} -\lambda \frac{\partial G}{\partial \theta} \qquad \qquad (\frac{\partial G}{\partial x} \frac{\partial x^\ast}{\partial \theta} = -\frac{\partial G}{\partial \theta}) \end{aligned} $$

Berge's Maximum Theorem

Berge 最大值定理描述了当参数变化时，目标函数的最优值和最优解集合的连续性问题。

设：

$X \subseteq \mathbb{R}^n$ 是决策变量的集合
$\Theta$ 是参数变量的集合
$f: X \times \Theta \to \mathbb{R}$ 是目标函数
$C: \Theta \rightrightarrows X$ 是一个约束，为每个参数 $\theta$ 指定一个可行集 $C(\theta) \subseteq X$

定义：

值函数：$V(\theta) = \displaystyle\sup_{x ∈ C(\theta)} f(x, \theta) $
最优解：$X^\ast(\theta) = \{ x ∈ C(\theta) \mid f(x, \theta) = V(\theta) \} $

如果满足以下条件：

目标函数 $f$ 是连续的。
约束 $C(\theta)$ 是紧集（在 $\mathbb{R}^n$ 中，这意味着是有界闭集），且 correspondence $C$ 是连续的（upper hemicontinuous + lower hemicontinuous）

那么：

The maxmium value function $ V(\theta) $ is well-defined and continuous
The optimal policy correspondence $ X^\ast(\theta) $ is nonempty, compact valued, and upper hemicontinuous.

上半连续 (uhc)对应：保证了最优解集合不会在参数连续变化时“突然扩大”。（图像不会从一点突然“炸开”成一个很大的集合）。

但定理不保证 $ X^\ast(\theta) $ 是下半连续的。这意味着最优解集合可能会在参数变化时“突然收缩”（例如，从一条线段突然变成一个点）。因此，不能保证最优解 $ x^\ast(\theta) $ 是参数 $\theta$ 的连续函数，除非最优解是唯一的。

在最优化问题中，连续性输入（目标函数和约束集）会导出连续性输出（最优值和最优解集的上半连续性）。该定理是进行比较静态分析和建立许多经济模型均衡存在性的基石。

参考：http://irelandp.com/econ7720/notes/bergeslides.pdf

Examples

consumer theory 里面的 indirect utility function

Dynamic Programming

相关文章：