Tips
This page of the documentation holds miscellaneous tips for using the package.
Deriving the conditional cumulant generating function
The cumulant generating function is based upon the moment-generating function. If
is the moment-generating function of a random variable $X$, then the cumulant-generating function is just
As an example, if $X \sim N(\mu, \sigma^2)$, then $M_X(t) = \exp(t\mu + \sigma^2 t^2 / 2)$ and $ccgf_X(t) = t\mu + \sigma^2 t^2 / 2$.
Risk-adjusted linearizations imply that the relative entropy measure $\mathcal{V}(\Gamma_5 z_{t + 1} + \Gamma_6 y_{t + 1})$ becomes a vector of conditional cumulant-generating functions for the random variables $A_i(z_t) \varepsilon_{t + 1}$, where $A_i(z_t)$ is the $i$th row vector of
To create a RiskAdjustedLinearization
, the user needs to define a function ccgf
in the form ccgf(F, A, z)
or ccgf(A, z)
, where A
refers to the matrix $A(z_t)$ once it has already been evaluated at $z_t$. In other words, the input A
should seen as a $n_y \times n_\varepsilon$ matrix of real scalars. However, depending on the distributions of the martingale difference sequence $\varepsilon_{t + 1}$, writing the conditional cumulant-generating function may also require knowing the current state $z_t$.
Let us consider two didactic examples. First, assume $\varepsilon_{t + 1}\sim \mathcal{N}(0, I)$. Then we claim
ccgf(A, z) = sum(A.^2, dims = 2) / 2
Based on the definition of $\mathcal{V}(z_t)$, one may be tempted to derive the conditional cumulant-generating function for the random vector $A(z_t) \varepsilon_{t + 1}$. However, this is not actually what we want. Rather, ccgf
should just return a vector of conditional cumulant-generating functions for the $n_y$ random variables $X_i = A_i(z_t)\varepsilon_{t + 1}$.
Because the individual components of $\varepsilon_{t + 1}$ are independent and each $\varepsilon_{i, t}$ has a standard Normal distribution, the moment-generating function for $X_i$ is $\exp\left(\frac{1}{2}\left(\sum_{j = 1}^{n_\varepsilon} (t A_{ij})^2 / 2\right)\right)$, hence the $i$th cumulant-generating function is $\frac{1}{2}\left(\sum_{j = 1}^{n_\varepsilon} (t A_{ij})^2 / 2\right)$. For risk-adjusted linearizations, we evaluate at $t = 1$ since we want the conditional cumulant-generating function $\log\mathbb{E}_t[\exp(A_i(z_t)\varepsilon_{t + 1})]$. This is precisely what the code above achieves.
Second, let us consider a more complicated example. In the Wachter (2013) Example, the ccgf is
function ccgf(F, α, z) # α is used here instead of A
# the S term in z[S[:p]] is just an ordered dictionary mapping the symbol :p to the desired index of z
F .= .5 .* α[:, 1].^2 + .5 * α[:, 2].^2 + (exp.(α[:, 3] + α[:, 3].^2 .* δ^2 ./ 2.) .- 1. - α[:, 3]) * z[S[:p]]
end
Observe that the first two quantities .5 .* α[:, 1].^2 + .5 * α[:, 2].^2
resemble what would be obtained from a standard multivariate normal distribution. The remaining terms are more complicated because the Wachter (2013) model involves a Poisson mixture of normal distributions. It will be instructive to spell the details out.
Consumption growth follows the exogenous process
where $\varepsilon_t^c \sim N(0, 1)$ is iid over time and $\xi_t \mid j_t \sim N(j_t, j_t\delta^2)$, where the number of jumps $j_t \sim Poisson(p_{t - 1})$, hence $\mathbb{E}_t \xi_{t + 1} = \mathbb{E}_t j_{t + 1} = p_t$. Assume that $\varepsilon_t^c$ and $\varepsilon_t^\xi = \xi_t - \mathbb{E}_{t - 1}\xi_t$ are independent. Finally, the intensity $p_t$ follows the process
where $\varepsilon_t^p \sim N(0, 1)$ is iid over time and independent of $\varepsilon_t^c$ and $\varepsilon_t^\xi$.
Note that $\xi_t$ and $\mathbb{E}_{t - 1}\xi_t$ are not independent because $\mathbb{E}_{t - 1}\xi_t = p_{t - 1}$ and $j_t \sim Poisson(p_{t - 1})$, hence a higher $p_{t - 1}$ implies $\xi_t$ is more likely to be higher. Re-centering $\xi_t$ by $\mathbb{E}_{t - 1}\xi_t$ creates a martingale difference sequence since $\xi_t \mid j_t$ is normal.
By independence of the components of $\varepsilon_t = [\varepsilon_t^c, \varepsilon_t^p, \varepsilon_t^\xi]^T$, the conditional cumulant-generating function for the $i$th row of the $A(z_t)$ matrix described in this section is
The first two terms on the RHS are for normal random variables and simplify to $(A_{i1}(z_t)^2 + A_{i2}(z_t)^2) / 2$. To calculate the remaining term, note that $\mathbb{E}_{t}\xi_{t + 1} = p_t$ is already part of the information set at $z_t$, hence
To calculate the cumulant-generating function of $\xi_t$, aside from direct calculation, we can also use the results for mixture distributions in Villa and Escobr (2006) or Bagui et al. (2020). Given random variables $X$ and $Y$, assume the conditional distribution $X\mid Y$ and the marginal distribution for $Y$ are available. If we can write the moment-generating function for the random variable $X\mid Y$ as
then the moment-generating function of $X$ is
In our case, we have
hence $C_1(s) = 0$ and $C_2(s) = (s + s^2\delta^2 / 2)$. The variable $j_t$ has a Poisson distribution with intensity $p_t$, which implies the moment-generating function
Thus, as desired,
Computing this quantity for each expectational equation yields the ccgf
used in the Wachter (2013) Example.
Writing functions compatible with automatic differentiation
Use an in-place function to avoid type errors.
For example, define the
ccgf
asccgf(F, x)
. You can use the element type ofF
viaeltype(F)
to ensure that you don't get a type error from usingFloat64
instead ofDual
inside the function. Ifccgf
was out-of-place, then depending on how the vector being returned is coded, you may get a type error if elements of the return vector are zero or constant numbers. By havingF
available, you can guarantee these numbers can be converted toDual
types if needed without always declaring them asDual
types.
Use
dualvector
ordualarray
.The package provides these two helper functions in the case where you have a function
f(x, y)
, and you need to be able to automatcally differentiate with respect tox
andy
separately. For example, the nonlinear terms of the expectational equationξ(z, y)
takes this form. Within , you can pre-allocate the return vector by callingF = RiskAdjustedLinearizations.dualvector(z, y)
. Thedualvector
function will infer fromz
andy
whetherF
should be haveDual
element types or not so you can repeatedly avoid writing if-else conditional blocks. Thedualarray
function generalizes this to arbitraryAbstractMatrix
inputs. See the out-of-place function forξ
in examples/wachter_disaster_risk/wachter.jl.
Don't pre-allocate the return vector.
Instead of pre-allocating the return vector at the top of the function for an out-of-place function, just concatenate the individual elements at the very end. Julia will figure out the appropriate element type for you. The downside of this approach is that you won't be able to assign names to the specific indices of the return vector (e.g. does this equation define the risk-free interest rate?). For small models, this disadvantage is generally not a problem. See the definition of the out-of-place expected state transition function
μ
in examples/wachter_disaster_risk/wachter.jl.Exponentiate all terms to write conditions in levels.
Automatic differentiation will be faster if the equilibrium conditions are written in the form
instead of in levels
However, it may be easier and/or less error prone to write the equilibrium conditions in levels. This approach can be easily accomplished by (1) exponentiating all input arguments at the beginning of the function to convert inputs from logs to levels; (2) writing all equilibrium conditions in the form $1 = F(x)$; and (3) returning the output as $\log(F(x))$.