Updates

Updates are structs that hold instructions about the MCMC updates.

Metropolis–Hastings with random walk proposals

The simplest types of updates are the Metropolis-Hastings updates with random walk proposals.

ExtensibleMCMC.RandomWalkUpdate — Type

struct RandomWalkUpdate{TRW,TA,K,TP} <: MCMCParamUpdate
    rw::TRW
    adpt::TA
    coords::K
    prior::TP
end

Definition of an MCMC update that uses Metropolis-Hastings algorithm with random walk proposals. rw is the random walk sampler, adpt is the struct responsible for adaptation of hyperparameters, coords lists the coordinate indeces of the global state that the given update operates on and prior is a prior for this update step (i.e. for a given subset of parameter vector).

function RandomWalkUpdate(
    rw::TRW,
    idx_of_global::K;
    prior::TP=ImproperPrior(),
    adpt::TA=NoAdaptation()
) where {TRW<:RandomWalk,K,TA,TP}

Base constructor.

A random walk can make steps according to any viable distribution, below we present two cases that have been implemented.

Note

Very often we only need to update a subset of the parameters as in the Metropolis-within-Gibbs algorithm. This is easily done by specifying the indices of relevant coordinates in idx_of_global field in the constructor.

Uniform Random Walker

The simplest type of walker is a Uniform random walker.

ExtensibleMCMC.UniformRandomWalk — Type

mutable struct UniformRandomWalk{T,S} <: RandomWalk
    ϵ::T
    pos::S
end

A uniform random walker that moves all unrestricted coordinates according to the additive rule

\[x_i+U, \quad\mbox{with} \quad U\sim\texttt{Unif}([-ϵ_i,ϵ_i]), \qquad i\in\texttt{indices},\]

and moves all coordinates restricted to be positive according to the rule

\[x_i\exp(U), \quad\mbox{with} \quad U\sim\texttt{Unif}([-ϵ_i,ϵ_i]), \qquad i\in\texttt{indices}.\]

Note

In practice, some coordinates of the parameter vector may be restricted to take only positive values. We may impose this restriction by performing a "multiplicative exponentiated uniform random walk" for such coordinates. Such random walk is no longer uniform on the original space, but is uniform in steps for $\log(X_i)$.

Adaptation

The uniform random walker depends on the hyper-parameter $ϵ$, which has a large effect on the learning speed of the MCMC sampler. This hyper-parameter can be learned adaptively in a standard way, by targeting the acceptance rate of the sampler. To turn the adaptation on we have to pass an appropriately initialized AdaptationUnifRW as a keyword argument adpt:

ExtensibleMCMC.AdaptationUnifRW — Type

mutable struct AdaptationUnifRW{T} <: Adaptation
    proposed::Int64
    accepted::Int64
    target_accpt_rate::Float64
    adapt_every_k_steps::Int64
    scale::T
    min::T
    max::T
    offset::T
    N::Int64
end

A struct containing information about the way in which to adapt the steps of the uniform random walker. proposed and accepted are the internal counters that keep track of the number of proposed and accepted samples. target_accpt_rate is the acceptance rate of the Metropolis-Hastings steps that is supposed to be targetted. min is used to enforce the minimal allowable range that the random walker can sample from, max enforces the maximum. offset introduces some delay in the start of decelerting the adaptation extent and scale is a scaling parameter for adaptation speed. N is the number of coordinates of the random walker.

ExtensibleMCMC.AdaptationUnifRW — Method

AdaptationUnifRW(θ::K; kwargs...) where K

Define an adaptation scheme for a random walker that does updates on the parameter of shape and type θ. The following named parameters can be specified

Arguments

adapt_every_k_steps=100: number of steps based on which adaptation happens
target_accpt_rate=0.234: acceptance rate of MH step that is to be targetted
scale=1.0: scaling of the adaptation
min=1e-12: minimum allowable (half)-range of the uniform sampler
max=1e7: maximum allowable (half)-range of the unifor sampler
offset=1e2: number of adaptation steps after which the shrinking of adaptive steps is supposed to start.

To see a simple example illustrating what kind of dramatic difference adaptation can make see the Tutorial on estimating the mean of a bivariate Gaussian random variable

Gaussian Random Walker

Another implemented type of a random walker is a Gaussian random walker.

ExtensibleMCMC.GaussianRandomWalk — Type

mutable struct GaussianRandomWalk{T} <: RandomWalk
    Σ::Symmetric{T,Array{T,2}}
    pos::Vector{Bool}
end

A Gaussian random walker that first transforms all coordinates restricted to be positive to a log-scale via

\[θ_i\leftarrow\log(θ_i),\quad i\in\{i: θ_i\texttt{_restricted_to_be_positive}\},\]

then moves according to

\[θ°∼N(θ, Σ),\quad \forall i,\]

and finally, transforms the restricted coordinates back to the original scale via

\[θ°_i\leftarrow\exp(θ°_i),\quad i\in\{i: θ_i\texttt{_restricted_to_be_positive}\}.\]

Restrictions are specified in the vector pos (and no restrictions correspond to all entries in pos being false).

It can be used for single-site updates as well as joint updates.

Adaptation

The adaptation of GaussianRandomWalk may no longer be done as in the case of UniformRandomWalk (in principle, with the exception of 1-d, but this is not implemented), as the entire covariance matrix needs to be learnt now. This is instead implemented with Haario-type updates.

ExtensibleMCMC.HaarioTypeAdaptation — Type

mutable struct HaarioTypeAdaptation{T,TF} <: Adaptation
    mean::Vector{T}
    cov::Matrix{T}
    adapt_every_k_steps::Int64
    scale::Float64
    N::Int64
    M::Int64
    fλ::TF
end

A struct containing information about the way in which to adapt the steps of the Gaussian random walker in a scheme:

\[X_{i+1} = λZ+(1-λ)W,\quad\mbox{where}\quad Z ∼ N(X_i, c\texttt{Id}),\]

and

\[W ∼ N(X_i, Σ_A),\]

with $Σ_A$ the covariance matrix that the adaptive scheme aims to learn. mean and cov are the empirical mean and covariance of the (appropriately log-transformed) local state vector. Every adapt_every_k_steps number of steps an adaptation is performed. scale is a scaling for going from the empirical covariance to the covariance of the Gaussian random walker. N is the total number of terms based on which the mean and cov were computed, M is the number of updates since the last call to adaptation and fλ is a function for determining the λ weight.

HaarioTypeAdaptation(
        state::Vector{T};
        adapt_every_k_steps=100,
        scale = 2.38^2,
        f::TF=((x,y,z)->x),
    ) where {T,TF}

Base constructor.

In addition, in order to run HaarioTypeAdaptation, the update transition kernel must be set to GaussianRandomWalkMix instead of just GaussianRandomWalk, for obvious reasons that follow from the description of the former:

ExtensibleMCMC.GaussianRandomWalkMix — Type

mutable struct GaussianRandomWalkMix{T} <: RandomWalk
    gsn_A::GaussianRandomWalk{T}
    gsn_B::GaussianRandomWalk{T}
    λ::Float64
end

A mixture of two Gaussian random walkers. It performs updates according to

\[X_{i+1} = X_i + λZ_A + (1-λ)Z_B,\]

where

\[Z_A ∼ N(0,Σ_A),\quad\mbox{and}\quad Z_B ∼ N(0, Σ_B),\]

(with $X_i$ being already appropriately log-transformed if needed).

(TODO) Metropolis-adjusted Langevin algorithm (MALA)

(TODO) Hamiltonian Monte Carlo

Custom MCMC updates and decorators

It is easy to define your own update functions. To do so, if your update updates parameters (if it does not, then it falls under the category of more advanced updates and the rules might differ) you must implement for it

ExtensibleMCMC.log_transition_density — Method

log_transition_density(updt::MCMCParamUpdate, θ, θ°)

Evaluates the log-transition density for an update θ → θ°

ExtensibleMCMC.proposal! — Method

proposal!(updt::MCMCParamUpdate, global_ws, ws::LocalWorkspace, step)

Sample a proposal value for the MCMC's local state variable.

ExtensibleMCMC.set_parameters! — Method

set_parameters!(
    ::Proposal,
    updt::MCMCParamUpdate,
    global_ws::GlobalWorkspace,
    ws::LocalWorkspace,
)

Set the parameters of the proposal Law to the current value held by the proposal state.

ExtensibleMCMC.set_parameters! — Method

set_parameters!(
    ::Previous,
    updt::MCMCParamUpdate,
    global_ws::GlobalWorkspace,
    ws::LocalWorkspace,
)

Set the parameters of the currently accepted Law to the current value held by the accepted state.

Optional

Additionally, the following methods are implemented generically, but may be overwritten:

ExtensibleMCMC.log_prior — Method

log_prior(updt::MCMCParamUpdate, θ)

Evaluate log-prior at θ.

ExtensibleMCMC.coords — Method

coords(updt::MCMCParamUpdate)

Return a list of coordinates that updt is concerned with

ExtensibleMCMC.subidx — Method

subidx(θ, updt::MCMCParamUpdate)

Return a view of subset of θ that updt is concerned with

ExtensibleMCMC.compute_gradients_and_momenta! — Method

compute_gradients_and_momenta!(
    updt::MCMCParamUpdate, ws::LocalWorkspace, ::Any
)

To be overwritten for methods that are gradient or momentum based.