Updates
Updates are structs that hold instructions about the MCMC updates.
Metropolis–Hastings with random walk proposals
The simplest types of updates are the Metropolis-Hastings updates with random walk proposals.
ExtensibleMCMC.RandomWalkUpdate
— Typestruct RandomWalkUpdate{TRW,TA,K,TP} <: MCMCParamUpdate
rw::TRW
adpt::TA
coords::K
prior::TP
end
Definition of an MCMC update that uses Metropolis-Hastings algorithm with random walk proposals. rw
is the random walk sampler, adpt
is the struct responsible for adaptation of hyperparameters, coords
lists the coordinate indeces of the global state
that the given update operates on and prior
is a prior for this update step (i.e. for a given subset of parameter vector).
function RandomWalkUpdate(
rw::TRW,
idx_of_global::K;
prior::TP=ImproperPrior(),
adpt::TA=NoAdaptation()
) where {TRW<:RandomWalk,K,TA,TP}
Base constructor.
A random walk can make steps according to any viable distribution, below we present two cases that have been implemented.
Very often we only need to update a subset of the parameters as in the Metropolis-within-Gibbs algorithm. This is easily done by specifying the indices of relevant coordinates in idx_of_global
field in the constructor.
Uniform Random Walker
The simplest type of walker is a Uniform random walker.
ExtensibleMCMC.UniformRandomWalk
— Typemutable struct UniformRandomWalk{T,S} <: RandomWalk
ϵ::T
pos::S
end
A uniform random walker that moves all unrestricted coordinates according to the additive rule
and moves all coordinates restricted to be positive according to the rule
In practice, some coordinates of the parameter vector may be restricted to take only positive values. We may impose this restriction by performing a "multiplicative exponentiated uniform random walk" for such coordinates. Such random walk is no longer uniform on the original space, but is uniform in steps for $\log(X_i)$.
Adaptation
The uniform random walker depends on the hyper-parameter $ϵ$, which has a large effect on the learning speed of the MCMC sampler. This hyper-parameter can be learned adaptively in a standard way, by targeting the acceptance rate of the sampler. To turn the adaptation on we have to pass an appropriately initialized AdaptationUnifRW
as a keyword argument adpt
:
ExtensibleMCMC.AdaptationUnifRW
— Typemutable struct AdaptationUnifRW{T} <: Adaptation
proposed::Int64
accepted::Int64
target_accpt_rate::Float64
adapt_every_k_steps::Int64
scale::T
min::T
max::T
offset::T
N::Int64
end
A struct containing information about the way in which to adapt the steps of the uniform random walker. proposed
and accepted
are the internal counters that keep track of the number of proposed and accepted samples. target_accpt_rate
is the acceptance rate of the Metropolis-Hastings steps that is supposed to be targetted. min
is used to enforce the minimal allowable range that the random walker can sample from, max
enforces the maximum. offset
introduces some delay in the start of decelerting the adaptation extent and scale
is a scaling parameter for adaptation speed. N
is the number of coordinates of the random walker.
ExtensibleMCMC.AdaptationUnifRW
— MethodAdaptationUnifRW(θ::K; kwargs...) where K
Define an adaptation scheme for a random walker that does updates on the parameter of shape and type θ
. The following named parameters can be specified
Arguments
adapt_every_k_steps=100
: number of steps based on which adaptation happenstarget_accpt_rate=0.234
: acceptance rate of MH step that is to be targettedscale=1.0
: scaling of the adaptationmin=1e-12
: minimum allowable (half)-range of the uniform samplermax=1e7
: maximum allowable (half)-range of the unifor sampleroffset=1e2
: number of adaptation steps after which the shrinking of adaptive steps is supposed to start.
To see a simple example illustrating what kind of dramatic difference adaptation can make see the Tutorial on estimating the mean of a bivariate Gaussian random variable
Gaussian Random Walker
Another implemented type of a random walker is a Gaussian random walker.
ExtensibleMCMC.GaussianRandomWalk
— Typemutable struct GaussianRandomWalk{T} <: RandomWalk
Σ::Symmetric{T,Array{T,2}}
pos::Vector{Bool}
end
A Gaussian random walker that first transforms all coordinates restricted to be positive to a log-scale via
then moves according to
and finally, transforms the restricted coordinates back to the original scale via
Restrictions are specified in the vector pos
(and no restrictions correspond to all entries in pos
being false
).
It can be used for single-site updates as well as joint updates.
Adaptation
The adaptation of GaussianRandomWalk
may no longer be done as in the case of UniformRandomWalk
(in principle, with the exception of 1-d, but this is not implemented), as the entire covariance matrix needs to be learnt now. This is instead implemented with Haario-type updates.
ExtensibleMCMC.HaarioTypeAdaptation
— Typemutable struct HaarioTypeAdaptation{T,TF} <: Adaptation
mean::Vector{T}
cov::Matrix{T}
adapt_every_k_steps::Int64
scale::Float64
N::Int64
M::Int64
fλ::TF
end
A struct containing information about the way in which to adapt the steps of the Gaussian random walker in a scheme:
and
with $Σ_A$ the covariance matrix that the adaptive scheme aims to learn. mean
and cov
are the empirical mean and covariance of the (appropriately log-transformed) local state
vector. Every adapt_every_k_steps
number of steps an adaptation is performed. scale
is a scaling for going from the empirical covariance to the covariance of the Gaussian random walker. N
is the total number of terms based on which the mean
and cov
were computed, M
is the number of updates since the last call to adaptation and fλ
is a function for determining the λ
weight.
HaarioTypeAdaptation(
state::Vector{T};
adapt_every_k_steps=100,
scale = 2.38^2,
f::TF=((x,y,z)->x),
) where {T,TF}
Base constructor.
In addition, in order to run HaarioTypeAdaptation
, the update
transition kernel must be set to GaussianRandomWalkMix
instead of just GaussianRandomWalk
, for obvious reasons that follow from the description of the former:
ExtensibleMCMC.GaussianRandomWalkMix
— Typemutable struct GaussianRandomWalkMix{T} <: RandomWalk
gsn_A::GaussianRandomWalk{T}
gsn_B::GaussianRandomWalk{T}
λ::Float64
end
A mixture of two Gaussian random walkers. It performs updates according to
where
(with $X_i$ being already appropriately log-transformed if needed).
(TODO) Metropolis-adjusted Langevin algorithm (MALA)
(TODO) Hamiltonian Monte Carlo
Custom MCMC updates and decorators
It is easy to define your own update functions. To do so, if your update
updates parameters (if it does not, then it falls under the category of more advanced updates and the rules might differ) you must implement for it
ExtensibleMCMC.log_transition_density
— Methodlog_transition_density(updt::MCMCParamUpdate, θ, θ°)
Evaluates the log-transition density for an update θ → θ°
ExtensibleMCMC.proposal!
— Methodproposal!(updt::MCMCParamUpdate, global_ws, ws::LocalWorkspace, step)
Sample a proposal value for the MCMC's local state
variable.
ExtensibleMCMC.set_parameters!
— Methodset_parameters!(
::Proposal,
updt::MCMCParamUpdate,
global_ws::GlobalWorkspace,
ws::LocalWorkspace,
)
Set the parameters of the proposal Law to the current value held by the proposal state
.
ExtensibleMCMC.set_parameters!
— Methodset_parameters!(
::Previous,
updt::MCMCParamUpdate,
global_ws::GlobalWorkspace,
ws::LocalWorkspace,
)
Set the parameters of the currently accepted Law to the current value held by the accepted state
.
Optional
Additionally, the following methods are implemented generically, but may be overwritten:
ExtensibleMCMC.log_prior
— Methodlog_prior(updt::MCMCParamUpdate, θ)
Evaluate log-prior at θ.
ExtensibleMCMC.coords
— Methodcoords(updt::MCMCParamUpdate)
Return a list of coordinates that updt
is concerned with
ExtensibleMCMC.subidx
— Methodsubidx(θ, updt::MCMCParamUpdate)
Return a view of subset of θ
that updt
is concerned with
ExtensibleMCMC.compute_gradients_and_momenta!
— Methodcompute_gradients_and_momenta!(
updt::MCMCParamUpdate, ws::LocalWorkspace, ::Any
)
To be overwritten for methods that are gradient or momentum based.