The idea behind Workspaces

Markov chain Monte Carlo algorithms often involve computationally expensive routines. As these often need to be repeated at each MCMC iteration, the MCMC algorithm may be sped up significantly by pre-allocating suitable containers on which all or a majority of computations are supposed to be performed. In ExtensibleMCMC.jl these containers are termed

ExtensibleMCMC.WorkspaceType

Supertype of all workspaces—i.e. of structs that gather in one place various objects that the MCMC sampler operates on.

source

There are two types of Workspaces:

  1. GlobalWorkspaces and
  2. LocalWorkspaces

Workspaces inheriting from GlobalWorkspace

ExtensibleMCMC.GlobalWorkspaceType
GlobalWorkspace{T} <: Workspace

Supertype of all global workspaces. Each MCMC sampler must have a unique global workspace, which contains state, state_history, state_proposal_history, acceptance_history and data. state is the paramater vector that the MCMC sampling is done for (other names being self-explanatory).

source

GlobalWorkspace, is the master Workspace that is responsible for:

  • keeping track of the MCMC chain, in particular the most recent state of the chain
  • holding the observed data (or at least a pointer to it)

and optionally:

  • holding containers for doing computations
  • keeping track of some online statistics regarding the chain
  • any other object that conceptually belongs to the scope of the chain and not the scope of the local MCMC updates

We implement a generic version of GlobalWorkspace that may be suitable for simple problems. For more advanced problems the user will need to implement custom GlobalWorkspaces.

ExtensibleMCMC.GenericGlobalWorkspaceType
struct GenericGlobalWorkspace{T,TD,TL} <: GlobalWorkspace{T}
    sub_ws::StandardGlobalSubworkspace{T,TD}
    P::TL
    P°::TL
end

Generic global workspace with sub_ws containing current state and keeping track of its history and some basic statistics. P and are the target laws with accepted and proposal state set as parameters.

source

where

ExtensibleMCMC.StandardGlobalSubworkspaceType
struct StandardGlobalSubworkspace{T,TD} <: GlobalWorkspace{T}
    state::Vector{T}
    state_history::Vector{Vector{Vector{T}}}
    state_proposal_history::Vector{Vector{Vector{T}}}
    data::TD
    stats::GenericChainStats{T}
end

Standard containers expected to be present in every global workspace. state is the currently accepted parameter θ, state_history is a chain of states that have been accepted and state_proposal_history is a chain of states that have been proposed. data are the data passed to an MCMC sampler (usually just a pointer) and stats gathers some basic online information about the chain.

source
Tip

To see an example of an implementation of a custom Workspace see DiffusionMCMC.jl, where we implemented DiffusionGlobalWorkspace (and DiffusionLocalWorkspace).

Note

StandardGlobalSubworkspace is simply a collection of the most common fields present in GlobalWorkspace. As such, it doesn't need to be present in custom implementations of GlobalWorkspace, however, it will often be convenient to do so.

Workspaces inheriting from LocalWorkspace

ExtensibleMCMC.LocalWorkspaceType
LocalWorkspace{T} <: Workspace

Supertype of all local workspaces. Local workspace should contain any additional gathering of objects that are needed by specific updates, but are not are not already in a global workspace. Each MCMC update has its own LocalWorkspace.

source

Each MCMC update (for instance RandomWalkUpdate) will have its own LocalWorkspace. During updates it will have access to both its LocalWorkspace as well as the GlobalWorkspace, but it will not see LocalWorkspaces of other updates (however, information between LocalWorkspaces may be exchanged prior to each update call). Conceptually, the objects that fall under LocalWorkspace are those that

  • belong only to a local scope (for instance, proposal ϑ° for a subset ϑ of all parameters θ, or the ∇log-likelihood)
  • provide appropriately shaped views to a global view (for instance, a view to a subset of observations that are to be used for computations in this update, or a recipe for how to sub-sample the observations)

Similarly to GenericGlobalWorkspace, we implement a generic version of LocalWorkspace suitable for simple problems. See DiffusionMCMC.jl for a more advanced example.

ExtensibleMCMC.GenericLocalWorkspaceType
struct GenericLocalWorkspace{T} <: LocalWorkspace{T}
    sub_ws::StandardLocalSubworkspace{T}
    sub_ws°::StandardLocalSubworkspace{T}
    acceptance_history::Vector{Bool}
end

Generic local workspace with sub_ws containing a subset of state that the corresponding updates operates on, with currently accepted value of state as well as its log-likelihood. sub_ws° corresponds to a proposal state. acceptance_history keeps track of accept/reject decisions.

source

where

ExtensibleMCMC.StandardLocalSubworkspaceType
struct StandardLocalSubworkspace{T} <: LocalWorkspace{T}
    state::Vector{T}
    ll::Vector{Float64}
    ll_history::Vector{Vector{Float64}}
    ∇ll::Vector{Vector{Float64}}
    momentum::Vector{Float64}
end

Standard containers likely to be present in every local workspace. state is the currently accepted subset of parameter θ that the corresponding update operates on, ll is the corresponding log-likelihood and ll_history is the chain of log-likelihoods. A single ll is of the type Vector{Float64} to reflect the fact that a problem might admit a natural factorisation into independent components that may be operated on independently, in parallel with each entry in ll corresponding to a separate component. ∇ll is the gradient of log-likelihood (needed by gradient-based algorithms) and momentum is the variable needed for the Hamiltionan dynamics. ∇ll and momentum may be simply left untouched if the problem does not need them.

source


Custom Workspaces


One of the most important aspects of ExtensibleMCMC.jl is customizability of Workspaces. Below, we describe how to define your own Workspaces.

Custom GlobalWorkspace


Each <CUSTOM>GlobalWorskspace needs to inherit from GlobalWorkspace. Apart from struct definition we need to provide a <CUSTOM>Backend inheriting from:

that will help Julia choose suitable initializers.

Additionally, if you plan on re-using some components of ExtensibleMCMC.jl, then the following methods MUST be defined for your <CUSTOM>GlobalWorskspace (using <CUSTOM>GlobalWorskspace in place of GlobalWorkspace and <CUSTOM>Backend in place of MCMCBackend):

ExtensibleMCMC.init_global_workspaceMethod
init_global_workspace(
    ::MCMCBackend,
    num_mcmc_steps,
    updates::Vector{<:MCMCUpdate},
    data,
    θinit::Vector{T};
    kwargs...
) where T

Initialize the <custom>GlobalWorkspace. <custom>MCMCBackend points to which GlobalWorkspace constructors to use, updates is a list of MCMC updates, θinit is the initial guess for the parameter and kwargs are the named arguments passed to run!.

source
StatsBase.loglikelihoodMethod
loglikelihood(ws::GlobalWorkspace, ::Proposal)

Evaluate the loglikelihood for the proposal Law and observations stored in a global workspace.

source
StatsBase.loglikelihoodMethod
loglikelihood(ws::GlobalWorkspace, ::Previous)

Evaluate the loglikelihood for the accepted Law and observations stored in a global workspace.

source
Note

When you are overriding loglikelihood, then do so via StatsBase.jl or Distributions.jl , where the function name originally belongs to, i.e.:

using StatsBase
const eMCMC = ExtensibleMCMC
StatsBase.loglikelihood(ws::CUSTOMGlobalWorkspace, ::eMCMC.Proposal) = ...
StatsBase.loglikelihood(ws::CUSTOMGlobalWorkspace, ::eMCMC.Previous) = ...

If <CUSTOM>GlobalWorskspace contains a field sub_ws::StandardGlobalSubworkspace, then the methods below will work automatically. If it does not, then you should implement them for your <CUSTOM>GlobalWorskspace:

ExtensibleMCMC.stateMethod
state(ws::GlobalWorkspace, step)

Return the state of the chain accepted at the the step iteration of the Markov chain.

source
ExtensibleMCMC.state°Method
state°(ws::GlobalWorkspace, step)

Return the state of the chain proposed at the the step iteration of the Markov chain.

source
ExtensibleMCMC.num_updtMethod
num_updt(ws::GlobalWorkspace)

Return the total set number of MCMC updates that may be performed at each MCMC iteration.

source

Custom LocalWorkspace


Each <CUSTOM>LocalWorskspace needs to inherit from LocalWorkspace. Additionally, if you plan on re-using some components of ExtensibleMCMC.jl, then the following methods MUST be defined for your <CUSTOM>LocalWorskspace (using <CUSTOM>LocalWorskspace in place of LocalWorkspace):

ExtensibleMCMC.create_workspaceMethod
create_workspace(
    ::MCMCBackend,
    mcmcupdate,
    global_ws::GlobalWorkspace,
    num_mcmc_steps
) where {T}

Create a local workspace for a given mcmcupdate.

source

If your <CUSTOM>LocalWorskspace contains sub_ws::StandardGlobalSubworkspace and sub_ws°::StandardGlobalSubworkspace as its fields then the methods below will work automatically:

ExtensibleMCMC.llMethod
ll(ws::LocalWorkspace, i::Int)

Return log-likelihood of the ith accepted parameter.

source
Tip

The function names associated with proposals end on a character °, which in Atom can be displayed with \degree and has a unicode: U+00B0.

Finally, there are some functions that are likely to work for custom LocalWorkspaces, but you are advised to check for compatibility: