The idea behind Workspaces
Markov chain Monte Carlo algorithms often involve computationally expensive routines. As these often need to be repeated at each MCMC iteration, the MCMC algorithm may be sped up significantly by pre-allocating suitable containers on which all or a majority of computations are supposed to be performed. In ExtensibleMCMC.jl these containers are termed
ExtensibleMCMC.Workspace — TypeSupertype of all workspaces—i.e. of structs that gather in one place various objects that the MCMC sampler operates on.
There are two types of Workspaces:
GlobalWorkspaces andLocalWorkspaces
Workspaces inheriting from GlobalWorkspace
ExtensibleMCMC.GlobalWorkspace — TypeGlobalWorkspace{T} <: WorkspaceSupertype of all global workspaces. Each MCMC sampler must have a unique global workspace, which contains state, state_history, state_proposal_history, acceptance_history and data. state is the paramater vector that the MCMC sampling is done for (other names being self-explanatory).
GlobalWorkspace, is the master Workspace that is responsible for:
- keeping track of the MCMC chain, in particular the most recent
stateof the chain - holding the observed data (or at least a pointer to it)
and optionally:
- holding containers for doing computations
- keeping track of some online statistics regarding the chain
- any other object that conceptually belongs to the scope of the chain and not the scope of the local MCMC updates
We implement a generic version of GlobalWorkspace that may be suitable for simple problems. For more advanced problems the user will need to implement custom GlobalWorkspaces.
ExtensibleMCMC.GenericGlobalWorkspace — Typestruct GenericGlobalWorkspace{T,TD,TL} <: GlobalWorkspace{T}
sub_ws::StandardGlobalSubworkspace{T,TD}
P::TL
P°::TL
endGeneric global workspace with sub_ws containing current state and keeping track of its history and some basic statistics. P and P° are the target laws with accepted and proposal state set as parameters.
where
ExtensibleMCMC.StandardGlobalSubworkspace — Typestruct StandardGlobalSubworkspace{T,TD} <: GlobalWorkspace{T}
state::Vector{T}
state_history::Vector{Vector{Vector{T}}}
state_proposal_history::Vector{Vector{Vector{T}}}
data::TD
stats::GenericChainStats{T}
endStandard containers expected to be present in every global workspace. state is the currently accepted parameter θ, state_history is a chain of states that have been accepted and state_proposal_history is a chain of states that have been proposed. data are the data passed to an MCMC sampler (usually just a pointer) and stats gathers some basic online information about the chain.
To see an example of an implementation of a custom Workspace see DiffusionMCMC.jl, where we implemented DiffusionGlobalWorkspace (and DiffusionLocalWorkspace).
StandardGlobalSubworkspace is simply a collection of the most common fields present in GlobalWorkspace. As such, it doesn't need to be present in custom implementations of GlobalWorkspace, however, it will often be convenient to do so.
Workspaces inheriting from LocalWorkspace
ExtensibleMCMC.LocalWorkspace — TypeLocalWorkspace{T} <: WorkspaceSupertype of all local workspaces. Local workspace should contain any additional gathering of objects that are needed by specific updates, but are not are not already in a global workspace. Each MCMC update has its own LocalWorkspace.
Each MCMC update (for instance RandomWalkUpdate) will have its own LocalWorkspace. During updates it will have access to both its LocalWorkspace as well as the GlobalWorkspace, but it will not see LocalWorkspaces of other updates (however, information between LocalWorkspaces may be exchanged prior to each update call). Conceptually, the objects that fall under LocalWorkspace are those that
- belong only to a local scope (for instance, proposal
ϑ°for a subsetϑof all parametersθ, or the∇log-likelihood) - provide appropriately shaped views to a global view (for instance, a view to a subset of observations that are to be used for computations in this
update, or a recipe for how to sub-sample the observations)
Similarly to GenericGlobalWorkspace, we implement a generic version of LocalWorkspace suitable for simple problems. See DiffusionMCMC.jl for a more advanced example.
ExtensibleMCMC.GenericLocalWorkspace — Typestruct GenericLocalWorkspace{T} <: LocalWorkspace{T}
sub_ws::StandardLocalSubworkspace{T}
sub_ws°::StandardLocalSubworkspace{T}
acceptance_history::Vector{Bool}
endGeneric local workspace with sub_ws containing a subset of state that the corresponding updates operates on, with currently accepted value of state as well as its log-likelihood. sub_ws° corresponds to a proposal state. acceptance_history keeps track of accept/reject decisions.
where
ExtensibleMCMC.StandardLocalSubworkspace — Typestruct StandardLocalSubworkspace{T} <: LocalWorkspace{T}
state::Vector{T}
ll::Vector{Float64}
ll_history::Vector{Vector{Float64}}
∇ll::Vector{Vector{Float64}}
momentum::Vector{Float64}
endStandard containers likely to be present in every local workspace. state is the currently accepted subset of parameter θ that the corresponding update operates on, ll is the corresponding log-likelihood and ll_history is the chain of log-likelihoods. A single ll is of the type Vector{Float64} to reflect the fact that a problem might admit a natural factorisation into independent components that may be operated on independently, in parallel with each entry in ll corresponding to a separate component. ∇ll is the gradient of log-likelihood (needed by gradient-based algorithms) and momentum is the variable needed for the Hamiltionan dynamics. ∇ll and momentum may be simply left untouched if the problem does not need them.
Custom Workspaces
One of the most important aspects of ExtensibleMCMC.jl is customizability of Workspaces. Below, we describe how to define your own Workspaces.
Custom GlobalWorkspace
Each <CUSTOM>GlobalWorskspace needs to inherit from GlobalWorkspace. Apart from struct definition we need to provide a <CUSTOM>Backend inheriting from:
ExtensibleMCMC.MCMCBackend — TypeSupertype of all backends for the MCMC sampler.
that will help Julia choose suitable initializers.
Additionally, if you plan on re-using some components of ExtensibleMCMC.jl, then the following methods MUST be defined for your <CUSTOM>GlobalWorskspace (using <CUSTOM>GlobalWorskspace in place of GlobalWorkspace and <CUSTOM>Backend in place of MCMCBackend):
ExtensibleMCMC.init_global_workspace — Methodinit_global_workspace(
::MCMCBackend,
num_mcmc_steps,
updates::Vector{<:MCMCUpdate},
data,
θinit::Vector{T};
kwargs...
) where TInitialize the <custom>GlobalWorkspace. <custom>MCMCBackend points to which GlobalWorkspace constructors to use, updates is a list of MCMC updates, θinit is the initial guess for the parameter and kwargs are the named arguments passed to run!.
StatsBase.loglikelihood — Methodloglikelihood(ws::GlobalWorkspace, ::Proposal)Evaluate the loglikelihood for the proposal Law and observations stored in a global workspace.
StatsBase.loglikelihood — Methodloglikelihood(ws::GlobalWorkspace, ::Previous)Evaluate the loglikelihood for the accepted Law and observations stored in a global workspace.
When you are overriding loglikelihood, then do so via StatsBase.jl or Distributions.jl , where the function name originally belongs to, i.e.:
using StatsBase
const eMCMC = ExtensibleMCMC
StatsBase.loglikelihood(ws::CUSTOMGlobalWorkspace, ::eMCMC.Proposal) = ...
StatsBase.loglikelihood(ws::CUSTOMGlobalWorkspace, ::eMCMC.Previous) = ...If <CUSTOM>GlobalWorskspace contains a field sub_ws::StandardGlobalSubworkspace, then the methods below will work automatically. If it does not, then you should implement them for your <CUSTOM>GlobalWorskspace:
ExtensibleMCMC.num_mcmc_steps — Methodnum_mcmc_steps(ws::GlobalWorkspace)Return the total set number of MCMC iterations.
ExtensibleMCMC.state — Methodstate(ws::GlobalWorkspace)Return currently accepted state of the chain.
ExtensibleMCMC.state — Methodstate(ws::GlobalWorkspace, step)Return the state of the chain accepted at the the step iteration of the Markov chain.
ExtensibleMCMC.state° — Methodstate°(ws::GlobalWorkspace, step)Return the state of the chain proposed at the the step iteration of the Markov chain.
ExtensibleMCMC.num_updt — Methodnum_updt(ws::GlobalWorkspace)Return the total set number of MCMC updates that may be performed at each MCMC iteration.
ExtensibleMCMC.estim_mean — Methodestim_mean(ws::GlobalWorkspace)Return the empirical mean of the parameter.
ExtensibleMCMC.estim_cov — Methodestim_cov(ws::GlobalWorkspace)Return the empirical covariance of the parameter.
Custom LocalWorkspace
Each <CUSTOM>LocalWorskspace needs to inherit from LocalWorkspace. Additionally, if you plan on re-using some components of ExtensibleMCMC.jl, then the following methods MUST be defined for your <CUSTOM>LocalWorskspace (using <CUSTOM>LocalWorskspace in place of LocalWorkspace):
ExtensibleMCMC.create_workspace — Methodcreate_workspace(
::MCMCBackend,
mcmcupdate,
global_ws::GlobalWorkspace,
num_mcmc_steps
) where {T}Create a local workspace for a given mcmcupdate.
ExtensibleMCMC.accepted — Methodaccepted(ws::LocalWorkspace, i::Int)Return boolean for whether the ith update has been accepted.
ExtensibleMCMC.set_accepted! — Methodset_accepted!(ws::LocalWorkspace, i::Int, v)Set boolean for whether the ith update has been accepted.
If your <CUSTOM>LocalWorskspace contains sub_ws::StandardGlobalSubworkspace and sub_ws°::StandardGlobalSubworkspace as its fields then the methods below will work automatically:
ExtensibleMCMC.ll — Methodll(ws::LocalWorkspace)Return log-likelihood of the currently accepted parameter.
ExtensibleMCMC.ll° — Methodll°(ws::LocalWorkspace)Return log-likelihood of the currently proposed parameter.
ExtensibleMCMC.ll — Methodll(ws::LocalWorkspace, i::Int)Return log-likelihood of the ith accepted parameter.
ExtensibleMCMC.ll° — Methodll°(ws::LocalWorkspace, i::Int)Return log-likelihood of the ith proposed parameter.
ExtensibleMCMC.state — Methodstate(ws::LocalWorkspace)Return the last accepted state.
ExtensibleMCMC.state° — Methodstate°(ws::LocalWorkspace)Return the last proposed state.
The function names associated with proposals end on a character °, which in Atom can be displayed with \degree and has a unicode: U+00B0.
Finally, there are some functions that are likely to work for custom LocalWorkspaces, but you are advised to check for compatibility:
ExtensibleMCMC.create_workspaces — Methodcreate_workspaces(v::MCMCBackend, mcmc::MCMC)Create local workspaces, one for each update.
ExtensibleMCMC.llr — Methodllr(ws::LocalWorkspace, i::Int)Compute log-likelihood ratio at ith mcmc iteration.
ExtensibleMCMC.name_of_update — Methodname_of_update(ws::LocalWorkspace)Return the name of the update.