PiecewiseInference.jl

PiecewiseInference.PiecewiseInferenceModule

PiecewiseInference.jl

Build status (Github Actions) codecov.io

Suite for inverse modelling of dynamical systems characterised by complex dynamics.

PiecewiseInference.jl implements a segmentation method that improves the convergence of local search methods by regularizing the inference problem.

Installation

PiecewiseInference.jl has ParametricModels.jl in its dependency, a non-registered package. As such, to install PiecewiseInference.jl, you'll need to first add an extra registry to your Julia installation that tracks both ParametricModels.jl and PiecewiseInference.jl.

To proceed, open Julia and type the following

using Pkg
pkg"registry add https://github.com/vboussange/VBoussangeRegistry.git"

Then go on and

pkg"add PiecewiseInference"

That's it! This will download the latest version of PiecewiseInference.jl from this git repo and download all dependencies.

Getting started

Check out this blog post providing a hands-on tutorial. See also the documentation and the test folder.

Related packages

DiffEqFlux is a package with similar goals as PiecewiseInference, and proposes the method DiffEqFlux.multiple_shooting, which is close to PiecewiseInference.inference but where initial conditions are not inferred. PiecewiseInference further proposes several utility methods for model selection.

Reference

  • Boussange, V., Vilimelis-Aceituno, P., Pellissier, L., Mini-batching ecological data to improve ecosystem models with machine learning bioRxiv, 46 pages.
source
PiecewiseInference.InferenceProblemMethod
InferenceProblem(
    model,
    p0;
    p_bij,
    u0_bij,
    loss_param_prior,
    loss_u0_prior,
    loss_likelihood
)

Args

  • model: a model of type AbstractModel.
  • p0: initial guess of the parameters. Should be a named tuple.

Optional

  • p_bij: a dictionary containing bijectors, to constrain parameter values and initial conditions
  • loss_param_prior is a function with arguments p::NamedTuple. Should

correspond to parameter priors. By default, loss_param_prior(p) = 0.

  • loss_u0_prior is a function with arguments u0_pred, u0_data. Should

correspond to IC priors. By default it corresponds to RSS between u0 and the corresponding data point.

  • loss_likelihood is a function that matches the predictions and the data,

which should have as arguments data, pred, tsteps, where tsteps correspond to the time steps of data and pred. By default, it corresponds to the RSS between data and preds.

source
PiecewiseInference.AICMethod
AIC(RSS, k, m)

Calculate AIC of a model given its RSS, k its number of parameters, and m the number of observations.

source
PiecewiseInference.AICMethod
AIC(data, tsteps, infres)

Computes the AIC of res given the observational noise distribution noisedistrib.

source
PiecewiseInference.AICcMethod
AICc(aic, k, m)

Calculate AIC corrected of a model given its aic, k its number of parameters, and m the number of observations (if d variables and T time steps, m = d * T). The formula is taken from Mangan2017.

source
PiecewiseInference.AICc_TREEMethod

AICc_TREE(RSS, k, m)

Calculate aic of a model given its RSS, k its number of parameters, and m the number of observations. The formula is taken from TREE article.

source
PiecewiseInference.FIM_strouwenMethod
FIM_strouwen(predict, θ, Σ)

Returns the FIM matrix associated to predict and evaluated at θ taken from https://arnostrouwen.com/post/dynamic-experimental-design/

  • predict is a function that takes θ as a parameter and returns an array
    • with dim=1 corresponding to state variables and
    • dim = 2 corresponding to the time steps
  • θ the parameter vector where to evaluate the FIM
  • Σ is the variance-covariance matrix of the observation errors
source
PiecewiseInference.FIM_yazdaniMethod
FIM_yazdani(dudt, u0_true, tspan, p_true, Σ)

Returns the FIM matrix associated to problem defined by prob = ODEProblem(dudt, u0_true, tspan, p_true). Σ is the variance-covariance matrix of the observation errors

source
PiecewiseInference.R2Method
R2(odedata, pred, _)
We distinguish between R2 for log transformed values (with `dist::MvLogNormal` as last argument) 
and standard R2, to discard non positive values in the former case.
source
PiecewiseInference.RSSMethod
RSS(res, data_set, noisedistrib)

Computes the RSS of res given data_set.

Argument:

  • fn is set to identity but can be e.g. log if lognormal loss used.
source
PiecewiseInference.get_evidenceMethod
get_evidence(data, tsteps, infprob, p, u0s, ranges)

Provides evidence p(M|data) = ∫p(data|M, θ) p(θ) p(M) dθ for model M stored in infprob given the data, using MAP estimate p and u0s. Here it is assumed that p(M) = 1.

Relies on Laplace's method

Note

For now, we do not integrate over initial conditions u0s, but this may be considered.

source
PiecewiseInference.group_ranges_gsMethod
group_ranges_gs(datasize, group_size)

Get ranges that partition data of length datasize in groups of groupsize observations. If the data isn't perfectly dividable by groupsize, the last group contains the reminding observations. Taken from https://github.com/SciML/DiffEqFlux.jl/blob/80c4247c19860d0422211d6a65283d896eeaa831/src/multiple_shooting.jl#L273-L303

group_ranges(datasize, groupsize)

Arguments:

  • datasize: amount of data points to be partitioned
  • groupsize: maximum amount of observations in each group

Example:

julia> group_ranges(10, 5)
3-element Vector{UnitRange{Int64}}:
 1:5
 5:9
 9:10
source
PiecewiseInference.inferenceMethod
inference(
    infprob;
    data,
    tsteps,
    group_size,
    group_nb,
    ranges,
    optimizers,
    epochs,
    batchsizes,
    verbose_loss,
    plotting,
    info_per_its,
    cb,
    threshold,
    save_pred,
    save_losses,
    u0s_init,
    adtype,
    multi_threading
)

performs piecewise inference for a given InferenceProblem and data. Loops through the optimizers optimizers. Returns a InferenceResult.

Arguments

  • infprob: An instance of InferenceProblem that defines the model, the parameter constraints and its likelihood function.
  • opt : An array of optimizers that will be used to maximize the likelihood (minimize the loss).
  • group_size : The size of the segments. It is an alternative to group_nb, and specifies the number of data point in each segment.
  • group_nb: Alternatively to group_size, one can ask for a certain number of segments.
  • ranges: Alternatively to group_size and group_nb, one can directly provide a vector of indices, where each entry corresponds to the indices of a segment. Possibly provided by get_ranges.
  • data : The data to fit.
  • tsteps : The time steps for which the data was recorded.

Optional

  • u0_init : A vector of initial guesses for the initial conditions for each segment. If not provided, initial guesses are initialized from the data.
  • optimizers : array of optimizers, e.g. [Adam(0.01)]
  • epochs : A vector with number of epochs for each optimizer in optimizers.
  • batchsizes: An vector of batch sizes, which should match the length of optimizers.
  • verbose_loss : Whether to display loss during training.
  • info_per_its = 50: The frequency at which to display the training information.
  • plotting : Whether to plot the convergence loss during training.
  • cb : A call back function. Must be of the form cb(θs, p_trained, losses, pred, ranges).
  • threshold : The tolerance for stopping training.
  • save_pred = true: Whether to save the predictions.
  • save_losses = true : Whether to save the losses.
  • adtype = Optimization.AutoForwardDiff() : The automatic differentiation (AD) type to be used. Can be Optimization.AutoForwardDiff() for forward AD or

Optimization.Autozygote() for backward AD.

  • multi_threading = true: if true, segments in the piecewise loss are

computed in parallel. Currently not supported with adtype = Optimization.Autozygote()

Examples

```julia using SciMLSensitivity # provides diffential equation sensitivity methods using UnPack # provides the utility macro @unpack using OptimizationOptimisers, OptimizationFlux # provide the optimizers using LinearAlgebra using ParametricModels using PiecewiseInference using OrdinaryDiffEq using Distributions, Bijectors # used to constrain parameters and initial conditions @model MyModel function (m::MyModel)(du, u, p, t) @unpack b = p du .= 0.1 .* u .* ( 1. .- b .* u) end tsteps = 1.:0.5:100.5 tspan = (tsteps[1], tsteps[end]) ptrue = (b = [0.23, 0.5],) pinit= (b = [1., 2.],)

Defining the model

Pay attention to the semi column before the parameters for ModelParams

u0 = ones(2) mp = ModelParams(;p = ptrue, tspan, u0, alg = BS3(), sensealg = ForwardDiffSensitivity(), saveat = tsteps, ) model = MyModel(mp) soldata = ParametricModels.simulate(model) odedata = Array(soldata)

adding some normally distributed noise

σnoise = 0.1 odedatawnoise = odedata .+ randn(size(odedata)) .* σnoise

Define the InferenceProblem

First specifiy which values can the parameter take with bijectors

here, b is constrained to be ∈ [1e-3, 5e0] and u0 ∈ [1e-3, 5.]

pbij = (bijector(Uniform(1e-3, 5e0)),) u0bij = bijector(Uniform(1e-3,5.)) distribnoise = MvNormal(ones(2) * σnoise^2)

defining loss_likelihood

losslikelihood(data, pred, tsteps) = sum(logpdf(distribnoise, data .- pred)) infprob = InferenceProblem(model, pinit; pbij, u0bij) optimizers = [ADAM(0.001)] epochs = [5000] groupnb = 2 batchsizes = [1] # batch size used for each optimizer in optimizers (here only one)

you could also have batchsizes = [group_nb]

res = inference(infprob, groupnb = groupnb, data = odedatawnoise, tsteps = tsteps, epochs = epochs, optimizers = optimizers, batchsizes = batchsizes, ) ptrained = getp_trained(res) pred = res.pred

source
PiecewiseInference.iterative_inferenceMethod
iterative_inference(
    infprob;
    group_sizes,
    group_nbs,
    optimizers_array,
    threshold,
    kwargs...
)

Performs a iterative piecewise MLE, iterating over group_sizes. Stops the iteration when loss function increases between two iterations.

Returns an array with all InferenceResult obtained during the iteration. For kwargs, see inference.

Note

  • for now, does not support independent time series (piecewise_ML_indep_TS).
  • at every iteration, initial conditions are initialised given the predition of previous iterations

Specific arguments

  • group_sizes : array of group sizes to test
  • optimizers_array: optimizersarray[i] is an array of optimizers for the trainging process of `groupsizes[i]`
source
PiecewiseInference.loss_param_prior_from_dictMethod
loss_param_prior_from_dict(params, param_distrib)

Returns the loglikelihood of params given the prior distribution of the parameters param_distrib

  • params: params, in the form of NamedTuple
  • param_distrib: in the form of a Dictionary or a NamedTuple, with entries p::String => "d::Distribution"
source
PiecewiseInference.moments!Method
moments!(xx, x)

reurns the moments of x as a vector stored in xx

  • Args
  • xx : the covariate vector
  • x : the state variable vector
source
PiecewiseInference.piecewise_ML_indep_TSMethod
piecewise_ML_indep_TS(
    infprob;
    data,
    group_size,
    group_nb,
    tsteps,
    save_pred,
    save_losses,
    kwargs...
)

Similar to inference but for independent time series, where data is a vector containing the independent arrays corresponding to the time series, and tsteps is a vector where each entry contains the time steps of the corresponding time series.

source
PiecewiseInference.piecewise_lossMethod
piecewise_loss(
    infprob,
    θ,
    ode_data,
    tsteps,
    ranges,
    idx_rngs
)
piecewise_loss(
    infprob,
    θ,
    ode_data,
    tsteps,
    ranges,
    idx_rngs,
    multi_threading
)

Returns a tuple (loss, pred) based on the segmentation of ode_data into segments with time steps given by tsteps[ranges[i]]. The initial conditions are assumed free parameters for each segments.

Arguments:

  • infprob: the inference problem
  • θ: [u0,p] where p corresponds to the parameters of ode function in the optimization space.
  • ode_data: Original Data to be modeloss_likelihooded.
  • tsteps: Timesteps on which ode_data was calculated.
  • ranges: Vector containg range for each segment.
  • idx_rngs: Vector containing the indices of the segments to be included in the loss. Useful for minibatching
source
PiecewiseInference.pretty_printFunction
pretty_print(p_1)
pretty_print(p_1, p_2)

Prints in a nice format the NamedTuple or Dict p_1. If p_true provided, also display its values for comparisions.

source