Parameter Inference in dynamical systems
One of the challenges modellers face in biological sciences is to calibrate models in order to match as closely as possible observations and gain predictive power. Scientific machine learning addresses this problem by applying optimisation techniques originally developed within the field of machine learning to mechanistic models, allowing to infer parameters directly from observation data.

One of the challenges modellers face in biological sciences is to calibrate models in order to match as closely as possible observations and gain predictive power. This can be done via direct measurements through experimental design, but this process is often costly, time consuming and even sometimes not possible. Scientific machine learning addresses this problem by applying optimisation techniques originally developed within the field of machine learning to mechanistic models, allowing to infer parameters directly from observation data. In this blog post, I shall explain the basics of this approach, and how the Julia ecosystem has efficiently embedded such techniques into ready to use packages. This promises exciting perspectives for modellers in all areas of environmental sciences.
π§ This is Work in progress π§
Dynamical systems are models that allow to reproduce, understand and forecast systems. They connect the time variation of the state of the system to the fundamental processes we believe driving it, that is
time variation of πt=βprocesses acting on πt
where πt denotes the state of the system at time t. This translates mathematically into
βt(πt)=fΞΈ(πt)
where the function fΞΈ captures the ensembles of the processes considered, and depend on the parameters ΞΈ.
Eq. (1) is a Differential Equation, that can be integrated with respect to time to obtain the state of the system at time t given an initial state πt0.
πt=πt0+β«t0fΞΈ(πs)ds
Dynamical systems have been used for hundreds of years and have successfully captured e.g. the motion of planets (second law of Kepler), the voltage in an electrical circuit, population dynamics (Lotka Volterra equations) and morphogenesis (Turing patterns)β¦
Such models can be used to forecast the state of the system in the future, or can be used in the sense of virtual laboratories. In both cases, one of the requirement is that they reproduce patterns - at least at a qualitative level. To do so, the modeler needs to find the true parameter combination ΞΈ that correspond to the system under consideration. And this is tricky! In this post we adress this challenge.
Model calibration
How to determine ΞΈ so that simulationsβempirical data?
The best way to do that is to design an experiment!
When possible, measuring directly the parameters in a controlled experiment with e.g. physical devices is a great approach. This is a very powerful scientific method, used e.g. in global circulation models where scientists can measure the water viscosity, the change in water density with respect to temperature, etc⦠Unfortunately, such direct methods are often not possible considering other systems.
An opposite approach, known as inverse modelling, is to infer the parameters undirectly with the empirical data available.
Parameter exploration
One way to find right parameters is to perform parameter exploration, that is, slicing the parameter space and running the model for all parameter combinations chosen. Comparing the simulation results to the empirical data available, one can elect the combination with the higher explanatory power.
But as the parameter space becomes larger (higher number of parameters) this becomes tricky. Such problem is often refered to as the curse of dimensionality. Feels very much like being lost in a giant maze. We need more clever technique to get out!
A Machine Learning problem
In machine learning, people try to predict a variable y from predictors x by finding suitable parameters ΞΈ of a parametric function FΞΈ so that
y=FΞΈ(x)
For example, in computer vision, this function might be designed for the specific task of labelling images, such as for instance
FΞΈ(

Usually people use neural networks so that FΞΈβ‘NNΞΈ, as they are good approximators for high dimensional function (see the Universal approximation theorem). One should really see neural networks as functions ! For example, feed forward neural networks are mathematically described by a series of matrix multiplications and nonlinear operations, i.e. NNΞΈ(x)=Ο1βf1ββ―βΟnβfn(x) where Οi is an activation function and fi is linear function fi(x)=Aix+bi.
πt=FΞΈ(πt0)
where FΞΈ(πt0)β‘πt0+β«t0fΞΈ(πs)ds.
With this perspective in mind, techniques developed within the field of Machine Learning - to find suitable parameters ΞΈ that best predict y - become readily available to reach our specific needs: model calibration!
Parameter inference
The general strategy to find a suitable neural network that can perform the tasks required is to βtrainβ it, that is, to find the parameters ΞΈ so that its predictions are accurate.
In order to train it, one βscoresβ how good a combination of parameter ΞΈ performs. A way to do so is to introduce a βLoss functionβ
L(ΞΈ)=(FΞΈ(x)βyempirical)2
One can then use an optimisation method to find a local minima (and in the best scenario, the global minima) for L.
Gradient descent
You ready?
Gradient descent and stochastic gradient descent are βiterative optimisation methods that seek to find a local minimum of a differentiable functionβ (Wikipedia). Such methods have become widely used with the development of artifical intelligence.
Those methods are used to compute iteratively ΞΈ using the sensitivity of the loss function to changes in ΞΈ, denoted by βΞΈL(ΞΈ)
ΞΈi+1=ΞΈ(i)βΞ»βΞΈL(ΞΈ)
where Ξ» is called the learning rate.
In practice
The sensitivity with respect to the parameters βΞΈL(ΞΈ) is in practice obtained by differentiating the code (Automatic Differentiation).
For some programming languages this can be done automatically, with low computational cost. In particular, Flux.jl allows to efficiently obtain the gradient of any function written in the wonderful language Julia.
The library DiffEqFlux.jl based on Flux.jl implements differentiation rules (custom adjoints) to obtain even more efficiently the sensitivity of a loss function that depends on the numerical solution of a differential equation. That is, DiffEqFlux.jl precisely allows to do parameter inference in dynamical systems. Go and check it out!