class: center, middle, inverse, title-slide # Probablistic differential equation solvers as a remedey for discretization-induced bias ### Renny Doig
Supervisor: Dr. Liangliang Wang
Department of Statistics and Actuarial Science, Simon Fraser University --- class: inverse, center, middle # 1. The problem: Likelihoods in chaotic systems --- # Data in dynamical systems A **dynamical system** is a real-valued process `\(x(t)\)` that evolves continuously throughout time - Defined by a system of differential equations, e.g.: `$$x'(t) = f(x(t),t;\theta_{ode},x_0)$$` <br> Observations on such a system typically take the form of: - `\(n\)` observations at times `\(t_1,\ldots,t_n\)` - The `\(i\)`-th observation is assumed to be drawn with some observation noise: `\(Y_i\sim g(x(t;\theta_{ode}); \theta_{obs})\)` - So given the initial values of the system and the parameter values, our likelihood is `\(\mathcal L=\prod_ig(y_i; x(t_i))\)` - Often cannot compute `\(x(t_i)\)` directly, so rely on discretization `\(\hat x_i\)` - So in practice, we actually evaluate `\(\hat{\mathcal L}=\prod_ig(y_i;\hat x_i)\)` --- # Chaotic systems **Chaotic systems** are dynamical systems which exhibit erratic long-term behaviour - Small shifts in the present state can drastically alter future values - Makes small errors due to discretization now result in large errors later <br> To counteract this, adaptive step size methods are used - These methods will set some tolerance, `\(\epsilon\)`, which limits the permissible local truncation error (LTE) at each time step - The LTE is the error resulting from a single step of a discretization algorithm - Adaptive methods will adjust the step size as necessary to ensure the LTE stays below `\(\epsilon\)` --- # The issue: dependence on algorithm parameters When computing the likelihood of observations made on a chaotic system, we find that the value of the likelihood depends heavily on the `\(\epsilon\)` used to compute `\(\hat x(t)\)` .pull-left[ To demonstrate this, we look at the **Lorenz system**: `\begin{aligned} \frac{\textrm dx}{\textrm dt}&=\sigma(y-x)\\ \frac{\textrm dy}{\textrm dt}&=x(\rho-z)-y\\ \frac{\textrm dz}{\textrm dt}&=xy-\beta z \end{aligned}` <br> - `\(\rho=28\)`, `\(\eta=10\)`, `\(\sigma=8/3\)` - `\(x_0=[0,1,0]^T\)` ] .pull-right[ ![](lorenz.gif) ] --- # The issue: dependence on algorithm parameters Here we generate `\(y_{1:n}\)` by computing `\(\hat x(t)\)` using `\(\epsilon^*=10^{-6}\)` - So in effect, `\(\hat x(t;\epsilon^*)\)` is our "true solution" - To demonstrate the issue, we compute `\(\hat{\mathcal L}\)` using several values of `\(\epsilon\)` - Below, we can see that when the `\(\epsilon\ne\epsilon^*\)`, `\(\hat{\mathcal L}\)` is heavily biased .center[ ![](presentation_files/figure-html/unnamed-chunk-1-1.png)<!-- --> ] --- class: inverse, center, middle # 2. The solution: SMC with probabilistic solvers --- # Probabilistic ODE solvers A **probabilistic ODE solver** is a variation of traditional discretization methods that adds random noise to each step of the algorithm - Here we use a probabilistic solver of the form: `$$\hat x(t+\Delta)=\hat x(t)+\Psi(\hat x(t),t,\Delta)+\mathcal N(0,e_t)$$` - `\(\Psi(\cdot)\)` is approximation of `\(x(t+\Delta)-\hat x(t)\)`; dependent on the solver - `\(e_t\)` is an estimate of the LTE of `\(\hat x(t+\Delta)\)` <br> We use the Runge-Kutta-Fehlberg (RKF) method - Simultaneously provide a `\(\mathcal O(\Delta^4)\)` estimate of `\(\hat x(t+\Delta)\)` and a `\(\mathcal O(\Delta^5)\)` estimate of `\(e_t\)` <br> This will be used to reflect uncertainty in our discretization into our solution path --- # Reformulating our problem By replacing a deterministic RKF solver with a probabilistic RKF solver, we can restate our model as: `\begin{aligned} \hat X_i|\hat X_{i-1} &\sim h(\hat X_i|\hat X_{i-1})\\ Y_i|\hat X_i &\sim g(Y_i | \hat X_i) \end{aligned}` - The density `\(h(\cdot)\)` corresponds to the distribution of `\(\hat X_i|\hat X_{i-1}\)` after (potentially) several steps of the probabilistic RKF algorithm - From this we can jointly expression observation and discretization uncertainty: `\(\prod_ig(y_i|x_i)h(x_i|x_{i-1})\)` <br> Therefore, we can obtain a new estimator of the likelihood by marginalising out the discretization r.v.'s `$$\hat{\mathcal L}(y_{1:n})=\int_{\mathcal X^n}h(x_{t_i}|x_{t_{i-1}})g(y_i|x_i)\mathrm dx_{1:n}$$` By expressing the likelihood estimation in this way, we can use sequential Monte Carlo to perform this integration --- # Sequential Monte Carlo **Sequential Monte Carlo** is a sequential procedure based on importance sampling - Starting from some initial value, `\(X_0\)`, sequentially update a collection of `\(K\)` particles of `\(\{\hat X_i\}\)` 1. For each particle, propose a `\(\hat X_i\)` from our probabilistic solver, given `\(\hat X_{i-1}\)` 1. Update its unnormalized importance weight `\(w^{(k)}_i=W_{i-1}^{(k)}\cdot g(y_i|\hat X_i)\)` and normalize to get `\(W_i^{(k)}\)` - From our importance weights we can estimate `\(\hat{\mathcal L}(y_{1:i})\)` `$$\hat{\mathcal L}(y_{1:i}) = \sum_{k=1}^KW_{i-1}^{(k)}\cdot w_i^{(k)}$$` --- class: inverse, center, middle # 3. The result: Improved likelihood estimates --- # Simulation set-up Again generating observations from the Lorenz system under the same parameter settings - 50 equally-space observations with `\(\mathcal N(0,0.1)\)` noise <br> **Simulation study \#1:** Compare log-likelihood estimates at each observation time for deterministic and probabilistic solvers, and SMC with probabilistic solver - All will use `\(\epsilon=10^{-3}\)` - Demonstrate improvement in point estimates and performance through time <br> **Simulation study \#2:** Compare SMC to deterministic solver in terms of dependence on `\(\epsilon\)` - Estimate using `\(\log_{10}\epsilon=-8,-7,\ldots,-3\)` - To reflect the randomness of the SMC algorithm, SMC is run 10 times --- # Simulation study \#1
--- # Simulation study \#2
--- # Key takeaways Three key points to highlight from this presentation: 1. When system dynamics are chaotic we cannot assume that discrete approximation is "good enough" - Bias in our likelihood estimate can be non-negligible and may depend on algorithm parameters 1. We can use probabilistic solvers to account for uncertainty due to discretization error 1. Sequential Monte Carlo can be used with these solvers to produce likelihood estimates that are more robust to `\(\epsilon\)` and reflect our uncertainty about the result --- class: center, middle # Thank you for your attention! # Questions?