From 6b7a06b66c7162ffa5be87583a3aec67be3fbfd4 Mon Sep 17 00:00:00 2001 From: edknock Date: Wed, 22 Jan 2025 17:30:40 +0000 Subject: [PATCH 1/7] Start of order of events --- _quarto-book.yml | 1 + interpolation.qmd | 9 +-- order.qmd | 169 ++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 172 insertions(+), 7 deletions(-) create mode 100644 order.qmd diff --git a/_quarto-book.yml b/_quarto-book.yml index d6c662f..3b267eb 100644 --- a/_quarto-book.yml +++ b/_quarto-book.yml @@ -18,6 +18,7 @@ book: - arrays.qmd - interpolation.qmd - data.qmd + - order.qmd - packaging.qmd - part: "Monty" chapters: diff --git a/interpolation.qmd b/interpolation.qmd index a11dc82..827b9fa 100644 --- a/interpolation.qmd +++ b/interpolation.qmd @@ -12,7 +12,7 @@ library(odin2) library(dust2) ``` -## Using `time` within equations +## Using `time` within equations {#sec-interpolation-time} The simplest way to make a model time-dependent is simply to refer to `time` as a variable. For example, suppose we have a continuous-time logistic growth model (see @sec-time-logistic) @@ -183,7 +183,7 @@ dust_system_state(sys) With this approach we can ensure that even very small periods of discontinuity are found by the solver. -## Using interpolation functions +## Using interpolation functions {#sec-interpolation-functions} Sometimes, you will want the time-varying functions of your model to be driven by data. So rather than having the model driven by seasonal change use some approximation of dynamics as we did with a `sin` wave above, you might want to use actual temperature or rainfall data. To do this, you can use odin's "interpolation" functions, which take some series of time points and data and create a continuous function with respect to time. There are several different modes of interpolation available, which you can use to model different sorts of processes. @@ -368,8 +368,3 @@ daily_doses <- interpolate(daily_doses_time, daily_doses_value, "constant") does not differ whether are interpolating a scalar quantity or an array quantity. In this example we have interpolated a vector quantity. You can interpolate higher-dimensional arrays. Just always make sure that the last dimension of your "value" array corresponds to your "time" vector. - - -## Additional topics not yet covered - -* order of events in interpolation with discrete time models diff --git a/order.qmd b/order.qmd new file mode 100644 index 0000000..d9661da --- /dev/null +++ b/order.qmd @@ -0,0 +1,169 @@ +# Order of events for discrete-time stochastic models {#sec-order} + +```{r} +#| include: false +source("common.R") +``` + +```{r} +library(odin2) +library(dust2) +``` + +Here we have a discrete-time stochastic model + +```{r} +sir <- odin({ + update(S) <- S - n_SI + update(I) <- I + n_SI - n_IR + update(R) <- R + n_IR + update(incidence) <- incidence + n_SI + + beta <- interpolate(beta_time, beta_value, "linear") + beta_time <- parameter(constant = TRUE) + beta_value <- parameter(constant = TRUE) + dim(beta_time, beta_value) <- parameter(rank = 1) + + p_SI <- 1 - exp(-beta * I / N * dt) + p_IR <- 1 - exp(-gamma * dt) + n_SI <- Binomial(S, p_SI) + n_IR <- Binomial(I, p_IR) + + initial(S) <- N - I0 + initial(I) <- I0 + initial(R) <- 0 + initial(incidence, zero_every = 1) <- 0 + + N <- parameter(1000) + I0 <- parameter(10) + gamma <- parameter(0.1) + + + ## Comparison to data + phi <- interpolate(phi_time, phi_value, "constant") + phi_time <- parameter(constant = TRUE) + phi_value <- parameter(constant = TRUE) + dim(phi_time, phi_value) <- parameter(rank = 1) + + cases <- data() + exp_noise <- Exponential(1e6) + cases ~ Poisson(phi * incidence + exp_noise) + +}) +``` + +this model includes many components we have introduced in preceding sections, such as [variables that reset periodically](time.qmd#sec-reset-variables), [use of interpolation functions](interpolation.qmd#sec-interpolation-functions) and [comparison to data](data.qmd). + +It is useful to understand the order in which events will be evaluated as the model updates from `time = t0` to `time = t0 + dt`. There will first be a series of updates to the variables, followed by a comparison to data (assuming you are comparing the model to data). + +## Updates {#sec-order-update} + +At the beginning of the update we have `time = t0` and then the updates proceed in the following order: + +### Reset any variables that use `zero_every` + +Any variable that uses `zero_every = x` will be reset to 0 if `t0` is a multiple of `x`. In our model we have + +```r +initial(incidence, zero_every = 1) <- 0 +``` + +so `incidence` will reset to 0 whenever `t0` is an integer. + +### Read from variables + +All variables will be read at their current values. In our example these are `S`, `I`, `R` and `incidence`. + +### Look up interpolation + +Any uses of `interpolate` relating to updates are evaluated using `time = t0`. Thus in our example + +```r +beta <- interpolate(beta_time, beta_value, "linear") +``` + +is evaluated to give `beta` its value at `time = t0`. + +### Evaluate assignments + +Assignment equations should have a logical order of evaluation (circularity should prevent compilation). For instance in our model it would naturally evaluate + +```r +p_SI <- 1 - exp(-beta * I / N * dt) +p_IR <- 1 - exp(-gamma * dt) +``` + +and then + +```r +n_SI <- Binomial(S, p_SI) +n_IR <- Binomial(I, p_IR) +``` + +Equations be evaluated using the variables that were read in earlier (note that this is after any variables that use `zero_every` have been reset to 0). As with interpolation, any equations that use `time` (see @sec-interpolation-time) will use `time = t0`. + +### Write out new values of state + +Here we write out new values of variables as given by `update` equations. In our example these are + +```r +update(S) <- S - n_SI +update(I) <- I + n_SI - n_IR +update(R) <- R + n_IR +update(incidence) <- incidence + n_SI +``` + +As with the other assignment equations, these are evaluated using values of the variables that were read in earlier. Thus in the equation + +```r +update(S) <- S - n_SI +``` + +it is telling us the value we will assign to `S` for `time = t0 + dt`, but on the right hand side it will use the value of `S` at `time = t0`. + + +### Update time to t0 + dt + +The series of updates ends with `time` being increased by `dt`. + + +## Comparison to data {#sec-order-comparison} + +The comparison to data follows on from the above series of updates, and as such you may find it is good practice to structure your `odin` code such that all equations relating to the comparison to data appear in a block at the end. + +Equations relating to comparison to data are evaluated after the update of time to `t0 + dt` and will only be evaluated if you have any data at `time = t0 + dt`. + +### Read from variables + +All variables will be read at their current values, which are the values corresponding to `time = t0 + dt`. Again, in our example the variables are `S`, `I`, `R` and `incidence`, although only `incidence` is used in the comparison to data + +### Look up interpolation + +Any uses of `interpolate` that are used in the comparison to data will be evaluated using `time = t0 + dt` (not using `time = t0` as was used earlier in the "update" step). Compilation of `odin` code will detect which interpolations are used in the comparison to data. + +In our example we +```r +phi <- interpolate(phi_time, phi_value, "constant") +``` +where `phi` might represent a time-varying ascertainment rate of cases (which could be affected by e.g. whether schools are open, or day of the week effects). + +### Evaluate assignments + +As with interpolation, compliation of `odin` code will detect which assignments are used only in the comparison to data. For instance in our example this would be + +```r +exp_noise <- Exponential(1e6) +``` + +Any use of variables in these equations on the right-hand side will be evaluated using their values at `time = t0 + dt`. + + +### Compare to data + +Finally we evaluate the density given by any relationship equations. In our example this is + +```r +cases ~ Poisson(phi * incidence + exp_noise) +``` + +As with assigments in the comparison to data step, any use of variables in these equations on the right-hand side will be evaluated using their values at `time = t0 + dt`. From 11c9b2e78076289105e20085667b47709897dcbd Mon Sep 17 00:00:00 2001 From: edknock Date: Thu, 23 Jan 2025 13:32:04 +0000 Subject: [PATCH 2/7] add warning note about lags --- order.qmd | 95 ++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 94 insertions(+), 1 deletion(-) diff --git a/order.qmd b/order.qmd index d9661da..d1ff770 100644 --- a/order.qmd +++ b/order.qmd @@ -100,7 +100,7 @@ n_SI <- Binomial(S, p_SI) n_IR <- Binomial(I, p_IR) ``` -Equations be evaluated using the variables that were read in earlier (note that this is after any variables that use `zero_every` have been reset to 0). As with interpolation, any equations that use `time` (see @sec-interpolation-time) will use `time = t0`. +Equations will be evaluated using the variables that were read in earlier (note that this is after any variables that use `zero_every` have been reset to 0). As with interpolation, any equations that use `time` (see @sec-interpolation-time) will use `time = t0`. ### Write out new values of state @@ -121,6 +121,99 @@ update(S) <- S - n_SI it is telling us the value we will assign to `S` for `time = t0 + dt`, but on the right hand side it will use the value of `S` at `time = t0`. +::: {.callout-warning} +You should be careful to avoid introducing lags between state variables in your `update` equations. + +For example for this simple model + +\begin{gather*} +X(t + dt) = X(t) + Normal(0, 1)\\ +Y(t + dt) = Y(t) + Normal(0, 1)\\ +Z(t) = X(t) + Y(t) +\end{gather*} + +it would be incorrect to code this as + +```{r} +m <- odin({ + update(X) <- X + dX + update(Y) <- Y + dY + update(Z) <- X + Y + + dX <- Normal(0, 1) + dY <- Normal(0, 1) + + initial(X) <- 0 + initial(Y) <- 0 + initial(Z) <- 0 +}) +``` + +as we can see that running this results in `Z` being lagged by a time-step relative to `X` and `Y`: + +```{r} +sys <- dust_system_create(m, list(), seed = 1) +dust_system_set_state_initial(sys) +t <- seq(0, 5) +y <- dust_system_simulate(sys, t) +y <- dust_unpack_state(sys, y) + +y$X + y$Y + +y$Z +``` + +This is because + +```r +update(Z) <- X + Y +``` + +corresponds to $Z(t + dt) = X(t) + Y(t)$ and not $Z(t) = X(t) + Y(t)$. Thus we need to account for the updates to X and Y on the right-hand side. + +A correct way to write it could be + +```r +update(Z) <- X + dX + Y + dY +``` + +however in more complex models such an approach may become unwieldy, and further changes to updates for `X` and `Y` must be replicated in the update to `Z`. + +A safer approach is to introduce intermediate variables representing the updated values of `X` and `Y` + +```{r} +m <- odin({ + update(X) <- X_new + update(Y) <- Y_new + update(Z) <- X_new + Y_new + + dX <- Normal(0, 1) + X_new <- X + dX + dY <- Normal(0, 1) + Y_new <- Y + dY + + initial(X) <- 0 + initial(Y) <- 0 + initial(Z) <- 0 +}) + +``` + +which means that subsequent changes to how `X` updates are automatically factored into how `Z` updates. We can see this version no longer results in a lag: + +```{r} +sys <- dust_system_create(m, list(), seed = 1) +dust_system_set_state_initial(sys) +t <- seq(0, 5) +y <- dust_system_simulate(sys, t) +y <- dust_unpack_state(sys, y) + +y$X + y$Y + +y$Z +``` +::: + ### Update time to t0 + dt From b7b08086e420051e07dae96acba6acb8a011b40b Mon Sep 17 00:00:00 2001 From: edknock <47318334+edknock@users.noreply.github.com> Date: Thu, 23 Jan 2025 13:57:38 +0000 Subject: [PATCH 3/7] Deal with Rich's comments --- order.qmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/order.qmd b/order.qmd index d1ff770..ce2dd6e 100644 --- a/order.qmd +++ b/order.qmd @@ -259,4 +259,4 @@ Finally we evaluate the density given by any relationship equations. In our exam cases ~ Poisson(phi * incidence + exp_noise) ``` -As with assigments in the comparison to data step, any use of variables in these equations on the right-hand side will be evaluated using their values at `time = t0 + dt`. +As with assignments in the comparison to data step, any use of variables in relationship equations on the right-hand side will be evaluated using their values at `time = t0 + dt`. No actual assignment is done in relationship equations, but the density accumulates over all non-NA data entries. From a8e7ed87e364e0d5e5a70420552829e0fad9efa4 Mon Sep 17 00:00:00 2001 From: edknock <47318334+edknock@users.noreply.github.com> Date: Thu, 23 Jan 2025 13:58:25 +0000 Subject: [PATCH 4/7] Apply suggestions from code review Co-authored-by: Rich FitzJohn --- order.qmd | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/order.qmd b/order.qmd index ce2dd6e..4aa9fda 100644 --- a/order.qmd +++ b/order.qmd @@ -222,7 +222,7 @@ The series of updates ends with `time` being increased by `dt`. ## Comparison to data {#sec-order-comparison} -The comparison to data follows on from the above series of updates, and as such you may find it is good practice to structure your `odin` code such that all equations relating to the comparison to data appear in a block at the end. +The comparison to data follows on from the above series of updates, and as such it is common that all equations relating to the comparison to data appear in a block at the end. Equations relating to comparison to data are evaluated after the update of time to `t0 + dt` and will only be evaluated if you have any data at `time = t0 + dt`. @@ -235,9 +235,11 @@ All variables will be read at their current values, which are the values corresp Any uses of `interpolate` that are used in the comparison to data will be evaluated using `time = t0 + dt` (not using `time = t0` as was used earlier in the "update" step). Compilation of `odin` code will detect which interpolations are used in the comparison to data. In our example we + ```r phi <- interpolate(phi_time, phi_value, "constant") ``` + where `phi` might represent a time-varying ascertainment rate of cases (which could be affected by e.g. whether schools are open, or day of the week effects). ### Evaluate assignments From 5e19e3c2d0d18dd2061948dfdac93c9a800b85e4 Mon Sep 17 00:00:00 2001 From: edknock Date: Thu, 23 Jan 2025 14:02:53 +0000 Subject: [PATCH 5/7] fix typo --- odin-monty.Rproj | 1 + 1 file changed, 1 insertion(+) diff --git a/odin-monty.Rproj b/odin-monty.Rproj index 47accff..a634677 100644 --- a/odin-monty.Rproj +++ b/odin-monty.Rproj @@ -1,4 +1,5 @@ Version: 1.0 +ProjectId: 34b55dc2-f297-40ac-9434-891d6e7e617b RestoreWorkspace: Default SaveWorkspace: Default From 710bfa3d9222ab117a962d00dfd36e6ecf73a6a0 Mon Sep 17 00:00:00 2001 From: edknock Date: Thu, 23 Jan 2025 16:45:29 +0000 Subject: [PATCH 6/7] Revert "fix typo" This reverts commit 5e19e3c2d0d18dd2061948dfdac93c9a800b85e4. --- odin-monty.Rproj | 1 - 1 file changed, 1 deletion(-) diff --git a/odin-monty.Rproj b/odin-monty.Rproj index a634677..47accff 100644 --- a/odin-monty.Rproj +++ b/odin-monty.Rproj @@ -1,5 +1,4 @@ Version: 1.0 -ProjectId: 34b55dc2-f297-40ac-9434-891d6e7e617b RestoreWorkspace: Default SaveWorkspace: Default From c9ceefbfbb7bb50d705007130a639b6563f37508 Mon Sep 17 00:00:00 2001 From: edknock Date: Thu, 23 Jan 2025 16:46:27 +0000 Subject: [PATCH 7/7] fix typo --- order.qmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/order.qmd b/order.qmd index 4aa9fda..60dd318 100644 --- a/order.qmd +++ b/order.qmd @@ -244,7 +244,7 @@ where `phi` might represent a time-varying ascertainment rate of cases (which co ### Evaluate assignments -As with interpolation, compliation of `odin` code will detect which assignments are used only in the comparison to data. For instance in our example this would be +As with interpolation, compilation of `odin` code will detect which assignments are used only in the comparison to data. For instance in our example this would be ```r exp_noise <- Exponential(1e6)