diff --git a/_quarto.yml b/_quarto.yml index db398fc..4cef2c0 100644 --- a/_quarto.yml +++ b/_quarto.yml @@ -21,6 +21,10 @@ book: - part: "Monty" chapters: - monty.qmd + - model.qmd + - montyDSL.qmd + - composability.qmd + - samplers.qmd - part: "Odin & Monty" chapters: - inference.qmd diff --git a/composability.qmd b/composability.qmd new file mode 100644 index 0000000..5b4977f --- /dev/null +++ b/composability.qmd @@ -0,0 +1 @@ +# Composability \ No newline at end of file diff --git a/model.qmd b/model.qmd new file mode 100644 index 0000000..38795a1 --- /dev/null +++ b/model.qmd @@ -0,0 +1,7 @@ +# About (monty) models + +We often refers to using models. + +## SimCity vs Statistical models + +## Normalised vs unormalised densities \ No newline at end of file diff --git a/monty.qmd b/monty.qmd index 8ddca01..fdf6fc1 100644 --- a/monty.qmd +++ b/monty.qmd @@ -1,4 +1,40 @@ -# Monty +# Getting started with monty + +* the philosophie +* three options + +`monty` is a R package design to work with statistical models in a *modular* way. It uses Monte Carlo methods to facilitate drawing from these statistical models and is currently built to create complex models through four different routes: + +* simple R code +* a dedicated DSL +* odin models +* composing two monty models. + +The simple example in this chapter is built from a simple R function and is the most basic way of defining a monty model. We will cover more complex examples in the model section. Section ** deals with the monty DSL, while section ** explains how to obtain new monty models through composition of two monty models. monty models obtains from odin models are dealt in the odin & monty chapter. + +## A simple example + +```{r} +library(monty) +``` + +We can define a simple Gaussian mixture model of two sub-population using the `monty_model_function()` from the package + +For this we start by defining + +```{r} +fn <- function(l, p, m1, m2) { + p*dnorm(l,mean = m1) + (1-p)*dnorm(l, mean = m2) +} +``` + +```{r} +l_distribution <- monty_model_function(fn, fixed = list(p = 0.75, m1 = 5, m2=7)) +``` + +## Sampling from our example distribution + +## Further work Before showing how to fit odin models to data, we'll focus for a bit on monty itself. If you are anxious to get on and fit the model from @sec-data-filter, you might skip ahead to @sec-inference, where we resume this. diff --git a/montyDSL.qmd b/montyDSL.qmd new file mode 100644 index 0000000..6989e89 --- /dev/null +++ b/montyDSL.qmd @@ -0,0 +1,109 @@ +# The monty DSL + +The `monty` DSL provides a more intuitive way to define statistical models with `monty`. It is currently relatively basic and focus on providing support for defining priors in Bayesian models. It fully support differentiability allowing to use gradient based samplers on these models. + +`monty` includes a simple probabilistic domain-specific language (DSL) that is inspired by `stan` and [Statistical Rethinking](https://xcelab.net/rm/). It is designed to make some tasks a bit easier, particularly when defining priors for your model. We expect that this DSL is not sufficiently advanced to represent most interesting models but it may get more clever and flexible in the future. In particular we do not expect the DSL to be useful in writing likelihood functions for comparison to data; we expect that if your model is simple enough for this you would be better off using `stan` or some similarly flexible system. + +```{r} +library(monty) +``` + +## Some simple examples + +In chapter 4 of Statistical Rethinking, we build a regression model of height with parameters $\alpha$, $\beta$ and $\sigma$. We can define the model for the prior probability of this model in monty by running + +```{r} +prior <- monty_dsl({ + alpha ~ Normal(178, 20) + beta ~ Normal(0, 10) + sigma ~ Uniform(0, 50) +}) +``` + +This will define a new `monty_model()` object that represents the prior, but with all the bits that we might need depending on how we want to use it: + +We have model parameters + +```{r} +prior$parameters +``` + +These are defined in the order that they appear in your definition (so `alpha` is first and `sigma` is last) + +We can compute the domain for your model: + +```{r} +prior$domain +``` + +We can draw samples from the model if we provide a `monty_rng` object + +```{r} +rng <- monty_rng$new() +theta <- prior$direct_sample(rng) +theta +``` + +We can compute the (log) density at a point in parameter space + +```{r} +prior$density(theta) +``` + +The computed properties for the model are: + +```{r} +prior$properties +``` + +## Calculations in the DSL + +Sometimes it will be useful to perform calculations in the code; you can do this with assignments. Most trivially, giving names to numbers may help make code more understandable: + +```{r} +m <- monty_dsl({ + mu <- 10 + sd <- 2 + a ~ Normal(mu, sd) +}) +``` + +You can also use this to do things like: + +```{r} +m <- monty_dsl({ + a ~ Normal(0, 1) + b ~ Normal(0, 1) + mu <- (a + b) / 2 + c ~ Normal(mu, 1) +}) +``` + +Where `c` is drawn from a normal distribution with a mean that is the average of `a` and `b`. + +## Pass in fixed data + +You can also pass in a list of data with values that should be available in the DSL code. For example, our first example: + +```{r} +prior <- monty_dsl({ + alpha ~ Normal(178, 20) + beta ~ Normal(0, 10) + sigma ~ Uniform(0, 50) +}) +``` + +Might be written as + +```{r} +fixed <- list(alpha_mean = 170, alpha_sd = 20, + beta_mean = 0, beta_sd = 10, + sigma_max = 50) +prior <- monty_dsl({ + alpha ~ Normal(alpha_mean, alpha_sd) + beta ~ Normal(beta_mean, beta_sd) + sigma ~ Uniform(0, sigma_max) +}, fixed = fixed) +``` + +Values you pass in this way are **fixed** (hence the name!) and cannot be modified after the model object is created. \ No newline at end of file diff --git a/samplers.qmd b/samplers.qmd new file mode 100644 index 0000000..71c6c4f --- /dev/null +++ b/samplers.qmd @@ -0,0 +1,96 @@ +# Samplers + +This vignette will describe the samplers available in monty. + +```{r} +library(monty) +``` + +## The bendy banana + +This example shows HMC outperforming a random walk on a two dimensional banana-shaped function. Our model takes two parameters `alpha` and `beta`, and is based on two successive simple draws, with the one conditional on the other, so $\beta \sim Normal(1,0)$ and $\alpha \sim Normal(\beta^2, \sigma)$, with $\sigma$ the standard deviation of the conditional draw. + +We include this example within the package; here we create a model with $\sigma = 0.5$ + +```{r} +m <- monty_example("banana", sigma = 0.5) +m +``` + +We can plot a visualisation of its density by computing the density over a grid. Normally this is not possible of course: + +```{r} +a <- seq(-2, 6, length.out = 1000) +b <- seq(-2, 2, length.out = 1000) +z <- outer(a, b, function(alpha, beta) { + exp(monty_model_density(m, rbind(alpha, beta))) +}) +image(a, b, z, xlab = "alpha", ylab = "beta") +``` + +In this particular case we can also easily generate samples, so we know what a good sampler will produce: + +```{r} +rng <- monty_rng$new() +s <- vapply(seq(200), function(x) m$direct_sample(rng), numeric(2)) +image(a, b, z, xlab = "alpha", ylab = "beta") +points(s[1, ], s[2, ], pch = 19, col = "#00000055") +``` + +It is also possible to compute the 95% confidence interval of the distribution using the relationship between the standard bivariate normal distribution and the banana shaped distribution as defined above. We can check that roughly 10 samples (out of 200) are out of this 95% CI contour. + +```{r} +theta <- seq(0, 2 * pi, length.out = 10000) +z95 <- local({ + sigma <- 0.5 + r <- sqrt(qchisq(.95, df = 2)) + x <- r * cos(theta) + y <- r * sin(theta) + cbind(x^2 + y * sigma, x) +}) +image(a, b, z, xlab = "alpha", ylab = "beta") +lines(z95[, 1], z95[, 2]) +points(s[1, ], s[2, ], pch = 19, col = "#00000055") +``` + +## Sampling with other samplers + +It is not generally possible to directly sample from a density (otherwise MCMC and similar methods would not exist!). In these cases we need to use a sampler based on the density and if available possibly the gradient of the density. + +We can start with a basic random-walk sampler: + +```{r RW_sampling} +sampler_rw <- monty_sampler_random_walk(vcv = diag(2) * 0.01) +res_rw <- monty_sample(m, sampler_rw, 1000) +plot(t(drop(res_rw$pars)), type = "l", col = "#0000ff66", + xlim = range(a), ylim = range(b)) +lines(z95[, 1], z95[, 2]) +``` + +As we can see this is not great, exhibiting strong random walk behaviour as it slowly explores the surface (this is over 1,000 steps). Another way to view this is the parameters varying over steps: + +``` +matplot(t(drop(res_rw$pars)), lty = 1, type = "l", col = c(2, 4), + xlab = "Step", ylab = "Value") +``` + +We can probably improve the samples here by finding a better variance covariance matrix (VCV), but a single VCV will not hold well over the whole surface because it is not very similar to a multivariate normal (that is, the appropriate VCV will change depending on the position in parameter space) + +Let's try the Hamiltonian Monte Carlo (HMC) sampler, which uses the gradient to move efficiently in parameter space: + +```{r HMC_sampling} +sampler_hmc <- monty_sampler_hmc(epsilon = 0.1, n_integration_steps = 10) +res_hmc <- monty_sample(m, sampler_hmc, 1000) +plot(t(drop(res_hmc$pars)), type = "l", col = "#0000ff33", + xlim = range(a), ylim = range(b)) +lines(z95[, 1], z95[, 2]) +``` + +or viewed over steps: + +```{r} +matplot(t(drop(res_hmc$pars)), lty = 1, type = "l", col = c(2, 4), + xlab = "Step", ylab = "Value") +``` + +Clearly better!