library(bayesSSM)
library(ggplot2)
library(tidyr)
library(extraDistr)
library(expm) # For matrix exponentiation
set.seed(1405)
Stochastic SIR Model
We consider a stochastic SIR (Susceptible–Infectious–Recovered) model for the spread of an infectious disease in a closed population of size N, meaning there are no births, deaths, or migration. The population is divided into three compartments: susceptible (S), infectious (I), and recovered (R). Initially, all individuals are susceptible except for m infectious individuals.
Note that most of the literature on SIR models assumes a deterministic model, where the compartments are treated as continuous and are described by ordinary differential equations (ODEs).
Each infectious individual makes contact with any given member of the population according to a time-homogeneous Poisson process with rate \lambda/N, where \lambda > 0. Thus, each infectious individual makes contact with any other individual at an overall rate of \lambda, independent of the population size.
The duration of the infectious period for each individual is assumed to follow an exponential distribution: I \sim \text{Exp}(\gamma), where \gamma > 0 is the recovery rate.
Let S(t) and I(t) denote the number of susceptible and infectious individuals at time t \geq 0, respectively. Under the assumptions above, the process \{(S(t), I(t)) : t \geq 0\} forms a continuous-time Markov process, since both infection and recovery events are exponentially distributed and are therefore memoryless. Since N is fixed, we have R(t) = N - S(t) - I(t), and thus the process can be described by the pair (S(t), I(t)).
Transition Rates
-
Infection Event:
(s, i, r) \to (s - 1, i + 1, r)
occurs at rate
\frac{\lambda}{N} \, s \, i,
for s > 0 and i > 0.
-
Recovery Event:
(s, i, r) \to (s, i - 1, r + 1)
occurs at rate
\gamma \, i,
for i > 0.
Partial Observations and Noisy Measurements
Suppose we only observe the initial state and the number of infectious individuals I(t) at discrete time points t = 0, 1, \ldots, T. The true number of infectious individuals is unobserved (latent), and instead we observe a noisy measurement that may either overestimate or underestimate the true count.
We model the observed counts Y_t in this article using a Poisson observation model: Y_t \mid I(t) \sim \operatorname{Pois}(I(t)). To account for over-dispersion, one could instead use a negative binomial observation model.
We are now in a State Space Model (SSM) setup, where the latent state is \bigl(S(t), I(t)\bigr), and the observations are Y_t. Note that due to the memoryless property of the exponential distribution, the latent state (S(t), I(t)) is fully determined by the initial state and the sequence of infection and recovery events that have occurred up to time t, thus a Markov process.
Why Standard MCMC Methods Are Infeasible
If one wanted to use standard MCMC methods to estimate the joint posterior of the parameters and latent states, one would need to compute the exact transition probabilities for the SIR model, which quickly becomes infeasible as the number of infectious individuals increases. We illustrate how this can be done using matrix exponentiation for small populations.
The probability distribution of the state at time t is given by: P(t) = e^{Qt}, where Q is the generator matrix of the Markov process, which contains the transition rates between states.
We define functions below to generate the state space for a small SIR system, build the generator matrix, and compute the transition probabilities at a given time t.
# Generate state space for small SIR system
generate_sir_states <- function(N) {
# Returns all (s, i) pairs where s + i <= N
states <- expand.grid(s = 0:N, i = 0:N)
states <- subset(states, s + i <= N)
states <- states[order(states$s, states$i), ]
states
}
# Build generator matrix Q
build_generator_matrix <- function(states, lambda, gamma, N) {
n_states <- nrow(states)
Q <- matrix(0, n_states, n_states)
for (k in 1:n_states) {
s <- states$s[k]
i <- states$i[k]
rate_infection <- (lambda / N) * s * i
rate_recovery <- gamma * i
# Infection transition
if (s > 0 && i > 0) {
next_state <- c(s - 1, i + 1)
idx <- which(states$s == next_state[1] & states$i == next_state[2])
Q[k, idx] <- rate_infection
}
# Recovery transition
if (i > 0) {
next_state <- c(s, i - 1)
idx <- which(states$s == next_state[1] & states$i == next_state[2])
Q[k, idx] <- rate_recovery
}
# Diagonal: exit rate
Q[k, k] <- -(rate_infection + rate_recovery)
}
Q
}
# Compute transition probabilities at time t
compute_transition_probs <- function(lambda, gamma, N, t, init_state) {
states <- generate_sir_states(N)
Q <- build_generator_matrix(states, lambda, gamma, N)
P <- expm(Q * t) # matrix exponential
init_index <- which(states$s == init_state[1] & states$i == init_state[2])
p_t <- P[init_index, ]
data.frame(
s = states$s,
i = states$i,
prob = p_t
)
}
This can then be run to compute the transition probabilities for a given state and time point. For example, if we have a population of size N=5 with state (S(t), I(t)) = (4, 1) at time t, we can compute the transition probabilities for the next time point t+1 as
# Example with very small population
lambda <- 1.0
gamma <- 0.5
N <- 5 # Small population
t <- 1 # Time point
init_state <- c(4, 1) # (s, i)
transition_probs <- compute_transition_probs(
lambda, gamma, N, t, init_state
)
# Show the most probable next states
transition_probs[order(-transition_probs$prob), ][1:5, ]
#> s i prob
#> 19 4 0 0.27979546
#> 20 4 1 0.27253179
#> 18 3 2 0.14375879
#> 17 3 1 0.08872580
#> 15 2 3 0.06343076
Note, that this scales poorly with larger populations, as the number of states is given by N+2 \choose 2, which for N=100 is already over 5000 states. Thus, this whole approach is only feasible for very small populations.
Simulate data
We will simulate data from the SIR model with the following parameters:
At t=0 we have S(0) = 90, I(0) = 10 and R(0) = 0.
Infection rate \lambda=1.5 and removal rate \gamma=0.5.
We observe the initial state at t=0 complete, and then noisy version of infectious individuals at times t=1, \ldots, 10 representing observations for each day.
We can simulate using the fact that we have two independent exponential distribution, so an event occurs at rate of the sum of the rates.
# --- Simulation settings and true parameters ---
n_total <- 500 # Total population size
init_infected <- 70 # Initially infectious individuals
init_state <- c(n_total - init_infected, init_infected) # (s, i) at time 0
t_max <- 10 # Total number of days to simulate
true_lambda <- 0.5 # True infection parameter
true_gamma <- 0.2 # True removal parameter
# --- Functions for simulating the epidemic ---
epidemic_step <- function(state, lambda, gamma, n_total) {
t <- 0
t_end <- 1
s <- state[1]
i <- state[2]
while (t < t_end && i > 0) {
rate_infection <- (lambda / n_total) * s * i
rate_removal <- gamma * i
rate_total <- rate_infection + rate_removal
if (rate_total <= 0) break
dt <- rexp(1, rate_total)
if (t + dt > t_end) break
t <- t + dt
# Decide which event occurs:
if (runif(1) < rate_infection / rate_total) {
# Infection event
s <- s - 1
i <- i + 1
} else {
# Removal event
i <- i - 1
}
}
c(s, i)
}
simulate_epidemic <- function(
n_total, init_infected, lambda, gamma, t_max) {
states <- matrix(0, nrow = t_max, ncol = 2)
# initial state at t = 0
state <- c(n_total - init_infected, init_infected)
for (t in 1:t_max) {
state <- epidemic_step(state, lambda, gamma, n_total)
states[t, ] <- state
}
states
}
Now, we generate some data:
# Simulate an epidemic dataset
true_states <- simulate_epidemic(
n_total, init_infected, true_lambda, true_gamma, t_max
)
latent_i <- true_states[, 2]
observations <- rpois(length(latent_i), lambda = latent_i)
# Display simulated data: time, susceptible, latent infectious, observed counts
print(data.frame(
time = 1:t_max, s = true_states[, 1], i = true_states[, 2], y = observations
))
#> time s i y
#> 1 1 391 88 103
#> 2 2 348 113 106
#> 3 3 307 132 114
#> 4 4 266 147 136
#> 5 5 221 164 155
#> 6 6 183 171 168
#> 7 7 155 159 154
#> 8 8 137 143 137
#> 9 9 118 139 147
#> 10 10 107 121 123
And plot it:
# Function to create a tidy dataset for ggplot
prepare_data_for_plot <- function(states, observations, t_max) {
# Organize the data into a tidy format
data <- data.frame(
time = 1:t_max,
s = states[, 1],
i = states[, 2],
y = observations
)
# Convert to long format for ggplot
data_long <- data %>%
gather(key = "state", value = "count", -time)
data_long
}
# Function to plot the epidemic data
plot_epidemic_data <- function(data_long, t_max) {
ggplot(data_long, aes(x = time, y = count, color = state)) +
geom_line(linewidth = 1.2) +
scale_color_manual(values = c("s" = "blue", "i" = "red", "y" = "green")) +
labs(
x = "Time (Days)", y = "Count",
title = "Susceptible, Infected, and Observed Counts"
) +
theme_minimal() +
theme(legend.title = element_blank()) +
scale_x_continuous(breaks = 1:t_max) +
theme(
axis.title = element_text(size = 12),
plot.title = element_text(size = 14, hjust = 0.5)
)
}
# Prepare data for plotting
data_long <- prepare_data_for_plot(true_states, observations, t_max)
# Plot the results
plot_epidemic_data(data_long, t_max)
Bayesian Inference with PMMH
We are interested in performing Bayesian inference in this setup, and define the priors as follows: \begin{align*} \lambda &\sim \operatorname{Normal^+}(0, 1^2), \\ \gamma &\sim \operatorname{Normal^+}(0, 2^2), \end{align*} where \operatorname{Normal^+} denotes a truncated at 0 normal distribution.
# Define the log-prior for the parameters
log_prior_lambda <- function(lambda) {
extraDistr::dhnorm(lambda, sigma = 1, log = TRUE)
}
log_prior_gamma <- function(gamma) {
extraDistr::dhnorm(gamma, sigma = 2, log = TRUE)
}
log_priors <- list(
lambda = log_prior_lambda,
gamma = log_prior_gamma
)
We now define the initial state, transition and likelihood functions for the SIR model.
init_fn_epidemic <- function(num_particles) {
# Return a matrix with particles rows; each row is the initial state (s, i)
matrix(
rep(init_state, each = num_particles),
nrow = num_particles,
byrow = FALSE
)
}
transition_fn_epidemic <- function(particles, lambda, gamma, t) {
new_particles <- t(apply(particles, 1, function(state) {
s <- state[1]
i <- state[2]
if (i == 0) {
return(c(s, i))
}
epidemic_step(state, lambda, gamma, n_total)
}))
new_particles
}
log_likelihood_fn_epidemic <- function(y, particles) {
# particles is expected to be a matrix with columns (s, i)
dpois(y, lambda = particles[, 2], log = TRUE)
}
Now we can run the PMMH algorithm to estimate the posterior distribution. For this vignette we only use a small number of iterations (1000) and 2 chains (and we also modify the tuning to only use 100 iterations with a burn-in of 10). In practice, these should be much higher.
result <- bayesSSM::pmmh(
y = observations,
m = 1000,
init_fn = init_fn_epidemic,
transition_fn = transition_fn_epidemic,
log_likelihood_fn = log_likelihood_fn_epidemic,
log_priors = log_priors,
pilot_init_params = list(
c(lambda = 0.5, gamma = 0.5),
c(lambda = 1, gamma = 1)
),
burn_in = 200,
num_chains = 2,
param_transform = list(lambda = "log", gamma = "log"),
tune_control = default_tune_control(pilot_m = 100, pilot_burn_in = 10),
verbose = TRUE,
seed = 1405,
)
#> Running chain 1...
#> Running pilot chain for tuning...
#> Pilot chain posterior mean:
#> lambda gamma
#> 0.5744942 0.2023182
#> Pilot chain posterior covariance (on transformed space):
#> lambda gamma
#> lambda 0.0049477745 0.0008807493
#> gamma 0.0008807493 0.0001576646
#> Using 50 particles for PMMH:
#> Running Particle MCMC chain with tuned settings...
#> Running chain 2...
#> Running pilot chain for tuning...
#> Pilot chain posterior mean:
#> lambda gamma
#> 0.5228747 0.1919012
#> Pilot chain posterior covariance (on transformed space):
#> lambda gamma
#> lambda 0.0018259219 0.0005323758
#> gamma 0.0005323758 0.0001830201
#> Using 50 particles for PMMH:
#> Running Particle MCMC chain with tuned settings...
#> PMMH Results Summary:
#> Parameter Mean SD Median 2.5% 97.5% ESS Rhat
#> lambda 0.57 0.07 0.56 0.42 0.72 109 1.012
#> gamma 0.21 0.02 0.21 0.17 0.24 70 1.020
#> Warning in bayesSSM::pmmh(y = observations, m = 1000, init_fn =
#> init_fn_epidemic, : Some ESS values are below 400, indicating poor mixing.
#> Consider running the chains for more iterations.
#> Warning in bayesSSM::pmmh(y = observations, m = 1000, init_fn = init_fn_epidemic, :
#> Some Rhat values are above 1.01, indicating that the chains have not converged.
#> Consider running the chains for more iterations and/or increase burn_in.
We get convergence warnings as expected, but the posterior is still centered around the true value.
We can access the chains and plot the densities:
chains <- result$theta_chain
For \lambda:
ggplot(chains, aes(x = lambda, fill = factor(chain))) +
geom_density(alpha = 0.5) +
labs(
title = "Density plot of lambda chains",
x = "Value",
y = "Density",
fill = "Chain"
) +
theme_minimal()
And for \gamma:
ggplot(chains, aes(x = gamma, fill = factor(chain))) +
geom_density(alpha = 0.5) +
labs(
title = "Density plot of gamma chains",
x = "Value",
y = "Density",
fill = "Chain"
) +
theme_minimal()