Capstone: Bayesian dose-response with MCMC
Learning objectives
- Fit a 4PL (Emax) dose-response model by Bayesian MCMC with informative priors
- Interpret posterior medians and 95% credible intervals as inference targets
- Diagnose MCMC convergence via acceptance rate, trace plots, and posterior predictive coverage
- Use the posterior to make threshold decisions (ED50, ED90) with calibrated uncertainty
The fourth research capstone fits a 4-parameter logistic (Emax) dose-response model the BAYESIAN way: combine informative priors on the four parameters with the observed dose-response data, then explore the joint posterior via random-walk Metropolis MCMC. Output: posterior medians plus calibrated 95% credible intervals for ED50, Emax, Hill coefficient h, and baseline E0 — all in one coherent uncertainty model.
The Emax model
The 4-parameter logistic (or "Emax") model is the workhorse of pharmacology and toxicology:
- E_0: baseline response at zero dose.
- E_max: maximum incremental response at high dose.
- ED50: dose at which the response is half-maximal — the headline quantity for dose-selection decisions.
- h (Hill coefficient): steepness of the dose-response curve at ED50.
Prior elicitation
Informative priors are CRITICAL for Bayesian dose-response: pharmacology has decades of accumulated knowledge — typical ED50 ranges for similar drugs, typical Hill coefficients (~1 for most receptors, ~2-3 for cooperative effects), Emax bounds from physiological constraints. Encoding this prior knowledge regularises inference, especially with small n.
- E_0 ~ Normal(0, 0.3) — typical pre-dose response near 0.
- E_max ~ Normal(1, σ_user) — physiologically reasonable maximum.
- log(ED50) ~ Normal(log(5), σ_user) — log-Normal prior on positive parameter, encoding "I expect ED50 around 5 dose units, plausibly 1-25."
- log(h) ~ Normal(0, 0.7) — Hill coefficient typically near 1, plausibly 0.3-3.
Random-walk Metropolis MCMC
For each iteration:
- Propose new parameters by adding small Normal perturbations (log-scale for ED50, h).
- Compute the log-posterior at the proposal.
- Accept the proposal with probability min(1, exp(Δlog-posterior)).
- Otherwise stay at current parameter values.
After a 30% burn-in (discarding initial samples), the remaining iterations approximate samples from the posterior. Tune step sizes so the ACCEPTANCE RATE lands in 0.20-0.45 — too low means proposals are too bold; too high means too timid.
Posterior summaries and decisions
The posterior is a JOINT distribution over (E0, Emax, ED50, h). Report:
- POSTERIOR MEDIAN as the point estimate.
- 2.5%-97.5% interval as the 95% CREDIBLE INTERVAL.
- Posterior probability of "interesting" events: e.g., P(ED50 < 10 | data) for go/no-go decisions.
Unlike frequentist confidence intervals, credible intervals have direct probability interpretation: "there is a 95% probability that ED50 lies in [a, b], GIVEN the data and priors."
Try it
- Defaults (ED50 = 5, h = 1.5, n = 24, σ = 0.08). Posterior median ED50 should match the truth within ±0.5. 95% CrI covers the truth (the "calibration OK" readout). Acceptance rate near 0.3.
- Crank n down to 8 (very small study). Posterior CrI for ED50 widens dramatically — sometimes spanning [1, 20]. Bayesian methods don't magically extract more information than the data contain; they just honestly REPORT the resulting uncertainty.
- Crank prior log-SD on ED50 up to 1.5 (very vague prior). With low n, the posterior shifts noticeably toward the data — the prior used to anchor, but vague priors can't. Now crank back to 0.3 (very tight): the posterior is dominated by the prior, ignoring conflicting data. Choose σ_prior to balance prior knowledge with data influence.
- Increase σ_obs to 0.3 (very noisy observations). Posterior widens but stays calibrated. Bayesian inference automatically discounts noisy data.
- Set true h = 0.5 (shallow dose-response). Watch h's posterior recover this. ED50 becomes harder to identify when the curve is shallow — large CrI even with n = 24.
A drug developer reports "the posterior median ED50 is 4.2 with 95% CrI [2.1, 8.5]." Their go/no-go threshold is ED50 < 6 (commercial viability). What posterior probability statement should they compute, and what value would trigger a STOP decision?
What you now know
You can run a complete Bayesian dose-response analysis: elicit priors, run MCMC, summarise posteriors with credible intervals, and translate posteriors into decision probabilities. The same machinery scales to richer pharmacological models (population PK/PD, hierarchical dose-response across populations) — what changes is the model complexity and the choice of MCMC algorithm (HMC / NUTS for high-dimensional models via Stan or PyMC).
References
- Gelman, A. et al. (2013). Bayesian Data Analysis, 3rd ed. Chapman & Hall. (Standard graduate reference.)
- Hill, A.V. (1910). "The possible effects of the aggregation of the molecules of haemoglobin on its dissociation curves." J. Physiol. 40, iv–vii. (The Hill equation.)
- Bonate, P.L. (2011). Pharmacokinetic-Pharmacodynamic Modeling and Simulation, 2nd ed. Springer.
- Carpenter, B. et al. (2017). "Stan: A probabilistic programming language." J. Stat. Soft. 76(1), 1–32.
- Metropolis, N. et al. (1953). "Equation of state calculations by fast computing machines." J. Chem. Phys. 21(6), 1087–1092. (The foundational Metropolis algorithm.)