Simulation
simulate()
Generate simulated observations from a model with random effects and residual error.
pub fn simulate(
model: &CompiledModel,
population: &Population,
params: &ModelParameters,
n_sim: usize,
) -> Vec<SimulationResult>Parameters: - model: Compiled model - population: Template population (dose events and observation times are used; DV values are ignored) - params: True parameter values for simulation - n_sim: Number of simulation replicates
Returns: Vector of SimulationResult, one per observation per subject per replicate.
Example:
let model = parse_model_file(Path::new("model.ferx"))?;
let population = read_nonmem_csv(Path::new("data.csv"), None)?;
// Simulate 1000 replicates
let sims = simulate(&model, &population, &model.default_params, 1000);
for sim in &sims[..5] {
println!("Sim {}, ID {}, TIME {}, IPRED {:.3}, DV {:.3}",
sim.sim, sim.id, sim.time, sim.ipred, sim.dv_sim);
}simulate_with_seed()
Same as simulate() but with a fixed random seed for reproducibility.
pub fn simulate_with_seed(
model: &CompiledModel,
population: &Population,
params: &ModelParameters,
n_sim: usize,
seed: u64,
) -> Vec<SimulationResult>simulate_with_options() — propensity-score matching
Same outputs as simulate(), with options for a seed and for propensity-score matching of the simulated random effects.
pub fn simulate_with_options(
model: &CompiledModel,
population: &Population,
params: &ModelParameters,
n_sim: usize,
opts: &SimulateOptions,
) -> Result<Vec<SimulationResult>, String>
pub struct SimulateOptions {
pub seed: Option<u64>, // None draws from entropy
pub match_method: Option<MatchMethod>, // None disables matching
pub horizon: Option<f64>, // TTE administrative censoring time; None = per-record window
}
pub enum MatchMethod {
Optimal, // global minimum total distance (linear assignment); recommended default
Nearest, // greedy nearest-neighbour in subject order
Rank, // pair by Mahalanobis-norm rank (k-th order statistic to k-th)
}With match_method = None and horizon = None this is identical to simulate_with_seed() (or simulate() when seed is None).
horizon = Some(t) sets an administrative censoring time for time-to-event endpoints, decoupled from the observed event times: it overrides each TTE record’s per-record observation window so a re-simulated event-bearing subject (a competing-risks VPC) censors at the planned study end t rather than drawing unbounded. It has no effect on Gaussian endpoints. See TTE simulation.
Why match?
In real-world data, therapy is often adapted in response to a patient’s own PK — e.g. high-clearance patients are given longer dosing intervals or more frequent sampling. A standard VPC draws each subject’s eta independently of its dosing/sampling design, so this design↔︎eta association is lost and the VPC can show spurious model misspecification even when the model is correct.
Propensity-score matching restores the association. For each replicate:
- Draw a pool of
Netas from \(N(0, \Omega)\) (N= number of subjects). - Match the
Ndrawn etas 1:1 (without replacement) to theNsubjects’ fitted (posthoc) etas under the Mahalanobis distance over the model \(\Omega\): \(d^2(a,b) = (a-b)^\top \Omega^{-1} (a-b)\) using the chosenMatchMethod:Optimal— global minimum of the total matched distance (the linear assignment problem); mirrorsMatchIt(method = "optimal", distance = "mahalanobis"). Best on average in simulation, the recommended default.Nearest— greedy nearest-neighbour in subject order; mirrorsMatchIt(method = "nearest", distance = "mahalanobis").Rank— pair subjects and draws by the rank of their Mahalanobis norm \(\lVert \eta \rVert = \sqrt{\eta^\top \Omega^{-1} \eta}\) (k-th order statistic to k-th).
- Each subject keeps its own observed design but is simulated with the drawn eta that matched its fitted eta — so a high-clearance draw lands on a subject whose adaptive design reflects high clearance.
The fitted etas are computed once via a posthoc (MAP) inner-loop pass over the observed population at params; only the drawn pool and the matching change across replicates.
Requires observed data: every subject must carry observations (so its posthoc eta is defined). The function returns Err if the population is empty or any subject has no observations. The matching is the only correction applied; binning and plotting the VPC is left to the caller (e.g. the vpc R package).
Validation against R
The matching numerics were cross-checked against R on a fixed 8-subject, 2-eta scenario:
- the Mahalanobis cost matrix matches R’s
mahalanobis()under Ω to2e-11; - ferx-core’s optimal assignment is identical to two independent optimal solvers —
clue::solve_LSAPandoptmatch::pairmatch(the engineMatchIt(method = "optimal")uses) — with the same total cost to 10 dp.
A direct MatchIt(method = "optimal", distance = "mahalanobis") call (as in the PAGE-poster workflow) returns a slightly different pairing, because MatchIt scales the Mahalanobis metric by the empirical covariance of the combined etas rather than the model Ω. The optimal-matching algorithm is the same (optmatch); only the metric differs, which is the documented modelling choice here (Ω is exact for draws from N(0, Ω) and needs no extra estimation).
Example:
let pop = read_nonmem_csv(Path::new("rwd.csv"), None)?; // observed data
let fit = fit(&model, &pop, &model.default_params, &opts)?;
// Recover the fitted ModelParameters (theta/Omega/sigma) from the result.
let params = fitted_params_from_result(&fit, &model);
let sims = simulate_with_options(
&model, &pop, ¶ms, 200,
&SimulateOptions { seed: Some(42), match_method: Some(MatchMethod::Optimal) },
)?;
// → 200 replicates; build the pmVPC from `sims` as usual.predict()
Population predictions without random effects (eta = 0). No simulation noise is added.
pub fn predict(
model: &CompiledModel,
population: &Population,
params: &ModelParameters,
) -> Vec<PredictionResult>Returns: Vector of PredictionResult with population-level predictions.
Example:
let preds = predict(&model, &population, &model.default_params);
for p in &preds {
println!("ID {}, TIME {}, PRED {:.3}", p.id, p.time, p.pred);
}simulate_with_uncertainty()
Simulate observations while propagating population parameter uncertainty. For each parameter draw from the uncertainty distribution, etas (from the drawn Omega) and epsilons (from the drawn Sigma) are sampled — giving simulation bands that include both individual variability and uncertainty in the population estimates.
pub fn simulate_with_uncertainty(
model: &CompiledModel,
population: &Population,
fit_result: &FitResult,
opts: &SimulateUncertaintyOptions,
) -> Result<Vec<SimulationResult>, String>pub struct SimulateUncertaintyOptions {
pub n_uncertainty_draws: usize, // parameter sets drawn
pub n_sim_per_draw: usize, // eta/eps replicates per draw
pub method: UncertaintyMethod, // Asymptotic | Sir
pub seed: Option<u64>,
}
pub enum UncertaintyMethod {
Asymptotic, // MVN around ML estimate using FitResult.covariance_matrix
Sir, // resamples from FitResult.sir_resamples_packed
}Parameters: - fit_result: Result of a previous fit() call. Must include covariance_matrix (Asymptotic) or sir_resamples_packed (SIR). - opts.n_uncertainty_draws: number of parameter sets drawn. - opts.n_sim_per_draw: number of subject-level replicates per draw. - opts.method: how to draw the parameter sets.
Returns: Vec<SimulationResult> of length n_uncertainty_draws * n_sim_per_draw * n_subjects * n_obs. Each row carries a draw index (1..=n_uncertainty_draws) and a sim index (1..=n_sim_per_draw).
Prerequisites: - For UncertaintyMethod::Asymptotic: run fit() with run_covariance_step = true (or covariance = true in [fit_options]). - For UncertaintyMethod::Sir: run with sir = true and sir_keep_samples = true so the resampled vectors are kept on the FitResult.
Example:
let fit_opts = FitOptions {
run_covariance_step: true,
..FitOptions::default()
};
let fit = fit(&model, &population, &model.default_params, &fit_opts)?;
let sims = simulate_with_uncertainty(
&model,
&population,
&fit,
&SimulateUncertaintyOptions {
n_uncertainty_draws: 200,
n_sim_per_draw: 10,
method: UncertaintyMethod::Asymptotic,
seed: Some(42),
},
)?;Result Types
pub struct SimulationResult {
pub draw: usize, // Uncertainty draw index (1-indexed; always 1 for simulate())
pub sim: usize, // Replicate number within a draw (1-indexed)
pub id: String, // Subject ID
pub time: f64, // Observation time
pub ipred: f64, // Individual prediction (no residual error)
pub dv_sim: f64, // Simulated observation (with residual error)
}
pub struct PredictionResult {
pub id: String,
pub time: f64,
pub pred: f64, // Population prediction (eta = 0)
}Simulation Process
For each replicate and each subject:
- Sample random effects: \(\eta_i \sim N(0, \Omega)\) using the Cholesky factor \(L\): \(\eta = L \cdot z\), where \(z \sim N(0, I)\)
- Generate predictions through the same time-varying-covariate-aware dispatcher used by estimation and
predict()(compute_predictions_with_tv). PK parameters are recomputed per event from each row’s covariate snapshot, so a covariate that changes over a subject’s record (e.g. body weight) drives the prediction at each time — not just the baseline value. Subjects with no time-varying covariates stay on the single-evaluation fast path (pk_param_fn(theta, eta, covariates)), so their output is unchanged. - Add residual error: \(DV = IPRED + \sqrt{V} \cdot \epsilon\), where \(\epsilon \sim N(0, 1)\) and \(V\) is the residual variance from the error model. FREM covariate pseudo-observations (
FREMTYPE > 0) instead use the additive covariate error \(V = \sigma_\text{EPSCOV}^2\), matching the FREM likelihood.
The same per-event covariate snapshots drive the NPDE/NPD diagnostics and the NCA-based start-value sweep, so all simulation-backed surfaces agree.
Simulation with Uncertainty Process
simulate_with_uncertainty() wraps the steps above in an outer loop over parameter draws:
- Outer loop — for each of
n_uncertainty_draws, draw a population parameter set \((\theta_k, \Omega_k, \Sigma_k)\) from the uncertainty distribution.- Asymptotic: \(x_k = \hat{x} + L_\text{cov} z\) in packed log-space (theta, Cholesky-omega, sigma share one packed vector), then unpack — theta, Omega, and Sigma are perturbed coherently from a single MVN draw.
- SIR: pick a parameter vector at random from
FitResult.sir_resamples_packed(the resampled pool retained from the SIR step). Draws that fall outside the parameter bounds or yield non-positive theta/omega/sigma are rejected and resampled (up to10 × n_uncertainty_drawsattempts).
- Inner loop — for the drawn \((\theta_k, \Omega_k, \Sigma_k)\), run
n_sim_per_drawreplicates of the standard simulation process above. Etas are drawn from \(N(0, \Omega_k)\) and epsilons from the drawn \(\Sigma_k\).
The resulting SimulationResult rows are tagged with both draw and sim so downstream code can compute either marginal bands (over all draws and sims) or hierarchical bands (e.g. median across sims within each draw, then percentiles across draws).