Scaling
Maturity: beta — see Feature Maturity for what this means.
The optional [scaling] block declares how the structural model’s raw output maps to the observed DV. It exists so the user does not have to fold unit conversion or amount-to-concentration arithmetic into the structural model itself — keeping [odes] and [structural_model] readable, and making mixed-unit data (e.g. data in ng/mL when the model thinks in mg/L) straightforward.
The convention is divisive: pred_scaled = pred_raw / scale. This matches the natural reading of obs_scale = V/1000 as “divide amount by V/1000 to get concentration in the user’s units.”
Three forms are supported. Each is optional; omitting [scaling] keeps the historical “raw prediction equals DV” behaviour.
Form A — scalar divisor
Use for fixed unit conversion (e.g. mg/L → mg/mL is a constant 1000).
[scaling]
obs_scale = 1000
Applies to analytical PK models and ODE models alike: every prediction is divided by the constant before reaching the residual error model.
Form B — expression divisor
Use when the scale depends on theta, eta, or a covariate.
[scaling]
obs_scale = WT / 70
Expressions may reference:
- thetas (e.g.
TVV), - etas (e.g.
ETA_CL), - covariates (e.g.
WT,CR), - individual parameters declared in
[individual_parameters](e.g.V,CL) — these are resolved from a subject-static evaluation ofpk_param_fnat scale-evaluation time, soobs_scale = 1000 / Vuses the per-subject V (typical value times the EBE eta).
The scale is evaluated once per subject with subject-level covariates (matching the no-TV path). Time-varying covariate support for expression scales is a Phase 1.5 follow-up.
Form C — explicit output expression
Use when the built-in concentration readout is not the observed quantity — e.g. an ODE state is held as an amount and the observation is a concentration, or the observation is a nonlinear function of the concentration (a protein-binding correction, a free-vs-total switch, a parent/metabolite sum). Form C replaces the default readout entirely.
[structural_model]
ode(states=[depot, central]) # no obs_cmt — Form C provides it
[odes]
d/dt(depot) = -KA * depot
d/dt(central) = KA * depot - CL/V * central # central holds amount
[scaling]
y = central / V
The right-hand side may reference state names (depot, central), individual parameters (CL, V, KA), thetas (TVCL), etas (ETA_CL), and covariates (WT). All five name classes are looked up at evaluation time — states from the ODE solver, individual parameters from the subject-static pk_param_fn, and thetas/etas/covariates from the values supplied by the caller.
Form C on analytical PK models (since #650)
Form C is also accepted on analytical (closed-form) PK models. There the readout replaces the built-in concentration; it can reference the same name classes, with the compartment-amount scope tied to what the closed forms expose:
central— the central compartment amount (always available). The analytic engine computes a concentration internally; the readout sees the amount, i.e.central = concentration × V, soy = central / Vrecovers the concentration and any additive term layers on top.depot— the oral depot amount, for first-order oral models (one_cpt_oral/two_cpt_oral/three_cpt_oral).- Peripheral compartments are not available (the closed forms don’t expose a cross-compartment amount) — referencing
periph/peripheral…is a parse error pointing you to an ODE model, mirroring[initial_conditions].
Individual parameters that are not a structural PK role (e.g. a binding capacity BMAX or dissociation constant KD) may be referenced in an analytic readout and are differentiated exactly — they become first-class differentiable parameters (#650). if/else works, so a per-row covariate can switch the readout.
A free-vs-total fluconazole readout on an analytic 1-cpt model, with a saturable albumin-binding correction on the total-drug rows:
[individual_parameters]
CL = TVCL * exp(ETA_CL)
V = TVV * exp(ETA_V)
BMAX = TVBMAX
KD = TVKD
[structural_model]
pk one_cpt_iv(cl=CL, v=V)
[scaling]
y = if (FREE == 0) central / V + BMAX * (central / V) / (KD + central / V) \
else central / V
Form C replaces the built-in readout, so it cannot be combined with a divisive obs_scale on the same model (fold any unit conversion into y, e.g. y = central / V / 1000); the parser rejects the combination.
Covariates in a Form C expression are read from the per-observation covariate snapshot — the values on each observation’s own data row — so a time-varying covariate drives the readout at the time it is observed (since #535/#538). Because a Form C covariate is therefore an input the prediction depends on, it is a required data column: if the readout references a covariate that is absent from the dataset the fit fails with E_MISSING_COVARIATE rather than silently reading the missing value as 0.0. For time-constant covariates the readout is unchanged.
This matches NONMEM’s per-record $ERROR semantics. Validated against a free/total protein-binding model (NONMEM ADVAN3 TRANS4 whose $ERROR selects the total concentration when FREE==0 and the unbound concentration when FREE==1, with paired assay rows at the same time): evaluated at identical parameters, ferx’s per-record population predictions reproduce NONMEM’s PRED to ~1e-4 relative on both the total-assay and free-assay rows — the two rows at a given time differing only by that per-record covariate.
The analytical Form C readout (#650) inherits this NONMEM anchor: on an IV 1-cpt model the analytic central amount (concentration × V) equals the ODE central state amount, so a saturable-binding total-vs-free readout gives bit-for-bit-equal predictions on the analytic and ODE paths at matched parameters — the analytic-vs-ODE equivalence is checked by analytic_form_c_matches_ode_form_c_binding_readout.
Runtime behaviour on bad scales
If an expression scale (Form B or C) evaluates to a non-positive or non-finite value at runtime — for example WT / 70 when WT is missing (reads 0) or 1 / (TVV - x) near a singularity — every prediction for that subject is set to NaN. The outer NLL then evaluates to NaN and the optimizer rejects the step. This matches established NLM convention (NONMEM’s OBJFN = NaN → step rejection) and surfaces bad scales in the per-subject diagnostics rather than silently producing a mis-scaled fit.
Comparison with NONMEM and nlmixr2
| Need | NONMEM | nlmixr2 | ferx |
|---|---|---|---|
| Scalar unit conversion | S1 = 1000 |
(multiplier in cmt/f) |
[scaling] obs_scale = 1000 |
| Amount-state ODE with concentration DV | S2 = V/1000 plus Y = A(2)/S2 |
cmt(central); f = central/V/1000 |
[scaling] y = central / (V/1000) |
The ferx form is divisive by convention, so an obs_scale = V/1000 reads as “divide raw by V/1000” — matching NONMEM’s S2.
Interaction with gradients
All [scaling] variants on the analytical PK path support the default gradient = auto setting and forced finite differences (gradient = fd):
| Form | auto |
fd |
Notes |
|---|---|---|---|
Scalar obs_scale = K |
exact analytic | exact FD | The constant threads as one entry per observation. |
Expression obs_scale = <expr> |
subject-static analytic | exact FD | See subject-static caveat below. |
Per-CMT obs_scale[CMT=N] |
subject-static analytic | exact FD | One per-observation scale entry per observation, dispatched by subject.obs_cmts[i]. |
Form C y[…] = <expr> (ODE) |
exact analytic (in-scope ODE) | exact FD | Since #410/#439 the ODE sensitivity provider differentiates the Form C readout — uniform y = <expr> and per-CMT y[CMT=N] = <expr> — over Dual2/Dual1 (outer and inner), including covariate references in the readout (#540): a covariate threads in as a constant from the per-observation snapshot, so a free→total protein-binding readout gated on a FREE flag stays analytic. A θ or η referenced directly in the readout (rather than via an [individual_parameters] entry) is also analytic since #486 — the parser desugars each bare THETA(i)/ETA(k) into a hidden individual parameter, so the readout’s ∂y/∂θ/∂y/∂η rides the individual-parameter sensitivity chain; only a readout referencing a neural-network output, or other out-of-scope ODE features (input-rate/SDE, unsupported steady-state combinations, …), still falls back to FD. |
Form C y = <expr> (analytical) |
exact analytic | exact FD | Since #650 the analytic sensitivity provider differentiates an analytical Form C readout over Dual2/Dual1 (outer and inner), on the static dose-superposition path, the time-varying-covariate / oral-infusion event-walk path, and (since #655) IOV subjects (kappa declarations) — so a readout gated on a per-row covariate (a free-vs-total FREE flag) stays analytic, including under IOV. The readout’s central-compartment amount (concentration × V), the individual parameters it references — including non-structural ones like a binding BMAX/KD, which get a free differentiable PK slot — and covariates all flow through the exact η/θ chain; under IOV the readout parameters are BSV-only (a kappa reference is rejected at parse), so only the concentration carries the occasion κ. Falls back to FD (with a parse warning; the prediction stays exact) for: a readout referencing the oral depot amount, a per-CMT y[CMT=N], a direct THETA/ETA or neural-network reference, a [initial_conditions] baseline, or (on the event-walk / IOV paths) a readout whose non-structural parameters overflow the eight structural sensitivity slots. |
Analytic outer gradient handles expression scaling exactly. The analytic sensitivity provider that drives the gradient-based outer optimizers (bfgs, lbfgs, slsqp, …) on analytical 1-/2-/3-compartment models compiles an obs_scale expression to a Dual2-differentiable program and differentiates the scaled prediction f / scale exactly — including its η and θ dependence. So an η-dependent scale like 1000 / V is handled exactly by the analytic FOCE/FOCEI outer gradient with no gradient = fd needed.
Inner EBE loop also handles expression scaling analytically (since #486). The light inner-gradient provider now carries the η-only quotient rule ∂(f/scale)/∂η = (∂f/∂η)/scale − f·(∂scale/∂η)/scale² — the η-block of the outer’s quotient rule, evaluated once per subject over the same Dual2-differentiable scale program — so the per-subject inner EBE loop is exact for an η-dependent expression obs_scale like 1000 / V, not just the outer gradient. The ODE ([odes]) path also serves an η-dependent obs_scale analytically on both loops: on the static walk since #486, on the non-IOV time-varying-covariate walk since #486 (the divisor is subject-static, so one subject-static jet is applied post-walk), for IOV since #575, and for IOV subjects that also have time-varying covariates since #590. The closed-form (analytical 1-/2-/3-compartment) IOV path serves it analytically on both loops since #486 as well, including IOV subjects with time-varying covariates — the same per-occasion post-walk quotient the ODE IOV path uses. The IOV expression-scale quotient follows production semantics: the scale is evaluated once per occasion group with the subject-level covariate snapshot, while the event walk still uses per-event covariate snapshots for the dynamics. Per-CMT obs_scale[CMT=N] falls outside the analytic provider’s scope entirely, and ODE expression scaling combined with LTBS still routes to FD. Result-neutral — estimates and SEs are unchanged.
Interaction with SDE / [diffusion]
In Phase 1, [scaling] is not supported on SDE models. The EKF / Kalman update computes both the predicted mean and the prediction covariance p_obs in the observation space, and the per-observation r_obs callback evaluates the residual variance from that predicted mean. Forms A/B post-multiply only the mean, so the EKF variance would remain in the unscaled space — producing mis-scaled OFVs.
A correct SDE+scaling integration needs the scale factor threaded into both the EKF p_obs propagation (scales by 1/K²) and the residual variance callback. That’s a wider change deferred to Phase 1.5. Until then, the parser rejects any [scaling] block on a model with a [diffusion] block (Forms A, B, and C alike).
Multi-analyte / per-CMT scaling
For models that observe multiple compartments (parent + metabolite, sum-of-moieties, free vs. total, …), specify a separate scale per observed CMT using the obs_scale[CMT=N] (Forms A/B) or y[CMT=N] (Form C) syntax. N is the 1-based CMT index from the data file’s CMT column.
[scaling]
obs_scale[CMT=1] = 1000 # parent in mg/L → mg/mL
obs_scale[CMT=2] = 1 # metabolite already in target units
Form C (ODE) per-CMT:
[structural_model]
ode(states=[depot, parent, metab])
[scaling]
y[CMT=1] = parent / V_parent
y[CMT=2] = metab / V_metab
Coverage rule — every CMT that has at least one observation in the data must have a matching [CMT=N] entry. The parser only checks syntax; the fit-time validation (run automatically at the top of fit()) errors with a list of the missing CMTs:
[scaling]: per-CMT scaling is missing entries for observed CMTs [2, 3].
Every observed CMT must have an `obs_scale[CMT=N]` (or `y[CMT=N]` for ODE) entry.
Mixing rule — the uniform form (obs_scale = K) and the per-CMT form (obs_scale[CMT=N] = K) are mutually exclusive within the same group. The parser rejects mixing them so the user is explicit about intent. The same rule applies to y and y[CMT=N].
Gradients — per-CMT obs_scale[CMT=N] works with both gradient = auto and gradient = fd. The analytic route materialises a per-observation scale array (one entry per observation in the subject) from a subject-static pk_param_fn evaluation. See Interaction with gradients for the subject-static caveat that applies to all expression-form scales.
Since #439, Form C per-CMT (y[CMT=N] = <expr>) is differentiated analytically by the ODE sensitivity provider — each endpoint’s compiled output program is evaluated over Dual2 (outer gradient) and Dual1 (inner EBE η-gradient), dispatched per observation by subject.obs_cmts[i]. So gradient = auto serves it; gradient = fd is no longer required (the fit falls back to FD only for out-of-scope ODE features such as input-rate/SDE, unsupported steady-state combinations, or LTBS).
Since #486, an η-dependent expression obs_scale = <expr> divisor on an ODE model is also differentiated analytically (previously a Form-C readout y = state/V was the only analytic route). The same quotient rule the closed-form provider uses (apply_expression_scale_*) is applied to the ODE prediction jet for both the outer θ/Ω/σ gradient and the inner EBE η-gradient. The static ODE walk evaluates one scale jet per subject; because the divisor is subject-static even under time-varying covariates (production evaluates it at the subject covariate snapshot), the non-IOV time-varying-covariate walk applies the same single subject-static jet post-walk since #486. The ODE IOV walk evaluates one scale jet per occasion group, so an IOV model with time-varying covariates in the dynamics and obs_scale = V stays on the analytic route since #590. Combined with LTBS, an ODE expression obs_scale still falls back to FD because the walk applies LTBS before the η/θ chain.
Numerical validation. The per-CMT analytic gradient inherits the NONMEM validation of the user-ODE sensitivity engine (#410): the gradient machinery — the augmented Dual2/Dual1 RK45 and the η/θ chain — is unchanged, and per-CMT only selects, per observation, which CMT’s compiled Form-C output program to evaluate. Each such program is the same readout already validated for the uniform y = <expr> case. The per-observation routing itself is verified by the ode_provider_percmt_matches_production test (analytic f, ∂f/∂η, ∂f/∂θ vs the production predictor + finite differences on a two-endpoint model) and the ode_provider_percmt_light_matches_full test (the Dual1 inner η-gradient equals the Dual2 outer to 1e-9). A dedicated NONMEM cross-check on a multi-endpoint model is therefore not re-run here; it is covered transitively by #410’s validation plus these routing tests.
The time-varying-covariate analytic gradient (#439) inherits #410’s NONMEM validation on the same basis: it switches the per-event PK parameters at each covariate breakpoint (the event-driven Dual2/Dual1 walk) but the underlying integrator and η/θ chain are unchanged, exactly mirroring how production’s compute_predictions_with_tv switches parameters over f64. The walk is verified against that production predictor + finite differences — first order by ode_provider_tvcov_matches_production and the second-order Hessian blocks (d2f_deta2, d2f_deta_dtheta) by central-differencing the validated first-order gradient — and the Dual1 inner equals the Dual2 outer (ode_provider_tvcov_light_matches_full).