Covariates

Maturity: stable — see Feature Maturity for what this means.

The optional [covariates] block declares which dataset columns are covariates and whether each is continuous or categorical. This is a declaration of availability — it does not mean the covariate is used in the structural model, only that it is potentially available. Once declared, ferx-core echoes the covariate columns back on the fit result (the covariate table), which downstream tooling (e.g. the R package) can use for summary statistics and covariate-search workflows.

Syntax

Two line forms are accepted and may be mixed. The type is one of continuous/cont or categorical/cat (case-insensitive):

[covariates]
  WT   continuous
  HT   continuous
  CRCL continuous
  SEX  categorical
  RACE categorical

The equivalent terser TYPE: NAME, ... form:

[covariates]
  continuous: WT, HT, CRCL
  categorical: SEX, RACE

Covariate names are case-sensitive and must match the CSV header exactly. The built-ins TIME and time are not covariates and should not be declared here; they are always available in [individual_parameters] expressions and as direct pk(...=TIME) / pk(...=time) mappings.

Semantics

  • Optional & backward-compatible. When the block is absent, behaviour is unchanged: every non-standard CSV column is auto-detected as a covariate.
  • Authoritative for the table and typing. Only the listed columns appear in the covariate table and carry a declared type; other non-standard columns (e.g. STUDY, DATE) are not tabled.
  • Undeclared-but-used is a warning, not an error. A covariate used in [individual_parameters] but missing from [covariates] is still usable — ferx reads it (leniently) and emits a warning recommending you declare it so its type is recorded and it appears in the table. Declaring a covariate the model does not use is also fine — that is the point.
  • Validation against the data. A declared column that is absent from the dataset is an error (E_MISSING_COVARIATE).

Categorical covariates must be numerically coded

Covariate values are carried as floating-point numbers, so categorical covariates must be encoded as integer levels in the data (e.g. SEX as 0/1, not "M"/"F"). Under a [covariates] block this is enforced: a non-numeric value in a declared covariate is a hard error rather than a silent coercion to 0.0. (In the legacy auto-detect path — no [covariates] block — a non-numeric covariate value fails to parse, is dropped, and the covariate evaluates to 0.0 in the model, preserving prior behaviour.)

Missing values (blank, ., NA) are permitted and recorded as missing.

ferx check reads through the same covariate-aware path the fit uses, so a declared column that is absent (E_MISSING_COVARIATE) or non-numeric (E_COVARIATE_NOT_NUMERIC), or a referenced covariate missing from the data, is reported at check time rather than only failing once the fit starts.

Time-varying covariates

A covariate may change within a subject — e.g. body weight measured at each visit, or a creatinine clearance that drifts over an admission. ferx detects this per subject: when a covariate column holds more than one value across a subject’s records, that subject carries a per-record covariate snapshot (last observation carried forward, LOCF) instead of a single subject-static value, and the structural model is evaluated at each event’s snapshot. A subject whose covariates are constant keeps the cheap static path.

For the analytical 1-/2-/3-compartment models this is handled exactly: a time-varying covariate makes the individual PK parameters switch part-way through a dose’s decay (e.g. CL = TVCL*(WT/70)^THETA_WT*exp(ETA_CL) with a changing WT), and the predictions flow through a state-propagating event-driven solver that carries the amounts across each covariate change. Under a gradient-based outer optimizer (lbfgs, bfgs, or slsqp) these subjects also drive the exact analytic FOCE/FOCEI gradient (each event’s parameter derivatives are taken at that event’s covariate snapshot), rather than falling back to finite differences. Covariate changes carried by EVID=2 records between observations are supported, as are combinations with EVID 3/4 resets, steady-state dosing, a constant obs_scale divisor, and inter-occasion variability (IOV) (the covariate and κ both switch the individual parameters across occasions, including subject-static expression obs_scale divisors since #590). A non-IOV ODE model with time-varying covariates now also stays analytic when combined with an expression-based output scaling (obs_scale = <expr>, applied as a subject-static post-walk quotient), an EVID=2 covariate-only breakpoint, or init(...) initial conditions (seeded at the subject’s first-record covariate snapshot, all since #486). Time-varying covariates with dose lagtime on the closed-form analytical models still fall back to the finite-difference gradient, as does an ODE model that combines init(...) with an EVID 3/4 reset (production re-seeds init at the reset-event snapshot). (See FOCE/FOCEI for the optimizer/gradient interaction.)

Covariate table

When a [covariates] block is present and the fit is launched from a data file, the result carries a covariate table (FitResult.covariate_table) echoing the declared columns:

  • Columns: ID, TIME, EVID, then one column per declared covariate.
  • One row per input dataset record — including dose and other-event rows. (This differs from the sdtab diagnostic table, which has observation rows only.)
  • Missing values are written as empty cells (the in-memory representation uses NaN).

The CLI writes it to {model}-covtab.csv alongside {model}-sdtab.csv whenever the model declares covariates.

Example

See examples/two_cpt_oral_cov.ferx, which declares:

[covariates]
  WT   continuous
  CRCL continuous

Both WT and CRCL are used in [individual_parameters] to scale CL and V1, so they are declared here. A covariate used in the model but left out of the block still works, but the parser emits a warning recommending it be declared (so its type is recorded and it appears in the covariate table).