Data Format

ferx-core reads data in NONMEM-compatible CSV format. This is the standard format used across population PK tools.

Required Columns

Column	Type	Description
`ID`	string/numeric	Subject identifier
`TIME`	numeric	Observation/event time on the data clock
`DV`	numeric	Dependent variable (observed concentration)

TIME need not start at zero. A subject’s TIME column may begin at any value (e.g. a calendar/clock time). Integration starts at each subject’s first record — matching NONMEM — so an off-zero start is not integrated over a phantom [0, first record] window. TIME is used exactly as written everywhere it is visible: the model TIME/T builtin, [derived] columns, the survival left-truncation TENTRY, and the sdtab / predict() / simulate() output. ferx does not re-base TIME onto a per-subject elapsed clock (the one exception is stacked reset occasions — see System Resets).

Optional Standard Columns

Column	Type	Default	Description
`EVID`	integer	0	Event ID: 0 = observation, 1 = dose, 2 = other event, 3 = system reset, 4 = reset + dose. If the column is omitted, the record type is inferred from `AMT` — see Inferring doses without an `EVID` column.
`AMT`	numeric	0	Dose amount (for `EVID=1`/`EVID=4`; also the dose-inference signal when `EVID` is absent)
`CMT`	integer	1	Compartment number (1-indexed)
`RATE`	numeric	0	Infusion rate. `0` = bolus, `>0` = constant-rate infusion (duration = `AMT/RATE`). NONMEM’s coded values `-1` (modeled rate via an `R{cmt}` `$PK` parameter) and `-2` (modeled duration via a `D{cmt}` `$PK` parameter) are supported on both analytical and ODE models — see Infusion Doses.
`MDV`	integer	0	Missing DV flag. 1 = DV should be ignored (row excluded from the likelihood)
`II`	numeric	0	Interdose interval for repeated dosing
`SS`	integer	0	Steady-state flag. 1 = assume steady state
`CENS`	integer	0	Censoring flag. `1` = below LLOQ and `DV` carries the LLOQ; `-1` = above ULOQ and `DV` carries the ULOQ; `0` = quantified. Paired with `bloq_method = m3` in `[fit_options]` to enable likelihood-based handling — see BLOQ example.

Missing DV on observation rows. An EVID=0 row contributes to the likelihood only when its DV is present. If the DV is missing (., NA, or blank), mark the row MDV=1. As a safety net, ferx also treats an EVID=0 row with a missing DV as MDV=1 even when the flag is absent — the row is skipped rather than scored as DV=0 — and emits a single W_MISSING_DV warning reporting how many rows were skipped. Set MDV=1 explicitly to silence it, or fix the data if the missing values are an error.

Inferring doses without an `EVID` column

EVID is optional. When the column is absent, ferx infers the record type from AMT, exactly as NONMEM does: a row with a nonzero AMT is treated as a dose (EVID=1), and every other row as an observation (EVID=0). This lets legacy NONMEM datasets that omit EVID — marking dose rows only by a nonzero AMT (often with MDV=1) — administer their doses instead of silently dropping every AMT row and fitting a degenerate, dose-free model.

ID,TIME,DV,MDV,AMT
1,0,.,1,100
1,1,9.5,0,.
1,2,7.3,0,.

Here the first row (AMT=100) is inferred as a dose; the others are observations. Inference keys on AMT only — a dose always carries a nonzero amount (for infusions, AMT is the amount and RATE the rate). When an EVID column is present, its values govern and nothing is inferred.

Two non-fatal warnings (collected into the fit’s warnings) guard against a silently dose-free fit:

W_AMT_NOT_DOSED — one or more non-observation rows carry AMT != 0 but were not treated as doses: an EVID column is present and codes a dose row as something other than 1/4 (e.g. a dose row mistyped EVID=0 with MDV=1). Their AMT was ignored. A scored observation (MDV=0) that merely carries a redundant or forward-filled AMT does not trigger this.
W_NO_DOSES — the dataset parsed zero dose events even though scored observations are present (usually a missing AMT/EVID). Not emitted for time-to-event/survival datasets, which legitimately have no PK doses.

Occasion Column (IOV)

When using Inter-Occasion Variability (IOV), add an occasion-index column to the dataset and specify its name with iov_column in [fit_options]. The column:

Contains integer occasion indices (e.g. 1, 2, 3…) — one per row
Applies to both dose rows and observation rows
Is excluded from covariate auto-detection

Example dataset with OCC column:

ID,TIME,DV,EVID,AMT,CMT,MDV,OCC
1,0,.,1,100,1,1,1
1,1,9.5,0,.,.,0,1
1,2,7.3,0,.,.,0,1
1,24,.,1,100,1,1,2
1,25,10.1,0,.,.,0,2
1,26,8.2,0,.,.,0,2

The occasion index can be any positive integer; they do not need to start at 1 or be consecutive, but a different number means a different occasion with its own kappa EBE.

See IOV documentation for full details.

Covariate Columns

By default, any column not in the standard set above is automatically treated as a covariate. Alternatively, declare covariates explicitly with a [covariates] block in the model file; when present it is authoritative (only listed columns are covariates) and enables type tagging, validation, and the covariate output table. Covariate values are:

Time-constant: The first non-missing value for each subject is used
Time-varying: If values change over time for a subject, Last Observation Carried Forward (LOCF) is applied per event (NONMEM-equivalent: [individual_parameters] is re-evaluated at each dose and observation row using that row’s covariate values)

Covariate names are case-sensitive: a column name in the data file must match the name used in [individual_parameters] expressions (and in a [covariates] block) exactly. (Standard NONMEM columns like ID/TIME are matched case-insensitively; covariate columns are not.)

Time-varying covariate scope

Time-varying covariates are supported for all analytical structural models and ODE-defined models:

1-compartment IV (one_cpt_iv) — bolus and/or infusion per dose
1-compartment oral (one_cpt_oral)
2-compartment IV (two_cpt_iv) — bolus and/or infusion per dose
2-compartment oral (two_cpt_oral)
3-compartment IV (three_cpt_iv) — bolus and/or infusion per dose
3-compartment oral (three_cpt_oral)
All ODE-defined models (via [odes])

For oral models, the bolus dose into compartment 1 is interpreted as the depot (NONMEM ADVAN2/ADVAN4/ADVAN12 convention) and observation read-out reads the central compartment.

The analytic Dual2 gradient path is event-driven for all analytical models; time-varying-covariate subjects fall back to finite differences for both the inner-loop gradient and the H-matrix Jacobian (a planned analytic-sensitivity extension).

Infusion routing on the event-driven path:

IV models: central infusion (cmt=1) for all 1/2/3-cpt; peripheral infusion for 2-cpt (cmt=2) and 3-cpt (cmt=2 → periph1, cmt=3 → periph2). Steady-state amounts per channel are computed by linear superposition over the channels.
Oral models: central infusion (cmt=2) is supported; peripheral infusion is rare clinically and still panics with a clear message (tracked as a follow-up).

Event Types (EVID)

EVID	Meaning
0	Observation record. `DV` is used for estimation.
1	Dosing record. `AMT` is administered to compartment `CMT`.
2	Other event (typically a covariate-change marker). The compartment state is unchanged but the rate matrix is refreshed from this row’s covariate values — matching NONMEM’s `$PK runs at every record` semantic. Only meaningful when at least one covariate is time-varying; for time-constant data EVID=2 rows are skipped (would be no-ops).
3	System reset. All compartment amounts are set to zero at this time, and any ongoing infusion is turned off. No dose is given and `DV` is ignored.
4	Reset and dose. Like EVID=3 (zero every compartment, stop ongoing infusions) followed immediately by a dose of `AMT` into compartment `CMT`.

Record order at a shared `TIME` (pre-dose troughs)

When an observation and a dose carry the same TIME, their order in the data file decides whether the sample is pre- or post-dose — exactly as in NONMEM, which processes records top-to-bottom:

Observation row before the dose row → a pre-dose trough: the dose has not yet been given, so the prediction excludes it.
Observation row after the dose row → a post-dose sample: the dose is already on board and contributes to the prediction.

This is the common trough pattern in maintenance-dosing data (a level drawn immediately before the next infusion):

ID,TIME,DV,EVID,AMT,CMT,MDV
1,0,.,1,300,1,1
1,14,18.6,0,.,.,0     # trough — listed BEFORE the day-14 dose → pre-dose
1,14,.,1,300,1,1      # day-14 dose
1,42,24.7,0,.,.,0     # trough before the day-42 dose
1,42,.,1,300,1,1

ferx reproduces the NONMEM prediction for both orderings. (Internally a pre-dose observation is ordered just ahead of its coincident dose; the TIME reported in sdtab/covtab and by predict()/simulate() is the value you wrote.) You do not need to nudge trough times by a small epsilon to force pre-dose evaluation, as is sometimes done when porting NONMEM datasets.

One exception: a steady-state dose (SS=1) carries its own periodic pre-arrival tail, so an observation at the SS dose TIME reads the steady-state trough regardless of row order.

NONMEM comparison

The infliximab population-PK benchmark (run55, ADVAN3 TRANS4, 42 subjects, 182 of 183 observations drawn as troughs at a dosing time) reproduces the NONMEM METHOD=1 INTER fit once record order is honored:

Parameter	NONMEM	ferx (FOCEI)
OFV	662.2	664.0
CL (L/day)	0.199	0.198
V1 (L)	4.94	5.04
ATI effect on CL	0.722	0.763
Maintenance-phase CL multiplier	1.40	1.41

(The small OFV offset is the time-conditional residual-error term the ferx translation omits, not a structural difference.) Scoring the same model at the NONMEM estimates without record-order handling gave OFV 3751 — every trough mispredicted as a post-dose peak.

System Resets (EVID=3 / EVID=4)

A reset record empties every compartment at its TIME, as if the subject’s drug history started over from that point. EVID=3 is a pure reset; EVID=4 resets and then administers the row’s dose into the freshly emptied system. This matches NONMEM’s reset-event semantics and is useful for, e.g., modelling washout between treatment cycles or re-using one subject record for independent dosing episodes.

ID,TIME,DV,EVID,AMT,CMT,MDV
1,0,.,1,100,1,1
1,1,9.5,0,.,.,0
1,4,6.1,0,.,.,0
1,24,.,3,.,.,1
1,24,.,1,100,1,1
1,25,9.4,0,.,.,0

Stacked occasions with a restarting clock

A common NONMEM idiom is to stack several independent dosing episodes under one subject ID, each opened by an EVID=4 record whose TIME restarts at 0 (so the records are not globally time-ordered):

ID,TIME,DV,EVID,AMT,RATE,MDV
1,0,.,4,100,100,1
1,1,9.4,0,.,.,0
1,8,2.1,0,.,.,0
1,0,.,4,100,100,1
1,1,9.6,0,.,.,0
1,8,2.0,0,.,.,0

NONMEM processes records sequentially, so the second EVID=4 begins a fresh occasion that re-uses the first occasion’s wall-clock. ferx-core reproduces this: each restarting occasion is shifted onto a single monotonic internal timeline (the reset zeros every compartment at the boundary, so no drug carries across), and the subject keeps one shared set of random effects across the occasions — exactly matching NONMEM’s EVID=4 semantics. The two occasions are not merged or double-dosed.

Diagnostics are reported on the raw data clock: the TIME column of {model}-sdtab.csv (and {model}-covtab.csv) echoes the value you wrote, and TAFD / TAD reset per occasion (time after that occasion’s first / most recent dose). The monotonic shift is purely internal to the prediction engine.

[derived] integrals and stacked resets. Integral columns with an absolute window ([from, to] or a periodic anchor) are evaluated per occasion in raw TIME coordinates. Each occasion is integrated independently, so crossover designs produce correct per-occasion AUC values. Occasions with no observations inside the window return NaN for that row.

Grid-based integrals (step = <dt> or step = auto) evaluate on the internal shifted clock rather than raw TIME, so expressions referencing TIME inside a grid integral will see the shifted value. Use observation-based integrals (step = obs or data_based = true) when raw TIME must appear inside the integrand expression.

Notes:

Resets force the event-driven analytical / ODE prediction path — dose superposition cannot express a mid-record reset — so any analytical or ODE model supports them with no configuration.
Reset-bearing subjects use the analytic gradient where in scope and otherwise fall back to finite-difference gradients; results are unaffected, only the gradient method.
Resets are not supported on the EKF/SDE path ([diffusion] models). A reset row on an SDE model emits a warning and is ignored.

Example Dataset

ID,TIME,DV,EVID,AMT,CMT,RATE,MDV,WT,CRCL
1,0,.,1,100,1,0,1,70,95
1,0.5,9.49,0,.,.,.,0,70,95
1,1,14.42,0,.,.,.,0,70,95
1,2,17.56,0,.,.,.,0,70,95
1,4,15.23,0,.,.,.,0,70,95
1,8,10.15,0,.,.,.,0,70,95
2,0,.,1,150,1,0,1,85,110
2,0.5,14.2,0,.,.,.,0,85,110
2,1,21.3,0,.,.,.,0,85,110

Key points: - Dose records (EVID=1) have MDV=1 and DV=. (missing) - Observation records (EVID=0) have MDV=0 and a valid DV - Covariates (WT, CRCL) are included as extra columns - Missing values can be represented as . or left empty

Infusion Doses

For IV infusions, set RATE to the infusion rate (amount per time unit):

ID,TIME,DV,EVID,AMT,CMT,RATE,MDV
1,0,.,1,500,1,50,1

This administers 500 units at a rate of 50 units/hour (duration = 10 hours).

NONMEM coded `RATE` values

NONMEM overloads the RATE column with negative codes that change its meaning:

`RATE`	NONMEM meaning	ferx-core
`0`	Bolus — route set by the dose compartment	✅ supported
`> 0`	Constant-rate infusion (duration = `AMT/RATE`)	✅ supported
`-1`	Infusion rate is modeled — defined by `R{cmt}` in `$PK`	✅ supported on ODE and analytical models (#324)
`-2`	Infusion duration is modeled — defined by `D{cmt}` in `$PK`	✅ supported on ODE and analytical models (#324, #394)

RATE = -2 makes the infusion duration a model parameter: declare an individual parameter D{cmt} (D1 for a dose into compartment 1, etc.) and ferx infuses AMT over that duration — rate AMT / D{cmt}, evaluated per iteration and occasion. Supported on both the analytical pk(...) engine and ode(...) models; see Modeled infusion duration for the DSL and semantics. A RATE=-2 dose with no matching D{cmt} parameter (on either engine) is a loud error — never a silent bolus.

On an analytical oral model (pk one_cpt_oral / two_cpt_oral / three_cpt_oral), a D1 into the depot (compartment 1) gives a zero-order absorption model: drug is released into the depot at a constant rate over D1, then absorbed first-order into central via KA (#400). D2 into the same model is a depot-bypassing infusion straight into central. Both stay on the closed-form analytical engine — no ode(...) block needed. (Compartment amounts in sdtab/[derived] are not available for analytical depot-zero-order subjects; the predictions themselves are exact — use an ode(...) model if you need the per-compartment amounts.) Infusion into an oral peripheral compartment is still unsupported and needs an ode(...) model.

RATE = -1 makes the infusion rate a model parameter: declare an individual parameter R{cmt} (R1 for a dose into compartment 1, etc.) and ferx infuses AMT at that rate — duration AMT / R{cmt}, evaluated per iteration and occasion (the mirror of -2). Supported on both engines; a RATE=-1 dose with no matching R{cmt} is the same loud E_MODELED_RATE_NO_PARAM error as the duration case. R{cmt}/D{cmt} are recognised compartment-indexed parameter names (like NONMEM’s reserved $PK names); recognising one only reserves it when a coded RATE actually targets that compartment.

Bioavailability and infusion shape (F ≠ 1). ferx applies F to an infusion the NONMEM way (#419), holding whichever quantity you specified and scaling the other so total exposure is F·AMT either way:

rate-defined (RATE>0 data and RATE=-1 → R{cmt}): the rate is held and the duration is scaled to F·AMT/RATE. So a RATE=-1 dose behaves exactly like its explicit RATE = R{cmt} twin.

duration-defined (RATE=-2 → D{cmt}): the duration is held at D{cmt} and the rate is scaled to F·AMT/D{cmt}.

The two modes therefore produce a different infusion shape under F ≠ 1 (same total exposure). At F = 1 they coincide. (Earlier versions scaled the rate for every infusion, which diverged from NONMEM for rate-defined infusions; #419.)

Any other negative or non-finite RATE on a dose row is rejected. Earlier versions silently treated all coded forms as a bolus, producing wrong predictions with no warning (#324). Note -1/-2 are driven by $PK parameters — not a separate DURATION data column.

A runnable demo of the supported forms — a bolus (RATE=0) and a constant-rate infusion (RATE>0) mixed in one dataset — is in examples/dose_rate.ferx (data: data/dose_rate.csv).

Steady-State Dosing

For steady-state simulations, set SS=1 and II to the dosing interval:

ID,TIME,DV,EVID,AMT,CMT,SS,II,MDV
1,0,.,1,100,1,1,12,1
1,0.5,25.3,0,.,.,.,.,0

This assumes the subject has reached steady state with 100 units every 12 hours before the observation at TIME=0.5.

SS=1 is supported on every prediction path: analytical (1-/2-/3-cpt with or without time-varying covariates) and ODE. SS=1 also composes with LAGTIME — the lagged SS curve at time t equals the un-lagged curve at t - lagtime. See Steady-State Doses for the full reference, including the data-validation warnings emitted for malformed rows (missing II, overlapping infusions).

Multiple Doses

Multiple doses are supported as separate rows:

ID,TIME,DV,EVID,AMT,CMT,MDV
1,0,.,1,100,1,1
1,0.5,9.49,0,.,.,0
1,12,.,1,100,1,1
1,12.5,15.2,0,.,.,0
1,24,.,1,100,1,1
1,24.5,18.1,0,.,.,0

Column Name Case

Column names are case-insensitive. ID, Id, and id are all recognized. Covariate columns preserve their case as declared in the CSV header.