The [data_selection] block excludes records from the dataset at read time without modifying the CSV. It is the ferx equivalent of NONMEM’s $DATA IGNORE=. This example drops observations below a surrogate LLOQ of 1.0 mg/L.
Model
library (ferx)
ex <- ferx_example ("warfarin_data_selection" )
ferx_model_show (ex$ model)
# model: warfarin_data_selection.ferx
# One-compartment oral PK model (warfarin) with data-selection filtering
#
# Demonstrates [data_selection]: excludes observations below a surrogate LLOQ
# (DV < 1.0 mg/L) at read time without modifying the CSV.
# Equivalent to NONMEM $DATA IGNORE=.
[parameters]
theta TVCL(0.134, 0.001, 10.0)
theta TVV(8.1, 0.1, 500.0)
theta TVKA(1.0, 0.01, 50.0)
omega ETA_CL ~ 0.07
omega ETA_V ~ 0.02
omega ETA_KA ~ 0.40
sigma PROP_ERR ~ 0.01 (sd)
[individual_parameters]
CL = TVCL * exp(ETA_CL)
V = TVV * exp(ETA_V)
KA = TVKA * exp(ETA_KA)
[structural_model]
pk one_cpt_oral(cl=CL, v=V, ka=KA)
[error_model]
DV ~ proportional(PROP_ERR)
[data_selection]
# Drop observations below a surrogate LLOQ of 1.0 mg/L.
ignore = DV < 1.0
[fit_options]
method = foce
maxiter = 300
gradient = fd
The relevant block is:
[data_selection]
ignore = DV < 1.0
R-side preview
Before fitting, use ferx_selection() to inspect which records the filter would drop:
sel <- ferx_selection (ex$ data, ignore = "DV < 1.0" )
cat ("Total records:" , nrow (read.csv (ex$ data)), " \n " )
cat ("Retained: " , nrow (sel), " \n " )
cat ("Excluded obs: " , sel$ exclusions$ n_obs_excluded, " \n " )
The excluded rows carry a .exclude_reason column:
ferx_selection_excluded (sel)[, c ("ID" , "TIME" , "DV" , ".exclude_reason" )]
ID TIME DV .exclude_reason
84 7 120 0.9761 ignore: DV < 1.0
96 8 120 0.8700 ignore: DV < 1.0
Fit
fit <- ferx_fit (ex$ model, ex$ data, verbose = FALSE )
print (fit)
============================================================
NONLINEAR MIXED EFFECTS MODEL ESTIMATION
============================================================
Model: warfarin_data_selection Dataset: warfarin
Method: FOCE | Gradient: FD | Subjects: 10 | Obs: 108
STATUS: CONVERGED 96 iterations 0.2s
OFV: -266.9150 AIC: -252.9150 BIC: -234.1401
DATA SELECTION
------------------------------------------------------------
Records read: 120 Obs excl.: 2 Doses excl.: 0 Other excl.: 0
Fired ignore: DV < 1.0
MODEL STRUCTURE (auto-derived)
------------------------------------------------------------
Structural: 1-cpt oral (TVCL, TVV, TVKA)
IIV: ETA_CL, ETA_V, ETA_KA
IOV: none
Residual: proportional
THETA
------------------------------------------------------------
Parameter Estimate SE %RSE
----------------------------------------------------
TVCL 0.132905 0.006664 5.0
TVV 7.730678 0.233804 3.0
TVKA 0.722044 0.124469 17.2
OMEGA (between-subject variability)
------------------------------------------------------------
ETA_CL [log-normal] = 0.028609 CV% = 17.0 SE = 0.012757
ETA_V [log-normal] = 0.009509 CV% = 9.8 SE = 0.004259
ETA_KA [log-normal] = 0.349301 CV% = 64.7 SE = 0.161022
SIGMA (residual error)
------------------------------------------------------------
PROP_ERR [proportional] = 0.010776 (var = 0.000116, CV% = 1.1) SE = 0.000952 [initial specified as SD]
SHRINKAGE
------------------------------------------------------------
ETA_CL: -0.3% ETA_V: 0.1% ETA_KA: -0.0% EPS: 17.9%
DIAGNOSTICS
------------------------------------------------------------
Covariance: computed Cond: 2.6 DW: 2.65 [negative autocorrelation] IWRES lag-1 r: -0.373
RUN INFO
------------------------------------------------------------
Gradient (requested): fd (used: fd)
ferx v0.1.6 (core v0.1.6)
SETTINGS (model file / call-time override)
------------------------------------------------------------
method foce [model only]
maxiter 300 [model only]
gradient fd [model only]
------------------------------------------------------------
1 warning -- call ferx_warnings(fit) for details
============================================================
The print output includes a DATA SELECTION section showing how many records were excluded and which conditions fired.
Exclusion details in fit$exclusions
fit$ exclusions$ n_records_total # total records read
fit$ exclusions$ n_obs_excluded # excluded observations
fit$ exclusions$ fired_ignore # which ignore conditions fired
Comparison with the unfiltered fit
The two records at TIME = 120 (the latest time point) are removed. Since these are the lowest-concentration samples, the effect on estimates is small but measurable:
fit_all <- ferx_fit (ferx_example ("warfarin" )$ model,
ferx_example ("warfarin" )$ data,
verbose = FALSE )
rbind (
unfiltered = round (c (TVCL = fit_all$ theta["TVCL" ],
TVV = fit_all$ theta["TVV" ]), 4 ),
filtered = round (c (TVCL = fit$ theta["TVCL" ],
TVV = fit$ theta["TVV" ]), 4 )
)
TVCL.TVCL TVV.TVV
unfiltered 0.1330 7.7307
filtered 0.1329 7.7307
R-side filtering as an alternative to [data_selection]
Pass a ferx_selection() result directly to ferx_fit() instead of writing a [data_selection] block. Both approaches produce the same fit:
filtered_data <- ferx_selection (ex$ data, ignore = "DV < 1.0" )
fit2 <- ferx_fit (ex$ model, filtered_data)
When both sources supply conditions they are merged — duplicate expressions are deduplicated automatically.