Fit options

The [fit_options] block in a .ferx file, and the arguments to ferx_fit(), control the estimation method, optimizer, convergence settings, and diagnostics. All keys have defaults — the block can be omitted entirely for a standard FOCEI fit.

Convergence troubleshooting

The most common reason to open this page is a fit that didn’t converge or gave surprising results. Work through this list in order:

1. Is the STATUS line red / does fit$converged return FALSE?

  • Check ferx_warnings(fit) — boundary parameters, high condition number, and unconverged subjects all appear there with guidance.
  • Run ferx_check_init() first: if ofv_drop is near zero, the initial values are the problem. Fix them or use inits_from_nca = "nca_sweep".
  • If OFV is dropping but hit the iteration cap, increase maxiter.
  • Plot the trace (optimizer_trace = TRUE then ferx_plot_trace(fit)) — is the OFV still declining at the last iteration, or has it stalled?

2. OFV converged but optimizer stalled above expected minimum

  • Switch from gradient-based (slsqp) to derivative-free (bobyqa) — the FOCE surface has small discontinuities from EBE re-estimation that trip gradient-based methods.
  • If using slsqp or lbfgs, try reconverge_gradient_interval = 1 to recover the full gradient surface (costs ~5–6× per outer iteration).
  • Try global_search = TRUE to escape a local basin.
  • Try n_starts = 8 for multi-start (≈ same wall time as one run on 8 cores).

3. Some subjects always fail their EBE inner loop

  • Raise inner_maxiter (e.g. 200L → 500L) and tighten inner_tol (e.g. 1e-4 → 1e-6).
  • Allow a fraction of failures: max_unconverged_frac = 0.1 (default). Raise to 0.2 for large sparse datasets.
  • Exclude very sparse subjects from the count: min_obs_for_convergence_check = 3.

4. IOV model stalls

  • Use a SAEM → FOCEI chain: method = [saem, focei]. The IOV variance surface is weakly identified and SAEM finds it more reliably than a cold FOCEI start.
  • Do not use slsqp with IOV without reconverge_gradient_interval = 1 — the fixed-EBE gradient cannot see the IOV variance direction.

5. SAEM doesn’t look converged

  • Inspect the trace (optimizer_trace = TRUE): the population parameters should settle to stable averages by the end of phase 2.
  • Increase n_convergence (default 250 → 400+) if the averaging phase is too short.
  • Increase omega_burnin (default 20) if Ω collapses in early iterations on sparse data.
  • SAEM with HMC (n_leapfrog = 3) mixes faster for correlated ETAs.

Which method should I use?

Situation Recommended
Starting out / standard PK focei (default)
Fast feedback during model building gn or gn_hybrid
Complex model, many ETAs, convergence issues c("saem", "focei") chain
Need the most accurate marginal log-likelihood c("focei", "imp")
Compare models on marginal likelihood imp as final chain stage

The method chaining guide covers these in depth with worked R code.


Estimation methods

focei — FOCE with Interaction (default)

Linearises around the EBE η̂; evaluates residual variance at η̂ (not η = 0). Correct for proportional and combined error models. The right choice for most NLME fits.

[fit_options]
  method     = focei
  covariance = true

Works with: all outer optimizers · BLOQ M3 · IOV · SAEM/GN chains · SIR


foce — FOCE (no interaction)

Evaluates residual variance at η = 0. Equivalent to FOCEI for additive error; slightly less accurate for proportional or combined. Use when reproducing a NONMEM FOCE result.

Works with: same as FOCEI.


gn — Gauss-Newton

BHHH Hessian approximation with Levenberg-Marquardt damping. Converges in 10–30 iterations on well-conditioned problems — the fastest method for feedback during model building. Does not accept an outer optimizer key.

[fit_options]
  method     = gn
  covariance = false   # skip covariance step for fast iterations

GN-specific option:

Key Default Description
gn_lambda 0.01 Levenberg-Marquardt damping. Increase (e.g. 0.1) when GN steps are too aggressive on a rough surface.

Works with: IOV · BLOQ M3 · multi-start · NCA inits

Does not accept: optimizer, global_search, stagnation_guard, reconverge_gradient_interval


gn_hybrid — Gauss-Newton then FOCEI polish

GN finds the basin; FOCEI refines to the MLE. Useful when GN is fast but its approximate Hessian gives slightly suboptimal final parameter estimates.

[fit_options]
  method     = gn_hybrid
  covariance = true

Works with: all outer optimizers (applied to the FOCEI polish stage) · everything GN accepts.


saem — Stochastic Approximation EM

Replaces MAP optimization with Metropolis-Hastings sampling for the E-step. More robust to multi-modal ETA posteriors and high-shrinkage scenarios. Run time is controlled by iteration counts, not maxiter.

[fit_options]
  method        = saem
  n_exploration = 150
  n_convergence = 250
  covariance    = true

SAEM-specific options:

Key Default Description
n_exploration 150 Phase 1 iterations — stochastic, step size γ = 1. Sampler explores the posterior.
n_convergence 250 Phase 2 iterations — step size anneals to 0. Running average converges.
n_mh_steps 10 Metropolis-Hastings proposals per subject per iteration. More = better mixing, slower per iteration.
adapt_interval 50 How often (iterations) to adapt the MH proposal scale.
omega_burnin 20 Hold Ω fixed for this many iterations while the sampler warms up. Guards against Ω collapse on sparse data. Set 0 to disable.
seed / saem_seed 12345 RNG seed for the MH sampler. Independent of multi_start_seed.
n_leapfrog / saem_n_leapfrog 0 Leapfrog steps for HMC proposals. 0 = Metropolis-Hastings. A positive value replaces MH with HMC — better mixing for correlated ETAs. The HMC η-sampler uses the analytic Dual2 sensitivity provider.

Works with: IOV · BLOQ M3 · method chains

Does not accept: optimizer, maxiter, global_search, reconverge_gradient_interval


imp — Importance Sampling (chain-terminal stage)

Computes the marginal log-likelihood by Monte Carlo importance sampling around the FOCEI/SAEM solution. Does not update parameters — purely diagnostic. Must be the last stage of a chain.

[fit_options]
  method = [focei, imp]

or in R: ferx_fit(model, data, method = c("focei", "imp")).

The marginal -2LL is more accurate than the FOCEI Laplace approximation and is comparable across non-nested models in the same family.

IMP-specific options (pass via settings = list(...) in R):

Key Default Description
is_samples 1000 Monte Carlo draws per subject. Halving the MC SE requires 4× more samples.
is_proposal_df 5 Student-t proposal degrees of freedom. Smaller = heavier tails = more robust to misspecified covariance, but less efficient.
is_seed (engine) RNG seed for IS step.
is_low_ess_threshold 0.1 ESS fraction below which a subject is flagged in fit$importance_sampling$low_ess_subject_ids.

Works with: any method as the preceding stage · IOV (kappa is fixed at EBE mode; a note appears in print(fit))


Method chains

Pass a list to run stages in sequence. Each stage is seeded from the previous converged parameters; only the final stage produces covariance/diagnostics.

[fit_options]
  method = [saem, focei]

In R: ferx_fit(model, data, method = c("saem", "focei")).

Common chains and their purpose:

Chain Why
[saem, focei] SAEM finds the basin robustly; FOCEI polishes to the MLE
[gn, focei] Fast GN warm-up then accurate FOCEI
[focei, imp] FOCEI fit + exact marginal log-likelihood
[saem, imp] Most robust: SAEM basin + exact marginal LL
[saem, focei, imp] Full pipeline: robust start → MLE → exact marginal LL

Options that apply to any stage in the chain are accepted without a warning (e.g. both n_exploration and maxiter are valid in a [saem, focei] chain).


Outer optimizer

Applies to FOCE / FOCEI / GN-hybrid (not pure GN, not SAEM).

[fit_options]
  optimizer = bobyqa    # default
Optimizer Algorithm When to use
bobyqa NLopt BOBYQA — derivative-free (default) General use. Robust to small surface discontinuities.
slsqp NLopt SLSQP — gradient-based Smooth, well-conditioned problems. Faster per iteration. Pair with reconverge_gradient_interval = 1 on IOV or difficult models.
trust_region Newton trust-region with Steihaug CG Ill-conditioned problems; guaranteed descent each step.
lbfgs / nlopt_lbfgs L-BFGS Moderate-sized parameter spaces.
mma NLopt Method of Moving Asymptotes When SLSQP stalls.
bfgs Hand-rolled BFGS Small parameter spaces.

Trust-region tuning:

Key Default Description
steihaug_max_iters ceil(sqrt(n_params)) ≥ 5 CG subproblem budget. Increase for ill-conditioned models. Only consumed with optimizer = trust_region.

Gradient-based optimizer tuning:

Key Default Description
stagnation_guard true Terminate NLopt early when OFV plateau is numerically flat. Set false to let SLSQP/L-BFGS run to their own xtol/ftol or maxiter.
reconverge_gradient_interval 0 Re-solve EBEs during the population gradient every N evaluations. 0 = fixed-EBE gradient (cheap). 1 = reconverge every evaluation (~5–6× cost, recovers the full surface). Helps gradient-based optimizers past ill-conditioned plateaus. IOV models always reconverge regardless of this setting.

Global pre-search (before local optimizer):

Key Default Description
global_search false Run NLopt CRS2-LM global search before local refinement.
global_maxeval 200 * (n_params + 1) OFV evaluation budget for the global stage.

Convergence and iteration control

Key Methods Default Description
maxiter FOCE/FOCEI/GN/GN-hybrid 500 Maximum outer iterations. Not used by SAEM (use n_exploration/n_convergence).
inner_maxiter all 200 Per-subject EBE iteration cap.
inner_tol all 1e-4 Gradient-norm convergence tolerance for the inner EBE loop.
max_unconverged_frac all 0.1 Fraction of subjects (with ≥ min_obs_for_convergence_check obs) allowed unconverged EBEs before the step is rejected. Raise for sparse populations.
min_obs_for_convergence_check all 2 Subjects with fewer observations are excluded from the convergence-guard count.

Special data features

Key Methods Default Description
iov_column all Name of the occasion column (e.g. OCC). Required for IOV / kappa models.
bloq_method all drop BLOQ handling: drop excludes CENS=1 rows; m3 uses Beal’s M3 likelihood (requires CENS column, DV = LLOQ on censored rows).

Gradient method

Controls how the inner-loop (per-subject EBE) gradient is computed. Set in the model file with gradient = ... or in R with the gradient argument.

Value Description
auto (default) Exact analytic Dual2 sensitivities when the model is in scope (analytical PK path); finite differences otherwise. Right choice for almost all uses.
fd Force finite differences. Always available. Required for SDE ([diffusion]) models and Form-C per-CMT scaling.

The legacy ad value is no longer accepted and returns an E_AD_RETIRED error — use auto or fd instead.

Set FERX_TIME_GRADIENTS=1 to print per-call timings at the end of a fit.


Starting values and robustness

Key / argument Methods Default Description
inits_from_nca all false Derive starting values from NCA before the optimizer. true = "nca_sweep". Options: "nca", "nca_sweep", "nca_ebe". Most useful with gn or trust_region.
n_starts FOCE/FOCEI/GN 1 Independent fits from perturbed initial values (multi-start). Best OFV wins.
start_sigma FOCE/FOCEI/GN 0.3 Log-space perturbation for multi-start (≈ 30% CV).
multi_start_seed FOCE/FOCEI/GN 42 RNG seed for start-point perturbations.
scale_params all false Apply per-coordinate scaling to the outer optimizer. Leave false (default) — see note.
mu_referencing all true Auto-detect mu-referencing from [individual_parameters]. Disable only for debugging.

Scaling divides each log/Cholesky-packed coordinate by its initial absolute value, so the optimizer works in a near-unit-magnitude space. This sounds helpful but is counterproductive for standard NLME: a coordinate like log(V) = log(20) ≈ 3 gets a scale factor of 3, so the optimizer’s unit step becomes a 3-unit move in log-space — a 20× multiplicative jump in V. That overshoots and, through the uniform gradient cap, starves the step in Ω dimensions. Leave at false unless you have a specific reason (e.g. A/B testing the scaled trajectory on an unusual model).


Uncertainty quantification

SIR (post-fit, requires covariance = true)

Key Default Description
sir false Enable Sampling Importance Resampling after the fit.
sir_samples 1000 Candidate draws (M). More → smoother CI, slower.
sir_resamples 250 Retained draws (m). Rule of thumb: m ≈ M/4.
sir_seed (engine) RNG seed, independent of saem_seed.

IMP (chain stage — see above)

IMP is the more accurate alternative when you need the marginal log-likelihood. SIR gives you non-parametric parameter uncertainty intervals. They serve different purposes and can be combined:

[fit_options]
  method        = [focei, imp]
  sir           = true
  sir_samples   = 2000
  sir_resamples = 500
  covariance    = true

Diagnostics and output

Key / argument Default Description
optimizer_trace false Write per-iteration CSV to a temp file. Read with ferx_trace(fit); plot with ferx_plot_trace(fit).
threads (all CPUs) Thread count for the Rust backend. Cap on shared machines or to avoid SMT contention: parallel::detectCores(logical = FALSE).
covariance true Compute covariance matrix and SEs. Set false during exploratory iterations to save time.

Settings precedence

Three layers can set the same option. The order of precedence (highest first):

  1. Dedicated ferx_fit() arguments (method, covariance, bloq_method, gradient, threads, sir, optimizer_trace, scale_params, inits_from_nca)
  2. settings = list(...) — fine-grained overrides in R
  3. [fit_options] in the .ferx file — model-file defaults

A warning() is emitted when a call-time value overrides the model file. Inspect fit$model_file_settings and fit$call_settings to audit.


Complete worked examples

Standard FOCEI (default):

[fit_options]
  method     = focei
  covariance = true

Fast exploratory GN:

[fit_options]
  method     = gn
  covariance = false

SAEM → FOCEI chain (complex model):

[fit_options]
  method        = [saem, focei]
  n_exploration = 200
  n_convergence = 300
  omega_burnin  = 30
  covariance    = true

FOCEI + exact marginal log-likelihood:

[fit_options]
  method   = [focei, imp]
  is_samples = 2000
  covariance = true

IOV model:

[fit_options]
  method     = focei
  iov_column = OCC
  covariance = true

IOV with SAEM (robust):

[fit_options]
  method        = [saem, focei]
  iov_column    = OCC
  n_exploration = 150
  n_convergence = 250
  omega_burnin  = 20
  covariance    = true

BLOQ M3:

[fit_options]
  method      = focei
  bloq_method = m3
  covariance  = true

Multi-start (8 parallel starts):

[fit_options]
  method           = focei
  n_starts         = 8
  start_sigma      = 0.3
  multi_start_seed = 42
  covariance       = true

Tight convergence + optimizer trace:

[fit_options]
  method          = focei
  optimizer       = slsqp
  maxiter         = 500
  inner_maxiter   = 100
  inner_tol       = 1e-6
  optimizer_trace = true
  covariance      = true

FOCEI + SIR uncertainty:

[fit_options]
  method        = focei
  covariance    = true
  sir           = true
  sir_samples   = 2000
  sir_resamples = 500

SAEM with HMC proposals:

[fit_options]
  method       = saem
  n_leapfrog   = 3
  omega_burnin = 30
  covariance   = false

Options by method — quick reference

Option foce focei gn gn_hybrid saem imp
maxiter
optimizer
global_search
reconverge_gradient_interval
stagnation_guard
gn_lambda
n_exploration / n_convergence
n_mh_steps / omega_burnin
n_leapfrog
is_samples / is_proposal_df
inner_maxiter / inner_tol
iov_column
bloq_method
n_starts / start_sigma
sir / sir_samples
covariance

(Chains: an option is accepted if it applies to any stage.)


See also