Fit options
The [fit_options] block in a .ferx file, and the arguments to ferx_fit(), control the estimation method, optimizer, convergence settings, and diagnostics. All keys have defaults — the block can be omitted entirely for a standard FOCEI fit.
Convergence troubleshooting
The most common reason to open this page is a fit that didn’t converge or gave surprising results. Work through this list in order:
1. Is the STATUS line red / does fit$converged return FALSE?
- Check
ferx_warnings(fit)— boundary parameters, high condition number, and unconverged subjects all appear there with guidance. - Run
ferx_check_init()first: ifofv_dropis near zero, the initial values are the problem. Fix them or useinits_from_nca = "nca_sweep". - If OFV is dropping but hit the iteration cap, increase
maxiter. - Plot the trace (
optimizer_trace = TRUEthenferx_plot_trace(fit)) — is the OFV still declining at the last iteration, or has it stalled?
2. OFV converged but optimizer stalled above expected minimum
- Switch from gradient-based (
slsqp) to derivative-free (bobyqa) — the FOCE surface has small discontinuities from EBE re-estimation that trip gradient-based methods. - If using
slsqporlbfgs, tryreconverge_gradient_interval = 1to recover the full gradient surface (costs ~5–6× per outer iteration). - Try
global_search = TRUEto escape a local basin. - Try
n_starts = 8for multi-start (≈ same wall time as one run on 8 cores).
3. Some subjects always fail their EBE inner loop
- Raise
inner_maxiter(e.g.200L → 500L) and tighteninner_tol(e.g.1e-4 → 1e-6). - Allow a fraction of failures:
max_unconverged_frac = 0.1(default). Raise to0.2for large sparse datasets. - Exclude very sparse subjects from the count:
min_obs_for_convergence_check = 3.
4. IOV model stalls
- Use a SAEM → FOCEI chain:
method = [saem, focei]. The IOV variance surface is weakly identified and SAEM finds it more reliably than a cold FOCEI start. - Do not use
slsqpwith IOV withoutreconverge_gradient_interval = 1— the fixed-EBE gradient cannot see the IOV variance direction.
5. SAEM doesn’t look converged
- Inspect the trace (
optimizer_trace = TRUE): the population parameters should settle to stable averages by the end of phase 2. - Increase
n_convergence(default 250 → 400+) if the averaging phase is too short. - Increase
omega_burnin(default 20) if Ω collapses in early iterations on sparse data. - SAEM with HMC (
n_leapfrog = 3) mixes faster for correlated ETAs.
Which method should I use?
| Situation | Recommended |
|---|---|
| Starting out / standard PK | focei (default) |
| Fast feedback during model building | gn or gn_hybrid |
| Complex model, many ETAs, convergence issues | c("saem", "focei") chain |
| Need the most accurate marginal log-likelihood | c("focei", "imp") |
| Compare models on marginal likelihood | imp as final chain stage |
The method chaining guide covers these in depth with worked R code.
Estimation methods
focei — FOCE with Interaction (default)
Linearises around the EBE η̂; evaluates residual variance at η̂ (not η = 0). Correct for proportional and combined error models. The right choice for most NLME fits.
[fit_options]
method = focei
covariance = true
Works with: all outer optimizers · BLOQ M3 · IOV · SAEM/GN chains · SIR
foce — FOCE (no interaction)
Evaluates residual variance at η = 0. Equivalent to FOCEI for additive error; slightly less accurate for proportional or combined. Use when reproducing a NONMEM FOCE result.
Works with: same as FOCEI.
gn — Gauss-Newton
BHHH Hessian approximation with Levenberg-Marquardt damping. Converges in 10–30 iterations on well-conditioned problems — the fastest method for feedback during model building. Does not accept an outer optimizer key.
[fit_options]
method = gn
covariance = false # skip covariance step for fast iterations
GN-specific option:
| Key | Default | Description |
|---|---|---|
gn_lambda |
0.01 |
Levenberg-Marquardt damping. Increase (e.g. 0.1) when GN steps are too aggressive on a rough surface. |
Works with: IOV · BLOQ M3 · multi-start · NCA inits
Does not accept: optimizer, global_search, stagnation_guard, reconverge_gradient_interval
gn_hybrid — Gauss-Newton then FOCEI polish
GN finds the basin; FOCEI refines to the MLE. Useful when GN is fast but its approximate Hessian gives slightly suboptimal final parameter estimates.
[fit_options]
method = gn_hybrid
covariance = true
Works with: all outer optimizers (applied to the FOCEI polish stage) · everything GN accepts.
saem — Stochastic Approximation EM
Replaces MAP optimization with Metropolis-Hastings sampling for the E-step. More robust to multi-modal ETA posteriors and high-shrinkage scenarios. Run time is controlled by iteration counts, not maxiter.
[fit_options]
method = saem
n_exploration = 150
n_convergence = 250
covariance = true
SAEM-specific options:
| Key | Default | Description |
|---|---|---|
n_exploration |
150 |
Phase 1 iterations — stochastic, step size γ = 1. Sampler explores the posterior. |
n_convergence |
250 |
Phase 2 iterations — step size anneals to 0. Running average converges. |
n_mh_steps |
10 |
Metropolis-Hastings proposals per subject per iteration. More = better mixing, slower per iteration. |
adapt_interval |
50 |
How often (iterations) to adapt the MH proposal scale. |
omega_burnin |
20 |
Hold Ω fixed for this many iterations while the sampler warms up. Guards against Ω collapse on sparse data. Set 0 to disable. |
seed / saem_seed |
12345 |
RNG seed for the MH sampler. Independent of multi_start_seed. |
n_leapfrog / saem_n_leapfrog |
0 |
Leapfrog steps for HMC proposals. 0 = Metropolis-Hastings. A positive value replaces MH with HMC — better mixing for correlated ETAs. The HMC η-sampler uses the analytic Dual2 sensitivity provider. |
Works with: IOV · BLOQ M3 · method chains
Does not accept: optimizer, maxiter, global_search, reconverge_gradient_interval
imp — Importance Sampling (chain-terminal stage)
Computes the marginal log-likelihood by Monte Carlo importance sampling around the FOCEI/SAEM solution. Does not update parameters — purely diagnostic. Must be the last stage of a chain.
[fit_options]
method = [focei, imp]
or in R: ferx_fit(model, data, method = c("focei", "imp")).
The marginal -2LL is more accurate than the FOCEI Laplace approximation and is comparable across non-nested models in the same family.
IMP-specific options (pass via settings = list(...) in R):
| Key | Default | Description |
|---|---|---|
is_samples |
1000 |
Monte Carlo draws per subject. Halving the MC SE requires 4× more samples. |
is_proposal_df |
5 |
Student-t proposal degrees of freedom. Smaller = heavier tails = more robust to misspecified covariance, but less efficient. |
is_seed |
(engine) | RNG seed for IS step. |
is_low_ess_threshold |
0.1 |
ESS fraction below which a subject is flagged in fit$importance_sampling$low_ess_subject_ids. |
Works with: any method as the preceding stage · IOV (kappa is fixed at EBE mode; a note appears in print(fit))
Method chains
Pass a list to run stages in sequence. Each stage is seeded from the previous converged parameters; only the final stage produces covariance/diagnostics.
[fit_options]
method = [saem, focei]
In R: ferx_fit(model, data, method = c("saem", "focei")).
Common chains and their purpose:
| Chain | Why |
|---|---|
[saem, focei] |
SAEM finds the basin robustly; FOCEI polishes to the MLE |
[gn, focei] |
Fast GN warm-up then accurate FOCEI |
[focei, imp] |
FOCEI fit + exact marginal log-likelihood |
[saem, imp] |
Most robust: SAEM basin + exact marginal LL |
[saem, focei, imp] |
Full pipeline: robust start → MLE → exact marginal LL |
Options that apply to any stage in the chain are accepted without a warning (e.g. both n_exploration and maxiter are valid in a [saem, focei] chain).
Outer optimizer
Applies to FOCE / FOCEI / GN-hybrid (not pure GN, not SAEM).
[fit_options]
optimizer = bobyqa # default
| Optimizer | Algorithm | When to use |
|---|---|---|
bobyqa |
NLopt BOBYQA — derivative-free (default) | General use. Robust to small surface discontinuities. |
slsqp |
NLopt SLSQP — gradient-based | Smooth, well-conditioned problems. Faster per iteration. Pair with reconverge_gradient_interval = 1 on IOV or difficult models. |
trust_region |
Newton trust-region with Steihaug CG | Ill-conditioned problems; guaranteed descent each step. |
lbfgs / nlopt_lbfgs |
L-BFGS | Moderate-sized parameter spaces. |
mma |
NLopt Method of Moving Asymptotes | When SLSQP stalls. |
bfgs |
Hand-rolled BFGS | Small parameter spaces. |
Trust-region tuning:
| Key | Default | Description |
|---|---|---|
steihaug_max_iters |
ceil(sqrt(n_params)) ≥ 5 |
CG subproblem budget. Increase for ill-conditioned models. Only consumed with optimizer = trust_region. |
Gradient-based optimizer tuning:
| Key | Default | Description |
|---|---|---|
stagnation_guard |
true |
Terminate NLopt early when OFV plateau is numerically flat. Set false to let SLSQP/L-BFGS run to their own xtol/ftol or maxiter. |
reconverge_gradient_interval |
0 |
Re-solve EBEs during the population gradient every N evaluations. 0 = fixed-EBE gradient (cheap). 1 = reconverge every evaluation (~5–6× cost, recovers the full surface). Helps gradient-based optimizers past ill-conditioned plateaus. IOV models always reconverge regardless of this setting. |
Global pre-search (before local optimizer):
| Key | Default | Description |
|---|---|---|
global_search |
false |
Run NLopt CRS2-LM global search before local refinement. |
global_maxeval |
200 * (n_params + 1) |
OFV evaluation budget for the global stage. |
Convergence and iteration control
| Key | Methods | Default | Description |
|---|---|---|---|
maxiter |
FOCE/FOCEI/GN/GN-hybrid | 500 |
Maximum outer iterations. Not used by SAEM (use n_exploration/n_convergence). |
inner_maxiter |
all | 200 |
Per-subject EBE iteration cap. |
inner_tol |
all | 1e-4 |
Gradient-norm convergence tolerance for the inner EBE loop. |
max_unconverged_frac |
all | 0.1 |
Fraction of subjects (with ≥ min_obs_for_convergence_check obs) allowed unconverged EBEs before the step is rejected. Raise for sparse populations. |
min_obs_for_convergence_check |
all | 2 |
Subjects with fewer observations are excluded from the convergence-guard count. |
Special data features
| Key | Methods | Default | Description |
|---|---|---|---|
iov_column |
all | — | Name of the occasion column (e.g. OCC). Required for IOV / kappa models. |
bloq_method |
all | drop |
BLOQ handling: drop excludes CENS=1 rows; m3 uses Beal’s M3 likelihood (requires CENS column, DV = LLOQ on censored rows). |
Gradient method
Controls how the inner-loop (per-subject EBE) gradient is computed. Set in the model file with gradient = ... or in R with the gradient argument.
| Value | Description |
|---|---|
auto (default) |
Exact analytic Dual2 sensitivities when the model is in scope (analytical PK path); finite differences otherwise. Right choice for almost all uses. |
fd |
Force finite differences. Always available. Required for SDE ([diffusion]) models and Form-C per-CMT scaling. |
The legacy ad value is no longer accepted and returns an E_AD_RETIRED error — use auto or fd instead.
Set FERX_TIME_GRADIENTS=1 to print per-call timings at the end of a fit.
Starting values and robustness
| Key / argument | Methods | Default | Description |
|---|---|---|---|
inits_from_nca |
all | false |
Derive starting values from NCA before the optimizer. true = "nca_sweep". Options: "nca", "nca_sweep", "nca_ebe". Most useful with gn or trust_region. |
n_starts |
FOCE/FOCEI/GN | 1 |
Independent fits from perturbed initial values (multi-start). Best OFV wins. |
start_sigma |
FOCE/FOCEI/GN | 0.3 |
Log-space perturbation for multi-start (≈ 30% CV). |
multi_start_seed |
FOCE/FOCEI/GN | 42 |
RNG seed for start-point perturbations. |
scale_params |
all | false |
Apply per-coordinate scaling to the outer optimizer. Leave false (default) — see note. |
mu_referencing |
all | true |
Auto-detect mu-referencing from [individual_parameters]. Disable only for debugging. |
scale_params defaults to false
Scaling divides each log/Cholesky-packed coordinate by its initial absolute value, so the optimizer works in a near-unit-magnitude space. This sounds helpful but is counterproductive for standard NLME: a coordinate like log(V) = log(20) ≈ 3 gets a scale factor of 3, so the optimizer’s unit step becomes a 3-unit move in log-space — a 20× multiplicative jump in V. That overshoots and, through the uniform gradient cap, starves the step in Ω dimensions. Leave at false unless you have a specific reason (e.g. A/B testing the scaled trajectory on an unusual model).
Uncertainty quantification
SIR (post-fit, requires covariance = true)
| Key | Default | Description |
|---|---|---|
sir |
false |
Enable Sampling Importance Resampling after the fit. |
sir_samples |
1000 |
Candidate draws (M). More → smoother CI, slower. |
sir_resamples |
250 |
Retained draws (m). Rule of thumb: m ≈ M/4. |
sir_seed |
(engine) | RNG seed, independent of saem_seed. |
IMP (chain stage — see above)
IMP is the more accurate alternative when you need the marginal log-likelihood. SIR gives you non-parametric parameter uncertainty intervals. They serve different purposes and can be combined:
[fit_options]
method = [focei, imp]
sir = true
sir_samples = 2000
sir_resamples = 500
covariance = true
Diagnostics and output
| Key / argument | Default | Description |
|---|---|---|
optimizer_trace |
false |
Write per-iteration CSV to a temp file. Read with ferx_trace(fit); plot with ferx_plot_trace(fit). |
threads |
(all CPUs) | Thread count for the Rust backend. Cap on shared machines or to avoid SMT contention: parallel::detectCores(logical = FALSE). |
covariance |
true |
Compute covariance matrix and SEs. Set false during exploratory iterations to save time. |
Settings precedence
Three layers can set the same option. The order of precedence (highest first):
- Dedicated
ferx_fit()arguments (method,covariance,bloq_method,gradient,threads,sir,optimizer_trace,scale_params,inits_from_nca) settings = list(...)— fine-grained overrides in R[fit_options]in the.ferxfile — model-file defaults
A warning() is emitted when a call-time value overrides the model file. Inspect fit$model_file_settings and fit$call_settings to audit.
Complete worked examples
Standard FOCEI (default):
[fit_options]
method = focei
covariance = true
Fast exploratory GN:
[fit_options]
method = gn
covariance = false
SAEM → FOCEI chain (complex model):
[fit_options]
method = [saem, focei]
n_exploration = 200
n_convergence = 300
omega_burnin = 30
covariance = true
FOCEI + exact marginal log-likelihood:
[fit_options]
method = [focei, imp]
is_samples = 2000
covariance = true
IOV model:
[fit_options]
method = focei
iov_column = OCC
covariance = true
IOV with SAEM (robust):
[fit_options]
method = [saem, focei]
iov_column = OCC
n_exploration = 150
n_convergence = 250
omega_burnin = 20
covariance = true
BLOQ M3:
[fit_options]
method = focei
bloq_method = m3
covariance = true
Multi-start (8 parallel starts):
[fit_options]
method = focei
n_starts = 8
start_sigma = 0.3
multi_start_seed = 42
covariance = true
Tight convergence + optimizer trace:
[fit_options]
method = focei
optimizer = slsqp
maxiter = 500
inner_maxiter = 100
inner_tol = 1e-6
optimizer_trace = true
covariance = true
FOCEI + SIR uncertainty:
[fit_options]
method = focei
covariance = true
sir = true
sir_samples = 2000
sir_resamples = 500
SAEM with HMC proposals:
[fit_options]
method = saem
n_leapfrog = 3
omega_burnin = 30
covariance = false
Options by method — quick reference
| Option | foce | focei | gn | gn_hybrid | saem | imp |
|---|---|---|---|---|---|---|
maxiter |
✓ | ✓ | ✓ | ✓ | — | — |
optimizer |
✓ | ✓ | — | ✓ | — | — |
global_search |
✓ | ✓ | — | ✓ | — | — |
reconverge_gradient_interval |
✓ | ✓ | — | ✓ | — | — |
stagnation_guard |
✓ | ✓ | — | ✓ | — | — |
gn_lambda |
— | — | ✓ | ✓ | — | — |
n_exploration / n_convergence |
— | — | — | — | ✓ | — |
n_mh_steps / omega_burnin |
— | — | — | — | ✓ | — |
n_leapfrog |
— | — | — | — | ✓ | — |
is_samples / is_proposal_df |
— | — | — | — | — | ✓ |
inner_maxiter / inner_tol |
✓ | ✓ | ✓ | ✓ | ✓ | — |
iov_column |
✓ | ✓ | ✓ | ✓ | ✓ | — |
bloq_method |
✓ | ✓ | ✓ | ✓ | ✓ | — |
n_starts / start_sigma |
✓ | ✓ | ✓ | ✓ | — | — |
sir / sir_samples |
✓ | ✓ | ✓ | ✓ | ✓ | — |
covariance |
✓ | ✓ | ✓ | ✓ | ✓ | — |
(Chains: an option is accepted if it applies to any stage.)