Environmental linkages and priors: a formula-driven API
Source:vignettes/environmental-linkages-and-priors.Rmd
environmental-linkages-and-priors.RmdOverview
Environmental drivers (temperature, prey density, climate indices)
enter stock assessments through several distinct process modules:
recruitment, natural mortality, growth, catchability, selectivity.
Rceattle exposes a long-format linkage table where each
row corresponds to exactly one estimated coefficient that ties an
environmental term (or any user-specified design column) to a process
parameter. Users describe the linkages they want with
linkage_spec() – a formula-driven helper – and pass the
result into the relevant build_*() function. Internally,
fit_mod() pools every spec into a single
data.frame with one row per beta, paired with
a shared design matrix X. The TMB template iterates the
table once, accumulates per-process offsets, and adds them to the
underlying linear predictors.
This vignette covers the user-facing API, the formula syntax it
supports, prior and bound options, and the per-species variations. The
linkage path is wired for growth (von Bertalanffy and
Richards), natural mortality (log_M1), and
recruitment (log_R0,
log_alpha, log_beta).
A first example: temperature on K
Suppose we want sex- and species-specific von Bertalanffy growth,
with the von Bertalanffy K parameter responding to a
temperature index by species:
library(Rceattle)
data(whamGrowthData)
plot_data(whamGrowthData)
# Use logistic selectivity for all fleets in this example.
whamGrowthData$fleet_control$Selectivity <- "Logistic"
whamGrowthData$fleet_control$Selectivity_dimension <- "Length"
# 1. Attach an env_data table to the data list. Must include `Year`
# plus whatever covariates show up in your formulas. Here `temp` is
# a placeholder; in real use it would be a standardised SST series,
# PDO index, etc.
yrs <- whamGrowthData$styr:whamGrowthData$endyr
whamGrowthData$env_data <- data.frame(
Year = yrs,
temp = rnorm(length(yrs), 0, 0.1)
)
# 2. Build a growth specification with one linkage per parameter that
# you want to vary. linkage_spec() captures (formula, by, link,
# init, bounds, priors) for one process parameter.
growth_spec <- build_growth(
fun = "vonBertalanffy",
linkages = list(
log_K = linkage_spec(
formula = ~ temp, # 2 columns: (Intercept) and `temp`
by = ~ species + sex, # one beta per (species, sex, column)
priors = list(temp = normal(0, 0.1)) # Tight prior
)
)
)
# 3. Fit as usual.
fit <- fit_mod(
data_list = whamGrowthData,
growthFun = growth_spec,
estimateMode = 0,
msmMode = 0,
fit_control = fit_control(verbose = 0)
)
summary(fit)The linkage_spec() API
linkage_spec(
formula,
param = NULL,
by = ~ species,
species = NULL,
sex = NULL,
link = c("identity", "log", "logit"),
init = NULL,
bounds = NULL,
priors = NULL,
re_group = NA_character_,
est_phase = 1L
)-
formula– a one-sided R formula that describes the linear predictor for the target parameter. Anythingmodel.matrix()understands works:~ 1,~ temp,~ temp + PDO,~ temp:PDO,~ poly(temp, 2), factor predictors, etc. -
param– the target parameter name (e.g."log_K"). When the spec is registered inside abuild_*()linkages list keyed by parameter name,paramis filled in automatically from the list key. -
by– a one-sided formula naming the stratifying factors that should each get their own coefficients. Allowed names arespecies,sex, andage_bin. The default is~species(one coefficient set per species). Pass~species + sexfor per-(species, sex) coefficients,~species + age_binfor an age-binned M response, orNULLto share a single coefficient set across every species/sex. -
species– optional integer vector of 1-based species ids this spec applies to.NULL(default) means every species in the model. Use this to register different formulas for different species under the same target parameter – see Per-species formulas below. -
sex– optional vector of sex ids this spec applies to. May be supplied as integers (1L= female,2L= male) or as the strings"Females"/"Males"(case-insensitive;"female","male","f","m"also accepted).NULL(default) means every sex in the model. Only meaningful whenbyincludessex; otherwise the filter is a no-op. Use this to register different formulas or priors per sex under the same target parameter – see Per-sex formulas below. -
link– the link function applied to the predictor (NOT YET IMPLEMENTED) -
init,bounds– optional named lists keyed by design-matrix column name.init = list(temp = 0.05)sets the temp slope’s initial value to 0.05;bounds = list(temp = c(-2, 2))constrains it during optimisation. -
priors– a named list of prior objects keyed by design-matrix column. Each entry is either a singleRceattle_prior(applied to every species/sex row that uses that column) or a named list of priors keyed by species id (one prior per species). Insidelinkage_spec()(and only there) the unprefixed short-namesnormal(),lognormal(),gamma(), andbeta()resolve to the correspondingprior_*constructors via a private data mask – see Priors below. -
re_group– random-effects grouping label (currently a placeholder). -
est_phase– estimation phase ordinal.0fixes the coefficient at its initial value;1+makes it estimable. Phased fitting throughfit_control(phase = TRUE)honors the values you pick.
Priors
Five families are available:
prior_normal(mean, sd)
prior_lognormal(meanlog, sdlog)
prior_gamma(shape, rate)
prior_beta(shape1, shape2)
# Family "none" is implicit -- no prior contribution.Inside linkage_spec(priors = ...) the constructors are
available without the prior_ prefix so
user code stays close to mathematical notation:
linkage_spec(
formula = ~ temp + PDO,
by = ~ species,
priors = list(
temp = normal(0, 1), # informative prior, ~95% in (-2, 2)
PDO = normal(0, 5) # weak prior
)
)The unprefixed forms are scoped to the
priors = ... argument and do not mask
base::gamma() or base::beta() at the package
level. Outside that argument you can still call the math functions
directly:
gamma(5) # 24, the math gamma function
beta(2, 3) # 1/12, the math beta function
linkage_spec(
formula = ~ temp,
priors = list(temp = gamma(2, 3)) # interpreted as prior_gamma(shape=2, rate=3)
)For programmatic prior assembly (e.g. species-specific priors built
in a loop) call the prior_* constructors directly:
priors <- lapply(species_specific_sds, function(s) prior_normal(0, s))
names(priors) <- c("temp", "PDO", "salinity")
linkage_spec(formula = ~ temp + PDO + salinity, priors = priors)Each prior’s negative log-density contributes to slot 20 of the joint
NLL (fit$quantities$jnll_comp[20, ] in R):
fit$quantities$jnll_comp["Linkage-table priors", ]Putting a prior on the parameter itself
Bayesian-style informative priors can be applied directly to the base
growth parameters (log_k, log_L1,
log_Linf, log_m), natural mortality
(log_M1), or recruitment parameters (log_R0,
log_alpha, log_beta).
A prior attached to an (Intercept) row is
re-targeted to the base parameter. The same syntax is
used for any other column.
# Example: prior on log_K for von Bertalanffy growth
growth_spec <- build_growth(
fun = "vonBertalanffy",
linkages = list(
log_K = linkage_spec(
formula = ~ 1, # intercept only -- no env effects
by = ~ species + sex,
init = list(`(Intercept)` = log(0.3)), # start at K = 0.3
priors = list(`(Intercept)` = normal(log(0.3), 0.1))
)
)
)The same pattern works for any linkage target –
log_Linf, log_L1, log_m
(Richards), log_M1, log_R0,
log_alpha, log_beta. You can also key the
prior by species id for species- specific priors:
rec_spec <- build_srr(
srr_fun = 0,
linkages = list(
log_R0 = linkage_spec(
formula = ~ 1,
by = ~ species,
init = list(`(Intercept)` = log(1000)),
priors = list(`(Intercept)` = list(
`1` = normal(log(1000), 0.2),
`2` = normal(log(1500), 0.3),
`3` = normal(log(800), 0.25)
))
)
)
)The prior contribution appears in slot 20 of the joint NLL:
fit$quantities$jnll_comp["Linkage-table priors", ]You can inspect the linked parameter values via the linkage table, the estimated slope coefficients, and the base-parameter array:
fit$data_list$linkage_table[fit$data_list$linkage_table$param == "log_K", ]
fit$estimated_params$beta_linkage # (Intercept) rows are 0; slopes are estimated
fit$estimated_params$log_growth_pars # base log_K / log_L1 / log_Linf / log_mThe same approach works for log_M1 when
you want a prior on natural mortality itself:
data("GOApollock") # Using GOA pollock as an example
# One-species model, normal(log(0.4), 0.1) prior on log_M1
# applied via the linkage intercept. The structural M1_model is
# kept at "sex_age_invariant" (= 1).
M1_spec <- build_M1(
M1_model = "sex_age_invariant",
linkages = list(
log_M1 = linkage_spec(
formula = ~ 1,
by = ~ species,
init = list(`(Intercept)` = log(0.34)),
priors = list(`(Intercept)` = normal(log(0.34), 0.01))
)
)
)
# Fit the model - no manual mapping needed
fit <- fit_mod(
data_list = GOApollock,
M1Fun = M1_spec,
estimateMode = 0,
fit_control = fit_control(phase = TRUE)
)The linkage prior is applied verbatim, with no lognormal bias-correction term.
Species- and (species, sex)-specific priors
Each entry under priors = list(...) may be:
- a single prior – applies to every species/sex row that uses that design column,
- a named list of priors keyed by species id – one prior per specified species, or
- a named list of priors keyed by species id and sex – one prior per specified species and sex, or
- a two-level nested list keyed first by specified species, then by specified sex id.
# Species-only keying
linkage_spec(
formula = ~ temp,
by = ~ species,
priors = list(
temp = list(
`1` = normal(0, 0.1), # tight prior for species 1
#`2` = normal(0, 0.5), # No prior species 2
`3` = normal(0, 1.0) # weakest for species 3
# any species not listed gets no prior
)
)
)For sexually dimorphic stocks, the same temp column can
take different priors per (species, sex). At each level the value can be
either an Rceattle_prior (applies to all sub-strata) or a
list that drills down further:
linkage_spec(
formula = ~ temp,
by = ~ species + sex,
priors = list(
temp = list(
# Species 1: different tightness per sex.
`1` = list(`1` = normal(0, 0.1), # sp 1, sex 1 (e.g. females)
`2` = normal(0, 0.2)), # sp 1, sex 2 (e.g. males)
# Species 2: scalar applies to every sex.
`2` = normal(0, 0.5)
# Any (species, sex) combination not listed gets no prior.
)
)
)The same syntax works for any prior family. This is the recommended pattern when a covariate has different prior strength across stocks (e.g. when one species has prior literature on its temp sensitivity but another doesn’t), or when within a species the sexes are expected to respond differently. Built programmatically, it’s just:
sd_by_species <- c(`1` = 0.1, `2` = 0.5, `3` = 1.0)
priors_per_sp <- lapply(sd_by_species, function(s) prior_normal(0, s))
linkage_spec(
formula = ~ temp,
by = ~ species,
priors = list(temp = priors_per_sp)
)Per-species formulas
If species have qualitatively different env relationships – e.g.
species 1 responds only to temperature, species 2 also to PDO – register
multiple specs against the same parameter and use the
species argument on each to scope it to a subset of
stocks:
build_growth(
fun = "vonBertalanffy",
linkages = list(
log_K = list(
linkage_spec(
formula = ~ temp,
by = ~ species,
species = 1L, # only sp 1
priors = list(temp = normal(0, 0.5))
),
linkage_spec(
formula = ~ temp + PDO,
by = ~ species,
species = 2L, # only sp 2
priors = list(temp = normal(0, 1),
PDO = normal(0, 1))
)
)
)
)Linkages can accepts both forms under each parameter key:
linkages = list(log_K = single_spec) # one formula for all sp
linkages = list(log_K = list(spec_a, spec_b)) # different formulas per spInspecting the result, fit$linkages$log_K is either an
Rceattle_linkage_spec or a list of them, and
fit$data_list$linkage_table shows the materialized rows
with species and X_col.
This pattern composes with species-specific priors – each spec carries its own priors, and within a spec the priors can themselves be species-keyed if you want even finer control:
linkage_spec(
formula = ~ temp,
by = ~ species + sex,
species = c(1L, 2L),
priors = list(
temp = list(
`1` = normal(0, 0.3),
`2` = normal(0, 0.7)
)
)
)Per-sex formulas
The sex argument on linkage_spec() mirrors
species: it scopes a spec to a subset of sex ids. Combined
with the multi-spec list form, this lets you register separate specs
per sex against the same parameter – typical use cases are
sex-dimorphic M priors or distinct env responses for females
vs. males:
# Female and male M1 each get their own intercept-only prior.
M1_spec <- build_M1(
M1_model = "sex_specific",
linkages = list(
log_M1 = list(
linkage_spec(
formula = ~ 1,
by = ~ species + sex,
sex = "Females", # or sex = 1L
init = list(`(Intercept)` = log(0.12)),
priors = list(`(Intercept)` = normal(log(0.12), 0.0001))
),
linkage_spec(
formula = ~ 1,
by = ~ species + sex,
sex = "Males", # or sex = 2L
init = list(`(Intercept)` = log(0.20)),
priors = list(`(Intercept)` = normal(log(0.20), 0.0001))
)
)
)
)The sex filter is a no-op when by does not
include sex. If you want different formulas per
sex (rather than just different priors or inits), use the multi-spec
form – one spec with sex = 1L and its own RHS, another with
sex = 2L and a different RHS.
If you only need different priors per (species, sex) for a single
shared formula, the nested-priors form (see Species- and (species,
sex)-specific priors above) is simpler – one spec, one set of
design columns, prior values keyed by species then
sex. Reach for the per-sex multi-spec form when you need
different RHS terms, different init/bounds, or
different est_phase between sexes.
Polynomial and basis-expansion formulas
The formula passed to linkage_spec() is fed straight to
stats::model.matrix(),
so any expression model.matrix() understands works –
polynomial bases, splines, interactions, and arithmetic
transformations.
Polynomials. Use poly() for orthogonal
polynomial bases (the default; numerically well-conditioned), or
poly(..., raw = TRUE) for raw monomials. Each basis order
shows up as a separate design column with its own coefficient:
# 4th-order orthogonal polynomial in temp, by species.
linkage_spec(
formula = ~ poly(temp, 4),
by = ~ species
)
# design columns produced:
# (Intercept), poly(temp, 4)1, poly(temp, 4)2,
# poly(temp, 4)3, poly(temp, 4)4poly() orthogonalises the basis against the rows of
env_data, so the columns are zero-mean over the model years
and the linear slope, quadratic curvature, etc. are estimated
independently – generally the right choice if you don’t have a specific
scientific reason to prefer raw monomials.
Arbitrary arithmetic. Use I() to insert
a literal expression (otherwise temp^2 is interpreted as a
formula crossing operator and reduces to temp):
# Linear + raw quadratic in temp.
linkage_spec(
formula = ~ temp + I(temp^2),
by = ~ species,
priors = list(`I(temp^2)` = normal(0, 0.1)) # tight prior on curvature
)The design column name is the deparsed call (I(temp^2)),
so when you key init, bounds, or
priors against it you have to use the same string –
backticks let you write it as a name.
Splines. Natural cubic splines via
splines::ns() work out-of-the-box and are often a better
choice than high-order polynomials when you want flexibility without
strong tail behaviour:
linkage_spec(
formula = ~ splines::ns(temp, df = 4),
by = ~ species
)Interactions. temp:PDO adds a single
product column; temp * PDO is shorthand for
temp + PDO + temp:PDO:
linkage_spec(
formula = ~ temp * PDO,
by = ~ species,
priors = list(`temp:PDO` = normal(0, 0.5))
)Mixing forms across species. Combined with the per-species formula machinery, you can give one species a polynomial response and another a linear one without duplicating columns:
build_growth(
fun = "vonBertalanffy",
linkages = list(
log_K = list(
linkage_spec(formula = ~ poly(temp, 3),
by = ~ species, species = 1L),
linkage_spec(formula = ~ temp,
by = ~ species, species = 2L)
)
)
)A practical caution: same-named columns must be numerically identical
across specs that use them. With deterministic formulas evaluated
against the same env_data this is automatic; it can only
fail if you’ve built specs against different env_data
frames.
What happens internally
When you call fit_mod() with linkages:
build_growth(fun, linkages)returns a list whoselinkageselement is a named list ofRceattle_linkage_specobjects (or, for per-species formulas, lists of them).fit_mod()stashesgrowth_fun/growth_linkagesandM1_linkagesondata_list, runsdata_check(), then callspool_linkages(spec_groups = list(growth = ..., M = ...)).pool_linkages()materialises every spec againstdata_list$env_dataand the species/sex/age strata, unions the per-spec design columns by name into a single shared design matrixX, and remaps each row’s localX_colto the global column index. The resultingRceattle_linkage_tableandlinkage_Xmatrix are stored ondata_list.encode_linkage_for_tmb()turns the table into the parallel integer/numeric vectors that the TMB template consumes (linkage_process,linkage_param,linkage_species,linkage_sex,linkage_age_bin,linkage_X_col,linkage_link,linkage_prior_family,linkage_prior_p1,linkage_prior_p2).-
The TMB template iterates the linkage rows once and builds per-process offset tensors:
-
growth_linkage_offsetof shape[nspp, max_sex, n_growth_params, nyrs]viarceattle_apply_linkages()insrc/TMB/linkage.hpp, added togrowth_parameterson the log scale. -
M_linkage_offsetof shape[nspp, max_sex, max_age, nyrs]viarceattle_apply_M_linkages(), added toln_M1inside theM1_at_agecompute. -
recruitment_linkage_offsetof shape[nspp, n_rec_params, nyrs]viarceattle_apply_recruitment_linkages(), multiplied against(R0, exp(rec_pars(*, alpha)), exp(rec_pars(*, beta)))each year at everycalculate_recruitment()call site (hindcast, BRPs, dynamic BRPs, projections, expected R). Year 0 is treated like any other year – the(Intercept)row is held at zero, so the offset is purely the slope contribution.
All three accumulators read from the same encoded table and the same
beta_linkageparameter vector; rows dispatch by theirprocesscolumn. With no linkages the offsets stay at zero so existing fits are identical-by-construction. -
Priors and bounds flow through the same table:
build_bounds()reads each row’slower/upperintobounds$lower$beta_linkage/bounds$upper$beta_linkage, and the cpp adds-d<family>(beta, p1, p2, log = TRUE)to slot 19 of the joint NLL for every row whoseprior_familyis not"none".
Natural mortality
build_M1() supports the same linkages
argument. The valid parameter key is log_M1; the offset
enters additively (on the log scale) inside the M1_at_age
compute, so a coefficient of 0.3 on a normalised
temperature index multiplies M1 by
exp(0.3 * temp[yr]) each year.
Why log_M1 and not log_M?
CEATTLE decomposes total natural mortality into
M = M1 + M2. M1 is residual / non-predation
mortality and is a parameter (ln_M1 in the cpp,
with optional random-effect deviates and a direct prior via
M_prior / M_prior_sd). M2 is
predation mortality and is a derived quantity each iteration –
a function of predator abundance, suitability, and predator ration.
There is no scalar ln_M2 parameter to attach a linkage
coefficient to. Environmental effects on predation propagate naturally
via upstream parameters: a temp effect on predator growth
(log_K), a temp effect on recruitment (forthcoming), or
env-driven inputs to the bioenergetics ration calculation. The linkage
system therefore exposes log_M1 as the single M-side
target, with M2 covered by upstream propagation.
yrs <- GOApollock$styr:GOApollock$endyr
GOApollock$env_data <- data.frame(
Year = yrs,
temp = rnorm(length(yrs), 0, 0.1)
)
M1_spec <- build_M1(
M1_model = "sex_age_invariant", # or M1_model = 1 (back-compat)
linkages = list(
log_M1 = linkage_spec(
formula = ~ temp,
by = ~ species,
priors = list(temp = normal(0, 0.1))
)
)
)
fit <- fit_mod(
data_list = GOApollock,
M1Fun = M1_spec,
estimateMode = 0,
msmMode = 0
)The structural M choices (M1_model) and the
random-effects choices (M1_re) accept either integer codes
or readable strings interchangeably:
build_M1(M1_model = "sex_age_specific", M1_re = "ar1_age") # = 3, 4
build_M1(M1_model = 3, M1_re = 4) # equivalentlinkages on M is age-aware: by default a row’s
age_bin is NA and the offset broadcasts across
every age. To pin an env effect to a specific age slice, set
by = ~ species + age_bin and supply
strata$age_bin when calling materialize directly
(fit_mod() provides this automatically). For most
operational uses the default broadcast is what you want.
The new linkage path coexists with the legacy
M1_indices = c(...) column-index argument for
M1_model = 4 / 5: both add to
ln_M1 on the log scale and either may be used. New code
should prefer linkages – it composes naturally with priors,
bounds, and per-species formulas.
Inspecting the result:
fit$data_list$linkage_table # rows have process == "M"
fit$quantities$M_linkage_offset # [nspp, max_sex, max_age, nyrs]
fit$quantities$jnll_comp["Linkage-table priors", ]Growth and M linkages share the same table – one fit
can carry linkage rows for both processes, and the underlying
coefficient vector beta_linkage indexes both.
Recruitment
build_srr() accepts the same linkages
argument with three allowed parameter keys:
-
log_R0– offset onlog(R0). Meaningful for anysrr_fun; the offset is added to the equilibrium / mean recruitment on the log scale, including the first model year. This is the right target if you want “recruitment responds to temp” and aren’t fitting an SRR. -
log_alpha– offset on the alpha parameter (productivity) of a Beverton-Holt or Ricker SRR. Only does work whensrr_fun %in% c(2, 3, 4, 5)– the SRRs that consume alpha. -
log_beta– offset on the beta parameter (density-dependence) of BH or Ricker. Same applicability aslog_alpha.
# Mean-recruitment with a temperature linkage on log_R0.
rec_spec <- build_srr(
srr_fun = "mean", # = 0
linkages = list(
log_R0 = linkage_spec(
formula = ~ temp,
priors = list(temp = normal(0, 0.5))
)
)
)
# Beverton-Holt with env effects on alpha (and species-keyed
# priors so each stock has its own sensitivity).
rec_spec_bh <- build_srr(
srr_fun = "BevertonHolt", # = 2
linkages = list(
log_alpha = linkage_spec(
formula = ~ temp,
priors = list(
temp = list(`1` = normal(0, 0.3),
`2` = normal(0, 0.7))
)
)
)
)
fit <- fit_mod(
data_list = data_list,
recFun = rec_spec,
estimateMode = 0,
msmMode = 0
)The structural choice (srr_fun) accepts either integer
codes or readable strings interchangeably:
build_srr(srr_fun = "mean") # = 0
build_srr(srr_fun = "BevertonHolt") # = 2
build_srr(srr_fun = "Ricker") # = 4Inspecting the result:
fit$data_list$linkage_table # rows have process == "recruitment"
fit$quantities$recruitment_linkage_offset # [nspp, n_rec_params, nyrs]
# n_rec_params = 3 (R0, alpha, beta)Practical note: a linkage row’s age_bin is ignored for
recruitment (recruitment is at age 0 by definition). The
sex_ratio data input partitions recruits across sexes
downstream.
A multi-process example
The same pattern composes when more than one parameter is linked:
growth_spec <- build_growth(
fun = "Richards",
linkages = list(
log_K = linkage_spec(
formula = ~ temp,
by = ~ species + sex,
priors = list(temp = normal(0, 1))
),
log_Linf = linkage_spec(
formula = ~ temp + PDO,
by = ~ species,
priors = list(temp = normal(0, 0.5),
PDO = normal(0, 0.5))
),
log_m = linkage_spec(
formula = ~ 1, # intercept only -- no env effect
by = ~ species
)
)
)The pooler will:
- compute one shared
X = model.matrix(~ temp + PDO, env_data)of three columns, - emit 4 (sp × sex) + 3 (Linf cols) × 3 (sp) + 3 (m intercepts) = 16 rows in the linkage table,
- match same-named columns (
(Intercept),temp) across specs so there’s no duplication inX.
Per-process logic (vB has no m; only Richards does) is
enforced at build time:
build_growth(
fun = "vonBertalanffy",
linkages = list(log_m = linkage_spec(~ temp))
)
#> Error: linkages$log_m is only valid with fun = 'Richards'; ...Growth, M, and recruitment can all be linked in the same fit – each
build_*() output carries its own linkages list and
fit_mod() pools them against a single shared design
matrix:
fit <- fit_mod(
data_list = data_list,
growthFun = build_growth(
fun = "vonBertalanffy",
linkages = list(
log_K = linkage_spec(formula = ~ temp)
)
),
M1Fun = build_M1(
M1_model = "sex_age_invariant",
linkages = list(
log_M1 = linkage_spec(formula = ~ temp)
)
),
recFun = build_srr(
srr_fun = "mean",
linkages = list(
log_R0 = linkage_spec(formula = ~ temp)
)
),
estimateMode = 0,
msmMode = 0
)
# Linkage table contains rows for all three processes:
table(fit$data_list$linkage_table$process)
#> growth M recruitment
#> 4 4 4Inspecting and debugging
A few diagnostics worth knowing:
# The materialised table -- one row per estimated coefficient.
fit$data_list$linkage_table
# The shared design matrix -- one column per unique RHS term.
fit$data_list$linkage_X
# Per-(species, sex, growth_param, year) offsets actually applied to
# growth_parameters this iteration. Useful for verifying the linkage
# fired as you intended.
str(fit$quantities$growth_linkage_offset)
str(fit$quantities$M_linkage_offset)
str(fit$quantities$recruitment_linkage_offset)
# Per-row coefficient values in the same order as `linkage_table`.
fit$estimated_params$beta_linkage
# The prior NLL contribution from slot 19 of jnll_comp.
fit$quantities$jnll_comp["Linkage-table priors", ]Behind the scens: Automatic base-parameter handling
Any pooled linkage table is stored on the fitted model:
fit$data_list$linkage_tableFor three species × two sexes × ~ temp (intercept +
slope) you get 12 rows: an (Intercept) beta and a
temp-slope beta per (species, sex) combination. The
(Intercept) beta is fixed at 0 (mapped out) and the
base parameter (log_K, log_M1,
log_R0, …) carries the level. The slope rows give the
year-by-year offset that multiplies the base. See Automatic
base-parameter handling below.
The shared design matrix is stored alongside:
head(fit$data_list$linkage_X)The base parameter (log_R0, log_M1,
log_K, …) and the linkage table cooperate cleanly:
any priors or initial values are applied to the base
parameter, and the linkage rows hold year-by-year
offsets that is added to the base parameter on the log scale.
The exact behavior depends on whether the formula carries an
intercept:
| Formula | Base parameter | Slope rows |
|---|---|---|
~ 1 (intercept only) |
estimable; init from spec or default | n/a |
~ temp (intercept + slope) |
estimable; init from spec or default | estimated |
~ 0 + temp (slope only) |
mapped NA, value from spec or default | estimated |
Intercept-bearing formulas (~ 1,
~ temp) – the base parameter remains estimable;
for example log_R0 for species s lives on
rec_pars[s, 1] exactly as it does without any linkages. The
(Intercept) row of the table exists for bookkeeping and as
a hook for init/prior (see below) but its
beta_linkage slot is held at 0 and not
estimated. The recruitment / mortality / growth modules read
base + offset(yr) each year – with the intercept fixed at
zero, the offset is exactly the slope contribution.
Slope-only formulas (~ 0 + temp) –
there is no (Intercept) row, so the base parameter is
mapped out (NA) and stays at the default initial value, or the value
supplied by the inits argument in fit_mod, or
the value supplied by init = list("(Intercept)"= ...) on
the spec.
init and the base parameter
init = list((Intercept)= X) on the linkage
spec flows to the base parameter rather than to the
linkage row. The linkage row’s beta_linkage slot stays at
0; the base parameter starts at X. This is the
recommended way to set a sensible starting value for intercept-bearing
linkages – it composes naturally with phasing (the base parameter goes
through the existing per-process phase logic) and keeps the optimiser
grounded.
# Start log_M1 at log(0.06) without estimating an extra intercept beta.
build_M1(
M1_model = "sex_age_invariant",
linkages = list(
log_M1 = linkage_spec(
formula = ~ 1,
init = list(`(Intercept)` = log(0.06))
)
)
)Stratification
Stratification (by) determines which
base-parameter slots are affected: by = ~ species produces
one (Intercept) row per species; by = NULL (a
fully shared linkage) produces a single shared row; and
by = ~ species + sex produces one per (species, sex). Each
(Intercept) row that the user supplies an init
for pushes that init into the matching base-parameter slot.
This rule applies uniformly to every linkage target
(log_K, log_L1, log_Linf,
log_m, log_M1, log_R0,
log_alpha, log_beta).
Single-sex models and by = ~ ... + sex
If your model is single-sex (nsex = 1 for every species
in scope), asking for by = ~ species + sex produces a
warning: there is only one sex level available, so the sex stratum
collapses to a single shared level. The result is equivalent to
by = ~ species. Models with mixed nsex (some
single-sex, some two-sex species) emit one sex level per species
automatically – single-sex species get one row per parameter, two-sex
species get two.
Pooling
Inside pool_linkages() each spec is materialized
independently and the resulting rows are concatenated into a single
linkage table. The shared design matrix unions all RHS terms across
specs, so species 1’s rows reference only (Intercept) and
temp, while species 2’s rows can additionally reference
PDO – there’s no duplication of the temp column.
pool_linkages() unions the design columns by name so the
global matrix X contains both temp and the
three poly(temp, 3)* columns; species 1’s rows reference
the polynomial columns, species 2’s only the linear
temp.
On the roadmap
-
Random-effects grouping – the
re_groupcolumn on the linkage table is stored end-to-end; the cpp will grow a hierarchical penalty so coefficients sharing are_groupvalue are partially pooled. -
Age-dependent linkages on processes other than M –
the
age_binstratum is consumed by the M accumulator today (a row’sage_bin == NAbroadcasts across ages; specific values pin the offset to that age slice). The schema is reserved for other processes that may eventually vary by age. -
Selectivity / catchability linkages – the encoder
already reserves
process = "sel"andprocess = "q"codes, but no cpp accumulator consumes them yet. -
Legacy retirement (v4.4.0) –
srr_indices,M1_indices,srr_fun in c(1, 3, 5), andM1_model in c(4, 5)retire in the next minor release; see Scheduled removal (v4.4.0) inNEWS.mdfor the migration table.
See also
-
?linkage_spec,?build_growth,?build_M1,?build_srr,?prior_normal -
vignette("model-parameterizations", package = "Rceattle")for the full set of process options.