CysticFibrosis (Harun 2019) • nlmixr2lib

Model and source

Citation: Harun SN, Wainwright CE, Grimwood K, Hennig S; Australasian Cystic Fibrosis Bronchoalveolar Lavage (ACFBAL) study group. Aspergillus and progression of lung disease in children with cystic fibrosis. Thorax 2019;74(2):125-131. doi:10.1136/thoraxjnl-2018-211550.
Description: Non-linear mixed-effects disease-progression model of forced expiratory volume in 1 second (FEV1) percent predicted versus age in children with cystic fibrosis (Harun 2019). The model describes the typical sigmoid-Emax decline of FEV1% predicted from a baseline at age 5 years to an asymptote, with covariate effects of BMI z-score and severe air trapping at age 5 on the baseline, and time-varying hospitalisation due to pulmonary exacerbation on the maximum drop and the half-effect age.
Article: https://doi.org/10.1136/thoraxjnl-2018-211550

Population

The disease-progression model was developed from the longitudinal sub-cohort of the Australasian Cystic Fibrosis Bronchoalveolar Lavage (ACFBAL) study (n=79 children with classic cystic fibrosis who were re-consented into the Cystic Fibrosis Follow-up After Bronchoscopy (CF FAB) study). Children were recruited from new-born screening at eight clinical sites across Australia (New South Wales, Northern Territory, Queensland, South Australia, Victoria, Australian Capital Territory) and New Zealand. The sub-cohort had a median age of 9.1 years (IQR 7.6 to 10.8) at the time of FEV1% measurement, with 8.5 years of follow-up per patient (IQR 7.8 to 8.8) yielding 2651 FEV1% predicted observations (median 33 per child) collected between ages 5 and 14 years. Forty-six point eight percent of the sub-cohort were female; all participants had classic CF (two CFTR mutations, sweat chloride > 60 mmol/L, pancreatic insufficiency, or meconium ileus). The age-5 FEV1% predicted measurements were obtained from the ACFBAL study; subsequent FEV1% measurements at ages 6 to 14 were obtained from the Australian Cystic Fibrosis Data Registry (ACFDR). FEV1% predicted values were transformed to percentage predicted using the Global Lung Function Initiative (GLI) 2012 reference equations. Population characteristics are reported in Tables 1 and 2 of Harun 2019 (n=79 column of Table 1).

The same information is available programmatically via readModelDb("Harun_2019_cysticFibrosis")$population after the model is loaded.

Source trace

The model is a sigmoid-Emax disease-progression model of FEV1% predicted versus chronological age. Equation 1 of Harun 2019 (page 127):

$\mathrm{FEV}_1\%(t) \;=\; \mathrm{FEV}_1\%_{\mathrm{baseline}} \;-\;\frac{\Delta\mathrm{max}_{\mathrm{FEV}_1\%} \cdot t^{\gamma}} {t_{50\%\mathrm{max}}^{\gamma} + t^{\gamma}}$

where t is the subject’s chronological age in years.

Equation / parameter	Value	Source location
`lfev1pp_baseline` (typical baseline FEV1% at age 5)	99.7 % predicted	Supplement Table E3 final-model column (page 12); also $THETA(1) in supplement NMTRAN block (page 20)
`ldmax` (maximum lifetime change in FEV1% predicted, FIXED)	40 % predicted	Supplement Table E3; FIXED to literature value from Harun 2016 systematic review (PMID 26597232; cited as supplement reference 6) because the data could not support estimation of dmax
`lt50max` (age at which 50% of dmax occurs)	8.38 years	Supplement Table E3; $THETA(3)
`lhill` (Hill coefficient on age)	3.08 (unitless)	Supplement Table E3; $THETA(4)
`e_bmi_baseline` (BMI z-score effect on baseline FEV1%)	0.0382	Supplement Table E3 (rounded to 0.038); $THETA(6) = 0.0382
`e_at_baseline` (severe air trapping at age 5 effect on baseline FEV1%)	-0.0417	Supplement Table E3 (rounded to -0.04); $THETA(9) = -0.0417
`e_hpe_dmax` (hospitalisation due to PE effect on dmax)	-0.22	Supplement Table E3; $THETA(7)
`e_hpe_t50max` (hospitalisation due to PE effect on t50max)	-0.235	Supplement Table E3 (rounded to -0.24); $THETA(8) = -0.235
BSV Hill coefficient (proportional, $OMEGA(1,1))	0.272	Supplement Table E3 (sqrt(0.272) ~= 52.2%); $OMEGA on page 20
BSV baseline FEV1% (proportional, $OMEGA(2,2))	0.0172	Supplement Table E3 (sqrt(0.0172) ~= 13.1%); $OMEGA on page 20 \| \| BSV dmax + t50max ($ OMEGA BLOCK(2))
`addSd` (additive residual error)	9.32 % predicted	Supplement Table E3 final
Sigmoid-Emax disease-progression form	n/a	Equation 1 (page 127); supplement Equation E2 (page 9)
Linear-deviation covariate form `(1 + e * cov)`	n/a	Supplement page 13 (“Structural parameters after inclusion of the influencing factors”)
Proportional IIV stochastic model (Eq E3) for baseline / dmax / Hill	n/a	Supplement pages 8 to 9 (“A proportional stochastic model was used to describe the BSV (Equations E3) for parameters alpha, dmax_FEV1%, and gamma”)
Exponential IIV stochastic model (Eq E2) for t50max	n/a	Supplement page 8 (“an exponential (Equation E2) stochastic model was used to describe between-subject variability (BSV) for the parameter t_50%max”)
Additive residual error model (Eq E4)	n/a	Supplement page 9; supplement Results paragraph “Residual errors for linear and nonlinear models were best described by an additive model” (page 10)

Errata

No published erratum or corrigendum was found for this paper as of the model extraction date (2026-05-08). One notational caveat is worth flagging for readers:

The parameter labelled “FEV1% baseline” in Table E3 (99.7 % predicted) is the model’s intercept at t = 0 years, not the typical FEV1% at age 5 (which would be the lower boundary of the observed age range). The sigmoid-Emax formula evaluated at t = 5 years gives FEV1% (5) = 99.7 - 40 * 5^3.08 / (8.38^3.08 + 5^3.08) ~= 92.9 % predicted, which is closer to the typical age-5 measurement actually observed in the cohort (Table 2 reports the median FEV1% predicted across the longitudinal sub-cohort = 88.6 % predicted, IQR 76.1 to 97.5, pooled across all visits). Treat lfev1pp_baseline as the algebraic intercept of the sigmoid, not the empirical age-5 value.

Virtual cohort

Original individual-level FEV1% data are not publicly available. The simulations below use a virtual cohort whose covariate distributions approximate the modelled population: ages drawn uniformly across the observed 5 to 14 year range, BMI z-score drawn from a normal distribution centred on the cohort median (the source paper does not report the variance of BMI z-score; we use SD = 1, the population reference SD), severe air trapping at age 5 drawn from a Bernoulli distribution with p = 0.449 (Table 1, n=79 column: 70/156 of the full ACFBAL cohort had air trapping > 0%, but the n=79 sub-cohort is closer to 36/79 = 45.6%), and hospitalisation indicator drawn from a Bernoulli distribution with p = 0.13 (Table 1, median 13 hospitalisations per child over follow-up; we approximate the per-visit hospitalisation rate as ~13%).

set.seed(20260508)

n_subj <- 200

# Per-subject time-fixed covariates
cohort <- data.frame(
  ID          = seq_len(n_subj),
  AIR_TRAP_5Y = rbinom(n_subj, 1, 0.456)
)

cat(sprintf(
  "Cohort: n=%d; AIR_TRAP_5Y=1 in %d (%.1f%%)\n",
  n_subj, sum(cohort$AIR_TRAP_5Y), 100 * mean(cohort$AIR_TRAP_5Y)
))
#> Cohort: n=200; AIR_TRAP_5Y=1 in 83 (41.5%)

Replication: typical-value disease-progression curve

We first reproduce the typical disease-progression curve (no IIV, no residual error, all covariates at their reference values: BMIZ = 0, AIR_TRAP_5Y = 0, HOSPRA = 0). The expected curve at canonical ages: 99.7% (t = 0), ~92.9% (t = 5), 79.7% (t = 8.38, the half-effect age), 66.5% (t = 14). The paper does not display this exact curve in Figure 1 (which shows the observed FEV1% distribution rather than the typical-value trajectory), but the parameters in Table E3 are sufficient to replicate it analytically.

mod <- readModelDb("Harun_2019_cysticFibrosis")
mod_fixed <- rxode2::zeroRe(mod)
#> ℹ parameter labels from comments will be replaced by 'label()'

age_grid <- seq(0, 14, by = 0.1)
ev_typical <- data.frame(
  id          = 1L,
  time        = age_grid,
  amt         = 0,
  evid        = 0L,
  BMIZ        = 0,
  AIR_TRAP_5Y = 0L,
  HOSPRA      = 0L
)

sim_typical <- rxode2::rxSolve(mod_fixed, events = ev_typical) |>
  as.data.frame()
#> ℹ omega/sigma items treated as zero: 'etalhill', 'etalfev1pp_baseline', 'etaldmax', 'etalt50max'

ggplot(sim_typical, aes(x = time, y = fev1pp)) +
  geom_line(linewidth = 0.8, colour = "steelblue") +
  geom_hline(yintercept = 99.7, linetype = "dotted", colour = "grey50") +
  geom_hline(yintercept = 99.7 - 40, linetype = "dotted", colour = "grey50") +
  geom_vline(xintercept = 8.38, linetype = "dotted", colour = "grey50") +
  scale_x_continuous(breaks = c(0, 5, 8.38, 14)) +
  scale_y_continuous(breaks = c(60, 80, 99.7), limits = c(50, 105)) +
  labs(
    x = "Age (years)",
    y = "FEV1 (% predicted)",
    title = "Typical-value sigmoid-Emax disease-progression curve",
    caption = "BMIZ = 0, AIR_TRAP_5Y = 0, HOSPRA = 0; reference covariate values."
  ) +
  theme_bw()

A spot-check at canonical ages confirms the implementation matches the analytic formula:

checkpoints <- tibble::tibble(
  age = c(0, 5, 8.38, 14),
  expected = c(
    99.7,
    99.7 - 40 *  5^3.08 / (8.38^3.08 +  5^3.08),
    99.7 - 40 *  8.38^3.08 / (8.38^3.08 +  8.38^3.08),
    99.7 - 40 * 14^3.08 / (8.38^3.08 + 14^3.08)
  )
)

actual <- approx(sim_typical$time, sim_typical$fev1pp,
                 xout = checkpoints$age, rule = 2)$y

knitr::kable(
  cbind(checkpoints, actual = actual,
        diff_pct = 100 * (actual - checkpoints$expected) / checkpoints$expected),
  digits  = 3,
  caption = "Typical-value FEV1% predicted at canonical ages: model output vs analytic formula."
)

Typical-value FEV1% predicted at canonical ages: model output vs analytic formula.
age	expected	actual
0.00	99.700	99.700
5.00	92.928	92.928
8.38	79.700	79.700
14.00	66.528	66.528


stopifnot(max(abs(actual - checkpoints$expected)) < 0.5)

Replication: covariate scenarios on the typical-value trajectory

The covariate effects in Table E3 are best understood graphically. Below we overlay typical-value trajectories for the four covariate-scenario corners: (reference) BMIZ = 0, AIR_TRAP_5Y = 0, HOSPRA = 0; (severe air trapping) AIR_TRAP_5Y = 1; (high BMI z-score) BMIZ = 2; (always hospitalised) HOSPRA = 1.

scenarios <- list(
  list(label = "reference",
       BMIZ = 0, AIR_TRAP_5Y = 0L, HOSPRA = 0L),
  list(label = "AIR_TRAP_5Y = 1",
       BMIZ = 0, AIR_TRAP_5Y = 1L, HOSPRA = 0L),
  list(label = "BMIZ = +2",
       BMIZ = 2, AIR_TRAP_5Y = 0L, HOSPRA = 0L),
  list(label = "BMIZ = -2",
       BMIZ = -2, AIR_TRAP_5Y = 0L, HOSPRA = 0L),
  list(label = "HOSPRA = 1 (always)",
       BMIZ = 0, AIR_TRAP_5Y = 0L, HOSPRA = 1L)
)

scenario_df <- bind_rows(lapply(seq_along(scenarios), function(i) {
  s <- scenarios[[i]]
  data.frame(
    id          = i,
    time        = age_grid,
    amt         = 0,
    evid        = 0L,
    BMIZ        = s$BMIZ,
    AIR_TRAP_5Y = s$AIR_TRAP_5Y,
    HOSPRA      = s$HOSPRA,
    label       = s$label,
    stringsAsFactors = FALSE
  )
}))

sim_scen <- rxode2::rxSolve(mod_fixed, events = scenario_df,
                            keep = c("label")) |>
  as.data.frame()
#> ℹ omega/sigma items treated as zero: 'etalhill', 'etalfev1pp_baseline', 'etaldmax', 'etalt50max'
#> Warning: multi-subject simulation without without 'omega'

ggplot(sim_scen, aes(x = time, y = fev1pp,
                     colour = label, linetype = label)) +
  geom_line(linewidth = 0.8) +
  scale_x_continuous(breaks = c(0, 5, 8.38, 14)) +
  labs(
    x = "Age (years)",
    y = "FEV1 (% predicted)",
    colour = "Scenario", linetype = "Scenario",
    title  = "Covariate-scenario overlays on the typical-value trajectory"
  ) +
  theme_bw()

The expected qualitative behaviour from Table E3:

Severe air trapping at age 5 lowers the algebraic baseline by 4.17% (99.7 -> 95.5) and shifts the entire curve down without changing dmax, t50max, or hill.
High BMI z-score raises the algebraic baseline; BMI z-score = +2 raises the baseline by 2 * 3.82% = 7.64% (99.7 -> 107.3).
HOSPRA = 1 reduces both dmax (40 -> 31.2 = 40 * (1 - 0.22)) and t50max (8.38 -> 6.41 years = 8.38 * (1 - 0.235)), shifting the curve down and to the left (faster onset, smaller asymptotic drop).

Replication: stochastic VPC against Figure E3 of the supplement

Supplement Figure E3 shows the prediction- and variability-corrected visual predictive check (pcVPC) of the final model. We cannot reproduce the pcVPC’s reference curves without the original observations, but we can generate the model’s own simulated 5th / 50th / 95th percentile bands and visually compare them against the published pcVPC qualitatively.

set.seed(20260509)

n_vpc <- 200
ages <- seq(5, 14, by = 0.5)

vpc_events <- expand.grid(
  id   = seq_len(n_vpc),
  time = ages
) |>
  arrange(id, time) |>
  mutate(
    amt         = 0,
    evid        = 0L,
    BMIZ        = rnorm(n(), 0, 1),
    AIR_TRAP_5Y = rep(rbinom(n_vpc, 1, 0.456),
                      each = length(ages)),
    HOSPRA      = rbinom(n(), 1, 0.13)
  )

sim_vpc <- rxode2::rxSolve(mod, events = vpc_events,
                           nStud = 1, addCov = TRUE) |>
  as.data.frame()
#> ℹ parameter labels from comments will be replaced by 'label()'

vpc_summary <- sim_vpc |>
  group_by(time) |>
  summarise(
    Q05 = quantile(sim, 0.05, na.rm = TRUE),
    Q50 = quantile(sim, 0.50, na.rm = TRUE),
    Q95 = quantile(sim, 0.95, na.rm = TRUE),
    .groups = "drop"
  )

ggplot(vpc_summary, aes(x = time)) +
  geom_ribbon(aes(ymin = Q05, ymax = Q95), fill = "steelblue", alpha = 0.25) +
  geom_line(aes(y = Q50), colour = "steelblue", linewidth = 0.8) +
  scale_x_continuous(breaks = seq(5, 14, by = 1)) +
  labs(
    x = "Age (years)",
    y = "FEV1 (% predicted)",
    title  = "Simulated 5th / 50th / 95th percentile FEV1% bands",
    caption = paste0(
      "n = ", n_vpc, " virtual subjects; covariate distributions ",
      "approximate the published cohort. Compare qualitatively against ",
      "supplement Figure E3 (5th / 50th / 95th observed and predicted ",
      "percentiles)."
    )
  ) +
  theme_bw()

The simulated median FEV1% band sits at roughly 90% predicted at age 5 and drifts to roughly 65% predicted at age 14, consistent with the published pcVPC’s 50th-percentile reference line, and the 5th-95th band spans the range of the published observations (roughly 30% to 120% across the same age window per supplement Figure 1A).

Endogenous-style sanity checks

This is a non-PK disease-progression model, so PKNCA-style validation does not apply. Instead we run the steady-state-style sanity checks:

# Initial condition: at age t = 0, fev1pp must equal lfev1pp_baseline
init_check <- approx(sim_typical$time, sim_typical$fev1pp,
                     xout = 0, rule = 2)$y
stopifnot(abs(init_check - 99.7) < 0.01)
cat(sprintf("Sanity 1 (intercept at t=0): fev1pp(0) = %.3f (expected 99.7)\n",
            init_check))
#> Sanity 1 (intercept at t=0): fev1pp(0) = 99.700 (expected 99.7)

# Asymptote: at t -> infinity, fev1pp must approach lfev1pp_baseline - dmax
asymptote <- approx(sim_typical$time, sim_typical$fev1pp,
                    xout = 14, rule = 2)$y
analytic_asym <- 99.7 - 40
cat(sprintf(
  "Sanity 2 (approach to asymptote at t=14): fev1pp(14) = %.3f (analytic limit at infinity = %.3f; t=14 is not yet at the asymptote because t50max=8.38 and the Hill is 3.08)\n",
  asymptote, analytic_asym
))
#> Sanity 2 (approach to asymptote at t=14): fev1pp(14) = 66.528 (analytic limit at infinity = 59.700; t=14 is not yet at the asymptote because t50max=8.38 and the Hill is 3.08)

# Half-effect age check: at t = t50max, fev1pp must equal baseline - dmax/2
half_eff <- approx(sim_typical$time, sim_typical$fev1pp,
                   xout = 8.38, rule = 2)$y
stopifnot(abs(half_eff - (99.7 - 20)) < 0.5)
cat(sprintf("Sanity 3 (half-effect age at t=8.38): fev1pp(8.38) = %.3f (expected baseline - dmax/2 = %.3f)\n",
            half_eff, 99.7 - 20))
#> Sanity 3 (half-effect age at t=8.38): fev1pp(8.38) = 79.700 (expected baseline - dmax/2 = 79.700)

All three sanity checks pass within tolerance, confirming the model file implements the published equation correctly.

Assumptions and deviations

Observation variable name: The observation is named fev1pp (FEV1 % predicted) rather than the canonical Cc. This generates a warning from checkModelConventions() – the canonical convention is PK-centric (Cc = central-compartment concentration) and not appropriate for a non-PK disease-progression model whose endpoint is a percentage predicted lung-function value. The same pattern is used by other non-PK models in the package (e.g., oncology_xenograft_simeoni_2004 uses tumorVol). Justified deviation; documented here per the convention workflow.
Concentration units string: units$concentration = "% predicted" does not contain the conventional mass/volume slash, generating a unit-format warning. This is correct for the FEV1% endpoint (which is a unitless percentage of a reference value, not a concentration); documented here per the convention workflow.
IIV form: The model file uses the paper’s exact IIV parameterisation: proportional (1 + eta) for baseline FEV1%, dmax, and the Hill coefficient (Eq E3 of the supplement); exponential exp(eta) for t50max (Eq E2 of the supplement). The OMEGA matrix values (variances and the dmax-t50max covariance) are the NMTRAN $OMEGA values from the supplement (page 20) and the bootstrap-median values from Table E3 are within rounding. Note that even though etaldmax and etalt50max form a correlated block in OMEGA, they enter the model via different functional forms (proportional and exponential respectively); this is the form of the published model.
Maximum lifetime change in FEV1% predicted (ldmax): FIXED to 40% predicted, the literature value from Harun 2016 (Paediatr Respir Rev 20:55-66, PMID 26597232; cited as supplement reference 6). The data could not support estimation of dmax because the longitudinal cohort was followed only between ages 5 and 14, well below the age at which full lifetime FEV1% decline manifests. The model file marks this with fixed().
Severe air trapping covariate (AIR_TRAP_5Y): Time-fixed per subject (the indicator captures the single end-of-study HRCT scan at age 5 in the ACFBAL study); applies only to baseline FEV1% predicted, not to dmax or t50max.
Hospitalisation due to PE (HOSPRA): Time-varying per-visit indicator; the source NMTRAN reads this from a per-row column. In the packaged model, the operator supplies HOSPRA at every observation row.
BMI z-score (BMIZ): Carried as the source paper’s per-visit z-score with reference 0 (population mean for the subject’s age and sex). The source paper does not state which growth reference standard was used to compute the z-score; ACFBAL paediatric CF cohorts conventionally use the WHO 2007 Growth Reference. The covariate column is registered as canonical BMIZ (distinct from the adult-PK canonical BMI in kg/m^2). The Harun 2019 source NMTRAN names the column BMI but the data dictionary describes it as a z-score; the canonical column is BMIZ and covariateData[[BMIZ]]$source_name is "BMI" to record the rename.
Time variable: The model treats t as the subject’s chronological age in years. The source NMTRAN converts a per-row TIME column from hours to years via TIME1 = ((TIME/24)/365); in the nlmixr2 data convention the operator carries age directly in the time column.
Aspergillus: The paper’s primary stated aim was to test whether positive Aspergillus bronchoalveolar lavage cultures predict FEV1% trajectory; after adjustment for BMI z-score and hospitalisation, Aspergillus did NOT improve the model fit (supplement Table E4 lines 5, 16; main paper Discussion). Aspergillus does NOT appear in the final model and is therefore not a covariate of this packaged model.
VPC reference: The stochastic-VPC chunk shows the model’s simulated percentile bands but cannot overlay the published pcVPC reference curves because the original observations are not on disk; the comparison is therefore qualitative.