We provide a range of diagnostics for assessing the validity of
gamlss.longitudinal fits, generally grouped by whether they
are marginal diagnostics (which are similar to many other marginal
modelling package) or diagnostics fod the joint fit which includes
dependence.
We provide an exmaple review of key diagnostics for an example dataset as a part of the detailed workflow here: Detailed worked example. The goal of this page is as a reference to outline the available methods and provide brief details on how they operate.
In general we suggest the following checks for any model:
-
check_model(fit)provides a basic set of numerical checks on the marginal and dependence fit with a simple pass/fail output and is a good starting point for review -
summary(fit)naturally is an important starting point for reviewing fit likelihood, coefficients and their significance to them model -
plot_terms(fit)provides a visual assessment of the fitted model coefficients and their confidence intervals -
plot(fit)provides a faceted plot of visual marginal fit diagnostics -
plot_copula_diagnostics(fit)provides a plot of visual joint distriution and dependence fit diagnostics
Basic checks
We provide a brief automated model assessment using
check_model() and return a compact set of basic check
statuses. Check model is reasonably minimal and should be used alongside
other diagnostics; the checks that flag are:
| Area | Quantity checked | Threshold / condition | Overall result |
|---|---|---|---|
| Convergence | Model convergence based on
fit$convergence$converged
|
Not TRUE
|
basic_checks=FAIL |
| Marginal fit | PIT Kolmogorov-Smirnov p-value vs Uniform(0, 1) | ks_p_value < 0.05 |
basic_checks=FAIL |
| Tail fit | Maximum of lower/upper PIT tail ratios across thresholds
0.05 and 0.10
|
max(lower_ratio, upper_ratio) > 2 |
basic_checks=FAIL |
| Copula fit | Absolute lag-1 Rosenblatt normal-score residual correlation after fitted copula | abs(lag1_cor) > 0.25 |
basic_checks=FAIL |
| Variance calculation | Variance-covariance method from summary | vcov_method == "numderiv" |
basic_checks=REVIEW |
Check model will return check_model$basic_checks as:
- “failed” if any
FAIL; - “review” if no
FAILbut at least oneREVIEW; - “passed” if all rows are
PASS.
Note that a PASS result from check_model
doesn’t indicate a good model fit, just that there aren’t any major,
easy-to-test-numerically issues with the overall shape of the model fit
for the margin or the dependence.
check <- check_model(fit)
check
check$basic_checks
check$checks
check$basic_checks_result
check$warningsMarginal distribution checks
The marginal fit affects both the margin and the copula, as the
copula relies on the results from the margin fit to strip out marginal
effects from dependence. The most straightforward way to review marginal
fit is with plot(fit).
plot(fit) creates diagnostic plots for:
- Probability integral transform (PIT) histogram (checking if the fitted model results in uniform fitted CDF)
- Quantile-quantile (QQ) plot of the fitted versus empirical distribution (points should follow the diagonal line)
- Wormplot, which can be viewed as a ‘higher resolution’ version of a QQ-plot which includes reference ranges and should have points within the reference ranges, ideally without persistent trends away from the centre line
- Rootogram, showing the difference in the binned distribution for the observed versus fitted distribution (these should centre on zero without significant trending deviation)
The default plot() function plots the diagnostics over
the whole distribution, but there is the option to facet the plots by
time to assess fit across the longitudinal timepoints to review any
issues over repeated observations.
rootogram(fit, by_time = TRUE)
pithist(fit, by_time = TRUE)
qqrplot(fit, by_time = TRUE)
wormplot(fit, by_time = TRUE)We also provide the option to facet each of the diagnostics by a
covariate using by="covariate_name". Note that when
including a variable to facet by, it’s required to provide the original
dataset as we don’t expose all covariates as part of the fitted
object.
pithist(fit, by="treatment", data=dataset)
qqrplot(fit, by="treatment", data=dataset)
wormplot(fit, by="treatment", data=dataset)
rootogram(fit, by="treatment", data=dataset)If any of the above diagnostics are of concern then marginal fit
needs to be adjusted. Review options for margin_dist or
coefficient fits for available parameters mu.formula,
sigma.formula, nu.formula and/or
tau.formula.
Dependence structure diagnostics
The copula structure specified by copula_dist should
capture the shape of the dependence after the marginal shape is removed,
with theta.formula and zeta.formula allowing
the shape to vary with time or other covariates.
In the same way as a margin, we expect the residual trends remaining in our dependence structure after imposing our dependence fit to be essentially random,
plot_copula_diagnostics(fit) provides a detailed set of
diagnostics for the dependence structure, including:
- Fitted copula overlayed on a histogram of the dependence (fitted contours should follow density of the histogram)
- Quantiles (deciles) of fitted Kendall’s tau (correlation) for the fitted versus observed copula (observed and fitted should broudly be in line)
- Rosenblatt z-scores by timepoint and overall Rosenblatt normal QQ which should be normally distributed (these take into account the correlation structure for timepoints beyond the first)
- Rosenblatt lag plot showing Rosenblatt z-scores for each time point against the lagged timepoint which should should zero correlation, as well as residual dependence by lag which should show close to zero residual correlation
- Sorted empirical versus fitted copula probabilities scatterplot which should closely track the centre line
- Tail co-occurrence and exceedence, essentially the fitted versus observed probability of observations landing in the upper and lower tails of the dependence structure
plot_copula_diagnostics(fit, data = dat)Any concerns in these diagnostics indicate lack of fit for the
dependence, or potentially, the margin fit affecting the dependence fit.
If marginal checks are reasonable, then adjustments to copula should
include assessing alternative copula shapes,
i.e. copula_dist, or adusting parameter covariates for
theta.formula or zeta.formula.