Skip to contents

Simulates a balanced longitudinal dataset from a GAMLSS marginal distribution and a first-order copula dependence model between consecutive time points. Parameter specifications may be constants, time-indexed vectors, subject-by-time matrices, or functions of the generated design data.

Usage

simulate_longitudinal_dataset(
  n = 100,
  times = seq_len(3),
  margin_dist = gamlss.dist::NO(),
  copula_dist = "N",
  margin_params = list(mu = 0, sigma = 1),
  copula_params = list(theta = 0),
  covariates = NULL,
  seed = NULL,
  subject_var = "subject",
  time_var = "time",
  response_var = "response",
  include_truth = TRUE,
  u_bounds = NULL
)

Arguments

n

Number of subjects.

times

Vector of observed time values.

margin_dist

A gamlss.dist family object, for example NO() or GA(mu.link = "log", sigma.link = "log").

copula_dist

Copula family code. Supported codes are "N", "C", "F", "G", "J", and "t".

margin_params

Named list of marginal distribution parameter specifications. Names should match the quantile function arguments, such as mu, sigma, nu, and tau.

copula_params

Named list of copula parameter specifications. Use theta or par for the primary copula parameter, tau to specify Kendall's tau instead, and zeta or par2 for the t-copula degrees of freedom.

covariates

Optional data frame or function returning covariates. A data frame may have either n rows for subject-level covariates or n * length(times) rows for long-format covariates. A function is called with the base long-format data. It may return only new covariate columns or the input data with new columns added; columns already present in the base design are ignored.

seed

Optional random seed.

subject_var, time_var, response_var

Column names for the subject, time, and response variables.

include_truth

If TRUE, include simulated uniforms and true parameter columns.

u_bounds

Optional length-two numeric vector giving lower and upper bounds used to clamp simulated uniforms before applying the marginal quantile function. The default NULL leaves uniforms unchanged.

Value

A long-format data frame with one row per subject-time observation.