Simulate all model parameters and sample source specific data from multivariate gaussian with full covariance structure.

simulate_data(
  n,
  m,
  k,
  p1,
  p2,
  mus_mean = 10,
  mus_std = 2,
  gammas_mean = 1,
  gammas_std = 0.1,
  betas_mean = 1,
  betas_std = 0.1,
  sigmas_lb = 0,
  sigmas_ub = 1,
  taus_std = 0.1,
  log_file = "Unico.log",
  verbose = FALSE
)

Arguments

n

A positive integer indicating the number of observations to simulate.

m

A positive integer indicating the number of features to simulate.

k

A positive integer indicating the number of sources to simulate.

p1

A non-negative integer indicating the number of source-specific covariates to simulate.

p2

A non-negative indicating the number of non-source-specific covariates to simulate.

mus_mean

A numerical value indicating the average of the source specific means.

mus_std

A positive value indicating the variation of the source specific means across difference sources.

gammas_mean

A numerical value indicating the average effect sizes of the source-specific covariates.

gammas_std

A non-negative numerical value indicating the variation of the effect sizes of the source-specific covariates.

betas_mean

A numerical value indicating the average effect sizes of the non-source-specific covariates.

betas_std

A non-negative numerical value indicating the variation of the effect sizes of the non-source-specific covariates.

sigmas_lb

A numerical value indicating the lower bound of a uniform distribution from which we sample entries of matrix A used to construct the feature specific k by k variance-covariance matrix.

sigmas_ub

A numerical value indicating the upper bound of a uniform distribution from which we sample entries of matrix A used to construct the feature specific k by k variance-covariance matrix.

taus_std

non-negative numerical value indicating the variation of the measurement noise across difference features.

log_file

A path to an output log file. Note that if the file log_file already exists then logs will be appended to the end of the file. Set log_file to NULL to prevent output from being saved into a file; note that if verbose == FALSE then no output file will be generated regardless of the value of log_file.

verbose

A logical value indicating whether to print logs.

Value

A list of simulated model parameters, covariates, observed mixture, and source-specific data.

X

An m by n matrix of the simulated mixture for m features and n observations.

W

An n by k matrix of the weights/proportions of k source for each of the n observations.

C1

An n by p1 matrix of the simulated covariates that affect the source-specific values.

C2

An n by p2 matrix of the simulated covariates that affect the mixture values.

Z

A k by m by n tensor of the source specific values for each of the k sources

mus

An m by k matrix of the mean of each of the m features for each of the k sources.

gammas

An m by k*p1 matrix of the effect sizes of the p1 covariates in C1 on each of the m features in X, where the first p1 columns are the source-specific effects of the p1 covariates on the first source, the following p1 columns are the source-specific effects on the second source and so on.

betas

An m by p2 matrix of the effect sizes of the p2 covariates in C2 on the mixture values of each of the m features.

sigmas

An m by k by k tensor of the variance-covariance matrix of each of the m features.

taus

An m by 1 matrix of the feature specific variance of the measurement noise for all m features.

Details

Simulate data based on the generative model described in function Unico.

Examples

data = simulate_data(n=100, m=2, k=3, p1=1, p2=1, taus_std=0, log_file=NULL)