Skip to contents

A function for simulating data

Usage

create.sim.data(
  sim = 1,
  te = 2,
  jitter = 1,
  cov_min = 0,
  cov_max = 5,
  wt_X1 = 1,
  wt_X2 = 1,
  n = 200,
  mean1 = 0,
  mean2 = 0,
  sd1 = 1,
  sd2 = 1,
  rho = 0.2,
  weight_t1 = -1,
  weight_t2 = 1,
  weight_y0 = 5,
  weight_y1 = -1,
  weight_y2 = 1,
  relX1 = "Mediator",
  relX2 = "Confounder"
)

Arguments

sim

a integer referencing the simulation - 1, 2, 3, or 4

te

a number representing the treatment effect, specify for all simulations.

jitter

controls the SD of the rnorm function that is used to add jitter to the outcome variable

cov_min

Simulation 2 - minimum value for uniform distribution

cov_max

Simulation 2 - maximum value for uniform distribution

wt_X1

Simulation 2 - the effect of the treatment on X1

wt_X2

Simulation 2 - the effect of the treatment on X2

n

the number of observations to populate for simulation 2, 3, and 4

mean1

Simulation 3 and 4 - X1 mean value for a normal distribution

mean2

Simulation 3 and 4 - X2 mean value for a normal distribution

sd1

Simulation 3 and 4 - X1 standard deviation value for a normal distribution

sd2

Simulation 3 and 4 - X2 standard deviation value for a normal distribution

rho

Simulation 3 and 4 - correlation coefficient for relationship between X1 and X2

weight_t1

Simulation 3 and 4 - the weight of X1 on the treatment variable t

weight_t2

Simulation 3 and 4 - the weight of X2 on the treatment variable t

weight_y0

Simulation 3 and 4 - the unaffected value of the outcome y

weight_y1

Simulation 3 and 4 - the weight of X1 on the outcome variable y

weight_y2

Simulation 3 and 4 - the weight of X2 on the outcome variable y

relX1

Simulation 3 - the causal relationship of X1 in the data. Can be 'mediator', 'confounder', 'collider', 'ancestor to y', 'ancestor to t'

relX2

Simulation 3 - the causal relationship of X2 in the data. Can be 'mediator', 'confounder', 'collider', 'ancestor to y', 'ancestor to t'

Value

A dataframe with a 2 covariates, X1 and X2, a random error variable, a treatment variable and an outcome variable for the purposes of matching

Examples

# generate a simulated dataframe
d <- create.sim.data(1,2)
head(d)
#>   t Allocation     X1     X2      y  error
#> 1 0    Control  1.259 -1.185 -1.411 -1.484
#> 2 0    Control -0.759  0.179 -1.726 -1.145
#> 3 0    Control  0.895 -0.768  0.239  0.112
#> 4 0    Control -1.569 -0.436 -2.057 -0.052
#> 5 0    Control -0.413  0.147 -1.780 -1.514
#> 6 0    Control  1.381 -1.013  0.317 -0.051