
Simulate MR Data with Known Causal Effect
simulateMRData.RdGenerates a complete dataset with SNP genotypes, an exposure, confounders, and a binary outcome where the true causal effect of the exposure on the outcome is known and recoverable. Useful for testing and vignette examples without requiring a database connection.
Usage
simulateMRData(
n = 5000,
nSnps = 10,
trueEffect = 0.5,
confoundingStrength = 0.3,
snpEffectRange = c(0.1, 0.5),
seed = 42
)Arguments
- n
Number of individuals.
- nSnps
Number of SNPs (instruments).
- trueEffect
True causal effect of exposure on outcome (log-OR scale).
- confoundingStrength
Strength of confounding (coefficient of confounder on both exposure and outcome).
- snpEffectRange
Range of SNP-exposure effect sizes.
- seed
Random seed for reproducibility.
Value
A list with elements:
- data
Data frame with person_id, outcome (0/1),
snp_<sanitized rsID>columns, confounder_1, confounder_2, exposure.- instrumentTable
Data frame mimicking getMRInstruments() output.
- trueEffect
The true causal effect used in simulation.
Examples
simData <- simulateMRData(n = 1000, nSnps = 5, trueEffect = 0.3)
head(simData$data)
#> person_id outcome snp_rs1 snp_rs2 snp_rs3 snp_rs4 snp_rs5 confounder_1
#> 1 1 1 1 1 0 0 1 1.3709584
#> 2 2 0 1 0 1 0 0 -0.5646982
#> 3 3 1 1 0 2 1 0 0.3631284
#> 4 4 1 0 0 1 1 0 0.6328626
#> 5 5 1 0 2 0 0 0 0.4042683
#> 6 6 1 0 1 0 0 0 -0.1061245
#> confounder_2 exposure
#> 1 1 3.3068319
#> 2 0 -0.4567887
#> 3 1 1.8067824
#> 4 1 0.7664246
#> 5 1 3.0807866
#> 6 1 -0.1501653