This function simulates trait data from a genotype matrix.

simulate_traits(
  genotype_matrix,
  n_causal = 1000,
  n_trait_specific = 10,
  n_pleiotropic = 10,
  H2 = 0.6,
  d = 2,
  rho = 0.8,
  marginal_correlation = 0.3,
  epistatic_correlation = 0.3,
  group_ratio_trait = 1,
  group_ratio_pleiotropic = 1,
  maf_threshold = 0.01,
  seed = 67132,
  logLevel = "INFO",
  logFile = NULL
)

Arguments

genotype_matrix

Genotype matrix with samples as rows, and SNPs as columns.

n_causal

Number of SNPs that are causal.

n_trait_specific

Number of causal SNPs with single trait epistatic effects.

n_pleiotropic

Number of SNPs with epistatic effects on all traits.

H2

Broad-sense heritability. Can be vector.

d

Number of traits.

rho

Proportion of heritability explained by additivity.

marginal_correlation

Correlation between the additive effects of the trait.

epistatic_correlation

Correlation between the epistatic effects of the trait.

group_ratio_trait

Ratio of sizes of trait specific groups that interact, e.g. a ratio 1:3 would be value 3.

group_ratio_pleiotropic

Ratio of sizes of pleiotropic groups that interact, e.g. a ratio 1:3 would be value 3.

maf_threshold

is a float parameter defining the threshold for the minor allele frequency not included in causal SNPs.

seed

Random seed for simulation.

logLevel

is a string parameter defining the log level for the logging package.

logFile

is a string parameter defining the name of the log file for the logging output.

Value

A list object containing the trait data, the genotype data, as well as the causal SNPs and summary statistics.

Details

This function takes a genotype matrix and simulates trait data under the following model: beta_i ~ MN(0, V_i, I), i in { additive, epistatic, residual}

The effect sizes follow a matrix normal distribution with no correlation between the samples but covariance between the effects for different traits

Examples

p <- 200
f <- 10
g <- 4
n <- 100
d <- 3
X <- matrix(
    runif(p * n),
    ncol = p
)
data <- simulate_traits(
    X, n_causal = f, n_trait_specific = g, n_pleiotropic = g, d = d, maf_threshold = 0,
    logLevel = "ERROR"
)