This function runs a multivariate version of the MArginal ePIstasis Test (mvMAPIT) under the following model variations:

mvmapit(
  X,
  Y,
  Z = NULL,
  C = NULL,
  threshold = 0.05,
  accuracy = 1e-08,
  test = c("normal", "davies", "hybrid"),
  cores = 1,
  variantIndex = NULL,
  logLevel = "WARN",
  logFile = NULL
)

Arguments

X

is the p x n genotype matrix where p is the number of variants and n is the number of samples. Must be a matrix and not a data.frame.

Y

is the d x n matrix of d quantitative or continuous traits for n samples.

Z

is the matrix q x n matrix of covariates. Must be a matrix and not a data.frame.

C

is an n x n covariance matrix detailing environmental effects and population structure effects.

threshold

is a parameter detailing the value at which to recalibrate the Z test p values. If nothing is defined by the user, the default value will be 0.05 as recommended by the Crawford et al. (2017).

accuracy

is a parameter setting the davies function numerical approximation accuracy. This parameter is not needed for the normal test. Smaller p-values than the accuracy will be zero.

test

is a parameter defining what hypothesis test should be run. Takes on values 'normal', 'davies', and 'hybrid'. The 'hybrid' test runs first the 'normal' test and then the 'davies' test on the significant variants.

cores

is a parameter detailing the number of cores to parallelize over. It is important to note that this value only matters when the user has installed OPENMP on their operating system.

variantIndex

is a vector containing indices of variants to be included in the computation.

logLevel

is a string parameter defining the log level for the logging package.

logFile

is a string parameter defining the name of the log file for the logging output. Default is stdout.

Value

A list of P values and PVEs

Details

(1) Standard Model: y = m+g+e where m ~ MVN(0,omega^2K), g ~ MVN(0,sigma^2G), e ~ MVN(0,tau^2M). Recall from Crawford et al. (2017) that m is the combined additive effects from all other variants, represents the additive effect of the k-th variant under the polygenic background of all other variants; K is the genetic relatedness matrix computed using genotypes from all variants other than the k-th; g is the summation of all pairwise interaction effects between the k-th variant and all other variants; G represents a relatedness matrix computed based on pairwise interaction terms between the k-th variant and all other variants. Here, we also denote D = diag(x_k) to be an n × n diagonal matrix with the genotype vector x_k as its diagonal elements. It is important to note that both K and G change with every new marker k that is considered. Lastly; M is a variant specific projection matrix onto both the null space of the intercept and the corresponding genotypic vector x_k.

(2) Standard + Covariate Model: y = Wa+m+g+e where W is a matrix of covariates with effect sizes a.

(3) Standard + Common Environment Model: y = m+g+c+e i where c ~ MVN(0,eta^2C) controls for extra environmental effects and population structure with covariance matrix C.

(4) Standard + Covariate + Common Environment Model: y = Wa+m+g+c+e

This function will consider the following three hypothesis testing strategies which are featured in Crawford et al. (2017): (1) The Normal or Z test (2) Davies Method (3) Hybrid Method (Z test + Davies Method)

Examples

set.seed(837)
p <- 200
n <- 100
d <- 2
X <- matrix(
    runif(p * n),
    ncol = p
)
Y <- matrix(
    runif(d * n),
    ncol = d
)
mapit <- mvmapit(
    t(X),
    t(Y),
    test = "normal", cores = 1, logLevel = "INFO"
)
#> 2023-10-16 16:39:04.622865 INFO:mvmapit:Running normal C++ routine.