Skip to contents

This function provides an approximate estimate of the memory requirements (in gigabytes) for running the Sparse Marginal Epistasis (SME) routine based on input parameters such as the number of samples, SNPs, and other configurations.

Usage

approximate_memory_requirements(
  n_samples,
  n_snps,
  n_blocks,
  n_randvecs,
  chunksize
)

Arguments

n_samples

Integer. The number of samples in the dataset.

n_snps

Integer. The total number of SNPs in the dataset.

n_blocks

Integer. The number of genotype blocks used to partition SNPs. Affects the size of encoded genotype segments.

n_randvecs

Integer. The number of random vectors used for stochastic trace estimation. Affects memory for operations involving random vectors.

chunksize

Integer. The number of focal SNPs processed per chunk.

Value

Numeric. The approximate memory requirement (in gigabytes) for the SME routine.

Details

The function calculates memory usage by summing the contributions from various components used in the SME routine, including:

  • Variance component estimates (vc_estimates)

  • Phenotype-related matrices

  • Random vector-based computations

  • Genotype objects and block statistics

  • Gene-by-gene interaction masks

The estimated memory requirement is derived from the data dimensions and operational needs, and it provides a guideline for configuring resources for the analysis.

Examples

n_samples <- 1e5
n_snps <- 1e6
n_blocks <- 100
n_randvecs <- 100
chunksize <- 10
approximate_memory_requirements(n_samples,
                                n_snps,
                                n_blocks,
                                n_randvecs,
                                chunksize)
#> [1] 6.447136