Fit conditional margins to standard Gaussian distributions for a list of genes.
Source:R/fit_marginals.R
fit_marginals.RdThis function applies marginal fitting and transforms each gene's expression to follow a standard Gaussian distribution.
Usage
fit_marginals(
gene_list,
count_mat,
cov_mat,
formula1,
family1,
DT = TRUE,
epsilon = 1e-06,
ncores = ncores,
seed = 123
)Arguments
- gene_list
A vector of gene names or indices (row names or row numbers of
count_mat) to process.- count_mat
A matrix of raw gene expression counts. Rows correspond to genes, columns to observations (cells/spots).
- cov_mat
A matrix or data frame of covariates used for marginal modeling (e.g., spatial coordinates or experimental annotations).
- formula1
A formula object or string (e.g.,
"~ covariate1 + covariate2") specifying the model structure for the mean.- family1
A string specifying the distribution family to be used for modeling. Supported values include
"gaussian","poisson","nb", or"zinb".- DT
Logical; if
TRUE, applies a discrete transformation suitable for count data. Default isTRUE.- epsilon
A small numeric constant to avoid boundary issues (e.g.,
0or1values in uniform distribution). Default is1e-6.- ncores
Integer specifying the number of cores to use for parallel processing via
parallel::mclapply.- seed
Random seed for reproducibility. Default is
123.
Value
A list containing two matrices:
- marginal
A matrix of transformed values from each gene, transformed to follow a standard normal distribution.
- residual
A matrix of values transformed to standard uniform distribution (before applying the Gaussian quantile function).
Examples
data(test_data)
# Fit standardized marginals for gene expressions
marginal_res <- fit_marginals(
gene_list = test_data$gene_list,
count_mat = test_data$count_mat,
cov_mat = test_data$cov_mat,
formula1 = "layer_annotations",
family1 = "nb",
DT = TRUE,
epsilon = 1e-6,
ncores = 2
)