This function estimates the abundances of SEs in bulk tumor gene expression data using a provided basis matrix (`W`) from NMF model.
Usage
DeconvoluteSE(
dat,
scale = TRUE,
W = NULL,
nsample.per.run = 500,
sum2one = TRUE,
ncores = 8
)Arguments
- dat
A numeric matrix of gene expression data, e.g. TPM matrix for bulk RNA-seq data, logCPM matrix for Visium data etc.
- scale
Logical. If `TRUE`, the input data is scaled before making predictions.
- W
A numeric matrix representing the basis matrix (`W`) from a pretrained NMF model.
- nsample.per.run
Integer specifying the maximum number of samples to process per prediction run to manage memory usage efficiently.
- sum2one
Logical. If `TRUE`, normalizes the predicted SE abundances so that they sum to 1 for each sample.
- ncores
Integer specifying the number of CPU cores to use for parallel processing.
Examples
bulkdata <- fread("https://spatialecotyper.stanford.edu/inc/inc.public.vignettes.php?file=SKCM_RNASeqV2.geneExp.tsv",
sep = "\t", header = TRUE, data.table = FALSE)
rownames(bulkdata) = bulkdata[, 1]
bulkdata = as.matrix(bulkdata[, -1])
# Predict SE abundances in bulk tumors
se_abundances <- DeconvoluteSE(dat = bulkdata)
head(se_abundances[, 1:5])
#> NonSE SE1 SE2 SE3 SE4
#> TCGA.3N.A9WB.06 0.19380234 0.05372667 0.16735600 0.12581663 0.07216666
#> TCGA.3N.A9WC.06 0.05293689 0.13901340 0.08427617 0.08372047 0.04243859
#> TCGA.3N.A9WD.06 0.27200075 0.17114795 0.06288761 0.05641549 0.10166721
#> TCGA.BF.A1PU.01 0.16899510 0.13618971 0.06501306 0.09269615 0.17313296
#> TCGA.BF.A1PV.01 0.12078283 0.06921834 0.05970319 0.16813008 0.11059648
#> TCGA.BF.A1PX.01 0.21398415 0.04019509 0.08906215 0.04499893 0.08561680
