This function uses pretrained NMF models to recover cell states / spatial ecotypes. It takes a factorization matrix W representing a pretrained NMF model and a numeric gene expression matrix.
Arguments
- W
Matrix representing the factorization matrix W of a pretrained NMF model.
- testdat
Numeric matrix containing the new data for which NMF scores are to be predicted.
- scale
Logical indicating whether to scale the input data.
- ncell.per.run
Integer specifying the maximum number of cells per NMF prediction run to avoid memory issues.
- sum2one
Logical. If `TRUE`, normalizes the predicted scores so that they sum to 1 for each sample.
- ncores
Integer specifying the number of CPU cores to use for parallel processing.
Value
A matrix representing the NMF prediction scores, with rows as samples/cells and columns as SE groups.
Examples
bulkdata <- fread("https://spatialecotyper.stanford.edu/inc/inc.public.vignettes.php?file=SKCM_RNASeqV2.geneExp.tsv",
sep = "\t", header = TRUE, data.table = FALSE)
rownames(bulkdata) = bulkdata[, 1]
bulkdata = as.matrix(bulkdata[, -1])
W <- readRDS(file.path(system.file("extdata", package = "SpatialEcoTyper"), "Bulk_SE_Recovery_W.rds"))
# Predict SE abundances in bulk tumors
preds <- NMFpredict(W = W, bulkdata, scale = TRUE)
head(preds[, 1:5])
#> NonSE SE1 SE2 SE3 SE4
#> TCGA.3N.A9WB.06 0.10378493 0.08326065 0.25368726 0.094176154 0.07931696
#> TCGA.3N.A9WC.06 0.03110314 0.06302208 0.05111014 0.173416088 0.02882935
#> TCGA.3N.A9WD.06 0.46412765 0.15760965 0.07094936 0.007223942 0.05750158
#> TCGA.BF.A1PU.01 0.12913097 0.18470256 0.09710288 0.069662022 0.16008800
#> TCGA.BF.A1PV.01 0.10221472 0.05277487 0.05740510 0.159600536 0.12876903
#> TCGA.BF.A1PX.01 0.28552813 0.01531421 0.07115704 0.010845747 0.07130197
