This function uses pretrained NMF models to recover cell states / spatial ecotypes. It takes a factorization matrix W representing a pretrained NMF model and a numeric gene expression matrix.
Arguments
- W
- Matrix representing the factorization matrix W of a pretrained NMF model. 
- testdat
- Numeric matrix containing the new data for which NMF scores are to be predicted. 
- scale
- Logical indicating whether to scale the input data. 
- ncell.per.run
- Integer specifying the maximum number of cells per NMF prediction run to avoid memory issues. 
- sum2one
- Logical. If `TRUE`, normalizes the predicted scores so that they sum to 1 for each sample. 
- ncores
- Integer specifying the number of CPU cores to use for parallel processing. 
Value
A matrix representing the NMF prediction scores, with rows as samples/cells and columns as SE groups.
Examples
library(googledrive)
drive_deauth() # no Google sign-in is required
drive_download(as_id("14QvmgISxaArTzWt_UHvf55aAYN2zm84Q"), "SKCM_RNASeqV2.geneExp.rds",
                    overwrite = TRUE)
#> File downloaded:
#> • SKCM_RNASeqV2.geneExp.rds <id: 14QvmgISxaArTzWt_UHvf55aAYN2zm84Q>
#> Saved locally as:
#> • SKCM_RNASeqV2.geneExp.rds
bulkdata <- readRDS("SKCM_RNASeqV2.geneExp.rds")
W <- readRDS(file.path(system.file("extdata", package = "SpatialEcoTyper"), "Bulk_SE_Recovery_W.rds"))
# Predict SE abundances in bulk tumors
preds <- NMFpredict(W = W, bulkdata, scale = TRUE)
head(preds[, 1:5])
#>                      NonSE        SE1        SE2         SE3        SE4
#> TCGA-3N-A9WB-06 0.10378493 0.08326065 0.25368726 0.094176154 0.07931696
#> TCGA-3N-A9WC-06 0.03110314 0.06302208 0.05111014 0.173416088 0.02882935
#> TCGA-3N-A9WD-06 0.46412765 0.15760965 0.07094936 0.007223942 0.05750158
#> TCGA-BF-A1PU-01 0.12913097 0.18470256 0.09710288 0.069662022 0.16008800
#> TCGA-BF-A1PV-01 0.10221472 0.05277487 0.05740510 0.159600536 0.12876903
#> TCGA-BF-A1PX-01 0.28552813 0.01531421 0.07115704 0.010845747 0.07130197
