This function can recover SEs from scRNA-seq data or single-cell spatial data by assigning each single cell to an SE or NonSE.
Usage
RecoverSE(
dat,
celltypes,
scale = TRUE,
ncell.per.run = 500,
Ws = NULL,
min.score = 0.6,
ncores = 8
)Arguments
- dat
A numeric (sparse) matrix of gene expression data, from single-cell spatial transcriptomics or scRNA-seq data.
- celltypes
Character vector specifying the cell type annotations for cells included in the gene expression `dat`. If you're using the default model, cell types including B, CD4T, CD8T, NK, Plasma, Macrophage, DC, Fibroblast, Endothelial are expected.
- scale
Logical indicating whether to perform unit-variance normalization (default: TRUE). Change it with caution.
- ncell.per.run
Integer specifying the maximum number of cells per NMF prediction run to avoid memory issues.
- Ws
A list of cell-type-specific W matrices used to recover SE-specific cell states. Each element in the list should be named after the corresponding cell type and contain a W matrix from an NMF model.
- min.score
A numeric threshold (0-1) specifying the minimum prediction score for SE classification; cells with lower scores are assigned to NonSE.
- ncores
Integer specifying the number of CPU cores to use for parallel processing (default: 8).
Value
A data frame with five columns, including CID (cell id), CellType (cell type), InitSE (raw prediction), SE (processed prediction), and PredScore (prediction probability for the assigned SE). InitSE records the raw prediction where cells are assigned into 11 init SEs discovered from MERSCOPE data. Cells were assigned to "NonSE" if 1) they were initially assigned to the two SEs lack of SE-specific cell states, or cell states do not specific to the SE or do not conserve across cancers or 2) their prediction score is below the min probability threshold (min.score<0.6) by default.
