Identify Spatial EcoTypes from Single-cell Spatial Data (A Single Sample)
Source:R/SpatialEcoTyper.R
SpatialEcoTyper.Rd
This function identifies spatially distinct cellular ecosystems (SE) from a single sample.
Usage
SpatialEcoTyper(
normdata,
metadata,
outprefix = "SE",
radius = 50,
resolution = 0.5,
nfeatures = 3000,
min.cts.per.region = 1,
npcs = 20,
k.sn = 50,
k = 20,
min.cells = 5,
min.features = 10,
iterations = 5,
minibatch = 5000,
ncores = 1,
grid.size = round(radius * 1.4),
filter.region.by.celltypes = NULL
)
Arguments
- normdata
A matrix representing normalized gene expression data, where rows correspond to genes and columns correspond to cells.
- metadata
A data frame containing metadata associated with each cell. Must include spatial coordinates (e.g., X and Y) as well as cell type annotations. The row names of the
metadata
must match the column names of thenormdata
.- outprefix
Character string specifying the prefix for output file names.
- radius
Numeric specifying the radius (in the same units as spatial coordinates) for defining spatial neighborhoods around each cell. Default is 50.
- resolution
Numeric specifying the resolution for Louvain clustering (default: 0.5).
- nfeatures
Integer specifying the number of top variable features (default: 3000) used for PCA.
- min.cts.per.region
Integer specifying the minimum number of cell types required for a microregion.
- npcs
Integer specifying the number of principal components (PCs) (default: 20).
- k.sn
Integer specifying the number of spatial nearest neighbors (default: 50) for constructing similarity network.
- k
Integer specifying the number of spatial nearest neighbors (default: 20) used to construct spatial meta-cells.
- min.cells
Minimum number of cells / spatial-meta-cells (default: 5) expressing a feature/gene.
- min.features
Minimum number of features (default: 10) detected in a cell / spatial-meta-cell.
- iterations
Integer specifying the number of iterations (default: 5) for SNF analysis.
- minibatch
Integer specifying the number of columns to process in each minibatch in the SNF analysis. Default is 5000. This option splits the matrix into smaller chunks (minibatch), thus reducing memory usage.
- ncores
Integer specifying the number of CPU cores to use for parallel processing.
- grid.size
Numeric specifying the grid size for spatial discretization of coordinates. By default, this size is determined based on the specified radius (radius*1.4 µm). Increasing the grid.size will downsample microregions and expedite the analysis, while it might eliminate cells located between bins from the SE discovery analysis.
- filter.region.by.celltypes
A character vector specifying the cell types to include in the analysis. Only spatial microregions that contain at least one of the specified cell types will be analyzed, while regions lacking these cell types will be excluded from the SE discovery process. If NULL, all spatial microregions will be included, regardless of cell type composition.
#' @return A list containing two elements:
- obj
A seurat object constructed from fused similarity network of sptial microregions
- metadata
Updated
metadata
, with a new column (`SE`) added
Examples
# See https://digitalcytometry.github.io/spatialecotyper/docs/articles/SingleSample.html
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(ggplot2))
suppressPackageStartupMessages(library(parallel))
suppressPackageStartupMessages(library(Seurat))
suppressPackageStartupMessages(library(data.table))
suppressPackageStartupMessages(library(googledrive))
suppressPackageStartupMessages(library(R.utils))
library(SpatialEcoTyper)
drive_deauth() # Disable Google sign-in requirement
drive_download(as_id("13Rc5Rsu8jbnEYYfUse-xQ7ges51LcI7n"), "HumanMelanomaPatient1_subset_counts.tsv.gz")
#> Error in drive_download(as_id("13Rc5Rsu8jbnEYYfUse-xQ7ges51LcI7n"), "HumanMelanomaPatient1_subset_counts.tsv.gz"): Local `path` already exists and overwrite is `FALSE`:
#> • HumanMelanomaPatient1_subset_counts.tsv.gz
drive_download(as_id("12xcZNhpT-xbhcG8kX1QAdTeM9TKeFAUW"), "HumanMelanomaPatient1_subset_scmeta.tsv")
#> Error in drive_download(as_id("12xcZNhpT-xbhcG8kX1QAdTeM9TKeFAUW"), "HumanMelanomaPatient1_subset_scmeta.tsv"): Local `path` already exists and overwrite is `FALSE`:
#> • HumanMelanomaPatient1_subset_scmeta.tsv
# Load single-cell gene expression matrix. Rows are genes, columns are cells.
scdata <- fread("HumanMelanomaPatient1_subset_counts.tsv.gz",
sep = "\t",header = TRUE, data.table = FALSE)
rownames(scdata) <- scdata[, 1] # set genes as row names
scdata <- as.matrix(scdata[, -1])
normdata <- NormalizeData(scdata)
head(normdata[, 1:5])
#> HumanMelanomaPatient1__cell_3655 HumanMelanomaPatient1__cell_3657
#> PDK4 0.000000 4.221184
#> TNFRSF17 0.000000 0.000000
#> ICAM3 0.000000 0.000000
#> FAP 3.552438 0.000000
#> GZMB 0.000000 0.000000
#> TSC2 0.000000 0.000000
#> HumanMelanomaPatient1__cell_3658 HumanMelanomaPatient1__cell_3660
#> PDK4 3.208945 0
#> TNFRSF17 0.000000 0
#> ICAM3 0.000000 0
#> FAP 0.000000 0
#> GZMB 0.000000 0
#> TSC2 0.000000 0
#> HumanMelanomaPatient1__cell_3661
#> PDK4 0
#> TNFRSF17 0
#> ICAM3 0
#> FAP 0
#> GZMB 0
#> TSC2 0
# Load single-cell metadata. Three columns are required, including X, Y, and CellType. Others are optional.
scmeta <- read.table("HumanMelanomaPatient1_subset_scmeta.tsv",
sep = "\t",header = TRUE, row.names = 1)
scmeta <- scmeta[colnames(scdata), ] # match the cell ids in scdata and scmeta
head(scmeta)
#> X Y CellType CellTypeName
#> HumanMelanomaPatient1__cell_3655 1894.706 -6367.766 Fibroblast Fibroblasts
#> HumanMelanomaPatient1__cell_3657 1942.480 -6369.602 Fibroblast Fibroblasts
#> HumanMelanomaPatient1__cell_3658 1963.007 -6374.026 Fibroblast Fibroblasts
#> HumanMelanomaPatient1__cell_3660 1981.600 -6372.266 Fibroblast Fibroblasts
#> HumanMelanomaPatient1__cell_3661 1742.939 -6374.851 Fibroblast Fibroblasts
#> HumanMelanomaPatient1__cell_3663 1921.683 -6383.309 Fibroblast Fibroblasts
#> Region Dist2Interface
#> HumanMelanomaPatient1__cell_3655 Stroma -883.1752
#> HumanMelanomaPatient1__cell_3657 Stroma -894.8463
#> HumanMelanomaPatient1__cell_3658 Stroma -904.1115
#> HumanMelanomaPatient1__cell_3660 Stroma -907.8909
#> HumanMelanomaPatient1__cell_3661 Stroma -874.2712
#> HumanMelanomaPatient1__cell_3663 Stroma -903.6559
se_results <- SpatialEcoTyper(normdata, scmeta,
outprefix = "Melanoma1_subset",
radius = 50, ncores = 2)
#> 2024-11-07 01:13:47.274967 Remove 88 genes expressed in fewer than 5 cells
#> 2024-11-07 01:13:47.599961 Construct spatial meta cells for each cell type
#> Total spatial microregions: 2315
#> Total spatial meta cells: 7711
#> 2024-11-07 01:13:57.050384 PCA for each cell type
#> 2024-11-07 01:13:58.415943 Construct cell-type-specific similarity network
#> 2024-11-07 01:13:59.127738 Similarity network fusion
#> 2024-11-07 01:13:59.129295 Normalize networks ...
#> 2024-11-07 01:13:59.226963 Calculate the local transition matrix ...
#> 2024-11-07 01:13:59.686673 Perform the diffusion ...
#> 2024-11-07 01:13:59.687456 Iteration: 1
#> 2024-11-07 01:14:05.423195 Iteration: 2
#> 2024-11-07 01:14:21.987019 Iteration: 3
#> 2024-11-07 01:14:48.162873 Iteration: 4
#> 2024-11-07 01:15:11.952116 Iteration: 5
#> 2024-11-07 01:15:38.286144 Create Seurat object and perform clustering analysis
#> Warning: Data is of class dgeMatrix. Coercing to dgCMatrix.
#> Warning: The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric
#> To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation'
#> This message will be shown once per session
#> 2024-11-07 01:15:49.269557 The Seurat object is saved into Melanoma1_subset_SpatialEcoTyper_results.rds
# Extract the Seurat object and updated single-cell metadata
obj <- se_results$obj # A Seurat object
obj
#> An object of class Seurat
#> 2315 features across 2315 samples within 1 assay
#> Active assay: RNA (2315 features, 2315 variable features)
#> 3 layers present: counts, data, scale.data
#> 2 dimensional reductions calculated: pca, umap
scmeta <- se_results$metadata %>% arrange(SE) # Updated single-cell meta data, with SE annotation added
head(scmeta)
#> X Y CellType CellTypeName
#> HumanMelanomaPatient1__cell_3661 1742.939 -6374.851 Fibroblast Fibroblasts
#> HumanMelanomaPatient1__cell_3664 1706.253 -6383.428 Fibroblast Fibroblasts
#> HumanMelanomaPatient1__cell_3666 1761.403 -6387.970 Fibroblast Fibroblasts
#> HumanMelanomaPatient1__cell_3668 1841.514 -6389.663 Fibroblast Fibroblasts
#> HumanMelanomaPatient1__cell_3670 1741.536 -6395.796 Fibroblast Fibroblasts
#> HumanMelanomaPatient1__cell_3674 1907.099 -6411.773 Fibroblast Fibroblasts
#> Region Dist2Interface SE
#> HumanMelanomaPatient1__cell_3661 Stroma -874.2712 SE0
#> HumanMelanomaPatient1__cell_3664 Stroma -880.4758 SE0
#> HumanMelanomaPatient1__cell_3666 Stroma -888.6271 SE0
#> HumanMelanomaPatient1__cell_3668 Stroma -896.8484 SE0
#> HumanMelanomaPatient1__cell_3670 Stroma -895.0259 SE0
#> HumanMelanomaPatient1__cell_3674 Stroma -928.7076 SE0