Skip to contents

This function identifies spatially distinct cellular ecosystems (SE) from a single sample.

Usage

SpatialEcoTyper(
  normdata,
  metadata,
  outprefix = "SE",
  radius = 50,
  resolution = 0.5,
  nfeatures = 500,
  min.cts.per.region = 2,
  npcs = 20,
  min.cells = 5,
  min.features = 10,
  iterations = 10,
  minibatch = 5000,
  ncores = 1,
  grid.size = round(radius * 1.4),
  filter.region.by.celltypes = NULL,
  k = 20,
  k.sn = 50,
  dropcell = TRUE
)

Arguments

normdata

A matrix representing normalized gene expression data, where rows correspond to genes and columns correspond to cells.

metadata

A data frame containing metadata associated with each cell. Must include spatial coordinates (e.g., X and Y) as well as cell type annotations. The row names of the metadata must match the column names of the normdata.

outprefix

Character string specifying the prefix for output file names.

radius

Numeric specifying the radius (in the same units as spatial coordinates) for defining spatial neighborhoods around each cell. Default is 50.

resolution

Numeric specifying the resolution for Louvain clustering (default: 0.5).

nfeatures

Integer specifying the number of top variable features (default: 500) used for the analysis.

min.cts.per.region

Integer specifying the minimum number of cell types required for a spatial neighborhood.

npcs

Integer specifying the number of principal components (PCs) (default: 20).

min.cells

Minimum number of cells / spatial-meta-cells (default: 5) expressing a feature/gene.

min.features

Minimum number of features (default: 10) detected in a cell / spatial-meta-cell.

iterations

Integer specifying the number of iterations (default: 10) for SNF analysis.

minibatch

Integer specifying the number of columns to process in each minibatch in the SNF analysis. Default is 5000. This option splits the matrix into smaller chunks (minibatch), thus reducing memory usage.

ncores

Integer specifying the number of CPU cores to use for parallel processing.

grid.size

Numeric specifying the grid size for spatial discretization of coordinates. By default, this size is determined based on the specified radius round(radius*1.4 µm). Increasing the grid.size will downsample spatial neighborhoods and expedite the analysis, while it might eliminate cells located between bins from the SE discovery analysis.

filter.region.by.celltypes

A character vector specifying the cell types to include in the analysis. Only spatial neighborhoods that contain at least one of the specified cell types will be analyzed, while regions lacking these cell types will be excluded from the SE discovery process. If NULL, all spatial neighborhoods will be included, regardless of cell type composition.

k

Integer specifying the number of spatial nearest neighbors (default: 20) used to construct spatial meta-cells.

k.sn

Integer specifying the number of nearest neighbors (default: 20) for constructing similarity network.

dropcell

Logical. If TRUE, cells that cannot be assigned to any spatial ecotype (outside the radius) will be removed from the returned metadata. Default is TRUE.

#' @return A list containing two elements:

obj

A seurat object constructed from fused similarity network of sptial neighborhoods

metadata

Updated metadata, with a new column (`SE`) added

Examples

# See https://digitalcytometry.github.io/spatialecotyper/docs/articles/SingleSample.html
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(ggplot2))
suppressPackageStartupMessages(library(parallel))
suppressPackageStartupMessages(library(Seurat))
suppressPackageStartupMessages(library(data.table))
suppressPackageStartupMessages(library(googledrive))
suppressPackageStartupMessages(library(R.utils))
library(SpatialEcoTyper)

drive_deauth() # Disable Google sign-in requirement
drive_download(as_id("13Rc5Rsu8jbnEYYfUse-xQ7ges51LcI7n"), "HumanMelanomaPatient1_subset_counts.tsv.gz")
#> Error in drive_download(as_id("13Rc5Rsu8jbnEYYfUse-xQ7ges51LcI7n"), "HumanMelanomaPatient1_subset_counts.tsv.gz"): Local `path` already exists and overwrite is `FALSE`:
#>  HumanMelanomaPatient1_subset_counts.tsv.gz
drive_download(as_id("12xcZNhpT-xbhcG8kX1QAdTeM9TKeFAUW"), "HumanMelanomaPatient1_subset_scmeta.tsv")
#> Error in drive_download(as_id("12xcZNhpT-xbhcG8kX1QAdTeM9TKeFAUW"), "HumanMelanomaPatient1_subset_scmeta.tsv"): Local `path` already exists and overwrite is `FALSE`:
#>  HumanMelanomaPatient1_subset_scmeta.tsv

# Load single-cell gene expression matrix. Rows are genes, columns are cells.
scdata <- fread("HumanMelanomaPatient1_subset_counts.tsv.gz",
                sep = "\t",header = TRUE, data.table = FALSE)
rownames(scdata) <- scdata[, 1]  # set genes as row names
scdata <- as.matrix(scdata[, -1])
normdata <- NormalizeData(scdata)
head(normdata[, 1:5])
#>          HumanMelanomaPatient1__cell_3655 HumanMelanomaPatient1__cell_3657
#> PDK4                             0.000000                         4.221184
#> TNFRSF17                         0.000000                         0.000000
#> ICAM3                            0.000000                         0.000000
#> FAP                              3.552438                         0.000000
#> GZMB                             0.000000                         0.000000
#> TSC2                             0.000000                         0.000000
#>          HumanMelanomaPatient1__cell_3658 HumanMelanomaPatient1__cell_3660
#> PDK4                             3.208945                                0
#> TNFRSF17                         0.000000                                0
#> ICAM3                            0.000000                                0
#> FAP                              0.000000                                0
#> GZMB                             0.000000                                0
#> TSC2                             0.000000                                0
#>          HumanMelanomaPatient1__cell_3661
#> PDK4                                    0
#> TNFRSF17                                0
#> ICAM3                                   0
#> FAP                                     0
#> GZMB                                    0
#> TSC2                                    0

# Load single-cell metadata. Three columns are required, including X, Y, and CellType. Others are optional.
scmeta <- read.table("HumanMelanomaPatient1_subset_scmeta.tsv",
                     sep = "\t",header = TRUE, row.names = 1)
scmeta <- scmeta[colnames(scdata), ] # match the cell ids in scdata and scmeta
head(scmeta)
#>                                         X         Y   CellType CellTypeName
#> HumanMelanomaPatient1__cell_3655 1894.706 -6367.766 Fibroblast  Fibroblasts
#> HumanMelanomaPatient1__cell_3657 1942.480 -6369.602 Fibroblast  Fibroblasts
#> HumanMelanomaPatient1__cell_3658 1963.007 -6374.026 Fibroblast  Fibroblasts
#> HumanMelanomaPatient1__cell_3660 1981.600 -6372.266 Fibroblast  Fibroblasts
#> HumanMelanomaPatient1__cell_3661 1742.939 -6374.851 Fibroblast  Fibroblasts
#> HumanMelanomaPatient1__cell_3663 1921.683 -6383.309 Fibroblast  Fibroblasts
#>                                  Region Dist2Interface
#> HumanMelanomaPatient1__cell_3655 Stroma      -883.1752
#> HumanMelanomaPatient1__cell_3657 Stroma      -894.8463
#> HumanMelanomaPatient1__cell_3658 Stroma      -904.1115
#> HumanMelanomaPatient1__cell_3660 Stroma      -907.8909
#> HumanMelanomaPatient1__cell_3661 Stroma      -874.2712
#> HumanMelanomaPatient1__cell_3663 Stroma      -903.6559

se_results <- SpatialEcoTyper(normdata, scmeta,
                              outprefix = "Melanoma1_subset",
                              radius = 50, ncores = 2)
#> 2025-10-19 18:19:17.031955 Remove 88 genes expressed in fewer than 5 cells
#> 2025-10-19 18:19:17.604179 Construct spatial meta cells for each cell type
#> 		Total spatial neighborhoods: 2315
#> 		Total spatial meta cells: 7711
#> 2025-10-19 18:19:25.67173 PCA for each cell type
#> 2025-10-19 18:19:26.949807 Construct cell-type-specific similarity network
#> 		Remove 446 spatial neighborhoods with fewer than 2 cell types 
#> 2025-10-19 18:19:27.52615 Similarity network fusion
#> 2025-10-19 18:19:27.528548 Normalize networks ...
#> 2025-10-19 18:19:27.605614 Calculate the local transition matrix ...
#> 2025-10-19 18:19:33.784168 Perform the diffusion ...
#> 	2025-10-19 18:19:33.785057 Iteration: 1
#> 	2025-10-19 18:19:36.563198 Iteration: 2
#> 	2025-10-19 18:19:45.570064 Iteration: 3
#> 	2025-10-19 18:19:57.462043 Iteration: 4
#> 	2025-10-19 18:20:07.823087 Iteration: 5
#> 	2025-10-19 18:20:18.082665 Iteration: 6
#> 	2025-10-19 18:20:28.770468 Iteration: 7
#> 	2025-10-19 18:20:39.140914 Iteration: 8
#> 	2025-10-19 18:20:49.38285 Iteration: 9
#> 	2025-10-19 18:20:59.673851 Iteration: 10
#> 2025-10-19 18:21:13.627206 Create Seurat object and perform clustering analysis
#> Warning: Data is of class dgeMatrix. Coercing to dgCMatrix.
#> Warning: The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric
#> To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation'
#> This message will be shown once per session
#> 2025-10-19 18:21:20.767875 The Seurat object is saved into Melanoma1_subset_SpatialEcoTyper_results.rds
# Extract the Seurat object and updated single-cell metadata
obj <- se_results$obj # A Seurat object
obj
#> An object of class Seurat 
#> 1869 features across 1869 samples within 1 assay 
#> Active assay: RNA (1869 features, 1869 variable features)
#>  3 layers present: counts, data, scale.data
#>  2 dimensional reductions calculated: pca, umap
scmeta <- se_results$metadata %>% arrange(SE) # Updated single-cell meta data, with SE annotation added
head(scmeta)
#>                                         X         Y   CellType CellTypeName
#> HumanMelanomaPatient1__cell_3655 1894.706 -6367.766 Fibroblast  Fibroblasts
#> HumanMelanomaPatient1__cell_3661 1742.939 -6374.851 Fibroblast  Fibroblasts
#> HumanMelanomaPatient1__cell_3663 1921.683 -6383.309 Fibroblast  Fibroblasts
#> HumanMelanomaPatient1__cell_3664 1706.253 -6383.428 Fibroblast  Fibroblasts
#> HumanMelanomaPatient1__cell_3666 1761.403 -6387.970 Fibroblast  Fibroblasts
#> HumanMelanomaPatient1__cell_3668 1841.514 -6389.663 Fibroblast  Fibroblasts
#>                                  Region Dist2Interface  SE
#> HumanMelanomaPatient1__cell_3655 Stroma      -883.1752 SE0
#> HumanMelanomaPatient1__cell_3661 Stroma      -874.2712 SE0
#> HumanMelanomaPatient1__cell_3663 Stroma      -903.6559 SE0
#> HumanMelanomaPatient1__cell_3664 Stroma      -880.4758 SE0
#> HumanMelanomaPatient1__cell_3666 Stroma      -888.6271 SE0
#> HumanMelanomaPatient1__cell_3668 Stroma      -896.8484 SE0