Integrate Multiple Spatial Transcriptomics Datasets to Identify Conserved Spatial Ecotypes
Source:R/MultiSpatialEcoTyper.R
MultiSpatialEcoTyper.Rd
This function performs SpatialEcoTyper analysis on multiple spatial transcriptomics datasets. It normalizes the input data, performs SpatialEcoTyper analysis on each dataset, and integrates the results across samples.
Usage
MultiSpatialEcoTyper(
data_list,
metadata_list,
outdir = "./",
normalization.method = "None",
nmf_ranks = 10,
nrun.per.rank = 30,
min.coph = 0.95,
radius = 50,
min.cts.per.region = 1,
nfeatures = 3000,
min.features = 10,
Region = NULL,
subresolution = 30,
minibatch = 5000,
ncores = 1,
seed = 1,
filter.region.by.celltypes = NULL,
...
)
Arguments
- data_list
A named list of expression matrices where each matrix represents gene expression data for a sample. The columns of each matrix correspond to cells, and the rows correspond to genes. Sample names should be used as list names. Otherwise, the samples will be named as 'Sample1' through 'SampleN'.
- metadata_list
A named list of metadata data frames where each data frame contains metadata corresponding to the cells in the expression matrices. Each row should correspond to a column (cell) in the expression matrices. Each metadata should include at least three columns, including X, Y and CellType.
- outdir
Directory where the results will be saved. Defaults to the current directory with a subdirectory named "SpatialEcoTyper_results_" followed by the current date.
- normalization.method
Method for normalizing the expression data. Options include "None" (default), "SCT", or other methods compatible with Seurat's `NormalizeData` function.
- nmf_ranks
Integer or a vector specifying the number of clusters (10 by default). When an integer vector is supplied, the function will test all supplied numbers and select the optimal number, which takes time.
- nrun.per.rank
An integer specifying the the number of runs per rank for NMF (default: 30).
- min.coph
Numeric specifying the minimum cophenetic coefficient required for a rank to be optimal.
- radius
Numeric specifying the radius (in the same units as spatial coordinates) for defining spatial neighborhoods around each cell. Default is 50.
- min.cts.per.region
Integer specifying the minimum number of cell types required for a microregion.
- nfeatures
An integer specifying the maximum number of top variable genes to select for each cell type.
- min.features
An integer specifying the minimum number of shared features (genes) required across samples.
- Region
Character string specifying the column name in metadata data frames containing region annotations (default: NULL). Pathologist annotation is recommended if available.
- subresolution
Numeric specifying the resolution for clustering within each sample.
- minibatch
Integer specifying the number of columns to process in each minibatch in the SNF analysis. Default is 5000. This option splits the matrix into smaller chunks (minibatch), thus reducing memory usage.
- ncores
Integer specifying the number of cores for parallel processing. Default is 1.
- seed
An integer used to seed the random number generator for NMF analysis.
- filter.region.by.celltypes
A character vector specifying the cell types to include in the analysis. Only spatial microregions that contain at least one of the specified cell types will be analyzed, while regions lacking these cell types will be excluded from the SE discovery process. If NULL, all spatial microregions will be included, regardless of cell type composition.
- ...
Additional arguments passed to the `SpatialEcoTyper` function.