Aggregate multiple recovery model weight matrices by identifying consistently selected marker genes across models and averaging their weights.
Arguments
- model_list
A list of recovery model objects. Each element should contain a named list of weight matrices indexed by cell type, derived from `NMFGenerateWList`.
- delta.threshold
Minimum delta weights required for a gene to be considered associated with a spatial ecotype. Default is `0.01`.
- min.model.fraction
Minimum fraction of models in which a gene must be selected to be retained. Default is `0.5`.
Details
This function is designed for aggregating recovery models generated from repeated training runs or cross-validation folds. For each cell type, genes are retained if they are consistently selected across models based on:
a minimum delta threshold required for a gene to be selected
a minimum fraction of models in which the gene is selected
Positive and negative feature pairs (`__pos` and `__neg`) are retained for all selected genes. Final weights are computed as the mean across non-missing values.
Examples
if (FALSE) { # \dontrun{
Ws_list = lapply(1:30, function(ii){
Ws = NMFGenerateWList(scdata, scmeta,
CellType = "CellType",
SE = "SE", Sample = "Sample",
nfeature = 300,
nfeature.per.se = 50,
ncores = 8)
})
aggregated_models <- AggregateRecoverModels(
model_list = Ws_list,
loading_margin = 0.01,
min_model_fraction = 0.5
)
} # }
