scparadise.scnoah.balance#
- scparadise.scnoah.balance(adata, celltype_keys, min_keep_frac=0.5, max_oversample_factor=7.0, min_oversample_cells=5, random_state=0)[source]#
Balance cell types in AnnData object. Returns adata_balanced with updated matrix and adata_balanced.obs with given celltypes levels. If you give counts function returns counts. If you give normalized data function returns normalized data.
- Parameters:
adata (AnnData) – Input dataset to be balanced.
celltype_keys (list) – List of cell type annotations in adata.obs. Example: [‘lineage’, ‘cell type’, ‘cell state’]
min_keep_frac (float (default: 0.5)) – Safety lower bound for the fraction of original cells retained in large classes.
max_oversample_factor (float (default: 7.0)) – Upper bound on how much a small class may be expanded relative to its original size.
min_oversample_cells (int (default: 5)) – Minimal cell type size to allow substantial generation of new cells.
random_state (int (default: 0)) – Seed for random number generators to ensure reproducibility.
- Returns:
Balanced dataset with preserved cell type hierarchy and var features.
- Return type:
AnnData