scparadise.scnoah.undersample

Contents

scparadise.scnoah.undersample#

scparadise.scnoah.undersample(adata, celltype_keys, target_per_class=None, min_keep_frac=0.5, random_state=0)[source]#

Undersample some cell types in AnnData object. Returns subsetted adata_undersampled object.

Parameters:
  • adata (AnnData) – Input dataset to be undersampled.

  • celltype_keys (list) – List of cell type annotations in adata.obs. Example: [‘lineage’, ‘cell type’, ‘cell state’]

  • target_per_class (int (default: None)) – Global target per cell type. If None, computed as average cells per cell type in celltype level with most cell types.

  • min_keep_frac (float (default: 0.5)) – Lower bound on fraction of original cell type size preserved after undersampling.

  • random_state (int (default: 0)) – Seed for random number generators to ensure reproducibility.

Returns:

adata_undersample containing the undersampled cell types. Cell type hierarchy is preserved.

Return type:

AnnData