scparadise.scnoah.get_frac

Contents

scparadise.scnoah.get_frac#

scparadise.scnoah.get_frac(adata=None, path=None, path_save=None, stratify=None, fraction=0.1, shuffle=True, random_state=0)[source]#

Get fraction of AnnData object. Specify AnnData object OR path to AnnData. The function returns a portion of the AnnData object while maintaining the ratio of cell types.

Parameters:
  • adata (AnnData) – Annotated data matrix. If None uses path to anndata (e.g., “/Data/adata.h5ad”).

  • path (str, path object (default: None)) – Path to the AnnData object if AnnData is not loaded into RAM.

  • path_save (str, path object (default: None)) – Path to save fraction of AnnData

  • stratify (str (default: None)) – Key in adata.obs dataframe. If specified, ensures the same key-based cell ratio as in the original adata.

  • fraction (float or int (default: 0.1)) – If a float value is specified, it must be between 0.0 and 1.0 and represent the fraction of the dataset to be included in adata_fraction. If an integer is specified, it must be less than the number of cells in the dataset. This number of cells will be randomly allocated from adata.

  • shuffle (bool (default: True)) – Whether or not to shuffle the data before subsetting. If shuffle = False, then stratify is not used to maintain the same ratio.

  • random_state (int (default: 0)) – Controls the data shuffling and splitting. Pass an int for reproducible output across multiple function calls.

Returns:

A fraction of the AnnData object while maintaining the same ratio of cell types (if stratify is specified). This part of the AnnData object can also be saved as adata_fraction.h5ad.