scNoah#

Benchmarking cell type annotation and modality prediction.

Balance dataset#

Balancing dataset using your own annotation for future model training. Oversmaple or undersample some cell types.

scnoah.balance

Balance cell types in AnnData object.

scnoah.oversample

Oversample some cell types in AnnData object.

scnoah.undersample

Undersample some cell types in AnnData object.

Annotation metrics#

Test annotation method quality using confusion matrix, accuracy, balanced accuracy and calculating cell type specific precision, recall (also called sensitivity), specificity, f1-score, geometric mean, and index balanced accuracy of the geometric mean.

scnoah.report_classif_full

Returns metrics (precision, recall (also called sensitivity), specificity, f1-score, geometric mean, and index balanced accuracy of the geometric mean) of predicted cell types.

scnoah.report_classif_sens_spec

Returns specificity and recall (also called sensitivity) metrics of predicted cell types.

scnoah.conf_matrix

Compute confusion matrix to evaluate the accuracy of a classification.

scnoah.pred_status

Find correct and incorrect predictions.

Regression metrics#

Test modality prediction method quality using error metrics (RMSE, MedianAE, MeanAE), EVS, R² score and PC. Also, visualise metrics on cell embeddings.

RMSE - Root mean squared error

MeanAE - Mean absolute error

MedianAE - Median absolute error

EVS - Explained variance score

R² score - Coefficient of determination

PC - Pearson coefficient

For error metrics (RMSE, MedianAE, MeanAE): lower value - better prediction

scnoah.report_reg

Returns multiple metrics of cell surface proteins prediction.

scnoah.regres_status

Compute regression status of cells to visualize on UMAP.

scnoah.pearson_coef_prot

Compute Pearson correlation coefficient of predicted protein.

Count cells#

Count number of cell types per sample or condition.

scnoah.cell_counter

Count cell types in samples.

Explanations#

Get explanations of gene importances in scAdam model prediction.

scnoah.explain

Identify the genes that are most important for determining cell type using a model.

scnoah.feature_importance

Get dataframe with gene importances for specific cell type.

Managing large datasets#

Get fraction of large dataset.

scnoah.get_frac

Get fraction of anndata object.

scnoah.get_samples

Get samples from AnnData object.

Difference between clusters#

Calculate Integral of absolute density difference and Mutual Information between two clusterings.

scnoah.clust_diff

Calculates metrics to Integral of absolute density difference and Mutual Information between two clusterings.