scparadise.scadam.warm_start#
- scparadise.scadam.warm_start(adata, path_model, celltype_keys=None, layer=None, path='', model_name='scAdam_model_warm_start', test_size=0.2, eval_metric=['accuracy', 'balanced_accuracy'], strategy='linear_offset', batch_size=128, epochs=100, patience=10, lr=5e-05, weight_decay=0.0001, use_augmentation=True, aug_probability=0.5, prob=0.15, noise_std=0.1, dropout_aug=0.1, alpha=0.2, adaptive_loss=True, freeze_transformer=False, allow_unseen_labels=False, unknown_detection=True, device='auto', random_state=0, return_model=False, verbose=True)[source]#
Warm-start fine-tuning of an existing scAdam model on new data. Warm-start training is a technique in machine learning that involves initializing a model with parameters or states learned from a previously trained model.
- adata: AnnData
New dataset with cell type annotations in adata.obs
- path_model: str, path object
Path to a model folder containing pretrained scAdam model.
- path: str, path object
Path to create a model folder containing the training history, cell annotation dictionary, and genes used for scAdam model warm start training.
- celltype_keys: list
List of cell type annotations in adata.obs. Example: [‘lineage’, ‘cell type’, ‘cell state’]
- layer: str (default: None)
If specified, use adata.layers[layer] for expression values instead of adata.X.
- model_name: str (default: ‘scAdam_model_warm_start’)
Name of a folder to save model.
- test_size: float or int (default: 0.2)
If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. If int, represents the absolute number of test cells.
- batch_size: int, (default: 128)
Number of examples per batch.
- epochs: int (default: 150)
Maximum number of epochs for scAdam model training
- patience: int (default: 10)
Number of consecutive epochs without improvement before performing early stopping. If patience is set to 0, then no early stopping will be performed. Note that if patience is enabled, then best weights from best epoch will automatically be loaded at the end of the training.
- eval_metric: str or list (default: [‘accuracy’, ‘balanced_accuracy’])
Available evaluation metrics:’accuracy’, ‘balanced_accuracy’, ‘f1_score’. The last metric is used as the target and for early stopping.
- freeze_transformer: bool (default = False)
If True, freezes the transformer backbone (gene embedding + transformer blocks) and trains only the classifier head. If False, fine-tunes the full model.
- allow_unseen_labels: bool (default = False)
If False, raise an error if new data contains labels not present in the pretrained LabelEncoder for any level. If True, unseen labels are mapped to a fallback known label to keep the label space unchanged.
- unknown_detection: bool (default: True)
Train unknown cell detector - identifies unknown cells when model used for prediction on a new data.
- verbose: bool (default: True)
Show progress bar for each epoch during training.
- return_model: bool (default: False)
Return model after training or not.
- Returns:
Saves fine-tuned scAdam model for cell type annotation.