scparadise.scadam.hyperparameter_tuning

scparadise.scadam.hyperparameter_tuning#

scparadise.scadam.hyperparameter_tuning(adata, path='', celltype_l1=None, celltype_l2=None, celltype_l3=None, celltype_l4=None, celltype_l5=None, model_name='model_annotation_tuning', storage='model_annotation_tuning.db', study_name='study', load_if_exists=True, accelerator='auto', tune_params='auto', random_state=0, num_trials=100, verbose=0, n_d=None, n_a=None, n_steps=None, n_shared=None, cat_emb_dim=None, n_independent=None, gamma=None, momentum=None, lr=None, lambda_sparse=None, patience=None, max_epochs=None, batch_size=None, virtual_batch_size=None, mask_type=None, optimizer_fn=<class 'torch.optim.adamw.AdamW'>, scheduler_fn=<class 'torch.optim.lr_scheduler.StepLR'>, loss_fn=CrossEntropyLoss(), step_size=10, gamma_scheduler=0.95, eval_metric=['accuracy'], direction='maximize', drop_last=True)[source]#

Hyperparameter tuning using the automatic model optimization framework Optuna.

Parameters:
  • adata (AnnData) – Annotated data matrix.

  • path (str, path object) – Path to create a folder with best hyperparameters, dictionary of cell annotations and genes used for hyperparameters optimization.

  • celltype_l1 (str, (default: None)) – First level of cell annotation. Key in adata.obs dataframe.

  • celltype_l2 (str, (default: None)) – Second level of cell annotation. Key in adata.obs dataframe.

  • celltype_l3 (str, (default: None)) – Third level of cell annotation. Key in adata.obs dataframe.

  • celltype_l4 (str, (default: None)) – Forth level of cell annotation. Key in adata.obs dataframe.

  • celltype_l5 (str, (default: None)) – Fifth level of cell annotation. Key in adata.obs dataframe.

  • model_name (str, (default: 'model_annotation_tuning')) – Name of a folder to save hyperparameters, dictionary of cell annotations and genes used for hyperparameters optimization.

  • storage (str, (default: 'model_annotation_tuning.db')) – Database URL. If this argument is set to None, in-memory (RAM) storage is used, and the study will not be persistent. We don’t recommend to use in-memory (RAM) storage to save optimization progress.

  • study_name (str, (default: 'study')) – Study’s name. If this argument is set to None, a unique name is generated automatically.

  • load_if_exists (bool, (default: True)) – Flag to control the behavior to handle a conflict of study names. In the case where a study named study_name already exists in the storage, a DuplicatedStudyError is raised if load_if_exists is set to False. Otherwise, the creation of the study is skipped, and the existing one is returned. If the value is True, allows hyperparameter tuning to continue if interrupted (keyboard interrupt, or Windows update).

  • accelerator (str, (default: 'auto')) – Type of accelerator to use in training model (‘cpu’, ‘cuda’). Set ‘auto’ for automatic selection.

  • tune_params (str, (default: 'auto')) – Dictionary of tunable hyperparameters with lowest and highest value and step for integer parameters. Example: tune_params = {“n_d”: [8, 64, 4]} # first - lowest value, second - highest value, third - step.

  • random_state (int, (default: 0)) – Controls the data shuffling, splitting to folds and model training. Pass an int for reproducible output across multiple function calls.

  • num_trials (int, (default: 100)) – Number of trials to tune hyperparameters.

  • verbose (int (0 or 1), bool (True or False), (default: True)) – Show progress bar for each epoch during training. Set to 1 or ‘True’ to see every epoch progress, 0 or ‘False’ to get None.

  • n_d (int, (default: None)) – Width of the decision prediction layer. Bigger values gives more capacity to the model with the risk of overfitting. Values typically range from 8 to 128. If given, then used for the trail 0. If not specified in the list of tunable hyperparameters, then this value is used for all trails.

  • n_a (int, (default: None)) – Width of the attention embedding for each mask. Values typically range from 8 to 128. If given, then used for the trail 0. If not specified in the list of tunable hyperparameters, then this value is used for all trails.

  • n_steps (int, (default: None)) – Number of steps in the architecture. Values typically range from 3 to 10. If given, then used for the trail 0. If not specified in the list of tunable hyperparameters, then this value is used for all trails.

  • n_shared (int, (default: None)) – Number of shared Gated Linear Units at each step. Values typically range from 1 to 10. If given, then used for the trail 0. If not specified in the list of tunable hyperparameters, then this value is used for all trails.

  • cat_emb_dim (int, (default: None)) – List of embeddings size for each categorical features. Values typically range from 1 to 10. If given, then used for the trail 0. If not specified in the list of tunable hyperparameters, then this value is used for all trails.

  • n_independent (int, (default: None)) – Number of independent Gated Linear Units layers at each step. Values typically range from 1 to 10. If given, then used for the trail 0. If not specified in the list of tunable hyperparameters, then this value is used for all trails.

  • gamma (float, (default: None)) – This is the coefficient for feature reusage in the masks. A value close to 1 will make mask selection least correlated between layers. Values typically range from 1.0 to 2.0. If given, then used for the trail 0. If not specified in the list of tunable hyperparameters, then this value is used for all trails.

  • momentum (float, (default: None)) – Momentum for batch normalization. Values typically range from 0.01 to 0.4. If given, then used for the trail 0. If not specified in the list of tunable hyperparameters, then this value is used for all trails.

  • lr (float, (default: None)) – Determines the step size at each iteration while moving toward a minimum of a loss function. A large initial learning rate of 0.02 with decay is a good option. If given, then used for the trail 0. If not specified in the list of tunable hyperparameters, then this value is used for all trails.

  • lambda_sparse (float, (default: None)) – This is the extra sparsity loss coefficient. The bigger this coefficient is, the sparser your model will be in terms of feature selection. Depending on the difficulty of your problem, reducing this value could help. If given, then used for the trail 0. If not specified in the list of tunable hyperparameters, then this value is used for all trails.

  • patience (int, (default: None)) – Number of consecutive epochs without improvement before performing early stopping. If patience is set to 0, then no early stopping will be performed. Values typically range from 5 to 20. Note that if patience is enabled, then best weights from best epoch will automatically be loaded at the end of the training. If given, then used for the trail 0. If not specified in the list of tunable hyperparameters, then this value is used for all trails.

  • max_epochs (int, (default: None)) – Maximum number of epochs for training. Values typically range from 5 to 100. If given, then used for the trail 0. If not specified in the list of tunable hyperparameters, then this value is used for all trails.

  • batch_size (int, (default: None)) – Number of examples per batch. Values typically range from 2 to 10 of virtual_batch_size. If given, then used for the trail 0. If not specified in the list of tunable hyperparameters, then this value is used for all trails.

  • virtual_batch_size (int, (default: None)) – Size of the mini batches used for “Ghost Batch Normalization”. ‘virtual_batch_size’ should divide ‘batch_size’. Values typically: 128, 256, 512, 1024 If given, then used for the trail 0. If not specified in the list of tunable hyperparameters, then this value is used for all trails.

  • mask_type (str, (default: None)) – Either “sparsemax” or “entmax”. This is the masking function to use for selecting features. If given, then used for the trail 0. If not specified in the list of tunable hyperparameters, then this value is used for all trails.

  • optimizer_fn (func, (default: torch.optim.AdamW)) – Pytorch Optimizer function.

  • scheduler_fn (func, (default: torch.optim.lr_scheduler.StepLR)) – Pytorch Scheduler to change learning rates during training.

  • loss_fn (torch.loss function (default: torch.nn.CrossEntropyLoss)) – Loss function for training.

  • step_size (int, (default: 10)) – Scheduler learning rate decay.

  • gamma_scheduler (float, (default: 0.95)) – Multiplicative factor of scheduler learning rate decay. step_size and gamma_scheduler are used in dictionary of parameters to apply to the scheduler_fn.

  • eval_metric (list, (default: ['accuracy'])) – List of evaluation metrics (‘accuracy’, ‘balanced_accuracy’, ‘logloss’). The last metric is used as the target and for early stopping.

  • direction (str, (default: 'maximize')) – Directioon of optuna algorithm. ‘maximize’ for ‘accuracy’ and ‘balanced_accuracy’, ‘minimize’ for ‘logloss’. Only for last evaluation metric given in eval_metric list.

  • drop_last (bool, (default: True)) – Set to True to drop the last incomplete batch, if the dataset size is not divisible by the batch size. If False and the size of dataset is not divisible by the batch size, then the last batch will be smaller.