renate.training.training module#
- renate.training.training.run_training_job(mode, config_space, metric, backend, updater='ER', max_epochs=50, task_id='default_task', chunk_id=None, input_state_url=None, output_state_url=None, working_directory='renate_working_dir', dependencies=None, config_file=None, requirements_file=None, role=None, instance_type='ml.c5.xlarge', instance_count=1, instance_max_time=259200, max_time=None, max_num_trials_started=None, max_num_trials_completed=None, max_num_trials_finished=None, max_cost=None, n_workers=1, scheduler=None, scheduler_kwargs=None, seed=0, accelerator='auto', devices=1, strategy='ddp', precision='32', deterministic_trainer=False, gradient_clip_val=None, gradient_clip_algorithm=None, job_name='renate')[source]#
Starts updating the model including hyperparameter optimization.
- Parameters:
mode¶ (
Literal
['min'
,'max'
]) – Declares the type of optimization problem: min or max.config_space¶ (
Dict
[str
,Any
]) – Details for defining your own search space is provided in the Syne Tune Documentation.metric¶ (
str
) – Name of metric to optimize.backend¶ (
Literal
['local'
,'sagemaker'
]) – Whether to run jobs locally (local) or on SageMaker (sagemaker).updater¶ (
str
) – Updater used for model update.max_epochs¶ (
int
) – The maximum number of epochs used to train the model. For comparability between methods, epochs are interpreted as “finetuning-equivalent”. That is, one epoch is defined as len(current_task_dataset) / batch_size update steps.task_id¶ (
str
) – Unique identifier for the current task.chunk_id¶ (
Optional
[int
]) – Unique identifier for the current data chunk.input_state_url¶ (
Optional
[str
]) – Path to the Renate model state.output_state_url¶ (
Optional
[str
]) – Path where Renate model state will be stored.working_directory¶ (
Optional
[str
]) – Path to the working directory.dependencies¶ (
Optional
[List
[str
]]) – (SageMaker backend only) List of strings containing absolute or relative paths to files and directories that will be uploaded as part of the SageMaker training job.config_file¶ (
Optional
[str
]) – File containing the definition of model_fn and data_module_fn.requirements_file¶ (
Optional
[str
]) – (SageMaker backend only) Path to requirements.txt containing environment dependencies.role¶ (
Optional
[str
]) – (SageMaker backend only) An AWS IAM role (either name or full ARN).instance_type¶ (
str
) – (SageMaker backend only) Sagemaker instance type for each worker.instance_count¶ (
int
) – (SageMaker backend only) Number of instances for each worker.instance_max_time¶ (
float
) – (SageMaker backend only) Requested maximum wall_clock time for each worker.max_time¶ (
Optional
[float
]) – Stopping criterion: wall clock time.max_num_trials_started¶ (
Optional
[int
]) – Stopping criterion: trials started.max_num_trials_completed¶ (
Optional
[int
]) – Stopping criterion: trials completed.max_num_trials_finished¶ (
Optional
[int
]) – Stopping criterion: trials finished.max_cost¶ (
Optional
[float
]) – (SageMaker backend only) Stopping criterion: SageMaker cost.n_workers¶ (
int
) – Number of workers running in parallel.scheduler¶ (
Union
[str
,Type
[TrialScheduler
],None
]) – Default is random search, you can change it by providing either a string (random, bo, asha or rush) or scheduler class and its corresponding scheduler_kwargs if required. For latter option, see details at .scheduler_kwargs¶ (
Optional
[Dict
]) – Only required if custom scheduler is provided.seed¶ (
int
) – Seed used for ensuring reproducibility.accelerator¶ (
Literal
['auto'
,'cpu'
,'gpu'
,'tpu'
]) – Type of accelerator to use.devices¶ (
int
) – Number of devices to use per worker (set in n_workers).strategy¶ (
str
) – Name of the distributed training strategy to use. More detailsprecision¶ (
str
) – Type of bit precision to use. More detailsdeterministic_trainer¶ (
bool
) – When true the Trainer adopts a deterministic behaviour also on GPU.gradient_clip_val¶ (
Optional
[float
]) – The value at which to clip gradients. Passing None disables it. More detailsgradient_clip_algorithm¶ (
Optional
[str
]) – The gradient clipping algorithm to use. Can be norm or value. More detailsjob_name¶ (
str
) – Prefix for the name of the SageMaker training job.
- Return type:
Optional
[Tuner
]