renate.training.training module#

renate.training.training.run_training_job(mode, config_space, metric, backend, updater='ER', max_epochs=50, task_id='default_task', chunk_id=None, input_state_url=None, output_state_url=None, working_directory='renate_working_dir', dependencies=None, config_file=None, requirements_file=None, role=None, instance_type='ml.c5.xlarge', instance_count=1, instance_max_time=259200, max_time=None, max_num_trials_started=None, max_num_trials_completed=None, max_num_trials_finished=None, max_cost=None, n_workers=1, scheduler=None, scheduler_kwargs=None, seed=0, accelerator='auto', devices=1, strategy='ddp', precision='32', deterministic_trainer=False, gradient_clip_val=None, gradient_clip_algorithm=None, job_name='renate')[source]#

Starts updating the model including hyperparameter optimization.

Parameters:

mode¶ (Literal['min', 'max']) – Declares the type of optimization problem: min or max.
config_space¶ (Dict[str, Any]) – Details for defining your own search space is provided in the Syne Tune Documentation.
metric¶ (str) – Name of metric to optimize.
backend¶ (Literal['local', 'sagemaker']) – Whether to run jobs locally (local) or on SageMaker (sagemaker).
updater¶ (str) – Updater used for model update.
max_epochs¶ (int) – The maximum number of epochs used to train the model. For comparability between methods, epochs are interpreted as “finetuning-equivalent”. That is, one epoch is defined as len(current_task_dataset) / batch_size update steps.
task_id¶ (str) – Unique identifier for the current task.
chunk_id¶ (Optional[int]) – Unique identifier for the current data chunk.
input_state_url¶ (Optional[str]) – Path to the Renate model state.
output_state_url¶ (Optional[str]) – Path where Renate model state will be stored.
working_directory¶ (Optional[str]) – Path to the working directory.
dependencies¶ (Optional[List[str]]) – (SageMaker backend only) List of strings containing absolute or relative paths to files and directories that will be uploaded as part of the SageMaker training job.
config_file¶ (Optional[str]) – File containing the definition of model_fn and data_module_fn.
requirements_file¶ (Optional[str]) – (SageMaker backend only) Path to requirements.txt containing environment dependencies.
role¶ (Optional[str]) – (SageMaker backend only) An AWS IAM role (either name or full ARN).
instance_type¶ (str) – (SageMaker backend only) Sagemaker instance type for each worker.
instance_count¶ (int) – (SageMaker backend only) Number of instances for each worker.
instance_max_time¶ (float) – (SageMaker backend only) Requested maximum wall_clock time for each worker.
max_time¶ (Optional[float]) – Stopping criterion: wall clock time.
max_num_trials_started¶ (Optional[int]) – Stopping criterion: trials started.
max_num_trials_completed¶ (Optional[int]) – Stopping criterion: trials completed.
max_num_trials_finished¶ (Optional[int]) – Stopping criterion: trials finished.
max_cost¶ (Optional[float]) – (SageMaker backend only) Stopping criterion: SageMaker cost.
n_workers¶ (int) – Number of workers running in parallel.
scheduler¶ (Union[str, Type[TrialScheduler], None]) – Default is random search, you can change it by providing either a string (random, bo, asha or rush) or scheduler class and its corresponding scheduler_kwargs if required. For latter option, see details at .
scheduler_kwargs¶ (Optional[Dict]) – Only required if custom scheduler is provided.
seed¶ (int) – Seed used for ensuring reproducibility.
accelerator¶ (Literal['auto', 'cpu', 'gpu', 'tpu']) – Type of accelerator to use.
devices¶ (int) – Number of devices to use per worker (set in n_workers).
strategy¶ (str) – Name of the distributed training strategy to use. More details
precision¶ (str) – Type of bit precision to use. More details
deterministic_trainer¶ (bool) – When true the Trainer adopts a deterministic behaviour also on GPU.
gradient_clip_val¶ (Optional[float]) – The value at which to clip gradients. Passing None disables it. More details
gradient_clip_algorithm¶ (Optional[str]) – The gradient clipping algorithm to use. Can be norm or value. More details
job_name¶ (str) – Prefix for the name of the SageMaker training job.

Return type:

Optional[Tuner]

renate.training.training.submit_remote_job(dependencies, role, instance_type, instance_count, instance_max_time, job_name, optional_dependencies=None, **job_kwargs)[source]#

Executes the training job on SageMaker.

See renate.train.run_training_job for a description of arguments.

Return type:: str