lr_scheduler: <list[list[LRScheduler]]
(Optional)name
, interval
and kwargs
:
name
: The name of the learning rate scheduler strategy. Valid learning rate scheduler strategies are:
constant_with_warmup
: Uses a constant learning rate preceded by a warmup period which increases the learning rate from 0 to base_lr
. Number of warmup steps can be specified through kwargs
via warmup_ratio_or_steps
.linear_with_warmup
: Decays the learning rate linearly from base_lr
to 0
, preceded by a warmup period which increases the learning rate from 0 to base_lr
. Number of warmup steps can be specified through kwargs
via warmup_ratio_or_steps
. exponential
: Decays the learning rate by gamma
. gamma
can be specified through kwargs
.cosine_with_warmup
: Adjusts the learning rate between base_lr
and 0
following a cosine function, preceded by a warmup period which increases the learning rate from 0 to base_lr
. Number of warmup steps can be specified through kwargs
via warmup_ratio_or_steps
.cosine_with_warmup_restarts
: Adjusts the learning rate between base_lr
and 0
following a cosine function, with several hard restarts. Preceded by a warmup period which increases the learning rate from 0
to base_lr
. Number of hard restarts can be configured through kwargs
via num_cycles
(3
by default). Number of warmup steps can be specified through kwargs
via warmup_ratio_or_steps
.interval
: Specifies whether learning rate scheduling is applied per optimization step (step
) or per epoch (epoch
).
kwargs
: Additional arguments depending on the chosen LR scheduler strategy. See above for detailed information.