Better performance optimizers¶
This document provides some third-party optimizers supported by MMEngine, which may bring faster convergence speed or higher performance.
D-Adaptation¶
D-Adaptation provides DAdaptAdaGrad, DAdaptAdam and DAdaptSGD optimizers.
Note
If you use the optimizer provided by D-Adaptation, you need to upgrade mmengine to 0.6.0.
Installation
pip install dadaptation
Usage
Take the DAdaptAdaGrad as an example.
runner = Runner(
model=ResNet18(),
work_dir='./work_dir',
train_dataloader=train_dataloader_cfg,
# To view the input parameters for DAdaptAdaGrad, you can refer to
# https://github.com/facebookresearch/dadaptation/blob/main/dadaptation/dadapt_adagrad.py
optim_wrapper=dict(optimizer=dict(type='DAdaptAdaGrad', lr=0.001, momentum=0.9)),
train_cfg=dict(by_epoch=True, max_epochs=3),
)
runner.train()
Lion-Pytorch¶
lion-pytorch provides the Lion optimizer.
Note
If you use the optimizer provided by Lion-Pytorch, you need to upgrade mmengine to 0.6.0.
Installation
pip install lion-pytorch
Usage
runner = Runner(
model=ResNet18(),
work_dir='./work_dir',
train_dataloader=train_dataloader_cfg,
# To view the input parameters for Lion, you can refer to
# https://github.com/lucidrains/lion-pytorch/blob/main/lion_pytorch/lion_pytorch.py
optim_wrapper=dict(optimizer=dict(type='Lion', lr=1e-4, weight_decay=1e-2)),
train_cfg=dict(by_epoch=True, max_epochs=3),
)
runner.train()
Sophia¶
Sophia provides Sophia, SophiaG, DecoupledSophia and Sophia2 optimizers.
Note
If you use the optimizer provided by Sophia, you need to upgrade mmengine to 0.7.4.
Installation
pip install Sophia-Optimizer
Usage
runner = Runner(
model=ResNet18(),
work_dir='./work_dir',
train_dataloader=train_dataloader_cfg,
# To view the input parameters for SophiaG, you can refer to
# https://github.com/kyegomez/Sophia/blob/main/Sophia/Sophia.py
optim_wrapper=dict(optimizer=dict(type='SophiaG', lr=2e-4, betas=(0.965, 0.99), rho = 0.01, weight_decay=1e-1)),
train_cfg=dict(by_epoch=True, max_epochs=3),
)
runner.train()
bitsandbytes¶
bitsandbytes provides AdamW8bit, Adam8bit, Adagrad8bit, PagedAdam8bit, PagedAdamW8bit, LAMB8bit, LARS8bit, RMSprop8bit, Lion8bit, PagedLion8bit and SGD8bit optimizers.
Note
If you use the optimizer provided by bitsandbytes, you need to upgrade mmengine to 0.9.0.
Installation
pip install bitsandbytes
Usage
Take the AdamW8bit as an example.
runner = Runner(
model=ResNet18(),
work_dir='./work_dir',
train_dataloader=train_dataloader_cfg,
# To view the input parameters for AdamW8bit, you can refer to
# https://github.com/TimDettmers/bitsandbytes/blob/main/bitsandbytes/optim/adamw.py
optim_wrapper=dict(optimizer=dict(type='AdamW8bit', lr=1e-4, weight_decay=1e-2)),
train_cfg=dict(by_epoch=True, max_epochs=3),
)
runner.train()
transformers¶
transformers provides Adafactor optimzier.
Note
If you use the optimizer provided by transformers, you need to upgrade mmengine to 0.9.0.
Installation
pip install transformers
Usage
Take the Adafactor as an example.
runner = Runner(
model=ResNet18(),
work_dir='./work_dir',
train_dataloader=train_dataloader_cfg,
# To view the input parameters for Adafactor, you can refer to
# https://github.com/huggingface/transformers/blob/v4.33.2/src/transformers/optimization.py#L492
optim_wrapper=dict(optimizer=dict(type='Adafactor', lr=1e-5,
weight_decay=1e-2, scale_parameter=False, relative_step=False)),
train_cfg=dict(by_epoch=True, max_epochs=3),
)
runner.train()