EvoLved Sign Momentum (Lion) is a Deep Learning stochastic optimization algorithm that builds upon the concept of sign momentum optimization. Lion, proposed in a research paper in 2023, aims to address the limitations of traditional optimization algorithms such as Stochastic Gradient Descent (SGD) and AdamW.

Lion is similar to AdamW with slight adjustments. Like sign momentum optimization, Lion only considers the sign of the gradient rather than the magnitude.

Lion also incorporates population-based training, where models with different hyperparameters are maintained and evolved over time. This helps to find better hyperparameters and improve the model's overall performance.

AdamW Vs. Lion
Source

In experiments conducted by the developers, Lion was shown to outperform traditional optimization algorithms such as SGD and Adam on several benchmark datasets. Lion was also found to be more robust to changes in hyperparameters and more resistant to overfitting.

Noteworthy, developers highlight that Lion likely performs no better than AdanW if the batch size is less than 64. So, they suggest increasing the batch size as the advantage of Lion over AdamW enlarges as the batch size increases.

Overall, Lion is a promising optimization algorithm that has shown competitive performance compared to other state-of-the-art optimization algorithms on various deep learning tasks, including image classification, object detection, and natural language processing.

To get a deeper understanding of the optimizer, please refer to the original paper.
Currently, Lion is not a built-in option for any Deep Learning framework. However, you can quickly test it on your data through other frameworks-compatible libraries.
python
      # pip install lion-pytorch

# toy model

import torch
from torch import nn

model = nn.Linear(10, 1)

# import Lion and instantiate with parameters

from lion_pytorch import Lion

opt = Lion(model.parameters(), lr=1e-4, weight_decay=1e-2)

# forward and backwards

loss = model(torch.randn(10))
loss.backward()

# optimizer step

opt.step()
opt.zero_grad()
    

Boost model performance quickly with AI-powered labeling and 100% QA.

Learn more
Last modified