Lion

EvoLved Sign Momentum (Lion) is a Deep Learning stochastic optimization algorithm that builds upon the concept of sign momentum optimization. Lion, proposed in a research paper in 2023, aims to address the limitations of traditional optimization algorithms such as Stochastic Gradient Descent (SGD) and AdamW.

Lion is similar to AdamW with slight adjustments. Like sign momentum optimization, Lion only considers the sign of the gradient rather than the magnitude.

Lion also incorporates population-based training, where models with different hyperparameters are maintained and evolved over time. This helps to find better hyperparameters and improve the model's overall performance.

In experiments conducted by the developers, Lion was shown to outperform traditional optimization algorithms such as SGD and Adam on several benchmark datasets. Lion was also found to be more robust to changes in hyperparameters and more resistant to overfitting.

Noteworthy, developers highlight that Lion likely performs no better than AdanW if the batch size is less than 64. So, they suggest increasing the batch size as the advantage of Lion over AdamW enlarges as the batch size increases.

Overall, Lion is a promising optimization algorithm that has shown competitive performance compared to other state-of-the-art optimization algorithms on various deep learning tasks, including image classification, object detection, and natural language processing.

Major Parameters

To get a deeper understanding of the optimizer, please refer to the original paper.

Code Implementation

Currently, Lion is not a built-in option for any Deep Learning framework. However, you can quickly test it on your data through other frameworks-compatible libraries.

  
Hello, thank you for using the code provided by CloudFactory. Please note that some code blocks might not be 100% complete and ready to be run as is. This is done intentionally as we focus on implementing only the most challenging parts that might be tough to pick up from scratch. View our code block as a LEGO block - you can’t use it as a standalone solution, but you can take it and add it to your system to complement it.

      python
      
      # pip install lion-pytorch

# toy model

import torch
from torch import nn

model = nn.Linear(10, 1)

# import Lion and instantiate with parameters

from lion_pytorch import Lion

opt = Lion(model.parameters(), lr=1e-4, weight_decay=1e-2)

# forward and backwards

loss = model(torch.randn(10))
loss.backward()

# optimizer step

opt.step()
opt.zero_grad()

Further Resources

SGD;
AdamW.

Boost model performance quickly with AI-powered labeling and 100% QA.

Learn more

Last modified 2mo ago

Previous - Solver / Optimizer

RMSprop

Next - Solver / Optimizer

Weight Decay