RMSprop

RMSprop is another optimization technique where there is a different learning rate for each parameter. The learning rate is varied by calculating the exponential moving average of the gradient squared and using it to further update the parameter.

Mathematically, the exponential moving average of the gradient squared is given as follows,

$$$S{dw}=\alpha \cdot S{dw} +(1-\alpha) \cdot \partial w^2$$$

Here $$w$$ is one of the parameters and $$beta$$ is the smoothing constant.

Value of alpha is usually set to 0.99.

Then using $$S_{dw}$$ to update the parameter

$$$w=w-\eta \cdot \frac{\partial w}{\sqrt{S_{dw}}}$$$

Here $$\eta$$ is the Base Learning Rate.

Notice some implications of such an update. If the change of w with respect to objective function was very high, then the update would decrease since we are dividing with a high value. Similarly, if the change of w with respect to the objective function was low then the update would be higher.

Let us clarify with a help of contour lines,

Comparison of stochastic gradient descent and RMSprop

We can see that higher gradient in the vertical direction and the lower in the horizontal direction is slowing down the overall search process of the optimal solution. The use of RMSprop solves this problem by finding a better search “trajectory”.

  
Hello, thank you for using the code provided by CloudFactory. Please note that some code blocks might not be 100% complete and ready to be run as is. This is done intentionally as we focus on implementing only the most challenging parts that might be tough to pick up from scratch. View our code block as a LEGO block - you can’t use it as a standalone solution, but you can take it and add it to your system to complement it.

      python
      
    
      # importing the library
import torch
import torch.nn as nn

x = torch.randn(10, 3)
y = torch.randn(10, 2)

# Build a fully connected layer.
linear = nn.Linear(3, 2)

# Build MSE loss function and optimizer.
criterion = nn.MSELoss()

# Optimization method using RMSprop
optimizer = torch.optim.RMSProp(linear.parameters(), lr=0.01, alpha=0.99,
 eps=1e-08, weight_decay=0, momentum=0, centered=False)

# Forward pass.
pred = linear(x)

# Compute loss.
loss = criterion(pred, y)
print('loss:', loss.item())

optimizer.step()
    

Boost model performance quickly with AI-powered labeling and 100% QA.

Learn more

Last modified 14d ago

Previous - Solver / Optimizer

Rprop

Next - Solver / Optimizer

Lion

RMSprop

Major Parameters

Alpha

Centered

Code Implementation

Boost model performance quickly with AI-powered labeling and 100% QA.