Optimization Functions¶
-
class
slugnet.optimizers.
SGD
(lr=0.001, clip=-1, decay=0.0, lr_min=0.0, lr_max=inf)[source]¶ Bases:
slugnet.optimizers.Optimizer
Optimize model parameters using common stochastic gradient descent.
-
class
slugnet.optimizers.
RMSProp
(rho=0.9, epsilon=1e-06, *args, **kwargs)[source]¶ Bases:
slugnet.optimizers.Optimizer
RMSProp updates Scale learning rates by dividing with the moving average of the root mean squared (RMS) gradients. See [1] for further description.
Parameters: - rho (float) – Gradient moving average decay factor.
- epsilon (float) – Small value added for numerical stability.
rho should be between 0 and 1. A value of rho close to 1 will decay the moving average slowly and a value close to 0 will decay the moving average fast. Using the step size and a decay factor the learning rate is calculated as:
[1] Tieleman, T. and Hinton, G. (2012): Neural Networks for Machine Learning, Lecture 6.5 - rmsprop. Coursera. http://www.youtube.com/watch?v=O3sxAc4hxZU (formula @5:20)