Optimization Functions¶

class slugnet.optimizers.SGD(lr=0.001, clip=-1, decay=0.0, lr_min=0.0, lr_max=inf)[source]¶

Bases: slugnet.optimizers.Optimizer

Optimize model parameters using common stochastic gradient descent.

update(params, grads)[source]¶

Update parameters. Parameters ———- params : list

A list of parameters in model.

grads : list: A list of gradients in model.

class slugnet.optimizers.RMSProp(rho=0.9, epsilon=1e-06, *args, **kwargs)[source]¶

Bases: slugnet.optimizers.Optimizer

RMSProp updates Scale learning rates by dividing with the moving average of the root mean squared (RMS) gradients. See [1] for further description.

Parameters:	rho (float) – Gradient moving average decay factor. epsilon (float) – Small value added for numerical stability.

rho should be between 0 and 1. A value of rho close to 1 will decay the moving average slowly and a value close to 0 will decay the moving average fast. Using the step size $\eta$ and a decay factor $\rho$ the learning rate $\eta_t$ is calculated as:

$r_t &= \rho r_{t-1} + (1-\rho)*g^2\\ \eta_t &= \frac{\eta}{\sqrt{r_t + \epsilon}}$

[1]	Tieleman, T. and Hinton, G. (2012): Neural Networks for Machine Learning, Lecture 6.5 - rmsprop. Coursera. http://www.youtube.com/watch?v=O3sxAc4hxZU (formula @5:20)

update(params, grads)[source]¶

Update parameters. Parameters ———- params : list

A list of parameters in model.

grads : list: A list of gradients in model.

Optimization Functions¶

Navigation

Related Topics

This Page