Skip to content

Conversation

@abhinavarora
Copy link
Contributor

Fixes #4377

Yang and others added 5 commits September 26, 2017 13:34
moment1_out = beta1 * moment1 + (1 − beta1) * grad moment2_out = beta2 * moment2 + (1 − beta2) * grad * grad moment1_hat = moment1_out / (1 - beta1^t) moment2_hat = moment2_out / (1 - beta2^t) param_out = param - learning_rate * moment1_hat / (sqrt(moment2_hat) + epsilon)
@abhinavarora
Copy link
Contributor Author

Adam Updates:

moment1_out = beta1 * moment1 + (1 − beta1) * grad moment2_out = beta2 * moment2 + (1 − beta2) * grad * grad beta1_pow_out = beta1_pow * beta1 beta2_pow_out = beta2_pow * beta2 learning_rate_t = learning_rate_t * sqrt(1 - beta2_pow_out) / (1 - beta1_pow_out) param_out = param - learning_rate_t * moment1/ (sqrt(moment2) + epsilon) 
@abhinavarora abhinavarora self-assigned this Oct 12, 2017
Copy link

@tonyyang-svail tonyyang-svail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good. Great work!

@abhinavarora abhinavarora merged commit 1168003 into PaddlePaddle:develop Oct 12, 2017
@abhinavarora abhinavarora deleted the adam_op branch October 12, 2017 20:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

3 participants