TY - GEN
T1 - Nostalgic ADAM
T2 - 28th International Joint Conference on Artificial Intelligence, IJCAI 2019
AU - Huang, Haiwen
AU - Wang, Chang
AU - Dong, Bin
N1 - Funding Information:
This work would not have existed without the support of BICMR and School of Mathematical Sciences, Peking University. Bin Dong is supported in part by Beijing Natural Science Foundation (Z180001).
PY - 2019
Y1 - 2019
N2 - First-order optimization algorithms have been proven prominent in deep learning. In particular, algorithms such as RMSProp and Adam are extremely popular. However, recent works have pointed out the lack of “long-term memory” in Adam-like algorithms, which could hamper their performance and lead to divergence. In our study, we observe that there are benefits of weighting more of the past gradients when designing the adaptive learning rate. We therefore propose an algorithm called the Nostalgic Adam (NosAdam) with theoretically guaranteed convergence at the best known convergence rate. NosAdam can be regarded as a fix to the non-convergence issue of Adam in alternative to the recent work of [Reddi et al., 2018]. Our preliminary numerical experiments show that NosAdam is a promising alternative algorithm to Adam. The proofs, code, and other supplementary materials are already released.
AB - First-order optimization algorithms have been proven prominent in deep learning. In particular, algorithms such as RMSProp and Adam are extremely popular. However, recent works have pointed out the lack of “long-term memory” in Adam-like algorithms, which could hamper their performance and lead to divergence. In our study, we observe that there are benefits of weighting more of the past gradients when designing the adaptive learning rate. We therefore propose an algorithm called the Nostalgic Adam (NosAdam) with theoretically guaranteed convergence at the best known convergence rate. NosAdam can be regarded as a fix to the non-convergence issue of Adam in alternative to the recent work of [Reddi et al., 2018]. Our preliminary numerical experiments show that NosAdam is a promising alternative algorithm to Adam. The proofs, code, and other supplementary materials are already released.
UR - http://www.scopus.com/inward/record.url?scp=85074931251&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85074931251&partnerID=8YFLogxK
U2 - 10.24963/ijcai.2019/355
DO - 10.24963/ijcai.2019/355
M3 - Conference contribution
AN - SCOPUS:85074931251
T3 - IJCAI International Joint Conference on Artificial Intelligence
SP - 2556
EP - 2562
BT - Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019
A2 - Kraus, Sarit
PB - International Joint Conferences on Artificial Intelligence
Y2 - 10 August 2019 through 16 August 2019
ER -