
0. Referencehttps://arxiv.org/abs/1412.6980 Adam: A Method for Stochastic OptimizationWe introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory rarxiv.org1. Introduction1.1. First-order Optimizer VS Se..