0. Referencehttps://arxiv.org/abs/1503.04069 LSTM: A Search Space OdysseySeveral variants of the Long Short-Term Memory (LSTM) architecture for recurrent neural networks have been proposed since its inception in 1995. In recent years, these networks have become the state-of-the-art models for a variety of machine learning problarxiv.org1. Introduction- RNN에서 LSTM은 Sequential data를 학습하는데 효과적인 모델이..
0. Referencehttps://ieeexplore.ieee.org/abstract/document/6795963 Long Short-Term MemoryLearning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by intrieeexplore.ieee.org1. Introduction- 기존의 RNN, BPTT, RTR..
0. Reference https://arxiv.org/abs/2209.03032 Machine Learning Students Overfit to OverfittingOverfitting and generalization is an important concept in Machine Learning as only models that generalize are interesting for general applications. Yet some students have trouble learning this important concept through lectures and exercises. In this paperarxiv.org1. Introduction- 해당 논문은 Overfitting에 대해..
0. Referencehttps://arxiv.org/abs/1803.08494 Group NormalizationBatch Normalization (BN) is a milestone technique in the development of deep learning, enabling various networks to train. However, normalizing along the batch dimension introduces problems --- BN's error increases rapidly when the batch size becomes smallarxiv.org 1. Introduction- Batch Normalization은 딥러닝에서 흔히 쓰이는 기법중 하나이다.- 하지만 이러..