2016 LayerNormalization

(Ba et al., 2016) ⇒ [[author::Lei Jimmy Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. (2016). “Layer Normalization.” In: arXiv eprint. abs/1607.06450.

Subject Headings: Neural Network Layer Normalization; GPT-2

Notes

Link(s):
- ArXiv: https://arxiv.org/abs/1607.06450
- DBLP: https://dblp.org/rec/html/journals/corr/BaKH16

Cited By

Google Scholar: ~ 1,851 Citations, Retrieved: 2020-06-25.
Semantic Scholar: ~ 971 Citations, Retrieved: 2020-06-25.

Quotes

Abstract

Training state-of-the-art, deep neural networks is computationally expensive. One way to reduce the training time is to normalize the activities of the neurons. A recently introduced technique called batch normalization uses the distribution of the summed input to a neuron over a mini-batch of training cases to compute a mean and variance which are then used to normalize the summed input to that neuron on each training case. This significantly reduces the training time in feed-forward neural networks. However, the effect of batch normalization is dependent on the mini-batch size and it is not obvious how to apply it to recurrent neural networks. In this paper, we transpose batch normalization into layer normalization by computing the mean and variance used for normalization from all of the summed inputs to the neurons in a layer on a single training case. Like batch normalization, we also give each neuron its own adaptive bias and gain which are applied after the normalization but before the non-linearity. Unlike batch normalization, layer normalization performs exactly the same computation at training and test times. It is also straightforward to apply to recurrent neural networks by computing the normalization statistics separately at each time step. Layer normalization is very effective at stabilizing the hidden state dynamics in recurrent networks. Empirically, we show that layer normalization can substantially reduce the training time compared with previously published techniques.

References

BibTeX

@article{2016_LayerNormalization,
  author    = {Lei [[Jimmy Ba]] and
               Jamie Ryan Kiros and
               Geoffrey E. Hinton},
  title     = {Layer Normalization},
  journal   = {arXiv eprint},
  volume    = {abs/1607.06450},
  year      = {2016},
  url       = {http://arxiv.org/abs/1607.06450},
  archivePrefix = {arXiv},
  eprint    = {1607.06450},
}

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2016 LayerNormalization	Geoffrey E. Hinton Jamie Ryan Kiros			Layer Normalization						2016

2016 LayerNormalization

Notes

Cited By

Quotes

Abstract

References

BibTeX

Navigation menu

Search