Simple Unidirectional Recurrent Neural Network (SRN)

AKA: Elman Network.
Example(s):
- ...
- …
Counter-Example(s):
- a Bidirectional Recurrent Neural Network;
- a GRU Network, with GRU units;
- a LSTM Network, with LSTM units;
- a Stacked Recurrent Neural Network.
See: RNN Hidden State, Gated Recurrent Hidden State, Gated Recurrent Hidden State.

References

(Jurasky & Martin, 2018) ⇒ Daniel Jurafsky, and James H. Martin (2018). "Chapter 9 -- Sequence Processing with Recurrent Networks". In: Speech and Language Processing (3rd ed. draft). Draft of September 23, 2018.
- QUOTE: The sequential nature of simple recurrent networks can be illustrated by unrolling the network in time as is shown in Fig. 9.4. In figures such as this, the various layers of units are copied for each time step to illustrate that they will have differing values over time. However the weights themselves are shared across the various timesteps. Finally, the fact that the computation at time [math]\displaystyle{ t }[/math] requires the value of the hidden layer from time [math]\displaystyle{ t-1 }[/math] mandates an incremental inference algorithm that proceeds from the start of the sequence to the end as shown in Fig. 9.5.
  Figure 9.4 A simple recurrent neural network shown unrolled in time. Network layers are copied for each timestep, while the weights U, V and W are shared in common across all timesteps.
  Figure 9.5 Forward inference in a simple recurrent network.

(Sammut & Webb, 2017) ⇒ Claude Sammut, and Geoffrey I. Webb. (2017). “Simple Recurrent Network.” In: (Sammut & Webb, 2017).
- QUOTE: The simple recurrent network is a specific version of the Backpropagation neural network that makes it possible to process of sequential input and output (Elman, 1990). It is typically a three-layer network where a copy of the hidden layer activations is saved and used (in addition to the actual input) as input to the hidden layer in the next time step. The previous hidden layer is fully connected to the hidden layer. Because the network has no recurrent connections per se (only a copy of the activation values), the entire network (including the weights from the previous hidden layer to the hidden layer) can be trained with the backpropagation algorithm as usual. It can be trained to read a sequence of inputs into a target output pattern, to generate a sequence of outputs from a given input pattern, or to map an input sequence to an output sequence (as in predicting the next input). ...