Convolutional (CNN/CNN)-based Encoder-Decoder Neural Network

AKA: CNN Encoder-Decoder Network.
Context:
- It can be trained by a Encoder-Decoder CNN Training System that implements a Encoder-Decoder CNN Training Algorithm to solve a Encoder-Decoder CNN Training System.
- It can be instantiated in a Trained CNN-based Encoder/Decoder Network.
- …
Example(s):
Counter-Example(s):
- a Recurrent Encoder-Decoder Neural Network,
- a Transformer Neural Network.
See: Sequence-to-Sequence Learning Task, Artificial Neural Network, Bidirectional Neural Network, Convolutional Neural Network, Neural Machine Translation Task, Deep Learning, Natural Language Processing.

References

(Chollampatt & Ng, 2018) ⇒ Shamil Chollampatt, and Hwee Tou Ng. (2018). “A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction.” In: Proceedings of the Thirty-Second Conference on Artificial Intelligence (AAAI-2018).
- QUOTE: The encoder and decoder are made up of [math]\displaystyle{ L }[/math] layers each. The architecture of the network is shown in Figure 1. The source token embeddings, [math]\displaystyle{ s_1, \cdots, s_m }[/math], are linearly mapped to get input vectors of the first encoder layer, [math]\displaystyle{ h^0_1 , \cdots, h^0_m, }[/math] where [math]\displaystyle{ h^0_i \in R^h }[/math] and [math]\displaystyle{ h }[/math] is the input and output dimension of all encoder and decoder layers.

Figure 1: Architecture of our multilayer convolutional model with seven encoder and seven decoder layers (only one encoder and one decoder layer are illustrated in detail).

=== 2018b ===

(Saxena, 2018) ⇒ Rohan Saxena (April, 2018). "What is an Encoder/Decoder in Deep Learning?".
- QUOTE: In a CNN, an encoder-decoder network typically looks like this (a CNN encoder and a CNN decoder):
  Image Credits: (Badrinarayanan et al., 2017)
  
  This is a network to perform semantic segmentation of an image. The left half of the network maps raw image pixels to a rich representation of a collection of feature vectors. The right half of the network takes these features, produces an output and maps the output back into the “raw” format (in this case, image pixels). …