Convolutional (CNN/CNN)-based Encoder-Decoder Neural Network
A Convolutional (CNN/CNN)-based Encoder-Decoder Neural Network is an encoder-decoder neural network that consists of a encoder neural network and a decoder neural network in which one or both are convolutional neural networks.
- AKA: CNN Encoder-Decoder Network.
- Context:
- It can be trained by a Encoder-Decoder CNN Training System that implements a Encoder-Decoder CNN Training Algorithm to solve a Encoder-Decoder CNN Training System.
- It can be instantiated in a Trained CNN-based Encoder/Decoder Network.
- …
- Example(s):
- Counter-Example(s):
- See: Sequence-to-Sequence Learning Task, Artificial Neural Network, Bidirectional Neural Network, Convolutional Neural Network, Neural Machine Translation Task, Deep Learning, Natural Language Processing.
References
2018
- (Chollampatt & Ng, 2018) ⇒ Shamil Chollampatt, and Hwee Tou Ng. (2018). “A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction.” In: Proceedings of the Thirty-Second Conference on Artificial Intelligence (AAAI-2018).
- QUOTE: The encoder and decoder are made up of [math]\displaystyle{ L }[/math] layers each. The architecture of the network is shown in Figure 1. The source token embeddings, [math]\displaystyle{ s_1, \cdots, s_m }[/math], are linearly mapped to get input vectors of the first encoder layer, [math]\displaystyle{ h^0_1 , \cdots, h^0_m, }[/math] where [math]\displaystyle{ h^0_i \in R^h }[/math] and [math]\displaystyle{ h }[/math] is the input and output dimension of all encoder and decoder layers.
=== 2018b ===
- (Saxena, 2018) ⇒ Rohan Saxena (April, 2018). "What is an Encoder/Decoder in Deep Learning?".
- QUOTE: In a CNN, an encoder-decoder network typically looks like this (a CNN encoder and a CNN decoder):
Image Credits: (Badrinarayanan et al., 2017)
This is a network to perform semantic segmentation of an image. The left half of the network maps raw image pixels to a rich representation of a collection of feature vectors. The right half of the network takes these features, produces an output and maps the output back into the “raw” format (in this case, image pixels). …
- QUOTE: In a CNN, an encoder-decoder network typically looks like this (a CNN encoder and a CNN decoder):
2017
- (Badrinarayanan et al., 2017) ⇒ Vijay Badrinarayanan, Alex Kendall, and Roberto Cipolla. (2017). “SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation.” In: IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12). doi:10.1109/TPAMI.2016.2644615
- QUOTE: SegNet has an encoder network and a corresponding decoder network, followed by a final pixelwise classification layer. This architecture is illustrated in Fig. 2. The encoder network consists of 13 convolutional layers which correspond to the first 13 convolutional layers in the VGG16 network (Simonyan & Zisserman, 2014) designed for object classification. ...
Each encoder layer has a corresponding decoder layer and hence the decoder network has 13 layers. The final decoder output is fed to a multi-class soft-max classifier to produce class probabilities for each pixel independently.
Figure 2. An illustration of the SegNet architecture. There are no fully connected layers and hence it is only convolutional. A decoder upsamples its input using the transferred pool indices from its encoder to produce a sparse feature map(s). It then performs convolution with a trainable filter bank to densify the feature map. The final decoder output feature maps are fed to a soft-max classifier for pixel-wise classification.
- QUOTE: SegNet has an encoder network and a corresponding decoder network, followed by a final pixelwise classification layer. This architecture is illustrated in Fig. 2. The encoder network consists of 13 convolutional layers which correspond to the first 13 convolutional layers in the VGG16 network (Simonyan & Zisserman, 2014) designed for object classification.