VGG Convolutional Neural Network

AKA: VGGNet.
Context
- It can be produced by a VGG Training System that implements an VGG Algorithm to solve an VGG Training Task.
- It can solve the classification task of ILSVRC-2014 challenge.
- It ranges from being a 11-layer to being a 19-layer deep convolutional neural network.
Example(s):
- a VGG 11-layer model (VGG11) such as:
  - torchvision.models.vgg11(pretrained=False, **kwargs),
  - torchvision.models.vgg11_bn(pretrained=False, **kwargs),
- a VGG 13-layer model (VGG13) such as:
  - torchvision.models.vgg13(pretrained=False, **kwargs),
  - torchvision.models.vgg13_bn(pretrained=False, **kwargs)
- a VGG 16-layer model (VGG16) such as:
- a VGG 19-layer model (VGG19) such as:
  - torchvision.models.vgg19(pretrained=False, **kwargs),
  - torchvision.models.vgg19_bn(pretrained=False, **kwargs),
- VGG-F [1]
Counter-Example(s):
- an AlexNet,
- a GoogLeNet,
- a ResNet.
- a HMAX,
- a LeNet,
- a NeoCognitron,
See: Convolution Function, Neural Network Layer, Neural Network Unit, Neural Network Convolutional Layer, Neural Network Pooling Layer, Neural Network Activation Function, Neural Network Weight, Artificial Neural Network, Supervised Machine Learning System, Machine Learning Classification System, Unsupervised Machine Learning System, Reiforcement Learning System, Deep Learning System.

References

(VGG, 2018) ⇒ http://www.robots.ox.ac.uk/~vgg/research/very_deep/ Retrieved:2018-07-29
- QUOTE: The very deep ConvNets were the basis of our ImageNet ILSVRC-2014 submission, where our team (VGG) secured the first and the second places in the localisation and classification tasks respectively. After the competition, we further improved our models, which has lead to the following ImageNet classification results:

(Mishra & Cheng, 2018) ⇒ Akshay Mishra, Hong Cheng (2017, 2018). Advanced CNN Architectures: http://slazebni.cs.illinois.edu/spring17/lec04_advanced_cnn.pdf Retrieved:2018-07-29

(CS231N, 2018) ⇒ https://cs231n.github.io/convolutional-networks/#case Retrieved 2018-09-30
- QUOTE: There are several architectures in the field of Convolutional Networks that have a name. The most common are:
  - (...)
  - VGGNet. The runner-up in ILSVRC 2014 was the network from Karen Simonyan and Andrew Zisserman that became known as the VGGNet. Its main contribution was in showing that the depth of the network is a critical component for good performance. Their final best network contains 16 CONV/FC layers and, appealingly, features an extremely homogeneous architecture that only performs 3x3 convolutions and 2x2 pooling from the beginning to the end. Their pretrained model is available for plug and play use in Caffe. A downside of the VGGNet is that it is more expensive to evaluate and uses a lot more memory and parameters (140M). Most of these parameters are in the first fully connected layer, and it was since found that these FC layers can be removed with no performance downgrade, significantly reducing the number of necessary parameters.

(Li, Johnson & Yeung, 2017) ⇒ Fei-Fei Li, Justin Johnson, and Serena Yeung (2017). Lecture 9: CNN Architectures
- QUOTE: Small filters, Deeper networks
  8 layers (AlexNet) -> 16 - 19 layers (VGG16Net)
  Only 3×3 CONV stride 1, pad 1 and 2×2 MAX POOL stride 2

(Vedaldi, Lenc, & Henriques, 2016) ⇒ Andrea Vedaldi, Karel Lenc, and Joao Henriques (2016). VGG CNN Practical: Image Regression
- QUOTE: This is an Oxford Visual Geometry Group computer vision practical (Release 2016a).

(Simonyan & Zisserman, 2014) ⇒ Karen Simonyan, and Andrew Zisserman (2014). "Very deep convolutional networks for large-scale image recognition". arXiv preprint arXiv:1409.1556.
- QUOTE: As can be seen from Table 7, our very deep ConvNets significantly outperform the previous generation of models, which achieved the best results in the ILSVRC-2012 and ILSVRC-2013 competitions. Our result is also competitive with respect to the classification task winner (GoogLeNet with 6.7% error) and substantially outperforms the ILSVRC-2013 winning submission Clarifai, which achieved 11.2% with outside training data and 11.7% without it. This is remarkable, considering that our best result is achieved by combining just two models – significantly less than used in most ILSVRC submissions. In terms of the single-net performance, our architecture achieves the best result (7.0% test error), outperforming a single GoogLeNet by 0.9%. Notably, we did not depart from the classical ConvNet architecture of LeCun et al. (1989), but improved it by substantially increasing the depth.