Dec 24, 2018 原创文章

VGG16的结构图

### 网络细节

3×3卷积层的优点

First, we incorporate three non-linear rectification layers instead of a single one, which makes the decision function more discriminative. Second, we decrease the number of parameters: assuming that both the input and the output of a three-layer 3 × 3 convolution stack has C channels, the stack is parametrised by $3(3^2 C^2) = 27C^2$ weights; at the same time, a single 7 × 7 conv. layer would require $7^2 C2 = 49C^2$ parameters, i.e. 81% more. This can be seen as imposing a regularisation on the 7 × 7 conv. filters, forcing them to have a decomposition through the 3 × 3 filters (with non-linearity injected in between).

### 网络训练、测试

#### 网络训练

VGG网络的训练使用 具有动量的mini-batch梯度下降法（mini-batch gradient descent with momentum.）优化多项Logistic回归目标。这里 batch 的大小设置为256，动量（momentum）的大小设置为0.9。

The training was regularised by weight decay (the L2 penalty multiplier set to 5· 10−4) and dropout regularisation for the first two fully-connected layers (dropout ratio set to 0.5).

It is worth noting that after the paper submission we found that it is possible to initialise the weights without pre-training by using the random initialisation procedure of Glorot & Bengio (2010，Understanding the difficulty of training deep feedforward neural networks )

#### 网络测试

1、一文读懂VGG网络 https://zhuanlan.zhihu.com/p/41423739

##### 相关文章:
Attentive Generative Adversarial Network for Raindrop Removal from A Single Image # , Nov 09, 2018 原创文章
Hello Pytorch 壹 -- 卷积层原理及实现 # , , Oct 20, 2018 原创文章