(1)What do you think applying this filter to a grayscale image will do?
[
0
1
?
1
0
1
3
?
3
?
1
1
3
?
3
?
1
0
1
?
1
0
]
\begin{bmatrix} 0& 1& -1& 0\\ 1& 3& -3& -1\\ 1& 3& -3& -1\\ 0& 1& -1& 0\\ \end{bmatrix}
?????0110?1331??1?3?3?1?0?1?10?????? [A]Detect 45 degree edges. [B]Detect horizontal edges. [C]Detect vertical edges. [D]Detect image contrast. 答案:C 解析:运行以下代码,自行观察结果
import torch
from torch.nn import functional as F
if __name__ == '__main__':
x = torch.tensor([[[
[1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 0],
[1, 1, 1, 1, 1, 1, 0, 0],
[1, 1, 1, 1, 1, 0, 0, 0],
[1, 1, 1, 1, 0, 0, 0, 0],
[1, 1, 1, 1, 0, 0, 0, 0],
[1, 1, 1, 1, 0, 0, 0, 0],
[1, 1, 1, 1, 0, 0, 0, 0],
]]], dtype=torch.float)
kernel = torch.tensor([[[
[0, 1, -1, 0],
[1, 3, -3, -1],
[1, 3, -3, -1],
[0, 1, -1, 0],
]]], dtype=torch.float)
output = F.conv2d(input=x, weight=kernel)
print(output)
注:你可能会发现该卷积核也能检测45度边缘,但显然检测垂直边缘要比45度边缘要好得多。 扩展阅读:判断一个边缘检测算子的检测方向的方法
(2)Suppose your input is a 300 by 300 color (RGB) image, and you are not using a convolutional network. If the first hidden layer has 100 neurons, each one fully connected to the input, how many parameters does this hidden layer have (including the bias parameters)? [A] 9,000,001 [B] 9,000,100 [C] 27,000,001 [D] 27,000,100 答案:D 解析:先将图像进行flatten操作,输入x的维度为(300*300*3,n),第一个隐藏层的维度为(100,n),则W的维度为(100,300*300*3,),b的维度为(100,1)。 则该隐藏层的参数量为
300
×
300
×
3
×
100
+
100
=
27
,
000
,
100
300\times300\times3\times100+100=27,000,100
300×300×3×100+100=27,000,100
(3)Suppose your input is a 300 by 300 color (RGB) image, and you use a convolutional layer 100 filters that are each 5x5. How many parameters does this hidden layer have (including the bias parameters)? [A] 2501 [B] 2600 [C] 7500 [D] 7600 答案:D 解析:100个卷积核,每个卷积核维度为5x5x3(通道数和输入相同),每个卷积核都有一个偏置值,所以总的参数量为
100
×
(
5
×
5
×
3
+
1
)
=
7600
100\times(5\times5\times3+1)=7600
100×(5×5×3+1)=7600
(4)You have an input volume that is 63x63x16, and convolve it with 32 filters that are each 7x7, using a stride of 2 and no padding. What is the output volume? [A] 29x29x32 [B] 29x29x16 [C] 16x16x32 [D] 16x16x16 答案:A 解析:每个卷积核与图像做卷积生成的新图像维度为
?
n
+
2
p
?
f
s
+
1
?
×
?
n
+
2
p
?
f
s
+
1
?
×
1
即
29
×
29
×
1
\lfloor \frac{n+2p-f}{s}+1 \rfloor \times \lfloor \frac{n+2p-f}{s}+1 \rfloor \times 1即29\times29\times1
?sn+2p?f?+1?×?sn+2p?f?+1?×1即29×29×1,通道数为1是因为卷积核的通道数和输入图像的通道数相同,32个卷积核的输出堆叠起来,则总的输出维度为
29
×
29
×
32
29\times29\times32
29×29×32
(5)You have an input volume that is 15x15x8, and pad it using “pad=2”. What is the dimension of the resulting volume (after padding)? [A] 17x17x10 [B] 17x17x8 [C] 19x19x8 [D] 19x19x12 答案:C 解析:padding不会在通道维度上填充,所以AD错,padding左右和上下都要填充所以不选B,选C
(6)You have an input volume that is 63x63x16, and convolve it with 32 filters that are each 7x7, and stride of 1. You want to use a “same” convolution. What is the padding? [A] 1 [B] 2 [C] 3 [D] 7 答案:C 解析:"same"卷积就是卷积前后维度大小不变。 令
?
n
+
2
p
?
f
s
+
1
?
=
n
代
入
解
得
得
p
=
3
\lfloor \frac{n+2p-f}{s}+1 \rfloor=n代入解得得p=3
?sn+2p?f?+1?=n代入解得得p=3
(7)You have an input volume that is 32x32x16, and apply max pooling with a stride of 2 and a filter size of 2. What is the output volume? [A] 32x32x8 [B] 15x15x16 [C] 16x16x16 [D] 16x16x8 答案:C 解析:卷积层的输出大小公式同样适用于池化层,
?
n
+
2
p
?
f
s
+
1
?
=
?
32
+
2
×
0
?
2
2
+
1
?
=
16
\lfloor \frac{n+2p-f}{s}+1 \rfloor=\lfloor \frac{32+2\times0-2}{2}+1 \rfloor=16
?sn+2p?f?+1?=?232+2×0?2?+1?=16,并且池化是对每个通道单独进行池化,不会改变通道数。
(8)Because pooling layers do not have parameters, they do not affect the backpropagation (derivatives) calculation. [A] True [B] False 答案:B 解析:池化层虽然没有参数可以更新,但是对反向传播还是有影响的。 参考阅读:池化层的反向传播
(9)In lecture we talked about “parameter sharing” as a benefit of using convolutional networks. which of the following statements about parameter sharing in ConvNets are true? (Check all that apply.) [A]It allows gradient descent to set many of the parameters to zero, thus making the connections sparse. [B]It allows a feature detector to be used in multiple locations throughout the whole input image/input volume. [C]It reduces the total number of parameters, thus reducing overfitting. [D]It allows parameters learned for task to be shared even for a different task (transfer learning) 答案:B,C 解析:课堂上对于parameter sharing的定义:A feature detector that’s useful in one part of the image is probably useful in another part of the image.故B对。 使用卷积核相比全连接减少了很多参数量,参考题(2)(3),这有利于减少过拟合。
(10)In lecture we talked about “sparsity of connections” as a benefit of using convolutional layers. What does this mean? [A]Regularization causes gradient descent to set many of the parameters to zero. [B]Each filter is connected to every channel in the previous layer. [C]Each avtivation in the next layer depends on only a small number of activations from the previous layer. [D]Each layer in a convolutional network is connected only to two other layers. 答案:C 解析:课堂上对于sparsity of connections的定义:in each layer, each output value depends only on a small number of inputs.
|