(1)If you have 10,000,000 examples, how would you split the train/dev/test set? [A] 98% train. 1% dev. 1%test [B] 33% train. 33% dev. 33%test [A] 60% train. 20% dev. 20%test 答案:A 解析:见视频1.1 Train/dev/test sets. (2)The dev and test set should: [A]Come from the same distribution. [B]Come from different distributions. [C]Be identical to each other (same (x,y) pairs) [D]Have the same number of examples. 答案:A (3)If your Neural Network model seems to have high variance, what of the following would be promising things to try? [A]Add regularization. [B]Make the Neural Network deeper. [C]Get more test data. [D]Get more training data. 答案:A,D 解析:B为减小高偏差(bias)的方法,C对方差(variance)和偏差(bias)均无影响 (4)You are working on an automated check-out kiosk for a supermarket, and are building a classifier for apples, bananas and oranges, Suppose your classifier obtains a training set error of 0.5%, and a dev set error of 7%. Which of the following are promising things to try to improve your classifier? (Check all that apply) [A]Increase the regularization parameter lambda. [B]Decrease the regularization parameter lambda. [C]Get more training data. [D]Use a bigger neural network. 答案:A,C 解析:题设条件分析可得出现了高方差现象 (5)What is weight decay? [A]A technique to avoid vanishing gradient by imposing a ceiling on the values of the weights. [B]A regularization technique (such as L2 regularization) that results in gradient descent shrinking the weights on every iteration. [C]The process of gradually decreasing the learning rate during training. [D]Gradual corruption of the weights in the neural network if it is trained on noisy data. 答案:B (6)What happens when you increase the regularization hyperparameter lambda? [A]Weights are pushed toward becoming smaller (closer to 0) [B]Weights are pushed toward becoming bigger (further from 0) [C]Doubling lambda should roughly result in doubling the weights. [D]Gradient descent taking bigger steps with each iteration (proportional to lambda) 答案:A 解析:
λ
\lambda
λ的增大会导致,代价函数
J
(
ω
,
b
)
=
1
m
∑
i
=
1
m
L
(
y
^
(
i
)
,
y
ω
)
+
λ
2
m
∥
ω
∥
2
2
J(\omega, b)=\frac{1}{m} \sum_{i=1}^{m} \mathcal{L}\left(\hat{y}^{(i)}, y^{\omega}\right)+\frac{\lambda}{2 m}\|\omega\|_{2}^{2}
J(ω,b)=m1?i=1∑m?L(y^?(i),yω)+2mλ?∥ω∥22? 的
λ
2
m
∥
ω
∥
2
2
\frac{\lambda}{2 m}\|\omega\|_{2}^{2}
2mλ?∥ω∥22?这一项增大,由于要使代价函数
J
(
ω
,
b
)
J(\omega, b)
J(ω,b)尽可能小,所以权重
ω
\omega
ω会随着训练变小 (7)With the inverted dropout technique, at test time: [A]You apply dropout (randomly eliminating units) and do not keep the
1
k
e
e
p
_
p
r
o
b
\frac{1}{keep\_prob}
keep_prob1? in the calculations used in training. [B]You do not apply dropout (do not randomly eliminating units) and do not keep the
1
k
e
e
p
_
p
r
o
b
\frac{1}{keep\_prob}
keep_prob1? in the calculations used in training. [C]You do not apply dropout (do not randomly eliminating units) ,but keep the
1
k
e
e
p
_
p
r
o
b
\frac{1}{keep\_prob}
keep_prob1? in the calculations used in training. [D]You apply dropout (randomly eliminating units) ,but keep the
1
k
e
e
p
_
p
r
o
b
\frac{1}{keep\_prob}
keep_prob1? in the calculations used in training. 答案:B 关键词:test time 测试时候 解析:测试的时候需要用到所有神经元,不然会导致测试的结果不稳定。训练的时候已经除以keep_prob来确保激活函数的期望不变,所以测试阶段不用除了 (8)Increasing the parameter keep_prob from 0.5 to 0.6 will likely cause the following:(Check the two that apply) [A]Increasing the regularization effect. [B]Reducing the regularization effect. [C]Causing the neural network to end up with a higher training set error. [D]Causing the neural network to end up with a lower training set error. 答案:B,D 解析: keep_prob从0.5提升到0.6将减少消除的神经元数量 (9)Which of these techniques are useful for reducing variance (reducing overfitting)?(Check all that apply.) [A]Xavier initialization [B]Gradient Checking [C]Exploding gradient [D]Vanishing gradient [E]Dropout [F]L2 regularization [G]Data augmentation 答案:E,F,G (10)Why do we normalize the inputs x? [A]Normalization is another word for regularization–It helps to reduce variance [B]It makes it easier to visualize the data. [C]It makes the parameter initialization faster. [D]It makes the cost function faster to optimize. 答案:D
|