If you have 10,000,000 examples, how would you split the train/dev/test set?
The dev and test set should:
If your Neural Network model seems to have high variance, what of the following would be promising things to try?
You are working on an automated check-out kiosk for a supermarket, and are building a classifier for apples, bananas and oranges. Suppose your classifier obtains a training set error of 0.5%, and a dev set error of 7%. Which of the following are promising things to try to improve your classifier?
What is weight decay?
What happens when you increase the regularization hyperparameter lambda?
With the inverted dropout technique, at test time:
Increasing the parameter keep_prob from (say) 0.5 to 0.6 will likely cause the following:
Which of these techniques are useful for reducing variance (reducing overfitting)?
Why do we normalize the inputs x?