You are on page 1of 2

When we eliminated the validation sets from the training samples, we opted to use a fix number of

epochs to train. An epoch is one step of the training when all the available training samples are shown
to the NN. It can be done in an incremental way one by one (MATLAB: adapt() method) or in one bulk
as a batch learning (MATLAB: train() method). The advantage of the batch learning is efficiency; the
advantage of incremental learning is that it is proved that with that training the NN can estimate any
non linear function in the limit case. While this is not guaranteed in the batch learning case.
Nevertheless, we use the batch learning.

We repeat two illustrations here from a previous post. (click them for a little larger image)

In the first image there are the 10 cases of the NN surface after 1 epoch training. The first little stamp
image shows the unprocessed output that you have already seen it in the last post. That is the average
next day %gains for the days Mon…Fri.
In the second stamp image, the same can be seen, but the mean of the period is subtracted from the
columns.
We expect that the NN should learn an inverted bell shape. However, note that in the 1 epoch case,
only 4 out of the 10 NN could learn the inverted bell shape. Training the NN for only 1 epoch is clearly
not enough. Unfortunately (as I realized now), this happened many times in our former experiments.
Imagine that when the validation set (20% of the samples) were used for termination, and randomly an
outlier were put into the validation set. We start the learning epochs. The training set RMSE error
improves gradually, but the validation RMSE error increases at every step. The default Matlab behavior
is that after the validation error increases for 6 consecutive epochs, it rolls back the 6 epochs learning
and generate the NN that exist 6 epochs before. If the validation RMSE error increases from the
inception from the 1st epoch to the 7th epoch, Matlab rolls back to the 1st epoch and finalize that as
the output net of the training. So, in many cases in our past, the final NN was equivalent to a NN
trained for 1 epoch only.

The second image is the same, but after 2 epochs training. Note that 9 out of 10 NN could learn the
inverted bell shape.

We don’t know what the optimal number of epochs is in advance. Usually, the optimal nEpoch should
increase as the input dimension increases or the complexity of the function increases.
It is more difficult to climb a 20 dimensional hyperspace than a 2 dimensional space. In our case we
have 2 neurons, so the state space is 2 dimensional. Articles for 20 dimensional weights report 1500
needed epoch. We mention that ultimately, a better way is to stop training at a specified error
threshold, but that also depends on the task we try to solve.

Until all training examples produce correct output or MSE ceases to decrease
– For all training examples,
do Begin Epoch
For each training example do
– Compute the network output
– Compute the error
– Backpropagate this error from layer to layer and adjust weights to decrease this error
End Epoch

You might also like