hapter 9: Introduction to Neural Nets
Neural networks are a series of algorithms that make predictions of highly complex
systems by attempting to identify the underlying relationships in the data. Unlike
regression, neural networks do not need any specification of the underlying
relationships. Rather, the algorithms take a portion of the data (training data set) to
develop the relationships between the dependent variable and the set of
independent variables. This is referred to as the training or learning process. Based
on the learned patterns and relationships, the network makes predictions and tests
those predictions on the remaining data (testing data set). The process is actually
iterative, training, testing, more training, more testing and so on until the net is fine
tuned. Each set of parameters values used in training & testing is called a trial, and
hundreds of thousands of trials are investigated before the neural net is finished.
Neural networks are extremely flexible in their learning. For example, neural
networks are used to improve mail-order marketing campaigns by looking for
patterns of those customers who purchase versus those that do not, then making
predictions as to who should receive the catalogues. If a pattern or relationship
exists, a neural network will often figure it out.
The downside of a neural network is that, unlike regression, the network does not
give us an explicit equation relating the dependent and independent variables,
such as Sales = 5*(Advertising) - 2*(Price)’. Neural networks are more like the
proverbial black box, where we throw in all of our data and out pops the answer.
We have no idea how the answer was produced. We can determine, however, how
good the answer is. We do this by looking at how well the net predicts values
outside the training data.
Because a neural nets’ inner workings are unknown, model validation techniques
are important in letting us know if our model is useful. Because of this, model
validation is usually automated. We use Palisade’s NeuralTools, which is an Excel
addin, to train & test the models to determine the best neural net. As we saw in
Chapter 5, best needs to be determined by not only considering how well the
model fits the estimation data (training data), but also how well it fits the
validation data (testing data), NeuralTools uses an iterative procedure cycling
between the training data and testing data to find the best fitting net. NeuralTools
randomly selects 80% of the observations to use as the training data and uses the
remaining 20% as the testing data.
r 9: Introduction to Neural Nets 125Example 9.1
Figure 9.1
Solution
Step by Step
After training, NeuralTools will test the sensitivity of the net to determine if the
good or bad testing results were obtained due to good or bad luck. We mentioned
in Chapter 5 that as the validation data set is relatively small (10 - 20% of the data
set), it could be simple chance that we see a good or a bad fit. Specifically,
NeuralTools will train & test the neural nets on validation data sets of sizes 10%,
20%, and 30% to determine if the results are stable under different selections of
testing cases. IF it is, then we have more confidence in our model.
Let's use neural nets to predict the number of subscribers to mobile devices in the
USA and compare the neural net results to those of the Pearl and Gompertz S-
Curves from Chapter 8. Whereas, with the Pearl and Gompertz models, we
assumed the fitted curve should be S-shaped, neural nets require no such
assumption, Will the neural net produce a S-shaped curve all on its own?
Step 1: Data Set Management in NeuralTools
1.1 Open the file USA Mobile Cellularxlsx and open the program
NeuralTools.
1.2 As with StatTools, the first step is to identify the data. Highlight a data
cell, for example B6, and click on the Data Set Manager button in
NeuralTools’ menu ribbon,
1.3. Figure 9.1 shows the popup dialog box, where we named the data: US
Mobile Subs. Each variable is given a type, with Numeric meaning that
the variable is a measurement and Category meaning the variable is
qualitative specifying categories, such as, color, gender, etc. Using the
pull-down menu as shown in Figure 9.1, choose Independent Numeric
for Year and Dependent Numeric for Subscriptions. Click OK.
1269.2
With the dependent variable and independent variables identified, we now train &
test the neural nets. Be warned that this step could literally takes hours and even
days to complete. The iterative nature of training & testing continues until either
the user-specified time limit is reached or a specified number of trials is run or until
the accuracy is within a specified tolerance interval. Moreover, to find the best-
fitting net, six different network structures are trained & tested, increasing the
completion time of this step.
Selecting the configuration of the networks to be tested & trained ususally requires
some knowledge of the underlying mathematics of neural nets, NeuralTools,
however, has an automated feature that finds the best fitting net based on the
structure of the data and training & testing a variety of configurations. We show
how to choose the best net in Step 2.
Selec ame Cases as tong as Tis Numbers the Same
| mutomataly Pred Mssng Dependent Values
IF enable uve Pesaion
1 Bce reste values Ocecty in Cate Set
case vate nous
2/4]
Step 2: Training & Testing Neural Nets
2.1 Click on the Train button in NeuralTools’ ribbon and the NeuralTools —
Training dialog box shown in Figure 9.2 pops up.
2.2 In the Train tab, make sure that US Mobile Subs is the selected data set.
2.3 Under When Training is Completed, click Automatically Test on
Randomly Selected Cases. As shown in Figure 9.2, we are using 20,
meaning that a random sample of 20% of the total data set will be
selected and used to test the accuracy of the net being trained. As there
are 29 data points in the data, 23 values will be used for training and 6
values for testing/validation.
Introduction to Neural Nets 127