• Embed Doc
  • Readcast
  • Collections
  • CommentGo Back
Download
Tutorial: Predicting with SAS Enterprise Miner
Matthew Beauregard
March 16, 2005
1 Important note

SAS is a very large, very complex piece of software. Enterprise Miner, while making up only a small part of the SAS system, is itself very large and complex. Until you understand what you are doing, follow these instructions carefully and in order. It is very possible to make a mistake from which you cannot recover, except by building your system all over again.

Note to FIT Linux lab users: apart from not following the instructions, the two most common reasons for strange errors in SAS are that you are out of free space on the network storage, or that you cancelled the VMware login box after starting Windows. To check your free space, ssh tocharlie and typequota

-v. Also, a helpdesk technician from Technical Services alleges that the Desktop is temporary local storage,
not network storage, so you might try using that as working space.
2 Startup
1. Download the data\ufb01les fromhttp://www-staff.it.uts.edu.au/~mbeaureg/topics/prediction_
in_sas/dataand extract the contents.
2. RunThe SAS System and chooseSolutions\u2192 Analysis\u2192 Enterprise Miner.
3. ChooseFile\u2192 New\u2192 Project, name itNN and click Create.
4. Right-click the empty pane on the right, chooseAdd Node and add an Input Data Source, a Data
Partition, a Replacement and a Neural Network. Arrange these left to right.

5. Hover your mouse cursor over the right edge of the Input Data Source until it becomes crosshairs. Drag a connecting arrow to the left edge of the Data Partition. Repeat to connect the other nodes in a line.

6. In the Explorer window (left) double click Libraries then right-click an empty area. ChooseNew.
7. EnterTUTORIAL as the name and clickBrowse. Navigate into the folder you expanded from the zip\ufb01le
and clickOK, thenOK again.
8. Double-click Input Data Source then clickSelect. Choose theTUTORIAL library andORGANICS data.
ClickOK.
3 Walkthrough: \ufb01nding organics buyers
TheORGANICS dataset contains information about customers of a supermarket. The target variable isORGYN,
a boolean that says whether or not the customer is interested in purchasing organic food products.
1. If theInput Data Source window is not open, double click that node.
1
2. Choose theVariables tab and right clickinput besideORGYN. ChooseSet Model Role\u2192 target.
Close the window. Save changes.
3. Double-click the Data Partition node. Set the percentages to 60% train, 20% validation and 20% test.
Close the window. Save changes.
4. Double-click the Neural Network node. Choose theBasic tab. Set the Runtime limit to 10 minutes.
5. Click the triangle besides Multilayer Perceptron. Choose the hidden neurons preset for Moderate noise
data. ChooseOK, close the window, save changes. Call your modelNN.
6. Ensure that the Neural Network node is selected (dotted outline) and chooseActions\u2192 Run.

Enterprise Miner will traverse all the preprocessing nodes before displaying a training/validation per- formance graph. Training will cease after about 15 iterations because the model becomes perfect. Once calculation is complete, view the results.

3.1 Questions
1. From theTables tab, what is the misclassi\ufb01cation rate on validation data?
2. Examine the training graph on thePlot tab. Is there any di\ufb00erence between performance on the
training and validation data sets?
3. What features in the data might lead to this perfect performance?
4 Walkthrough: organics buyers again
1. Close the Results window if it is open, and open the Input Data Source node (which may now be called
TUTORIAL.ORGANICS. In the Variablestab, set the Model Role for ORGANICSto rejected. Close the
window, save changes.
2. Run the Neural Network node again. When the Neural Network Monitor appears clickContinue.
You may stop training after about 12 iterations.
3. View the results.
4.1 Questions
1. Describe the training performance plot. What do you think would have happened if we allowed training
to continue?
2. Is this a useful predictive model?
5 Walkthrough: multiple models

1. Add twoTree nodes, connected from your Data Partition to a Control Point.
2. Connect your neural network to the Control Point.
3. Add anAssessment node connected from the Control Point.
4. Con\ufb01gure one of the Trees to have a maximum of 4 branches rather than 2. Call this tree4way Tree

when prompted. Close, save.
5. Click on the tree\u2019s label in the diagram and rename it to4way Tree.
2
of 00

Leave a Comment

You must be to leave a comment.
Submit
Characters: ...
You must be to leave a comment.
Submit
Characters: ...