You are on page 1of 5

Name: Nguyen Thi Mai Chi

Student’s ID: 1622431

LAB 4 REPORT

PREDICTIVE MODELING USING REGRESSION

1. What variable is the most important?

Profit is the most important variable.

2. How many predictors/inputs are in the logistic regression model?

There are 10 predictors/inputs in the logistic regression model.

3. What is the validation misclassification rate for the logistic regression model?

The validation misclassification rate for the logistic regression model is 0,0038.

4. How many true positives does the logistic regression model have? (Recall: true
positives is predicted the unit was down and it was observed (actually) down.)

True positives the logistic regression model have:


- Training Frequency: 35001
- Valid Frequency: 23015

5. What other variable(s) entered/left the regression model?

- Variables entered the regression model are:

 Expenses
 Profit

- Variable left the regression model is Expenses


6. What is the R2 for the regression model?

R-Square for the regression model is 0.205029.

7. Do any of the measure variables used to build the model have a strong
correlation? If so, which one(s)?
Measure variables used to build the model have a strong correlation are:

- Expenses – Profit
- Facility Age – Unit Capacity
- Profit – Revenue
- Unit Age – Unit Reliability

8. How does this regression model compare to your decision tree model from Lab 3

Both regression models and decision trees form Lab 3 are valuable tools in SAS for
making predictions, but they have different strengths and weaknesses:

Regression Models:

- Strengths:

 Easier to interpret: The equation shows how each variable contributes to the
prediction.
 Works well for continuous target variables.
 Can handle large datasets efficiently.

- Weaknesses:

 Assumes linear relationships between variables. May not be accurate for complex
relationships.
 Sensitive to outliers.

Decision Trees:

- Strengths:

 Can handle non-linear relationships without data transformation.


 Less sensitive to outliers.
 Easy to visualize and understand the decision-making process.

- Weaknesses:

 Can be less accurate than regression models for some problems.


 Prone to overfitting if not carefully pruned (reduced in complexity).
 Can be difficult to interpret complex trees.

Here's a table summarizing the key points:

Feature Regression Model Decision Tree


Relationship Assumption Linear Non-linear
Target Variable Continuous Continuous or Categorical
Outlier Sensitivity High Low
Interpretability Easier More complex
Overfitting Risk Lower Higher

You might also like