You are on page 1of 5

Pripoae Serbanescu Mihai

Neural Networks Report HW2

I used Neural Networks to solve an Energy efficiency Data Set with 8 attributes and 2 decisions,
in my program I used the first decision with all six attributes.
The attributes and decisions were:

X1(attribute) Relative Compactness


X2(attribute) Surface Area
X3(attribute) Wall Area
X4(attribute) Roof Area
X5(attribute) Overall Height
X6(attribute) Orientation
X7(attribute) Glazing Area
X8(attribute) Glazing Area Distribution
y1(decision) Heating Load
y2(decision) Cooling Load

All the attributes were Real or Integer numbers.


The decision I chose was to determine the cooling load based on all attributes given.
I used 2 models to determine a decision:
-Quantile Regression
- Linear Regression
Linear Regression fits a linear model with coefficients(w) to minimize the residual sum of squares
between the observed targets in the dataset, and the targets predicted by the linear
approximation.

Quantile regression estimates the median or other quantiles of conditional on, while ordinary
least squares (OLS) estimate the conditional mean.
In my case, Linear Regression performs better at 20% training size than Quantile Regression
with a score of 85% compared to 73%. The error is also a lot smaller when using Linear
Regression standing at 13.6 compared to 24.91. When changing the training size, the result do
change but linear regression still tops quantile regression at the same ratio.
I used a logarithmic axis for visual representation with the middle blue line representing perfect
regression. The closer the red point are to the line the more accurate the model is. If the points
are under the blue line it means that the model is under predicting and if they are above the
blue line it means that the model is over predicting.
Pripoae Serbanescu Mihai

LINEAR REGRESSION
Pripoae Serbanescu Mihai

QUANTILE REGRESSION

If we look closely to the 2 graphs, we ca observe that the dots at linear regression are
more chaotic than in quantile regression. The phenomenon is caused by the fact that
linear regression makes a bunch of predictions whereas quantile regression does not,
Pripoae Serbanescu Mihai

so it looks a lot more symmetrical and linear. While the Linear Regression Graph looks
more chaotic it gives more precision and less errors based on the predictions made.

The CODE:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import QuantileRegressor
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error

dataset = pd.read_csv("ENB2012_data.csv")
x = dataset.iloc[:, :8]
y = dataset.iloc[:, -3]
x_tr, x_te, y_tr, y_te = train_test_split(x, y, test_size=0.2,
random_state=1)

# Linear Regression
LinearRegression_model = LinearRegression()
LinearRegression_model.fit(x_tr, y_tr)
y_pr = LinearRegression_model.predict(x_te)
print("\n Linear Regression")
PredictedValues1 = pd.DataFrame({'Actual': y_te, 'Predicted': y_pr})
print(PredictedValues1)
score = r2_score(y_te, y_pr)
print("Regression accuracy score is ", score)
print("Mean squared error is== ", mean_squared_error(y_te, y_pr))
print("Root of mean squared error is == ", np.sqrt(mean_squared_error(y_te,
y_pr)))

plt.figure(figsize=(10, 10))
plt.scatter(y_te, y_pr, c='crimson')
plt.yscale('log')
plt.xscale('log')
p1 = max(max(y_te), max(y_tr))
p2 = min(min(y_te), min(y_tr))
plt.plot([p1, p2], [p1, p2], 'b-')
plt.xlabel('True Values', fontsize=15)
plt.ylabel('Predictions', fontsize=15)
plt.axis('equal')
plt.show()

# Quantile Regression
print("\n Method2")
QuantileRegressor_Model = QuantileRegressor()
Pripoae Serbanescu Mihai

QuantileRegressor_Model.fit(x_tr, y_tr)
y_pr = QuantileRegressor_Model.predict(x_te)
PredictedValues2 = pd.DataFrame({'Actual': y_te, 'Predicted': y_pr})
print(PredictedValues2)
score = r2_score(y_te, y_pr)
print("Regression accuracy score is ", score)
print("Mean squared error is== ", mean_squared_error(y_te, y_pr))
print("Root of mean squared error is == ", np.sqrt(mean_squared_error(y_te,
y_pr)))

plt.figure(figsize=(10, 10))
plt.scatter(y_te, y_pr, c='crimson')
plt.yscale('log')
plt.xscale('log')
p1 = max(max(y_te), max(y_tr))
p2 = min(min(y_te), min(y_tr))
plt.plot([p1, p2], [p1, p2], 'b-')
plt.xlabel('True Values', fontsize=15)
plt.ylabel('Predictions', fontsize=15)
plt.axis('equal')
plt.show()

I used read_csv to open and process the dataset . I stored the attributes in x and the
decision in Y choosing only the last decision(Cooling Load).I created the model using
x_tr and y_tr than generated the score and error .I then generated a plot to visualize the
results and than printed it.

You might also like