You are on page 1of 4

31/5/23, 21:54 Desarrollo Solemne 3 - Carola Araya D..

ipynb - Colaboratory

pip install pingouin

import pandas as pd
import numpy as np
import scipy.stats as ss
import matplotlib.pyplot as plt
import seaborn as sns
import pingouin as pg
import statsmodels.api as sm
import statsmodels.formula.api as smf
from statsmodels.formula.api import ols
from statsmodels.stats.diagnostic import het_breuschpagan
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from google.colab import drive

data = pd.read_excel("/content/drive/MyDrive/Colab Notebooks/solemne 3/Datos_solemne_3-1.xlsx")
data.head()

y Gravedad_crudo Presión_vapor Temperatura10 Temperatura100

0 6.9 38.4 6.1 220 235

1 14.4 40.3 48.0 231 307

2 7.4 40.0 6.1 217 212

3 8.5 31.8 0.2 316 365

4 8.0 40.8 3.5 210 218

data = data.dropna()
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 25 entries, 0 to 24
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 y 25 non-null float64
1 Gravedad_crudo 25 non-null float64
2 Presión_vapor 25 non-null float64
3 Temperatura10 25 non-null int64
4 Temperatura100 25 non-null int64
dtypes: float64(3), int64(2)
memory usage: 1.1 KB

mod01 = smf.ols("y~Gravedad_crudo+Presión_vapor+Temperatura10+Temperatura100", data=data)
mod01 = mod01.fit()
print(mod01.summary())

OLS Regression Results


==============================================================================
Dep. Variable: y R-squared: 0.937
Model: OLS Adj. R-squared: 0.925
Method: Least Squares F-statistic: 74.87
Date: Thu, 01 Jun 2023 Prob (F-statistic): 9.59e-12
Time: 01:41:01 Log-Likelihood: -53.911
No. Observations: 25 AIC: 117.8
Df Residuals: 20 BIC: 123.9
Df Model: 4
Covariance Type: nonrobust
==================================================================================
coef std err t P>|t| [0.025 0.975]
----------------------------------------------------------------------------------
Intercept 3.9216 8.857 0.443 0.663 -14.553 22.397
Gravedad_crudo 0.1875 0.132 1.418 0.172 -0.088 0.463
Presión_vapor -0.0047 0.038 -0.122 0.904 -0.084 0.075
Temperatura10 -0.1702 0.021 -8.139 0.000 -0.214 -0.127
Temperatura100 0.1474 0.010 15.380 0.000 0.127 0.167
==============================================================================
Omnibus: 1.290 Durbin-Watson: 1.777
Prob(Omnibus): 0.525 Jarque-Bera (JB): 1.191
Skew: 0.413 Prob(JB): 0.551
Kurtosis: 2.322 Cond. No. 7.79e+03
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 7.79e+03. This might indicate that there are
strong multicollinearity or other numerical problems.

https://colab.research.google.com/drive/1OdLoeiyvo3JGcD2BSXoil2dwa8uaBd7R#scrollTo=-IawVxQBpjwD&printMode=true 1/4
31/5/23, 21:54 Desarrollo Solemne 3 - Carola Araya D..ipynb - Colaboratory

Ecuación del modelo:


y = 3.9216 + 0.1875 * Gravedad_crudo - 0.0047 * Presión_vapor - 0.1702 * Temperatura10 + 0.1474 * Temperatura100

mod02 = smf.ols("y~Gravedad_crudo+Temperatura10+Temperatura100", data=data)
mod02 = mod02.fit()
print(mod02.summary())

OLS Regression Results


==============================================================================
Dep. Variable: y R-squared: 0.937
Model: OLS Adj. R-squared: 0.928
Method: Least Squares F-statistic: 104.7
Date: Thu, 01 Jun 2023 Prob (F-statistic): 8.57e-13
Time: 01:41:32 Log-Likelihood: -53.920
No. Observations: 25 AIC: 115.8
Df Residuals: 21 BIC: 120.7
Df Model: 3
Covariance Type: nonrobust
==================================================================================
coef std err t P>|t| [0.025 0.975]
----------------------------------------------------------------------------------
Intercept 3.8014 8.593 0.442 0.663 -14.069 21.672
Gravedad_crudo 0.1865 0.129 1.447 0.163 -0.082 0.454
Temperatura10 -0.1690 0.018 -9.249 0.000 -0.207 -0.131
Temperatura100 0.1468 0.008 17.476 0.000 0.129 0.164
==============================================================================
Omnibus: 1.312 Durbin-Watson: 1.773
Prob(Omnibus): 0.519 Jarque-Bera (JB): 1.166
Skew: 0.374 Prob(JB): 0.558
Kurtosis: 2.251 Cond. No. 7.74e+03
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 7.74e+03. This might indicate that there are
strong multicollinearity or other numerical problems.

from pandas.core.groupby import groupby

data["resid"]=mod02.resid
data["predict"]=mod02.predict()

plt.figure(figsize=(3,3))
sns.histplot(data=data, x="y", kde=True)
plt.title("Histograma y")
plt.show()

sns.set()

plt.figure(figsize=(3,3))
pg.qqplot(data["y"], "norm")
plt.show()

https://colab.research.google.com/drive/1OdLoeiyvo3JGcD2BSXoil2dwa8uaBd7R#scrollTo=-IawVxQBpjwD&printMode=true 2/4
31/5/23, 21:54 Desarrollo Solemne 3 - Carola Araya D..ipynb - Colaboratory

ss.kstest(data["y"], "norm")

KstestResult(statistic=0.997444869669572, pvalue=3.0644774280471976e-65, statistic_location=2.8, statistic_sign=-1)

ss.normaltest(data["y"])

NormaltestResult(statistic=1.5858242685723607, pvalue=0.452525060848027)

from statsmodels.stats.diagnostic import het_breuschpagan

bp_test= het_breuschpagan(mod02.resid, mod02.model.exog)
bp_test

(3.987442707411526,
0.26282328717345105,
1.3283532586357687,
0.2917992311613149)

plt.figure(figsize=(5,5))
sns.scatterplot(data =data,x = "predict", y = "y")
plt.axhline(y=0, color="Blue", linestyle="--")
plt.show()

https://colab.research.google.com/drive/1OdLoeiyvo3JGcD2BSXoil2dwa8uaBd7R#scrollTo=-IawVxQBpjwD&printMode=true 3/4
31/5/23, 21:54 Desarrollo Solemne 3 - Carola Araya D..ipynb - Colaboratory

error 0 s se ejecutó 21:53

https://colab.research.google.com/drive/1OdLoeiyvo3JGcD2BSXoil2dwa8uaBd7R#scrollTo=-IawVxQBpjwD&printMode=true 4/4

You might also like