Pregunta1 - Pablo David - Jupyter Notebook

6/7/22, 22:18 Pregunta1 - Pablo David - Jupyter Notebook
Regresión Logística: Ejercicio Propuesto

Apellido: Pablo Mamani
Nombre: David Alcides
Importación de Librerias
In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
Recupere los datos

In [2]:
ad_data=pd.read_csv('airfoil_self_noise.csv')
Revise las primeras filas de ad_data
In [3]:
ad_data.head()
Out[3]:
Angle of Chord Free-stream Suction side Scaled sound

Frequency
attack length velocity displacement thickness pressure level
0 800 0.0 0.3048 71.3 0.002663 126.201
1 1000 0.0 0.3048 71.3 0.002663 125.201
2 1250 0.0 0.3048 71.3 0.002663 125.951
3 1600 0.0 0.3048 71.3 0.002663 127.591
4 2000 0.0 0.3048 71.3 0.002663 127.461
localhost:8888/notebooks/Tareas Inteligencia artificial/parcial/Pregunta1 - Pablo David.ipynb 1/14

In [4]:
ad_data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1503 entries, 0 to 1502
Data columns (total 6 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Frequency 1503 non-null int64
1 Angle of attack 1503 non-null float64
2 Chord length 1503 non-null float64
3 Free-stream velocity 1503 non-null float64
4 Suction side displacement thickness 1503 non-null float64
5 Scaled sound pressure level 1503 non-null float64
dtypes: float64(5), int64(1)
memory usage: 70.6 KB
In [5]:
ad_data.describe()
Out[5]:
Suction side Scaled sound

Angle of Chord Free-stream
Frequency displacement pressure
attack length velocity
thickness level
count 1503.000000 1503.000000 1503.000000 1503.000000 1503.000000 1503.000000
mean 2886.380572 6.782302 0.136548 50.860745 0.011140 124.835943
std 3152.573137 5.918128 0.093541 15.572784 0.013150 6.898657
min 200.000000 0.000000 0.025400 31.700000 0.000401 103.380000
25% 800.000000 2.000000 0.050800 39.600000 0.002535 120.191000
50% 1600.000000 5.400000 0.101600 39.600000 0.004957 125.721000
75% 4000.000000 9.900000 0.228600 71.300000 0.015576 129.995500
max 20000.000000 22.200000 0.304800 71.300000 0.058411 140.987000
Análisis de Datos Exploratorio

¡Usemos seaborn para explorar los datos!
¡Intenta recrear los trazados que se muestran a continuación!
** Crea un histograma de la edad **

In [6]:
sns.set_palette("GnBu_d")
sns.set_style("whitegrid")
sns.displot(ad_data['Scaled sound pressure level'])
Out[6]:
<seaborn.axisgrid.FacetGrid at 0x2425c926b80>

In [7]:
sns.jointplot(data=ad_data)
Out[7]:
<seaborn.axisgrid.JointGrid at 0x2425ea24a30>

In [8]:
sns.jointplot(x='Frequency', y='Scaled sound pressure level', data=ad_data)
Out[8]:
<seaborn.axisgrid.JointGrid at 0x242599a2280>
Crear un jointplot mostrando la distribución kde de "Daily Time spent on site" vs. "Age".

In [9]:
sns.pairplot(data=ad_data)
Out[9]:
<seaborn.axisgrid.PairGrid at 0x2425ee1a280>

** Crear un jointplot de "Daily Time Spent on Site" vs. "Daily Internet Usage"**
In [10]:
sns.lmplot(x='Frequency', y='Scaled sound pressure level',data=ad_data)
Out[10]:
<seaborn.axisgrid.FacetGrid at 0x24261214eb0>
Datos de entrenamiento y de prueba

In [11]:
from sklearn.model_selection import train_test_split
In [12]:
X = ad_data[['Frequency', 'Angle of attack', 'Chord length', 'Free-stream velocity', 'Sucti

y = ad_data['Scaled sound pressure level']

In [13]:
Out[13]:
Angle of Chord Free-stream Suction side displacement

Frequency
attack length velocity thickness
0 800 0.0 0.3048 71.3 0.002663
1 1000 0.0 0.3048 71.3 0.002663
2 1250 0.0 0.3048 71.3 0.002663
3 1600 0.0 0.3048 71.3 0.002663
4 2000 0.0 0.3048 71.3 0.002663
... ... ... ... ... ...
1498 2500 15.6 0.1016 39.6 0.052849
1499 3150 15.6 0.1016 39.6 0.052849
1500 4000 15.6 0.1016 39.6 0.052849
1501 5000 15.6 0.1016 39.6 0.052849
1502 6300 15.6 0.1016 39.6 0.052849
1503 rows × 5 columns
In [14]:
Out[14]:
0 126.201
1 125.201
2 125.951
3 127.591
4 127.461
...
1498 110.264
1499 109.254
1500 106.604
1501 106.224
1502 104.204
Name: Scaled sound pressure level, Length: 1503, dtype: float64
Creación y entrenamiento del modelo 1

In [15]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)
Regresión Lineal
** Entrene y ajuste un modelo de regresión logística con el conjunto de entrenamiento.**
In [16]:
from sklearn.linear_model import LinearRegression
In [17]:
lm=LinearRegression()
In [18]:
lm.fit(X_train,y_train)
Out[18]:
LinearRegression()
In [19]:
X_train.head()
Out[19]:
Angle of Chord Free-stream Suction side displacement

Frequency
attack length velocity thickness
820 500 8.4 0.0508 39.6 0.005662
879 2000 15.4 0.0508 71.3 0.026427
946 3150 19.7 0.0508 71.3 0.034118
862 1000 11.2 0.0508 39.6 0.015048
704 2500 12.6 0.1524 71.3 0.048316
EVALUACION DEL MODELO 1

In [20]:
print(lm.intercept_)
132.76846988060998
In [21]:
print(lm.coef_)
[-1.26290565e-03 -4.06458339e-01 -3.69221121e+01 1.02692608e-01
-1.51813495e+02]
Predicciones y Evaluaciones
** Ahora pronostique los valores para los datos de prueba.**

In [22]:
predictions = lm.predict(X_test)
Crea un reporte de clasificación para el modelo.
In [23]:
from sklearn.metrics import classification_report
In [24]:
plt.scatter(y_test, predictions)
Out[24]:
<matplotlib.collections.PathCollection at 0x2426244f580>

In [25]:
sns.displot((y_test-predictions),bins=30)
Out[25]:
<seaborn.axisgrid.FacetGrid at 0x24261705940>
Metricas de Evaluación de Regresión

In [26]:
from sklearn import metrics
In [27]:
print('MAE:', metrics.mean_absolute_error(y_test, predictions))

print('MSE:', metrics.mean_squared_error(y_test, predictions))
print('RMSE:', np.sqrt(metrics.mean_squared_error(y_test, predictions)))
MAE: 3.6747310654156187
MSE: 22.395946643814117
RMSE: 4.732435593202946
Creación y entrenamiento de modelo 2

In [28]:
X_train2, X_test2, y_train2, y_test2 = train_test_split(X,y,test_size=0.30,random_state=101

modeloLineal2 = LinearRegression()
modeloLineal2.fit(X_train2,y_train2)
Out[28]:
LinearRegression()

EVALUACION DEL MODELO 2

In [29]:
print("Interceptor: ",modeloLineal2.intercept_)
Interceptor: 132.9105746266391
In [30]:
print("Coeficientes:",modeloLineal2.coef_)
Coeficientes: [-1.20820533e-03 -4.21277489e-01 -3.59496141e+01 9.33410044e-

02
-1.47089444e+02]
PREDICCIONES DEL MODELO

In [31]:
predicciones2 = modeloLineal2.predict(X_test2)
In [32]:
plt.scatter(y_test2,predicciones2)
Out[32]:
<matplotlib.collections.PathCollection at 0x2426255aaf0>

In [33]:
sns.displot((y_test2-predicciones2),bins=30)
Out[33]:
<seaborn.axisgrid.FacetGrid at 0x242624cb640>
Metricas de Evaluación de Regresión 2

In [34]:
print('MAE:', metrics.mean_absolute_error(y_test2, predicciones2))

print('MSE:', metrics.mean_squared_error(y_test2, predicciones2))
print('RMSE:', np.sqrt(metrics.mean_squared_error(y_test2, predicciones2)))
MAE: 3.510346177232778
MSE: 21.213057157693157
RMSE: 4.605763471748539
Si se desea reducir las métricas de evaluación de errores, es necesario comparar y revisar el modelo 2 donde
el valor resultante de MAE, MSE, RMSE son de menor valor que el modelo 1, por lo tanto podemos afirmar que
el Modelo 2 es mejor que el Modelo 1.

¡Buen trabajo!

Pregunta1 - Pablo David - Jupyter Notebook

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Pregunta1 - Pablo David - Jupyter Notebook

Uploaded by

Copyright:

Available Formats

6/7/22, 22:18 Pregunta1 - Pablo David - Jupyter Notebook

Regresión Logística: Ejercicio Propuesto

Nombre: David Alcides

Recupere los datos

Revise las primeras filas de ad_data

Angle of Chord Free-stream Suction side Scaled sound

0 800 0.0 0.3048 71.3 0.002663 126.201

1 1000 0.0 0.3048 71.3 0.002663 125.201

2 1250 0.0 0.3048 71.3 0.002663 125.951

3 1600 0.0 0.3048 71.3 0.002663 127.591

4 2000 0.0 0.3048 71.3 0.002663 127.461

localhost:8888/notebooks/Tareas Inteligencia artificial/parcial/Pregunta1 - Pablo David.ipynb 1/14

RangeIndex: 1503 entries, 0 to 1502

Data columns (total 6 columns):

# Column Non-Null Count Dtype

--- ------ -------------- -----

0 Frequency 1503 non-null int64

1 Angle of attack 1503 non-null float64

2 Chord length 1503 non-null float64

3 Free-stream velocity 1503 non-null float64

4 Suction side displacement thickness 1503 non-null float64

5 Scaled sound pressure level 1503 non-null float64

dtypes: float64(5), int64(1)

memory usage: 70.6 KB

Suction side Scaled sound

count 1503.000000 1503.000000 1503.000000 1503.000000 1503.000000 1503.000000

mean 2886.380572 6.782302 0.136548 50.860745 0.011140 124.835943

std 3152.573137 5.918128 0.093541 15.572784 0.013150 6.898657

min 200.000000 0.000000 0.025400 31.700000 0.000401 103.380000

25% 800.000000 2.000000 0.050800 39.600000 0.002535 120.191000

50% 1600.000000 5.400000 0.101600 39.600000 0.004957 125.721000

75% 4000.000000 9.900000 0.228600 71.300000 0.015576 129.995500

max 20000.000000 22.200000 0.304800 71.300000 0.058411 140.987000

Análisis de Datos Exploratorio

¡Intenta recrear los trazados que se muestran a continuación!

** Crea un histograma de la edad **

localhost:8888/notebooks/Tareas Inteligencia artificial/parcial/Pregunta1 - Pablo David.ipynb 2/14

localhost:8888/notebooks/Tareas Inteligencia artificial/parcial/Pregunta1 - Pablo David.ipynb 3/14

localhost:8888/notebooks/Tareas Inteligencia artificial/parcial/Pregunta1 - Pablo David.ipynb 4/14

sns.jointplot(x='Frequency', y='Scaled sound pressure level', data=ad_data)

localhost:8888/notebooks/Tareas Inteligencia artificial/parcial/Pregunta1 - Pablo David.ipynb 5/14

localhost:8888/notebooks/Tareas Inteligencia artificial/parcial/Pregunta1 - Pablo David.ipynb 6/14

sns.lmplot(x='Frequency', y='Scaled sound pressure level',data=ad_data)

Datos de entrenamiento y de prueba

from sklearn.model_selection import train_test_split

X = ad_data[['Frequency', 'Angle of attack', 'Chord length', 'Free-stream velocity', 'Sucti

localhost:8888/notebooks/Tareas Inteligencia artificial/parcial/Pregunta1 - Pablo David.ipynb 7/14

Angle of Chord Free-stream Suction side displacement

0 800 0.0 0.3048 71.3 0.002663

1 1000 0.0 0.3048 71.3 0.002663

2 1250 0.0 0.3048 71.3 0.002663

3 1600 0.0 0.3048 71.3 0.002663

4 2000 0.0 0.3048 71.3 0.002663

... ... ... ... ... ...

1498 2500 15.6 0.1016 39.6 0.052849

1499 3150 15.6 0.1016 39.6 0.052849

1500 4000 15.6 0.1016 39.6 0.052849

1501 5000 15.6 0.1016 39.6 0.052849

1502 6300 15.6 0.1016 39.6 0.052849

1503 rows × 5 columns

Name: Scaled sound pressure level, Length: 1503, dtype: float64

Creación y entrenamiento del modelo 1

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)

** Entrene y ajuste un modelo de regresión logística con el conjunto de entrenamiento.**

Crea un histograma de la edad

Entrene y ajuste un modelo de regresión logística con el conjunto de entrenamiento.