Professional Documents
Culture Documents
ipynb - Colaboratory
Introduccion
Vamos a probar los algoritmos que conocemos para predecir si un individuo se hubiera salvado o no en el titanic
https://colab.research.google.com/drive/1GUZTuXUwPp7JscTOsZ5sNaIOexFxBdfS#scrollTo=uqHIyGCnX1ZJ&printMode=true 1/17
6/2/23, 11:22 PM Arboles y K-nn.ipynb - Colaboratory
Requirement already satisfied: pandas>=0.25 in /usr/local/lib/python3.10/dist-packages (from seaborn) (1.5.3)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas>=0.25->seaborn) (2022
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.7->matplotlib
import pandas as pd
import numpy as np
import seaborn as sns #visualisacion
import matplotlib.pyplot as plt #visualisacion
from matplotlib import rcParams
%matplotlib inline
sns.set(color_codes=True)
pd.set_option('display.max_columns', None)
rcParams['figure.figsize'] = 12,8
Pandas es la libreria de Python mas importante para el manejo de datos tabulares. En este caso particular estoy leyendo el dataset desde
internet, pero Google Colab tiene la opcion de cargar archivos desde local o conectarse a Google Drive
Siblings/Spouses Parents/Children
Survived Pclass Name Sex Age Fare
Aboard Aboard
Mrs. John
Bradley
1 1 1 female 38.0 1 0 71.2833
(Florence Briggs
Thayer) Cum...
Miss. Laina
2 1 3 female 26.0 0 0 7.9250
Heikkinen
Mrs. Jacques
3 1 1 Heath (Lily May female 35.0 1 0 53.1000
Peel) Futrelle
Rev. Juozas
882 0 2 male 27.0 0 0 13.0000
Montvila
Renombramos columnas
Cambiamos los tipos de datos
df['Pclass'] = df['Pclass'].astype(str)
df = df.rename(columns={"Pclass": "Boarding_class", "Siblings/Spouses Aboard": "siblings", "Parents/Children Aboard": "parent_
https://colab.research.google.com/drive/1GUZTuXUwPp7JscTOsZ5sNaIOexFxBdfS#scrollTo=uqHIyGCnX1ZJ&printMode=true 2/17
6/2/23, 11:22 PM Arboles y K-nn.ipynb - Colaboratory
Como vimos anteriormente, un outlier o dato atípico es un punto o grupo de puntos que es diferente de los demas. Los datos atípicos pueden
perjudicar el rendimiento de los modelos y no nos permiten ver la distribucion real de la variable.
Vamos a revisarlo en cada modelo que hagamos, pero inicialmente, los arboles de decision son indiferentes a los outliers por la forma en la
que dividen la data y como se hacen los cálculos de cada rama (en base a la clase). Po el momento no vamos a realizar ninguna acción
Estado de la "clase"
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 887 entries, 0 to 886
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Survived 887 non-null int64
1 Boarding_class 887 non-null object
2 Name 887 non-null object
3 Sex 887 non-null object
4 Age 887 non-null float64
5 siblings 887 non-null int64
6 parent_children 887 non-null int64
7 Fare 887 non-null float64
dtypes: float64(2), int64(3), object(3)
memory usage: 55.6+ KB
sns.countplot(x=df.Survived)
Antes de crear nuestro set de train y test, tenemos que ocuparnos de las variables categoricas, la mayoria de los modelos no las soportan y
tenemos que convertirlas a algo con lo que puedan trabajar. Tenemos las siguientes:
https://colab.research.google.com/drive/1GUZTuXUwPp7JscTOsZ5sNaIOexFxBdfS#scrollTo=uqHIyGCnX1ZJ&printMode=true 3/17
6/2/23, 11:22 PM Arboles y K-nn.ipynb - Colaboratory
Name = Por ahora vamos a dejarla hasta que entrenemos el modelo, la vamos a reservar para entender los resultados luego.
Boarding_class y Sex = vamos a tranformarla a One-hot encoding
Mr. Owen
0 0 3 Harris male 22.0 1 0 7.2500 0
Braund
Mrs. John
Bradley
(Florence
1 1 1 female 38.0 1 0 71.2833 1
Briggs
Thayer)
Cum...
Miss.
2 1 3 Laina female 26.0 0 0 7.9250 1
Heikkinen
Mrs.
Jacques
Heath
3 1 1 female 35.0 1 0 53.1000 1
(Lily May
Peel)
Futrelle
Mr
Ahora separamos las features de la clase en dataframes distintos, vamos a rem Despues aplicamos la funcion de la libreria scikit-learn para
generar los set de train y test
features = df_model.columns.tolist()
features.remove("Survived")
features.remove("Boarding_class")
features.remove("Sex")
X = df_model.loc[:, features]
y = df_model.loc[:, ["Survived"]]
Name Age siblings parent_children Fare female male class_1 class_2 class_3
Mr. Owen
0 Harris 22.0 1 0 7.2500 0 1 0 0 1
Braund
Mrs. John
Bradley
(Florence
1 38.0 1 0 71.2833 1 0 1 0 0
Briggs
Thayer)
Cum...
Miss. Laina
2 26.0 0 0 7.9250 1 0 0 0 1
Heikkinen
Mrs.
Jacques
3 Heath (Lily 35.0 1 0 53.1000 1 0 1 0 0
May Peel)
Futrelle
Mr. William
4 35.0 0 0 8.0500 0 1 0 0 1
Henry Allen
https://colab.research.google.com/drive/1GUZTuXUwPp7JscTOsZ5sNaIOexFxBdfS#scrollTo=uqHIyGCnX1ZJ&printMode=true 4/17
6/2/23, 11:22 PM Arboles y K-nn.ipynb - Colaboratory
Survived
0 0
1 1
2 1
3 1
4 0
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0, train_size = .75, stratify=y)
... ...
Veamos
882los parametros:
0
X883
= Nuestro set
1 de features
y884
= Nuestra columna
0 de labels
random_state
885
= El split en train y test se hace de manera aleatoria, este valor oficia de semilla para poder reproducir el mismo split luego
1
train_size = El tamaño del set de training, en este caso, el 75% del total
886 0
stratify = Al tener una clase desbalanceada, este parametro permite tener un ratio similar de positivos entre los dos sets
887 rows × 1 columns
Comprobemos las clases en la salida
X_train.set_index("Name", inplace=True)
X_train.info()
<class 'pandas.core.frame.DataFrame'>
Index: 665 entries, Mrs. William (Margaret Norton) Rice to Mr. Reginald Charles Coleridge
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Age 665 non-null float64
1 siblings 665 non-null int64
2 parent_children 665 non-null int64
3 Fare 665 non-null float64
4 female 665 non-null uint8
5 male 665 non-null uint8
6 class_1 665 non-null uint8
7 class_2 665 non-null uint8
8 class_3 665 non-null uint8
dtypes: float64(2), int64(2), uint8(5)
memory usage: 29.2+ KB
X_train
Name
Mrs. William
(Margaret 39.0 0 5 29.1250 1 0 0 0 1
Norton) Rice
Mrs. Jacques
Heath (Lily May 35.0 1 0 53.1000 1 0 1 0 0
Peel) Futrelle
Mrs. Lizzie
(Elizabeth Anne
29.0 1 0 26.0000 1 0 0 1 0
Wilkinson)
Faunthorpe
Mrs. (Lutie
50.0 0 1 26.0000 1 0 0 1 0
Davis) Parrish
... ... ... ... ... ... ... ... ... ...
Mr. Youssef
16.0 2 0 21.6792 0 1 0 0 1
Samaan
X_test.set_index("Name", inplace=True)
X_test.info()
<class 'pandas.core.frame.DataFrame'>
Index: 222 entries, Miss. Annie Jessie Harper to Miss. Margit Elizabeth Skoog
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Age 222 non-null float64
1 siblings 222 non-null int64
2 parent_children 222 non-null int64
3 Fare 222 non-null float64
4 female 222 non-null uint8
https://colab.research.google.com/drive/1GUZTuXUwPp7JscTOsZ5sNaIOexFxBdfS#scrollTo=uqHIyGCnX1ZJ&printMode=true 5/17
6/2/23, 11:22 PM Arboles y K-nn.ipynb - Colaboratory
5 male 222 non-null uint8
6 class_1 222 non-null uint8
7 class_2 222 non-null uint8
8 class_3 222 non-null uint8
dtypes: float64(2), int64(2), uint8(5)
memory usage: 9.8+ KB
De los 887 registros, 665 estan en train y 222 en test Veamos ahora los labels
sns.countplot(x=y_train.Survived)
sns.countplot(x=y_test.Survived)
https://colab.research.google.com/drive/1GUZTuXUwPp7JscTOsZ5sNaIOexFxBdfS#scrollTo=uqHIyGCnX1ZJ&printMode=true 6/17
6/2/23, 11:22 PM Arboles y K-nn.ipynb - Colaboratory
<Axes: xlabel='Survived', ylabel='count'>
Como podemos observar, el ratio entre los valores de supervivencia se mantuvo del set original a los sets de training y test
Arbol de decisión
Eager learning o Aprendizaje Ansioso - el modelo final es un binario que no depende de los datos de entrenamiento (a diferencia de K-nn)
criterion = La formula que se va a aplicar para usar el mejor atributo en cada decision. Por ahora usamos "entropy" que es el que vimos
en clase
random_state = De nuevo, la semilla para la funcion random para asegurar la reproductibilidad de los resultados
base_tree.fit(X_train, y_train)
▾ DecisionTreeClassifier
DecisionTreeClassifier(criterion='entropy', random_state=0)
tree.plot_tree(base_tree)
https://colab.research.google.com/drive/1GUZTuXUwPp7JscTOsZ5sNaIOexFxBdfS#scrollTo=uqHIyGCnX1ZJ&printMode=true 7/17
6/2/23, 11:22 PM Arboles y K-nn.ipynb - Colaboratory
Text(0.3407188221709007, 0.875, 'x[0] <= 13.0\nentropy = 0.802\nsamples = 291\nvalue =
[220, 71]'),
Text(0.21939953810623555, 0.825, 'x[1] <= 2.5\nentropy = 0.99\nsamples = 34\nvalue = [15,
19]'),
Text(0.20092378752886836, 0.775, 'x[2] <= 0.5\nentropy = 0.297\nsamples = 19\nvalue = [1,
18]'),
Text(0.19168591224018475, 0.725, 'x[0] <= 11.5\nentropy = 1.0\nsamples = 2\nvalue = [1,
1]'),
Text(0.18244803695150116, 0.675, 'entropy = 0.0\nsamples = 1\nvalue = [1, 0]'),
Text(0.20092378752886836, 0.675, 'entropy = 0.0\nsamples = 1\nvalue = [0, 1]'),
Text(0.21016166281755197, 0.725, 'entropy = 0.0\nsamples = 17\nvalue = [0, 17]'),
Text(0.23787528868360278, 0.775, 'x[0] <= 3.5\nentropy = 0.353\nsamples = 15\nvalue = [14,
1]'),
Text(0.22863741339491916, 0.725, 'x[0] <= 2.5\nentropy = 0.65\nsamples = 6\nvalue = [5,
1]'),
Text(0.21939953810623555, 0.675, 'entropy = 0.0\nsamples = 5\nvalue = [5, 0]'),
Text(0.23787528868360278, 0.675, 'entropy = 0.0\nsamples = 1\nvalue = [0, 1]'),
Text(0.2471131639722864, 0.725, 'entropy = 0.0\nsamples = 9\nvalue = [9, 0]'),
Text(0.4620381062355658, 0.825, 'x[6] <= 0.5\nentropy = 0.727\nsamples = 257\nvalue = [205,
52]'),
Text(0.3340069284064665, 0.775, 'x[3] <= 7.988\nentropy = 0.556\nsamples = 170\nvalue =
[148, 22]'),
Text(0.26558891454965355, 0.725, 'x[0] <= 41.5\nentropy = 0.991\nsamples = 9\nvalue = [5,
4]'),
Text(0.25635103926096997, 0.675, 'x[1] <= 1.5\nentropy = 0.954\nsamples = 8\nvalue = [5,
3]'),
Text(0.2471131639722864, 0.625, 'x[1] <= 0.5\nentropy = 0.985\nsamples = 7\nvalue = [4,
3]'),
Text(0.23787528868360278, 0.575, 'x[0] <= 26.0\nentropy = 0.918\nsamples = 6\nvalue = [4,
2]'),
Text(0.22863741339491916, 0.525, 'entropy = 0.0\nsamples = 2\nvalue = [2, 0]'),
Text(0.2471131639722864, 0.525, 'x[0] <= 31.5\nentropy = 1.0\nsamples = 4\nvalue = [2,
2]'),
Text(0.23787528868360278, 0.475, 'entropy = 0.0\nsamples = 1\nvalue = [0, 1]'),
Text(0.25635103926096997, 0.475, 'x[0] <= 35.5\nentropy = 0.918\nsamples = 3\nvalue = [2,
1]'),
Text(0.2471131639722864, 0.425, 'entropy = 1.0\nsamples = 2\nvalue = [1, 1]'),
Text(0.26558891454965355, 0.425, 'entropy = 0.0\nsamples = 1\nvalue = [1, 0]'),
Text(0.25635103926096997, 0.575, 'entropy = 0.0\nsamples = 1\nvalue = [0, 1]'),
Text(0.26558891454965355, 0.625, 'entropy = 0.0\nsamples = 1\nvalue = [1, 0]'),
Text(0.2748267898383372, 0.675, 'entropy = 0.0\nsamples = 1\nvalue = [0, 1]'),
Text(0.40242494226327946, 0.725, 'x[1] <= 0.5\nentropy = 0.505\nsamples = 161\nvalue =
[143, 18]'),
Text(0.3683602771362587, 0.675, 'x[3] <= 44.748\nentropy = 0.589\nsamples = 120\nvalue =
[103, 17]'),
Text(0.3464203233256351, 0.625, 'x[3] <= 13.931\nentropy = 0.518\nsamples = 112\nvalue =
[99, 13]'),
Text(0.3371824480369515, 0.575, 'x[3] <= 13.681\nentropy = 0.588\nsamples = 92\nvalue =
[79, 13]'),
Text(0.3279445727482679, 0.525, 'x[0] <= 19.5\nentropy = 0.563\nsamples = 91\nvalue = [79,
12]'),
Text(0.29330254041570436, 0.475, 'x[3] <= 11.0\nentropy = 0.811\nsamples = 12\nvalue = [9,
3]'),
Text(0.2840646651270208, 0.425, 'x[3] <= 8.356\nentropy = 0.881\nsamples = 10\nvalue = [7,
3]'),
Text(0.26558891454965355, 0.375, 'x[0] <= 18.5\nentropy = 0.971\nsamples = 5\nvalue = [3,
2]'),
Text(0.25635103926096997, 0.325, 'x[0] <= 17.0\nentropy = 0.918\nsamples = 3\nvalue = [1,
2]'),
Text(0.2471131639722864, 0.275, 'entropy = 1.0\nsamples = 2\nvalue = [1, 1]'),
Text(0.26558891454965355, 0.275, 'entropy = 0.0\nsamples = 1\nvalue = [0, 1]'),
Text(0.2748267898383372, 0.325, 'entropy = 0.0\nsamples = 2\nvalue = [2, 0]'),
Text(0.302540415704388, 0.375, 'x[0] <= 18.0\nentropy = 0.722\nsamples = 5\nvalue = [4,
1]'),
Text(0.29330254041570436, 0.325, 'entropy = 0.0\nsamples = 3\nvalue = [3, 0]'),
Text(0.3117782909930716, 0.325, 'entropy = 1.0\nsamples = 2\nvalue = [1, 1]'),
Text(0.302540415704388, 0.425, 'entropy = 0.0\nsamples = 2\nvalue = [2, 0]'),
Text(0.3625866050808314, 0.475, 'x[0] <= 22.5\nentropy = 0.512\nsamples = 79\nvalue = [70,
9]'),
Text(0.3533487297921478, 0.425, 'entropy = 0.0\nsamples = 14\nvalue = [14, 0]'),
Text(0.371824480369515, 0.425, 'x[3] <= 8.081\nentropy = 0.58\nsamples = 65\nvalue = [56,
9]'),
https://colab.research.google.com/drive/1GUZTuXUwPp7JscTOsZ5sNaIOexFxBdfS#scrollTo=uqHIyGCnX1ZJ&printMode=true 8/17
6/2/23, 11:22 PM Arboles y K-nn.ipynb - Colaboratory
Como la imagen es muy pequeña para revisarlo, vamos a exportarlo y usar un lector de grafos para revisarlo.
fn=X_train.columns.tolist()
cn=['Deseased', 'Survived']
tree.export_graphviz(base_tree,
out_file="base_tree.dot",
feature_names = fn,
class_names=cn,
filled = True)
Como se puede apreciar, a mayor cantidad de variables mayor es la complejidad del arbol y mas dificil su visualizacion. Tambien es posible que
al generar tantos nodos este sobreajustando (overfitting)
y_pred = base_tree.predict(X_test)
ax.set_title('Matriz de Confusión\n\n');
ax.set_xlabel('\nPredicción')
ax.set_ylabel('Real ');
ax.xaxis.set_ticklabels(['0','1'])
ax.yaxis.set_ticklabels(['0','1'])
plt.show()
De la matriz podemos ver que, de los 86 casos de supervivencia, 57 son predichos correctamente y 29 no. Asimismo, de los 136 que no
sobrevivieron, 16 son predichos como sobrevivientes.
https://colab.research.google.com/drive/1GUZTuXUwPp7JscTOsZ5sNaIOexFxBdfS#scrollTo=uqHIyGCnX1ZJ&printMode=true 9/17
6/2/23, 11:22 PM Arboles y K-nn.ipynb - Colaboratory
Veamos ahora la metrica de Accuracy
print(classification_report(y_test, y_pred))
El classification report de la libreria scikit-learn nos muestra las metricas mas habituales de clasificación:
Accuracy
Precision
Recall
F1
Estas ultimas son calculadas por clase, por lo que el reporte nos brinda 2 promedios:
Macro Average: Es el promedio de las metricas por clase. Por ejemplo con Precision es (0.81 + 0.78) / 2 = 0.79
Weigthed Average: Es el promedio de la metrica, pero cada valor multiplicado por el porcentaje de soporte de la clase, en este caso, la
clase 0 tiene 136/222 = 0.61 y la clase 1 86/222 = 0.38. Siendo la formula de promedio ((136/222) * 0.81 + (86/222) * 0.78) = 0.7983 ~
0.80.
Este último no es util en problemas con clases desbalanceadas como el nuestro ya que le da mas peso a la metrica de la clase mayoritaria y
habitualmente no es lo que se busca. Es mas util en problemas multiclase.
Para mas informacion sobre las métricas pueden revisar el siguiente link
Entonces, nuetro arbol tiene un 80% de Accuracy, el resto de las metricas en promedio estan cercanas a esa con lo cual podemos seguir
utilizandola aunque la clase este desbalanceada.
Este primer modelo será nuestro Baseline, revisemos si podemos mejorarlo un poco con hiperparametros
Vamos a realizar una busqueda de parámetros, como no queremos usar el set de testing para un modelo que no sea final, y que nuestro set de
training ya es relativamente chico como para volver a separarlo en train y test, vamos a usar Cross Validation.
https://colab.research.google.com/drive/1GUZTuXUwPp7JscTOsZ5sNaIOexFxBdfS#scrollTo=uqHIyGCnX1ZJ&printMode=true 10/17
6/2/23, 11:22 PM Arboles y K-nn.ipynb - Colaboratory
base_tree.get_params()
{'ccp_alpha': 0.0,
'class_weight': None,
'criterion': 'entropy',
'max_depth': None,
'max_features': None,
'max_leaf_nodes': None,
'min_impurity_decrease': 0.0,
'min_samples_leaf': 1,
'min_samples_split': 2,
'min_weight_fraction_leaf': 0.0,
'random_state': 0,
'splitter': 'best'}
Esta es la lista de hiperparametros del arbol de clasificacion, para mas informacion sobre los mismos revisen la documentacion de scikit-learn
Ahora vamos a definir nuestra lista de valores para los mismos o "grid" sobre la cual se va a realizar la busqueda. En la misma vamos a incluir
los del baseline ya que alguna combinacion podria incluirlos
parameters={"splitter":["best","random"],
"max_depth" : [1,3,5,7,9,11,12, None],
"max_features":["auto","log2","sqrt",None],
"max_leaf_nodes":[None,10,50,80,90],
"min_samples_leaf":[1,2,4,9],
"min_samples_split": [1,3,5],
"min_weight_fraction_leaf": [0.0, 0.1,0.5,0.9],
}
tuning_model.fit(X_train, y_train)
https://colab.research.google.com/drive/1GUZTuXUwPp7JscTOsZ5sNaIOexFxBdfS#scrollTo=uqHIyGCnX1ZJ&printMode=true 11/17
6/2/23, 11:22 PM Arboles y K-nn.ipynb - Colaboratory
[CV /3] ND a _dept No e, a _ eatu es No e, a _ ea _ odes 80, _sa p es_ ea , _s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=4, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 1/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 2/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
[CV 3/3] END max_depth=None, max_features=None, max_leaf_nodes=80, min_samples_leaf=9, min_s
https://colab.research.google.com/drive/1GUZTuXUwPp7JscTOsZ5sNaIOexFxBdfS#scrollTo=uqHIyGCnX1ZJ&printMode=true 12/17
6/2/23, 11:22 PM Arboles y K-nn.ipynb - Colaboratory
https://colab.research.google.com/drive/1GUZTuXUwPp7JscTOsZ5sNaIOexFxBdfS#scrollTo=uqHIyGCnX1ZJ&printMode=true 13/17
6/2/23, 11:22 PM Arboles y K-nn.ipynb - Colaboratory
https://colab.research.google.com/drive/1GUZTuXUwPp7JscTOsZ5sNaIOexFxBdfS#scrollTo=uqHIyGCnX1ZJ&printMode=true 14/17
6/2/23, 11:22 PM Arboles y K-nn.ipynb - Colaboratory
https://colab.research.google.com/drive/1GUZTuXUwPp7JscTOsZ5sNaIOexFxBdfS#scrollTo=uqHIyGCnX1ZJ&printMode=true 15/17
6/2/23, 11:22 PM Arboles y K-nn.ipynb - Colaboratory
https://colab.research.google.com/drive/1GUZTuXUwPp7JscTOsZ5sNaIOexFxBdfS#scrollTo=uqHIyGCnX1ZJ&printMode=true 16/17
6/2/23, 11:22 PM Arboles y K-nn.ipynb - Colaboratory
https://colab.research.google.com/drive/1GUZTuXUwPp7JscTOsZ5sNaIOexFxBdfS#scrollTo=uqHIyGCnX1ZJ&printMode=true 17/17