You are on page 1of 1

Has To increase regularization or

Task Properties Main sklearn models (bold = recommended) Key sklearn hyperparameters Should scale features Multi-target/label Deterministic Has predict_proba_() feature_importances_ Typical loss function Typical evaluation metric Training complexity Prediction complexity Space complexity
similar effect

linear_model.LinearRegression (no
regularization), linear_model.Lasso (L1-
regularization), linear_model.Ridge (L2- Use coef_ only if data is Increase alpha (usually O(p²n + p³) for n examples and
Linear regression Regression Linear, deterministic regressor alpha Yes Yes Yes No Mean squared error R², MSE, RMSE O(p) O(p)
regularization), linear_model.ElasticNet (L1 scaled squared L2 and/or L1 penalty) p model parameters
and L2), linear_model.ElasticNetCV,
linear_model.SGDRegressor

Yes (some sources say No, Decrease C (usually L2


Binary classifier (multiclass via OVR), linear_model.LogisticRegression, linear_model. Cross-entropy aka log loss, aka
Logistic regression Classification penalty, C but this is not the case in my No Yes Yes No penalty), or increase alpha for Likelihood ratio, weighted F1 O(np) O(p) O(p)
deterministic (depends on solver) SGDClassifier logistic loss, aka deviance
experience) SGDClassifier

Ridge regression Kernel-based, non-linear decision Cross-entropy aka log loss, aka O(p²n + p³) for n examples and
Classification linear_model.RidgeClassifier alpha Yes No Depends on solver No No Increase alpha Weighted F1 O(p) O(p)
classification boundary, binary classifier logistic loss, aka deviance p model parameters

For brute force, O(npk) for k


neighbours and n training
O(1) for brute force, or O(n.log
Instance-based, non-parametric, neighbors.KNeighborsClassifier, neighbors. Weighted F1 (classification) or examples of dimension p O(1) for brute force, or O(npk)
k-nearest neighbours Classification or Regression n_neighbours Yes Yes Yes Yes No None Increase n_neighbors (n).p) for k-d tree for n
multiclass classifier KNeighborsRegressor R², MSE, RMSE (regression) (number of features ~ number for k-d tree
examples and p features
of parameters). Or, for k-d tree,
O(k.log(n)).

svm.SVC, svm.SVR, svm.LinearSVC, svm.LinearSVR, Decrease C (squared L2


Linear support vector Linear decision boundary, binary Weighted F1 (classification) or O(n²p) for n examples and p O(sp) for s support vectors and O(sp) for s support vectors and
Classification or Regression svm.NuSVC, svm.NuSVR, linear_model. C, or alpha for SGDClassifier Yes No Only if probability=False Only if probability=True No Hinge loss penalty), or increase alpha for
machine classifier (multiclass via OVR) R², MSE, RMSE (regression) model parameters p model parameters p model parameters
SGDClassifier SGDClassifier

Kernel-based, non-linear decision


Nonlinear support vector Decrease C (squared L2 Weighted F1 (classification) or O(n²p + n³) for n examples and O(sp) for s support vectors and O(sp) for s support vectors and
Classification or Regression boundary, binary classifier (multiclass via svm.SVC, svm.SVR, svm.NuSVC, svm.NuSVR kernel, C Yes No Only if probability=False Only if probability=True No Hinge loss
machine penalty) R², MSE, RMSE (regression) p model parameters p model parameters p model parameters
OVR)

Decrease max_depth,
max_features, max_depth, O(nzp) for n examples, p model
tree.DecisionTreeClassifier, tree. Gini (per split, not global so not max_features, or increase Weighted F1 (classification) or
Decision tree Classification or Regression Non-parametric, multiclass classifier
DecisionTreeRegressor
min_samples_leaf, No Yes Yes Yes Yes
strictly a loss function per se) R², MSE, RMSE (regression)
parameters, if depth is limited O(z) for max depth z O(z)
min_samples_split min_samples_split, to z.
min_samples_leaf

Decrease max_depth,
n_estimators, max_features, O(nzpt) for n examples, p
ensemble.RandomForestClassifier, ensemble. Gini (per split, not global so not max_features, or increase Weighted F1 (classification) or
Random forest Classification or Regression Stochastic, ensemble multiclass classifier
RandomForestRegressor
max_depth, min_samples_leaf, No Yes No Yes Yes
strictly a loss function per se) R², MSE, RMSE (regression)
model parameters, max depth O(zt) O(zt)
min_samples_split min_samples_split, z, and t trees
min_samples_leaf

Decrease max_depth,
Stochastic, ensemble multiclass classifier n_estimators, max_features, O(nzpt) for n examples, p
ensemble.ExtraTreesClassifier, ensemble. Gini (per split, not global so not max_features, or increase Weighted F1 (classification) or
Extremely randomized trees Classification or Regression (ExtraTrees is to ExtraTree as
ExtraTreesRegressor
max_depth, min_samples_leaf, No Yes No Yes Yes
strictly a loss function per se) R², MSE, RMSE (regression)
model parameters, max depth O(zt) O(zt)
RandomForest is to DecisionTree) min_samples_split min_samples_split, z, and t trees
min_samples_leaf

ensemble.GradientBoostingClassifier, ensemble. Decrease max_depth,


Stochastic, ensemble multiclass classifier n_estimators, max_features, O(nzpt) for n examples, p
GradientBoostingRegressor, ensemble. Cross-entropy aka log loss, aka max_features, or increase Weighted F1 (classification) or
Gradient boosted trees Classification or Regression (ExtraTrees is to ExtraTree as
HistGradientBoostingClassifier, ensemble.
max_depth, min_samples_leaf, No No No Yes Yes
logistic loss, aka deviance R², MSE, RMSE (regression)
model parameters, max depth O(zt) O(zt)
RandomForest is to DecisionTree) min_samples_split min_samples_split, z, and t trees
HistGradientBoostingRegressor min_samples_leaf

Cross-entropy aka log loss, aka


hidden_layer_sizes, activation,
neural_network.MLPClassifier, neural_network. logistic loss, aka deviance Weighted F1 (classification) or
Multilayer perceptron Classification or Regression Deep feedforward artificial neural network
MLPRegressor
alpha, learning_rate_init, Yes Yes No Yes No
(classification) or squared error
Increase alpha (L2 penalty)
R², MSE, RMSE (regression) 😬 O(p) for p parameters (weights) O(p)
max_iter
(regression)

You might also like