Professional Documents
Culture Documents
Pedram Jahangiry
Utah State University: Huntsman School of Business
Machine Learning
Model Model Type and use case Description Pros Cons Hyperparameters
Type
Linear - Parametric - highly interpretable (giving significancy results) - validity of linear regression assumptions none
Linear regression Used for regression only
Finds the "best fit" through all the data points. - very fast training because of closed form solution - cannot capture complex relationships
- no hyperparameter tuning required
- interpretable for low values of d (giving significancy
Linear - Parametric Extending linear regression model to capture - need to choose the right polynomial degree d: degree of polynomial
Polynomial regression non-linearities
results)
Used for regression only - Can capture polynomial relationships - notorious tail behavior (sensitive to outliers)
Linear method that penalizes irrelevant - Can be used for feature selection (reducing the penalty (how much to penalize the
Linear - Parametric - Requires feature scaling
Penalized regression: features using regularization dimension of the feature space) parameters)
Ridge, LASSO and Elastic L1 regularization: LASSO
L2 regularization: Ridge L1 ratio: ration between L1 and L2
Net Used for regression only - interpretable
regularization
combining L1 and L2: Elastic net
- the same as penalized regression if
Linear - Parametric - probabilitsitc model (the outputs are probabilities) - validity of linear regression assumptions
regularization is used
Basically the adaptation of linear regression
Logistic regression to classification problems.
- highly interpretable (giving significancy results) - sensitive to extreme values
Used for classification only - easy to understand - cannot capture complex relationships
- fast and efficient
Non-linear - Non-parametric - Intuitive and simple - Choice of K - K value
Make prediction for a new observation by - Easy to implement for multi class problem - Slow (memory based approach) - distance metrics
finding similarities (“nearness”) between it - Few parameters/hyper parameters - Curse of dimensionality
KNN Used for both regression and
and its k nearest neighbors in the existing - No assumption (non parametric) - Hard to interpret
classification
dataset. - Requires feature scaling
- Not good with multiple categorical features
Supervised
Machine Learning
Model Model Type and use case Description Pros Cons Hyperparameters
Type
- Reducing the number of features to the most relevant
- Hard to interpret None
predictors is very useful in general.
Non- Parametirc
- Dimension reduction facilitates the data visualization
- Requires feature scaling
in two or three dimensions.
Principal components are vectors that define a
- Before training another supervised or unsupervised
new coordinate system in which the first axis
Principle Component learning model, it can be performed as part of EDA to
goes in the direction of the highest variance in
identify patterns and detect correlations .
Unsupervised
- Simple to understand - Need to choose k before running the algorithm K: Number of clusters
Cab be both Parametric and Non-
Parametirc - Uses a measure of similarity to detect - The k means algorithm is fast and works well on very
- Requires feature scaling - Distance metrics
groups within data set: K means is an large datasets
K-Means algorithm that repeatedly partitions - Can help visualize the data and facilitate detecting
observations into a fixed, pre-specified - Poor performance with clusters of irregular shapes
trends or outliers.
Used for clustering the data number of clusters - Not applicable for categorical data
- Unable to handle noisy data
Cab be both Parametric and Non- - The optimal number of clusters can be obrained by - The choice of distance metrics and linkage
- Uses a measure of similarity to detect - Distance metrics
Parametirc the model itself, methods can be tricky
groups within data set: Hierarchical clustering
Hierarchical clustering is an iterative procedure used to build a
Used for clustering the data hierarchy of clusters - Requires feature scaling - Linkage methods