You are on page 1of 1
fered Advantages Ee euch ed Linear regression algorithms are all Fastiotrain | Not very accurate. about finding best fit line through data. | and forecast. | Don't use for non Logistic regression is adaptation of linear | Good for small | linear data. Python Libraries: | regression algorithm so that it can classification | Not flexible to sKLeam forecast problems where data is data problems. | adapt to complex classified into groups. If you want to Easy to data. linear_model.Logis | classify data such as finding if an object | understand Model occasionally ticRegression belongs to a category or if you want to ends up overfitting find probability of an event happening then use Logistic regression. Itis a Linear Classification Model therefore it finds relationship between independent and dependent variables. Outcome probabilities are modelled using logistic functions Nearest Ifyou have known dala properties (such_ | Simple, Memory intensive. Neighbours as products that customers have bought) | adaptable to | Costly, all training and want to use your data to forecast the problem. | data might be Python Libraries: | new events (such as finding what Accurate. involved in sKLeam products to recommend to a new Easy to decision making. customer based on similar products that | understand Slow performance Neighbors.KNeighb | existing customers have bought) then Use spatial due to 10 orsClassifier use nearest neighbour. trees to operations. It finds sample data that is closest in improve space | Choosing wrong distance to the target object. issues. distance measure Random Forest Python Libraries: SKLeam ensemble.Random ForestClassifier Euclidean distance can be used to measure the new data points. Ifyou have alarge data set and forecasting data is based on multiple decisions then use random forest. With random forest, you can split data, give it to multiple decision trees, combine multiple trees to forest and use majority vote to find the best possible decision. ‘An example could be finding best seller TV brand for next year based on different categories such as price, TV sold last year, warranty, screen size etc. Random forest is atype of ensemble which is a combination of gathering decisions (outcomes) from different algorithms. A large number of decision trees are created to form a random forecast. Each decision tree forecasts a value and then average of the forecasted values are chosen There are two stages; first to create a tree and second is to get the tree to forecast. Each tree in the ensemble is built from a sample drawn with replacement (.e., a bootstrap sample) from the training set. In addition, when splitting a node during the construction of the tree, the split that is chosen is no longer the best split among all features. Instead. the split that High accuracy Good starting point to solve a problem Flexible and can fit variety of different data well Fast to execute Easy to use Useful for regression and classification problems Can model missing values. Itis high performing. can produce inaccurate results ‘Slow at training Overfiting Not suitable for small samples ‘Small change in training data changes model. Oceasionally too simple for very complex problems.

You might also like