Professional Documents
Culture Documents
In this module we are generating Vehicle information all the vehicle sales details, vehicle number
registration details, vehicle owner details, vehicle insurance details
A generation data set is one of a collection of successive, historically related, cataloged data sets,
known as a generation data group (GDG). The system keeps track of each data set in a generation
data group as it is created, so that new data sets can be chronologically ordered and old ones easily
retrieved.
2. DATA PREPROCESING
Data pre-processing is a process of preparing the raw data and making it suitable for a machine
learning model. It is the first and crucial step while creating a machine learning model. When
creating a machine learning project, it is not always a case that we come across the clean and
formatted data
Segmentation, the technique of splitting customers into separate groups depending on their
attributes or behavior, makes this possible. Customer segmentation in machine learning can help
you save money on marketing initiatives by reducing waste.
Random forest is a supervised learning algorithm which is used for both classification as well as
regression. But however, it is mainly used for classification problems. As we know that a forest is
made up of trees and more trees means more robust forest. Similarly, random forest algorithm
creates decision trees on data samples and then gets the prediction from each of them and finally
selects the best solution by means of voting. It is an ensemble method which is better than a single
decision tree because it reduces the over-fitting by averaging the result
5. PREDICTION