Professional Documents
Culture Documents
• Pre-processing is about gaining insights about data and select a model that is thought to best fit our data.
Feature
Engineering
Feature engineering or feature
extraction or feature discovery is the
process of using domain knowledge
to extract features from raw data.
DATA cleaning & visualization
• The process of fixing or removing incorrect, corrupted, incorrectly formatted, duplicate, or incomplete data
within a dataset.
• Visualization: the representation of data through use of common graphics, such as charts, plots, infographics,
and even animations. These visual displays of information communicate complex data relationships and data-
driven insights in a way that is easy to understand.
• This definition can extend to data inside the files, transform data from string or char to number
representation, this method can be later known as encoding.
Feature Selection
• In machine learning and statistics, feature selection,
also known as variable selection, attribute selection
or variable subset selection, is the process of
selecting a subset of relevant features for use in
model construction.
Model Accuracy & Precision
• Model Accuracy: an evaluation metric that measures the number of correct predictions made by a model in
relation to the total number of predictions made.
Model accuracy is proportional to the amount of data the model train on.
• Precision: the quality of a positive prediction made by the model. Precision refers to the number of true
positives divided by the total number of positive predictions
Bios & Variance Tradeoff
• Bias: the difference between the prediction of the
values by the Machine Learning model and the
correct value.
• Variance: the amount by which the performance of
a predictive model changes when it is trained on
different subsets of the training data.
Thanks