Professional Documents
Culture Documents
Machine Learning
Machine Learning
Machine learning = tool that can be used to enhance humans’ abilities to solve problems and
make informed inferences on a wide range of problems
Statistics = science concerned with developing and studying methods for collecting, analyzing,
interpreting and presenting empirical data
Overfitting = creating a regression model that fits too closely to a specific set of data. It works
too well for the present data but terrible for predictions.
AI = ability of a machine to perform cognitive functions we associate with human minds, such as
perceiving, reasoning, learning, and problem solving. Examples of technologies that enable AI
to solve business problems are robotics and autonomous vehicles, computer vision, language,
virtual agents, and machine learning.
Machine Learning algorithms detect patterns and learn how to make predictions and
recommendations by processing data and experiences, rather than by receiving explicit
programming instruction. The algorithms also adapt in response to new data and experiences to
improve efficacy over time
Data governance is a collection of processes, roles, policies, standards, and metrics that
ensure the effective and efficient use of information in enabling an organization to achieve its
goals.
Statistics is the science concerned with developing and studying methods for collecting,
analyzing, interpreting and presenting empirical data
Two fundamental ideas in the field of statistics are uncertainty and variation
Probability is a mathematical language used to discuss uncertain events
Statisticians attempt to understand and control (where possible) the sources of variation in any
situation
Machine learning is a tool that can be used to enhance humans’ abilities to solve
problems and make informed inferences on a wide range of problems
Machine Learning is also a category of algorithms that allows software applications to become
more accurate in predicting outcomes without being explicitly programmed
The processes involved in machine learning require searching through data to look for patterns
and adjusting program actions accordingly
Machines that learn are useful to humans because, with all of their processing power, they are
able to more quickly highlight or find patterns in big data that would have otherwise been
missed by human beings
When using machine learning we may be overfitting, that is, believing in random correlations
(that do not entail causation). The solution to overfitting is Cross-validation or Backtesting to
assess stability:
● Divide data into Training & Testing subsamples
● Fit model using Training data – Assess using Testing data
SUPERVISED
- Data scientists determine which variables, or features, the model should analyze and
use to develop prediction
- Once training is complete, the algorithm will apply what was learned to new data
Or UNSUPERVISED
DEEP LEARNING
Deep learning is a type of machine learning that can process a wider range of data resources,
requires less data preprocessing by humans, and can often produce more accurate results than
traditional machine-learning approaches.
In deep learning, interconnected layers of software-based calculators known as “neurons” form
a neural network. The network can ingest vast amounts of input data and process them through
multiple layers that learn increasingly complex features of the data at each layer. The network
can then make a determination about the data, learn if its determination is correct, and use what
it has learned to make determinations about new data. For example, once it learns what an
object looks like, it can recognize the object in a new image.