You are on page 1of 43

CH-2

Haimanot D. (MSc.)
binary_world@aau.edu.et
¡ What is ML ?
¡ Why ML ?
¡ How ML solve
¡ ML methods
¡ Machine learning steps
¡ Machine Learning evaluation
¡ What ML can do ?
¡ Application area of ML.
¡ ML is an application of AI that provides
systems the ability to learn and improve
automatically from experience without being
explicitly programmed.
§ The process of learning begins with observations or
data in order to look for patterns in data and make
better decisions in the future based on the examples
that we provide.
§ As intelligence requires knowledge, it is necessary
for the computers to acquire knowledge.
¡ The primary aim ML is to allow the computers
learn automatically without human
intervention or assistance and adjust actions
accordingly.
¡ The amount of data generated by the typical modern
business increases, so does the prominence of data
scientists hired by organizations to help them turn raw
data into valuable business information.
¡ Data extraction is the act of retrieving specific data
from unstructured or poorly structured data sources
for further processing and investigation.
¡ Machine learning is changing the world through a
better forecasting.
¡ Data-driven decisions are more profitable. Every minute,
§ Americans use 2,657,700 GB of data
§ Instagram users post 46,750 photos
§ 15,220,700 texts are sent in the form of Email/SMS and
§ Google conducts 3,607,080 searches.
¡ 2.5 quintillion bytes of data is produced every day (2.5*
1018)
¡ Data scientists must possess a combination of analytic,
machine learning, data mining and statistical skills, as
well as experience with algorithms and coding.
¡ Mathematics and Statistical knowledge
enable to view the data through a quantitative
lens. There are textures, dimensions, and
correlations in data that can be expressed
mathematically.
¡ Technology and Hacking skill is required for
a data scientists utilize technology in order to
wrangle enormous data sets and work with
complex algorithms, and it requires tools far
more sophisticated than Excel.
§ Data scientists need to be able to code
prototype quick solutions, as well as integrate
with complex data systems through different
program.
¡ Domain expert is another important for a data
scientist to be a tactical business consultant to
work closely with data.
¡ Being the study of where information comes from, what it
represents and how it can be turned into a valuable resource in
the creation of business and IT strategies.
§ Mining large amounts of structured and unstructured data
to identify patterns can help an organization rein in costs,
increase efficiencies, recognize new market opportunities
and increase the organization's competitive advantage.
¡ Along with managing and interpreting large amounts of data,
many data scientists are also tasked with creating data
visualization models that help illustrate the business value of
digital information.
¡ Data scientists/expert draw the digital information they are
studying from a growing list of channels and sources,
including
§ Smartphones,
§ Internet of things (IoT) devices,
§ Social media,
§ Surveys,
§ Purchases,
§ Internet searches and behavior
¡ By sorting through these large data, data scientists can identify
patterns to solve problems through the analysis of bigdata.
¡ Data are raw facts and figures that on their own have no
meaning. (e.g. readings from sensors, survey facts, etc)
¡ Data can be numbers, words, letters, images, sound etc.
Yes, Yes, No, Yes, No, Yes, No, Yes
42, 63, 96, 74, 56, 86,?
111192, 111234

¡ None of the above data have any meaning until they


are given a CONTEXT and PROCESSED into a
useable form
§ Thus we need to process data in to information to
make it meaning full and important.
¡ To achieve its aims the organisation will need to process
data into information.
¡ Data needs to be turned into meaningful information and
presented in its most useful format
¡ Data must be processed in a context in order to give it
meaning. 40

Temperature
39

(Celsius)
38
37
36
0 0 0 0 0 0
:0 :3 :0 :3 :0 :3
08 08 09 09 10 10
Time
¡ To turn data into information it needs to be processed.
Information
Processing

Data
¡ Information is data that has been processed by a
computer system to give it meaning.
¡ Processed can mean:
§ Having calculations performed on it
§ Converted to give it meaning
§ Organized in some way
Yes, Yes, No, Yes, No, Yes, No,
Raw Data
Yes, No, Yes, Yes

Context ????
Processing

Information ????
35.8, 36.2, 37.0, 38.4, 37.1, 35.8,
Raw Data 36.2, 37.0, 38.4, 37.0, 38.4, 37.1

Context ??????????
Processing

Information ??????????
Raw Data 030219, 100519

Context
????
Processing

Information ????
Data Information
Data is raw, unorganized facts that
When data is processed, organized,
need to be processed. Data can be
structured or presented in a given
Meaning something simple and seemingly
context so as to make it useful, it is
random and useless until it is
called information.
organized.
The average score of a class or of
Each student's test score is one
Example the entire student that can be
piece of data.
derived from the given raw data
¡ Machine learning focus on how to learn the
rules from examples automatically and
apply on new instances either using
supervised or unsupervised technique.
§ ML algorithms are often categorized as
supervised or unsupervised.

Data or Machine
Output based
Information Learning on input data
system
¡ ML may follow one of the following
techniques
§ Supervised
§ Unsupervised
§ Semi supervised
§ Reinforcement
¡ Uses what has been learned in the past labeled
data to predict future events.
§ Starting from the analysis of a known training data, the
learning algorithm produces an inferred function to
make predictions about the output values. The system is
able to provide targets for any new input after sufficient
training.
¡ The learning algorithm can also compare its output
with the correct, intended output and find errors in
order to modify the model accordingly.
¡ Used when the data or information used to
train is neither classified nor labeled.
¡ Unsupervised learning studies how systems
can infer a function to describe a hidden
structure from unlabeled data.
¡ The system doesn’t figure out the right output,
but it explores the data and can draw
inferences from datasets to describe hidden
structures from unlabeled data.
¡ Semi-supervised learning fall somewhere in
between supervised and unsupervised learning.
§ Use both labeled and unlabeled data for training.
§ Typically a small labeled data and a large amount of
unlabeled data. The systems able to considerably
improve learning accuracy through learning.
¡ Semi-supervised learning is chosen when the
acquired labeled data requires skilled and
relevant resources in order to learn from it.
¡ Interacts with its environment by producing actions
and discovers errors or rewards.
¡ Trial and error search and delayed reward are the
most relevant characteristics of RL. Two main types
of reward are:
§ Positive reward
▪ Encourages continuing performance a particular sequence.
§ Negative reward
▪ Penalizes for performing certain activities and urges to
correct the algorithm to stop.
¡ Machine learning has 7 basic steps
§ Gathering data
§ Data preparation or representation
§ Model selection
§ Training
§ Evaluation
§ Parameter tuning
§ Prediction
¡ The quantity &
quality of your data
dictate how accurate
our model is.
¡ The outcome of this
step is generally a
representation of data.
§ Some of the pre-
collected data such as
Kaggle and UCI.
¡ Clean data which may require remove duplicates,
correct errors, deal with missing values,
normalization, data type conversions and etc.
¡ Randomize data, which erases the effects of the
particular order in which we collected and/or
otherwise prepared our data
¡ Visualize data to help detect relevant relationships
between variables or class imbalances, or perform
other exploratory analysis
¡ Split into training and evaluation sets
¡ Different machine
learning algorithms are
there for different tasks
and application;
§ Clustering,
§ Classification,
§ Regression and
§ Dimensional reduction.
¡ Select the right one.
¡ The goal of training is to answer a question or
make a prediction correctly as often as
possible.
§ Linear regression example: algorithm would need
to learn values for m (or W) and b (x is input, y is
output).
¡ Each iteration of process is a training step.

1
¡ Uses metric or combination of metrics to "measure"
objective performance of model.
§ Test the model against previously unseen data.
¡ Good train/evaluation split? 80/20, 70/30, or similar,
depending on domain, data availability, dataset particulars,
etc.
¡ Tune the model parameters for improved
performance of machine learning.
¡ Simple model hyperparameters may include:
number of training steps, learning rate,
initialization values and distribution, etc.
¡ Using further (test set) data which have, until
this point, been withheld from the model (and
for which class labels are known), are used to
test the model; a better approximation of how
the model will perform in the real world.
¡ The evaluation metric to use depends heavily on the
task at hand. However, the followings are the major
evaluation metric for ML.
§ Confusion matrix
§ Accuracy
§ Precision
§ Recall
§ Specificity
§ F1 score
§ ROC (Receiver Operating Characteristics) curve
TP Predicted positive and FP Predicted positive and
are actually positive. are actually negative.

FN Predicted negative TN Predicted negative


and are actually positive. and are actually negative.
the correctness of a single how much the model is
measurement right when it says it is right
Accuracy Precision

F1 score
It is the harmonic mean
of precision and recall.

how much extra right ones, similar to recall but the shift
the model missed when it is on the negative instances.
showed the right ones
Recall/Sensitivity/TPR Specificity
¡ Predicting iceberg paths: this occasionally requires icebergs to
be towed to avoid collisions
¡ Oil wells drilling optimization: how to digg as few test wells as
possible to detect the entire area where oil can be found
¡ Predicting solar flares: timing, duration, intensity and
localization
¡ Predicting Earthquakes
¡ Predicting very local or global weather; reconstructing past
weather (like 200 million years old)
¡ Predicting Mars weather to identify best time and spots to land.
¡ Predict riots based on tweets
¡ Designing metrics to predict student success, or employee
attrition
¡ Predicting book sales, determining correct price, price
elasticity and whether a specific book should be accepted or
rejected by a publisher, based on projected ROI
¡ Predicting volcano risk, to evacuate populations or cancel
flights, while minimizing expenses caused by these decisions
¡ Predicting 500-year floods, to build dams
¡ Predict death, and health expenditures, to compute your
premiums (based on which population segment you belong to)
¡ Predicting reproduction rate in animal populations
¡ Predicting food reserves each year (fish, meat, crops including
crop failures caused by diseases or other problems).
¡ Electricity, gas, water consumption and other modern
products.
¡ Predicting longevity of a product, or a customer
¡ Predicting duration, extent and severity of draught or fires
¡ Predicting racial and religious mix in a population, detecting
change point to adapt policies accordingly
¡ Predicting new flu viruses to design efficient vaccines each
year
¡ Road constructions and traffic lights designed to optimize
highway traffic.
¡ Google algorithm to predict duration of a road trip, doing
much better than GPS systems not connected to the Internet.
¡ Spell checks, especially for people writing in multiple
languages
¡ Distinguishing between noise and signal on millions of
pictures or videos, to identify patterns
¡ Automated piloting (drones, cars without pilots)
¡ Customized, patient-specific medications and diets
¡ Predicting and legally manipulating elections
¡ Sport bets
¡ Predicting oil demand, oil reserves, oil price, impact of coal
usage
¡ Predicting chances that a container in a port contains a nuclear
bomb
¡ Computing correct average time-to-crime statistics for an
average gun (using censored models to compensate for the
bias caused by new guns not having a criminal history
attached to them)
¡ ML applications in different area;
§ Drug Discovery/Manufacturing.
§ Fraud Detection.
§ Retail for Product Recommendations.
§ Retail for Improved Customer Service.
§ How does Uber determine the price of your ride.
§ How does minimize the wait time once you book a car.
§ Personalizing news feed to rendering targeted ads NETFLIX.
§ Text classification and summarization.

You might also like