Professional Documents
Culture Documents
A computer program is said to learn from experience E with respect to some class of tasks T and
performance measure P, if its performance at tasks in T, as measured by P, improves with experience E
E Experience
T Task
P Performance
How do machines learn?
The basic machine learning process can be divided into three parts.
1. Data Input: Past data or information is utilized as a basis for future decision making
2. Abstraction: The input data is represented in a broader way through the underlying
algorithm
3. Generalization: The abstracted representation is generalized to form a framework for
making decisions
Machine learning paradigm :
During the machine learning process, knowledge is fed in the form of input data. However, the data cannot be used
in the original shape and form. abstraction helps in deriving a conceptual map based on the input data.
The other key part is to tune up the abstracted knowledge to a form which can be used to take future
decisions. This is achieved as a part of generalization.
Consider the benefits of solving the problem. What capabilities does it enable?
It is important to clearly understand the benefits of solving the problem. These benefits can be articulated to sell the
project
Step 3: How would I solve the problem?
Try to explore how to solve the problem manually. Detail out step-by-step data collection, data preparation, and
program design to solve the problem. Collect all these details and update the previous sections of the problem
definition, especially the assumptions..
The three different types of machine learning:
supervised learning is to learn from past information which is about the task machine
has to execute.
Example: machine is getting images of different objects as input and the task is to segregate the images by
either shape or colour of the object.
The image is round or triangular, or blue or green in colour. The tag is called ‘ label’ and we say that the
training data is ’labelled data’.
When we are trying to predict a categorical or nominal variable, the problem is
known as a classification problem.
classification problem
Whereas when we are trying to predict a real-valued variable, the problem falls
under the category of regression.
regression.
Classification
In which category the machine should put an image of unknown
category, also called a test data
Further which depends on the information it gets
from the past data, which we have called as training data.
Here machine has to map a new image or test
data to a set of images to which it is similar to and
assign the same label or category to the test data.
Ex: Naïve Bayes, Decision tree, and k-Nearest Neighbour
one typical classification problems include:
•Image classification
•Prediction of disease
•Win–loss prediction of games
•Prediction of natural calamity like earthquake, flood, etc.
•Recognition of handwriting
Example: Bank Fraudulent data on the past transaction data, specifically the ones
labelled as fraudulent, all new incoming transactions are marked or labelled as normal
or suspicious. The suspicious transactions are subsequently segregated for a closer
review.
Supervised Learning
Supervised
Where the labelled training data is passed to a machine learning
Learning algorithm for fitting a predictive model that can make predictions on
new,.
Regression for predicting continuous outcomes:
A second type of supervised learning is the prediction of continuous
outcomes.
which is also called regression analysis. In regression analysis, we
are given a number of predictor (explanatory) variables and
a continuous response variable (outcome), and we try to find a
relationship between
those variables that allows us to predict an outcome.
Your Title Here
where ‘x’ is the predictor variable and ‘y’ is the target variable.
Given a feature
variable, x, and a target variable, y, we fit a straight line to this data that
minimizes the distance—most commonly the average squared distance—
between the data points and the fitted line.
The variable you want to predict is called the dependent variable. The
variable you are using to predict the other variable's value is called the
independent variable.
Your Title Here
Unsupervised learning :
In unsupervised learning, there is no labelled training data to learn from and no
prediction to be made. In unsupervised learning, the objective is to take a dataset as
input and try to find natural groupings or patterns within the data elements or records
Therefore, unsupervised
learning is often termed as descriptive model and the process
of unsupervised learning is referred as pattern discovery or
knowledge discovery.
Clustering is the main type of unsupervised learning
Your Title Here
objective of clustering to discover the intrinsic grouping of unlabelled data and form clusters
Your Title Here
What is Reinforcement Learning?
Reinforcement Learning is a feedback-based Machine learning technique in
which an agent learns to behave in an environment by performing the
actions and seeing the results of actions. For each good action, the agent gets
positive feedback, and for each bad action, the agent gets negative feedback
or penalty.
In Reinforcement Learning, the agent learns automatically using feedbacks
without any labeled data, unlike supervised learning.
Since there is no labeled data, so the agent is bound to learn by its
experience only.
RL solves a specific type of problem where decision making is sequential,
and the goal is long-term, such as game-playing, robotics, etc.
Your Title Here
Example: Suppose there is an AI agent present
within a maze environment, and his goal is to
find the diamond. The agent interacts with the
environment by performing some actions, and
based on those actions, the state of the agent
gets changed, and it also receives a reward or
penalty as feedback.
Reinforcement l earning
Training a logistic regression model with
Your Title Here
scikit-learn(sklearn) Example:1
Example: There is a dataset given which contains the information of
various users obtained from the social networking sites. There is a
car making company that has recently launched a new SUV car. So
the company wanted to check how many users from the dataset,
will purchase the car.
• # importing libraries
• import pandas as pd
• import numpy as np
• import matplotlib.pyplot as plt
• #importing datasets
• data_set= pd.read_csv('...user_data.csv')
• #Extracting Independent and dependent Variable
Your Title Here • x= data_set.iloc[:, [2,3]].values
• y= data_set.iloc[:, 4].values
• In the above code, [2, 3] for x because independent variables are age and
salary, which are at index 2, 3.
• And 4 for y variable because dependent variable is at index 4.
Your Title Here # Splitting the dataset into training and test set.
#feature Scaling -because Age and Estimated Salary values lie in different
ranges. If we don’t scale the features then Estimated Salary feature will dominate
Age feature when the model finds the nearest neighbor to a data point in data
space.
#Predicting the test set result After training the model, it time to
use it to do prediction on testing data.
y_pred= classifier.predict(x_test)
#Creating the Confusion matrix
from sklearn.metrics import confusion_matrix
cm= confusion_matrix(ytest, y_pred)
print ("Confusion Matrix : \n", cm)
Your Title Here