You are on page 1of 51

B

Adau
ARTIFICIALYNTELLIGENCE AND DATA ANALYTICS

UNIT -

Artificial lnielligence

Arificial Intelligence (AI) refers to the ficld of computer. science that focuses on creating intelligent
machines capable of performing tasks that typically require human intcligence. AI involves the
deveiopment of computer systems or programs that can perccive, understand, reáson, learn, and make
decisions in a manner similar to human cognitive abilities.

Alsystens are designcd to analyse vast amounts of data, recognize patterns, and make predictions or
take actiornis based on that analysis. They rely on various techniques and algorithms, including
machine learning, natural language processing, computer vision, expert systeims, and neural networks,
among others.

Machine learning is a subset .of AI that enábles systems to learn from data without explicit
programming. It involves training algorithms on Jarge datasets to recognize patterns
and make
accurate predictions or decisions. Dep learming, a type of machine learning, uses neural networks
with multiple layers to proccss and understand complex data,

AI has numerous applications across various


industries and domains. It is used in areas such as
healthcarc, finace, transportatio, robotics, manufacturing, customcr scrvice, and more.
Examples of
Al applications include virtual assistants (e.g., Siri, Alexa), autonomous vehicles, fruud detection
systems, image recognition software, and recommendation engines.

Types of ArtificialIntelligence:

ATfüal Tntelligence

Narrow General Reactive


AI
Strong
AI AI Machincs Sclf
Limited. Awareness
Theory
Meimory of mind

Artificial' Intelligence can be divided


in various lypes, there are mainly two types
categorization which are based on capabilities of main
and based on functionally of Al.. Following
diagram which explain the, types is flow
of AI.
Al(ype-i: Based on Capabilities
1. Weak AIor
Narrow A:
Narrow Alis a type of Al which is
ubie to perfom a dedicated task witlh intelligence.
common andcurrently The most
available Alis Narrow Al the world
Narow Alcannot perforn beyond its field or
in of Artificial lntelligencc.
limitations, as it is only traincd for one
(ask. Hence it is also termed as specific
weak Al. Narrow Alcan fail in unpredictable ways
beyond its limits. if it goes
Apple Sirjis a goodexample
of Narrow A!, but it operates with a limited pre-defined range
functions. ot
lBM's \Watson superconmputer also comes
under Narrow Al, as it uscs an Expert systen
approach combincd with Machine
learning and natural language processing.
Some Exampies of Narrow Alare playing
chess, purchasing suggestions on e-commerce
self-driving cars, specch recognilion, site,
and image recognition.
.2: General Al:
Genera! AI is a
ype of intelliencc whicn could pforn any intellectual task with
like a human. cticiency
o The idea belhind the general Alto make sucha system which could be smarter und think like a
human by its own.
Currently, there is no such system exist which could come under
generalAl and can perform
any lask as perlect as a human.
The worldwide researchers are now tocused on developing macines witlh
General AI.
As systems with genral Alare still under researh, and it willtake lots of efforts and time to
develop such sstems.

3. Super Al:
o Super Al is a level of Intelligence of Systems at which .machines could surpass human
intelligence, and can perform any task better than human with cognitive properties. It is an
outcome of gencral Al.
Some key .characteristics of strong Al include capability include the ability to think, to
reason,solve the puzzie, make judgments, plan, Icarn, and communicate by its own.
Super Al isstill a hypothetical concept of Artificial Intelligence. Devclopment of such systenns
in real is stillworld changing task.
Artiticial Intelligence (ype-2: Bascd on functionality
1.Reactive Maclhines
Purely reactive machines arc the most basic types of
Atiticial Intelligençe.
SuchAl systens do not store memoris or past experiences for future aclions.
Thesemachines oly focus on current sccnarios and react on it as per pössible
best action.
IBM's Deep Blue system is an cxample of rcactive machines.
Google's AlphaGo is also an example of reactive nachines.

2. Limited Memory
Limited memory machines can store past experiences or some
data fora short period of time.
These machines can use stored data for a limited
time period only.
Self-driving cars are one of the best cxamples of Limited
Memory systems. These cars can
store recent speed of' ncarby cars, the distance
of othr cars, speed limit, and othcr information
to navigate the road.

3. Theory of Mind

Theory of Mind Alshould understand the' human


emotions, people, beliefs, and be•ableto
interact socially like humans.
This type of Al machines are still not developed, but researchers.are making lots of cfforts and
improvement for developing such Al machines.

4. Self-Awarcness
Self-awareness Al is the future of Artificial
Intelligencc. These machines will be super
intelligent, and willhave their own consciousness,
sentiments, and self-awareness.
These machincs will be smartcr than hulnan mind.
Self-Awarcness Al does not exist reality
in still and it is a hypothetical concept.

Ilistory of Artificial Intelligence


The history of Artificial Intelligence (AI) dates back
several decadesand is marked by significánt
" milestones and breakthroughs.
Here is n chronological overview of the history
of Al:
1. 1940s-1950s: The Foundations
. In
the 1940s, the first electronic computers werc
developed, providing a basis for
computational rescarch.
• In 1943. neurophysiologist Warren
McCulloch and logician Walter
Pitts proposed a modelof
artificial neurons, Ilaying the foundation for neural networks.
In 1950, mathematician
and conmputer scientist Alan Turing published
the paper "Computing
Machinery and Intelligence," introducing
the idea of machine intelligence
. and the Turing Test.
In 1956, John McCarthy
coined the term "Artificial Intelligence"
and organized the Dartmouth
Conference, which marked the official birth
of Al as a field of study.
3
Symbolic Approaches
1950s-1960s: Early AIResearch and
2.
on symbolic
approachcs, aiming to mimic humanrcasoning
A! rescarcl focused
using
Early
logicalrules and synbols.
computer was uscd to run the first Al program, designed
195i,. the Fcrranti Mark
I toplay

In
chess.
prograns likethe Logic Theorist and the General Problem Solver showcased

Inthe late 1950s, problem-solving and reasoning.
the potential of Al in devolk
concept of machinc learning was introduced by Arthur Samuel, who
In 1961. the
performance in checkers.
program that could improve its oivn
Expert Systems
1960s-1970s: Knowledge-Bascd Systems and
3.
systems, aiming to encode human knowled..
Research shifted towards knowledge-based

computer programs.
and MYIN
. The development expert systems, such as DENDRAL for chemistry
of

ability to emulate human expertise.


medical diagnosis, demonstrated Ai's
. In 1963, the first chatbot, ELIZA, was developed by Joseph Weizenbaum, simulating hura

like conversation.
.
The carly 1970s witnessed thc development
of rule-bäsed language like PROLOG. yhick

of expert systems.
facilitated the creation

4. 1980s-1990s: AI Winter and thec Rise of Machine Learning


progress known as the I
In the1980s. Al faced a period of feduced funding and linitcd
winter."
Machine learning gained promínence as
a subfield of AI, with algorithms that enabled.

computers to learn from data.


Neural networks expericnced a resurgence, with breakthroughs like the backpropagation
algorithm for training multi-layer.networks.
The emergence of statistical techniques and algorithms like support vector.machines an'
Bayesian networks contributed to advancements in Al.
5. 2000s-Present: Big Data, Deep Learning, and AI Boom
. The availability of large dataets and incrcased compuational power fuel advancements inA

Deep learning, powered by neral networks with multiple layers, revolutionized AI in areas

such as image recognition, natúral language processing,and speech syntheis.


Alapplications gained widespread adoption, including virtual assistants (e.g, Siri, Alexa),

recommendation systems, autónomous yehicles, and robotics,


Breakthroughs like lBM's Watson winning the game show Jeopardy! in 201 l showcased !

ability to process. and understand natural language.

4
Turing Test in AI

concept in the ficld of Artificial lntelligence (Al), introduced by


The Turing Test is a well-known
paper "Computing Machinery and
mathematician and.computer scientist Alan Turing in his 1950
a can exhibit intelligent behavior
Intelligence." The test is designed to determine whether machine
indistinguishable from that ofahuman.

Human Evalustor

Communication Channel

Machine (AI System)

Human Participant

System), and the


Inthe dizgram, we have three main entities: the Human Evaluator, the Machine (AI
which enables them to
Human Participant. They are connected through Communication Channel,
a

exchange messages.
a conversation through the
The Turing Test begins with the Human Evaluator engaging in
Communication Channel. The Evaluator interacts with both the Machine and the Human Participant
without knowing which is which. The goal of the Evaluator is to determine which entity human and
is

which is the machine.

responses to
The Machine attempts to generate responses that are indistinguishable from human
convince the Evaluator that it is the human participant. The Human Participant, on the other hand, tries
to respond naturally as a humarn would.

5
Through a series of conversations und interactions, the tlunan Evaluator evaluates thc responscs uf
both enities and decides whether the machine can simuate human-tike belhavior effectively. II the
Evaluator consistently faits to identify the machine, it is considered to thave passcd the Turing Test and
demonstrated intelligence.

This diagram visually represents the key components and the flow of communication in theTuring
Test, where the machine aimsto convince thc human evaluator of its human-ike intelligence.

Iuriig Test

Computer
Player A Hunan responder
Player B

Interrugator
Player C

Consider, Player A is a computer, Player B is human, and Player C is an interrogator. Interrogator is


aware that one of thcn is machine, bui he necds to identily this on the basis of questions and their
responses.

The conversation between all playcrs is via keyboard and screcn so the result would not depend on the
machine's ubility toconvert words as specch.

The test result does not depend on cach correct unswer, but only how closely its responses like a
human answer. The computer, is pernitted to do everything possible to force a wrong identification by
the interrogalor.

The questions and answers can be like:

Interrogator: Are you å computer?

PlayerA (Computer): Nu

Interrogator: Multiply.two large numbers suclh as (256896489*456725896)

Player A: Long pause and givethe wrong answer.

a vwhich is human,
Inthis game, if an interrogator would not be able to identify which is machine and
to be intelligent and can think
then the computer passes the test successfully, and the machine said
is

like a human.
"In T91. the New York businessan
llugh Locbner announces
ollering a S100,000 prize for the
first computer to pass the
tle priz COmpeiiliol,
progrn1 to tilldate, come close to Turing test. However, no A1
passing un undiluted Turing test".
Goals ofArificial
Intelligence.
l'ollowing are the main goals
of Artificial ntclligence:
1. Replicate human
intelligence
2. Solve Knowicdge-intensivg
tasks
3. An intelligent connection
of perccption and action
4. Building amachine
which can perforn tasks that requires
human intelligence such as:
Proving a theorem
Playing chess
Plan some surgical operation
Driving a car in traffic
5. Creating some system
which can exhibit intelligent
itself, demonstrate, cxplain, behayior, lcarn ncw things
and can advise to its user. by

Importance of artificial intelligence

Artificial Intelligence's importance


a long time. They are and subsequent components
being secn as tools and techniqucs have been known for
And it's not like you have to go to make this world better.
You can look around, and I'm sure through to be able to 'use
intelligence. most of your work is these fancy tech gadgets.
snioothed out by artificial
Its importance lies in
making our
humans and are programmed to life casicr. These techinologies are a grcat asset to
operate in an automated minimize human effort as imuch as
can be sought or seen fas]hion.Therefore, manual intervention possible. They can
during the operation of parts is the last thing that
involving this technology.
Thcse machines spccd up your
precision, making them a uscful tasks and processcs with guarantccd accuracy
crTor-free pluce with and valuable tool. Apart from and
their simple making the world an
applications are not only nd everyday techniqucs, these technologies
rclated to our ordinary and and
holds importance for other domains as everyday life. It is affecting
well. and
What is AITechnique?
In the real world, tlhe
knowledge has some unwelcomed
properties -
Itsvolume is Ihuge, next to unimaginable.
It is not well-organized or well-formatted.
It kecps changing constantly.
use the knowledge efliciently in
AI lcchnique is a manner to organize and such a way
that
It should be perceivable by the people who provide it.

It should be casily modifiable to correct errors.

Itshould bec uscful in many situations though it is incomplete or inaccurate.


Alteclniques clevate the specd of execution of the complex program it is equipped wit,

Perception involves interpreting sights, sounds, smells and touch.


Action includos the abilityto negativethrough the wortd and manipulate objects. we
If want to
build robots that live in the world, we must understand these processes.

Perception is a proces to interpret, acquire, select and then organize the sensory
information that is captured from the real worid.
For example: Human beings have sensory receptors such as touch, taste, smell, sight and
hearing. So, the information rcceived from these receptors is transmitted to human brain
to organize the rcccived information.

According to the reccived infonation, action is taken by interacting with the cnvironmen
to manipulate and navigate the objects.
Perception andaction arc vcry important concepis in the ficld Robotics. The followino
of
figures show the complete autonomous robot.

Percepton

ycal Cognition
Wotd

Ätion

Fig: Autonomous Robot

There is one imporlant difference between the artificial


intelligence program and robot.
The AI program performs in a çomputer stimulated
environment, while the robot
performs in the physical world.
For example:
In chess., an Alprogram can be able to
make a move by searching different nodes and has
no facility to touch or sense
the physical worid.
However, the chess playing robot can make anmove
and grasp the pieces by interacing
with the physical world.

8
Representation Requirements

A good knowledge representation system


must have properties
such as:
Representational Aceuracy: it should represent
all kinds of required knowiedge.
InferentialAdequacy: should be able to
It manipulate the representational structures
to poduce new knowlcdge
corresponding to the existing structure.
Inferential Efficiency: The ability to
direct the inferential knowledye mechanism
the most productive directions into
by storing appropriate guides.
Acquisitional efficicncy: The ability to
acquire new knowledge easily using
automatic methods.

What is a KnowledgeReprescntation?

The: object of aknowldge representation


is to express knowledge in a computer
form, so that it can be used to enable our tractable
Alagents to'perform well.
Åknowledge
representation language is defined
by two aspecis:

1. Syntax The syntax óf a languagc defines which configurations


languagc constitutc valid sentences. of the components of the

2. Sennantics The semantics defines which facts in the world the sentences
the statennent about the wvorld that each sentence mnakes. refer lo, and hence

This is a very general idea, and not restricted to


nalural language.
Süppose the Jahguage arithmetic,
is then 'x', and 'y' arg components (or symbols or
words) of the.language the syntax says that
'x >y' is a valid sentence in the language, but
Xxy'is not thesemantics say that 'x y' isfalse if y is bigger '2
than x, and true otherwise
Requirements of a Knowledge Representation

A good knowledge representation system for any particular domain should possess
following properties: the

1. Represeitational Adequacy
the ability to represent all the different kinds of knowledgc
that might be needed in that domain.

2. Inferential Adequacy – the ability to manipulate


new structures (corresponding new the representational structures to derive
to knowledge) from existing structures.
3. Inferential Efficiency -the ability to incorporate additional
infonnation into the
knowledge structure which can
be used to focus the attention of the inference nechanisms
the most promising directions. in
4. Acquisitional Eficiency - tthe ability to
acquire new information casily. ldeally tlie agent
should be able to control its ownknowledge acquisition. butdirect insetion of intormation
by a 'knowledye engineer'
would be acceptable.

Finding a system that optimises these for


all possible domuins is not going to be feasible.

Practical Aspects of Good Representations


Inpractice, tlie theoretical requirements for good knowledge
representations can usually be
achieved by dealing appropriately
with a nuber of practical requirements:
1. The representalions
need to be complete - so that everything that could
be represented, can casily be represcntcd. possibly need to

2. They must be computable - inplementable with stundard computing procedures.


3.They should make the innportani
easy to see what is
objects and relations explieit and accessible so -
that it is
going on, and how tlhe various components
interact.
4. They should suppress irrelevant
detail- so tlat rarely used details don't introduce
unnecessary complications,
but are still available when necded,
5. Theyshould expose ány
natural constraints - so thut it is easy to express how one
relation influences another. objeet or

6. They should be transparent - s0 you can casily understand what is being said.
7. The implenmentation
nceds to be concise and fast –so that
retrieved and nanipulated information can be slorcd,
rapidly.

Componentsof a Good Representation

For analysis purposes it useful to


is
their four fundamental components:
be able to break any knowledye representation
down into

!. Thc lexicalpart-- thatdeternines which symbols4or words are uscd


vocabulary., in the representation's

2. The structural or syntactíc part


-that describes thc constraints on how tlie synbols can
arranged, i.e. a grarmmar. be

3. The semantic part - that establishes a 'way of associating,real world


meanings witl tlhe
representations.
4. The procedural part - that specifies the actess procedures that enables ways
modifying representations and of creating, and
answering questions using, them, i.e. we
compute things with low generate and
therepresentation.

10
Intelligent Agents:

An intelligent agent is an autonomous entity


and actuators for achieving goals An
which act upon an environment using sensorS
achieve their goals. A thernmostat is un
intclligent Lent may learn from the cnvironnent to
cxumple of n
inlelligent agent.
Following are the main
for rules for an Alagent:
Rule 1: An Alagent must have the ability to pcrcejve
thc environment.
Rule 2: The observation nmust be used to
make decisions..
Rule 3: Decision should result in an
action.
Rule 4: The action taken by an Al ugent must a be rational action.
What is an Agent?

An ag¢nt can bc tyi a


peicivcis. iK ind act upon that
cnvironment thuough actuators. An Agent
runs' in the cycle of percciving, tlhinking.
and acting. An agent can be:

Human-Agent: A human agent has eyes, ears, and other organs


Sensors and hand, legs, vocal tract which work for
work for-actuators.
C Robotie Agent: A robotic agent can have cameras,
sensors and various motors infrared range [inder, NLP for
for actuators.
Soflware Agent: Software agent can have keysrokes, file contents as
and acton those inputs aund display output on sensory input
the screcn.

Sensors

5.

Etectors
Actions

Hence theworld around us is full agents


of such as thermostat. ccllphone, camera,
wC arc also ugcnts. and even

Sensor: Sensor is a device which detects


the change in the environment
information to other electronic devices. and sends the
An agent observes its environment
through sensors.
Actuators: Actuators arc he component
actuators are only responsible
of machines that convcrts energy into motion. The
for moving and controlling a system.
.electric motor, gears, rails, etc. An actuator can be an

11
devices which affect thc environnent. EffectorsÇan
EIfector's: Efectors are thc
and isplay
scrccn. beleps
whcels, arms. fingers, wings. fins,

Agent Environment in
A

surrounds the agent,


everything in the world which but it isnot
An environnent is a situation a part
can bc described as in wliiclh an of
an agcnt ilsell, An enviroment agentispresent,
agent witlh
The cnvironment is whcre agcnt lives, operate and provide he sonetling
environment mostly said to be non-fministic. losns,
and act upon it. An is

Propcrties of Environment
-
Lhe cIVI Oulleit has nulifold properties
Discrctc / Continuous- If there are a limited number of distinct, clearly defined,
is discrete (For cxample,
slates of the environment, the environment chess);otherwis,
it is continuous (For cxaniple, driving).

-
Observable / Partially Observable If it is possible to determine thee complele
stale
of tlhe environunent at cach time poiint from thè percepls it is observable; otlherwiseitis
only partially observable.
ti

Static / Dynamic- if thc environment does not clhange while,an agent acung,
is hen
it is static; otherwise it is dynamic.

Single agent / Multiple agents -The environnent may contain other agents ivhick

may be of the sanne or differcnt kind as that of the agent.

Acccssible / Inaccessible - If the agent's sensory apparatus can havc access to the
complcte statc of thc environment, then the environiment is acccssible to that aon!

Deterministic / Non-deterministie - If the next state of the envirönient it


completely determined by the crrent state and the actions of the agent, then the
environmcnt is deterministic; otherwisc it is non-deternninistic.

Episodie /Non-episodic In an episodic environment, each


episode consists of he
agent perceiving and then acting. The quality of itsaction depends just on thepisode
itself. Subsequent episodes do not depend on the actions in the previous episodes.
Episodic environments are much simpler because the agent does not need to think
alhead.

12
Characteristies of intclligent agents
Intelligent agents have some level of individualism that allows
them to perform
certain tasks on their own.

IA can learn even as taskS are cairicd out.

They can make interactions with other entities like agcnts,


humans, and systems.
New ules can be accomnodated.

Goal-oriented habits

Typcs ofAlAgents

Agents can be groupcd into five classes based on their degree


of perceived intelligence and
capability. All these agents can improve their performance and generate
time. These are given below: better action over he

Simple Reflex Agent


o Model-based reflex agent
Goal-based agents
Uility-based agent
Learning agent

1. Sinple Reflex agent:

The Simple reflex agents arc the simplest agents. These agents
take decisions on the
basisof the current percepts and ignore the rest
of the percept history.
These agents only succeed in the fully observable environnient.
The Simple reflex agent-does not consider
aiy part of percepts history during their
decision and action process.
The Simple reflex agent works on Condition-action
rule, vhich means it maps the
current state to action. as a
Such Room Cleaner agent, it works only there
room. if is dirt in
the
Problems for the sinple reflex agent
design approach:
They havé very limited intelligence
They do not have knowledge
of non-petceptual parts of the current state
Mostly too big to generate.and to store.
Not adaptive to changes in the environment.

13
Agent
Sensots< Precepts
1What e
wertd

Ceaditiosation rule |hat actton I


shoutd do now

Actuatort
Action

2. Model-bascd reflex
agent
The Model-based agent can work a
in partially observable
situation. environment, and track the
o A
model-based agent has two
inportant factors:
o Model: Itis knowledge about "how
things lhappen in the world," so it
a Model-bascd agent. is called
o Internal
State: I is representation of the current
a
history. state based on percept
These agents have the model,
"which is knowlcdge
model thcy performn uctions. of the world" and baved on thc
Updating the agent state requires
infonnation about:
How the world evolves
b How the agcnt's action
afects the world.

\Precepts
Sensors
State
How the world evolves
Eivironment
Wa the world|

What mny actions do

konition-acúon rules |Whar acdon I

Actuators
Action
Agent

3.Goal-based agents
The knowledge of the current state
environment is not always suflicient to
decide for
an agent to what todo.

14
The agent nccdsto kow its Loal wihich deseribes desirablo situations.
Goal-bascd agents cxpand the capabilities of the nodel-based agent by having the
"goal" infornation.
They choose an action, so that they can aclhieve the goal.

These agents may have to consider a long sequence of possible actions before
deciding whether the goal is achieved or not. Such considerations of different sccnario
are called scarching and planning, which nakes an agcnt proactivc.

Prcccpts
Sensors
State :*
filow the world evolves
Envronment
What the world
is ike now
What mg actíons do
Waat it nill be
ike if I do acion A

What action i
Goals )
should do now

Actuitors Action
Ageut

4. Utility-Ibased agents

These agents are similar to the goal-based gent but


provide an extra component of
utility measurement which makes them different by
pruviding a measure of success at
a given state.
o
Utility-bascd agent act based not only goals but also the best way to achieve
the goal.
The Utility-based agent is useful when there are
inultiple possible altermatives, and an
agent has to choose order to perforn the best
in action.
o The utility function maps cach state to a real
number to check how cfficiently cacl
action aclhieves the goals.

15
Sensors Preceps
sthte
Wuathe rorld
ior the world olves ke bor..
Whatnuy'xetlons do
Envirounent
What ir will
be like ifIdo action'

Hor happy I,will


Udlity be It 1ucli itare

WLaraadat.I
Xhgutddonoi:
Action
Agent Aetuatos

5. Learning Agents

learning agent in Al is the type of agent which can leam fronm


its past experience
or it has learning capabilities.
It starts to act with basic knowledge and then ablc to act and adapt automntisiat
through learning.
o A learning agent has nainly four conceptual coniponents, which arc:

AGENT
Pertormance
Stnpdaid
Perccpts
Critie Sensors

feedback
changesPerommance Envionment
Learning knowledge
element element
leaming goal
JeNperiments
Probleu
Generacor
SEMecrors actions

a. Learning element: It is responsible for making improvements by lcarning fror=

environment

b Critic: Learning element takes feedback from critic which describes that how well the
agent is doing with respect to a
fxed perfornance standard.

16
Performance element: It is responsible for sclecting external action
Problem generator: This component is responsible for
uggesting actions that will
Jead to newand informativeexperiences.

Hence, learning agents arcable to learn, analyze


performance, and look for new ways
to improve the performance.

1.3 Problem Solving


Problem-solving in Al: The problem of Al isdirectly associated
with the nature of humans
and their activities. So we nced a number of finite steps to
solvea problem which makes
human casy works.
These are thefollowing steps which require to solve a
problem:
Goal Formulation: This one is the first and
siple step in problem-solving. It
organizes finite steps to formulate a target/goals which
require some action to achieve
the goal. Today the formulation of the goal is based on AlI
agents.

Problemn formulation: It is one of the core steps


of problem-solving which decides
what action should be taken to achieve the formulated goal. In Al
this core part is
dependent upon software agent which consisted of the following components to
formulate the associated problem.

Y InitialState: starting state or initial step of the agent towards its goal.
It is the
Actions: It is the description of the possible actions available to the agent.
v Transition Model: It describes what each
action does.
V Goal Test: It determines
if the given state is a goal state.
Path cost: It assigns a numeric cost to each path that follows the goal.
The problem
solving agent selects a cost function, which reflects its performance measure.
Remember, an optimalsolution has the lowest path cost among
all the solutions.
Search Problem
Artificial Intelligence is the study of building agents that act rationally. Most
of the time,
these agents perform some kind of search algorithm in the background
in order to achieve
their tasks.
A search problem consists of:
A State Space. Set of all possible states where you can be.
A Start State. The state from where the search begins.
A
Goal Test. A function that looks at the current state returns
whether or not
it is the goal state.

17
The Solution to a scarch problem is a sequence of actions, callcd the plan that
transforms the start state to the goal state.
This plan is achieved through search algorithms.

Types of search algorithms:

State spacc scarch

Blind Juninfomed Heuristic

DFS BFS HillClimbing Best FS

Uninformed/Blind Search:

The uninformmed search does not contain any


domain knowledge such as closeness, the
location of the goal. It operates in a brute-force way as
it only includes information about
how to traverse the tree and how to identify leaf and
goal nodes. Uninformed search applies a
way in which search tree
is searched without any information about the search space
like
initial state operators and test for the goal, so it is also
called blind search. It examines each
node of the tree until it achieves the goal node.

Breadth-first search
Depth-first search

Breadth-first Search:

Breadth-first 'search is the most common search strategy


for traversing a tree or graph.
This algorithm searches breadthwise in a tree or graph, so it is
called breadth-first
search.
BFS algorithm starts searching from the root node
of the tree and expands all
successor node at the current level before
moving to nodes of next level.
The breadth-first search algorithm is an example a
of general-graph search algorithm.
Breadth-first search implemented using FIFO queue data structure.

18
Advantages:

BFS will provide a solution if any solution exists.


1f there are more than one solutions for a given problem, then BFS willprovide the
minimal solution which requires the least number of steps.

Disadvantages:

It requires lots of memory since cach level of the tree must be saved into memory to
expand the next level.
BFS needs lots of ine if the solution is far away from the root node.

Example:

In the below tree structure, we have shown the traversing of the tree using BFS algorithm
from the root node S to goal node K. BFS search algorithm traverse in layers, so it will follow
the path which is shown by the dotted arrow, and the traversed path wil be:

Breadth First Search


ELeveo

B Level i

H Leval2

Level 3

evelA

S---> A-->B---->C--->D--->G-->H--->E---->F..>-->K
Time Complexity: Time Complexity
of BFS algorithm can be obtained by the nunber
nodes traversed in BFS until the shallovest
Node. Where the d= depth
of
and b isanode at every state. of shallowest solution

19
T (b)
=
1+b'+b'+...t b'=0 (b')
Space Complexity: Space complexity of BES algorithm is given
by the Memory
frontier which is O(b').
sze
Completeness: BFS is complete, which means if the shallowest goal
node i
BFS will find a
depth, then solution. isalsomefinite
Optimality: BFS is optimal if path cost is a
non-decreasing function of
the depth ofthenode.
2. Depth-first Search

Depth-first search isa recursive algorithm for traversing a


tree or graph datastructure.
It is called the depth-first search because it starts from
the root node and
path to its greatest depth node before noving to followseach
the next
path.
DFS uses a stack data structure for its implementation.
The process of the DFS algorithm is similar to thc BFS algorithm.

Advantage:

DES requires very less memory as it only


needs to store a stack of the nodes on a
path from root node to the current node.
It takes less time to rcach to the goal node
than BFS algorithm (ifit traverses in th

right path).

Disadvantage:

There is the possibility that many states keep


re-occurring, and there is no guarante
of finding the solution.
DFS algorithm goes for deep down searching
and sometine it may go to the infinite
loop.

Example:

In the below search tree, we have shown


the flow of depth-first search, and it will follow the
order as:. :.

Root node--->Left node ----> right node.

It will start searching from root node S, and traverse


A, then B, then D and E, after traversing
E, it willbacktrack the tree as E has no
other successor and still goal node is not found. After
backtracking it willtraverse node C and then G,
and here it will terminate as it found gol
node.

20
. Depth First Search

Tevel'o

H. Leveli

LLevel 2

Completeness: DFS search algorithm is complete within finite state space as it will expand
every node within a limited search tree.

TimeComplexity: Time complexity of DFS will be equivalent to the node traversed by the
algorithm. It is given by:

T(n)= 1+ n'+n'+...... n"=0(n")


Where, m= maximum depth of any node and this can be much larger than d (Shallowest
solution depth)

Space Complexity: DFS algorithm needs to store only single path from the root node,
space complexity
hence
of DFS is equivalent to the size of the fringe set, which is O(bm).
Optimal: DFS search algorithm is non-optimal, as it may generate a
large number of steps or
high cost to reach to the goal node.

Inforned Search Algorithms

Informed search algorithm contains an array


of knowledge such as how far we are
from the goal, path cost, how. to. reach.
to.goal .node,.etc."This knowledge help agents to
explore less to the search space and find more efficiently the
goal node.
The informed search .algorithm is more
useful for large search space. Informed search
algorithm uses the idea of heuristic, so it is also called Heuristic search.

Heuristics function: Heuristic is a function which


is used in Informed Search, and it finds
the most promising path. Ittakes the current state of the agent as its input and produces
estimation of how close agent is from the
the goal. The heuristic method, however, might not
21
always give the best solution, but it guaranteed to find a good solution in reasonable time.
Heuristic function estimates how close a state is to tlhe goal. It is
represented by h(n), and it
caleulates the cost of an optimal path between the pair of states. The
value of the heuristic
function is always positive.

Admissibility of the heuristic function is given as:


h(n) <=h*(n)
llere h(n) is heuristic cost, and h*(n) is the estimated cost. IIence
be less tlhan or cqual to the estimated cost. heuristic cost should

Pure Heuristic Search:


Pure heuristic search is the simplest form
of heuristic search algorithms. It expands nodes
based on their heuristic value h(n). It maintains two
lists, OPEN and CLOSED list. In the
CLOSED list, it places those nodes which have already
expanded and in the OPEN ist, it
places nodes which have yet not been
expanded.
On each iteration, each node n with the lowest
heuristic value is expanded and generates all
its successors and n is placed to the closed list.
The algorithm continues unit a goal state is
found.

In the informed search we will discuss two


main algorithms which are given below:

Best First Search Algorithm(Greedy


search)
Greedy best-first search algorithm always selects the
path which appears best at that moment.
It is the combination of depth-first search
and breadth-first search algorithms. It uses the
heuristic function and search. Best-first search allows us to
take the
algorithns. With the help of best-first search, at each step, we can choose advantages. of both
node. In the best first search algorithm, we expand the node which the most promising
is closest to the goal node
and the closest cost is estimated by heuristic function, i.e.

f(n)= g(n).

Were, h(n)= estimated cost from noden to the goal.

The greedy best first algorithm is implemented by the priority queuc.

Best firstsearch algorithm:


Step 1:Place the starting nodë into the OPEN list."

o Step 2: If the OPEN list is empty, Stop and return failure.


Step 3: Remove the node n, from the OPEN list which has the lowest value of hn),
and places it in the CLOSED list.
Step 4: Expand the node n, and generate the successors of node n.
o Step 5: Check each successor of node n, and find whether any node isa goal node or
not. If any successor node is goal node, then return success and terminate the search,
else proceed to Step 6.
22
evaluation function f(n), nd
Step 6: For cach successor hode, alyotitbm checks for
or CLOSED list. If the node hos not
then check if the hode has been in cither OPEN
it to the OPEN list.
been in both list, then dd
Step 7: Return to Step 2.

Advantages:
Best lirst search can switch between DES and DES by gaining. the advantages of botlh

the algorithms.
This algorithm is more eflicient than BES and DFS alyorithms.

Disadvantages:
It can behave as an unguided depth-first search in the worst case scenario.
It can get stuck in a loop as DFS.

This algorithm is not optimal.

Example:

Consider the below search problenm, and we will traverse it using grccdy best-first search. At
cach iteration, cach node is expanded using cvaluation function f(n)=h(u), which is given
in

the below table.

node

B
T2
B

D.
8

In this search example, we are using two lists which are OPEN and CLOSED Lists.
Following are the iteration for traversing the above examplc.

23
[83( E 9

G 2

Expand the nodes ofS and put in the CLOSED list

Initialization: Open |A, BJ, Closed |S]

Iteration 1: Open (A], Closed [S, B]

Iteration2: Open (E, F, AJ, Closed [S, B]


: Open [E, A], Closed [S, B, FJ

Iteration 3: Open (1,G, E, A], Closcd [S, B, F)


Open |1, E, A],Closed [S, B, F, G]

Hence the final solution path will be: S----> B--...F..> G

Time Comnplexity: The worst case time complexity


of Greedy best first search is 0(6).
Space Complexity: The worst case space complexity
of Greedy best first search is 0C
Wherc, n is the maximum depth of the search space.

Complete: Greedy best-first search is also incomplete, even


if the given state spacc is fin
Optimal: Grecdy best first search.algorithm is not optimal.

Constraint Satisfaction Problem

Another type of problem-solving tcchnique


known as Constraint satisfaction technque
the name, it is understood that constraint
satisfaction means solving a problem wacr
consiraints or rules.

Constraint satisfaction
is a technique where a problemn is solved when its valessanif
certain constraints or
rules of the problem. Such deeper
understanding of the type of technique leadsto a
problem structure as
Constraint satisfaction wellas its complexity.
depends on three conponents,
namely:
24
. X: It is a set of variables.
D: It is a set of domains where the variables reside. There is a specific donain for
each variable.
C:It is a set of constraints which are followed by the set of variables.
satisfaction, domains are the spaces where the variables reside, following the
In constraint
a
problem specific constraints. These are the three main elements of constraint satisfaction
technique. The constraint value consists of a pair of {scope, rel). The seope is a tuple of
variables which participate in the constraint and rel is a relation which includes a list of
yalues which the variables can take to satisfy the constraints of the problem.

Solving Constraint Satisfaction Problems


The requirements to solve a constraint satisfaction problem (CSP) is:

state-space
A

The notion of the solution.

A state in state-space is defined by assigning values to some or all variables such as


XVI,Xv, and so on...].
n assignment of values to a variable can be done in three ways:
Consistent or Legal Assignment: An assignment which does not violate any
constraint or rule is called Consistent or legal assignment.
Complete Assignment: An assignment where every variable is assignedwith a value,
and the solution to the CSP remains consistent. Such assignment is known as
Complete assignment.
Partial Assigument: An assignment which assigns values to some of the variables
only. Such type of assignments are called Partial assignments.
Types of Domains in CSP
There are following two types of domains which are used by the variables :
Discrete Domain: It is an infinite domain which can have one state for multiple
variables. For example, a start state can be allocated infinite times for each variable.
Finite Domain: It is a finite domain which can have continuous states
describing one
domain for one specific variable. It is also called a continuous domain.
Constraint Types in CSP
With respect tothe variables, basically there are following types of constraints:
Unary Constraints: It is the simplest type of constraints that restricts
the value of a
single variable.
Binary Constraints: It is the constraint type which relates two
variables. A
value x willcontain a value which lies between
xl and x3.
GlobalConstraints: It is the constraint type
which involves an arbitrary number
variables.
of

25
Constraint Satisfaction Problems
CSP can be viewed as a standard search problem as follows:

Initialstate: the cmpty assignment , in which all variables are unassigned.

Successor function: a value can be assigned to any unassigned variable, provided that it
does
not confict with previously assigned
variables.
Goal test:the current assignment
is complete.
Path cost: a constant cost(E.g.,1)
for every step.
Backtracking Search for CSPs

A variant of depth-first search called backtracking


search uses less memory and only
one successor generated a
is at time rather than all successors.; Only
O(m) memory is needed
rather than O(bm).

The ternm backtracking search is used for depth-first


search that chooses values for one
variable at a time and backtracks when a
variable has no legal values left to
assign
8-queens problem
The goal of 8-queens problem is to place & queens on
the chessboard such that no queen
attacks any other.(A queen attacks any piece
in the same row,column or diagonal).
Figure shows an attempted solution that fails: the queen
in the right most column is attacked
by the queen at the top left.

2 3 4 5 6 7

3
4

6 q6

An Incremental formulation involves operators that augments the state


description, starting
with an empty state. for 8-queens problem, this means each action adds a queen to
the state.
26
move them around.
A complete-state formulation starts with all S gueens on the board and
In cither case the path cost is of no interest because only the final state counts.

The first incremental formulation one might try is the following:

o States: Any arrangement qucens on board is a state.


a0 to S

o Initial state: No queen on the board.

o Successor funetion: Adda queen to any empty squre.


Goal T'est: S qucens are on the board.none attacked. In this formulation,we have 64.63...57
o

=3x 1014 possible sequences to investigate. A better formulation woud prohibit placing a
qucen in any square that is already attacked. :

OSates! Arangements of n queens (0<n<= 8),one per column in the left most columns
,with no qucen attacking another are states.

o Successor function: Add a queen to any square in the left most empty column such that it
is not attacked by any other queen.

This fornmulation reduces the 8-queen state space from 3 x 1014 to just 2057, and solutions
are easy to find. For the 100 queens the initial formulation has roughly 10400 states whereas
the improved formulation has about 1052 states. This is a huge reduction, but the improved
state space is stilltoo big for the algorithms to handle.

The &-puzzle
An 8-puzzle consists of a 3x3 boardwith eight numbered tiles and a blank space. A tile
adjacent to the balank space can slide into the space. The object is to reach the goal state ,as
shown in figure.

Example: The S-puzzle

7 2

Start State Goal State

27
A
typical instance of 8-puzzle

fomulation is as follows:
The problem
specifiesthe location of cach of
the cight tiles und
o States:Astate description
oneof the nine squares.
as initial state. It can bo noted
o stale can be designated the
Initial state: Any
lhatanygivCn
possible initial states.
goalcan be rcached from cxactly half of
the

generates the legal states that result from


fr
trying the four
oSuccessor funetion: This ncions
or down).
(blank moves Left, Right, Up
state natches the goal configuration shown
Goal Test: This checks whether the
o

possible)
infigure
2.4.(0ther goal conligurations are
so the pathh cost 1s
thcnumber of steps in the nail,
o
Path cost: Each step costs 1,

.
belongs to the family of sliding-block puzzles, which are ofen
o 'The&-uzzle
AI. This general class
problems for new search algorithms in is known as NP-complele.
= 181,440 reaclhable states and is easily solyed.
The 8-puzzle has 9!/2

board) has aroUnd 1.3 trillion states, an the random


x 4 instanooe
The 15 puzzle (4
algorithms.
solved optimally in few milli seconds by the best scarch
are etu
instances ,

The 24-puzzle (ona5x5 board) has around 1025 states and random
difficult to solve optimally with current machines and
algorithms.

28
Machine Learning

Learning is a
continuous process of
improvement over
experience.
Machinelearning is building machines that can adapt
programmed, and learn from expericnce
without
being explicitly

Definition

computer program which Jearns from experience


A
is called a machine learning
program. programn or
simplyà learning
à Such a program
is sometimes also referred as a
to learner.
machine learning,
In
.There is a learning algorithm.
nala called as training data set is fed to
the learning algorithm.
.Iearning algorithm draws interences from the training data set.
ltgenerates a model which is a function that maps input to the outnut

Input
(Testirig Dala Set)

Data Learning
(Training Data Set) Algorithm ModeI

Oulput
Machine Learning Model
4A computer program is
said to learn from experience E with respect to some class
of tasks T
and performance measure
P, if its performance at tasksin T, as
measured by P, improves with
experience E."

The above definition is basically focuing on thrèe parameters,


also the main components of
any learning
algorithm, namely Task(T), Performance(P) and experience (E).
In this context,
"we cansimplify this definition
a
ML is a field
of Al consisting of learning algorithms that -
Improve their performance (P)
At executing some task (T)
Over time with experience
(E)
Based on theabove, the following diagam represents a
Machinc Learning Model

Task (T) Performan


ce (P)

Experienc
e (E)
Task(T)
From the perspective
of problem, we may define the task as the real-world problem to
solved. The problem can be anything be
like finding best house price
find best marketing strategy etc.
ina specific location or to
On the other hand, if we talk about
machine learning, the
definition of task is different because
it is difficult to solve ML based
tasks by conventional
programming approach.

A task T is said to be a ML based


task when it is based on the process
and the system must

Regression, Structured annotation, Clustering, Cin


Transcription etc.
Experience (E)
As name suggests, it is the knowlcdge gained from
data points provided to the algorithm or
model. Once provided with the dataset, the model will run
iteratively and will learn some
inherent pattern. The learning thus acquired is called experience(E).
Making an analogy with
human learning, we can think of this situation as in which a
human being is learning or
gaining some experience from various attributes like situation, relationships etc. Supervised,
unsupervised and reinforcement learning are some ways to learn or gain experience.
The
experience gained by out ML model or algorithm will be used to solve the task T.

Performance (P)
An ML algorithm is supposed to perform task and gain cxperience with the passage time.
of
The measure which tells whether ML algorithm is performing as per expectation or not is its
performance (P). P is basically a quantitative metric that tells how a model is performing the
task, T, using its experience, E. There are many metrics that telp to understand the ML
performance, such as accuracy score, FI score, confusion matrix, precision, recall, sensitivity
etc.

2
Examples

i) Handwriting recognition lcarning problem



Task T: Recognising and classifying handwritten words within imagcs


Performance P: Percent of words correctly classified

• Training experience E: A dataset of handwritten words with given classifications

ii) A robot driving learning problem

•Task T: Driving on highways using vision sensors


anerror

Performance measure P: Average distance traveled before
sequence of images and steering commands recorded while
training experience: A
*

observing a hunnan driver

iii) A chess learning problem


• Task T: Playing chess

Performance measure P: Percent of games won against opponents

Training experience E: Playing practice games against itself

Classes of Learning
to automatically learn
Machine learning isa subset of Al, which enables the machine
make prcdictions.
from data, improve performance from past experiences, and

is divided into mainly three


Based on the methods and way of learning. machine learning
types, which are:

1. Supervised Machine Learing


2. Unsupervised Machine Learning
3. Reinforcement Learning

1. Supervised Machine Learning


on supervision. It means in the
As its name suggests, Supervised machine. learning is based
supervised learning technique, we train the machines using' the "labelled" dataset, and based
on the training, the machine predicts the output. Here, the labelled data specifies that some of

the inputs are already mapped to the output. More preciously, we cn say; first, we train the

3
we
ask the machine
machinc with the input and corresponding oulput, and then to predictthe
output using the test datasct.

an example. Suppose we
Let's understand supervised learning with have an input dataset
of
cats and dog images. So, first,
we will provide
the training to the machine to understand
the
images, such as the shapc size of the tail of cat and dog, Shape of eyes, colour,
&

height
(dogs are taller, cats are smaller), ctc. After completion of training,
we
input the picture
of
a cat
and ask the machine to identify the object and predict the output. Now, the machine is
well traincd, so it will check all the features of the object, such as height, shape, colour,
eyes,
cars, tail, ctc., and find that it's a
cat. So, it will put it in the Cat
category. This
isthe process
of how the machine identifies the objects in Supervised Lcarning.

The main goal of the supervised learning tcchnique is to mapthe input variable(*) with

the output variable(y). Some real-world applications


of supervised learning are
Risk
Assessment, Fraud Detection, Spam filtering, etc.

Categorices of Supervised Machine Learning

Supervised machinc learning can be classified into two types of problems, which are oi

below:

Classification
Regression

a) Classification

Classification algorithms are used to solve the classification problems in which the outout
variable is categorical, such as "Yes" or No, Male or Female, Red or Blue, etc. The
classification algorithms predict the categories present in the dataset. Some real-yortd
examples of classi fication algorithms are Spam Detection, Email filtering, etc.

Some popular classification algorithms are given below:

o Random Forcst Algorithm


Decision Tree Algorithm
a. Logistic Regression Algorithm
Support Vector Machine Algorithm

b) Regression

4
Regression algorithms are used to solve regression problems in which there is a linear
relationship between input and output variables. These are used to predict continuous outpui
variables, such as market trends, weather prediction, etc.

Some popular Regression algorithms are given below:

Simple Linear Regression Algorithm


Multivariate Regression Algorithm
Decision Tree Algorithm
Lasso Regression

Advantages and Disadvantages of Supervised Learning

Advantages:

Since supervised learning work with the labelled dataset so we can have an exact idea
about the classes of objects.
These algorithms are helpful in predicting the output on the basis of prior experience.

Disadvantages:

These algorithmsare not able to solve complex tasks.


It may predict the wrong output if the test data is different from the training data.
It requires lots of computational time to train the algorithm.

2. Unsupervised Machine Learning

Unsupervised learning is different from the Supervised learning technique; as its


name suggests, there is no need for supervision. It means, in unsupervised machine learning,
the machine is trained using the unlabeled dataset, and the machine predicts the output
without any supervision.

In unsupervised learning, the models are trained with the data that is neither
classified nor
labelled, and the model acts on that datawithout any supervision.

The main aim of the unsupervised learning algorithn is


to group or categories the
unsorted dätaset according to the similarities, patterns, and 'differences. Machines are
instructed to find the hidden patterns from the input dataset.

Nm 5
Let's take an example te understand it more preciously; suppose therc is a basket of fruit

images, and we input it into the machine learning model. The images are totally unknown to
the model, and the task of the machine is to find the pattcrns and categories of the objects.

So, now the machine will discover its patterns and differences, such as colour diference.
shape difference, and predict the output when it is tested with the test dataset.

Categories of Unsupervised Machine Learning

Unsupervised Learning can be further classificd into tvwo types, which are given below:

Clustering
Association

1)Clustering
Theclustering technique is used when we want to find the inherent groups from the data. It is
a way to group the objects into a cluster such that the objects with the most similarities

remain in one group and have fewer or no similarities with the objects of other groups. An
example of the clustering algorithm isgrouping the customers by their purchasing behaviour.

Some of the popular clustering algorithms are given bclow:

K-Means Clustering algorithm


Mean-shift algorithm
DBSCAN Algorithm
PrincipalComponent Analysis
Independent Component Analysis

2) Association

Association rule learning is an unsupervised learning technique, which finds interesting


relations among variables within a large dataset. The imain aim of this learning algörithm is to
find the dependency of onc data item on another data item and map those variables
accordingly so that it can generate maximum profit. This algorithm is mainly applied
in Markct Basket analysis, Web usage mining, continuous production, etc.

Some popular algorithms of Association rule learning are Apriori Algorithm, Eclat, FP

growth algorithm.

Advantages and Disadvantages of Unsupervised Learning Algorithm


6
Advantages:

algorithms
can be used for complicated
These taskS Compared
algorithms work on the unlabeled to the supervised ones
becausethese dataset.
Unsupervised
algorithms are preferable for various
tasks as geting
casier as compared to thelabelled datasct. the unlabeled
dataset is
Disadvantages:

output of an unsupervised aleorithm can be less accurate


The as the dataset
labelled, are not is not
and algorithms traind with the exact output prior.
in
Workingwith Unsupervised learning is more difficult as
o it works with the unlabelled
datasetthat does not map with the output.

Reinforcement Learning
:
Reinforcement on a
learning works feedback-based process,
an
component)
in which AI agent (A
sofbvare automatically explore its surrounding by hitting & trail,
taking
learning from experiences, and improving its performance. Agent gets
action, rewarded
good| action and get punished for each bad action; hence the goal of
reinforcement
foreach
Iearming agent is to
maximize the rewards.

reinforcement learning, there is no labelled data like supervised learning,


In and agents learn
experiences only.
from their

reinforcement a
The learning process is similar to human being; for example, a
child learns

ious things by experiences in his day-to-day life. An example of reinforcement learning is

h nhy a game, where the Game is the environment, moves of an agent at each step define
sates, and the goal of the agent is to get a high score. Agent receives feedback in termns of
punishment and rewards.

Due to its way of working, reinforcement learning is employed in differènt fields such

s Game theory, Operation Research, Information theory, multi-agent systems.

A reinforcement learning problem can be formalized using Markov Decision Process


(DP). In MDP, the agent constantly interacts with the environment and performs actions; at
each action, the environment responds and generates a new state.

Categories of Reinforcement
Learning

7
Keintorcement learning is categorized mainly into two types of methods/algorithms:

o Positive Reinforcement Learning: Positive reinforcement learning specities


inereasing the tendency that the required behaviour would occur again by adding
Something. It cnhances the strength of the behaviour
of the agent and positively
impacts it.

Negative Reinforcement Learning: Negative reinforcement


learning works exactly
opposite to the positive RL. It increases the tendency
that the specilic behaviour
would occur again by avoiding the
negative condition.

Advantages

It helps in solving complex real-world


problems which are difficult to be solved by
general techniques.
The learning model of RL is similar to
the learning of human beings; hence most
accurate results can be found.
o Helps in achieving long term results.

Disadvantage

RL algorithms are not preferred for simple problems.


RL algorithms require huge data and computations.
Too much reinforcement learning can lead to an
overload of states which can weaken
the results.

Process of Machine Learning


Machine learning workflow refers to the series of stages or steps
involved in the process of
building a successful machine learning system.

Thë'various štages iñivolved in themachíne learning workflow are

1. Data Collection
2. Data Preparation
3. Choosing Learning Algorithm
4. Training Model
5. Evaluating Model
6. Predictions
Data Collectlon
on

Data Preparation

Choosing Le arning
Algorithm

Eveluatng Model

Presicton

Machine Learning Workflow

1. Data Collection
In this stage,

Data is collected from different sources.


• The type of data collected depends upon the type of
d:oii puujui.
•Data may be collected from various sources such as files, databases etc.
•The quality and quantity of gathereddata directly affects the accuracyof the desired
system.

2. Data Preparation
In this stage,
• Data preparation is done to clean the raw data.
• Data collected from the real world is transformed to a clean dataset.

•Raw data may contain missing values, inconsistent values,duplicate instances etc.
• So, raw data cannot be directly used for building a model.

Different methods of cleaning the dataset are


Ignoring the missing values


• Removing instances having missing values from the dataset.
..
Estimating the missing values of instancesusing mean, median or mode.
•Removing duplicate instances from the dataset.
•Normalizing the data in'the dataset.
This is the most time consuming stage in machine learning workflow.
3. Choosing Learning Algorithm
In this stage,

•The best performing learning algorithm is rescarched.

•ltdepends upon the type of problem that needs to solved and the type of data we have.
•1ftheproblem,is'to classify and the data is labeled, classification algorithms are uscd.
•If theproblem is to perform a regression task and the data is labeled, regression
algorithms are used.
•If the problem is to create clusters and the data is unlabeled, clustering algorithms are
used.
The following chart provides the overvievv learning
of algorithms

Machine Learning

SupervIsed
Learmlng EUnsupervlsed
Learnlng

Gausslan tixtures
lassiflcation Regresslon
K-Ateans Clustering

Boosting
-Nearest Nekgnbor Linear Reyression
Hierarchical Clusteing
Naive Bayes Support Vector Regression
Spectral Clustering
Desision Ttees Decslon Tregs

Suvport Vector Machine Gaussian Froçresses Regtessicn

>Logist: Fegression Ensemble Methods

4.
Training Modela.
In thisstage,

• The model is trained to improve its


ability.
•The dataset is divided into training dataset and testing dataset.
•The training and testing split is order of 80/20 or 70/30.
• It also depends upon the size
of the dataset.

Training dataset is used for training purpose.
•Testing dataset is used for the testing purpose.
•Training dataset is fed to the learning algorithm.
•The learning algorithm finds a mapping between the input and the output and
generates the model.

Trainiig Data Set Learning Model


Algorithm

5. Evaluating Model

In this stage,

The model is evaluated to test if the model is any good.
•The modelis evaluated using the kept-aside testing datasct.
•lt allows to test the model against data that has never been used before for training.
•Metrics such as accuracy, precision, recall etc are used to test the performance.
• If themodel does not perform well, the model is re-built using different hyper
parameters.
•The accuracy may be further improved by tuning the hyper parameters.

Testing Data Set

Model

Qutput
6.
Predictions
In this stage,

• The built system is finally used to


do something useful in the real world.,
• Here, the true value
of machine learning is realized.
Common types of Machinc Learning algorithms

Linear Regression in Machine


Learning

Lincar regression is onc of


the casicst and most popular Machine Learning algorithms.
Ilisa
statistical method that is used for predictive analysis. Lincar regression makes predictions
for
continuous/rcal or numeric variablcs such as sales, salary, age, product price. ete

Linear regression algorithm shows a linear relationship belween a dependent (y) and en

as
more indepcndent (y) variables, hence called lincar regression. Since lincar
rcgression
means it finds how thc value of
shows the lincar relationship, which the dependent variable is
changing according to the value of the independent variable.

a
The lincar regression model provides sloped straight linc representing the relationship
betwcen the variables. Consider the below imagc:

Datapoints
Variable

dependent

Line of
regression

independent Variables X

Mathematically, we can represent a lincar regression as: y= BotB;x+


IHere,

Y= Dependent Variable (Target Variable)


X= Independent Variable (prcdictor Variable).
B0= intercept of the line (Gives an additional
degrêe'of frcedom)
B| =Lincar regression cocfficient (scale factor to each-input
valuc).
¿= random error

The values for x and y variables are training


datasets for Linear Regression mode
representation.

12
Logistic Regression in Machine Learning
Logistic regression is one of the most popular Machine Learning algorithms,
which
comes under the Supervised Learning
technique. It is used for predicting the
categorical dependent variable using a given set of independent variables.
Logistic regression predicts the output ofa categorical dependent variable. Therefore
the outcome must be a categoricalor discrete value. It can either Yes or No, 0 or 1,
b
true or False, etc. but instead of giving the exact value as 0
and 1, it gives the
probabilistic values which lie between 0 and 1.
Logistic Regression is much similar to the Linear Regression except that how they are
used. Linear Regression is used for solving Regression problems, whereas Logistic
regression is used for solving the classification problems.
In Logistic regression, instead of fitting a regression line, we fit an "S" shaped logistic
function, which predicts two maximum values (0 or 1).
The curve from the logistic function indicates the likelihood of something such as
whether the cells are cancerous or not, a mouse is obese or not based on its weight,
etc.
Logistic Regression is a significant machine learning algorithm because it has the
ability to provide probabilities and classify new data using continuous and discrete
datasets.
o Logistic Regression can be used to classify the observations using different types of
data and can easily determine the most effective variables used for the classification.
The below image is showing the logistic function:

0.5

-6 -2 0. 2

Logistic Function (Sigmoid Function):

The sigmoid function is a mathematical function used to map the predicted values to
probabilities.

Itmaps any real value into another value within a range of0 and 1.

I3
1, which
cannot go beyond
logistic regression must be between 0 and
The valuc of the T he S-form
curve is callecd the
a curve like the "S" form.
this Iimit, so it foms
function.
Sigmoid function or the logistic as g in the
basically used as an input to another function such
The linear function is
following relation -

hg(z) = 0"
)where0< hy < 1

can be given as follows -


Here, g is the logistic or sigmoid function which

g(z) =
1

1+e-:
wherez =z
which defines the
In logistic regression, we use the concept of the threshold value,
to l, and a
probability of cither 0 or 1. Such as values above the threshold value tends
value below the threshold values tends to 0.

2.2 Neural Network


neural net is an artificial representation of the human brain that tries to simulate its
A

a or
learning process. An artificial neural network (ANN) is often called "Neural Network"
simply Neural Net (NN).

Traditionally, the word neural network is referred to a network of biological neurons


in the nervous system that process and transmit information.

Artificial neural network is an interconnected group of artificial neurons that uses a


mathematical model or computational model for information processing based on a

connectionist approach to computation.

The, artificial, neural, networks are, made of interconnecting artificial neurons which
may share some properties of biological neural networks.

Artificial Neural network is a network of simple processing elements (neurons)


which can exhibit complex global behavior, determined by the connections between the
processing elements and element parameters.

14
Mathematical Model of a Neuroa

A very simplificd model of real neurons


is kown as a Threshokd Logic Unit (TLU).
The model saidto have:
is

-A set synapses (connections) brings in activations from other neurons.


of

-A processing unit sums the inputs, and then applies a non-linear activation function
(ie. squashing /transfer / threshold function).
-
An output line transmits the result to other neurons.
The McCulloch-Pitts Neuron This is a simplified model
of real neurons, kn0Wn as a
Threshold Logic Unit.

Input 1

Input 2 M
>Output

Input n

Simplified Model of Real Neuron


(Threshold Logic Unit)
A
set of
input connections brings in activations from other neurons.
A
processing unit sums the inputs, and then applies a non-linear activation
function (i.e.
squashing / transfer / threshold function).
Anoutput line transmits the result to other neurons. In other words, - The input to a neuron
arrives in the form of signals.
-
The signals build up in the cell.
-
Finally the cell discharges (cellfires) through the output.
-The cell can start
building up signals again.
The equation for the output ofa McCulloch-Pitts neuron as a function of I to n inputs
is written as

Output = sgn ( Input i - )


i=1
where o is the neuron's activation threshold.
If Inputi 2
then Output = 1
i=1

If Inputi < p then Output = 0


i=1

15
Associations
Learning
Association learning is a
rule-based machine learning and data mining
technique
or features in a data set.
relations between variables
that
Unlikeconventional
finds important
degrees of similarity, association rulelearning
association algorithms measuring
measure identifies
corelations in databases by applying some of interestingness
hjdden 10generate ar
association rule for new searches.

Association rule learning works on the concept of If and Else Statement,


such as
ifA
then B.

If A
Then B

Here the If element is called antecedent, and then statement is called as Consequent.
These
some association or relation
types relationships where we can find out
of betweentwo items
as single cardinality. It is all about creating rules, and if the number of items
is known
increases, then cardinality also increases accordingly. So, to measure the associations
between thousands of data items, there are several metrics. These metrics are given belos.

Support
Confidence
Lift
Support
Support is the frequency of A or how frequently an item appears in the dataset. lt is
defined as the fraction of the transaction T that contains the itemset X. If there are X datasets.

then for transactions T, it can be written as:

Freq(X)
Supp(x)=
T

Confidence

Confidence indicates how often the rule has been found to be true. Or how often the
"items X and Y occur together in the dataset when the occurrence of X
s
is already given. It

the ratio of thê transaction that contains X and Y to the number of records that contain A.

Confidence=
Freq(X,Y
Freq(X)
Lift
strength of any rule, which can be
It is the defincd as belowformula:
Supp(X,Y)
Lift=
Supp(X) xSupp(Y)
.h ratio of the observed support
neasure and cxpected support
independent of cach other. It has three
possiblevalues:
if X and Y are
o IfLift= 1:
The probability of occurrence
of antecedent and consequent is independent
of each other.
o Lift>l: It determines the degree to
which the two itemsets are dependent
to cach
other.
Lift<1: It tells us that one item is a
substitute for other items, which mecans onc
item
has a negative effect on another.

Regression

Regression is a supervised learning technique which


helps in finding the correlation between
variables and enables us to predict the continuous output
variable based on the one or more
predictor variabes. It is mainly used for prediction,
forecasting, time series modeling, and
determining the causal-effect relationship between variables.

In Regression, we plot a graph between the variables


which best fits the given datapoints,
using this plot, the machine learning model can make predictions
about the data. In simple
words, "Regression shows a line or curve that passes
through all the datapoints on target
predictor graph in such a way that the vertical distance between tlhe
datapoints and te
regression line is minimun. " The distance between datapoints and line
tells whcther a model
has captured a strong relationship or not.

Some examples of regression can be as:

Prediction of rain using temperature and other factors


o'.:Determining Market trends
Prediction of road accidents due to rash driving.

The some important types of regression which are given below.

o Linear Regression
Logistic Regression

Classification

Classification is Supervised Learning technique that is used to identify the category


a

a program learns from the


of new observations on the basis of training data. In Classification,
a or
given dataset or observations and then classifies new observation into number of classes
groups. Such as, Yes or No, 0 or 1, Spam or Not Spam, cat or dog, ctc. Classes can be

called as targets/labels or categories.

a
Unlike regression, the output variable of Classification is category, not value, such
a

as "Green or Blue", "fruit or animal", etc. Since the Classification algorithm is a Supervised

learning technique, hence it takes labeled input data, which means it contains input with thc
corresponding output.

In classification algorithm, a discrete output function() is mapped to input variable(x).

y=f(), where y = categorical output

Class A

A Class B

The algorithm which implementsthe.classification on'a.dataset. is known. as a classifier.


There are two types of Classifications:

Binary Classifier: If the classification problem has only two possible outcomes, then
it is called as Binary Classifier.

Examples: YES or NO, MALE or FEMALE, SPAM or NOT SPAM, CAT or DOG,
etc.
Multi-class Classifier: Ifa classification problem bas more
than two outCones
it is called as Multi-class
Classificr.
Example: Classifications of types of croDs, Classification of types of mustC.
Natural Language Processing
NILP stands for Natural Laneuame Procesiu!, which is a part of Conp
Seicnee, lIuman language, and Artiicial utellipence, It is the technology
hat s
machines to understand, analyse, manipulate, amd interprct human's languages.
developers to organize knovledpe for nerloruiue tasks sucl as translation, autonnue
summarization, Named Entity Recoguition (NER), speechh recognition, relationsi
extraction, and topic segmentation.

Computer
Science

NLP)
Artificial Hüman
Intelligence 4
Language

Components of NLP
There are the following two componcnts of NLP -
1.Natural Language Understanding (NLU)
2. NaturalLanguage Generation (NLG)

1. Natural Language Understanding (NLU)


Natural Language Understanding (NLU) helps the machine to understand and analyse
human anguage by extracting the meladata from content such as concepts, entities,
keywords, emotion, reltions, and semanti roles.
NLU mainly used in Business applications to understand the customer's problem in both
spoken and written language.
NLUinvolves thie following tasks -
o It is used to map the given input into uscful representation.
It is used to analyze different aspects of the language.
(NLG)
Language Generation
Natural Generation (NLG)
2. Language acts as a
translator
Natural representation. hat converts
Ianguage It mainly
computerized data into natural
involvesText the
Realization. planing,
Sentenceplanning, and Text

Advantages
questions about any subject and get a directresponse
NLP helps users
to ask within
seconds.
means
the question
answers to it does not offer
o NLP offers cxact unneceSsary
and
unwanted information.
communicate with
humans in t heir languages.
NLPhelps computers to
efficient.
It is very time
use NLP to improve the cfficiency of
Most the companies
of documentation
documentation. and identify the information
processes, accuracy of
fromlarge
databases.

Disadvantages
A list of
disadvantages of NLP is given below:

o NLP may not show


context.

NLP is unpredictable
NLP may require more keystrokes.
unable to adapt to the new domain, and has a
it
NLP is limited function that'swhy
NLP is built for a single and specific task only.

Automatic spcech recognition


Automatic Speech Recogition, or ASR, is the use of Machine Learning or Aribial

Intelligence (Al) techrnology to process human speech into readable text. Automatic speech

recognition is a technology that converts speech to te:xt


in real time. ASR may also be called
speech-to-text or, simply, transcription systems.

speech recognition occurs when a computer


receives audio input from a person speaking
processes that input by breaking down the various components
of spcech, and then
transcribes that speech to text.

ASK Systems are typically composed


model, and the language model --
of three major components - the lexicon, the acouste
most
that decode an audio signal and provide
appropriate transcription. t

20
Lexicon
Acoustic model
Language model

Decoding
«most likely words
spokens

Lexicon

The lexicon is the primary


step in decoding speech
for an ASR system involves Creating a comprehensive
lexical design
including the fundanmental
(the audio input the ASR system elements of both spoken language
receives) and written
vocabulary (the text the system
out). sends

Acoustic Model

Acoustic modeling involves


separating an audio signal
into small time frames. Acoustic
models analyze each frame and
provide the probability
of using different phonemes in that
section of audio. Simply put, acoustic
models aim to predict which
sound is spoken in each
frame.

Language Model

ASR systens employ natural language processing


(NLP) to help computers understand the
context of what a speaker says. Language
models recognize the intent spoken
use that knowledge to compose of phrases and
word sequences. They operate in a similar way
to acoustic
models by using deep neural networks trained on
text data to estimate the probability
of
which word comes next in a phrase.
A common language model that speech-recognition
software use to translate spoken word
into text formats is N-gram probability, which is
used in NLP.
An N-gram is a string of words. For example, "contact
center" is a 2-gram, and "omniçhannel
contact center" is a 3-gram. N-gram probability
werks by predicting the next word in a
sequence, based on known previous words
and standard grammar rules.

Together, the lexicon, acoustic model, and language model enable ASR systems
to make
close-to-accurate predictions about the words and sentences in an audio input.

2)
word
system requires calculating tlhe
speech-recognition accuracy of an ASR
Figuring out the
error rate (WER).

The formula for VERis:

words spoken
insertions + deletions / the number of
WER = substitutions + specch
to note that the utility of

While WER is a helpful


metric to know, it's important
speaker's
alone. Variables such as
a
shouldn't be based on this metric
recognition software
recording or microphone quality, and background
pronunciation of certain words, a speaker's
many cases, even with the
a speech-recognition tool. In
sounds can affect the WER of
may still prove valuable to a user.
mentioncd errors present, the decoded audio input

Robotics
Robotics is a separate entity in Artificial
Intelligence that helps study the creation of

intelligent robots or machines.


of performing out of
A
robot is a
machine that looks like a human, and is capable
neuns of
box actions and replicating certain luman movements automaticully by
Compounding Robot,
conunands given to using progranmming. Examples: Drug
it Scrubbers and Sage
Automotive Industry Robots, Order Picking Robots, Industrial Floor

Automation Gantry Robots, etc.


are as follows:
Several components construct a robot, these components
a
Actuators: Actuators are the devices that are responsible for moving and controlling
system or machine. It helps to achieve physical movements by converting
energy like
as rotary motion.
electrical, hydraulic and air, etc. Actuators can create linear as well

Power Supply: It electrical device that supplies electrical power to an electrical


is an

load.The primary function of the power supply is to convert electrical current to power
the load.

o. Electric Motors: These are the devices that convert electrical cnergy into, mechanical
energy and are required for the rotational motion of the machines.

Pneumatic Air Muscles: Air Muscles are soft pneumatic devices that are ideally best
fitted for robötics. They can contract andextend and operate by pressurized air filling a

pneumatic bladder. Whenever air is introduced, it can contract up to 40%.

Muscles wire: These are made up of nickel-titanium alloy called Nitinol and are very
thin in shape. It can also extend and contract when a specific amount of heat and electric
current is supplied into it. Also, it can be formed and bent into different shapes
when it is

22
in its martensitic lorm. They cn contract by s% when clectrical current passes through
thein.
Piezo Motors nd Ulrasonic lotors: Piczoclectric motors or Piczo motors arc the
electrical! devices that receive an clectric signal and apply a
directional force to an
opposing ceramic plate. It helps a robot to move in thc desircd direction. These are the
best suited electrical motors for industrial robots.

Sensor: They provide the ability like sec, hear, touch and movement like
humans. Sensors ure the devices or nachines which help to detect the events or
chinges
in the eviroment cnd send duta to the computer processor. These devices are usually
equipped with other electronic devices, Similar to human organs, the clectrical sensor
also plays a crucial role in Artificial Intelligence & robotics. Al algorithms control robots
by sensing the cnvironment, and it provides real-time information to computcr
processors.

Actuators

Power Electric
Supply 1****

Cemponents
of
Robot Pneumatic
Muscle Air Muscles
Wires

Piezo
Motors and Sensors
Menrs

23

You might also like