Professional Documents
Culture Documents
I. Artificial Intelligence:
An example is Deep Blue, the IBM chess program that beat Garry Kasparov in the
1990s. Deep Blue can identify pieces on the chessboard and make predictions, but it
has no memory and cannot use past experiences to inform future ones. It analyzes
possible moves -- its own and its opponent -- and chooses the strategic move. Deep
Blue and Google's Alpha GO were designed for narrow purposes and cannot easily be
applied to another situation.
These AI systems can use past experiences to inform future decisions. Some of the
decision-making functions in self-driving cars are designed this way. Observations
inform actions happening in the not-so-distant future, such as a car changing lanes.
These observations are not stored permanently.
This psychology term refers to the understanding that others have their own beliefs,
desires and intentions that impact the decisions they make. This kind of AI does not
yet exist.
Type 4: Self-awareness :
In this category, AI systems have a sense of self, have consciousness. Machines with
self-awareness understand their current state and can use the information to infer what
others are feeling. This type of AI does not yet exist .
3. Examples of AI technology :
AI is incorporated into a variety of different types of technology. Here are some examples.
Its eyeless head is the neuron. It is connected to other neurons by those tentacles around it
called dendrites and by the tails, which are called axons. Through these flow the electrical
signals that form our perception of the world around us.
Strangely enough, at the moment a signal is passed between an axon and dendrite, the two
don’t actually touch. A gap exists between them. To continue its journey, the signal must act
like a stuntman jumping across a deep canyon on a dirtbike. This jump process of the signal
passing is called the synapse.
The inputs on the left side represent the incoming signals to the main neuron in the middle.
In a human neuron, this these would include smell or touch.
In your Neural Network, these inputs are independent variables. They travel down the
synapses, and then emerge the other side as output values. It is a like-for-like process, for
the most part.
The main difference between the biological process and its artificial counterpart is the level
of control you exert over the input values; the independent variables on the left-hand side.
An ANN usually involves a large number of processors operating in parallel and arranged in
tiers. The first tier receives the raw input information -- analogous to optic nerves in human
visual processing. Each successive tier receives the output from the tier preceding it, rather
than from the raw input -- in the same way neurons further from the optic nerve receive
signals from those closer to it. The last tier produces the output of the system.
Each processing node has its own small sphere of knowledge, including what it has seen and
any rules it was originally programmed with or developed for itself. The tiers are highly
interconnected, which means each node in tier n will be connected to many nodes in tier n-1
-- its inputs -- and in tier n+1, which provides input data for those nodes. There may be one
or multiple nodes in the output layer, from which the answer it produces can be read.
Artificial neural networks are notable for being adaptive, which means they modify
themselves as they learn from initial training and subsequent runs provide more information
about the world. The most basic learning model is centered on weighting the input streams,
which is how each node weights the importance of input data from each of its predecessors.
Inputs that contribute to getting right answers are weighted higher.
c) How neural networks learn?
Typically, an ANN is initially trained or fed large amounts of data. Training consists of
providing input and telling the network what the output should be. For example, to build a
network that identifies the faces of actors, the initial training might be a series of pictures,
including actors, non-actors, masks, statuary and animal faces. Each input is accompanied by
the matching identification, such as actors' names, "not actor" or "not human" information.
Providing the answers allows the model to adjust its internal weightings to learn how to do
its job better.
For example, if nodes David, Dianne and Dakota tell node Ernie the current input image is a
picture of Brad Pitt, but node Durango says it is Betty White, and the training program
confirms it is Pitt; Ernie will decrease the weight it assigns to Durango's input and increase
the weight it gives to that of David, Dianne and Dakota.
In defining the rules and making determinations -- that is, the decision of each node on what
to send to the next tier based on inputs from the previous tier -- neural networks use several
principles. These include gradient-based training, fuzzy logic, genetic algorithms and
Bayesian methods. They may be given some basic rules about object relationships in the
space being modeled.
For example, a facial recognition system might be instructed, "Eyebrows are found above
eyes," or, "Moustaches are below a nose. Moustaches are above and/or beside a
mouth."Preloading rules can make training faster and make the model more powerful
sooner. However, it also builds in assumptions about the nature of the problem space, which
may prove to be either irrelevant and unhelpful or incorrect and counterproductive, making
the decision about what, if any, rules to build in very important.
Here are some of the most important types of neural networks and their applications.
His is one of the simplest types of artificial neural networks. In a feedforward neural network,
the data passes through the different input nodes until it reaches the output node.
In other words, data moves in only one direction from the first tier onwards until it reaches
the output node. This is also known as a front propagated wave, which is usually achieved by
using a classifying activation function.
Unlike in more complex types of neural networks, there is no back propagation and data
moves in one direction only. A feedforward neural network may have a single layer or it may
have hidden layers.
In a feedforward neural network, the sum of the products of the inputs and their weights are
calculated. This is then fed to the output. Here is an example of a single layer feedforward
neural network.
Feedforward neural networks are used in technologies like face recognition and computer
vision. This is because the target classes in these applications are hard to classify.
A simple feedforward neural network is equipped to deal with data, which contains a lot of
noise. Feedforward neural networks are also relatively simple to maintain.
A radial basis function considers the distance of any point relative to the center. Such neural
networks have two layers. In the inner layer, the features are combined with the radial basis
function.
Then the output of these features is taken into account when calculating the same output in
the next time-step. Here is a diagram, which represents a radial basis function neural network.
The radial basis function neural network is applied extensively in power restoration systems.
In recent decades, power systems have become bigger and more complex.
This increases the risk of a blackout. This neural network is used in the power restoration
systems in order to restore power in the shortest possible time.
Multilayer Perceptron:
A multilayer perceptron has three or more layers. It is used to classify data that cannot be
separated linearly. It is a type of artificial neural network that is fully connected. This is
because every single node in a layer is connected to each node in the following layer.
This type of neural network is applied extensively in speech recognition and machine
translation technologies.
Convolutional Neural Network
A convolutional neural network (CNN) uses a variation of the multilayer perceptrons. A CNN
contains one or more than one convolutional layers. These layers can be either completely
interconnected or pooled.
Before passing the result to the next layer, the convolutional layer uses a convolutional
operation on the input. Due to this convolutional operation, the network can be much deeper
but with much fewer parameters.
Due to this ability, convolutional neural networks show very effective results in image and
video recognition, natural language processing, and recommender systems.
Convolutional neural networks also show great results in semantic parsing and paraphrase
detection. They are also applied in signal processing and image classification.
CNNs are also being used in image analysis and recognition in agriculture where weather
features are extracted from satellites like LSAT to predict the growth and yield of a piece of
land. Here is an image of what a Convolutional Neural Network looks like.
Recurrent Neural Network(RNN) – Long Short Term Memory
A Recurrent Neural Network is a type of artificial neural network in which the output of a
particular layer is saved and fed back to the input. This helps predict the outcome of the layer.
The first layer is formed in the same way as it is in the feedforward network. That is, with the
product of the sum of the weights and features. However, in subsequent layers, the recurrent
neural network process begins.
From each time-step to the next, each node will remember some information that it had in
the previous time-step. In other words, each node acts as a memory cell while computing and
carrying out operations. The neural network begins with the front propagation as usual but
remembers the information it may need to use later.
If the prediction is wrong, the system self-learns and works towards making the right
prediction during the backpropagation. This type of neural network is very effective in text-
to-speech conversion technology. Here’s what a recurrent neural network looks like.
A modular neural network has a number of different networks that function independently
and perform sub-tasks. The different networks do not really interact with or signal each other
during the computation process. They work independently towards achieving the output.
As a result, a large and complex computational process can be done significantly faster by
breaking it down into independent components. The computation speed increases because
the networks are not interacting with or even connected to each other. Here’s a visual
representation of a Modular Neural Network.
Sequence-To-Sequence Models:
Parallel processing abilities mean the network can perform more than one job at a
time.
The ability to learn and model nonlinear, complex relationships helps model the real
life relationships between input and output.
Fault tolerance means the corruption of one or more cells of the ANN will not stop
the generation of output.
Gradual corruption means the network will slowly degrade over time, instead of a
problem destroying the network instantly.
The ability to produce output with incomplete knowledge with the loss of
performance being based on how important the missing information is.
No restrictions are placed on the input variables, such as how they should be
distributed.
Machine learning means the ANN can learn from events and make decisions based
on the observations.
The ability to learn hidden relationships in the data without commanding any fixed
relationship means an ANN can better model highly volatile data and non-constant
variance.
The ability to generalize and infer unseen relationships on unseen data means ANNs
can predict the output of unseen data.
4. Disadvantages of artificial neural networks:
The lack of rules for determining the proper network structure means the
appropriate artificial neural network architecture can only be found through trial and
error and experience.
The network works with numerical information, therefor all problems must be
translated into numerical values before they can be presented to the ANN.
The lack of explanation behind probing solutions is one of the biggest disadvantages
in ANNs. The inability to explain the why or how behind the solution generates a lack
of trust in the network.
Chatbots
Natural language processing, translation and language generation
Stock market prediction
Delivery driver route planning and optimization
Drug discovery and development
III. Bayesian Network:
1. Introduction and définition :
Bayesian networks are a graphical approach to modeling, using probability. In a network,
nodes are used to represent variables, and links to indicate that one node influences
another. This allows the relationship between variables to be visualized easily. Each node in
a Bayesian network requires a probability distribution to be specified (conditional on its
parents), and Bayes Server uses advanced algorithms to combine these distributions, in
order to answer queries (questions/predictions).
Bayes Server is a tool for modeling Bayesian networks, Dynamic Bayesian networks and
Decision graphs.
Bayesian networks are widely used in the fields of Artificial Intelligence, Machine Learning,
Data Science, Big data, and Time Series Analysis.
A Bayesian network is a type of graph called a Directed Acyclic Graph or DAG. A Dag is a
graph with directed links and one which contains no directed cycles. A directed cycle in a
graph is a path starting and ending at the same node where the path taken can only be along
the direction of links.
Notation:
It is useful to introduce some simple mathematical notation for variables and probability
distributions.
Variables are represented with upper-case letters (A,B,C) and their values with lower-case
letters (a,b,c). If A = a we say that A has been instantiated.
A set of variables is denoted by a bold upper-case letter (X), and a particular instantiation by
a bold lower-case letter (x). For example if X represents the variables A,B,C then x is the
instantiation a,b,c. The number of variables in X is denoted |X|. The number of possible
states of a discrete variable A is denoted |A|.
P(A) is used to denote the probability of A. For example if A is discrete with states {True,
False} then P(A) might equal [0.2, 0.8]. I.e. 20% chance of being True, 80% chance of being
False.
Joint probability
A joint probability refers to the probability of more than one variable occurring together,
such as the probability of A and B, denoted P(A,B).
NB: If two variables are independent (i.e. unrelated) then P(A,B) = P(A)P(B).
Conditional probability
Conditional probability is the probability of a variable (or set of variables) given another
variable (or set of variables), denoted P(A|B).
Marginal probability
A probabilistic graphical model (PGM), or simply “graphical model” for short, is a way of
representing a probabilistic model with a graph structure.
The nodes in the graph represent random variables and the edges that connect the nodes
represent the relationships between the random variables.
There are many different types of graphical models, although the two most commonly
described are the Hidden Markov Model and the Bayesian Network.
The Hidden Markov Model (HMM) is a graphical model where the edges of the graph are
undirected, meaning the graph contains cycles. Bayesian Networks are more restrictive,
where the edges of the graph are directed, meaning they can only be navigated in one
direction. This means that cycles are not possible, and the structure can be more generally
referred to as a directed acyclic graph (DAG).
2. How to Develop and Use a Bayesian Network :
Designing a Bayesian Network requires defining at least three things:
It may be possible for an expert in the problem domain to specify some or all of these
aspects in the design of the model.
In many cases, an expert can specify the architecture or topology of the graphical model, but
the probability distributions must be estimated from data from the domain.
Both the probability distributions and the graph structure itself can be estimated from data,
although it can be a challenging process. As such, it is common to use learning algorithms for
this purpose; for example, assuming a Gaussian distribution for continuous random variables
gradient ascent for estimating the distribution parameters.
Once a Bayesian Network has been prepared for a domain, it can be used for reasoning, e.g.
making decisions.
Reasoning is achieved via inference with the model for a given situation. For example, the
outcome for some events is known and plugged into the random variables. The model can
be used to estimate the probability of causes for the events or possible further outcomes.
Practical examples of using Bayesian Networks in practice include medicine (symptoms and
diseases), bioinformatics (traits and genes), and speech recognition (utterances and time).
Consider a problem with three random variables: A, B, and C. A is dependent upon B, and C
is dependent upon B.
Notice that the conditional dependence is stated in the presence of the conditional
independence. That is, A is conditionally independent of C, or A is conditionally dependent
upon B in the presence of C.
P(A|C, B) = P(A|B)
We can see that B is unaffected by A and C and has no parents; we can simply state the
conditional independence of B from A and C as P(B, P(A|B), P(C|B)) or P(B).
We can also write the joint probability of A and C given B or conditioned on B as the product
of two conditional probabilities; for example:
The model summarizes the joint probability of P(A, B, C), calculated as:
Also notice that the graph is useful even at this point where we don’t know the probability
distributions for the variables.
You might want to extend this example by using contrived probabilities for discrete events
for each random variable and practice some simple inference for different scenarios.
To update beliefs about states of certain variables when some other variables were
yes,X54 = no).