Professional Documents
Culture Documents
Algorithms for
Artificial Intelligence
Neil C. Rowe, U.S. Naval Postgraduate School
A
become primary only with the in-
creases in computation speeds in the
rtificial intelligence (AI) is usually defined as last 20 years. However, machine learning is unnecessary
behavior by machines that would appear in- for AI. You can still program intelligent behavior directly if
telligent if it were done by a human. That does a reasonably good simple solution for a task will suffice—
not necessarily require reasoning as brains do. something possible, for instance, for many help desk tasks.
AI is a large subarea of computer science with a long his- Both AI modeling and machine learning can be distin-
tory starting in 1956. Over the years, it has borrowed many guished further on whether they focus on logical reason-
algorithms from other areas of computer science, statis- ing, numerical reasoning, or some combination of both.
tics, and operations research toward the goals of creating Early work focused on logical reasoning, but current work
many kinds of intelligent behavior. It has also invented is focused on numerical reasoning, though both are neces-
some algorithms of its own. sary to cover the full range of human reasoning abilities.
Some researchers in AI and allied areas of psychology
TYPES OF AI ALGORITHMS use AI algorithms to model brains (human or animal)2
Many AI algorithms are described in introductory text- and other aspects of biology such as “artificial life” and
books such as the one by Russell and Norvig.1 Algorithms immune systems.3 We touch only briefly on these ap-
proaches here. This work has important differences
Digital Object Identifier 10.1109/MC.2022.3169360
from most of AI as biological phenomena are limited in
Date of current version: 4 July 2022 speed and space in ways that are unlike semiconductor
technology. For instance, brains have the solutions to a task, as often hap- focus on efficient constraint selection
short-term and long-term memory of pens in modeling policies and laws. and application; especially difficult
a limited size, and brains have a dis- Computers and digital devices perform scheduling problems can be solved
tinctive top-level process called “con- logical operations in their machine with such methods. Constraint-based
sciousness.” Modeling of brains and language, so they have an inherent tasks appear to be well suited for quan-
biology is, however, important in sim- ability for logical reasoning. Most logi- tum computing, though implementa-
ulating the behavior of people and liv- cal AI algorithms represent knowledge tions are proceeding quite slowly.
ing creatures as in computer games, in the form of if–then rules and apply
as well as suggesting good ideas for them to Boolean values representing Planning methods
AI. Some have argued with a reli- facts to make inferences. Inference A special class of logical reasoning is
gious-like fervor that brain modeling can go either forward from facts to needed for reasoning about actions in
is the only way to achieve true AI, but conclusions (“forward chaining”) or time, the subarea of “planning.” For
this claim is disputed by the generally backward from conclusions to facts instance, repairing a vehicle or do-
more impressive reasoning abilities (“ back wa rd cha i n i ng”). Back wa rd ing surgery often requires a precise
of algorithms without any biological chaining is analogous to conventional sequence of steps done in a precise
or psychological analogs. program execution. The logic used can order. Tree traversal algorithms are
Most problems addressed by AI are be either Boolean algebra (proposi- helpful for these problems because the
NP-hard, so their computations are tional calculus), limited forms of pred- space of possible states often resem-
exponential in problem size. That is icate calculus, or full predicate calcu- bles a tree. Rules can be used to guide
because most tasks have features of lus. A limited predicate calculus used choices, what is known as a “heuristic
nondeterminism in which consid- in the Prolog programming language search.” Numbers can also be used to
erable trial and error in the testing requires the conclusions of rules to be guide choices, as in best-first search
of combinations is necessary. How- single unnegated universally quan- or branch-and-bound search. A classic
ever, AI practitioners have been quite tified facts and can run much faster and versatile search algorithm is the
clever in finding criteria to consider- than full predicate calculus by reduc- A* algorithm, which combines best-
ably reduce the rate of the “combina- ing the combinatorial explosion.7 Rea- first and branch-and-bound ideas by
torial explosion” with problem size soning with full predicate calculus adding the costs incurred to a state
for many practical problems so that can be done with the classic method of to a prediction of the future costs to
implementations of many AI algo- resolution theorem proving, but other a goal state. Good tasks for A* search
rithms appear polynomial in average logical inference methods like modus are route planning and resource allo-
performance.4 A good example is the ponens are used, too. cation. Alternatives are cross-entropy
graph–subgraph isomorphism prob- Some AI tasks require the knowl- optimization8 and random searches
lem (determining whether one graph edge of many facts, such as answering such as Monte Carlo tree search and
is a subgraph of another), which unrestricted human questions about simulated annealing.
arises in many AI areas such as find- a subject. Algorithms are then nec- A subclass of planning addresses
ing objects in pictures and match- essary to store the facts, index them, extended interactions between two
ing natural-language questions to and efficiently retrieve them. Logical opposing agents, also the subject of
natural-language documents. It can methods can then provide encyclope- game theory in operations research.
often be done in polynomial average dic capabilities to answer many kinds This has become especially important
time when items have rich features of questions from such facts. Ontol- recently for its applications to cyber-
to exploit in the matching.5 A related ogies (representations of intercon- security planning. Traditional game
problem is that because they are NP- nected facts) have been produced as modeling involves adversaries who al-
hard, many AI implementations con- input data for such purposes for many ternate turns in planning, each trying
sume large amounts of energy, and common applications. to thwart the other’s goals or increase
“deep learning” methods with arti- If a set of if–then rules must be run the other’s costs. To plan in adversar-
ficial neural networks on large data frequently, methods of “compiling” ial situations, it is important to look
sets, such as picture libraries, can be the rules can improve their speed. This ahead several moves; a classic heu-
especially energy intensive.6 can involve creating an equivalent of a ristic called alpha–beta pruning can
customized gate array or semiconduc- be used to prove that certain subtrees
IMPLEMENTING BEHAVIOR tor chip or creating a decision tree of can be ignored if the adversary always
yes–no questions. Special methods makes their best choice. Some degree
Logical methods called “constraint programming” have of randomness in exploration can also
Logical models are necessary when- been developed for tasks with many be helpful in problems such as plan-
ever there are absolute constraints on logical conditions on a solution, which ning business strategies.
J U LY 2 0 2 2 99
ALGORITHMS
and pipelined architectures originally have similar conclusions to similar, pre- approach, rules can be modified as nec-
designed for fast graphics (“graphical viously seen cases; this is “case-based” or essary to conform to each new case. This
processing units”) have greatly contrib- “instance-based” reasoning. For instance, may require adding terms to the prem-
uted to the recent success of artificial a medical system may remember the pat- ises (often negated terms representing
neural networks. However, distributed tern of symptoms in rare cases and what exception conditions), deleting terms,
processing plays a more intrinsic role in the eventual diagnoses were. However, a or joining two rules together. New
algorithms that model organizations distance metric between the current situ- rules should be created for cases suffi-
or societies—“social AI”; they model ation and stored situations is needed, and ciently different from any seen before.
intelligent communications between this is not always easy to define; if more As with decision trees, cases should be
“agents” (intelligences). Social AI dif- than one situation is similar, an average or cycled through until the rules no longer
fers from protocols for digital networks consensus of conclusions of the “nearest change. An alternative is to learn a set of
in that it can negotiate in human-like neighbors” should be taken. Stored cases broader but imperfect (“heuristic”) rules
ways rather than issue requests and should be indexed to enable quickly find- since humans use many such rules.
reports.3 A synergistic effect can occur ing the nearest matches. The “set covering” approach, a “greedy”
where the network of communicating A still-used supervised learning al- algorithm, uses conditional probabili-
AI agents is smarter than any one of gorithm from the late 1950s learns a ties of the conclusions given the prem-
them alone. One simple form of this is decision tree incrementally from Bool- ises to rank rules. It first finds the best
“swarm intelligence,”9 which can model ean data. Decision trees are often a good one-premise rules, then combines them
coordinated herding and flocking ac- model for tasks governed by policies, by taking conjunctions or disjunctions
tions by animals as well as analogous such as filling out tax forms. Starting of the “if” parts to get new rules, and re-
phenomena in military operations. with an empty tree, it considers each peats to build larger and larger rules of
training case in order, following the tree a minimum conditional probability. For
MACHINE LEARNING according to the features of the case un- instance, a person may think they get
Recent attention to AI has been focused til it gets to a leaf node. If this process sick after eating seafood, but they may
on the subarea of machine learning does not lead to a correct conclusion discover after further tries that the com-
and the related area of data mining.10 according to the training set, the leaf node bination of seafood with garlic and high
With machine learning, a carefully de- is changed to query a previously unasked amounts of fat is more likely to cause
signed starting model for intelligent feature that will distinguish this case the symptoms.
behavior is unnecessary; instead, an from the previous ones. Cycling through
adequate model is optimized automat- the cases until no further changes are Numerical learning
ically. Most algorithms for machine made to the tree ensures that the tree The classic way to fit numeric models
learning have been around for a long handles the cases in its training set cor- to data is through regression methods
time but have not been sufficiently fast rectly if they are not contradictory. from statistics, and these are widely
to be useful until recently; my AI text- However, the tree built may be quite used for the simpler numeric models in
book, which came out in 1988,11 said unbalanced; entropy calculations can AI. However, artificial neural networks
little about machine learning. help balance it by considering the have their own specialized learning
Machine learning can be supervised features whose querying reduces the algorithm called “backpropagation,”
(training on examples), unsupervised (us- entropy the most. The accuracy of the which adjusts the weights in networks
ing some less direct feedback), or some tree on new cases can still be a prob- incrementally to training data. It uses
combination of both. Training examples lem, however, since the tree does not the chain rule for finding derivatives
in supervised learning must specify what generalize from its experience. A pop- and optimizes the weights by working
should be concluded from the case; for ular improvement called a “random backward from the final layer to the in-
instance, a training set for medical AI forest” builds a set of trees with ran- put layer. Its idea is that weights should
could be patient records with their diag- dom orders of questions and takes a be increased for a case proportional to
noses. Unsupervised learning often uses majority vote on their predictions; this the correct output minus that actual
numeric feedback, such as the degree of seems to perform well on many prac- output (which can be negative). The
quantitative success or distance to a goal tical problems such as classifying ob- degree of blame or credit should also
state. Most machine learning is super- jects in images. The multiple trees may be proportional to the variable value
vised because that is much faster. require more energy to run than alter- multiplied by the weight on that case,
natives like neural networks, which, but since a nonlinear function was im-
Logical learning however, require more energy to train. posed after the weight was applied, it
The caching of previous results is a simple An alternative model for logical data should also be proportional to the slope
and often effective method of supervised is a set of if–then rules interpreted as an of the nonlinear function then. These
learning. A new case can be assumed to implicit conjunction. For an incremental three factors should be multiplied.
J U LY 2 0 2 2 101
ALGORITHMS
as Weka and Scikit-Learn, that offer are numeric, a weighted average can Comput., Jul. 2012, pp. 361–368, doi:
a variety of algorithms to use on the be taken. Ensemble learning appears 10.1145/2330163.2330216.
same data. That is because it appears to be the best countermeasure to the 6. J. Finks. “Solving Big AI’s big energy
that most models provide representa- threat of an adversary manipulat- problem.” The Next Web. https://
tional (or “epistemological”) adequacy ing training sets to force learning to thenextweb.com/news/solving-big-ais
for their tasks, meaning that most reach incorrect conclusions, as could -big-energy-problem (Accessed: Apr.
models are usable for most AI tasks occur with learning of new kinds 21, 2021).
with proper adjustments. of cyberattacks. 7. M. Bramer, Logic Programming with
However, there are exceptions. Prolog. Cham, Switzerland: Springer
A
Logical models cannot adequately rep- Nature, 2013.
resent, without significant complexity, I has adopted and modified 8. D. Drusinsky and J. B. Michael,
the adding of effects from many algorithms from many areas “Multiagent pathfinding under
weak factors in a linear model. Simi- of computer science. Much of rigid, optimization, and uncertainty
larly, some numerical models cannot this variety is unnecessary for practi- constraints,” Computer, vol. 54, no.
adequately represent some logical tioners to examine since many tasks 7, pp. 111–118, Jul. 2021, doi: 10.1109/
constraints, as for instance, a lin- are addressed equally well by most MC.2021.3074264.
ear weighted sum of numeric factors algorithms. Nonetheless, there are 9. P. Tarasewich and P. R. McMullen,
cannot represent an exclusive-or re- important tasks for which one algo- “Swarm intelligence: Power in
lationship. The latter example raises rithm is better, such as neural net- numbers,” Commun. ACM, vol. 45,
questions about the epistemological works for computer vision and speech no. 8, pp. 62–67, Aug. 2002, doi:
adequacy of artificial neural networks, processing. Despite the current vogue 10.1145/545151.545152.
which are based on weighted sums for numerical algorithms, logical algo- 10. I. Witten, E. Frank, M. Hall, and C.
though they include nonlinearit y. rithms remain a better fit for problems Pal, Data Mining: Practical Machine
However, epistemological adequacy of finding combinations and imposing Learning Tools and Techniques, 4th
is not always necessary. A good exam- of logical constraints. Though most ed. Cambridge, MA, USA: Morgan
ple is natural-language processing. algorithms are NP-hard, the parame- Kaufmann, 2017.
Despite the fact that natural languages ters of the complexity with input size 11. N. Rowe, Artificial Intelligence through
have been shown to require at least can vary considerably, and in many Prolog. Englewood Cliffs, NJ, USA:
context-sensitive grammars, many cases, the tasks are polynomial on the Prentice-Hall, 1988.
usef ul neura l net work s for speech average. So the performance of AI al- 12. “Libratus Poker AI Beats Humans for
understanding provide adequate per- gorithms generally needs to be deter- $1.76m; Is End Near?” PokerListings.
formance f rom models equiva lent mined by experiment. https://www.pokerlistings.com/
to reg u l a r g r a m m a r s (f i n ite -s t ate libratus-poker-ai-smokes-humans
machines) since the occasional con- REFERENCES -for-1-76m-is-this-the-end-42839
text-sensitive features can be approxi- 1. S. Russell and P. Norvig, Artificial (Accessed: Apr. 20, 2021).
mated adequately. Intelligence: A Modern Approach, 4th 13. M. Celebi and K. Aydin, Eds.
The choice of a single AI algorithm ed. New York, NY, USA: Pearson Unsupervised Learning Algorithms.
may not be appropriate for some tasks Education, 2020. Cham, Switzerland: Springer
anyway. Many successes have been 2. J. Crowder, J. Carbone, and S. Fries, Nature, 2016.
achieved with “ensemble learning,” Artificial Psychology: Psychological Mod- 14. H. M. Gomes, J. P. Barddal, F. En-
also called “bagging,” where multiple eling and Testing of AI Systems. Cham, embreck, and A. Bifet, “A survey on
learning algorithms or their variants Switzerland: Springer Nature, 2020. ensemble learning for data stream
are run simultaneously on data; ran- 3. D. Floreano and C. Mattiussi, Bio-In- classification,” ACM Comput. Surv.,
dom forests are an example previously spired Artificial Intelligence: Theories, vol. 50, no. 2, pp. 1–36, Mar. 2017, doi:
mentioned.14 If the conclusions are cat- Methods, and Technologies. Cam- 10.1145/3054925.
egorical, a majority vote can be taken bridge, MA, USA: MIT Press, 2008.
of the methods; if the conclusions 4. T. Roughgarden, Algorithms Illumi-
nated (Part IV): Algorithms for NP-
Hard Problems. New York, NY, USA: NEIL C. ROWE is a professor of
DISCLAIMER Soundlikeyourself Publishing, 2020. computer science at the U.S. Naval
The views expressed are those of the 5. J. Choi, Y. Yoon, and B.-R. Moon, Postgraduate School, Monterey,
author and do not necessarily repre- “An efficient genetic algorithm California, 93943, USA. Contact him
sent those of the U.S. Government. for subgraph isomorphism,” in at ncrowe@nps.edu.
Proc. 14th Annu. Conf. Genetic Evol.