You are on page 1of 54

Artificial Intelligence Engines: A

Tutorial Introduction to the


Mathematics of Deep Learning 1st
Edition James Stone
Visit to download the full and correct content document:
https://textbookfull.com/product/artificial-intelligence-engines-a-tutorial-introduction-to-
the-mathematics-of-deep-learning-1st-edition-james-stone/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Introduction to Deep Learning From Logical Calculus to


Artificial Intelligence 1st Edition Sandro Skansi

https://textbookfull.com/product/introduction-to-deep-learning-
from-logical-calculus-to-artificial-intelligence-1st-edition-
sandro-skansi/

Pro Deep Learning with TensorFlow: A Mathematical


Approach to Advanced Artificial Intelligence in Python
1st Edition Santanu Pattanayak

https://textbookfull.com/product/pro-deep-learning-with-
tensorflow-a-mathematical-approach-to-advanced-artificial-
intelligence-in-python-1st-edition-santanu-pattanayak/

MATLAB Deep Learning: With Machine Learning, Neural


Networks and Artificial Intelligence 1st Edition Phil
Kim

https://textbookfull.com/product/matlab-deep-learning-with-
machine-learning-neural-networks-and-artificial-intelligence-1st-
edition-phil-kim/

Artificial Intelligence and Deep Learning in Pathology


1st Edition Stanley Cohen Md (Editor)

https://textbookfull.com/product/artificial-intelligence-and-
deep-learning-in-pathology-1st-edition-stanley-cohen-md-editor/
Artificial Intelligence With an Introduction to Machine
Learning 2nd Edition Richard E. Neapolitan

https://textbookfull.com/product/artificial-intelligence-with-an-
introduction-to-machine-learning-2nd-edition-richard-e-
neapolitan/

Cracking the Code: Introduction to Machine Learning for


Novices: Building a Foundation for Artificial
Intelligence 1st Edition Sarah Parker

https://textbookfull.com/product/cracking-the-code-introduction-
to-machine-learning-for-novices-building-a-foundation-for-
artificial-intelligence-1st-edition-sarah-parker/

Introduction to Deep Learning Using R: A Step-by-Step


Guide to Learning and Implementing Deep Learning Models
Using R Taweh Beysolow Ii

https://textbookfull.com/product/introduction-to-deep-learning-
using-r-a-step-by-step-guide-to-learning-and-implementing-deep-
learning-models-using-r-taweh-beysolow-ii/

Introduction to Artificial Intelligence for Security


Professionals 1st Edition The Cylance Data Science Team

https://textbookfull.com/product/introduction-to-artificial-
intelligence-for-security-professionals-1st-edition-the-cylance-
data-science-team/

Artificial Intelligence for Coronavirus Outbreak Simon


James Fong

https://textbookfull.com/product/artificial-intelligence-for-
coronavirus-outbreak-simon-james-fong/
Reviews of Artificial Intelligence Engines

“Artificial Intelligence Engines will introduce you to the rapidly growing


field of deep learning networks: how to build them, how to use them;
and how to think about them. James Stone will guide you from the
basics to the outer reaches of a technology that is changing the world.”
Professor Terrence Sejnowski, Director of the Computational
Neurobiology Laboratory, the Salk Institute, USA. Author of The Deep
Learning Revolution, MIT Press, 2018.

“This book manages the impossible: it is a fun read, intuitive and


engaging, lighthearted and delightful, and cuts right through the hype
and turgid terminology. Unlike many texts, this is not a shallow
cookbook for some particular deep learning program-du-jure. Instead, it
crisply and painlessly imparts the principles, intuitions and background
needed to understand existing machine-learning systems, learn new
tools, and invent novel architectures, with ease.”
Professor Barak Pearlmutter, Brain and Computation Laboratory,
National University of Ireland Maynooth, Ireland.

“This text provides an engaging introduction to the mathematics


underlying neural networks. It is meant to be read from start to
finish, as it carefully builds up, chapter by chapter, the essentials of
neural network theory. After first describing classic linear networks
and nonlinear multilayer perceptrons, Stone gradually introduces a
comprehensive range of cutting edge technologies in use today. Written
in an accessible and insightful manner, this book is a pleasure to read,
and I will certainly be recommending it to my students.”

Dr Stephen Eglen, Cambridge Computational Biology Institute,


Cambridge University, UK.
James V Stone is an Honorary Associate Professor at the University of
Sheffield, UK.
Artificial Intelligence Engines
A Tutorial Introduction to the

Mathematics of Deep Learning

James V Stone
Title: Artificial Intelligence Engines
Author: James V Stone

©2020 Sebtel Press

All rights reserved. No part of this book may be reproduced or


transmitted in any form without written permission from the author.
The author asserts his moral right to be identified as the author of this
work in accordance with the Copyright, Designs and Patents Act 1988.

First Ebook Edition, 2020


From second printing of paperback.
Typeset in LATEX∂ 2ε .

ISBN 9781916279117

Cover background by Okan Caliskan, reproduced with permission.


For Teleri, my bright star
Contents
List of Pseudocode Examples i
Online Code Examples ii
Preface iii

1 Artificial Neural Networks 1


1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 What is an Artificial Neural Network? . . . . . . . . . . 2
1.3 The Origins of Neural Networks . . . . . . . . . . . . . . 5
1.4 From Backprop to Deep Learning . . . . . . . . . . . . . 8
1.5 An Overview of Chapters . . . . . . . . . . . . . . . . . 9

2 Linear Associative Networks 11


2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Setting One Connection Weight . . . . . . . . . . . . . . 12
2.3 Learning One Association . . . . . . . . . . . . . . . . . 14
2.4 Gradient Descent . . . . . . . . . . . . . . . . . . . . . . 16
2.5 Learning Two Associations . . . . . . . . . . . . . . . . 18
2.6 Learning Many Associations . . . . . . . . . . . . . . . . 23
2.7 Learning Photographs . . . . . . . . . . . . . . . . . . . 25
2.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3 Perceptrons 27
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 The Perceptron Learning Algorithm . . . . . . . . . . . 28
3.3 The Exclusive OR Problem . . . . . . . . . . . . . . . . 32
3.4 Why Exclusive OR Matters . . . . . . . . . . . . . . . . 34
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4 The Backpropagation Algorithm 37


4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.2 The Backpropagation Algorithm . . . . . . . . . . . . . 39
4.3 Why Use Sigmoidal Hidden Units? . . . . . . . . . . . . 48
4.4 Generalisation and Over-fitting . . . . . . . . . . . . . . 49
4.5 Vanishing Gradients . . . . . . . . . . . . . . . . . . . . 52
4.6 Speeding Up Backprop . . . . . . . . . . . . . . . . . . . 55
4.7 Local and Global Mimima . . . . . . . . . . . . . . . . . 57
4.8 Temporal Backprop . . . . . . . . . . . . . . . . . . . . 59
4.9 Early Backprop Achievements . . . . . . . . . . . . . . . 62
4.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5 Hopfield Nets 63
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.2 The Hopfield Net . . . . . . . . . . . . . . . . . . . . . . 64
5.3 Learning One Network State . . . . . . . . . . . . . . . 65
5.4 Content Addressable Memory . . . . . . . . . . . . . . . 66
5.5 Tolerance to Damage . . . . . . . . . . . . . . . . . . . . 68
5.6 The Energy Function . . . . . . . . . . . . . . . . . . . . 68
5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 70

6 Boltzmann Machines 71
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.2 Learning in Generative Models . . . . . . . . . . . . . . 72
6.3 The Boltzmann Machine Energy Function . . . . . . . . 74
6.4 Simulated Annealing . . . . . . . . . . . . . . . . . . . . 76
6.5 Learning by Sculpting Distributions . . . . . . . . . . . 77
6.6 Learning in Boltzmann Machines . . . . . . . . . . . . . 78
6.7 Learning by Maximising Likelihood . . . . . . . . . . . . 80
6.8 Autoencoder Networks . . . . . . . . . . . . . . . . . . . 84
6.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 84

7 Deep RBMs 85
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 85
7.2 Restricted Boltzmann Machines . . . . . . . . . . . . . . 87
7.3 Training Restricted Boltzmann Machines . . . . . . . . 87
7.4 Deep Autoencoder Networks . . . . . . . . . . . . . . . . 93
7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 96

8 Variational Autoencoders 97
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 97
8.2 Why Favour Independent Features? . . . . . . . . . . . 98
8.3 Overview of Variational Autoencoders . . . . . . . . . . 100
8.4 Latent Variables and Manifolds . . . . . . . . . . . . . . 103
8.5 Key Quantities . . . . . . . . . . . . . . . . . . . . . . . 104
8.6 How Variational Autoencoders Work . . . . . . . . . . . 106
8.7 The Evidence Lower Bound . . . . . . . . . . . . . . . . 110
8.8 An Alternative Derivation . . . . . . . . . . . . . . . . . 115
8.9 Maximising the Lower Bound . . . . . . . . . . . . . . . 115
8.10 Conditional Variational Autoencoders . . . . . . . . . . 119
8.11 Applications . . . . . . . . . . . . . . . . . . . . . . . . . 119
8.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 120
9 Deep Backprop Networks 121
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 121
9.2 Convolutional Neural Networks . . . . . . . . . . . . . . 122
9.3 LeNet1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
9.4 LeNet5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
9.5 AlexNet . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
9.6 GoogLeNet . . . . . . . . . . . . . . . . . . . . . . . . . 130
9.7 ResNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
9.8 Ladder Autoencoder Networks . . . . . . . . . . . . . . 132
9.9 Denoising Autoencoders . . . . . . . . . . . . . . . . . . 135
9.10 Fooling Neural Networks . . . . . . . . . . . . . . . . . . 136
9.11 Generative Adversarial Networks . . . . . . . . . . . . . 137
9.12 Temporal Deep Neural Networks . . . . . . . . . . . . . 140
9.13 Capsule Networks . . . . . . . . . . . . . . . . . . . . . . 141
9.14 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 142

10 Reinforcement Learning 143


10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 143
10.2 What’s the Problem? . . . . . . . . . . . . . . . . . . . . 146
10.3 Key Quantities . . . . . . . . . . . . . . . . . . . . . . . 147
10.4 Markov Decision Processes . . . . . . . . . . . . . . . . . 148
10.5 Formalising the Problem . . . . . . . . . . . . . . . . . . 149
10.6 The Bellman Equation . . . . . . . . . . . . . . . . . . . 150
10.7 Learning State-Value Functions . . . . . . . . . . . . . . 152
10.8 Eligibility Traces . . . . . . . . . . . . . . . . . . . . . . 155
10.9 Learning Action-Value Functions . . . . . . . . . . . . . 157
10.10 Balancing a Pole . . . . . . . . . . . . . . . . . . . . . . 164
10.11 Applications . . . . . . . . . . . . . . . . . . . . . . . . 166
10.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 168

11 The Emperor’s New AI? 169


11.1 Artificial Intelligence . . . . . . . . . . . . . . . . . . . 169
11.2 Yet Another Revolution? . . . . . . . . . . . . . . . . . 170

Further Reading 173


Appendices
A Glossary 175
B Mathematical Symbols 181
C A Vector and Matrix Tutorial 183
D Maximum Likelihood Estimation 187
E Bayes’ Theorem 189
References 191
Index 199
List of Pseudocode Examples

Each example below refers to a text box that summarises a particular


method or neural network.

Linear associative network gradient descent: One weight page 16.


Linear associative network gradient descent: Two weights page 22.
Perceptron classification page 35.
Backprop: Short version page 43.
Backprop: Long version page 47.
Hopfield net: Learning page 66.
Hopfield net: Recall page 67.
Boltzmann machine: Simulated annealing page 77.
Boltzmann machine learning algorithm page 82.
Restricted Boltzmann machine learning algorithm page 93.
Variational autoencoder learning algorithm page 118.
SARSA: On-policy learning of action-value functions page 160.
Q-Learning: Off-policy learning of action-value functions page 161.
Learning to balance a pole page 165.

i
Online Code Examples

The Python and MatLab code examples below can be obtained from
the online GitHub repository at
https://github.com/jgvfwstone/ArtificialIntelligenceEngines
Please see the file README.txt in that repository.
The computer code has been collated from various sources (with
permission). It is intended to provide small scale transparent examples,
rather than an exercise in how to program artificial neural networks.
The examples below are written in Python.

Linear network
Perceptron
Backprop network
Hopfield net
Restricted Boltzmann machine
Variational autoencoder
Convolutional neural network
Reinforcement learning

ii
Preface
This book is intended to provide an account of deep neural
networks that is both informal and rigorous. Each neural network
learning algorithm is introduced informally with diagrams, before a
mathematical account is provided. In order to cement the reader’s
understanding, each algorithm is accompanied by a step-by-step
summary written in pseudocode. Thus, each algorithm is approached
from several convergent directions, which should ensure that the
diligent reader gains a thorough understanding of the inner workings of
deep learning artificial neural networks. Additionally, technical terms
are described in a comprehensive glossary, and (for the complete novice)
a tutorial appendix on vectors and matrices is provided.
Unlike most books on deep learning, this is not a ‘user manual’ for
any particular software package. Such books often place high demands
on the novice, who has to learn the conceptual infrastructure of neural
network algorithms, whilst simultaneously learning the minutiae of
how to operate an unfamiliar software package. Instead, this book
concentrates on key concepts and algorithms.
Having said that, readers familiar with programming can benefit
from running working examples of neural networks, which are available
at the website associated with this book. Simple examples were written
by the author, and these are intended to be easy to understand rather
than efficient to run. More complex examples have been borrowed (with
permission) from around the internet, but mainly from the PyTorch
repository. A list of the online code examples that accompany this
book is given opposite.

Who Should Read This Book? The material in this book should
be accessible to anyone with an understanding of basic calculus. The
tutorial style adopted ensures that any reader prepared to put in the
effort will be amply rewarded with a solid grasp of the fundamentals of
deep learning networks.

iii
The Author. My education in artificial neural networks began
with a lecture by Geoff Hinton on Boltzmann machines in 1984 at
Sussex University, where I was studying for an MSc in Knowledge
Based Systems. I was so inspired by this lecture that I implemented
Hinton and Sejnowski’s Boltzmann machine for my MSc project.
Later, I enjoyed extended visits to Hinton’s laboratory in Toronto
and to Sejnowski’s laboratory in San Diego, funded by a Wellcome
Trust Mathematical Biology Fellowship. I have published research
30;125
papers on human memory , temporal backprop 116 , backprop 71 ,
evolution 120 , optimisation 115;117 , independent component analysis 119 ,
and using neural networks to test principles of brain function 118;124 .

Acknowledgements. Thanks to Steve Snow for discussions on


variational autoencoders and the physics of Boltzmann machines. For
debates about neural networks over many years, I am indebted to my
friend Raymond Lister. For helping with Python frameworks, thanks to
Royston Sellman. For reading one or more chapters, I am very grateful
to Olivier Codol, Peter Dunne, Lancelot Da Costa, Stephen Eglen,
George Farmer, Charles Fox, Nikki Hunkin, Raymond Lister, Gonzalo
de Polavieja, and Paul Warren. Special thanks to Eleni Vasilaki and
Luca Manneschi for providing guidance on reinforcement learning. For
allowing me to use their computer code, I’d like to thank Jose Antonio
Martin H (reinforcement learning) and Peter Dunne (for his version
of Hinton’s RBM code). Finally, thanks to Alice Yew for meticulous
copy-editing and proofreading.

Corrections. Please email corrections to j.v.stone@sheffield.ac.uk.


A list of corrections can be found at:
https://jim-stone.staff.shef.ac.uk/AIEngines

iv
Once the machine thinking method has started, it would not take
long to outstrip our feeble powers.
A Turing, 1951.
Chapter 1

Artificial Neural Networks


If you want to understand a really complicated device, like a brain,
you should build one.
G Hinton, 2018.

1.1. Introduction

Deep neural networks are the computational engines that power


modern artificial intelligence. Behind every headline regarding
the latest achievements of artificial intelligence lies a deep neural
network, whether this involves diagnosing cancer, recognising objects
(Figure 1.1), learning to ride a bicycle 90 , or learning to fly 28;93
(Figure 1.2). In essence, deep neural networks rely on a sophisticated
form of pattern recognition, but the secret of their success is that they
are capable of learning.

30

25

20
Error rate %

15

10

0
2010 2011 2012 2013 2014 2015 2016 2017 human

(a) (b)
Figure 1.1. Classifying images. (a) The percentage error on the annual
ILSVRC image classification competition has fallen dramatically since 2010.
(b) Example images. The latest competition involves classifying about 1.5
million images into 1,000 object classes. Reproduced with permission from
Wu and Gu (2015).

1
Artificial Neural Networks

A striking indication of the dramatic progress made over recent


years can be seen in Figure 1.1, which shows performance on an image
classification task. Note that the performance of deep neural networks
surpassed human performance in 2015.

1.2. What is an Artificial Neural Network?

An artificial neural network is a set of interconnected model neurons


or units, as shown in Figure 1.3. For brevity, we will use the term
neural network to refer to artificial neural networks. By analogy to the
brain, the most common forms of neural networks receive inputs via an
array of sensory units. With repeated presentations, a neural network

(a)

(b) (c)
Figure 1.2. Learning to soar using reinforcement learning. (a) Glider used
for learning. (b) Before learning, flight is disorganised, and glider descends.
(c) After learning, glider ascends to 0.6 km. Note the different scales on the
vertical axes in (b) and (c). Reproduced with permission: (a) from Guilliard
et al. (2018); (b,c) from Reddy et al. (2016).

2
1.2. What is an Artificial Neural Network?

can learn to recognise those inputs so as to classify them or to perform


some action. For example, if the input to a neural network is an image
of a teapot then the output should indicate that the image contains a
teapot. However, this simple description hides the many complexities
that make object recognition extraordinarily difficult. Indeed, the fact
that we humans are so good at object recognition makes it hard to
appreciate just how difficult it is for a neural network to recognise
objects. A teapot can vary in colour and shape, and it can appear at
many different orientations and sizes, as in Figure 1.4. These variations
do not usually fool humans, but they can cause major problems for
neural networks.
Let’s assume that we have wired up a neural network so that its input
is an image, and that we want its output to indicate which object is in
the image. Specifically, we wire up the network so that each pixel in
the image acts as input to one input unit, and we connect all of these
input units to a single output unit. The output or state of the output
unit is determined by the image through the strengths or weights of
the connections from the different input units to the output unit. In
essence, the bigger the weight, the more a given input unit affects the
state of the output unit. To make things really simple, we assume
that the object is either a teapot or a cup, and that the state of the
output unit should be 1 when the image contains a teapot and 0 when
it contains a cup.
In order to get a feel for the magnitude of the problem confronting
a neural network, consider a square (black and white) image that is
1,000 × 1,000 pixels. If each pixel can adopt 256 grey-levels then the

Input
layer Output
layer

Input w1
Output
Input w2

Figure 1.3. A neural network with two input units and one output unit. The
connections from the input units to the output unit have weights w1 and w2 .

3
Artificial Neural Networks

number of possible images is 2561,000,000 , which is much more than the


number of atoms in the universe (about 1087 ). Of course, most of these
images look like random noise, and only a tiny fraction depict anything
approaching a natural scene — and only a fraction of those depict a
teapot. The point is that if a network is to discriminate between teapots
and non-teapots then it implicitly must learn to assign a probability to
each of the 2561,000,000 possible images, which is the probability that
each image contains a teapot.
The neural network has to learn to recognise objects in its input
image by adjusting the connection weights between units (here,
between input and output units). The process of adjusting weights
is implemented by a learning algorithm (the word algorithm just means
a definite sequence of logical steps that yields a specified outcome).
Now, suppose the input image contains a teapot but the output unit
state is 0 (indicating that the image depicts a cup). This classification
error occurs because one or more weights have the wrong values.
However, it is far from obvious which weights are wrong and how they
should be adjusted to correct the error. This general problem is called
the credit assignment problem, but it should really be called the blame
assignment problem, or even the whodunnit problem.

(a) (b) (c)

(d) (e) (f)


Figure 1.4. Teapots come in various shapes (a vs b), and the single teapot
in (b)–(f) can appear with many different orientations, sizes, and colours
(represented as shades of grey here).

4
1.3. The Origins of Neural Networks

In practice, knowing whodunnit, and what to do with whoever-did-


do-it, represents the crux of the problem to be solved by neural network
learning algorithms. In order to learn, artificial (and biological) neural
networks must solve this credit/blame assignment problem.
Most solutions to this problem have their roots in an idea proposed
by Donald Hebb 34 in 1949, which is known as Hebb’s postulate:

When an axon of cell A is near enough to excite cell B and


repeatedly or persistently takes part in firing it, some growth process
or metabolic change takes place in one or both cells such that A’s
efficiency, as one of the cells firing B, is increased.

1.3. The Origins of Neural Networks

The development of modern neural networks (see Figure 1.5) occurred


in a series of step-wise dramatic improvements, interspersed with
periods of stasis (often called neural network winters). As early as 1943,
before commercial computers existed, McCulloch and Pitts 77 began to
explore how small networks of artificial neurons could mimic brain-like
processes. The cross-disciplinary nature of this early research is evident
from the fact that McCulloch and Pitts also contributed to a classic
paper on the neurophysiology of vision (What the frog’s eye tells the
frog’s brain 68 , 1959), which was published not in a biology journal, but
in the Proceedings of the Institute of Radio Engineers.
In subsequent years, the increasing availability of computers allowed
neural networks to be tested on simple pattern recognition tasks. But
progress was slow, partly because most research funding was allocated
to more conventional approaches that did not attempt to mimic the
neural architecture of the brain, and partly because artificial neural
network learning algorithms were limited.
A major landmark in the history of neural networks was Frank
Rosenblatt’s perceptron (1958), which was explicitly modelled on the
neuronal structures in the brain. The perceptron learning algorithm
allowed a perceptron to learn to associate inputs with outputs in a
manner apparently similar to learning in humans. Specifically, the
perceptron could learn an association between a simple input ‘image’

5
Artificial Neural Networks

and a desired output, where this output indicated whether or not the
object was present in the image.
Perceptrons, and neural networks in general, share three key
properties with human memory. First, unlike conventional computer
memory, neural network memory is content addressable, which means
that recall is triggered by an image or a sound. In contrast, computer
memory can be accessed only if the specific location (address) of
the required information is known. Second, a common side-effect
of content addressable memory is that, given a learned association
between an input image and an output, recall can be triggered by an
input image that is similar (but not identical) to the original input of a
particular input/output association. This ability to generalise beyond
learned associations is a critical human-like property of artificial neural
networks. Third, if a single weight or unit is destroyed, this does
not remove a particular learned association; instead, it degrades all
associations to some extent. This graceful degradation is thought to
resemble human memory.
Despite these human-like qualities, the perceptron was dealt a severe
blow in 1969, when Minsky and Papert 79 famously proved that it
could not learn associations unless they were of a particularly simple
kind (i.e. linearly separable, as described in Chapter 2). This marked
the beginning of the first neural network winter, during which neural
network research was undertaken by only a handful of scientists.

Reinforcement
Learning

Perceptron Hopfield Network

Backprop Network Boltzmann Machine

Deep Neural Network

Figure 1.5. A simplified taxonomy of artificial neural networks.

6
1.3. The Origins of Neural Networks

The modern era of neural networks began in 1982 with the Hopfield
net 47 . Although Hopfield nets are not practically very useful, Hopfield
introduced a theoretical framework based on statistical mechanics,
which laid the foundations for Ackley, Hinton, and Sejnowki’s
Boltzmann machine 1 in 1985. Unlike a Hopfield net, in which the states
of all units are specified by the associations being learned, a Boltzmann
machine has a reservoir of hidden units, which can be used to learn
complex associations. The Boltzmann machine is important because it
facilitated a conceptual shift away from the idea of a neural network
as a passive associative machine towards the view of a neural network
as a generative model. The only problem is that Boltzmann machines
learn at a rate that is best described as glacial. But on a practical level,
the Boltzmann machine demonstrated that neural networks could learn
to solve complex toy (i.e. small-scale) problems, which suggested that
they could learn to solve almost any problem (at least in principle, and
at least eventually).
The impetus supplied by the Boltzmann machine gave rise to a more
tractable method devised in 1986 by Rumelhart, Hinton, and Williams,
the backpropagation learning algorithm 97 . A backpropagation network
consists of three layers of units: an input layer, which is connected with
connection weights to a hidden layer, which in turn is connected to an
output layer, as shown in Figure 1.6. The backpropagation algorithm
is important because it demonstrated the potential of neural networks
to learn sophisticated tasks in a human-like manner. Crucially, for the
first time, a backpropagation neural network called NETtalk learned to
‘speak’, inasmuch as it translated text to phonemes (the basic elements
of speech), which a voice synthesizer then used to produce speech.

Input Hidden Output


layer layer layer

x1
y
x2

Figure 1.6. A neural network with two input units, two hidden units, and
one output unit.

7
Artificial Neural Networks

In parallel with the evolution of neural networks, reinforcement


learning was developed throughout the 1980s and 1990s, principally
by Sutton and Barto (2018). Reinforcement learning is an inspired
fusion of game playing by computers, as developed by Shannon
(1950) and Samuel (1959), optimal control theory, and stimulus–
response experiments in psychology. Early results showed that hard,
albeit small-scale, problems such as balancing a pole can be solved
using feedback in the form of simple reward signals (Michie and
Chambers, 1968; Barto, Sutton, and Anderson, 1983). More recently,
reinforcement learning has been combined with deep learning to
produce impressive skill acquisition, such as in the case of a glider
that learns to gain height on thermals 28 (see Figure 1.2).

1.4. From Backprop to Deep Learning

In principle, a backprop network with just two hidden layers of units


can associate pretty much any inputs with any outputs 12 , which means
that it should be able to perform most tasks. However, getting a
backprop to learn the tasks that it should be able to perform in theory is
problematic. A plausible solution is to add more units to each layer and
more layers of hidden units, because this should improve learning (at
least in theory). In practice, it was found that conventional backprop
networks struggled to learn if they had deep learning architectures like
that shown in Figure 1.7.

Input Hidden Hidden Hidden


layer layer 1 layer 2 layer 3
Output
layer

Figure 1.7. A deep network with three hidden layers.

8
1.5. An Overview of Chapters

With the benefit of hindsight, the field of artificial intelligence


stems from the research originally done on Hopfield nets, Boltzmann
machines, the backprop algorithm, and reinforcement learning.
However, the evolution of backprop networks into deep learning
networks had to wait for three related developments. To quote one
of the inventors of backprop, the reason for the revolution in neural
networks is because:

In simple terms, computers became millions of times faster, data


sets got thousands of times bigger, and researchers discovered many
technical refinements that made neural nets easier to train.
G Hinton, 2018.

1.5. An Overview of Chapters

In order to appreciate modern deep learning networks, some familiarity


with their historical antecedents is required. Accordingly, Chapter 2
provides a tutorial on the simplest neural network, a linear associative
network. This embodies many of the essential ingredients (e.g.
gradient descent) necessary for understanding more sophisticated
neural networks. In Chapter 3, the role of the perceptron in exposing
the limitations of linear neural networks is described.
In Chapter 4, the backpropagation or backprop algorithm is
introduced and demonstrated on the simplest nonlinear problem (i.e.
the exclusive OR problem). The application of recurrent backprop
neural networks to the problem of recognising sequences (e.g. speech) is
also explored. The properties of backprop are examined in the context
of the cross-entropy error function and in relation to problems of over-
fitting, generalisation, and vanishing gradients.
To understand the stochastic nature of deep learning neural
networks, Chapter 5 examines Hopfield nets, which have their roots in
statistical mechanics. In Chapter 6, the limitations of Hopfield nets are
addressed in the form of Boltzmann machines, which represent the first
generative models. The glacial learning rates of Boltzmann machines
are overcome by restricted Boltzmann machines in Chapter 7. A recent

9
Artificial Neural Networks

innovation in the form of variational autoencoders promises a further


step-change in generative models, as described in Chapter 8.
The use of convolution in backprop networks in the early 1990s
foreshadowed the rise of deep convolutional networks, ladder networks,
denoising autoencoders, generative adversarial networks, temporal deep
networks, and capsule networks, which are described in Chapter 9.
In Chapter 10, the reinforcement learning algorithm is introduced
and demonstrated on examples such as pole balancing. Ingenious
variants of backprop were devised between 2000 and 2016, which, when
combined with reinforcement learning, allowed deep networks to learn
increasingly complex tasks. A particularly striking result of these
developments was a deep network called AlphaGo 108 , which learned
to play the game of Go so well that it beat the world’s best human
players in 2016.
Finally, the prospect of an artificial intelligence revolution is
discussed in Chapter 11, where we consider the two extremes of
naive optimism and cynical pessimism that inevitably accompany any
putative revolution.

10
Chapter 2

Linear Associative Networks

It is no good poking around in the brain without some idea of what


one is looking for. That would be like trying to find a needle in a
haystack without having any idea what needles look like. The theorist
is the man who might reasonably be asked for his opinion about the
appearance of needles.
HC Longuet-Higgins, 1969.

2.1. Introduction

In order to understand the deep learning networks considered in later


chapters, we will sneak up on them slowly, so that what appears
mysterious at a distance will seem obvious by the time it is viewed
close up. Historically, the simplest form of artificial neural network
consists of an array of input units that project to an array of output
units. Usually, every input unit is connected to every output unit via a

Input Output
layer layer

Figure 2.1. A neural network with three input units, which are connected to
three output units via a matrix of nine connection weights.

11
2 Linear Associative Networks

matrix of connection weights, as shown in Figure 2.1. Depending on the


network under consideration, the unit states can have either binary or
real values. The capabilities of these networks were explored between
the late 1950s and early 1970s as perceptrons by Rosenblatt (1958),
adalines by Widrow and Hoff (1960), holophones by Longuet-Higgins
(1968), correlographs by Longuet-Higgins, Willshaw, and Buneman
(1970), and correlation matrix memories by Kohonen (1972).

2.2. Setting One Connection Weight

All of the networks mentioned above are variations on the theme of


linear associative networks. A linear associative network can have
many input and output units, which do not have thresholds, such that
each output is a linear function of the input. To simplify matters, we
dispense with all except one unit in the input layer and one unit in the
output layer, as in Figure 2.2.
The state of the input unit is represented as x, and the state of the
output unit is represented as y. Suppose we wish the network to learn
a single association, such that when the input unit’s state is set to
x = 0.8, the output unit’s state is y = 0.2. The desired target value
of the output unit state is represented as y, to differentiate it from the
actual output unit state y.
The input and output units are connected by a single connection.
Crucially, the strength, or weight, of this connection is the only thing
that determines the output unit state for a given input unit state,
because the output unit state is the input unit state multiplied by the
value of the connection weight. More formally, if the input state is x
and if the weight of the connection from the input unit to the output

Input Output
layer layer

w y
x

Figure 2.2. A neural network with two layers, each containing one artificial
neuron or unit. The state x of the unit in the input layer affects the state y
of the unit in the output layer via a connection weight w (Equation 2.1).

12
2.2. Setting One Connection Weight

unit is w, then the total input u to the output unit is

u = w × x. (2.1)

We will omit the multiplication sign from now on, and write simply
u = wx. In general, the state y of a unit is governed by an activation
function (i.e. input/output function)

y = f (u). (2.2)

However, in a linear associative network, the activation function of each


unit is usually the identity function, which means that the state of the
output unit is equal to its input, y = u = wx, as shown in Figure 2.3.
How should we adjust the weight w so that the input unit state
x = 0.8 yields an output unit state y = 0.2? In this example, the
optimal weight is

w∗ = y/x = 0.25. (2.3)

The optimal weight has an asterisk superscript, which is standard


notation. We can check that this gives the desired output unit state:
y = w∗ × x = 0.25 × 0.8 = 0.2. If learning in a neural network is
so easy, then what is all the fuss about? Well, we cheated in five ways:

1. we calculated the optimal weight value manually, rather than


using a learning algorithm

0.5
Output y=u

-0.5

-1
-1 -0.5 0 0.5 1
Input u

Figure 2.3. A linear activation function, y = u.

13
2 Linear Associative Networks

2. the network only ‘learned’ one association, rather than


simultaneously learning many associations
3. the network had only one input unit, rather than the many units
required to represent images at its input
4. the network used only two layers of units, rather than the many
layers necessary for learning complex associations
5. because each unit has a linear activation function, the network
provides a linear mapping between input and output layers.

The final point introduces the idea that the network, when considered
as a whole, defines a function F that maps unit states in the input
layer to unit states in the output layer. The nature of this function
depends on the activation functions of the individual units and on the
connection weights. Crucially, if the activation functions of the units
are linear then the function F implemented by the network is also
linear, no matter how many layers or how many units the network
contains. In Chapter 3, we will see how the linearity of the network
function F represents a fundamental limitation.
Next, we will incrementally remove each of the cheats listed above,
so that by the time we get to Chapter 4 we will have a nonlinear three-
layer neural network, which can learn many associations.

2.3. Learning One Association

We begin by describing an algorithm for learning a single association in


the network shown in Figure 2.2. For simplicity, we refer to a network
with one input unit and one output unit as a 1-1 network.
Suppose, as above, we wish the network to learn to associate an input
value of x = 0.8 with a target state of y = 0.2. Usually, we have no
idea of the correct value for the weight w, so we may as well begin by
choosing its value at random. Suppose we choose a value w = 0.4, so
the output state is y = 0.4 × 0.8 = 0.32. The difference between the
output y and the target value y is defined here as the delta term:

δ = y − y = 0.32 − 0.2 = 0.12. (2.4)

14
2.3. Learning One Association

A standard measure of the error in y is half the squared difference:

E = 1
2 (wx − y)2 (2.5)
1 2
= 2 (y − y) (2.6)
= 1
2 δ2 . (2.7)

Below we will explore a method for adjusting the weight w


automatically so that δ = 0. However, in the simple case under
consideration here, we can use Equation 2.5 to calculate the value of
E for values of w between 0 and 0.5, and then plot a graph of E
versus w, as in Figure 2.4. This defines a quadratic curve called the
error function, which has its lowest point at w∗ = 0.25, a point that
corresponds to an error of E = 0. Because we can plot the entire error
function, we can see that the solution lies at w∗ = 0.25. This is not
possible with more complex networks, because we cannot plot the entire
error function. Nevertheless, we will continue to explore an automatic
method for finding w∗ using this simple network, which will hold good
for cases where we cannot see the whole error function.
So if we cannot see the whole error function, how can we find the
value of w that minimises E? The following trial-and-error method is
not perfect, but it motivates the better methods described below. We
begin by choosing the value of w at random, as in Figure 2.4 where
we have chosen w = 0.4. How could we decide how to change w in
order to reduce E? Well, even though we don’t know which value of w
minimises the value of E, there are only two possible ways to alter w:
we can make it either bigger or smaller, which corresponds to the right
and left directions in Figure 2.4. If we make w a little bigger then this
increases E, so that is a bad idea. But if we make w a little smaller
then this decreases E, so that is a good idea. And if we repeatedly
make w smaller then E keeps on decreasing until E = 0.
This method works well if there is only one weight, because there
are only two directions to choose from. But as the number of weights
gets bigger, the number of possible combinations of directions grows
exponentially, so this method becomes inefficient. Fortunately, there
exists a more efficient method, called gradient descent.

15
Another random document with
no related content on Scribd:
had been burned, and who had taken shelter in a fortified house.[54]
But he fought with disadvantage against an enemy who must be
hunted before every battle. Some flourishing towns were burned.
John Monoco, a formidable savage, boasted that “he had burned
Medfield and Lancaster, and would burn Groton, Concord,
Watertown and Boston;” adding, “what me will, me do.” He did burn
Groton, but before he had executed the remainder of his threat he
was hanged, in Boston, in September, 1676.[55]
A still more formidable enemy was removed, in the same year, by
the capture of Canonchet, the faithful ally of Philip, who was soon
afterwards shot at Stonington. He stoutly declared to the
Commissioners that “he would not deliver up a Wampanoag, nor the
paring of a Wampanoag’s nail,” and when he was told that his
sentence was death, he said “he liked it well that he was to die
before his heart was soft, or he had spoken anything unworthy of
himself.”[56]
We know beforehand who must conquer in that unequal struggle.
The red man may destroy here and there a straggler, as a wild beast
may; he may fire a farmhouse, or a village; but the association of the
white men and their arts of war give them an overwhelming
advantage, and in the first blast of their trumpet we already hear the
flourish of victory. I confess what chiefly interests me, in the annals
of that war, is the grandeur of spirit exhibited by a few of the Indian
chiefs. A nameless Wampanoag who was put to death by the
Mohicans, after cruel tortures, was asked by his butchers, during the
torture, how he liked the war?—he said, “he found it as sweet as
sugar was to Englishmen.”[57]
The only compensation which war offers for its manifold mischiefs,
is in the great personal qualities to which it gives scope and
occasion. The virtues of patriotism and of prodigious courage and
address were exhibited on both sides, and, in many instances, by
women. The historian of Concord has preserved an instance of the
resolution of one of the daughters of the town. Two young farmers,
Abraham and Isaac Shepherd, had set their sister Mary, a girl of
fifteen years, to watch whilst they threshed grain in the barn. The
Indians stole upon her before she was aware, and her brothers were
slain. She was carried captive into the Indian country, but, at night,
whilst her captors were asleep, she plucked a saddle from under the
head of one of them, took a horse they had stolen from Lancaster,
and having girt the saddle on, she mounted, swam across the
Nashua River, and rode through the forest to her home.[58]
With the tragical end of Philip, the war ended. Beleaguered in his
own country, his corn cut down, his piles of meal and other provision
wasted by the English, it was only a great thaw in January, that,
melting the snow and opening the earth, enabled his poor followers
to come at the ground-nuts, else they had starved. Hunted by
Captain Church, he fled from one swamp to another; his brother, his
uncle, his sister, and his beloved squaw being taken or slain, he was
at last shot down by an Indian deserter, as he fled alone in the dark
of the morning, not far from his own fort.[59]
Concord suffered little from the war. This is to be attributed no
doubt, in part, to the fact that troops were generally quartered here,
and that it was the residence of many noted soldiers. Tradition finds
another cause in the sanctity of its minister. The elder Bulkeley was
gone. In 1659,[60] his bones were laid at rest in the forest. But the
mantle of his piety and of the people’s affection fell upon his son
Edward,[61] the fame of whose prayers, it is said, once saved
Concord from an attack of the Indian.[62] A great defence
undoubtedly was the village of Praying Indians, until this settlement
fell a victim to the envenomed prejudice against their countrymen.
The worst feature in the history of those years, is, that no man spake
for the Indian. When the Dutch, or the French, or the English royalist
disagreed with the Colony, there was always found a Dutch, or
French, or tory party,—an earnest minority,—to keep things from
extremity. But the Indian seemed to inspire such a feeling as the wild
beast inspires in the people near his den. It is the misfortune of
Concord to have permitted a disgraceful outrage upon the friendly
Indians settled within its limits, in February, 1676, which ended in
their forcible expulsion from the town.[63]
This painful incident is but too just an example of the measure
which the Indians have generally received from the whites. For them
the heart of charity, of humanity, was stone. After Philip’s death, their
strength was irrecoverably broken. They never more disturbed the
interior settlements, and a few vagrant families, that are now
pensioners on the bounty of Massachusetts, are all that is left of the
twenty tribes.

“Alas! for them—their day is o’er,


Their fires are out from hill and shore,
No more for them the wild deer bounds,
The plough is on their hunting grounds;
The pale man’s axe rings in their woods,
The pale man’s sail skims o’er their floods,
Their pleasant springs are dry.”[64]

I turn gladly to the progress of our civil history. Before 1666,


15,000 acres had been added by grants of the General Court to the
original territory of the town,[65] so that Concord then included the
greater part of the towns of Bedford, Acton, Lincoln and Carlisle.
In the great growth of the country, Concord participated, as is
manifest from its increasing polls and increased rates. Randolph at
this period writes to the English government, concerning the country
towns; “The farmers are numerous and wealthy, live in good houses;
are given to hospitality; and make good advantage by their corn,
cattle, poultry, butter and cheese.”[66] Edward Bulkeley was the
pastor, until his death, in 1696. His youngest brother, Peter, was
deputy from Concord, and was chosen speaker of the house of
deputies in 1676. The following year, he was sent to England, with
Mr. Stoughton, as agent for the Colony; and on his return, in 1685,
was a royal councillor. But I am sorry to find that the servile
Randolph speaks of him with marked respect.[67] It would seem that
his visit to England had made him a courtier. In 1689, Concord
partook of the general indignation of the province against Andros. A
company marched to the capital under Lieutenant Heald, forming a
part of that body concerning which we are informed, “the country
people came armed into Boston, on the afternoon (of Thursday, 18th
April) in such rage and heat, as made us all tremble to think what
would follow; for nothing would satisfy them but that the governor
must be bound in chains or cords, and put in a more secure place,
and that they would see done before they went away; and to satisfy
them he was guarded by them to the fort.”[68] But the Town Records
of that day confine themselves to descriptions of lands, and to
conferences with the neighboring towns to run boundary lines. In
1699, so broad was their territory, I find the selectmen running the
lines with Chelmsford, Cambridge and Watertown.[69] Some
interesting peculiarities in the manners and customs of the time
appear in the town’s books. Proposals of marriage were made by the
parents of the parties, and minutes of such private agreements
sometimes entered on the clerk’s records.[70] The public charity
seems to have been bestowed in a manner now obsolete. The town
lends its commons as pastures, to poor men; and “being informed of
the great present want of Thomas Pellit, gave order to Stephen
Hosmer to deliver a town cow, of a black color, with a white face,
unto said Pellit, for his present supply.”[71]
From the beginning to the middle of the eighteenth century, our
records indicate no interruption of the tranquillity of the inhabitants,
either in church or in civil affairs. After the death of Rev. Mr.
Estabrook, in 1711, it was propounded at the town-meeting, “whether
one of the three gentlemen lately improved here in preaching,
namely, Mr. John Whiting, Mr. Holyoke and Mr. Prescott, shall be
now chosen in the work of the ministry? Voted affirmatively.”[72] Mr.
Whiting, who was chosen, was, we are told in his epitaph, “a
universal lover of mankind.” The charges of education and of
legislation, at this period, seem to have afflicted the town; for they
vote to petition the General Court to be eased of the law relating to
providing a schoolmaster; happily, the Court refused; and in 1712,
the selectmen agreed with Captain James Minott, “for his son
Timothy to keep the school at the school-house for the town of
Concord, for half a year beginning 2d June; and if any scholar shall
come, within the said time, for larning exceeding his son’s ability, the
said Captain doth agree to instruct them himself in the tongues, till
the above said time be fulfilled; for which service, the town is to pay
Captain Minott ten pounds.”[73] Captain Minott seems to have served
our prudent fathers in the double capacity of teacher and
representative. It is an article in the selectmen’s warrant for the town-
meeting, “to see if the town will lay in for a representative not
exceeding four pounds.” Captain Minott was chosen, and after the
General Court was adjourned received of the town for his services,
an allowance of three shillings per day. The country was not yet so
thickly settled but that the inhabitants suffered from wolves and wild-
cats, which infested the woods; since bounties of twenty shillings are
given as late as 1735, to Indians and whites, for the heads of these
animals, after the constable has cut off the ears.[74]
Mr. Whiting was succeeded in the pastoral office by Rev. Daniel
Bliss, in 1738. Soon after his ordination, the town seems to have
been divided by ecclesiastical discords. In 1741, the celebrated
Whitfield preached here, in the open air, to a great congregation.[75]
Mr. Bliss heard that great orator with delight, and by his earnest
sympathy with him, in opinion and practice, gave offence to a part of
his people. Party and mutual councils were called, but no grave
charge was made good against him. I find, in the Church Records,
the charges preferred against him, his answer thereto, and the result
of the Council. The charges seem to have been made by the lovers
of order and moderation against Mr. Bliss, as a favorer of religious
excitements. His answer to one of the counts breathes such true
piety that I cannot forbear to quote it. The ninth allegation is “That in
praying for himself, in a church-meeting, in December last, he said,
‘he was a poor vile worm of the dust, that was allowed as Mediator
between God and this people.’” To this Mr. Bliss replied, “In the
prayer you speak of, Jesus Christ was acknowledged as the only
Mediator between God and man; at which time, I was filled with
wonder, that such a sinful and worthless worm as I am, was allowed
to represent Christ, in any manner, even so far as to be bringing the
petitions and thank-offerings of the people unto God, and God’s will
and truths to the people; and used the word Mediator in some
differing light from that you have given it; but I confess I was soon
uneasy that I had used the word, lest some would put a wrong
meaning thereupon.”[76] The Council admonished Mr. Bliss of some
improprieties of expression, but bore witness to his purity and fidelity
in his office. In 1764, Whitfield preached again at Concord, on
Sunday afternoon; Mr. Bliss preached in the morning, and the
Concord people thought their minister gave them the better sermon
of the two. It was also his last.[77]
The planting of the colony was the effect of religious principle. The
Revolution was the fruit of another principle,—the devouring thirst for
justice. From the appearance of the article in the Selectmen’s
warrant, in 1765, “to see if the town will give the Representative any
instructions about any important affair to be transacted by the
General Court, concerning the Stamp Act,”[78] to the peace of 1783,
the Town Records breathe a resolute and warlike spirit, so bold from
the first as hardly to admit of increase.
It would be impossible on this occasion to recite all these patriotic
papers. I must content myself with a few brief extracts. On the 24th
January, 1774, in answer to letters received from the united
committees of correspondence, in the vicinity of Boston, the town
say:
“We cannot possibly view with indifference the past and present
obstinate endeavors of the enemies of this, as well as the mother
country, to rob us of those rights, that are the distinguishing glory
and felicity of this land; rights, that we are obliged to no power, under
heaven, for the enjoyment of; as they are the fruit of the heroic
enterprises of the first settlers of these American colonies. And
though we cannot but be alarmed at the great majority, in the British
parliament, for the imposition of unconstitutional taxes on the
colonies, yet, it gives life and strength to every attempt to oppose
them, that not only the people of this, but the neighboring provinces
are remarkably united in the important and interesting opposition,
which, as it succeeded before, in some measure, by the blessing of
heaven, so, we cannot but hope it will be attended with still greater
success, in future.
“Resolved, That these colonies have been and still are illegally
taxed by the British parliament, as they are not virtually represented
therein.
“That the purchasing commodities subject to such illegal taxation
is an explicit, though an impious and sordid resignation of the
liberties of this free and happy people.
“That, as the British parliament have empowered the East India
Company to export their tea into America, for the sole purpose of
raising a revenue from hence; to render the design abortive, we will
not, in this town, either by ourselves, or any from or under us, buy,
sell, or use any of the East India Company’s tea, or any other tea,
whilst there is a duty for raising a revenue thereon in America;
neither will we suffer any such tea to be used in our families.
“That all such persons as shall purchase, sell, or use any such tea,
shall, for the future, be deemed unfriendly to the happy constitution
of this country.
“That, in conjunction with our brethren in America, we will risk our
fortunes, and even our lives, in defence of his majesty, King George
the Third, his person, crown and dignity; and will, also, with the same
resolution, as his freeborn subjects in this country, to the utmost of
our power, defend all our rights inviolate to the latest posterity.
“That, if any person or persons, inhabitants of this province, so
long as there is a duty on tea, shall import any tea from the India
House, in England, or be factors for the East India Company, we will
treat them, in an eminent degree, as enemies to their country, and
with contempt and detestation.
“That we think it our duty, at this critical time of our public affairs, to
return our hearty thanks to the town of Boston, for every rational
measure they have taken for the preservation or recovery of our
invaluable rights and liberties infringed upon; and we hope, should
the state of our public affairs require it, that they will still remain
watchful and persevering; with a steady zeal to espy out everything
that shall have a tendency to subvert our happy constitution.”[79]
On the 27th June, near three hundred persons, upwards of twenty-
one years of age, inhabitants of Concord, entered into a covenant,
“solemnly engaging with each other, in the presence of God, to
suspend all commercial intercourse with Great Britain, until the act
for blocking the harbor of Boston be repealed; and neither to buy nor
consume any merchandise imported from Great Britain, nor to deal
with those who do.”[80]
In August, a County Convention met in this town, to deliberate
upon the alarming state of public affairs, and published an admirable
report.[81] In September, incensed at the new royal law which made
the judges dependent on the crown, the inhabitants assembled on
the common, and forbade the justices to open the court of sessions.
This little town then assumed the sovereignty. It was judge and jury
and council and king. On the 26th of the month, the whole town
resolved itself into a committee of safety, “to suppress all riots,
tumults, and disorders in said town, and to aid all untainted
magistrates in the execution of the laws of the land.” It was then
voted, to raise one or more companies of minute-men, by enlistment,
to be paid by the town whenever called out of town; and to provide
arms and ammunition, “that those who are unable to purchase them
themselves, may have the advantage of them, if necessity calls for
it.” In October, the Provincial Congress met in Concord. John
Hancock was President. This body was composed of the foremost
patriots, and adopted those efficient measures whose progress and
issue belong to the history of the nation.[82]
The clergy of New England were, for the most part, zealous
promoters of the Revolution. A deep religious sentiment sanctified
the thirst for liberty. All the military movements in this town were
solemnized by acts of public worship. In January, 1775, a meeting
was held for the enlisting of minute-men. Reverend William
Emerson, the chaplain of the Provincial Congress, preached to the
people. Sixty men enlisted and, in a few days, many more. On 13th
March, at a general review of all the military companies, he preached
to a very full assembly, taking for his text, 2 Chronicles xiii. 12, “And,
behold, God himself is with us for our captain, and his priests with
sounding trumpets to cry alarm against you.”[83] It is said that all the
services of that day made a deep impression on the people, even to
the singing of the psalm.
A large amount of military stores had been deposited in this town,
by order of the Provincial Committee of Safety. It was to destroy
those stores that the troops who were attacked in this town, on the
19th April, 1775, were sent hither by General Gage.
The story of that day is well known. In these peaceful fields, for the
first time since a hundred years, the drum and alarm-gun were
heard, and the farmers snatched down their rusty firelocks from the
kitchen walls, to make good the resolute words of their town
debates. In the field where the western abutment of the old bridge
may still be seen, about half a mile from this spot, the first organized
resistance was made to the British arms. There the Americans first
shed British blood. Eight hundred British soldiers, under the
command of Lieutenant-Colonel Francis Smith, had marched from
Boston to Concord; at Lexington had fired upon the brave handful of
militia, for which a speedy revenge was reaped by the same militia in
the afternoon. When they entered Concord, they found the militia
and minute-men assembled under the command of Colonel Barrett
and Major Buttrick. This little battalion, though in their hasty council
some were urgent to stand their ground, retreated before the enemy
to the high land on the other bank of the river, to wait for
reinforcement. Colonel Barrett ordered the troops not to fire, unless
fired upon. The British following them across the bridge, posted two
companies, amounting to about one hundred men, to guard the
bridge, and secure the return of the plundering party. Meantime, the
men of Acton, Bedford, Lincoln and Carlisle, all once included in
Concord, remembering their parent town in the hour of danger,
arrived and fell into the ranks so fast, that Major Buttrick found
himself superior in number to the enemy’s party at the bridge. And
when the smoke began to rise from the village where the British
were burning cannon-carriages and military stores, the Americans
resolved to force their way into town. The English beginning to pluck
up some of the planks of the bridge, the Americans quickened their
pace, and the British fired one or two shots up the river (our ancient
friend here, Master Blood,[84] saw the water struck by the first ball);
then a single gun, the ball from which wounded Luther Blanchard
and Jonas Brown, and then a volley, by which Captain Isaac Davis
and Abner Hosmer of Acton were instantly killed. Major Buttrick
leaped from the ground, and gave the command to fire, which was
repeated in a simultaneous cry by all his men. The Americans fired,
and killed two men and wounded eight. A head-stone and a foot-
stone, on this bank of the river, mark the place where these first
victims lie.[85] The British retreated immediately towards the village,
and were joined by two companies of grenadiers, whom the noise of
the firing had hastened to the spot. The militia and minute-men—
every one from that moment being his own commander—ran over
the hills opposite the battle-field, and across the great fields, into the
east quarter of the town, to waylay the enemy, and annoy his retreat.
The British, as soon as they were rejoined by the plundering
detachment, began that disastrous retreat to Boston, which was an
omen to both parties of the event of the war.
In all the anecdotes of that day’s events we may discern the
natural action of the people. It was not an extravagant ebullition of
feeling, but might have been calculated on by any one acquainted
with the spirits and habits of our community. Those poor farmers who
came up, that day, to defend their native soil, acted from the simplest
instincts. They did not know it was a deed of fame they were doing.
These men did not babble of glory. They never dreamed their
children would contend who had done the most. They supposed they
had a right to their corn and their cattle, without paying tribute to any
but their own governors. And as they had no fear of man, they yet
did have a fear of God. Captain Charles Miles, who was wounded in
the pursuit of the enemy, told my venerable friend who sits by me,
that “he went to the services of that day, with the same seriousness
and acknowledgment of God, which he carried to church.”[86]
The presence of these aged men who were in arms on that day
seems to bring us nearer to it. The benignant Providence which has
prolonged their lives to this hour gratifies the strong curiosity of the
new generation. The Pilgrims are gone; but we see what manner of
persons they were who stood in the worst perils of the Revolution.
We hold by the hand the last of the invincible men of old, and
confirm from living lips the sealed records of time.
And you, my fathers, whom God and the history of your country
have ennobled, may well bear a chief part in keeping this peaceful
birthday of our town. You are indeed extraordinary heroes. If ever
men in arms had a spotless cause, you had. You have fought a good
fight. And having quit you like men in the battle, you have quit
yourselves like men in your virtuous families; in your cornfields; and
in society. We will not hide your honorable gray hairs under perishing
laurel-leaves, but the eye of affection and veneration follows you.
You are set apart—and forever—for the esteem and gratitude of the
human race. To you belongs a better badge than stars and ribbons.
This prospering country is your ornament, and this expanding nation
is multiplying your praise with millions of tongues.[87]
The agitating events of those days were duly remembered in the
church. On the second day after the affray, divine service was
attended, in this house, by 700 soldiers. William Emerson, the
pastor, had a hereditary claim to the affection of the people, being
descended in the fourth generation from Edward Bulkeley, son of
Peter. But he had merits of his own. The cause of the Colonies was
so much in his heart that he did not cease to make it the subject of
his preaching and his prayers, and is said to have deeply inspired
many of his people with his own enthusiasm. He, at least, saw
clearly the pregnant consequences of the 19th April. I have found
within a few days, among some family papers, his almanac of 1775,
in a blank leaf of which he has written a narrative of the fight;[88] and
at the close of the month, he writes, “This month remarkable for the
greatest events of the present age.” To promote the same cause, he
asked, and obtained of the town, leave to accept the commission of
chaplain to the Northern army, at Ticonderoga, and died, after a few
months, of the distemper that prevailed in the camp.[89]
In the whole course of the war the town did not depart from this
pledge it had given. Its little population of 1300 souls behaved like a
party to the contest. The number of its troops constantly in service is
very great. Its pecuniary burdens are out of all proportion to its
capital. The economy so rigid, which marked its earlier history, has
all vanished. It spends profusely, affectionately, in the service.
“Since,” say the plaintive records, “General Washington, at
Cambridge, is not able to give but 24s. per cord for wood, for the
army; it is Voted, that this town encourage the inhabitants to supply
the army, by paying two dollars per cord, over and above the
General’s price, to such as shall carry wood thither;”[90] and 210
cords of wood were carried. A similar order is taken respecting hay.
Whilst Boston was occupied by the British troops, Concord
contributed to the relief of the inhabitants, £70, in money; 225
bushels of grain; and a quantity of meat and wood. When, presently,
the poor of Boston were quartered by the Provincial Congress on the
neighboring country, Concord received 82 persons to its hospitality.
In the year 1775, it raised 100 minute-men, and 74 soldiers to serve
at Cambridge. In March, 1776, 145 men were raised by this town to
serve at Dorchester Heights.[91] In June, the General Assembly of
Massachusetts resolved to raise 5000 militia for six months, to
reinforce the Continental army. “The numbers,” say they, “are large,
but this Court has the fullest assurance that their brethren, on this
occasion, will not confer with flesh and blood, but will, without
hesitation, and with the utmost alacrity and despatch, fill up the
numbers proportioned to the several towns.”[92] On that occasion,
Concord furnished 67 men, paying them itself, at an expense of
£622. And so on, with every levy, to the end of the war. For these
men it was continually providing shoes, stockings, shirts, coats,
blankets and beef. The taxes, which, before the war, had not much
exceeded £200 per annum, amounted, in the year 1782, to $9544, in
silver.[93]
The great expense of the war was borne with cheerfulness, whilst
the war lasted; but years passed, after the peace, before the debt
was paid. As soon as danger and injury ceased, the people were left
at leisure to consider their poverty and their debts. The Town
Records show how slowly the inhabitants recovered from the strain
of excessive exertion. Their instructions to their representatives are
full of loud complaints of the disgraceful state of public credit, and
the excess of public expenditure. They may be pardoned, under
such distress, for the mistakes of an extreme frugality. They fell into
a common error, not yet dismissed to the moon, that the remedy
was, to forbid the great importation of foreign commodities, and to
prescribe by law the prices of articles. The operation of a new
government was dreaded, lest it should prove expensive, and the
country towns thought it would be cheaper if it were removed from
the capital. They were jealous lest the General Court should pay
itself too liberally, and our fathers must be forgiven by their charitable
posterity, if, in 1782, before choosing a representative, it was “Voted,
that the person who should be chosen representative to the General
Court should receive 6s. per day, whilst in actual service, an account
of which time he should bring to the town, and if it should be that the
General Court should resolve, that, their pay should be more than
6s., then the representative shall be hereby directed to pay the
overplus into the town treasury.”[94] This was securing the prudence
of the public servants.
But whilst the town had its own full share of the public distress, it
was very far from desiring relief at the cost of order and law. In 1786,
when the general sufferings drove the people in parts of Worcester
and Hampshire counties to insurrection, a large party of armed
insurgents arrived in this town, on the 12th September, to hinder the
sitting of the Court of Common Pleas. But they found no
countenance here.[95] The same people who had been active in a
County Convention to consider grievances, condemned the
rebellion, and joined the authorities in putting it down.[96] In 1787, the
admirable instructions given by the town to its representative are a
proud monument of the good sense and good feeling that prevailed.
The grievances ceased with the adoption of the Federal Constitution.
The constitution of Massachusetts had been already accepted. It
was put to the town of Concord, in October, 1776, by the Legislature,
whether the existing house of representatives should enact a
constitution for the State? The town answered No.[97] The General
Court, notwithstanding, draughted a constitution, sent it here, and
asked the town whether they would have it for the law of the State?
The town answered No, by a unanimous vote. In 1780, a constitution
of the State, proposed by the Convention chosen for that purpose,
was accepted by the town with the reservation of some articles.[98]
And, in 1788, the town, by its delegate, accepted the new
Constitution of the United States, and this event closed the whole
series of important public events in which this town played a part.
From that time to the present hour, this town has made a slow but
constant progress in population and wealth, and the arts of peace. It
has suffered neither from war, nor pestilence, nor famine, nor
flagrant crime. Its population, in the census of 1830, was 2020 souls.
The public expenses, for the last year, amounted to $4290; for the
present year, to $5040.[99] If the community stints its expense in
small matters, it spends freely on great duties. The town raises, this
year, $1800 for its public schools; besides about $1200 which are
paid, by subscription, for private schools. This year, it expends $800
for its poor; the last year it expended $900. Two religious societies,
of differing creed, dwell together in good understanding, both
promoting, we hope, the cause of righteousness and love.[100]
Concord has always been noted for its ministers. The living need no
praise of mine. Yet it is among the sources of satisfaction and
gratitude, this day, that the aged with whom is wisdom, our fathers’
counsellor and friend, is spared to counsel and intercede for the
sons.[101]
Such, fellow citizens, is an imperfect sketch of the history of
Concord. I have been greatly indebted, in preparing this sketch, to
the printed but unpublished History of this town, furnished me by the
unhesitating kindness of its author, long a resident in this place. I
hope that History will not long remain unknown. The author has done
us and posterity a kindness, by the zeal and patience of his
research, and has wisely enriched his pages with the resolutions,
addresses and instructions to its agents, which from time to time, at
critical periods, the town has voted.[102] Meantime, I have read with
care the Town Records themselves. They must ever be the fountains
of all just information respecting your character and customs. They
are the history of the town. They exhibit a pleasing picture of a
community almost exclusively agricultural, where no man has much
time for words, in his search after things; of a community of great
simplicity of manners, and of a manifest love of justice. For the most
part, the town has deserved the name it wears. I find our annals
marked with a uniform good sense. I find no ridiculous laws, no
eavesdropping legislators, no hanging of witches, no ghosts, no
whipping of Quakers, no unnatural crimes. The tone of the Records
rises with the dignity of the event. These soiled and musty books are
luminous and electric within. The old town clerks did not spell very
correctly, but they contrive to make pretty intelligible the will of a free
and just community. Frugal our fathers were,—very frugal,—though,
for the most part, they deal generously by their minister, and provide
well for the schools and the poor. If, at any time, in common with
most of our towns, they have carried this economy to the verge of a
vice, it is to be remembered that a town is, in many respects, a
financial corporation. They economize, that they may sacrifice. They
stint and higgle on the price of a pew, that they may send 200
soldiers to General Washington to keep Great Britain at bay. For
splendor, there must somewhere be rigid economy. That the head of
the house may go brave, the members must be plainly clad, and the
town must save that the State may spend. Of late years, the growth
of Concord has been slow. Without navigable waters, without mineral
riches, without any considerable mill privileges, the natural increase
of her population is drained by the constant emigration of the youth.
Her sons have settled the region around us, and far from us. Their
wagons have rattled down the remote western hills. And in every
part of this country, and in many foreign parts, they plough the earth,
they traverse the sea, they engage in trade and in all the
professions.[103]
Fellow citizens; let not the solemn shadows of two hundred years,
this day, fall over us in vain. I feel some unwillingness to quit the
remembrance of the past. With all the hope of the new I feel that we
are leaving the old. Every moment carries us farther from the two
great epochs of public principle, the Planting, and the Revolution of
the colony. Fortunate and favored this town has been, in having
received so large an infusion of the spirit of both of those periods.
Humble as is our village in the circle of later and prouder towns that
whiten the land, it has been consecrated by the presence and
activity of the purest men. Why need I remind you of our own
Hosmers, Minotts, Cumings, Barretts, Beattons, the departed
benefactors of the town? On the village green have been the steps
of Winthrop and Dudley; of John Eliot, the Indian apostle, who had a
courage that intimidated those savages whom his love could not
melt; of Whitfield, whose silver voice melted his great congregation
into tears; of Hancock, and his compatriots of the Provincial
Congress; of Langdon, and the college over which he presided. But
even more sacred influences than these have mingled here with the
stream of human life. The merit of those who fill a space in the
world’s history, who are borne forward, as it were, by the weight of
thousands whom they lead, sheds a perfume less sweet than do the
sacrifices of private virtue. I have had much opportunity of access to
anecdotes of families, and I believe this town to have been the
dwelling-place, in all times since its planting, of pious and excellent
persons, who walked meekly through the paths of common life, who
served God, and loved man, and never let go the hope of
immortality. The benediction of their prayers and of their principles
lingers around us. The acknowledgment of the Supreme Being
exalts the history of this people. It brought the fathers hither. In a war
of principle, it delivered their sons. And so long as a spark of this
faith survives among the children’s children so long shall the name of
Concord be honest and venerable.
III
LETTER TO MARTIN VAN BUREN,
PRESIDENT OF THE UNITED STATES

A PROTEST AGAINST THE REMOVAL OF THE


CHEROKEE INDIANS FROM THE STATE OF
GEORGIA

“Say, what is Honour? ’Tis the finest sense


Of justice which the human mind can frame,
Intent each lurking frailty to disclaim,
And guard the way of life from all offence,
Suffered or done.”

Wordsworth.

LETTER
TO MARTIN VAN BUREN,
PRESIDENT OF THE UNITED STATES

Concord, Mass., April 23, 1838.


Sir: The seat you fill places you in a relation of credit and
nearness to every citizen. By right and natural position, every citizen
is your friend. Before any acts contrary to his own judgment or
interest have repelled the affections of any man, each may look with
trust and living anticipation to your government. Each has the
highest right to call your attention to such subjects as are of a public
nature, and properly belong to the chief magistrate; and the good
magistrate will feel a joy in meeting such confidence. In this belief
and at the instance of a few of my friends and neighbors, I crave of
your patience a short hearing for their sentiments and my own: and
the circumstance that my name will be utterly unknown to you will
only give the fairer chance to your equitable construction of what I
have to say.
Sir, my communication respects the sinister rumors that fill this
part of the country concerning the Cherokee people. The interest
always felt in the aboriginal population—an interest naturally growing
as that decays—has been heightened in regard to this tribe. Even in
our distant State some good rumor of their worth and civility has
arrived. We have learned with joy their improvement in the social
arts. We have read their newspapers. We have seen some of them
in our schools and colleges. In common with the great body of the
American people, we have witnessed with sympathy the painful
labors of these red men to redeem their own race from the doom of
eternal inferiority, and to borrow and domesticate in the tribe the arts
and customs of the Caucasian race. And notwithstanding the
unaccountable apathy with which of late years the Indians have been
sometimes abandoned to their enemies, it is not to be doubted that it
is the good pleasure and the understanding of all humane persons in
the Republic, of the men and the matrons sitting in the thriving
independent families all over the land, that they shall be duly cared
for; that they shall taste justice and love from all to whom we have
delegated the office of dealing with them.
The newspapers now inform us that, in December, 1835, a treaty
contracting for the exchange of all the Cherokee territory was
pretended to be made by an agent on the part of the United States
with some persons appearing on the part of the Cherokees; that the
fact afterwards transpired that these deputies did by no means
represent the will of the nation; and that, out of eighteen thousand
souls composing the nation, fifteen thousand six hundred and sixty-
eight have protested against the so-called treaty. It now appears that
the government of the United States choose to hold the Cherokees
to this sham treaty, and are proceeding to execute the same. Almost
the entire Cherokee Nation stand up and say, “This is not our act.
Behold us. Here are we. Do not mistake that handful of deserters for
us;” and the American President and the Cabinet, the Senate and
the House of Representatives, neither hear these men nor see them,
and are contracting to put this active nation into carts and boats, and
to drag them over mountains and rivers to a wilderness at a vast
distance beyond the Mississippi. And a paper purporting to be an
army order fixes a month from this day as the hour for this doleful
removal.
In the name of God, sir, we ask you if this be so. Do the
newspapers rightly inform us? Men and women with pale and
perplexed faces meet one another in the streets and churches here,
and ask if this be so. We have inquired if this be a gross
misrepresentation from the party opposed to the government and
anxious to blacken it with the people. We have looked in the
newspapers of different parties and find a horrid confirmation of the
tale. We are slow to believe it. We hoped the Indians were
misinformed, and that their remonstrance was premature, and will
turn out to be a needless act of terror.
The piety, the principle that is left in the United States, if only in its
coarsest form, a regard to the speech of men,—forbid us to entertain
it as a fact. Such a dereliction of all faith and virtue, such a denial of
justice, and such deafness to screams for mercy were never heard
of in times of peace and in the dealing of a nation with its own allies
and wards, since the earth was made. Sir, does this government
think that the people of the United States are become savage and
mad? From their mind are the sentiments of love and a good nature
wiped clean out? The soul of man, the justice, the mercy that is the
heart’s heart in all men, from Maine to Georgia, does abhor this
business.
In speaking thus the sentiments of my neighbors and my own,
perhaps I overstep the bounds of decorum. But would it not be a
higher indecorum coldly to argue a matter like this? We only state
the fact that a crime is projected that confounds our understandings
by its magnitude,—a crime that really deprives us as well as the
Cherokees of a country? for how could we call the conspiracy that
should crush these poor Indians our government, or the land that
was cursed by their parting and dying imprecations our country, any
more? You, sir, will bring down that renowned chair in which you sit
into infamy if your seal is set to this instrument of perfidy; and the
name of this nation, hitherto the sweet omen of religion and liberty,
will stink to the world.
You will not do us the injustice of connecting this remonstrance
with any sectional and party feeling. It is in our hearts the simplest
commandment of brotherly love. We will not have this great and
solemn claim upon national and human justice huddled aside under
the flimsy plea of its being a party act. Sir, to us the questions upon
which the government and the people have been agitated during the
past year, touching the prostration of the currency and of trade,
seem but motes in comparison. These hard times, it is true, have
brought the discussion home to every farmhouse and poor man’s
house in this town; but it is the chirping of grasshoppers beside the
immortal question whether justice shall be done by the race of
civilized to the race of savage man,—whether all the attributes of
reason, of civility, of justice, and even of mercy, shall be put off by
the American people, and so vast an outrage upon the Cherokee
Nation and upon human nature shall be consummated.
One circumstance lessens the reluctance with which I intrude at
this time on your attention my conviction that the government ought
to be admonished of a new historical fact, which the discussion of
this question has disclosed, namely, that there exists in a great part
of the Northern people a gloomy diffidence in the moral character of
the government.
On the broaching of this question, a general expression of
despondency, of disbelief that any good will accrue from a
remonstrance on an act of fraud and robbery, appeared in those men
to whom we naturally turn for aid and counsel. Will the American
government steal? Will it lie? Will it kill?—We ask triumphantly. Our
counsellors and old statesmen here say that ten years ago they
would have staked their lives on the affirmation that the proposed
Indian measures could not be executed; that the unanimous country

You might also like