Professional Documents
Culture Documents
Download pdf Deep Learning Book Ian Goodfellow ebook full chapter
Download pdf Deep Learning Book Ian Goodfellow ebook full chapter
https://textbookfull.com/product/programming-pytorch-for-deep-
learning-creating-and-deploying-deep-learning-applications-1st-
edition-ian-pointer/
https://textbookfull.com/product/deep-learning-on-windows-
building-deep-learning-computer-vision-systems-on-microsoft-
windows-thimira-amaratunga/
https://textbookfull.com/product/deep-learning-pipeline-building-
a-deep-learning-model-with-tensorflow-1st-edition-hisham-el-amir/
https://textbookfull.com/product/deep-learning-with-python-
develop-deep-learning-models-on-theano-and-tensorflow-using-
keras-jason-brownlee/
Deep Learning for Natural Language Processing Develop
Deep Learning Models for Natural Language in Python
Jason Brownlee
https://textbookfull.com/product/deep-learning-for-natural-
language-processing-develop-deep-learning-models-for-natural-
language-in-python-jason-brownlee/
https://textbookfull.com/product/deep-learning-on-windows-
building-deep-learning-computer-vision-systems-on-microsoft-
windows-1st-edition-thimira-amaratunga/
https://textbookfull.com/product/deep-learning-in-natural-
language-processing-deng/
https://textbookfull.com/product/deep-learning-for-cancer-
diagnosis-utku-kose/
https://textbookfull.com/product/r-deep-learning-essentials-1st-
edition-wiley/
Deep Learning
Ian Goodfellow
Yoshua Bengio
Aaron Courville
Contents
Website viii
Acknowledgments ix
Notation xiii
1 Introduction 1
1.1 Who Should Read This Book? . . . . . . . . . . . . . . . . . . . . 8
1.2 Historical Trends in Deep Learning . . . . . . . . . . . . . . . . . 12
2 Linear Algebra 29
2.1 Scalars, Vectors, Matrices and Tensors . . . . . . . . . . . . . . . 29
2.2 Multiplying Matrices and Vectors . . . . . . . . . . . . . . . . . . 32
2.3 Identity and Inverse Matrices . . . . . . . . . . . . . . . . . . . . 34
2.4 Linear Dependence and Span . . . . . . . . . . . . . . . . . . . . 35
2.5 Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.6 Special Kinds of Matrices and Vectors . . . . . . . . . . . . . . . 38
2.7 Eigendecomposition . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.8 Singular Value Decomposition . . . . . . . . . . . . . . . . . . . . 42
2.9 The Moore-Penrose Pseudoinverse . . . . . . . . . . . . . . . . . . 43
2.10 The Trace Operator . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.11 The Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.12 Example: Principal Components Analysis . . . . . . . . . . . . . 45
i
CONTENTS
4 Numerical Computation 78
4.1 Overflow and Underflow . . . . . . . . . . . . . . . . . . . . . . . 78
4.2 Poor Conditioning . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.3 Gradient-Based Optimization . . . . . . . . . . . . . . . . . . . . 80
4.4 Constrained Optimization . . . . . . . . . . . . . . . . . . . . . . 91
4.5 Example: Linear Least Squares . . . . . . . . . . . . . . . . . . . 94
ii
CONTENTS
12 Applications 438
12.1 Large-Scale Deep Learning . . . . . . . . . . . . . . . . . . . . . . 438
12.2 Computer Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
12.3 Speech Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . 453
12.4 Natural Language Processing . . . . . . . . . . . . . . . . . . . . 456
12.5 Other Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 473
iv
CONTENTS
14 Autoencoders 499
14.1 Undercomplete Autoencoders . . . . . . . . . . . . . . . . . . . . 500
14.2 Regularized Autoencoders . . . . . . . . . . . . . . . . . . . . . . 501
14.3 Representational Power, Layer Size and Depth . . . . . . . . . . . 505
14.4 Stochastic Encoders and Decoders . . . . . . . . . . . . . . . . . . 506
14.5 Denoising Autoencoders . . . . . . . . . . . . . . . . . . . . . . . 507
14.6 Learning Manifolds with Autoencoders . . . . . . . . . . . . . . . 513
14.7 Contractive Autoencoders . . . . . . . . . . . . . . . . . . . . . . 518
14.8 Predictive Sparse Decomposition . . . . . . . . . . . . . . . . . . 521
14.9 Applications of Autoencoders . . . . . . . . . . . . . . . . . . . . 522
v
CONTENTS
Bibliography 717
vi
CONTENTS
Index 773
vii
Website
www.deeplearningbook.org
viii
Acknowledgments
This book would not have been possible without the contributions of many people.
We would like to thank those who commented on our proposal for the book
and helped plan its contents and organization: Guillaume Alain, Kyunghyun Cho,
Çağlar Gülçehre, David Krueger, Hugo Larochelle, Razvan Pascanu and Thomas
Rohée.
We would like to thank the people who offered feedback on the content of the
book itself. Some offered feedback on many chapters: Martín Abadi, Ishaq Aden-Ali,
Guillaume Alain, Ion Androutsopoulos, Laura Ball, Fred Bertsch, Olexa Bilaniuk,
Ufuk Can Biçici, Matko Bošnjak, John Boersma, François Brault, Greg Brockman,
Alexandre de Brébisson, Pierre Luc Carrier, Sarath Chandar, Pawel Chilinski,
Mark Daoust, Oleg Dashevskii, Laurent Dinh, Stephan Dreseitl, Gudmundur
Einarsson, Hannes von Essen, Jim Fan, Miao Fan, Meire Fortunato, Frédéric
Francis, Nando de Freitas, Çağlar Gülçehre, Jurgen Van Gael, Yaroslav Ganin,
Javier Alonso García, Aydin Gerek, Stefan Heil, Jonathan Hunt, Gopi Jeyaram,
Chingiz Kabytayev, Lukasz Kaiser, Varun Kanade, Asifullah Khan, Akiel Khan,
John King, Diederik P. Kingma, Dominik Laupheimer, Yann LeCun, Minh Lê, Max
Marion, Rudolf Mathey, Matías Mattamala, Abhinav Maurya, Vincent Michalski,
Kevin Murphy, Oleg Mürk, Hung Ngo, Roman Novak, Augustus Q. Odena, Simon
Pavlik, Karl Pichotta, Eddie Pierce, Kari Pulli, Roussel Rahman, Tapani Raiko,
Anurag Ranjan, Johannes Roith, Mihaela Rosca, Halis Sak, César Salgado, Grigory
Sapunov, Yoshinori Sasaki, Mike Schuster, Julian Serban, Nir Shabat, Ken Shirriff,
Andre Simpelo, Scott Stanley, David Sussillo, Ilya Sutskever, Carles Gelada Sáez,
Graham Taylor, Valentin Tolmer, Massimiliano Tomassoli, An Tran, Shubhendu
Trivedi, Alexey Umnov, Vincent Vanhoucke, Robert Viragh, Marco Visentini-
Scarzanella, Martin Vita, David Warde-Farley, Dustin Webb, Shan-Conrad Wolf,
Kelvin Xu, Wei Xue, Ke Yang, Li Yao, Zygmunt Zając and Ozan Çağlayan.
We would also like to thank those who provided us with useful feedback on
individual chapters:
ix
CONTENTS
• Chapter 16, Structured Probabilistic Models for Deep Learning: Deng Qingyu
, Harry Braviner, Timothy Cogan, Diego Marez, Anton Varfolom and Victor
Xie.
• Chapter 18, Confronting the Partition Function: Sam Bowman and Jin Kim.
book and receive feedback and guidance from colleagues. We would especially like
to thank Ian’s former manager, Greg Corrado, and his current manager, Samy
Bengio, for their support of this project. Finally, we would like to thank Geoffrey
Hinton for encouragement when writing was difficult.
xii
Notation
This section provides a concise reference describing the notation used throughout
this book. If you are unfamiliar with any of the corresponding mathematical
concepts, we describe most of these ideas in chapters 2–4.
xiii
CONTENTS
Indexing
ai Element i of vector a , with indexing starting at 1
a−i All elements of vector a except for element i
Ai,j Element i, j of matrix A
Ai,: Row i of matrix A
A:,i Column i of matrix A
Ai,j,k Element (i, j, k ) of a 3-D tensor A
A :,:,i 2-D slice of a 3-D tensor
ai Element i of the random vector a
xiv
CONTENTS
Calculus
dy
Derivative of y with respect to x
dx
∂y
Partial derivative of y with respect to x
∂x
∇ xy Gradient of y with respect to x
∇X y Matrix derivatives of y with respect to X
∇ Xy Tensor containing derivatives of y with respect to
X
∂f
Jacobian matrix J ∈ Rm×n of f : Rn → Rm
∂x
2
∇x f (x) or H (f )(x) The Hessian matrix of f at input point x
f (x)dx Definite integral over the entire domain of x
f (x)dx Definite integral with respect to x over the set S
S
xv
CONTENTS
Functions
f :A→B The function f with domain A and range B
f ◦g Composition of the functions f and g
f (x; θ) A function of x parametrized by θ. (Sometimes
we write f(x) and omit the argument θ to lighten
notation)
log x Natural logarithm of x
1
σ(x) Logistic sigmoid,
1 + exp(−x)
ζ (x) Softplus, log(1 + exp(x))
||x||p Lp norm of x
||x|| L2 norm of x
x+ Positive part of x, i.e., max(0, x)
1 condition is 1 if the condition is true, 0 otherwise
Sometimes we use a function f whose argument is a scalar but apply it to a
vector, matrix, or tensor: f (x), f(X ), or f (X ). This denotes the application of f
to the array element-wise. For example, if C = σ(X ), then C i,j,k = σ(Xi,j,k ) for all
valid values of i, j and k.
xvi
Chapter 1
Introduction
Inventors have long dreamed of creating machines that think. This desire dates
back to at least the time of ancient Greece. The mythical figures Pygmalion,
Daedalus, and Hephaestus may all be interpreted as legendary inventors, and
Galatea, Talos, and Pandora may all be regarded as artificial life (Ovid and Martin,
2004; Sparkes, 1996; Tandy, 1997).
When programmable computers were first conceived, people wondered whether
such machines might become intelligent, over a hundred years before one was
built (Lovelace, 1842). Today, artificial intelligence (AI) is a thriving field with
many practical applications and active research topics. We look to intelligent
software to automate routine labor, understand speech or images, make diagnoses
in medicine and support basic scientific research.
In the early days of artificial intelligence, the field rapidly tackled and solved
problems that are intellectually difficult for human beings but relatively straight-
forward for computers—problems that can be described by a list of formal, math-
ematical rules. The true challenge to artificial intelligence proved to be solving
the tasks that are easy for people to perform but hard for people to describe
formally—problems that we solve intuitively, that feel automatic, like recognizing
spoken words or faces in images.
This book is about a solution to these more intuitive problems. This solution is
to allow computers to learn from experience and understand the world in terms of
a hierarchy of concepts, with each concept defined through its relation to simpler
concepts. By gathering knowledge from experience, this approach avoids the need
for human operators to formally specify all the knowledge that the computer needs.
The hierarchy of concepts enables the computer to learn complicated concepts by
building them out of simpler ones. If we draw a graph showing how these concepts
1
CHAPTER 1. INTRODUCTION
are built on top of each other, the graph is deep, with many layers. For this reason,
we call this approach to AI deep learning.
Many of the early successes of AI took place in relatively sterile and formal
environments and did not require computers to have much knowledge about
the world. For example, IBM’s Deep Blue chess-playing system defeated world
champion Garry Kasparov in 1997 (Hsu, 2002). Chess is of course a very simple
world, containing only sixty-four locations and thirty-two pieces that can move
in only rigidly circumscribed ways. Devising a successful chess strategy is a
tremendous accomplishment, but the challenge is not due to the difficulty of
describing the set of chess pieces and allowable moves to the computer. Chess
can be completely described by a very brief list of completely formal rules, easily
provided ahead of time by the programmer.
Ironically, abstract and formal tasks that are among the most difficult mental
undertakings for a human being are among the easiest for a computer. Computers
have long been able to defeat even the best human chess player but only recently
have begun matching some of the abilities of average human beings to recognize
objects or speech. A person’s everyday life requires an immense amount of
knowledge about the world. Much of this knowledge is subjective and intuitive,
and therefore difficult to articulate in a formal way. Computers need to capture
this same knowledge in order to behave in an intelligent way. One of the key
challenges in artificial intelligence is how to get this informal knowledge into a
computer.
Several artificial intelligence projects have sought to hard-code knowledge
about the world in formal languages. A computer can reason automatically about
statements in these formal languages using logical inference rules. This is known as
the knowledge base approach to artificial intelligence. None of these projects has
led to a major success. One of the most famous such projects is Cyc (Lenat and
Guha, 1989). Cyc is an inference engine and a database of statements in a language
called CycL. These statements are entered by a staff of human supervisors. It is an
unwieldy process. People struggle to devise formal rules with enough complexity
to accurately describe the world. For example, Cyc failed to understand a story
about a person named Fred shaving in the morning (Linde, 1992). Its inference
engine detected an inconsistency in the story: it knew that people do not have
electrical parts, but because Fred was holding an electric razor, it believed the
entity “FredWhileShaving” contained electrical parts. It therefore asked whether
Fred was still a person while he was shaving.
The difficulties faced by systems relying on hard-coded knowledge suggest
that AI systems need the ability to acquire their own knowledge, by extracting
2
CHAPTER 1. INTRODUCTION
patterns from raw data. This capability is known as machine learning. The
introduction of machine learning enabled computers to tackle problems involving
knowledge of the real world and make decisions that appear subjective. A simple
machine learning algorithm called logistic regression can determine whether to
recommend cesarean delivery (Mor-Yosef et al., 1990). A simple machine learning
algorithm called naive Bayes can separate legitimate e-mail from spam e-mail.
The performance of these simple machine learning algorithms depends heavily
on the representation of the data they are given. For example, when logistic
regression is used to recommend cesarean delivery, the AI system does not examine
the patient directly. Instead, the doctor tells the system several pieces of relevant
information, such as the presence or absence of a uterine scar. Each piece of
information included in the representation of the patient is known as a feature.
Logistic regression learns how each of these features of the patient correlates with
various outcomes. However, it cannot influence how features are defined in any
way. If logistic regression were given an MRI scan of the patient, rather than
the doctor’s formalized report, it would not be able to make useful predictions.
Individual pixels in an MRI scan have negligible correlation with any complications
that might occur during delivery.
This dependence on representations is a general phenomenon that appears
throughout computer science and even daily life. In computer science, operations
such as searching a collection of data can proceed exponentially faster if the collec-
tion is structured and indexed intelligently. People can easily perform arithmetic
on Arabic numerals but find arithmetic on Roman numerals much more time
consuming. It is not surprising that the choice of representation has an enormous
effect on the performance of machine learning algorithms. For a simple visual
example, see figure 1.1.
Many artificial intelligence tasks can be solved by designing the right set of
features to extract for that task, then providing these features to a simple machine
learning algorithm. For example, a useful feature for speaker identification from
sound is an estimate of the size of the speaker’s vocal tract. This feature gives a
strong clue as to whether the speaker is a man, woman, or child.
For many tasks, however, it is difficult to know what features should be
extracted. For example, suppose that we would like to write a program to detect
cars in photographs. We know that cars have wheels, so we might like to use the
presence of a wheel as a feature. Unfortunately, it is difficult to describe exactly
what a wheel looks like in terms of pixel values. A wheel has a simple geometric
shape, but its image may be complicated by shadows falling on the wheel, the sun
glaring off the metal parts of the wheel, the fender of the car or an object in the
3
CHAPTER 1. INTRODUCTION
4
CHAPTER 1. INTRODUCTION
context, we use the word “factors” simply to refer to separate sources of influence;
the factors are usually not combined by multiplication. Such factors are often not
quantities that are directly observed. Instead, they may exist as either unobserved
objects or unobserved forces in the physical world that affect observable quantities.
They may also exist as constructs in the human mind that provide useful simplifying
explanations or inferred causes of the observed data. They can be thought of as
concepts or abstractions that help us make sense of the rich variability in the data.
When analyzing a speech recording, the factors of variation include the speaker’s
age, their sex, their accent and the words they are speaking. When analyzing an
image of a car, the factors of variation include the position of the car, its color,
and the angle and brightness of the sun.
A major source of difficulty in many real-world artificial intelligence applications
is that many of the factors of variation influence every single piece of data we are
able to observe. The individual pixels in an image of a red car might be very close
to black at night. The shape of the car’s silhouette depends on the viewing angle.
Most applications require us to disentangle the factors of variation and discard the
ones that we do not care about.
Of course, it can be very difficult to extract such high-level, abstract features
from raw data. Many of these factors of variation, such as a speaker’s accent,
can be identified only using sophisticated, nearly human-level understanding of
the data. When it is nearly as difficult to obtain a representation as to solve the
original problem, representation learning does not, at first glance, seem to help us.
Deep learning solves this central problem in representation learning by intro-
ducing representations that are expressed in terms of other, simpler representations.
Deep learning enables the computer to build complex concepts out of simpler con-
cepts. Figure 1.2 shows how a deep learning system can represent the concept of
an image of a person by combining simpler concepts, such as corners and contours,
which are in turn defined in terms of edges.
The quintessential example of a deep learning model is the feedforward deep
network, or multilayer perceptron (MLP). A multilayer perceptron is just a
mathematical function mapping some set of input values to output values. The
function is formed by composing many simpler functions. We can think of each
application of a different mathematical function as providing a new representation
of the input.
The idea of learning the right representation for the data provides one per-
spective on deep learning. Another perspective on deep learning is that depth
enables the computer to learn a multistep computer program. Each layer of the
representation can be thought of as the state of the computer’s memory after
5
CHAPTER 1. INTRODUCTION
Output
CAR PERSON ANIMAL
(object identity)
Visible layer
(input pixels)
Figure 1.2: Illustration of a deep learning model. It is difficult for a computer to understand
the meaning of raw sensory input data, such as this image represented as a collection
of pixel values. The function mapping from a set of pixels to an object identity is very
complicated. Learning or evaluating this mapping seems insurmountable if tackled directly.
Deep learning resolves this difficulty by breaking the desired complicated mapping into a
series of nested simple mappings, each described by a different layer of the model. The
input is presented at the visible layer, so named because it contains the variables that
we are able to observe. Then a series of hidden layers extracts increasingly abstract
features from the image. These layers are called “hidden” because their values are not given
in the data; instead the model must determine which concepts are useful for explaining
the relationships in the observed data. The images here are visualizations of the kind
of feature represented by each hidden unit. Given the pixels, the first layer can easily
identify edges, by comparing the brightness of neighboring pixels. Given the first hidden
layer’s description of the edges, the second hidden layer can easily search for corners and
extended contours, which are recognizable as collections of edges. Given the second hidden
layer’s description of the image in terms of corners and contours, the third hidden layer
can detect entire parts of specific objects, by finding specific collections of contours and
corners. Finally, this description of the image in terms of the object parts it contains can
be used to recognize the objects present in the image. Images reproduced with permission
from Zeiler and Fergus (2014).
6
CHAPTER 1. INTRODUCTION
executing another set of instructions in parallel. Networks with greater depth can
execute more instructions in sequence. Sequential instructions offer great power
because later instructions can refer back to the results of earlier instructions. Ac-
cording to this view of deep learning, not all the information in a layer’s activations
necessarily encodes factors of variation that explain the input. The representation
also stores state information that helps to execute a program that can make sense
of the input. This state information could be analogous to a counter or pointer
in a traditional computer program. It has nothing to do with the content of the
input specifically, but it helps the model to organize its processing.
There are two main ways of measuring the depth of a model. The first view is
based on the number of sequential instructions that must be executed to evaluate
the architecture. We can think of this as the length of the longest path through
a flow chart that describes how to compute each of the model’s outputs given
its inputs. Just as two equivalent computer programs will have different lengths
depending on which language the program is written in, the same function may
be drawn as a flowchart with different depths depending on which functions we
allow to be used as individual steps in the flowchart. Figure 1.3 illustrates how this
choice of language can give two different measurements for the same architecture.
Element
Set σ Element
Set
+
+
× × × Logistic
Regression
Logistic
Regression
σ
w1 x1 w2 x2 w x
7
CHAPTER 1. INTRODUCTION
8
CHAPTER 1. INTRODUCTION
Representation learning
Machine learning
AI
Figure 1.4: A Venn diagram showing how deep learning is a kind of representation learning,
which is in turn a kind of machine learning, which is used for many but not all approaches
to AI. Each section of the Venn diagram includes an example of an AI technology.
target audience is software engineers who do not have a machine learning or statis-
tics background but want to rapidly acquire one and begin using deep learning in
their product or platform. Deep learning has already proved useful in many soft-
ware disciplines, including computer vision, speech and audio processing, natural
language processing, robotics, bioinformatics and chemistry, video games, search
engines, online advertising and finance.
This book has been organized into three parts to best accommodate a variety
of readers. Part I introduces basic mathematical tools and machine learning
concepts. Part II describes the most established deep learning algorithms, which
are essentially solved technologies. Part III describes more speculative ideas that
are widely believed to be important for future research in deep learning.
9
CHAPTER 1. INTRODUCTION
Output
Mapping from
Output Output
features
Additional
Mapping from Mapping from layers of more
Output
features features abstract
features
Hand- Hand-
Simple
designed designed Features
features
program features
Deep
Classic learning
Rule-based
machine
systems Representation
learning
learning
Figure 1.5: Flowcharts showing how the different parts of an AI system relate to each
other within different AI disciplines. Shaded boxes indicate components that are able to
learn from data.
Readers should feel free to skip parts that are not relevant given their interests
or background. Readers familiar with linear algebra, probability, and fundamental
machine learning concepts can skip part I, for example, while those who just want
to implement a working system need not read beyond part II. To help choose which
10
CHAPTER 1. INTRODUCTION
1. Introduction
3. Probability and
2. Linear Algebra
Information Theory
6. Deep Feedforward
Networks
11. Practical
12. Applications
Methodology
18. Partition
19. Inference
Function
Figure 1.6: The high-level organization of the book. An arrow from one chapter to another
indicates that the former chapter is prerequisite material for understanding the latter.
11
CHAPTER 1. INTRODUCTION
chapters to read, figure 1.6 provides a flowchart showing the high-level organization
of the book.
We do assume that all readers come from a computer science background. We
assume familiarity with programming, a basic understanding of computational
performance issues, complexity theory, introductory level calculus and some of the
terminology of graph theory.
• Deep learning has had a long and rich history, but has gone by many names,
reflecting different philosophical viewpoints, and has waxed and waned in
popularity.
• Deep learning has become more useful as the amount of available training
data has increased.
• Deep learning models have grown in size over time as computer infrastructure
(both hardware and software) for deep learning has improved.
We expect that many readers of this book have heard of deep learning as an exciting
new technology, and are surprised to see a mention of “history” in a book about an
emerging field. In fact, deep learning dates back to the 1940s. Deep learning only
appears to be new, because it was relatively unpopular for several years preceding
its current popularity, and because it has gone through many different names, only
recently being called “deep learning.” The field has been rebranded many times,
reflecting the influence of different researchers and different perspectives.
A comprehensive history of deep learning is beyond the scope of this textbook.
Some basic context, however, is useful for understanding deep learning. Broadly
speaking, there have been three waves of development: deep learning known as
cybernetics in the 1940s–1960s, deep learning known as connectionism in the
12
Another random document with
no related content on Scribd:
telegrams protesting against the infliction of the death-penalty on a
woman.
One of the reasons which has been urged for the total abolition of
this penalty is the reluctance of juries to convict women of crimes
punishable by death. The number of wives who murder their
husbands, and of girls who murder their lovers, is a menace to
society. Our sympathetic tolerance of these crimes passionnés, the
sensational scenes in court, and the prompt acquittals which follow,
are a menace to law and justice. Better that their perpetrators should
be sent to prison, and suffer a few years of corrective discipline, until
soft-hearted sentimentalists circulate petitions, and secure their
pardon and release.
The right to be judged as men are judged is perhaps the only form
of equality which feminists fail to demand. Their attitude to their own
errata is well expressed in the solemn warning addressed by Mr.
Louis Untermeyer’s Eve to the Almighty,
as to the lenity shown them by men,—a lenity which they stand ever
ready to abuse. We have only to imagine what would have
happened to a group of men who had chosen to air a grievance by
picketing the White House, the speed with which they would have
been arrested, fined, dispersed, and forgotten, to realize the nature
of the tolerance granted to women. For months these female pickets
were unmolested. Money was subscribed to purchase for them
umbrellas and overshoes. The President, whom they were affronting,
sent them out coffee on cold mornings. It was only when their
utterances became treasonable, when they undertook to assure our
Russian visitors that Mr. Wilson and Mr. Root were deceiving Russia,
and to entreat these puzzled foreigners to help them free our nation,
that their sport was suppressed, and they became liable to arrest
and imprisonment.
Much censure was passed upon the unreasonable violence of
these women. The great body of American suffragists repudiated
their action, and the anti-suffragists used them to point stern morals
and adorn vivacious tales. But was it quite fair to permit them in the
beginning a liberty which would not have been accorded to men, and
which led inevitably to licence? Were they not treated as parents
sometimes treat children, allowing them to use bad language
because, “if you pay no attention to them, they will stop it of their
own accord”; and then, when they do not stop it, punishing them for
misbehaving before company? When a sympathetic gentleman
wrote to a not very sympathetic paper to say that the second Liberty
Loan would be more popular if Washington would “call off the dogs
of war on women,” he turned a flashlight upon the fathomless gulf
with which sentimentalism has divided the sexes. No one dreams of
calling policemen and magistrates “dogs of war” because they arrest
and punish men for disturbing the peace. If men claim the privileges
of citizenship, they are permitted to suffer its penalties.
A few years before the war, a rage for compiling useless statistics
swept over Europe and the United States. When it was at its height,
some active minds bethought them that children might be made to
bear their part in the guidance of the human race. Accordingly a
series of questions—some sensible and some foolish—were put to
English, German, and American school-children, and their
enlightening answers were given to the world. One of these
questions read: “Would you rather be a man or a woman, and why?”
Naturally this query was of concern only to little girls. No sane
educator would ask it of a boy. German pedagogues struck it off the
list. They said that to ask a child, “Would you rather be something
you must be, or something you cannot possibly be?” was both
foolish and useless. Interrogations concerning choice were of value
only when the will was a determining factor.
No such logical inference chilled the examiners’ zeal in this
inquisitive land. The question was asked and was answered. We
discovered, as a result, that a great many little American girls (a
minority, to be sure, but a respectable minority) were well content
with their sex; not because it had its duties and dignities, its
pleasures and exemptions; but because they plainly considered that
they were superior to little American boys, and were destined, when
grown up, to be superior to American men. One small New England
maiden wrote that she would rather be a woman because “Women
are always better than men in morals.” Another, because “Women
are of more use in the world.” A third, because “Women learn things
quicker than men, and have more intelligence.” And so on through
varying degrees of self-sufficiency.
These little girls, who had no need to echo the Scotchman’s
prayer, “Lord, gie us a gude conceit o’ ourselves!” were old maids in
the making. They had stamped upon them in their tender childhood
the hall-mark of the American spinster. “The most ordinary cause of
a single life,” says Bacon, “is liberty, especially in certain self-
pleasing and humorous minds.” But it is reserved for the American
woman to remain unmarried because she feels herself too valuable
to be entrusted to a husband’s keeping. Would it be possible in any
country save our own for a lady to write to a periodical, explaining
“Why I am an Old Maid,” and be paid coin of the realm for the
explanation? Would it be possible in any other country to hear such
a question as “Should the Gifted Woman Marry?” seriously asked,
and seriously answered? Would it be possible for any sane and
thoughtful woman who was not an American to consider even the
remote possibility of our spinsters becoming a detached class, who
shall form “the intellectual and economic élite of the sex, leaving
marriage and maternity to the less developed woman”? What has
become of the belief, as old as civilization, that marriage and
maternity are developing processes, forcing into flower a woman’s
latent faculties; and that the less-developed woman is inevitably the
woman who has escaped this keen and powerful stimulus? “Never,”
said Edmond de Goncourt, “has a virgin, young or old, produced a
work of art.” One makes allowance for the Latin point of view. And it
is possible that M. de Goncourt never read “Emma.”
There is a formidable lack of humour in the somewhat
contemptuous attitude of women, whose capabilities have not yet
been tested, toward men who stand responsible for the failures of
the world. It denotes, at home and abroad, a density not far removed
from dulness. In Mr. St. John Ervine’s depressing little drama, “Mixed
Marriage,” which the Dublin actors played in New York some years
ago, an old woman, presumed to be witty and wise, said to her son’s
betrothed: “Sure, I believe the Lord made Eve when He saw that
Adam could not take care of himself”; and the remark reflected
painfully upon the absence of that humorous sense which we used
to think was the birthright of Irishmen. The too obvious retort, which
nobody uttered, but which must have occurred to everybody’s mind,
was that if Eve had been designed as a care-taker, she had made a
shining failure of her job.
That astute Oriental, Sir Rabindranath Tagore, manifested a
wisdom beyond all praise in his recognition of American standards,
when addressing American audiences. As the hour for his departure
drew nigh, he was asked to write, and did write, a “Parting Wish for
the Women of America,” giving graceful expression to the sentiments
he knew he was expected to feel. The skill with which he modified
and popularized an alien point of view revealed the seasoned
lecturer. He told his readers that “God has sent woman to love the
world,” and to build up a “spiritual civilization.” He condoled with
them because they were “passing through great sufferings in this
callous age.” His heart bled for them, seeing that their hearts “are
broken every day, and victims are snatched from their arms to be
thrown under the car of material progress.” The Occidental sentiment
which regards man simply as an offspring, and a fatherless offspring
at that (no woman, says Olive Schreiner, could look upon a battle-
field without thinking, “So many mothers’ sons!”), came as naturally
to Sir Rabindranath as if he had been to the manner born. He was
content to see the passion and pain, the sorrow and heroism of men,
as reflections mirrored in a woman’s soul. The ingenious gentlemen
who dramatize Biblical narratives for the American stage, and who
are hampered at every step by the obtrusive masculinity of the East,
might find a sympathetic supporter in this accomplished and
accommodating Hindu.
The story of Joseph and his Brethren, for example, is perhaps the
best tale ever told the world,—a tale of adventure on a heroic scale,
with conflicting human emotions to give it poignancy and power. It
deals with pastoral simplicities, with the splendours of court, and with
the “high finance” which turned a free landholding people into
tenantry of the crown. It is a story of men, the only lady introduced
being a disedifying dea ex machina, whose popularity in Italian art
has perhaps blinded us to the brevity of her Biblical rôle. But when
this most dramatic narrative was cast into dramatic form, Joseph’s
splendid loyalty to his master, his cold and vigorous chastity, were
nullified by giving him an Egyptian sweetheart. Lawful marriage with
this young lady being his sole solicitude, the advances of Potiphar’s
wife were less of a temptation than an intrusion. The keynote of the
noble old tale was destroyed, to assure to woman her proper place
as the guardian of man’s integrity.
Still more radical was the treatment accorded to the parable of the
“Prodigal Son,” which was expanded into a pageant play, and acted
with a hardy realism permitted only to the strictly ethical drama. The
scriptural setting of the story was preserved, but its patriarchal
character was sacrificed to modern sentiment which refuses to be
interested in the relation of father and son. Therefore we beheld the
prodigal equipped with a mother and a trusting female cousin, who,
between them, put the poor old gentleman out of commission,
reducing him to his proper level of purveyor-in-ordinary to the
household. It was the prodigal’s mother who bade her reluctant
husband give their wilful son his portion. It was the prodigal’s mother
who watched for him from the house-top, and silenced the voice of
censure. It was the prodigal’s mother who welcomed his return, and
persuaded father and brother to receive him into favour. The whole
duty of man in that Syrian household was to obey the impelling word
of woman, and bestow blessings and bags of gold according to her
will.
The expansion of the maternal sentiment until it embraces, or
seeks to embrace, humanity, is the vision of the emotional, as
opposed to the intellectual, feminist. “The Mother State of which we
dream” offers no attraction to many plain and practical workers, and
is a veritable nightmare to others. “Woman,” writes an enthusiast in
the “Forum,” “means to be, not simply the mother of the individual,
but of society, of the State with its man-made institutions, of art and
science, of religion and morals. All life, physical and spiritual,
personal and social, needs to be mothered.”
“Needs to be mothered”! When men proffer this welter of
sentiment in the name of women, how is it possible to say
convincingly that the girl student standing at the gates of knowledge
is as humble-hearted as the boy; that she does not mean to mother
medicine, or architecture, or biology, any more than the girl in the
banker’s office means to mother finance? Her hopes for the future
are founded on the belief that fresh opportunities will meet a sure
response; but she does not, if she be sane, measure her untried
powers by any presumptive scale of valuation. She does not
consider the advantages which will accrue to medicine, biology, or
architecture by her entrance—as a woman—into any one of these
fields. Their need for her maternal ministration concerns her less
than her need for the magnificent heritage they present.
It has been said many times that the craving for material profit is
not instinctive in women. If it is not instinctive, it will be acquired,
because every legitimate incentive has its place in the progress of
the world. The demand that women shall be paid men’s wages for
men’s work may represent a desire for justice rather than a desire for
gain; but money fairly earned is sweet in the hand, and to the heart.
An open field, an even start, no handicap, no favours, and the same
goal for all. This is the worker’s dream of paradise. Women have
long known that lack of citizenship was an obstacle in their path.
Self-love has prompted them to overrate their imposed, and
underrate their inherent, disabilities. “Whenever you see a woman
getting a high salary, make up your mind that she is giving twice the
value received,” writes an irritable correspondent to the “Survey”;
and this pretension paralyzes effort. To be satisfied with ourselves is
to be at the end of our usefulness.
M. Émile Faguet, that most radical and least sentimental of French
feminists, would have opened wide to women every door of which
man holds the key. He would have given them every legal right and
burden which they are physically fitted to enjoy and to bear. He was
as unvexed by doubts as he was uncheered by illusions. He had no
more fear of the downfall of existing institutions than he had hope for
the regeneration of the world. The equality of men and women, as he
saw it, lay, not in their strength, but in their weakness; not in their
intelligence, but in their stupidity; not in their virtues, but in their
perversity. Yet there was no taint of pessimism in his rational refusal
to be deceived. No man saw more clearly, or recognized more justly,
the art with which his countrywomen have cemented and upheld a
social state at once flexible and orderly, enjoyable and inspiriting.
That they have been the allies, and not the rulers, of men in building
this fine fabric of civilization was also plain to his mind. Allies and
equals he held them, but nothing more. “La femme est parfaitement
l’égale de l’homme, mais elle n’est que son égale.”
Naturally to such a man the attitude of Americans toward women
was as unsympathetic as was the attitude of Dahomeyans. He did
not condemn it (possibly he did not condemn the Dahomeyans,
seeing that the civic and social ideals of France and Dahomey are in
no wise comparable); but he explained with careful emphasis that
the French woman, unlike her American sister, is not, and does not
desire to be, “un objet sacro-saint.” The reverence for women in the
United States he assumed to be a national trait, a sort of national
institution among a proud and patriotic people. “L’idolâtrie de la
femme est une chose américaine par excellence.”
The superlative complacency of American women is due largely to
the oratorical adulation of American men,—an adulation that has no
more substance than has the foam on beer. I have heard a
candidate for office tell his female audience that men are weak and
women are strong, that men are foolish and women are wise, that
men are shallow and women are deep, that men are submissive
tools whom women, the leaders of the race, must instruct to vote for
him. He did not believe a word that he said, and his hearers did not
believe that he believed it; yet the grossness of his flattery kept pace
with the hypocrisy of his self-depreciation. The few men present
wore an attitude of dejection, not unlike that of the little boy in
“Punch” who has been told that he is made of
The story goes that, after the bloody victory of the Scots under
Kenneth MacAlpine, in 860, only two Picts who knew the secret of
the brew survived the general slaughter. Some say they were father
and son, some say they were master and man. When they were
offered their lives in exchange for the recipe, the older captive said
he dared not reveal it while the younger lived, lest he be slain in
revenge. So the Scots tossed the lad into the sea, and waited
expectantly. Then the last of the Picts cried, “I only know!” and
leaped into the ocean and was drowned. It is a brave tale. One
wonders if a man would die to save the secret of making milk-toast.
From the pages of history the prohibition-bred youth may glean
much off-hand information about the wine which the wide world
made and drank at every stage of civilization and decay. If, after the
fashion of his kind, he eschews history, there are left to him
encyclopædias, with their wealth of detail, and their paucity of
intrinsic realities. Antiquarians also may be trusted to supply a
certain number of papers on “leather drinking-vessels,” and “toasts
of the old Scottish gentry.” But if the youth be one who browses
untethered in the lush fields of English literature, taking prose and
verse, fiction and fact, as he strays merrily along, what will he make
of the hilarious company in which he finds himself? What of Falstaff,
and the rascal, Autolycus, and of Sir Toby Belch, who propounded
the fatal query which has been answered in 1919? What of Herrick’s
“joy-sops,” and “capring wine,” and that simple and sincere
“Thanksgiving” hymn which takes cognizance of all mercies?
time it was, until the gilt began to wear off the gingerbread. But
Evelyn, though he feasted as became a loyal gentleman, and
admitted that canary carried to the West Indies and back for the
good of its health was “incomparably fine,” yet followed Saint
Chrysostom’s counsel. He drank, and compelled his household to
drink, with sobriety. There is real annoyance expressed in the diary
when he visits a hospitable neighbour, and his coachman is so well
entertained in the servants’ hall that he falls drunk from the box, and
cannot pick himself up again.
Poor Mr. Pepys was ill fitted by a churlish fate for the simple
pleasures that he craved. To him, as to many another Englishman,
wine was precious only because it promoted lively conversation. His
“debauches” (it pleased him to use that ominous word) were very
modest ones, for he was at all times prudent in his expenditures. But
claret gave him a headache, and Burgundy gave him the stone, and
late suppers, even of bread and butter and botargo, gave him
indigestion. Therefore he was always renouncing the alleviations of
life, only to be lured back by his incorrigible love of companionship.
There is a serio-comic quality in his story of the two bottles of wine
he sent for to give zest to his cousin Angler’s supper at the Rose
Tavern, and which were speedily emptied by his cousin Angler’s
friends: “And I had not the wit to let them know at table that it was I
who paid for them, and so I lost my thanks.”
If the young prohibitionist be light-hearted enough to read Dickens,
or imaginative enough to read Scott, or sardonic enough to read
Thackeray, he will find everybody engaged in the great business of
eating and drinking. It crowds love-making into a corner, being,
indeed, a pleasure which survives all tender dalliance, and restores
to the human mind sanity and content. I am convinced that if Mr.
Galsworthy’s characters ate and drank more, they would be less
obsessed by sex, and I wish they would try dining as a restorative.
The older novelists recognized this most expressive form of
realism, and knew that, to be accurate, they must project their minds
into the minds of their characters. It is because of their sympathy and
sincerity that we recall old Osborne’s eight-shilling Madeira, and Lord
Steyne’s White Hermitage, which Becky gave to Sir Pitt, and the
brandy-bottle clinking under her bed-clothes, and the runlet of canary
which the Holy Clerk of Copmanhurst found secreted conveniently in
his cell, and the choice purl which Dick Swiveller and the
Marchioness drank in Miss Sally Brass’s kitchen. We hear
Warrington’s great voice calling for beer, we smell the fragrant fumes
of burning rum and lemon-peel when Mr. Micawber brews punch, we
see the foam on the “Genuine Stunning” which the child David calls
for at the public house. No writer except Peacock treats his
characters, high and low, as royally as does Dickens; and Peacock,
although British publishers keep issuing his novels in new and
charming editions, is little read on this side of the sea. Moreover, he
is an advocate of strong drink, which is very reprehensible, and
deprives him of candour as completely as if he had been a
teetotaller. We feel and resent the bias of his mind; and although he
describes with humour that pleasant middle period, “after the
Jacquerie were down, and before the march of mind was up,” yet the
only one of his stories which is innocent of speciousness is “The
Misfortunes of Elphin.”
Now to the logically minded “The Misfortunes of Elphin” is a
temperance tract. The disaster which ruins the countryside is the
result of shameful drunkenness. The reproaches levelled by Prince