Professional Documents
Culture Documents
ARCH
TUTORIALS (HTTPS://PYIMAGESEARCH.COM/CATEGORY/TUTORIALS/)
Constant Initialization
When applying constant initialization, all weights in the neural network are initialized with a
constant value, C. Typically C will equal zero or one.
To visualize this in pseudocode let’s consider an arbitrary layer of a neural network that has 64
inputs and 32 outputs (excluding any biases for notional convenience). To initialize these weights
via NumPy and zero initialization (the default used by Caffe, a popular deep learning framework)
we would execute:
Although constant initialization is easy to grasp and understand, the problem with using this
method is that it’s near impossible for us to break the symmetry of activations (Heinrich, 2015
method is that it s near impossible for us to break the symmetry of activations (Heinrich, 2015
(https://github.com/NVIDIA/DIGITS/blob/master/examples/weight-init/README.md)).
Therefore, it is rarely used as a neural network weight initializer.
Again, let’s presume that for a given layer in a neural network we have 64 inputs and 32 outputs.
We then wish to initialize our weights in the range lower=-0.05 and upper=0.05 . Applying
the following Python + NumPy code will allow us to achieve the desired normalization:
Executing the code above NumPy will randomly generate 64×32 = 2,048 values from the range
[−0.05, 0.05], where each value in this range has equal probability.
We then have a normal distribution where we define the probability density for the Gaussian
distribution as:
(1)
The most important parameters here are µ (the mean) and σ (the standard deviation). The square
of the standard deviation, σ2, is called the variance.
When using the Keras library the RandomNormal class draws random values from a normal
distribution with µ = 0 and σ = 0.05. We can mimic this behavior using NumPy below:
Both uniform and normal distributions can be used to initialize the weights in neural networks;
however, we normally impose various heuristics to create “better” initialization schemes (as we’ll
discuss in the remaining sections).
Here, the authors define a parameter Fin (called “fan in,” or the number of inputs to the layer)
along with Fout (the “fan out,” or number of outputs from the layer). Using these values we can
apply uniform initialization by:
We can also use a normal distribution as well. The Keras library uses a truncated normal
distribution when constructing the lower and upper limits, along with a zero mean:
For the normal distribution the limit value is constructed by averaging the Fin and Fout
together and then taking the square-root (Jones, 2016
(https://andyljones.tumblr.com/post/110998971763/an-explanation-of-xavier-initialization)). A
zero-center (µ = 0) is then used:
Glorot/Xavier initialization can also be done with a uniform distribution where we place stronger
restrictions on limit :
Understanding weight initialization for neural networks
1. >>> F_in = 64
2. >>> F_out = 32
3. >>> limit = np.sqrt(6 / float(F_in + F_out))
4. >>> W = np.random.uniform(low=-limit, high=limit, size=(F_in, F_out))
Learning tends to be quite efficient using this initialization method and I recommend it for most
neural networks.
We typically use this method when we are training very deep neural networks that use a ReLU-
like activation function (in particular, a “PReLU,” or Parametric Rectified Linear Unit).
To initialize the weights in a layer using He et al. initialization with a uniform distribution we set
limit to be , where Fin is the number of input units in the layer:
On the other hand, the default Xaiver initialization for Keras uses
np.sqrt(6/(F_in + F_out)) (Keras contributors, 2016
(https://keras.io/initializers/#glorot_uniform)). No method is “more correct” than the other, but
you should read the documentation of your respective deep learning library.
What's next? I recommend PyImageSearch University
(https://pyimagesearch.com/pyimagesearch-
university/?
utm_source=blogPost&utm_medium=bottomBanner&u
tm_campaign=What%27s%20next%3F%20I%20recom
mend).
3:52
Course information:
79 total classes • 101+ hours of on-demand code walkthrough videos • Last updated:
August 2023
★★★★★ 4.84 (128 Ratings) • 16,000+ Students Enrolled
I strongly believe that if you had the right teacher you could master computer vision
and deep learning.
Do you think learning computer vision and deep learning has to be time-consuming,
overwhelming, and complicated? Or has to involve complex mathematics and
equations? Or requires a degree in computer science?
That’s not the case.
All you need to master computer vision and deep learning is for someone to explain
things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to
change education and how complex Artificial Intelligence topics are taught.
If you're serious about learning computer vision, your next stop should be
PyImageSearch University, the most comprehensive computer vision, deep learning,
and OpenCV course online today. Here you’ll learn how to successfully and
confidently apply computer vision to your work, research, and projects. Join me in
computer vision mastery.
✓ 79 Certificates of Completion
✓ Brand new courses released regularly, ensuring you can keep up with state-of-the-
art techniques
✓ Run all code examples in your web browser — works on Windows, macOS, and
Linux (no dev environment configuration required!)
IGN=WHAT%27S%20NEXT%3F%20I%20RECOMMEND)
Summary
In this tutorial, we reviewed the fundamentals of neural networks. Specifically, we focused on the
history of neural networks and the relation to biology.
From there, we moved on to artificial neural networks, such as the Perceptron algorithm. While
important from a historical standpoint, the Perceptron algorithm has one major flaw — it cannot
accurately classify nonlinear separable points. In order to work with more challenging datasets
we need both (1) nonlinear activation functions and (2) multi-layer networks.
To train multi-layer networks we must use the backpropagation algorithm. We then implemented
backpropagation by hand and demonstrated that when used to train multi-layer networks with
nonlinear activation functions, we can model nonlinearly separable datasets, such as XOR.
Finally, we reviewed the four key ingredients when working with any neural network, including
the dataset, loss function, model/architecture, and optimization method.
Unfortunately, as some of our results demonstrated (e.g., CIFAR-10) standard neural networks fail
to obtain high classification accuracy when working with challenging image datasets that exhibit
variations in translation, rotation, viewpoint, etc. In order to obtain reasonable accuracy on these
datasets, we’ll need to work with a special type of feedforward neural networks called
Convolutional Neural Networks (CNNs), which we will cover in a separate tutorial.
3:52
Course information:
79 total classes • 101+ hours of on-demand code walkthrough videos • Last updated:
August 2023
★★★★★ 4.84 (128 Ratings) • 16,000+ Students Enrolled
I strongly believe that if you had the right teacher you could master computer vision
and deep learning.
Do you think learning computer vision and deep learning has to be time-consuming,
overwhelming, and complicated? Or has to involve complex mathematics and
equations? Or requires a degree in computer science?
All you need to master computer vision and deep learning is for someone to explain
things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to
change education and how complex Artificial Intelligence topics are taught.
If you're serious about learning computer vision, your next stop should be
PyImageSearch University, the most comprehensive computer vision, deep learning,
and OpenCV course online today. Here you’ll learn how to successfully and
confidently apply computer vision to your work, research, and projects. Join me in
computer vision mastery.
✓ 79 Certificates of Completion
✓ Brand new courses released regularly, ensuring you can keep up with state-of-the-
art techniques
✓ Run all code examples in your web browser — works on Windows, macOS, and
Linux (no dev environment configuration required!)
Previous Article:
(https://pyimagesearch.com/2021/05/05/gradient-descent-algorithms-and-variations/)
Next Article:
(https://pyimagesearch.com/2021/05/06/implementing-feedforward-neural-networks-with-
keras-and-tensorflow/)
Comment section
Hey, Adrian Rosebrock here, author and creator of PyImageSearch. While I love
hearing from readers, a couple years ago I made the tough decision to no longer offer
1:1 help over blog post comments.
At the time I was receiving 200+ emails per day and another 100+ blog post
comments. I simply did not have the time to moderate and respond to them all, and
the sheer volume of requests was taking a toll on me.
Instead, my goal is to do the most good for the computer vision, deep learning, and
OpenCV community at large by focusing my time on authoring high-quality blog
posts, tutorials, and books/courses.
If you need help learning computer vision and deep learning, I suggest you refer to
my full catalog of books and courses (https://pyimagesearch.com/books-and-
courses/) — they have helped tens of thousands of developers, students, and
researchers just like yourself learn Computer Vision, Deep Learning, and OpenCV.
Similar articles
PYIMAGECONF
(https://www.facebook.com/pyimagesearch)
(https://twitter.com/PyImageSearch) (http://www.linkedin.com/pub/adrian-
rosebrock/2a/873/59b) (https://www.youtube.com/channel/UCoQK7OVcIVy-nV4m-
SMCk_Q/videos)