# Deep Learning Fundamentals in Python

## Description

Deep learning is making waves. At the time of this writing (March 2016), Google’s AlghaGo program just beat 9-dan professional Go player Lee Sedol at the game of Go, a Chinese board game.

Experts in the field of Artificial Intelligence thought we were 10 years away from achieving a victory against a top professional Go player, but progress seems to have accelerated!

While deep learning is a complex subject, it is not any more difficult to learn than any other machine learning algorithm. I wrote this book to introduce you to the basics of neural networks. You will get along fine with undergraduate-level math and programming skill.

All the materials in this book can be downloaded and installed for free. We will use the Python programming language, along with the numerical computing library Numpy. I will also show you in the later chapters how to build a deep network using Theano and TensorFlow, which are libraries built specifically for deep learning and can accelerate computation by taking advantage of the GPU.

Unlike other machine learning algorithms, deep learning is particularly powerful because it automatically learns features. That means you don’t need to spend your time trying to come up with and test “kernels” or “interaction effects” - something only statisticians love to do. Instead, we will let the neural network learn these things for us. Each layer of the neural network learns a different abstraction than the previous layers. For example, in image classification, the first layer might learn different strokes, and in the next layer put the strokes together to learn shapes, and in the next layer put the shapes together to form facial features, and in the next layer have a high level representation of faces.

Do you want a gentle introduction to this “dark art”, with practical code examples that you can try right away and apply to your own data? Then this book is for you.

What do I mean by “fundamentals”?

When students first hear about deep learning, they often are introduced to the field via some hyped up news article about convolutional neural networks or LSTMs. While this is a fine eventual goal, this is not the place to start when you’re first learning about deep learning.

All of deep learning depends on one fundamental algorithm, the “secret sauce”, if you will. That is what you will learn in this book. You will learn how we get there from basic undergraduate math. You will learn how it can be modified for speed improvements. You will learn how to code it in Numpy, Theano, and TensorFlow.

But the most fundamental, important thing, is understanding what “it” is and how “it” works.

What happens when you skip over these important fundamentals?

If you’re reading this book, you probably have some experience with software and programming in a team. More often than not, there is someone on the team who:

* Talks about machine learning endlessly, but is barely able to use Sci-Kit Learn.

* Can possibly plug-and-play into some pre-written deep learning code, so that it at least runs without errors, but has no idea how to make it work for the problem at hand.

If you are on a software team, and you don’t know who “that guy” is, YOU could be “that guy”! My goal in this book is to make sure you are not “that guy”.

I want you to know how deep learning works on a mathematical and algorithmic level.

A true computer scientist can take an algorithm, transform it into pseudocode, and transform that into real, working code.

At the very highest level, all we are doing is “minimizing cost”. Even business people can understand this very intuitive idea. All business try to minimize their costs and maximize their profits.

In this book, I will show you how to take an intuitive objective like “minimize cost”, and how that eventually results in deep learning. It is nothing more than a little bit of math and Python programming.

Le

## About the author

The LazyProgrammer is a data scientist, big data engineer, and full stack software engineer. He is especially interested in deep learning and neural networks. Some also refer to this as AI, or artificial intelligence. He graduated with a masters degree with a thesis on machine learning for brain-computer interfaces. This research would help those who are non-mobile or non-vocal communicate with their caregivers. The LazyProgrammer got his start in machine learning and data science by learning about computational neuroscience and neural engineering. The physics aspect has always interested him but the practical nature of machine learning and data science has made up a majority of his work. After spending years in online advertising and the media, working to build and improve big data pipelines and using machine learning to increase revenue via CTR (click-through rate) optimization and conversion tracking, he began to work for himself. This allowed the LazyProgrammer to focus 100% of his effort on deepening his knowledge of machine learning and data science. He works with startups and larger companies to set up data pipelines and engineer predictive models that result in meaningful insights and data-driven decision making. The LazyProgrammer also loves to teach. He has helped many adults looking to change their career path and dive into the startup and tech world. Students at General Assembly, the Flatiron School, and App Academy have all benefitted from his help. He has also helped many graduate students at various ivy leagues and other colleges through their machine learning and data science programs. The LazyProgrammer loves to give away free tutorials and other material. You can get a FREE 6-week introduction to machine learning course by signing up for his newsletter at: https://lazyprogrammer.me The LazyProgrammer also has a collection of Udemy courses that teach topics like machine learning, data science, and deep learning. You can find them here: https://www.udemy.com/data-science-natural-language-processing-in-python https://www.udemy.com/data-science-linear-regression-in-python https://www.udemy.com/data-science-logistic-regression-in-python https://www.udemy.com/data-science-deep-learning-in-python https://www.udemy.com/data-science-deep-learning-in-theano-tensorflow

## Related categories

## Inside the book

### Top quotes

Instead, we will let the neural network learn these things for us. Each layer of the neural network learns a different abstraction than the previous layers.

A true computer scientist can take an algorithm, transform it into pseudocode, and transform that into real, working code.

You should recall from your calculus studies that to find the minimum of a quadratic (or any other smooth function), we can take its derivative and set it to 0.

Recall that with linear regression, using the squared error was the same as tak- ing the log-likelihood of the data (for which the error was assumed to be Gauss- ian).

When you’re coding in MATLAB or Python, it is more efficient to use “vectorized” operations, so understanding vectors and matrices is vital.

### Book Preview

### Deep Learning Fundamentals in Python - LazyProgrammer

version.

*Chapter 1: Linear Regression Review *

*Chapter 1: Linear Regression Review*

A great place to start when you’re learning about machine learning is linear regression. It is called regression

because we will be trying to predict a real number. In later chapters, we will be doing classification

, which means we will be trying to predict a category.

This is something a lot of us have done by hand in our high school math studies, sometimes called finding the line of best fit

.

Let’s review what the problem is.

We’ve plotted some data points on a scatterplot.

We see that they form what looks like a line.

￼

We then take a ruler, and try to draw a line that goes through the middle

of all these points. There should be some points on one side of the line, and some points on the other side.

Simple, right?

Let’s generalize this concept.

We are given a bunch of 2-dimensional points:

(x1, y1), (x2, y2), (x3, y3), ..., (xN, yN)

We are given N (x, y) pairs in total.

We would like to find a line, such that the line goes through the middle of the given points.

Recall that the equation for a line is:

y_hat = ax + b

We call a

the slope, and b

the y-intercept (where it crosses the y-axis when x = 0).

Is there a better way to do this rather than just eye-balling it? Of course! We can use math.

A common way to measure error is the squared error. Let’s call it J.

J = sum[i=1..N]{ (y(i) - y_hat(i))² }

It is difficult to represent mathematical equations with the e-book format (there is no LaTeX available), so I will describe what this equation means.

For each data point we are given, y(i), I will calculate its prediction y_hat(i) = ax(i) + b.

I will subtract y_hat(i) from y(i) and square the difference. I will do this for all i from 1 to N, and sum all the squared differences together.

That is my error.

Notice that this is actually a 2-dimensional quadratic equation, where a

is the first dimension, and b

is the second dimension.

You should recall from your calculus studies that to find the minimum of a quadratic (or any other smooth function), we can take its derivative and set it to 0.

In particular, we want to find dJ/da and dJ/db.

We will set them to 0, so that dJ/da = 0 and dJ/db = 0, and then solve for a and b.

I would recommend doing this at home on your own. You should arrive at the solution:

a = [ N*sum(x i y i ) - sum(x i )*sum(y i ) ] / [ sum(x i ² ) - sum(x i ) ² ]

b = mean(y) - m*mean(x)

Where sum() is the sum over all i from i=1 to N. And mean() is the sample mean (sum all the items and divide by N).

Sometimes, instead of calling the actual outputs y

and the predicted outputs y_hat

, we call the actual outputs t

and the predicted outputs y

.

Confusing, I know.

We also take the convention that T

refers to an N-sized vector that contains all the N individual t

s, and Y

refers to an N-sized vector that contains all the individual y

s.

**Extending linear regression to multiple dimensions **

In real machine learning problems we of course have more than one input feature, so each x i becomes a vector.

When x is 1-D, we get a line. When x is 2-D, we get a plane. When x is 3-D or higher, we get a hyperplane.

When you’re coding in MATLAB or Python, it is more efficient to use vectorized

operations, so understanding vectors and matrices is vital.

As an example, suppose I wanted to do the dot product between w = [1, 2, 3], and x = [4, 5, 6]

As you know from linear algebra, the answer is 1*4 + 2*5 + 3*6.

You might write it like this in code:

answer = 0

for i in xrange(3):

answer += w[i]*x[i]

This is slow!

It is much better to use a library like numpy and call the dot

## Reviews

### What people think about Deep Learning Fundamentals in Python

3.8### Reader reviews

- (3/5)Good content, but as equations are written as ordinary text, it becomes hard to read
- (1/5)Book is difficult to read due to poor type setting of the equations. Don't bother with it.
- (5/5)
1 person found this helpful

The book is well written and has a good structure that helps you understand the subject better, bcause it shows the connection between deep learning and more simple methods like linear regression. Some of the formulas can be a bit tricky to understand, so make sure you remember basic linear algebra stuff about multiplying arrays and also how you work with Numpy.1 person found this helpful