You are on page 1of 164

Getting Started with TensorFlow

Getting Started with Free Lab

1. Open CloudxLab
2. If already Enrolled, go to “My Lab”
3. Else Click on "Start Free Lab"
a. And Complete the process of enrollment
b. You might have sign using credit card or college
id
4. Go to “My Lab”
5. Open Jupyter

TensorFlow
Getting Started with TensorFlow
1. Open jupyter from My Lab
2. Click on "New" -> Terminal
3. Type:
a. git clone https://github.com/cloudxlab/ml.git
4. Go to the Jupyter Browser and into the folder
ml/deep_learning
5. Open tensorflow.ipynb

TensorFlow
Getting Started with TensorFlow
● Check if TensorFlow is available

>>> import tensorflow


>>> print(tensorflow.__version__)

TensorFlow
Lazy Evaluation Example - The waiter takes orders patiently
Soup and
Cheese burger, A Plate of
soup and Ok.
Noodles for
a Plate of Noodles One cheese burger
me
please Two soups
Two plates of Noodles
Anything else, sir?

The chef is able to


optimize because of
clubbing multiple
order together

Basics of RDD
Instant Evaluation

Cheese Burger...

Let me get a cheese burger


for you. I'll be right back!

And Soup?

The soup order will be taken once the waiter is back.

Basics of RDD
Actions: Lazy Evaluation - Optimization - Scala

def Map1(x:String):String = def Map3(x:String):String={


x.trim(); var y = x.trim();
return y.toUpperCase();
def Map2(x:String):String = }
x.toUpperCase();
lines = sc.textFile(...)
var lines = sc.textFile(...) lines2 = lines.map(Map3);
var lines1 = lines.map(Map1);
var lines2 = lines1.map(Map2); lines2.collect()

lines2.collect()

Basics of RDD
TensorFlow
● Powerful open source library
○ For numerical computations
○ Fine-tuned for large-scale Machine Learning

TensorFlow
TensorFlow
● Developed by the Google Brain team
● It powers many of Google’s large-scale services, such as
○ Google Cloud Speech
○ Google Photos and
○ Google Search

TensorFlow
TensorFlow - Principal
● First define graph of computations

Operation

TensorFlow
TensorFlow - Principal
● Then TensorFlow runs this graph efficiently using optimized C++ code

Operation

TensorFlow
TensorFlow - Parallel Computation
● Also the graph can be broken into multiple chunks
● Each chunk can run
○ parallel across multiple
○ CPUs and
○ GPUs

TensorFlow
TensorFlow - Parallel Computation
● TensorFlow also supports distributed computing
○ We’ll cover Distributed TensorFlow later in the course
● Can train a network composed of
○ Billions of instances with millions of features each

TensorFlow
TensorFlow - Highlights - Runs Everywhere
● Runs on desktop and mobile devices such as
○ Windows
○ Linux
○ macOS
○ iOS and
○ Android

TensorFlow
TensorFlow - Highlights - TF.Learn
● Provides Python API called TF.Learn(tensorflow.contrib.learn)
● TF.Learn is compatible with Scikit-Learn
● Train machine learning models with just few lines of code

TensorFlow
TensorFlow - Highlights - Other APIs
● Many high-level APIs are built on top of TensorFlow such as
○ Keras and
○ Pretty Tensor

TensorFlow
TensorFlow - Highlights - Flexibility
● Python API offers flexibility to create all sorts of computations
○ Including any neural network architecture we can think of

TensorFlow
TensorFlow - Highlights - C++ API
● Includes highly efficient C++ implementations of many ML operations
● Also it has a C++ API to define our own high-performance operations

TensorFlow
TensorFlow - Highlights - TensorBoard
● Has a great visualization tool called TensorBoard
○ It allows to browse through the computation graph
○ Helps in viewing learning curves and more details

TensorFlow
TensorFlow - Highlights - Open Source
● Last but not the least
○ It has a dedicated team of passionate and helpful developers
○ Growing community contributing to improve it
○ One the the most popular open source projects on GitHub

TensorFlow
Deep Learning Libraries

Open source Deep Learning libraries (not an exhaustive list)

TensorFlow
Creating Our First Graph and Running It in a
Session

TensorFlow
Creating & Running a Graph
>>> import tensorflow as tf
>>> x = tf.Variable(3, name="x") #x = 3
>>> y = tf.Variable(4, name="y") # y = 4
>>> f = x*x*y + y + 2

TensorFlow
Creating & Running a Graph
>>> import tensorflow as tf
>>> x = tf.Variable(3, name="x")
>>> y = tf.Variable(4, name="y")
>>> f = x*x*y + y + 2

● This code does not perform any computation


● It just creates a computation graph
● Even the variables are not initialized up to this point

TensorFlow
Creating & Running a Graph

Then how do we start the computation?

TensorFlow
Creating & Running a Graph
● To evaluate this graph
○ We need to open a
○ TensorFlow session and
○ Use it to initialize the variables and evaluate f

TensorFlow
Creating & Running a Graph - TF Session
● A TensorFlow session takes care of
○ Placing the operations onto devices
■ Such as CPUs and GPUs
○ and Running them
○ It also holds all the variable values

TensorFlow
Creating & Running a Graph - TF Session
>>> sess = tf.Session() ← Creates the session
>>> sess.run(x.initializer)
Initialize the variables
>>> sess.run(y.initializer)
>>> result = sess.run(f) ← Evaluates
>>> print(result)
42
>>> sess.close() ← Closes the session to free up resources

TensorFlow
Creating & Running a Graph - TF Session
Better way
with tf.Session() as sess:
x.initializer.run()
y.initializer.run()
result = f.eval()
result

Output - 42

TensorFlow
Creating & Running a Graph - TF Session
Better way
with tf.Session() as sess:
x.initializer.run()
y.initializer.run()
result = f.eval()
result

Important - the session is automatically closed at the end of the block

TensorFlow
Creating & Running a Graph - TF Session
Another Better way
● Instead of manually running the initializer for every single variable
○ We can use can use the global_variables_initializer() function

init = tf.global_variables_initializer() # prepare an init node


with tf.Session() as sess:
init.run() # actually initialize all the variables
result = f.eval()

TensorFlow
Creating & Running a Graph - TF Session
Another Better way
● Instead of manually running the initializer for every single variable
○ We can use can use the global_variables_initializer() function

init = tf.global_variables_initializer() # prepare an init node


with tf.Session() as sess:
init.run() # actually initialize all the variables
result = f.eval()

Important -
● It does not actually perform the initialization immediately
● It creates a node in the graph that will initialize all variables when it is run

TensorFlow
Creating & Running a Graph - Summary
TensorFlow Program

Build Computational Graph -


Construction Phase

Run Computational Graph -


Execution Phase

TensorFlow
Creating & Running a Graph - Summary
Construction Phase
● Typically computation graph is built
● Representing the ML model and
● The computations required to train it

TensorFlow
Creating & Running a Graph - Summary
Execution Phase
● Runs a loop that evaluates a training step repeatedly
● For example one step per mini-batch and
● Gradually improving the model parameters
● We will see the example shortly

TensorFlow
Managing Graphs

TensorFlow
Managing Graphs
● Any node we create is added to the default graph

>>> x1 = tf.Variable(1)
>>> x1.graph is tf.get_default_graph()

Output -
True

TensorFlow
Managing Graphs
● At times we may want to manage multiple independent graphs
● We can do by creating a new Graph and
○ Temporarily making it the default graph
○ Inside a with block
graph = tf.Graph()
with graph.as_default():
x2 = tf.Variable(2)
print(x2.graph is graph)
print(x2.graph is tf.get_default_graph())
Output-
True
False

TensorFlow
Managing Graphs
Important
● Use reset_graph function specified in notebook to
● Reset the default graph

TensorFlow
Lifecycle of a Node Value

TensorFlow
Lifecycle of a Node Value
Important
● When we evaluate a node
○ TensorFlow automatically determines the set of nodes that it depends
on and
○ It evaluate these node first

TensorFlow
Lifecycle of a Node Value
Example
● Consider the following code

w = tf.constant(3)
x = w + 2
y = x + 5
z = x * 3
with tf.Session() as sess:
print(y.eval()) # 10
print(z.eval()) # 15

TensorFlow
Lifecycle of a Node Value
Example
● Consider the following code

w = tf.constant(3) y.eval()
x = w + 2 y = 5+5
y = x + 5 y =10
z = x * 3
with tf.Session() as sess:
print(y.eval()) # 10 x x = 3+2 = 5
print(z.eval()) # 15

w w=3

TensorFlow
Lifecycle of a Node Value
Example
● Consider the following code

w = tf.constant(3) z.eval()
x = w + 2
y = x + 5 z z = 5*3 =15
z = x * 3
with tf.Session() as sess:
print(y.eval()) # 10 x x = 3+2 = 5
print(z.eval()) # 15

w w=3

TensorFlow
Lifecycle of a Node Value
Important
● Preceding code evaluates w and x twice
● Nodes values are dropped between graph runs
● Except variable values, which are maintained by the
○ Session across graph runs
● A variable starts its life when its initializer is run, and
○ It ends when the session is closed

TensorFlow
Lifecycle of a Node Value
Important
● To evaluate y and z without evaluating w and x twice
○ Ask TensorFlow to evaluate both y and z in just one graph run

with tf.Session() as sess:


y_val, z_val = sess.run([y, z])
print(y_val) # 10
print(z_val) # 15

TensorFlow
Ops
● TensorFlow operations are called ops
● Ops can take any number of inputs and produce any number of outputs
● For example,
○ Addition and multiplication ops
■ Each take two inputs and
■ Produce one output
○ Constant and variables take no input
■ They are called source ops

TensorFlow
Tensors
● The inputs and outputs are
○ Multidimensional arrays, called tensors
○ Hence the name TensorFlow
● Like NumPy arrays, tensors have a type and a shape
● In the Python API tensors are
○ Simply represented by NumPy ndarrays
● Tensors typically contain floats
○ But we can make them to carry strings (arbitrary byte arrays)

TensorFlow
Linear Regression with TensorFlow

TensorFlow
Linear Regression

3 + 5 = $12

9 + 1 = $21

= $??
= $??

TensorFlow
Linear Regression

+ 5/3 = $12/3

+ 1/9 = $21/9

= $??
= $??
Did you notice that values are not integers?

TensorFlow
Linear Regression

+ 5/3 = $12/3

+ 1/9 = $21/9

[ ][ ] [ ]
1 5/3
1 1/9
=
12/3

21/9

TensorFlow
Linear Regression

[ ][ ] [ ]
1 5/3
1 1/9
=
12/3

21/9
X Θ y

TensorFlow
Linear Regression

[ 1 5/3
1 1/9
][ ] [ ] =
12/3

21/9
X Θ y
Optimum value of theta can calculated using the Normal Equation

T -1 T
Θ = (X .X) .X .y

TensorFlow
Linear Regression - Sklearn

[ 1 5/3
1 1/9
][ ] [ ] =
12/3

21/9
X Θ y
Optimum value of theta can calculated using the Normal Equation

T -1 T
Θ = (X .X) .X .y
np.linalg.inv(X.T.dot(X)) .dot(X.T).dot(y)

TensorFlow
Linear Regression - Sklearn

[ 1 5/3
1 1/9
][ ] [ ] =
12/3

21/9
X Θ y
Optimum value of theta can calculated using the Normal Equation

T -1 T
Θ = (X .X) .X .y
np.linalg.inv(X.T.dot(X)) .dot(X.T).dot(y)

[[2.21428571], [1.07142857]

TensorFlow
Linear Regression
● Now, Let’s calculate theta using normal equation in TensorFlow
● We will use housing dataset of end-to-end project

T -1 T
Θ = (X .X) .X .y

TensorFlow
Linear Regression
>>> import numpy as np
>>> from sklearn.datasets import fetch_california_housing

>>> reset_graph()
>>> housing = fetch_california_housing()

Fetches housing data

TensorFlow
Linear Regression
>>> import numpy as np
>>> from sklearn.datasets import fetch_california_housing

>>> reset_graph()
>>> housing = fetch_california_housing()
>>> m, n = housing.data.shape
>>> housing_data_plus_bias = np.c_[np.ones((m, 1)), housing.data]

Adds extra bias input feature (x0 = 1) to all training instances. (it
does so using NumPy so it runs immediately)

TensorFlow
Linear Regression
>>> import numpy as np
>>> from sklearn.datasets import fetch_california_housing

>>> reset_graph()
>>> housing = fetch_california_housing()
>>> m, n = housing.data.shape
>>> housing_data_plus_bias = np.c_[np.ones((m, 1)), housing.data]

>>> X = tf.constant(housing_data_plus_bias)
>>> y = tf.constant(housing.target.reshape(-1, 1))

Creates two TensorFlow constant nodes, X and y, to hold this data


and the targets

TensorFlow
Linear Regression
1. >>> import numpy as np
2. >>> from sklearn.datasets import fetch_california_housing
3.
4. >>> reset_graph()
5. >>> housing = fetch_california_housing()
6. >>> m, n = housing.data.shape
7. >>> housing_data_plus_bias = np.c_[np.ones((m, 1)), housing.data]
8.
9. >>> X = tf.constant(housing_data_plus_bias)
10. >>> y = tf.constant(housing.target.reshape(-1, 1))
11. >>> XT = tf.transpose(X)
12. >>> theta = tf.matmul(
1. tf.matmul(
a. tf.matrix_inverse(tf.matmul(XT, X)),
b. XT),
2. y)

Matrix operations provided by TensorFlow to define theta

TensorFlow
Linear Regression
Run previous code

with tf.Session() as sess:


theta_value = theta.eval()
theta_value

Output

array([[ -3.74651413e+01],
[ 4.35734153e-01],
[ 9.33829229e-03],
[ -1.06622010e-01],
[ 6.44106984e-01],
[ -4.25131839e-06],
[ -3.77322501e-03],
[ -4.26648885e-01],
[ -4.40514028e-01]], dtype=float32)

TensorFlow
Linear Regression

Compare different ways of calculating theta (Using NumPy and Scikit-Learn)


using the code shown in the notebook

TensorFlow
Linear Regression
The main benefit of previous code versus computing the Normal Equation
directly using NumPy is that
○ TensorFlow automatically runs this on
○ Your GPU card if you have one
○ Provided TensorFlow with GPU support is installed

TensorFlow
Implementing Gradient Descent

TensorFlow
Implementing Gradient Descent
● Let’s try using Batch Gradient Descent instead of Normal Equation
● In next couple of slides we will
○ First manually compute the gradients
○ Then use TensorFlow’s autodiff feature
○ And then use a couple of TensorFlow’s out-of-the-box optimizers

TensorFlow
Implementing Gradient Descent
Manually computing Gradients using Batch Gradient Descent

Gradient vector of the cost function

TensorFlow
Implementing Gradient Descent
Manually computing Gradients using Batch Gradient Descent

Gradient Descent step

TensorFlow
Implementing Gradient Descent
Manually computing Gradients using Batch Gradient Descent
using NumPy

# Using Numpy
eta = 0.1 # learning rate
n_iterations = 1000
m = 100
theta = np.random.randn(2,1) # random initialization
for iteration in range(n_iterations):
gradients = 2/m * X_b.T.dot(X_b.dot(theta) - y)
theta = theta - eta * gradients
print(theta)

TensorFlow
Implementing Gradient Descent
Manually computing Gradients using Batch Gradient Descent
using TF
Step 1- Normalize the feature vector list using Scikit-Learn
>>> from sklearn.preprocessing import StandardScaler
>>> scaler = StandardScaler()
>>> scaled_housing_data = scaler.fit_transform(housing.data)
>>> scaled_housing_data_plus_bias = np.c_[np.ones((m, 1)),
scaled_housing_data]
>>> print(scaled_housing_data_plus_bias.mean(axis=0))
>>> print(scaled_housing_data_plus_bias.mean(axis=1))
>>> print(scaled_housing_data_plus_bias.mean())
>>> print(scaled_housing_data_plus_bias.shape)

TensorFlow
Implementing Gradient Descent
Manually computing Gradients using Batch Gradient Descent
using TF

Step 2- Manually computing the gradient

Please check the code and explanation in the notebook

TensorFlow
Implementing Gradient Descent Using autodiff

TensorFlow
Using autodiff - Automatic Differentiation
Problem with manually calculating gradients
● Previous codes require mathematically deriving the gradients from the
cost function(MSE)
● In case of Linear Regression mathematically deriving the gradient is
reasonable easy
● But doing the same with deep neural networks is quite a headache
○ It would be tedious and
○ Error-prone and
○ Code might not be very efficient

TensorFlow
Using autodiff - Automatic Differentiation
Different options to find partial derivatives
● Let’s say we have a function f(x,y)
● And we need its partial derivatives with respect to x and y
○ To perform Gradient Descent or
○ Some other optimization algorithm

TensorFlow
Using autodiff - Automatic Differentiation
Different options to find partial derivatives
● There are four options to find partial derivatives
○ Symbolic Differentiation
○ Numerical Differentiation
○ Forward-mode autodiff
○ Reverse-mode autodiff
● We’ll cover these options in details later in the course
● TensorFlow implements Reverse-mode autodiff

TensorFlow
Using autodiff - Automatic Differentiation
Problem with manually calculating gradients

● Let’s say our function is


○ f(x)= exp(exp(exp(x)))
● As per the calculus its derivative f′(x) will be
○ f′(x) = exp(x) × exp(exp(x)) × exp(exp(exp(x)))

TensorFlow
Using autodiff - Automatic Differentiation
Problem with manually calculating gradients

● If we code f(x) and f′(x) separately then


○ exp function will be called 9 times
○ This is inefficient

f(x)= exp(exp(exp(x))) --> 3 times


f′(x) = exp(x) × exp(exp(x)) × exp(exp(exp(x))) --> 6
times
Total calls → 9 times

TensorFlow
Using autodiff - Automatic Differentiation
Problem with manually calculating gradients

● A more efficient solution will be to write function which


○ First computes exp(x)
○ Then exp(exp(x))
○ Then exp(exp(exp(x)))
○ And returns all the three
○ This way exp function will be called just three times, once in each step

TensorFlow
Using autodiff - Automatic Differentiation
● TensorFlow’s autodiff feature
○ Efficiently and automatically computes the gradient for us
○ Replace the gradients = … in the Gradient Descent code in the
previous section with the below line
○ gradients = tf.gradients(mse, [theta])[0]
○ And it will compute the gradients for us

Check the code in the notebook

TensorFlow
Using autodiff - Automatic Differentiation
gradients = tf.gradients(mse, [theta])[0]

● The gradients() function takes


○ An op (in this case mse) and
○ A list of variables (in this case just theta)
● And computes the gradients of each variables in the list

TensorFlow
Understanding Differentiation
Manual

● Slope at any point


● Here the curve is:
○ y = x*x
● Here we are computing
○ Slope at x = 5
○ dy/dx = d(x^2)/dx
○ 2*x
○ 2*5 = 10

TensorFlow
Understanding Differentiation
Using tensorFlow Autodiff

tf.reset_default_graph()
x = tf.constant(5.0)
y = tf.square(x)
z = tf.gradients(y, x)
with tf.Session() as s:
print(z[0].eval())
------
10.0

TensorFlow
Using autodiff - Automatic Differentiation
One more example on autodiff gradient computation

Find the partial derivatives of the following function with regards to `a` and
`b` at the point (0.2, 0.3)

def my_func(a, b):


z = 0
for i in range(100):
z = a * np.cos(z + i) + z * np.sin(b - i)
return z

Check the code in the notebook

TensorFlow
Implementing Gradient Descent Using an
Optimizer

TensorFlow
Implementing Gradient Descent
Using an Optimizer

● As we have seen TensorFlow computes the gradient for us


● It also provides a number of optimizers out of the box such as
○ Gradient Descent optimizer
○ Momentum optimizer
● We will cover these optimizers in details in the later part of the course

TensorFlow
Implementing Gradient Descent
Using an Optimizer - GradientDescentOptimizer

Replace the preceding gradients = ... and training_op = ... lines with the
following code

>>> optimizer =
tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
>>> training_op = optimizer.minimize(mse)

Please check the code and explanation in the notebook

TensorFlow
Implementing Gradient Descent
Using an Optimizer - MomentumOptimizer
It adds momentum to SGD. Slightly different algorithm.

>>> optimizer =
tf.train.MomentumOptimizer(learning_rate=learning_rate,
momentum=0.9)

Please check the code and explanation in the notebook

TensorFlow
Feeding Data to the Training Algorithm

TensorFlow
Feeding Data to the Training Algorithm
Feeding and Placeholder nodes

● Feeding allows to substitute different values for one or more tensors at


run time
● Placeholders are nodes whose value is fed in at execution time
○ They don’t actually perform any computation
○ They just output the data we tell them to output at runtime
○ Typically used to pass the training data to TensorFlow during training
○ Mostly inputs and labels
placeholder

TensorFlow
Feeding Data to the Training Algorithm
Placeholder nodes - Example

>>> A = tf.placeholder(tf.float32, shape=(None, 3))


>>> B = A + 5
>>> with tf.Session() as sess:
B_val_1 = B.eval(feed_dict={A: [[1, 2, 3]]})
B_val_2 = B.eval(feed_dict={A: [[4, 5, 6], [7, 8, 9]]})

TensorFlow
Feeding Data to the Training Algorithm
Placeholder nodes - Example

>>> A = tf.placeholder(tf.float32, shape=(None, 3))


>>> B = A + 5
>>> with tf.Session() as sess:
B_val_1 = B.eval(feed_dict={A: [[1, 2, 3]]})
B_val_2 = B.eval(feed_dict={A: [[4, 5, 6], [7, 8, 9]]})

● Create a placeholder node “A”


● “A” must have rank 2 (two-dimensionals) and
● There must be three columns
● It can have any number of rows

TensorFlow
Feeding Data to the Training Algorithm
Placeholder nodes - Example

>>> A = tf.placeholder(tf.float32, shape=(None, 3))


>>> B = A + 5
>>> with tf.Session() as sess:
B_val_1 = B.eval(feed_dict={A: [[1, 2, 3]]})
B_val_2 = B.eval(feed_dict={A: [[4, 5, 6], [7, 8, 9]]})

● Create a node “B”

TensorFlow
Feeding Data to the Training Algorithm
Placeholder nodes - Example

>>> A = tf.placeholder(tf.float32, shape=(None, 3))


>>> B = A + 5
>>> with tf.Session() as sess:
B_val_1 = B.eval(feed_dict={A: [[1, 2, 3]]})
B_val_2 = B.eval(feed_dict={A: [[4, 5, 6], [7, 8, 9]]})

● While evaluating B
○ Pass a feed_dict to the eval() method
○ feed_dict specifies the value of A

Check the code in the notebook

TensorFlow
Feeding Data to the Training Algorithm
Placeholder nodes - Mini-batch Gradient Descent

● Let’s implement Mini-batch Gradient Descent using TensorFlow


● We need to replace X and y at every iteration with the next mini-batch
○ Check the complete code in the notebook

X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X")


y = tf.placeholder(tf.float32, shape=(None, 1), name="y")

Check the complete code in the notebook

TensorFlow
Saving and Restoring Models

TensorFlow
Saving & Restoring Models
● We should save the model parameters to disk once it is trained
● It helps in
○ Using the trained models without retraining them
■ Remember training takes a lot of time
○ Using models in other programs

TensorFlow
Saving & Restoring Models - Checkpoints
● We should also save checkpoints to disks
○ At regular intervals during training
○ So that if the computer crashes during training
○ Then we can continue from the last saved checkpoint
○ Rather than start over from scratch

TensorFlow
Saving & Restoring Models - Save Models
How to save models in TensorFlow?
Step 1 - Create a Saver node
At the end of the construction phase(after all variable nodes are created)

>>> saver = tf.train.Saver() # Create Saver node

Check the complete code in the notebook

TensorFlow
Saving & Restoring Models - Save Models
How to save models in TensorFlow?
Step 2 - In the execution phase
● Just call its save() method whenever you want to save the model
● By passing the session and path of the checkpoint file

>>> save_path = saver.save(sess, "/tmp/my_model_final.ckpt") # Save


model

Check the complete code in the notebook

TensorFlow
Saving & Restoring Models - Restore Models
How to restore models in TensorFlow?
● In the execution phase,
○ Do not initialize variables using init
○ Just call the restore() method of the Saver object

with tf.Session() as sess:


saver.restore(sess, "/tmp/my_model_final.ckpt")
best_theta_restored = theta.eval()

Check the complete code in the notebook

TensorFlow
Saving & Restoring Models - Restore Models
How to restore models in TensorFlow?
● Check if restored theta is same as the saved theta

np.allclose(best_theta, best_theta_restored)

Output - True

Check the complete code in the notebook

TensorFlow
Saving & Restoring Models - Restore Models
How to restore models in TensorFlow?
● Restore theta with a different name, such as "weights"

saver = tf.train.Saver({"weights": theta})

Check the complete code in the notebook

TensorFlow
Visualizing the Graph and Training Curves
Using TensorBoard

TensorFlow
TensorBoard
● In the previous example
○ We used print() function to
○ Visualize progress during training
● Training a massive deep neural network can be
○ Complex and
○ Confusing
● Better way to visualize is to use TensorBoard

TensorFlow
TensorBoard
● TensorBoard is a suite of web applications
○ For inspecting and understanding
○ TensorFlow runs and graphs

TensorFlow
TensorBoard
● TensorBoard makes easy to
○ Understand,
○ Debug and
○ Optimize
○ TensorFlow programs

TensorFlow
TensorBoard
● If we feed training stats to TensorBoard
○ It displays nice interactive visualizations of these stats
○ In the web browser
● If we feed graph definition to TensorBoard
○ It displays a great interface to browse through it
■ Very useful to identify errors in the graph
■ Find bottlenecks

TensorFlow
TensorBoard
● We can visualize using TensorBoard in two ways
○ Inside Jupyter (Using IFrame)
○ Using TensorBoard Server

TensorFlow
Visualize Graph Inside Jupyter
● We can visualize the graph inside the Jupyter instead of using
TensorBoard

Check the complete code in the notebook

TensorFlow
Using TensorBoard Server
● Let’s visualize our previous TensorFlow program using TensorBoard
server
● We have to tweak our program a bit
○ So that our program writes
○ The graph definition and training stats to
○ A Log directory, that TensorBoard will read from

TensorFlow
Using TensorBoard Server
● We use different log directory for every run of the program
● Else, TensorBoard merges stats from different runs
○ Which will mess up visualizations
● The simplest solution is to
○ Include timestamp in the log directory name
● Let’s visualize using TensorBoard

TensorFlow
Using TensorBoard Server
Step 1 - Add the following code at the beginning of the program

>>> from datetime import datetime


>>> now = datetime.utcnow().strftime("%Y%m%d%H%M%S")
>>> root_logdir = "tf_logs"
>>> logdir = "{}/run-{}/".format(root_logdir, now)

Includes the timestamp in the log directory name

TensorFlow
Using TensorBoard Server
Step 2 - Add the following code at the very end of the construction phase

>>> mse_summary = tf.summary.scalar('MSE', mse)


>>> file_writer = tf.summary.FileWriter(logdir,tf.get_default_graph())

TensorFlow
Using TensorBoard Server
Step 2 - Add the following code at the very end of the construction phase

>>> mse_summary = tf.summary.scalar('MSE', mse)


>>> file_writer = tf.summary.FileWriter(logdir,tf.get_default_graph())

● Creates a node in the graph that will evaluate the MSE and
○ Write it to a TensorBoard- compatible binary log string
○ Called a summary

TensorFlow
Using TensorBoard Server
Step 2 - Add the following code at the very end of the construction phase

>>> mse_summary = tf.summary.scalar('MSE', mse)


>>> file_writer =
tf.summary.FileWriter(logdir,tf.get_default_graph())

● Create a FileWriter
○ Used to write summaries to
○ Logfiles in the log directory
○ Writes the graph definition in a
○ Binary logfile called an events file

TensorFlow
Using TensorBoard Server
Step 2 - Add the following code at the very end of the construction phase

>>> mse_summary = tf.summary.scalar('MSE', mse)


>>> file_writer =
tf.summary.FileWriter(logdir,tf.get_default_graph())

● The first parameter is logdir


○ Indicates the path of the log directory
● The second parameter is the
○ Graph we want to visualize
○ Optional parameters

TensorFlow
Using TensorBoard Server
Step 3 - Update the execution phase
[...]
for batch_index in range(n_batches):
X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
if batch_index % 10 == 0:
summary_str = mse_summary.eval(feed_dict={X: X_batch, y:
y_batch})
step = epoch * n_batches + batch_index
file_writer.add_summary(summary_str, step)
sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
[...]

TensorFlow
Using TensorBoard Server
Step 3 - Update the execution phase
[...]
for batch_index in range(n_batches):
X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
if batch_index % 10 == 0:
summary_str = mse_summary.eval(feed_dict={X:
X_batch, y: y_batch})
step = epoch * n_batches + batch_index
file_writer.add_summary(summary_str, step)
sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
[...]

● Evaluate the mse_summary node regularly during training


○ e.g., every 10 mini-batches

TensorFlow
Using TensorBoard Server
Step 3 - Update the execution phase
[...]
for batch_index in range(n_batches):
X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
if batch_index % 10 == 0:
summary_str = mse_summary.eval(feed_dict={X: X_batch, y:
y_batch})
step = epoch * n_batches + batch_index
file_writer.add_summary(summary_str, step)
sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
[...]

Writes the summary from the previous step to the event file

TensorFlow
Using TensorBoard Server
Step 4 - Close the FileWriter at the end of the program

>>> file_writer.close()

TensorFlow
Using TensorBoard Server
Step 4 - Close the FileWriter at the end of the program

>>> file_writer.close()

TensorFlow
Using TensorBoard Server
Step 5 - Run the Program

● It will create log directory


○ And events files in that directory
○ Containing both the graph definition and
○ MSE Values

TensorFlow
Using TensorBoard Server
Step 6 - Check the files

>>> cd ~/ml/deep_learning/ # Go to the deep learning directory


>>> ls -lh tf_logs/run*

TensorFlow
Using TensorBoard Server
Step 7 - Fire up the TensorBoard server

>>> cd ~/ml/deep_learning/ # Go to the deep learning directory


>>> ls -lh tf_logs/run*

TensorFlow
Using TensorBoard Server
Step 7 - Fire up the TensorBoard server

1. source activate py36


2. cd ~/ml
3. tensorboard --logdir deep_learning/tf_logs/ --port 6006
a. If you get error “port already in use”
b. Change port to 6007, 6008 and so on
c. We have opened port 6006 to 6016 in the firewall for TensorBoard
4. Find the port

4. Find hostname of console whether it is e, f, g.cloudxlab.com


5. Open the <hostname>:<port> in browser

TensorFlow
Name Scopes

TensorFlow
Name Scopes
● When dealing with complex neural networks
○ The graph becomes cluttered with thousand of nodes
● We can avoid this by
○ Creating name scopes to group related nodes

TensorFlow
Name Scopes
Example

>>> reset_graph()
>>> a1 = tf.Variable(0, name="a") # name == "a"
>>> a2 = tf.Variable(0, name="a") # name == "a_1" (since a already
exists)

● When we create a node


○ TensorFlow checks if its name already exists
○ If yes, then it appends
■ An underscore followed by
■ An index to make the name unique

TensorFlow
Name Scopes
Example

with tf.name_scope("param"): # name == "param"


a3 = tf.Variable(0, name="a") # name == "param/a"

with tf.name_scope("param"): # name == "param_1"


a4 = tf.Variable(0, name="a") # name == "param_1/a"

with tf.name_scope("param"): # name == "param_2"


a5 = tf.Variable(0, name="a") # name == "param_2/a"

TensorFlow
Name Scopes
Example

for node in (a1, a2, a3, a4, a5):


print(node.op.name)

Output -

a
a_1
param/a
param_1/a
param_2/a

Check the complete code in the notebook

TensorFlow
Name Scopes
● In our previous TensorBoard example
○ We can define
○ “error” and “mse” ops within a
○ name scope called “loss”

>>> with tf.name_scope("loss") as scope:


error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
print(error.op.name) # loss/sub
print(mse.op.name) # loss/mse

Check the complete code in the notebook

TensorFlow
Name Scopes

A collapsed namescope in TensorBoard (mse and error nodes now appear


inside the loss namespace)
TensorFlow
Modularity

TensorFlow
Modularity
● Let’s say we want to add the output of
○ Two rectified linear units (ReLU)
● A ReLU computes
○ A linear function of the inputs and
○ Outputs the result if it is positive, and 0 otherwise

Rectified Linear Unit (ReLU)

TensorFlow
Modularity

1. How do we add the output of two RELUs?


1.1. Create a graph for first RELU
1.2. Create a graph for second RELU
1.3. Add them

TensorFlow
Modularity
Question - What is the problem with the code in notebook?
n_features = 3
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
w1 = tf.Variable(tf.random_normal((n_features, 1)), name="weights1")
w2 = tf.Variable(tf.random_normal((n_features, 1)), name="weights2")
b1 = tf.Variable(0.0, name="bias1")
b2 = tf.Variable(0.0, name="bias2")
z1 = tf.add(tf.matmul(X, w1), b1, name="z1")
z2 = tf.add(tf.matmul(X, w2), b2, name="z2")
relu1 = tf.maximum(z1, 0., name="relu1")
relu2 = tf.maximum(z1, 0., name="relu2")
output = tf.add(relu1, relu2, name="output")

TensorFlow
Modularity
● Code in notebook is
○ Repetitive
○ Error-prone and
○ Hard to maintain
● Also what if we had to add output of five RELUs?

TensorFlow
Modularity
● TensorFlow lets us stay DRY
○ Don’t Repeat Yourself
● Create a function to build a RELU
● Check the code in the notebook

TensorFlow
Modularity
● Check the graph by running
○ tensorboard --logdir logs/relu2 --port 6006

TensorFlow
Modularity
● If name already exists, it appends an “_” followed by an
index to make the name unique.
● So the first ReLU contains nodes named "weights",
"bias", "z", and "relu" (plus many more nodes with their
default name, such as "MatMul");
● The second ReLU contains nodes named "weights_1",
"bias_1", and so on; the third ReLU contains nodes
named "weights_2", "bias_2", and so on.
● TensorBoard identifies such series and collapses them
together to reduce clutter

TensorFlow
Modularity

● Using name scopes, you can make the graph much clearer.
● Simply move all the content of the relu() function inside a name scope.

TensorFlow
Sharing Variables

TensorFlow
Sharing Variables - Classic Way
● How do we share a variable between various components?
● Classic way is to
○ First create the variable
○ Then pass it as a parameter to the functions

TensorFlow
Sharing Variables - Classic Way
● For example
○ To control the ReLU threshold (currently 0) using a shared threshold
variable for all ReLUs
○ Create variable first, and then pass it to the relu() function

Check the complete code in the notebook

TensorFlow
Sharing Variables - Classic Way
● For example
○ To control the ReLU threshold (currently 0) using a shared threshold
variable for all ReLUs
○ Create variable first, and then pass it to the relu() function

Step 1- Creating variable

Step 2 - Passing it to a function


TensorFlow
Question - What is the problem with previous
approach?

TensorFlow
Sharing Variables
Answer-
● If there are many shared parameters, then
○ It becomes painful to pass them
○ As parameters all the time

TensorFlow
Sharing Variables - Solutions
1. Create a Python dictionary
a. Containing all the variables and
b. Pass it around to every function
2. Set the shared variable
a. As an attribute of the relu() function
b. Check the complete code in the notebook

TensorFlow
Sharing Variables - get_variable()
● TensorFlow offers cleaner and more modular option
● Use the get_variable() function to
○ Create the shared variable if it does not exists
○ Or reuse it if it already exists

TensorFlow
Sharing Variables - get_variable()
● For example
○ Following code creates a variable named "relu/threshold"

Scalar 0.0 is initial value

TensorFlow
Sharing Variables - get_variable()
● Note that this code will raise an exception if
○ If the variable has already been created
○ By an earlier call to get_variable()
● This prevents reusing variables by mistake

TensorFlow
Sharing Variables - Reuse Variables
● If we want to reuse a variable
○ We explicitly say so by setting the
○ Variable scope’s reuse attribute to True
● Note that here we don’t have to specify
○ The shape
○ Or the initializer

TensorFlow
Sharing Variables - Reuse Variables
● This code fetches the existing "relu/threshold" variable
● Or raises an exception
○ If it does not exist
○ Or if it was not created using get_variable()

TensorFlow
Sharing Variables - Reuse Variables
Important
● Once reuse is set to True
○ It cannot be set back to False within the block
● Only variables created by
○ get_variable() can be reused

TensorFlow
Sharing Variables - Reuse Variables
● Alternate way to reuse variables is to
○ Use reuse_variables() function

TensorFlow
Sharing Variables - Reuse Variables
● Now let’s come back to the original problem of sharing variables
● And make the relu() function
○ Access the threshold variable
○ Without having to pass it as a parameter

TensorFlow
Sharing Variables - Reuse Variables
Steps - 1
● This code first defines the relu() function

TensorFlow
Sharing Variables - Reuse Variables
Steps - 2
● Then creates the relu/threshold variable
○ As a scalar that
○ Will later be initialized to 0.0

TensorFlow
Sharing Variables - Reuse Variables
Steps - 3
● Then builds five ReLUs by calling the relu() function

TensorFlow
Sharing Variables - Reuse Variables
Steps - 4
● The relu() function reuses the relu/threshold variable

TensorFlow
Summary

TensorFlow
Summary
● In this chapter
○ We learnt the basics of TensorFlow
● We will discuss more advanced topics
○ During the course like operations related to
■ Deep neural networks
■ Convolutional neural networks, and
■ Recurrent neural networks

TensorFlow
Summary
● We will also learn how to scale up TensorFlow using
○ Multithreading
○ Queues
○ Multiple GPUs and
○ Multiple servers

TensorFlow
Questions?
https://discuss.cloudxlab.com
reachus@cloudxlab.com

You might also like