Professional Documents
Culture Documents
CS 831
Dr. Sanjay Chatterji
1. Introduction of Keras
• Keras is a deep learning framework based on theano/tensorflow written in
python.
• Keras is a high level neural network API that supports fast experimentation
and can quickly translate your ideas into results.
• Powerful
• Easy to use
• Free
• Open source
Other Deep Learning Tools
TensorFlow is not the only game in town. These are some of the best supported alternatives. Most of these are
written in C++.
• TensorFlow Google's deep learning API.
• MXNet Apache foundation's deep learning API. Can be used through Keras.
• Theano - Python, from the academics that created deep learning.
• Keras - Also by Google, higher level framework that allows the use of TensorFlow, MXNet and Theano
interchangeably.
• Torch - LUA based. It has been used for some of the most advanced deep learning projects in the world.
• PaddlePaddle - Baidu's deep learning API.
• Deeplearning4J - Java based. GPU support in Java!
• Computational Network Toolkit (CNTK) - Microsoft. Support for Windows/Linux, command line only.
GPU support.
• H2O - Java based. Supports all major platforms. Limited support for computer vision. No GPU support.
2. Keras design principles
( 1 ) user-friendly
Keras provides a consistent and concise API that greatly reduces the
user workload in general applications, while providing clear and practical
bug feedback.
Keras does not have a separate model profile type, and the model is
described by python code, which makes it more compact and
debuggable, and provides the convenience of extensions.
(3) Expansibility
It is easy to add new modules by simply writing new classes or functions that mimic
existing modules.
(4) modularity
Input layer
Step2 : Build network Hidden layer
layer Output layer Commonly used layer
Convolution layer
Program
network Pooling layer
Optimization function layer Local connection layer
Step3 : compile Loss function Circulation layer
Performance evaluation
……
Step4 : train callback function
Sequential
data preprocessing
preprocessing Text preprocessing
Step5 : prediction Image preprocessing
Hyperparameters: Activation
• Activation functions (for neurons) are applied on a per-layer basis.
Input 2D image is
flattened to 1D
vector.
Dropout (with the
rate 0.2) is applied
to the first hidden
layer
6. Keras main concept
(1) Symbolic computing
The underlying libraries of Keras use Theano or TensorFlow, which are also known
as the back ends of Keras. Both Theano and TensorFlow are "symbolic" libraries.
(2) Tensor
Tensors, you can think of them as natural extensions of vectors, matrices, to represent
a wide range of data types. The order of a tensor is also called a dimension.
(3) model
Keras comes in two types: Sequential models, which are more widely used, and
functional models.
Define Model.
Fit Model.
Evaluate Model.
Load data
• Peptides in amino acid sequence
• Encode AA using BLOSUM : independend variable (X)
• Binding affinity : dependend variable (y)
• Mnist dataset
Define model
from keras.models import Sequential
from keras.layers import Dense
import numpy as np
# fix random seed for reproducibility
np.random.seed(7)
# create model
model = Sequential()
model.add(Dense(12, input_dim=INPUT_DIMENSIONS, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
Compile model
# Compile model
model.compile(loss='binary_crossentropy’,
optimizer='adam', metrics=['accuracy'])
Fit Model
# Fit the model
model.fit(X, Y,
epochs=EPOCHS,
batch_size=BATCH_SIZE,
validation_data=(X_val, y_val))
Evaluate model
# evaluate the model
scores = model.evaluate(X, Y)
print("\n%s: %.2f%%" % (model.metrics_names[1],
scores[1]*100))
Avoid overfitting
• Regularization
from keras.regularizers import l2
model.add(Dense(number_of_neurons, activation = 'relu’,
kernel_regularizer=l2(0.001)))
• Dropout
from keras.layers import Dropout
model.add(Dropout(0.2))
• Batch normalization
from keras.layers.normalization import BatchNormalization
model.add(BatchNormalization())
8. Keras coding example--
(regression model)
import numpy as np
np.random.seed(1337)
from keras.models import Sequential
from keras.layers import Dense
import matplotlib.pyplot as plt
Create data set
# Create data set
X = np.linspace(-1, 1, 200)
np.random.shuffle(X) # Randomize the data set
Y = 0.5 * X + 2 + np.random.normal(0, 0.05, (200, )) # Suppose the real model is : Y=0.5X+2
plt.show() # Draw data set:plt.scatter(X, Y)
X_train, Y_train = X[:160], Y[:160] # Put the first 160 pieces of data into the
training set
X_test, Y_test = X[160:], Y[160:] # Put the last 40 points in the test set
Define a model
# Define a model
'''
Keras has two types of models, Sequential (Sequential) and functional,
and the more common is Sequential,
which is single-input-single-output
'''
model = Sequential ()
'''
The model is added layer by layer through the add() method.
Dense is a full connection layer.
The first layer needs to define the input,
while the second layer does not need to specify the input
'''
model.add(Dense(output_dim=1, input_dim=1))
'''
1. Training is required after defining the model,
but we need to specify some training parameters before training
2. Select the loss function and optimizer through the compile() method
3. Here, mean square error is used as the loss function,
and stochastic gradient descent is used as the optimization method
'''
model.compile(loss='mse', optimizer='sgd')
Train the model
# Train the model
print('Training -----------')
for step in range(301):
cost = model.train_on_batch(X_train, Y_train) # Keras has a number
of functions to start training, here train_on_batch ()
if step % 100 == 0:
print('train cost: ', cost)
Test the model
# Test the model
print('\nTesting ------------')
cost = model.evaluate(X_test, Y_test, batch_size=40)
print('test cost:', cost)
'''
Check the trained network parameters.
Since our network has only one layer,
and there is only one input and one output for each training,
the model Y=WX+B is trained on the first layer, where W and B are the training
'''
W, b = model.layers[0].get_weights()
print('Weights=', W, '\nbiases=', b)
Output forecast
# plotting the prediction
Y_pred = model.predict(X_test)
plt.scatter(X_test, Y_test)
plt.plot(X_test, Y_pred)
plt.show()
Full code
import numpy as np
# Train the model
np.random.seed(1337)
print('Training -----------')
from keras.models import Sequential
for step in range(301):
from keras.layers import Dense
cost = model.train_on_batch(X_train, Y_train) #
import matplotlib.pyplot as plt
Keras has a number of functions to start training, here
train_on_batch ()
# Create data set
if step % 100 == 0:
X = np.linspace(-1, 1, 200)
print('train cost: ', cost)
np.random.shuffle(X) # Randomize the data set
Y = 0.5 * X + 2 + np.random.normal(0, 0.05, (200, )) # Suppose the
# Test the model
real model is : Y=0.5X+2
print('\nTesting ------------')
plt.show() # Draw data set:plt.scatter(X, Y)
cost = model.evaluate(X_test, Y_test, batch_size=40)
print('test cost:', cost)
X_train, Y_train = X[:160], Y[:160] # Put the first 160 pieces
W, b = model.layers[0].get_weights()
of data into the training set
print('Weights=', W, '\nbiases=', b)
X_test, Y_test = X[160:], Y[160:] # Put the last 40 points
in the test set
# plotting the prediction
Y_pred = model.predict(X_test)
# Define a model
plt.scatter(X_test, Y_test)
model = Sequential ()
plt.plot(X_test, Y_pred)
model.add(Dense(output_dim=1, input_dim=1))
plt.show()
model.compile(loss='mse', optimizer='sgd')
Regression model training results
Embeddings in Word Prediction
Code Example (1): The Embedding Layer in Keras
Turns positive integers (indexes) into dense vectors of fixed size. eg. [[4], [20]] ->
[[0.25, 0.1], [0.6, -0.2]] So, this is used to learn embedding from scratch.
model.compile('rmsprop', 'mse’)
output_array = model.predict(input_array)
assert output_array.shape == (32, 10, 64)
Arguments
•input_dim: int > 0. Size of the vocabulary, i.e. maximum integer index + 1.
•output_dim: int >= 0. Dimension of the dense embedding.
Code Example (2): Embedding in a FeedForward
Network for Text Classification
model = keras.Sequential([
keras.layers.Embedding(encoder.vocab_size, 16),
keras.layers.GlobalAveragePooling1D(),
keras.layers.Dense(1, activation='sigmoid')])
1.The first layer is an Embedding layer. This layer takes the integer-encoded
vocabulary and looks up the embedding vector for each word-index. These vectors
are learned as the model trains. The vectors add a dimension to the output array.
The resulting dimensions are: (batch, sequence, embedding).
3.This fixed-length output vector is piped through a fully-connected (Dense) layer with
16 hidden units.
4.The last layer is densely connected with a single output node. Using
the sigmoid activation function, this value is a float between 0 and 1, representing a
probability, or confidence level.
Code Example (3): Embedding in a RNN Network
for Text Classification
• With one Bidirectional layer
model = tf.keras.Sequential([
tf.keras.layers.Embedding(encoder.vocab_size, 64),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])