You are on page 1of 74

10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never
 miss a tutorial:
Navigation

Picked for you:

How to Reshape Input Data for Long


Click to Take the FREE LSTMs Crash-Course
Short-Term Memory Networks in Keras

Search... 
How to Develop an Encoder-Decoder

A Gentle Introduction to LSTM Autoencoders


Model for Sequence-to-Sequence
Prediction in Keras

by Jason Brownlee on November 5, 2018 in Long Short-Term Memory Networks


How to Develop an Encoder-Decoder
Tweet ModelShare
with Attention in Keras
Share

Last Updated on August 27, 2020


How to Use the TimeDistributed Layer in
An LSTMKeras
Autoencoder is an implementation of an autoencoder for sequence data using an Encoder-
Decoder LSTM architecture.

Once fit, A
the encoder
Gentle part oftothe
Introduction model can be used to encode or compress sequence data that in turn
LSTM
may be used in data visualizations or as a feature vector input to a supervised learning model.
Autoencoders

In this post, you will discover the LSTM Autoencoder model and how to implement it in Python using
Keras.
Loving the Tutorials?
After reading this post, you will know:
The LSTMs with Python EBook is
where you'll find the Really Good stuff.
Autoencoders are a type of self-supervised learning model that can learn a compressed
representation of inputINSIDE
>> SEE WHAT'S data.
LSTM Autoencoders can learn a compressed representation of sequence data and have been
used on video, text, audio, and time series sequence data.
How to develop LSTM Autoencoder models in Python using the Keras deep learning library.

Kick-start your project with my new book Long Short-Term Memory Networks With Python, including
step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

https://machinelearningmastery.com/lstm-Autoencoders/ 1/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial:

Picked for you:

How to Reshape Input Data for Long


Short-Term Memory Networks in Keras

How to Develop an Encoder-Decoder


Model for Sequence-to-Sequence
Prediction in Keras
Start Machine Learning ×
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.

How to Use the TimeDistributed Layer in


Email Address
Keras

START MY EMAIL COURSE


A Gentle Introduction to LSTM
Autoencoders A Gentle Introduction to LSTM Autoencoders
Photo by Ken Lund, some rights reserved.

Overview
Loving the Tutorials?
This postThe
is divided intoPython
LSTMs with six sections;
EBook isthey are:
where you'll find the Really Good stuff.
1. What Are Autoencoders?
2. A Problem withWHAT'S
>> SEE Sequences
INSIDE
3. Encoder-Decoder LSTM Models
4. What Is an LSTM Autoencoder?
5. Early Application of LSTM Autoencoder
6. How to Create LSTM Autoencoders in Keras
Start Machine Learning
What Are Autoencoders?
An autoencoder is a neural network model that seeks to learn a compressed representation of an input.

They are an unsupervised learning method, although technically, they are trained using supervised
learning methods, referred to as self-supervised. They are typically trained as part of a broader model
that attempts to recreate the input.

For example:

1 X = model.predict(X)

https://machinelearningmastery.com/lstm-Autoencoders/ 2/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

The design of the autoencoder model purposefully makes this challenging by restricting the architecture
Never miss a tutorial:
to a bottleneck at the midpoint of the model, from which the reconstruction of the input data is
performed.

There are many types of autoencoders, and their use varies, but perhaps the more common use is as a
learned or
Picked forautomatic
you: feature extraction model.

Howonce
In this case, to Reshape Input is
the model Data
fit, for
theLong
reconstruction aspect of the model can be discarded and the
Short-Term Memory Networks in Keras
model up to the point of the bottleneck can be used. The output of the model at the bottleneck is a fixed
length vector that provides a compressed representation of the input data.

Input dataHow to Develop


from an Encoder-Decoder
the domain can then be provided to the model and the output of the model at the
Model for Sequence-to-Sequence
bottleneck can be used as a feature vector in a supervised learning model, for visualization, or more
Prediction in Keras
generally for dimensionality reduction. Start Machine Learning ×
How to Develop an Encoder-Decoder You can master applied Machine Learning
A Problem with inSequences
Model with Attention Keras without math or fancy degrees.
Find out how in this free and practical course.
Sequence prediction problems are challenging, not least because the length of the input sequence can
vary. How to Use the TimeDistributed Layer in
Email Address
Keras
This is challenging because machine learning algorithms, and neural networks in particular, are
designed to work with fixed length inputs. START MY EMAIL COURSE
A Gentle Introduction to LSTM
Another challenge with sequence data is that the temporal ordering of the observations can make it
Autoencoders
challenging to extract features suitable for use as input to supervised learning models, often requiring
deep expertise in the domain or in the field of signal processing.

Loving
Finally, many the Tutorials?
predictive modeling problems involving sequences require a prediction that itself is also a
sequence. These are called sequence-to-sequence, or seq2seq, prediction problems.
The LSTMs with Python EBook is
where you'll find the Really Good stuff.
You can learn more about sequence prediction problems here:
>> SEE WHAT'S INSIDE
Making Predictions with Sequences

Encoder-Decoder LSTM Models


Recurrent neural networks, such as the Long Short-Term Memory, or LSTM, network are specifically
Start Machine Learning
designed to support sequences of input data.

They are capable of learning the complex dynamics within the temporal ordering of input sequences as
well as use an internal memory to remember or use information across long input sequences.

The LSTM network can be organized into an architecture called the Encoder-Decoder LSTM that allows
the model to be used to both support variable length input sequences and to predict or output variable
length output sequences.

This architecture is the basis for many advances in complex sequence prediction problems such as
speech recognition and text translation.

https://machinelearningmastery.com/lstm-Autoencoders/ 3/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

In this architecture, an encoder LSTM model reads the input sequence step-by-step. After reading in
Never miss a tutorial:
the entire input sequence, the hidden state or output of this model represents an internal learned
representation of the entire input sequence as a fixed-length vector. This vector is then provided as an
input to the decoder model that interprets it as each step in the output sequence is generated.

You can for


Picked learn more about the encoder-decoder architecture here:
you:

How to ReshapeLong
Encoder-Decoder Input Short-Term
Data for LongMemory Networks
Short-Term Memory Networks in Keras

What Is an LSTM Autoencoder?


An LSTMHow to Develop an Encoder-Decoder
Autoencoder is an implementation of an autoencoder for sequence data using an Encoder-
Model for Sequence-to-Sequence
Decoder LSTM architecture.
Prediction in Keras
Start Machine Learning ×
For a given dataset of sequences, an encoder-decoder LSTM is configured to read the input sequence,
encode it,How to Develop
decode anrecreate
it, and Encoder-Decoder
it. The performanceYou can model
of the master applied Machine
is evaluated Learning
based on the model’s
Model with Attention in Keras without math or fancy degrees.
ability to recreate the input sequence.
Find out how in this free and practical course.
Once the model achieves a desired level of performance recreating the sequence, the decoder part of
How to Use the TimeDistributed Layer in
the model may be removed, leaving just the encoder model. This model can then be used to encode
Email Address
Keras
input sequences to a fixed-length vector.

START MY EMAIL COURSE


The resulting vectors can then be used in a variety of applications, not least as a compressed
A Gentle Introduction to LSTM
representation of the sequence as an input to another supervised learning model.
Autoencoders

Early Application of LSTM Autoencoder


One of the Loving
early andthe Tutorials?
widely cited applications of the LSTM Autoencoder was in the 2015 paper titled
“Unsupervised Learning of Video Representations using LSTMs.”
The LSTMs with Python EBook is
where you'll find the Really Good stuff.

>> SEE WHAT'S INSIDE

Start Machine Learning

LSTM Autoencoder Model


Taken from “Unsupervised Learning of Video Representations using LSTMs”

https://machinelearningmastery.com/lstm-Autoencoders/ 4/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

In the paper, Nitish Srivastava, et al. describe the LSTM Autoencoder as an extension or application of
Never miss a tutorial:
the Encoder-Decoder LSTM.

They use the model with video input data to both reconstruct sequences of frames of video as well as to
predict frames of video, both of which are described as an unsupervised learning task.
Picked for you:
The input to the model is a sequence of vectors (image patches or features). The encoder
 How to Reshape Input Data for Long
LSTM reads in this sequence. After the last input has been read, the decoder LSTM takes
Short-Term Memory Networks in Keras
over and outputs a prediction for the target sequence.

How to Develop
— Unsupervised an Encoder-Decoder
Learning of Video Representations using LSTMs, 2015.
Model for Sequence-to-Sequence
Prediction
More than in Keras
simply using the model directly, the authors explore some interesting architecture choices
Start Machine Learning ×
that may help inform future applications of the model.
How to Develop an Encoder-Decoder You can master applied Machine Learning
They designed the Attention
Model with model ininsuch
Kerasa way as to recreatewithout
the target
mathsequence of video frames in reverse
or fancy degrees.
order, claiming that it makes the optimization problemFind
solved by the
out how model
in this more
free and tractable.
practical course.

How to Use the TimeDistributed Layer in


Email Address
The target sequence is same as the input sequence, but in reverse order. Reversing the
 Keras
target sequence makes the optimization easier because the model can get off the ground by
looking at low range correlations. START MY EMAIL COURSE
A Gentle Introduction to LSTM
Autoencoders
— Unsupervised Learning of Video Representations using LSTMs, 2015.

They also explore two approaches to training the decoder model, specifically a version conditioned in
the previous output generated by the decoder, and another without any such conditioning.
Loving the Tutorials?

The LSTMs with Python EBook is


The decoder can be of two kinds – conditional or unconditioned. A conditional decoder
 where you'll find the Really Good stuff.
receives the last generated output frame as input […]. An unconditioned decoder does not
receive that input.
>> SEE WHAT'S INSIDE

— Unsupervised Learning of Video Representations using LSTMs, 2015.

A more elaborate autoencoder model was also explored where two decoder models were used for the
one encoder: one to predict the next frame in the sequence and oneLearning
Start Machine to reconstruct frames in the
sequence, referred to as a composite model.

… reconstructing the input and predicting the future can be combined to create a composite
 […]. Here the encoder LSTM is asked to come up with a state from which we can both
predict the next few frames as well as reconstruct the input.

— Unsupervised Learning of Video Representations using LSTMs, 2015.

https://machinelearningmastery.com/lstm-Autoencoders/ 5/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial:

Picked for you:

How to Reshape Input Data for Long


Short-Term Memory Networks in Keras

How to Develop an Encoder-Decoder


Model for Sequence-to-Sequence
Prediction in Keras
Start Machine Learning ×
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.

How to Use the TimeDistributed Layer in


Email Address
Keras

START MY EMAIL COURSE


A Gentle Introduction to LSTM
Autoencoders

Loving the Tutorials?

The LSTMs with Python EBook is


where you'll find the Really Good stuff.

>> SEE WHAT'S INSIDE

LSTM Autoencoder Model With Two Decoders


Taken from “Unsupervised Learning of Video Representations using LSTMs”

The models were evaluated in many ways, including using encoder Learning
Start Machine to seed a classifier. It appears that
rather than using the output of the encoder as an input for classification, they chose to seed a
standalone LSTM classifier with the weights of the encoder model directly. This is surprising given the
complication of the implementation.

We initialize an LSTM classifier with the weights learned by the encoder LSTM from this
 model.

— Unsupervised Learning of Video Representations using LSTMs, 2015.

The composite model without conditioning on the decoder was found to perform the best in their
experiments.
https://machinelearningmastery.com/lstm-Autoencoders/ 6/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial:


The best performing model was the Composite Model that combined an autoencoder and a
 future predictor. The conditional variants did not give any significant improvements in terms
of classification accuracy after fine-tuning, however they did give slightly lower prediction
errors.

Picked for you:


— Unsupervised Learning of Video Representations using LSTMs, 2015.
How to Reshape Input Data for Long
Short-Term
Many other Memory
applications of Networks
the LSTM in Autoencoder
Keras have been demonstrated, not least with sequences
of text, audio data and time series.

How to Develop an Encoder-Decoder


How to Create
Model LSTM Autoencoders in Keras
for Sequence-to-Sequence
Prediction in Keras
Creating an LSTM Autoencoder in Keras can be achieved Startby Machine
implementingLearning
an Encoder-Decoder LSTM ×
architecture and configuring the model to recreate the input sequence.
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras
Let’s look at a few examples to make this concrete. without math or fancy degrees.
Find out how in this free and practical course.

Reconstruction LSTM Autoencoder


How to Use the TimeDistributed Layer in
Email Address
Keras
The simplest LSTM autoencoder is one that learns to reconstruct each input sequence.

For these demonstrations, we will use a dataset of one START


sampleMY
ofEMAIL COURSE
nine time steps and one feature:
A Gentle Introduction to LSTM
1 [0.1,Autoencoders
0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]

We can start-off by defining the sequence and reshaping it into the preferred shape of [samples,
timesteps, features].
Loving the Tutorials?
1 # define input sequence
2 The LSTMs
sequence with Python0.2,
= array([0.1, EBook is 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
0.3,
3 # reshape input into [samples, timesteps, features]
where you'll find the Really Good stuff.
4 n_in = len(sequence)
5 sequence = sequence.reshape((1, n_in, 1))
>> SEE WHAT'S INSIDE
Next, we can define the encoder-decoder LSTM architecture that expects input sequences with nine
time steps and one feature and outputs a sequence with nine time steps and one feature.

1 # define model
2 model = Sequential()
3 model.add(LSTM(100, activation='relu', input_shape=(n_in,1)))
Start Machine Learning
4 model.add(RepeatVector(n_in))
5 model.add(LSTM(100, activation='relu', return_sequences=True))
6 model.add(TimeDistributed(Dense(1)))
7 model.compile(optimizer='adam', loss='mse')

Next, we can fit the model on our contrived dataset.

1 # fit model
2 model.fit(sequence, sequence, epochs=300, verbose=0)

The complete example is listed below.

The configuration of the model, such as the number of units and training epochs, was completely
arbitrary.

https://machinelearningmastery.com/lstm-Autoencoders/ 7/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

1 # lstm autoencoder recreate sequence


Never miss
2 from a tutorial:
numpy import array
3 from keras.models import Sequential
4 from keras.layers import LSTM
5 from keras.layers import Dense
6 from keras.layers import RepeatVector
7 from keras.layers import TimeDistributed
8 fromfor
Picked keras.utils
you: import plot_model
9 # define input sequence
10 sequence = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
How to input
11 # reshape Reshape Input[samples,
into Data for Long
timesteps, features]
12 n_inShort-Term
= len(sequence)
Memory Networks in Keras
13 sequence = sequence.reshape((1, n_in, 1))
14 # define model
15 model = Sequential()
16 model.add(LSTM(100, activation='relu', input_shape=(n_in,1)))
How to Develop an Encoder-Decoder
17 model.add(RepeatVector(n_in))
Model for Sequence-to-Sequence
18 model.add(LSTM(100, activation='relu', return_sequences=True))
Prediction in Keras
19 model.add(TimeDistributed(Dense(1)))
20 model.compile(optimizer='adam', loss='mse') Start Machine Learning ×
21 # fit model
22 model.fit(sequence, sequence, epochs=300, verbose=0)
How to Develop an Encoder-Decoder You can master applied Machine Learning
23 plot_model(model, show_shapes=True,
Model with Attention in Keras to_file='reconstruct_lstm_autoencoder.png')
without math or fancy degrees.
24 # demonstrate recreation
25 yhat = model.predict(sequence, verbose=0) Find out how in this free and practical course.
26 print(yhat[0,:,0])
How to Use the TimeDistributed Layer in
Running Keras Email
the example fits the autoencoder and prints the Address
reconstructed input sequence.

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or
START MY EMAIL COURSE
differences in numerical
A Gentle precision.
Introduction to LSTMConsider running the example a few times and compare the average
outcome.Autoencoders

The results are close enough, with very minor rounding errors.

1 [0.10398503 0.20047213
Loving 0.29905337 0.3989646 0.4994707 0.60005534
the Tutorials?
2 0.70039135 0.80031013 0.8997728 ]
The LSTMs with Python EBook is
A plot of the architecture is created for reference.
where you'll find the Really Good stuff.

>> SEE WHAT'S INSIDE

Start Machine Learning

https://machinelearningmastery.com/lstm-Autoencoders/ 8/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial:

Picked for you:

How to Reshape Input Data for Long


Short-Term Memory Networks in Keras

How to Develop an Encoder-Decoder


Model for Sequence-to-Sequence
Prediction in Keras
Start Machine Learning ×
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.

How to Use the TimeDistributed Layer in


Email Address
Keras

START MY EMAIL COURSE


A Gentle Introduction to LSTM
Autoencoders
LSTM Autoencoder for Sequence Reconstruction

Prediction LSTM Autoencoder


Loving the Tutorials?
We can modify the reconstruction LSTM Autoencoder to instead predict the next step in the sequence.
The LSTMs with Python EBook is
where you'll find the Really Good stuff.
In the case of our small contrived problem, we expect the output to be the sequence:
>> SEE WHAT'S INSIDE
1 [0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]

This means that the model will expect each input sequence to have nine time steps and the output
sequence to have eight time steps.

1 # reshape input into [samples, timesteps, features]


Start Machine Learning
2 n_in = len(seq_in)
3 seq_in = seq_in.reshape((1, n_in, 1))
4 # prepare output sequence
5 seq_out = seq_in[:, 1:, :]
6 n_out = n_in - 1

The complete example is listed below.

1 # lstm autoencoder predict sequence


2 from numpy import array
3 from keras.models import Sequential
4 from keras.layers import LSTM
5 from keras.layers import Dense
6 from keras.layers import RepeatVector
7 from keras.layers import TimeDistributed
8 from keras.utils import plot_model
9 # define input sequence
https://machinelearningmastery.com/lstm-Autoencoders/ 9/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

10 seq_in = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
Never miss a input
11 # reshape tutorial:
into [samples, timesteps, features]
12 n_in = len(seq_in)
13 seq_in = seq_in.reshape((1, n_in, 1))
14 # prepare output sequence
15 seq_out = seq_in[:, 1:, :]
16 n_out = n_in - 1
17 # define
Picked model
for you:
18 model = Sequential()
19 model.add(LSTM(100, activation='relu', input_shape=(n_in,1)))
How to Reshape Input Data for Long
20 model.add(RepeatVector(n_out))
21 model.add(LSTM(100,
Short-Term Memoryactivation='relu',
Networks in Keras return_sequences=True))
22 model.add(TimeDistributed(Dense(1)))
23 model.compile(optimizer='adam', loss='mse')
24 plot_model(model, show_shapes=True, to_file='predict_lstm_autoencoder.png')
25 # fit model
How to Develop an Encoder-Decoder
26 model.fit(seq_in, seq_out, epochs=300, verbose=0)
Model for Sequence-to-Sequence
27 # demonstrate prediction
28 yhatPrediction in Keras
= model.predict(seq_in,
29 print(yhat[0,:,0])
verbose=0)
Start Machine Learning ×
Running How to Developprints
the example an Encoder-Decoder
the output sequence that You
predicts the next
can master timeMachine
applied step for each input time
Learning
step. Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.
Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or
How
differences in tonumerical
Use the TimeDistributed Layer in running the example a few times and compare the average
precision. Consider Email Address
Keras
outcome.

START MY EMAIL COURSE


We can see that the model is accurate, barring some minor rounding errors.
A Gentle Introduction to LSTM
Autoencoders
1 [0.1657285 0.28903174 0.40304852 0.5096578 0.6104322 0.70671254
2 0.7997272 0.8904342 ]

A plot of the architecture is created for reference.


Loving the Tutorials?

The LSTMs with Python EBook is


where you'll find the Really Good stuff.

>> SEE WHAT'S INSIDE

Start Machine Learning

https://machinelearningmastery.com/lstm-Autoencoders/ 10/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial:

Picked for you:

How to Reshape Input Data for Long


Short-Term Memory Networks in Keras

How to Develop an Encoder-Decoder


Model for Sequence-to-Sequence
Prediction in Keras
Start Machine Learning ×
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.

How to Use the TimeDistributed Layer in


Email Address
Keras

START MY EMAIL COURSE


A Gentle Introduction to LSTM
Autoencoders
LSTM Autoencoder for Sequence Prediction

Composite LSTM Autoencoder


Loving the Tutorials?
Finally, we can create a composite LSTM Autoencoder that has a single encoder and two decoders,
The LSTMs with Python EBook is
one forwhere
reconstruction
you'll find theand oneGood
Really for prediction.
stuff.

We can implement this multi-output


>> SEE WHAT'S INSIDE model in Keras using the functional API. You can learn more about
the functional API in this post:

How to Use the Keras Functional API for Deep Learning

First, the encoder is defined.


Start Machine Learning
1 # define encoder
2 visible = Input(shape=(n_in,1))
3 encoder = LSTM(100, activation='relu')(visible)

Then the first decoder that is used for reconstruction.

1 # define reconstruct decoder


2 decoder1 = RepeatVector(n_in)(encoder)
3 decoder1 = LSTM(100, activation='relu', return_sequences=True)(decoder1)
4 decoder1 = TimeDistributed(Dense(1))(decoder1)

Then the second decoder that is used for prediction.

1 # define predict decoder


2 decoder2 = RepeatVector(n_out)(encoder)

https://machinelearningmastery.com/lstm-Autoencoders/ 11/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

3 decoder2 = LSTM(100, activation='relu', return_sequences=True)(decoder2)


Never miss =a TimeDistributed(Dense(1))(decoder2)
4 decoder2 tutorial:

We then tie the whole model together.

1 # tie it together
2 model = Model(inputs=visible, outputs=[decoder1, decoder2])
Picked for you:
The complete example is listed below.
How to Reshape Input Data for Long
1 Short-Term
# lstm Memory
autoencoder Networks in Keras
reconstruct and predict sequence
2 from numpy import array
3 from keras.models import Model
4 from keras.layers import Input
5 fromHow
keras.layers
to Develop animport LSTM
Encoder-Decoder
6 fromModel
keras.layers import Dense
for Sequence-to-Sequence
7 from keras.layers import RepeatVector
8
9
Prediction in Keras
from keras.layers import TimeDistributed
from keras.utils import plot_model
Start Machine Learning ×
10 # define input sequence
11 How= toarray([0.1,
seq_in Develop an Encoder-Decoder
0.2, 0.3, 0.4, 0.5, 0.6,You can0.8,
0.7, master applied Machine Learning
0.9])
12 # reshape input into [samples,
Model with Attention in Keras timesteps, features]
without math or fancy degrees.
13 n_in = len(seq_in)
Find out how in this free and practical course.
14 seq_in = seq_in.reshape((1, n_in, 1))
15 # prepare output sequence
16 seq_out
How =to seq_in[:, 1:, :]
Use the TimeDistributed Layer in
17 n_out = n_in - 1 Email Address
Keras
18 # define encoder
19 visible = Input(shape=(n_in,1))
20 encoder = LSTM(100, activation='relu')(visible)START MY EMAIL COURSE
21 # define reconstruct decoder
A Gentle Introduction to LSTM
22 decoder1 = RepeatVector(n_in)(encoder)
23 Autoencoders
decoder1 = LSTM(100, activation='relu', return_sequences=True)(decoder1)
24 decoder1 = TimeDistributed(Dense(1))(decoder1)
25 # define predict decoder
26 decoder2 = RepeatVector(n_out)(encoder)
27 decoder2 = LSTM(100, activation='relu', return_sequences=True)(decoder2)
28 Loving
decoder2 the Tutorials?
= TimeDistributed(Dense(1))(decoder2)
29 # tie it together
30 model
The= LSTMs
Model(inputs=visible,
with Python EBook isoutputs=[decoder1, decoder2])
31 model.compile(optimizer='adam',
where you'll find the Really Good stuff.loss='mse')
32 plot_model(model, show_shapes=True, to_file='composite_lstm_autoencoder.png')
33 # fit model
34 model.fit(seq_in,
>> SEE WHAT'S [seq_in,seq_out],
INSIDE epochs=300, verbose=0)
35 # demonstrate prediction
36 yhat = model.predict(seq_in, verbose=0)
37 print(yhat)

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or
differences in numerical precision. Consider running the example
Start a few
Machine times and compare the average
Learning
outcome.

Running the example both reconstructs and predicts the output sequence, using both decoders.

1 [array([[[0.10736275],
2 [0.20335874],
3 [0.30020815],
4 [0.3983948 ],
5 [0.4985725 ],
6 [0.5998295 ],
7 [0.700336 ,
8 [0.8001949 ],
9 [0.89984304]]], dtype=float32),
10
11 array([[[0.16298929],

https://machinelearningmastery.com/lstm-Autoencoders/ 12/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

12 [0.28785267],
Never miss a tutorial:
13 [0.4030449 ],
14 [0.5104638 ],
15 [0.61162543],
16 [0.70776784],
17 [0.79992455],
18 [0.8889787 ]]], dtype=float32)]
Picked for you:
A plot of the architecture is created for reference.
How to Reshape Input Data for Long
Short-Term Memory Networks in Keras

How to Develop an Encoder-Decoder


Model for Sequence-to-Sequence
Prediction in Keras
Start Machine Learning ×
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.

How to Use the TimeDistributed Layer in


Email Address
Keras

START MY EMAIL COURSE


A Gentle Introduction to LSTM
Autoencoders
Composite LSTM Autoencoder for Sequence Reconstruction and Prediction

Keep Standalone LSTM Encoder


Loving the Tutorials?
RegardlessThe of the method
LSTMs chosen
with Python (reconstruction,
EBook is prediction, or composite), once the autoencoder has
been fit, the you'll
where decoder canReally
find the be removed and the encoder can be kept as a standalone model.
Good stuff.

The encoder>>can
SEEthen be used
WHAT'S to transform input sequences to a fixed length encoded vector.
INSIDE

We can do this by creating a new model that has the same inputs as our original model, and outputs
directly from the end of encoder model, before the RepeatVector layer.

1 # connect the encoder LSTM as the output layer


Start Machine Learning
2 model = Model(inputs=model.inputs, outputs=model.layers[0].output)

A complete example of doing this with the reconstruction LSTM autoencoder is listed below.

1 # lstm autoencoder recreate sequence


2 from numpy import array
3 from keras.models import Sequential
4 from keras.models import Model
5 from keras.layers import LSTM
6 from keras.layers import Dense
7 from keras.layers import RepeatVector
8 from keras.layers import TimeDistributed
9 from keras.utils import plot_model
10 # define input sequence
11 sequence = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
12 # reshape input into [samples, timesteps, features]
13 n_in = len(sequence)

https://machinelearningmastery.com/lstm-Autoencoders/ 13/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

14 sequence = sequence.reshape((1, n_in, 1))


Never miss amodel
15 # define tutorial:
16 model = Sequential()
17 model.add(LSTM(100, activation='relu', input_shape=(n_in,1)))
18 model.add(RepeatVector(n_in))
19 model.add(LSTM(100, activation='relu', return_sequences=True))
20 model.add(TimeDistributed(Dense(1)))
21 model.compile(optimizer='adam', loss='mse')
Picked for you:
22 # fit model
23 model.fit(sequence, sequence, epochs=300, verbose=0)
How to the
24 # connect Reshape Input LSTM
encoder Data for
as Long
the output layer
25 model = Model(inputs=model.inputs,
Short-Term Memory Networks in Kerasoutputs=model.layers[0].output)
26 plot_model(model, show_shapes=True, to_file='lstm_encoder.png')
27 # get the feature vector for the input sequence
28 yhat = model.predict(sequence)
29 print(yhat.shape)
How to Develop an Encoder-Decoder
30 print(yhat)
Model for Sequence-to-Sequence
Running Prediction
the examplein Keras
creates a standalone encoder model that could be used or saved for later use.
Start Machine Learning ×
We demonstrate the encoder
How to Develop by predicting the sequence
an Encoder-Decoder andmaster
You can getting back Machine
applied the 100Learning
element output of
the encoder.
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.
Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or
differences
Howin tonumerical precision. Consider
Use the TimeDistributed Layer in running the example a few times and compare the average
Email Address
outcome.Keras

Obviously, this is overkill for our tiny nine-step input sequence.


START MY EMAIL COURSE
A Gentle Introduction to LSTM
1 Autoencoders
[[0.03625513 0.04107533 0.10737951 0.02468692 0.06771207 0.
2 0.0696108 0. 0. 0.0688471 0. 0.
3 0. 0. 0. 0. 0. 0.03871286
4 0. 0. 0.05252134 0. 0.07473809 0.02688836
5 0. 0. 0. 0. 0. 0.0460703
6 0. Loving0. the Tutorials?
0.05190025 0. 0. 0.11807001
7 0. 0. 0. 0. 0. 0.
8 0.The LSTMs 0.14514188 0.
with Python EBook is 0. 0. 0.
9 0.02029926 0.02952124 0. 0. 0. 0.
where you'll find the Really Good stuff.
10 0. 0.08357017 0.08418129 0. 0. 0.
11 0. 0. 0.09802645 0.07694854 0. 0.03605933
12 0. >> SEE 0.06378153
WHAT'S INSIDE0. 0.05267526 0.02744672 0.
13 0.06623861 0. 0. 0. 0.08133873 0.09208347
14 0.03379713 0. 0. 0. 0.07517676 0.08870222
15 0. 0. 0. 0. 0.03976351 0.09128518
16 0.08123557 0. 0.08983088 0.0886112 0. 0.03840019
17 0.00616016 0.0620428 0. 0. ]

A plot of the architecture is created for reference. Start Machine Learning

Standalone Encoder LSTM Model

https://machinelearningmastery.com/lstm-Autoencoders/ 14/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Further
Never Reading
miss a tutorial:

This section provides more resources on the topic if you are looking to go deeper.

Making Predictions with Sequences


Picked for you:
Encoder-Decoder Long Short-Term Memory Networks
Autoencoder, Wikipedia
How to Reshape Input Data for Long
Unsupervised Learning of Video Representations using LSTMs, ArXiv 2015.
Short-Term Memory Networks in Keras
Unsupervised Learning of Video Representations using LSTMs, PMLR, PDF, 2015.
Unsupervised Learning of Video Representations using LSTMs, GitHub Repository.
Building
How Autoencoders in Keras, 2016.
to Develop an Encoder-Decoder
HowModel
to Use forthe Keras Functional API for Deep Learning
Sequence-to-Sequence
Prediction in Keras
Start Machine Learning ×
Summary
How to Develop an Encoder-Decoder You can master applied Machine Learning
In this post, youwith
Model discovered
Attention inthe LSTM Autoencoder model
Keras and
without how
math orto implement
fancy degrees.it in Python using
Keras. Find out how in this free and practical course.

Specifically,
Howyou learned:
to Use the TimeDistributed Layer in
Email Address
Keras
Autoencoders are a type of self-supervised learning model that can learn a compressed
representation of input data. START MY EMAIL COURSE
LSTMA Gentle Introduction
Autoencoders canto learn
LSTM a compressed representation of sequence data and have been
usedAutoencoders
on video, text, audio, and time series sequence data.
How to develop LSTM Autoencoder models in Python using the Keras deep learning library.

Do you have any questions?


Loving the Tutorials?
Ask your questions in the comments below and I will do my best to answer.
The LSTMs with Python EBook is
where you'll find the Really Good stuff.

Develop LSTMs for Sequence Prediction Today!


>> SEE WHAT'S INSIDE

Develop Your Own LSTM models in Minutes


...with just a few lines of python code

Start Machine
Discover Learning
how in my new Ebook:
Long Short-Term Memory Networks with Python

It provides self-study tutorials on topics like:


CNN LSTMs, Encoder-Decoder LSTMs, generative models, data preparation,
making predictions and much more...

Finally Bring LSTM Recurrent Neural Networks to


Your Sequence Predictions Projects
Skip the Academics. Just Results.

SEE WHAT'S INSIDE

https://machinelearningmastery.com/lstm-Autoencoders/ 15/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Tweet
Never miss aShare
tutorial:Share

About Jason Brownlee


Jason Brownlee, PhD is a machine learning specialist who teaches developers how to get results
with modern machine learning methods via hands-on tutorials.
Picked for you:
View all posts by Jason Brownlee →

How to Reshape Input Data for Long


Short-Term Memory Networks in Keras

 LSTM Model Architecture for Rare Event Time Series Forecasting


How to Use the TimeseriesGenerator for Time Series Forecasting in Keras 
How to Develop an Encoder-Decoder
Model for Sequence-to-Sequence
Prediction in Keras
Start Machine Learning ×
How to Develop an Encoder-Decoder You can master applied Machine Learning
261 Responses to A Gentle Introductionwithout
Model with Attention in Keras
to LSTM Autoencoders
math or fancy degrees.
Find out how in this free and practical course.
REPLY 
samaksh kumar November 5, 2018 at 11:35 am #
How to Use the TimeDistributed Layer in
Email Address
Keras
Nice Explained……….

START MY EMAIL COURSE


A Gentle Introduction to LSTM
Autoencoders
REPLY 
Jason Brownlee November 5, 2018 at 2:27 pm #

Thanks.
Loving the Tutorials?

The LSTMs with Python EBook is


where you'll find the
ramar Really
October 11,Good
2019 atstuff.
8:31 pm #
REPLY 

>> SEE
VeryWHAT'S INSIDE
Good and Detailed representation of LSTM.

I have a csv file which contains 3000 values, when i run it in Google colab or jupyter notebook it was
much slow What may be the reason?

Start Machine Learning


REPLY 
Jason Brownlee October 12, 2019 at 6:56 am #

Thanks.

Perhaps try running on a faster machine, like EC2?


Perhaps try using a more efficient implementation?
Perhaps try using less training data?

REPLY 
Ali Alwehaibi November 6, 2018 at 8:16 am #

https://machinelearningmastery.com/lstm-Autoencoders/ 16/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Thanks for the great posts! I have learn a lot from them.
Never miss a tutorial:
Can this approach for classification problems such as sentiment analysis?

REPLY 
Jason Brownlee November 6, 2018 at 2:16 pm #
Picked for you:

How toPerhaps.
Reshape Input Data for Long
Short-Term Memory Networks in Keras

REPLY 
TJtoChen
How November
Develop 7, 2018 at 1:21 pm #
an Encoder-Decoder
Model for Sequence-to-Sequence
Hi Jason,
Prediction in Keras
Thanks for the posts, I really enjoy reading this. Start Machine Learning ×
I’m trying to use this method to do time series data anomaly detection and I got few questions here:
How to Develop an Encoder-Decoder You can master applied Machine Learning
When you reshape the sequence into [samples, timesteps, features], samples and features always equal
Model with Attention in Keras without math or fancy degrees.
to 1. What is the guidance to choose the value here? If the input sequences have variable length, how to
Find out how in this free and practical course.
set timesteps, always choose max length?

Also, ifHow to Useisthe


the input twoTimeDistributed Layerdata
dimension tabular in with each row has different length, how will you do the
Email Address
reshape Keras
or normalization?
Thanks in advance!
START MY EMAIL COURSE
A Gentle Introduction to LSTM
Autoencoders
REPLY 
Jason Brownlee November 7, 2018 at 2:49 pm #

The time steps should provide enough history to make a prediction, the features are the
Loving the Tutorials?
observations recorded at each time step.
The LSTMs with Python EBook is
More on preparing data for LSTMs here:
where you'll find the Really Good stuff.
https://machinelearningmastery.com/faq/single-faq/how-do-i-prepare-my-data-for-an-lstm

>> SEE WHAT'S INSIDE

REPLY 
Soheila November 8, 2018 at 5:52 pm #

Hi,
Start Machine Learning
I am wondering why the output of encoder has a much higher dimension(100), since we usually use
encoders to create lower dimensions!

Could you please bring examples if I am wrong?

And what about variable length of samples? You keep saying that LSTM is useful for variable length. So
how does it deal with a training set like:

dataX[0] = [1,2,3,4]
dataX[1] = [2,5,7,8,4]
dataX[2] = [0,3]

I am really confused with my second question and I’d be very thankful for your help!

https://machinelearningmastery.com/lstm-Autoencoders/ 17/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial: REPLY 


Jason Brownlee November 9, 2018 at 5:19 am #

The model reproduces the output, e.g. a 1D vector with 9 elements.

You can pad the variable length inputs with 0 and use a masking layer to ignore the padded values.
Picked for you:

How to Reshape Input Data for Long


Short-Term Memory Networks in Keras REPLY 
SB December 25, 2018 at 5:31 pm #

“I am wondering why the output of encoder has a much higher dimension(100), since
How
we to Develop
usually useanencoders
Encoder-Decoder
to create lower dimensions!”, I have the same question, can you please
Model for Sequence-to-Sequence
explain more?
Prediction in Keras
Start Machine Learning ×
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras REPLY 
Jason without
Brownlee December 26, 2018 at 6:40 math
am # or fancy degrees.
Find out how in this free and practical course.
It is a demonstration of the architecture only, feel free to change the model
How toconfiguration
Use the TimeDistributed Layer problem.
for your specific in
Email Address
Keras

START MY EMAIL COURSE


A Gentle Introduction REPLY 
J.V May 14,to2019
LSTMat 7:10 am #
Autoencoders
Great article. But reading through it I thought you were tackling the most important
problem with sequences – that is they have variable lengths. Turns out it wasn’t. Any chance you
could write a tutorial on using a mask to neutralise the padded value? This seems to be more
Loving the Tutorials?
difficult than the rest of the model.
The LSTMs with Python EBook is
where you'll find the Really Good stuff.

REPLY 
Jason
>> SEE WHAT'S Brownlee May 14, 2019 at 7:54 am #
INSIDE

Yes, I believe I have many tutorials on the topic.

Perhaps start here:


https://machinelearningmastery.com/handle-missing-timesteps-sequence-prediction-
problems-python/ Start Machine Learning

REPLY 
Ufanc November 9, 2018 at 6:17 pm #

I really likes your posts and they are important.I got a lot of knowledge from your post.
Today, am going to ask your help. I am doing research on local music classifications. the key features of
the music is it sequence and it uses five keys out of the seven keys, we call it scale.

1. C – E – F – G – B. This is a major 3rd, minor 2nd, major 2nd, major 3rd, and minor 2nd
2. C – Db – F – G – Ab. This is a minor 2nd, major 3rd, major 2nd, minor 2nd, and major 3rd.
3. C – Db – F – Gb – A. This is a minor 2nd, major 3rd, minor 2nd, minor 3rd, and a minor 3rd.

https://machinelearningmastery.com/lstm-Autoencoders/ 18/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

4. C – D – E – G – A. This is a major 2nd, major 2nd, minor 3rd, major 2nd, and a minor 3rd
Never miss a tutorial:
it is not dependent on range, rythm, melody and other features.

This key has to be in order. Otherwise it will be out of scale.

So, which tools /algorithm do i need to use for my research purpose and also any sampling mechanism
to takefor
Picked 30you:
sec sample music from each track without affecting the sequence of the keys ?

Regards
How to Reshape Input Data for Long
Short-Term Memory Networks in Keras

REPLY 
Jason Brownlee November 10, 2018 at 5:59 am #
How to Develop an Encoder-Decoder
Model for Sequence-to-Sequence
Perhaps try a suite of models and discover what works best for your specific dataset.
Prediction in Keras
Start Machine Learning ×
More here:
https://machinelearningmastery.com/faq/single-faq/what-algorithm-config-should-i-use
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.

REPLY 
lungen
How to Use November 9, 2018 at 7:43
the TimeDistributed pm #in
Layer
Email Address
Keras
Hi, can you please explain the use of repeat vector between encoder and decoder?
Encoder is encoding 1-feature time-series into fixed length 100 vector. In my understanding, decoder
START MY EMAIL COURSE
shouldAtake thisIntroduction
Gentle 100-lengthtovector
LSTMand transform it into 1-feature time-series.
So, encoder is like
Autoencoders many-to-one lstm, and decoder is one-to-many (even though that ‘one’ is a vector of
length 100). Is this understanding correct?

Loving the Tutorials?


REPLY 
Jason Brownlee November 10, 2018 at 6:01 am #
The LSTMs with Python EBook is
where you'll find the Really Good stuff.
The RepeatVector repeats the internal representation of the input n times for the number of
required output steps.
>> SEE WHAT'S INSIDE

REPLY 
Abraham May 10, 2019 at 4:02 am #

Hi Jason? Start Machine Learning


What is the intuition behind “representing of the input n times for the number of required output
steps?”Here n times denotes, let say as in simple LSTM AE, 9 i.e. output step number.
I understand from repeatvector that here sequence are being read and transformed into a single
vector(9×100) which is the same 100 dim vector, then the model uses that vector to reconstruct
the original sequence.Is it right?
What about using any number except for 9 for the number of required output steps?
Thanks from now on.

REPLY 
Jason Brownlee May 10, 2019 at 8:19 am #

https://machinelearningmastery.com/lstm-Autoencoders/ 19/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

To provide input for the LSTM on each output time step for one sample.
Never miss a tutorial:

REPLY 
rekha November 10, 2018 at 4:10 am #
Picked for you:
Which model is most suited for stock market prediction
How to Reshape Input Data for Long
Short-Term Memory Networks in Keras

REPLY 
Jason Brownlee November 10, 2018 at 6:10 am #
How to Develop an Encoder-Decoder
None,
Model for a time series of prices is a random walk as far as I’ve read.
Sequence-to-Sequence
Prediction in Keras
More here:
Start Machine Learning ×
https://machinelearningmastery.com/faq/single-faq/can-you-help-me-with-machine-learning-for-
finance-or-the-stock-market
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.

How REPLY 
MJtoNovember
Use the TimeDistributed
10, 2018 at 4:41 pmLayer
# in
Email Address
Keras
Hi,
START MY EMAIL COURSE
thanks for the instructive post!
A Gentle Introduction to LSTM
Autoencoders
I am trying to repeat your first example (Reconstruction LSTM Autoencoder) using a different syntax of
Keras; here is the code:

import numpy as np
Loving the
from keras.layers Tutorials?
import Input, LSTM, RepeatVector
from keras.models import Model
The LSTMs with Python EBook is
timesteps = 9 find the Really Good stuff.
where you'll
input_dim = 1
>>=SEE
latent_dim 100 WHAT'S INSIDE

# input placeholder
inputs = Input(shape=(timesteps, input_dim))

# “encoded” is the encoded representation of the input


encoded = LSTM(latent_dim,activation=’relu’)(inputs) Start Machine Learning

# “decoded” is the lossy reconstruction of the input


decoded = RepeatVector(timesteps)(encoded)
decoded = LSTM(input_dim, activation=’relu’, return_sequences=True)(decoded)

sequence_autoencoder = Model(inputs, decoded)


encoder = Model(inputs, encoded)

# compile model
sequence_autoencoder.compile(optimizer=’adadelta’, loss=’mse’)

# run model
sequence_autoencoder.fit(sequence,sequence,epochs=300, verbose=0)

https://machinelearningmastery.com/lstm-Autoencoders/ 20/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

# prediction
Never miss a tutorial:
sequence_autoencoder.predict(sequence,verbose=0)

I did not know why, but I always get a poor result than the model using your code.
So my question is: is there any difference between the two method (syntax) under the hood? or they are
actually the same ?
Picked for you:
Thanks.
How to Reshape Input Data for Long
Short-Term Memory Networks in Keras

REPLY 
Jason Brownlee November 11, 2018 at 5:58 am #
How to Develop an Encoder-Decoder
Model for Sequence-to-Sequence
If you have trouble with the code in the tutorial, confirm that your version of Keras is 2.2.4 or
Prediction
higher in Keras
and TensorFlow is up to date.
Start Machine Learning ×
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
REPLY 
J Hogue November 29, 2018 at 5:01 am # Find out how in this free and practical course.

I feel like a bit more description could go into how to setup the LSTM autoencoder. Particularly
How to Use the TimeDistributed Layer in
tune the bottleneck. Right now when I apply thisEmail
how toKeras to myAddress
data its basically just returning the mean
for everything, which suggests the its too aggressive but I’m not clear on where to change things.
START MY EMAIL COURSE
A Gentle Introduction to LSTM
Autoencoders
REPLY 
Jason Brownlee November 29, 2018 at 7:48 am #

Thanks for the suggestion.


Loving the Tutorials?

The LSTMs with Python EBook is


where you'll find the Really Good stuff. REPLY 
Dimitre Oliveira December 10, 2018 at 11:59 am #

>> SEE WHAT'S INSIDE


Hi Jason, thanks for the wonderful article, I took some time and wrote a kernel on Kaggle
inspired by your content, showing regular time-series approach using LSTM and another one using a
MLP but with features encoded by and LSTM autoencoder, as shown here, for anyone interested here’s
the link: https://www.kaggle.com/dimitreoliveira/time-series-forecasting-with-lstm-autoencoders

I would love some feedback.


Start Machine Learning

REPLY 
Jason Brownlee December 10, 2018 at 2:17 pm #

Well done!

REPLY 
Simranjit Singh December 27, 2018 at 10:53 pm #

Hey! I am trying to compact the data single row of 217 rows. After running the program it is
returning nan values for prediction Can you guide me where did i do wrong?
https://machinelearningmastery.com/lstm-Autoencoders/ 21/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial:

REPLY 
Jason Brownlee December 28, 2018 at 5:57 am #

I have some advice here:


Picked for you:
https://machinelearningmastery.com/faq/single-faq/can-you-read-review-or-debug-my-code

How to Reshape Input Data for Long


Short-Term Memory Networks in Keras

REPLY 
Net December 31, 2018 at 9:32 pm #
How to Develop an Encoder-Decoder
Dear Jason
Model for Sequence-to-Sequence
After building andin training
Keras the model above, how to evaluate the model? (like model.evaluate(train_x,
Prediction
train_y…) in common LSTM)? Start Machine Learning ×
ThanksHow
a lotto Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.

REPLY 
How toJason
Use theBrownlee
TimeDistributed Layer
January in at 6:15 am #
1, 2019
Email Address
Keras
The model is evaluated by its ability to reconstruct the input. You can use evaluate function
or perform the evaluation of the predictions manually.START MY EMAIL COURSE
A Gentle Introduction to LSTM
Autoencoders

REPLY 
Andy Hung January 2, 2019 at 6:28 am #

Loving the Tutorials?


I have learned a lot from your website. Autoencoder can be used as dimension reduction. Is it
possible to merge multiple time-series inputs into one using RNN autoencoder? My data shape is (9500,
The LSTMs with Python EBook is
20, 5) => (sample size, time steps, features). How to encode-decode into (9500, 20, 1)?
where you'll find the Really Good stuff.
Thank you very much,
>> SEE WHAT'S INSIDE

REPLY 
Jason Brownlee January 2, 2019 at 6:44 am #

Perhaps, that may require some careful design. It might be


Start Machine easier to combine all data to a
Learning
single input.

REPLY 
Andy Hung January 3, 2019 at 3:47 am #

Thank you for your reply. Will (9500,100,1) => (9500,20,1) be easier?

REPLY 
Jason Brownlee January 3, 2019 at 6:14 am #

Perhaps test it and see.


https://machinelearningmastery.com/lstm-Autoencoders/ 22/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial:

REPLY 
Jimmy Joe January 5, 2019 at 9:41 am #

Hi Jason,
Picked for you:
I’m a regular reader of your website, I learned a lot from your posts and books!
This one is also very informative, but there’s one thing I can’t fully understand: if the encoder input is
How to Reshape Input Data for Long
[0.1, 0.2, …, 0.9] and
Short-Term the expected
Memory decoder
Networks in Keras output is [0.2, 0.3, …, 0.9], that’s basically a part of the input
sequence. I’m not sure why you say it’s “predicting next step for each input step”. Could you please
explain? Is an autoencoder a good fit for multi-step time series prediction?
AnotherHowquestion: does
to Develop antraining the composite autoencoder imply that the error is averaged for both
Encoder-Decoder
expected
Modeloutputs ([seq_in, seq_out])?
for Sequence-to-Sequence
Prediction in Keras
Start Machine Learning ×
How to Develop an Encoder-Decoder You can master applied Machine Learning REPLY 
Jason Brownlee January
Model with Attention in Keras 6, 2019 at 10:15 am #
without math or fancy degrees.
Find out how in this free and practical course.
I am demonstrating two ways to learn the encoding, by reproducing the input and by
predicting the next step in the output.
How to Use the TimeDistributed Layer in
Email Address
Keras
Remember, the model outputs a single step at a time in order to construct a sequence.

Good question, I assume the reported error is averaged over both outputs. I’m not sure.
START MY EMAIL COURSE
A Gentle Introduction to LSTM
Autoencoders

REPLY 
Junetae Kim January 27, 2019 at 4:12 pm #

Hi, I am JT.
Loving the Tutorials?
First of all, thanks for your post that provides an excellent explanation of the concept of LSTM AE
The LSTMs with Python EBook is
models and codes.
where you'll find the Really Good stuff.
If I understand your AE model correclty, features from your LSTM AE vector layer [shape (,100)] does
not seem>> to SEE
be time dependent.
WHAT'S INSIDE

So, I have tried to build a time-dependent AE layer by modifying your codes.

Could you check my codes whether my codes are correct to build an AE model that incpude a time-wise
AE layer, if you don’t mind?

My codes are below. Start Machine Learning

from numpy import array


from keras.models import Model
from keras.layers import Input
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributed

## Data generation
# define input sequence
seq_in = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
# reshape input into [samples, timesteps, features]
https://machinelearningmastery.com/lstm-Autoencoders/ 23/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

n_in = len(seq_in)
Never miss a tutorial:
seq_in = seq_in.reshape((1, n_in, 1))
# prepare output sequence
seq_out = array([3, 5, 7, 9, 11, 13, 15, 17, 19])
seq_out = seq_out.reshape((1, n_in, 1))
Picked for you:
## Model specification
# define encoder
How to Reshape Input Data for Long
visible Short-Term
= Input(shape=(n_in,1))
Memory Networks in Keras
encoder = LSTM(60, activation=’relu’, return_sequences=True)(visible)

# AE Vector
AEV =How to Develop
LSTM(30, an Encoder-Decoder
activation=’relu’, return_sequences=True)(encoder)
Model for Sequence-to-Sequence
# define reconstruct
Prediction decoder
in Keras
Start Machine
decoder1 = LSTM(60, activation=’relu’, return_sequences=True)(AEV) Learning ×
decoder1 = TimeDistributed(Dense(1))(decoder1)
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model
# define withdecoder
predict Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.
decoder2 = LSTM(30, activation=’relu’, return_sequences=True)(AEV)
decoder2 = TimeDistributed(Dense(1))(decoder2)
How to Use the TimeDistributed Layer in
# tie it Keras
together Email Address
model = Model(inputs=visible, outputs=[decoder1, decoder2])
model.summary() START MY EMAIL COURSE
model.compile(optimizer=’adam’,
A Gentle Introduction to LSTMloss=’mse’)
Autoencoders
# fit model
model.fit(seq_in, [seq_in,seq_out], epochs=2000, verbose=2)

## The model that feeds seq_in to predict seq_out


Loving the Tutorials?
hat1= model.predict(seq_in)
The
## The LSTMs
model thatwith Python
feeds EBook
seq_in is
to predict AE Vector values
where you'll find the Really Good stuff.
model2 = Model(inputs=model.inputs, outputs=model.layers[2].output)
hat_ae= model2.predict(seq_in)
>> SEE WHAT'S INSIDE
## The model that feeds AE Vector values to predict seq_out
input_vec = Input(shape=(n_in,30))
dec2 = model.layers[4](input_vec)
dec2 = model.layers[6](dec2)
model3 = Model(inputs=input_vec, outputs=dec2) Start Machine Learning
hat_= model3.predict(hat_ae)

Thank you very much

REPLY 
Jason Brownlee January 28, 2019 at 7:11 am #

I’m happy to answer questions, but I don’t have the capacity to review and debug your
code, sorry.

https://machinelearningmastery.com/lstm-Autoencoders/ 24/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial: REPLY 


Anirban Ray January 28, 2019 at 5:33 pm #

Thanks for the nice post. Being a beginner in machine learning, your posts are really helpful.

I want to build an auto-encoder for data-set of names of a large number of people. I want to encode the
Picked
entire for
fieldyou:
instead of doing it character or wise, for example [“Neil Armstrong”] instead of [“N”, “e”, “i”, “l”,
” “, “A”, “r”, “m”, “s”, “t”, “r”, “o”, “n”, “g”] or [“Neil”, “Armstrong”]. How can I do it?
How to Reshape Input Data for Long
Short-Term Memory Networks in Keras

REPLY 
Jason Brownlee January 29, 2019 at 6:09 am #
How to Develop an Encoder-Decoder
Model for Sequence-to-Sequence
Wrap your list of strings in one more list/array.
Prediction in Keras
Start Machine Learning ×
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in 15,
Keras REPLY 
Benjamin February 2019 at 11:05 am # without math or fancy degrees.
Find out how in this free and practical course.
Hey, thanks for the post, I have found it helpful… Although I am confused about one, in my
opinion,
Howmajor point..
to Use the TimeDistributed Layer in
Email Address
Keras
– If autoencoders are used to obtain a compressed representation of the input, what is the purpose of
taking the output after the encoder part if it is now 100 elements instead of 9? I’m struggling to find a
START MY EMAIL COURSE
meaning of the 100 element data and how one could use this 100 element data to predict anomalies. It
A Gentle Introduction to LSTM
sort of Autoencoders
seems like doing the exact opposite of what was stated in the explanation prior to the example.
An explanation would be greatly appreciated.

– In the end I’m trying to really understand how after learning the weights by minimizing the
reconstruction error of the training set using the AE, how to then use this trained model to predict
Loving the Tutorials?
anomalies in the cross validation and test sets.
The LSTMs with Python EBook is

where you'll find the Really Good stuff.

>> SEE WHAT'S INSIDE

REPLY 
Jason Brownlee February 15, 2019 at 2:22 pm #

It is just a demonstration, perhaps I could have given a better example.

For example, you could scale up the input to be 1,000 inputs that is summarized with 10 or 100
Start Machine Learning
values.

REPLY 
Ahmad May 19, 2019 at 10:22 pm #

Hi Jason, Benjamin is right. The last example you provided for using standalone LSTM
encoder. The input sequence is 9 elements but the output of the encoder is 100 elements
despite explaining in the first part of the tutorial that encoder part compresses the input
sequence and can be used as a feature vector. I am also confused about how the output of 100
elements can be used as a feature representation of 9 elements of the input sequence. A more
detail explanation will help. Thank you!

https://machinelearningmastery.com/lstm-Autoencoders/ 25/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial:


REPLY 
Jason Brownlee May 20, 2019 at 6:29 am #

Thanks for the suggestion.


Picked for you:

How to Reshape Input Data for Long


REPLY 
Cloudy Memory
Short-Term Networks
February 17, in Keras
2019 at 7:51 pm #

Thank your great post.


As youHow
mentioned in the
to Develop first section, “Once fit, the encoder part of the model can be used to encode or
an Encoder-Decoder
compress
Modelsequence data that in turn may be used as a feature vector input to a supervised learning
for Sequence-to-Sequence
model”.Prediction
I fed theinfeature
n_dimensions=50
Keras vector (encode part) to 1 feedforward neural network 1 hidden layer:
Start Machine Learning ×
1 Howget_model(n_dimensions):
def to Develop an Encoder-Decoder You can master applied Machine Learning
2 inputs
Model = Input(shape=(timesteps,
with Attention in Keras input_dim))
without math or fancy degrees.
3 encoded = LSTM(n_dimensions, return_sequences=False, name="encoder")(inputs)
Find out how in this free and practical course.
4 decoded = RepeatVector(timesteps)(encoded)
5 decoded = LSTM(input_dim, return_sequences=True, name='decoder')(decoded)
6 Howdecoded = TimeDistributed(Dense(features_n))(decoded)
to Use the TimeDistributed Layer in
7 Email Address
Keras
8 autoencoder = Model(inputs, decoded)
9 encoder = Model(inputs, encoded)
10 mid = Dense(num_units, activation='relu')(encoded) #FFNN
START MY EMAIL (1in, 1hid, 1out)
COURSE
11 out = Dense(num_classes,
A Gentle Introduction to LSTM activation='softmax')(mid) #FFNN
12 full_model=Model(inputs, out)
13 Autoencoders
return autoencoder, encoder, full_model
14 autoencoder, encoder, full_model = get_model(n_dimensions)
15 history = autoencoder.fit(train_x, train_x, batch_size=100, epochs=epochs, validation_
16 train_encoded = encoder.predict(train_x)
17 val_encoded = encoder.predict(train_x)
18 Loving the Tutorials?
history_class=full_model.fit(train_encoded, train_y, epochs=3, batch_size=256, validat

The LSTMs
Error when with PythonError
fit(): ValueError: EBook is checking input: expected input_1 to have 3 dimensions, but got
when
where
array withyou'll
shape find(789545,
the Really Good stuff.
50).

I mix Autoencoder to FFNN


>> SEE WHAT'S and is my method, right? Can you help me shape the feature vector before
INSIDE
fed to FFNN

REPLY 
Jason Brownlee February 18, 2019 at 6:30 am #
Start Machine Learning
Change the LSTM to not return sequences in the decoder.

REPLY 
Cloudy February 19, 2019 at 6:14 pm #

Thank for your reply soon.


I saw your post, LSTM layer at the decoder is set “return_sequences=True” and I follow and then
error as you saw. Actually, I thought the decoder is not a stacked LSTM (only 1 LSTM layer), so
“return_sequences=False” is suitable. I changed as you recommend. Another error:
decoded = TimeDistributed(Dense(features_n))(decoded)
File “/usr/local/lib/python3.4/dist-packages/keras/engine/topology.py”, line 592, in __call__
https://machinelearningmastery.com/lstm-Autoencoders/ 26/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

self.build(input_shapes[0])
Never miss a tutorial:
File “/usr/local/lib/python3.4/dist-packages/keras/layers/wrappers.py”, line 164, in build
assert len(input_shape) >= 3
AssertionError.
Can you give me an advice?
Picked forThank
you:you

How to Reshape Input Data for Long


Short-Term Memory Networks in Keras
REPLY 
Jason Brownlee February 20, 2019 at 7:52 am #

How to DevelopI’man not sure about this error, sorry. Perhaps post code and error to stackoveflow or
Encoder-Decoder
Modeltry
fordebugging?
Sequence-to-Sequence
Prediction in Keras
Start Machine Learning ×
How to Develop an Encoder-Decoder You can master applied Machine Learning
Cloudy
Model with Attention in Keras February 22, 2019 at 7:45 pm #
without math or fancy degrees.
Find out how in this free and practical course.
Hi Jason,
How to UseI found another way Layer
the TimeDistributed to build
in full_model. I don’t use autoencoder.predict(train_x) to input
Email Address
Keras to full_model. I used orginal inputs, saved weights of the encoder part in autoencoder
model, then set that weights to encoder model. Something like this:
autoencoder.save_weights(‘autoencoder.h5′) START MY EMAIL COURSE
for l1,l2 in zip(full_model.layers[:a],autoencoder.layers[0:a]):
A Gentle Introduction to LSTM #a:the num_layer in the
Autoencoders
encoder part
l1.set_weights(l2.get_weights())
train full_model:
full_model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
Loving the Tutorials?
history_class=full_model.fit(train_x, train_y, epochs=2, batch_size=256, validation_data=
The LSTMs(val_x, val_y))
with Python EBook is
where you'll find the Really Good stuff.
My full_model run, but the result so bad. Hic, train 0%, test/val: 100%

>> SEE WHAT'S INSIDE

Jason Brownlee February 23, 2019 at 6:31 am #

Interesting, sounds like more debugging might be required.


Start Machine Learning

REPLY 
Anshuman Singh Bhadauria February 21, 2019 at 12:20 am #

Hi Jason,

Thank you for putting in the effort of writing the posts, they are very helpful.

Is it possible to learn representations of multiple time series at the same time? By multiple time-series I
don’t mean multivariate.

For eg., if I have time series data from 10 sensors, how can I feed them simultaneously to obtain 10
representations, not a combined one.

https://machinelearningmastery.com/lstm-Autoencoders/ 27/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Best,
Never miss a tutorial:
Anshuman

REPLY 
Jason Brownlee February 21, 2019 at 8:13 am #
Picked for you:

How toYes, eachInput


Reshape series would
Data be a different sample used to train one model.
for Long
Short-Term Memory Networks in Keras

REPLY 
JasOlean
How to DevelopMarch 5, 2019 at 9:42 pm #
an Encoder-Decoder
Model for Sequence-to-Sequence
Hi Jason,
Prediction in Keras
Start Machine Learning ×
I use MinMaxScaler function to normalize my training and testing data.
How to Develop an Encoder-Decoder You can master applied Machine Learning
After that I did some training process to get model.h5 file.
Model with Attention in Keras without math or fancy degrees.
And then I use this file to predict my testing data. Find out how in this free and practical course.

After that I got some prediction results with range (0,1).


How to Use the TimeDistributed Layer in
I reverse Email
my original data using inverse_transform function Address
from MinMaxScaler.
Keras
But, when I compare my original data (before scaler) with my predictions data, the x,y coordinates are
changed like this: START MY EMAIL COURSE
A Gentle Introduction to LSTM
Ori_data =
Autoencoders
[7.6291,112.74,43.232,96.636,61.033,87.311,91.55,115.28,121.22,136.48,119.52,80.53,172.08,77.987,1
99.21,94.94,228.03,110.2,117.83,104.26,174.62,103.42,211.92,109.35,204.29,122.91,114.44,125.46,16
8.69,124.61,194.97,134.78,173.77,141.56,104.26,144.11,125.46,166.99,143.26,185.64,165.3,205.14]
Loving the Tutorials?
Predicted_data = [94.290375, 220.07372, 112.91617, 177.89548, 133.5322, 149.65489,
The LSTMs
161.85602, with Python
99.74797, EBook60.718987,
178.18903, is 86.012276, 113.3682,
where you'll find the Really Good stuff.
111.641655, 90.18026, 134.16464, 82.28861, 155.12575, 78.26058,
99.82883, 145.162, 98.78825, 98.62861, 130.25414, 62.43494,
>> 74.574684,
143.52762, SEE WHAT'S99.36809,
INSIDE 169.79303, 107.395615, 131.40468,

124.29078, 114.974014, 135.11014, 107.4492, 90.64477, 188.39305,


121.55309, 174.63484, 138.58575, 167.6933, 144.91512, 162.34071]

When I visualize these predictions data on my image, the direction is 90 degree changing (i.e Original
data is horizontal but predictions data is vertical).
Start Machine Learning
Why I face this and how can I fix that?

REPLY 
Jason Brownlee March 6, 2019 at 7:54 am #

You must ensure that the columns match when calling transform() and inverse_transform().

See this tutorial:


https://machinelearningmastery.com/machine-learning-data-transforms-for-time-series-forecasting/

https://machinelearningmastery.com/lstm-Autoencoders/ 28/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial: REPLY 


saria March 11, 2019 at 6:09 am #

Hi Jason,

Thank you so much for your great post. I wish you have done this with a real data set like 20 newsgroup
Picked for you:
data set.
It is at first not clear the different ways of preparing the data for different objectives.
How to Reshape Input Data for Long
My understanding is that with LSTM Autoencoder we can prepare data in different ways based on the
Short-Term Memory Networks in Keras
goal. Am I correct?
Or can you please give me the link which is preparing the text data like 20 news_group for this kind of
model?
How to Develop an Encoder-Decoder
Model for
Again thanks forSequence-to-Sequence
your awesome material
Prediction in Keras
Start Machine Learning ×
How to Develop an Encoder-Decoder You can master applied Machine Learning
REPLY 
Jason Brownlee March
Model with Attention in Keras 11, 2019 at 6:58 am #
without math or fancy degrees.
Find out how in this free and practical course.
If you are working with text data, perhaps start here:
https://machinelearningmastery.com/start-here/#nlp
How to Use the TimeDistributed Layer in
Email Address
Keras

START MY EMAIL COURSE


REPLY 
saria March
A Gentle Introduction to 11, 2019 at 3:06 pm #
LSTM
Autoencoders
Thank you so much Jason for the link. I have already gone through lots of material, in
detail the mini corse in the mentioned link months ago.
My problem mainly is the label data here.
Loving the Tutorials?
For example, in your code, in the reconstruction part, you have given sequence for both data
Theand label.with
LSTMs however,
Pythonin the prediction
EBook is part you have given the seq_in, seq_out as the data and
wherethe
you'll findand
label, the their
Really Good stuff.
difference is that seq_out looking at one timestamp forward.

My question according to your example will be if I want to use this LSTM autoencoder for the
>> SEE WHAT'S INSIDE
purpose of topic modeling, Do I need to follow the reconstruction part as I don’t need any
prediction?

Start Machine Learning REPLY 


saria March 12, 2019 at 1:31 am #

I think I got my answer. Thanks Jason

REPLY 
Jason Brownlee March 12, 2019 at 6:43 am #

No. Perhaps this model would be more useful as a starting point:


https://machinelearningmastery.com/develop-word-embedding-model-predicting-movie-
review-sentiment/

https://machinelearningmastery.com/lstm-Autoencoders/ 29/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial:


saria March 11, 2019 at 6:35 am #
REPLY 

Based on different objectives I meant, for example if we use this architecture for topic modeling,
or sequence generation, or … is preparing the data should be different?

Picked for you:

How to Reshape Input Data for Long REPLY 


Mingkuan Wu March 14, 2019 at 9:27 am #
Short-Term Memory Networks in Keras

Thanks for your post! When you use RepeatVectors(), I guess you are using the unconditional
decoder, am I right?
How to Develop an Encoder-Decoder
Model for Sequence-to-Sequence
Prediction in Keras
Start Machine Learning ×
REPLY 
Jason Brownlee March 14, 2019 at 9:32 am #
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention
I guess in Keras
so. Are you referring to a specificwithout
model in comparison?
math or fancy degrees.
Find out how in this free and practical course.

How to Use the TimeDistributed Layer in


Email Address REPLY 
Keras
rekha March 18, 2019 at 4:47 am #

Thanks for the post. Can this be formulated as START


a sequence prediction
MY EMAIL COURSE research problem
A Gentle Introduction to LSTM
Autoencoders

REPLY 
Jason Brownlee March 18, 2019 at 6:08 am #

Loving the Tutorials?


The demonstration is a sequence prediction problem.

The LSTMs with Python EBook is


where you'll find the Really Good stuff.

REPLY 
Kristian March 20, 2019 at 11:32 pm #
>> SEE WHAT'S INSIDE

Hi Jason,

what a fantastic tutorial!

I have a question about the loss function used in the composite model.
Start and
Say you have different loss functions for the reconstruction Machine Learning
the prediction/classification parts, and
pre-trains the reconstruction part.

In Keras, would it be possible to combine these two loss functions into one when training the model,
such that the model does not lose or diminish its reconstruction ability while traning the
prediction/classification part?

If so; could you please point me in the right direction.

Kind regards

Kristian

https://machinelearningmastery.com/lstm-Autoencoders/ 30/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial: REPLY 


Jason Brownlee March 21, 2019 at 8:16 am #

Yes, great question!

You can specify a list of loss functions to use for each output of the network.
Picked for you:

How to Reshape Input Data for Long


Short-Term Memory Networks in Keras REPLY 
El March 26, 2019 at 4:54 am #

Dear,
How to Develop an Encoder-Decoder
Would Model
it makeforsense to set statefull = true on the LSTMs layers of an encoder decoder?
Sequence-to-Sequence

Thanks
Prediction in Keras
Start Machine Learning ×
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Jason Brownlee March 26, 2019 at 8:12 am Find
# out how in this free and practical course. REPLY 

How toItUse
really
thedepends on whether
TimeDistributed Layer you
in want control over when the internal state is reset, or not.
Email Address
Keras

START MY EMAIL COURSE


REPLY 
A Gentle Introduction
saria March 28, 2019toatLSTM
3:34 am #
Autoencoders
Thank you, Jason, but still, I have not got the answer to my question.
Lets put it another way. what is the latent space in this model? is it only a compressed version of the
input data?
Loving
do you think the
if I use theTutorials?
architecture of Many to one, I will have one word representation for each
sequence of data?
The LSTMs with Python EBook is
Whywhere you'll findprint
am I able to out theGood
the Really clusters of the topics in autoencoder easily but when it comes to this
stuff.
architecture I am lost!
>> SEE WHAT'S INSIDE

REPLY 
Jason Brownlee March 28, 2019 at 8:22 am #

In some sense, yes, but a one value representation is an aggressive


Start Machine Learning
projection/compression of the input and may not be useful.

What problem are you having exactly?

REPLY 
JohnAlex March 28, 2019 at 2:01 pm #

Hi Jason,
I am appreciate your tutorial!
Now I’m implementing the paper ‘“Unsupervised Learning of Video Representations using LSTMs.”But
my result is not very well.The predict pictures are blurred,not good as the paper’s result.
(You can see my result at here:

https://machinelearningmastery.com/lstm-Autoencoders/ 31/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

https://i.loli.net/2019/03/28/5c9c374d68af2.jpg
Never miss a tutorial:
https://i.loli.net/2019/03/28/5c9c37af98c65.jpg)

I don’t think there exists difference between my keras model and the paper’s model.But the problem has
confused me for 2 weeks,I can not get a good solution.I really appreciate your help!

This my
Picked forkeras
you:model’s code:
1 Input_seqlen=10
How to Reshape Input Data for Long
2 Output_seqlen=10
Short-Term Memory Networks in Keras
3 Pic_size=64
4 Channels=1
5 Num_units = 2048
6
7 def
HowBasic_Encoder_Decoder(input_img_shape=(Input_seqlen,Pic_size,Pic_size,Channels)):
to Develop an Encoder-Decoder
8 Model for Sequence-to-Sequence
9 encoder_x = keras.layers.Input(shape=input_img_shape[1:])
10
11
Prediction in Keras
encoder_flatten = keras.layers.Flatten()(encoder_x)
conv_model=keras.Model(encoder_x,encoder_flatten)
Start Machine Learning ×
12
13 Howdecode_in
to Develop=ankeras.layers.Input(shape=(Num_units,))
Encoder-Decoder You can master applied Machine Learning
14 decode_dense=keras.layers.Dense(Pic_size*Pic_size,activation='sigmoid')(decode_in)
Model with Attention in Keras without math or fancy degrees.
15 decode_reshape = keras.layers.Reshape(target_shape=(Pic_size,Pic_size,Channels))(d
Find out how in this free and practical course.
16 decode_model=keras.Model(decode_in,decode_reshape)
17
18 How to Use the TimeDistributed Layer in
19 Email Address
#TensorShape([Dimension(None), Dimension(10), Dimension(64), Dimension(64), Dimens
Keras
20 Encoder_inp = keras.layers.Input(shape=input_img_shape)
21 Encode_seq = keras.layers.TimeDistributed(conv_model)(Encoder_inp)
22 START MY EMAIL COURSE
23
A Gentle Introduction to LSTM
24 Encoder_Lstm,Encoder_h,Encoder_c= keras.layers.LSTM(units=Num_units,
25 Autoencoders return_sequences=False,return_state=True,dropou
26
27 copylayer= keras.layers.RepeatVector(Output_seqlen)(Encoder_h)
28 decoder_lstm,_,_= keras.layers.LSTM(units=Num_units,return_sequences=True,
29 return_state=True,dropout=0.2,activation='relu')(copylayer)
30 Loving
Decode_seqthe Tutorials?
= keras.layers.TimeDistributed(decode_model)(decoder_lstm)
31
32 The cae
LSTMs= keras.Model(Encoder_inp,Decode_seq)
with Python EBook is
33where you'll find the Really Good stuff.
34 rms = keras.optimizers.RMSprop(lr=0.001)
35
36 defSEE
>> ae_loss(inp,outp):
WHAT'S INSIDE
37 inp = K.flatten(inp)
38 outp = K.flatten(outp)
39 xent_loss = keras.losses.binary_crossentropy(inp, outp)
40 return xent_loss
41 cae.compile(optimizer=rms,loss=ae_loss)

Start Machine Learning

REPLY 
Jason Brownlee March 28, 2019 at 2:43 pm #

Sounds like a great project!

Sorry, I don’t have the capacity to debug your code, I have some suggestions here though:
https://machinelearningmastery.com/faq/single-faq/can-you-read-review-or-debug-my-code

REPLY 
JohnAlex March 28, 2019 at 10:33 pm #

https://machinelearningmastery.com/lstm-Autoencoders/ 32/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Thank for your reply.


Never miss a tutorial:
And I wanna know that what may cause the image of the output to be blurred according to your
experience ?Thank you~

Picked for you:


REPLY 
Jason Brownlee March 29, 2019 at 8:34 am #
How to Reshape Input Data for Long
In what
Short-Term Memory context
Networks in exactly?
Keras

How to Develop an Encoder-Decoder


REPLY 
Birish
Model for Sequence-to-Sequence
April 5, 2019 at 2:26 am #
Prediction in Keras
Start
How can I use the cell state of this “Standalone LSTMMachine Learning
Encoder” model as an input layer for
×
another model? Suppose in your code for “Keep Standalone LSTM Encoder”, you had
How to Develop an Encoder-Decoder You can master applied Machine Learning
“return_state=True” option for the
Model with Attention in Keras
encoder LSTM layer and create the model like:
without math or fancy degrees.
Find out how in this
model = Model(inputs=model.inputs, outputs=[model.layers[0].output, free and practical
hidden_state, course.
cell_state])

Then one
Howcan retrieve
to Use the cell state by:
the TimeDistributed model.outputs[2]
Layer in
Email Address
Keras is that this will return a “Tensor” and keras complains that it only accept “Input Layer” as an
The problem
input for ‘Model()’. How can I feed this cell state to another model as input?
START MY EMAIL COURSE
A Gentle Introduction to LSTM
Autoencoders
REPLY 
Jason Brownlee April 5, 2019 at 6:20 am #

I think it would be odd to use cell state as an input, I’m not sure I follow what you want to
Loving the Tutorials?
do.
The LSTMs with Python EBook is
Nevertheless, you can use Keras to evaluate the tensor, get the data, create a numpy array and
where you'll find the Really Good stuff.
provide it as input to the model.

Also, >>
thisSEE
mayWHAT'S
help: INSIDE
https://machinelearningmastery.com/return-sequences-and-return-states-for-lstms-in-keras/

REPLY 
Birish April 5, 2019 at 4:21 pm # Start Machine Learning

That’s the approach used in this paper: https://arxiv.org/pdf/1709.01907.pdf

“After the encoder-decoder is pre-trained, it is treated as an intelligent feature-extraction


blackbox. Specifically, the last LSTM cell states of the encoder are extracted as learned
embedding. Then, a prediction network is trained to forecast the next one or more timestamps
using the learned embedding as features.”

They trained an LSTM autoencoder and fed the last cell states of last encoder layer to another
model. Did I misunderstand it?

https://machinelearningmastery.com/lstm-Autoencoders/ 33/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial: REPLY 


Jason Brownlee April 6, 2019 at 6:40 am #

Sounds odd, perhaps confirm with the authors that they are not referring to hidden
states (outputs) instead?

Picked for you:

How to Reshape Input Data for Long


REPLY 
GeorgeMemory
Short-Term April 11, 2019 at 1:07 in
Networks amKeras
#

Hello Jason,

Is thereHow
anytoway
Develop an Encoder-Decoder
to stack the LSTM autoencoder?
Model for Sequence-to-Sequence
for example:
Prediction in Keras
Start Machine Learning ×
model = Sequential()
How to Develop anactivation=activation,
model.add(LSTM(units, Encoder-Decoder input_shape=(n_in,d)))
You can master applied Machine Learning
Model with Attention in Keras
model.add(RepeatVector(n_in)) without math or fancy degrees.
model.add(LSTM( units/2, activation=activation)) Find out how in this free and practical course.
model.add(RepeatVector(n_in)
How to Use the TimeDistributed
model.add(LSTM(units/2, Layer in
activation=activation,return_sequences=True))
Email Address
Keras
model.add(LSTM(units, activation=activation, return_sequences=True))
model.add(TimeDistributed(Dense(d)))
START MY EMAIL COURSE
is this a
A correct approach? to LSTM
Gentle Introduction
Autoencoders
Do you see any benefits by stacking the autoencoder?

Loving the Tutorials?


REPLY 
Jason Brownlee April 11, 2019 at 6:43 am #
The LSTMs with Python EBook is
where you'll find the
I have Really
never seenGood stuff. like this
something

>> SEE WHAT'S INSIDE

REPLY 
Leland Hepworth September 14, 2019 at 5:53 am #

Hi George,
Start Machine
Stacked encoder / decoders with a narrowing bottleneck are usedLearning
in a tutorial on the Keras website
in the section “Deep autoencoder”

https://blog.keras.io/building-autoencoders-in-keras.html

The tutorial claims that the deeper architecture gives slightly better results than the more shallow
model definition in the previous example. This tutorial uses simple dense layers in its models, so I
wonder if something similar could be done with LSTM layers.

REPLY 
Jason Brownlee September 14, 2019 at 6:25 am #

Thanks for sharing.

https://machinelearningmastery.com/lstm-Autoencoders/ 34/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial:

REPLY 
Rojin April 12, 2019 at 3:36 am #

I have a theoretical question about autoencoders. I know that autoencoders are suppose to
Picked for the
construct you:input at the output, and by doing so they will learn a lower-dim representation of the input.
Now I want to know if it is possible to use autoencoders to construct something else at the output (let’s
How to Reshape Input Data for Long
say a something that is a modified version of the input).
Short-Term Memory Networks in Keras

How to Develop an Encoder-Decoder REPLY 


Jason Brownlee April 12, 2019 at 7:53 am #
Model for Sequence-to-Sequence
Prediction in Keras
Sure. Start Machine Learning ×
Perhaps check out conditional generative models, VAEs, GANs.
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.

REPLY 
How to UseRojin April 12, 2019 atLayer
the TimeDistributed 10:35 am
in #
Email Address
Keras
Thanks for the response. I will check those out. I though about denoising autoencoders,
but was not sure if that is applicable to my situations.
START MY EMAIL COURSE
A Let’s
Gentle Introduction
say to two
that I have LSTM
versions of a feature vector, one is X, and the other one is X’, which has
Autoencoders
some meaningful noise (technically not noise, meaningful information). Now my question is
whether it is appropriate to use denoising autoencoders in this case to learn about the transition
between X to X’ ?
Loving the Tutorials?

The LSTMs with Python EBook is


where you'll find theJason
Really Brownlee
Good stuff. April 12, 2019 at 2:43 pm # REPLY 

Conditional
>> SEE WHAT'S INSIDE GANs do this for image to image translation.

REPLY 
Nick April 16, 2019 at 4:18 pm #
Start Machine Learning
Hi Jason, could you explain the difference between RepeatVector and return_sequence?
It looks like they both repeat vector several times but what’s the difference?
Can we only use return_sequence in the last LSTM encoder layer and don’t use RepeatVector before
the first LSTM decoder layer?

REPLY 
Jason Brownlee April 17, 2019 at 6:53 am #

Yes, they are very different in operation, but similar in effect.

The “return_sequence” argument, returns the LSTM layer outputs for each input time step.

https://machinelearningmastery.com/lstm-Autoencoders/ 35/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

The “RepeatVector” layer copies the output from the LSTM for the last input time step and repeats it
Never miss a tutorial:
n times.

REPLY 
Picked for you: Nick April 18, 2019 at 3:44 am #

ThankInput
How to Reshape you,Data
Jason, now I understand the difference between them. But, here is another
for Long
question, Memory
Short-Term can we Networks
do like this:
in Keras
”’
encoder = LSTM(100, activation=’relu’, input_shape=(n_in,1), return_sequence=True)
How to Develop an Encoder-Decoder
(no RepeatVector layer here, but return_sequence is True in encoder layer)
Model for Sequence-to-Sequence
Prediction
decoder in= Keras
LSTM(100, activation=’relu’, return_sequences=True)(encoder)
Start Machine Learning ×
decoder = TimeDistributed(Dense(1))(decoder)
”’ to Develop an Encoder-Decoder
How You can master applied Machine Learning
Model with Attention in Keras without
If yes, what’s the difference between this one and the math or fancy
one you shareddegrees.
(with RepeatVector layer
Find out how in this free and practical course.
between encoder and decoder, but return_sequence is False in encoder layer)

How to Use the TimeDistributed Layer in


Email Address
Keras
REPLY 
Jason Brownlee April 18, 2019 at 8:54 am #
START MY EMAIL COURSE
A Gentle Introduction to LSTM
The repeat vector allows the decoder to use the same representation when creating
Autoencoders
each output time step.

A stacked LSTM is different in that it will read all input time steps before formulating an
output, and in your case, will output an activation for each input time step.
Loving the Tutorials?
There’s no “best” way, test a suite of models for your problem and use whatever works best.
The LSTMs with Python EBook is
where you'll find the Really Good stuff.

>> SEE WHAT'S INSIDE


Nick April 18, 2019 at 2:53 pm #

Thank you for answering my questions.

Start Machine Learning


TS May 6, 2019 at 3:27 pm #

Dear Sir,

One point I would like to mention is the Unconditioned Model that Srivastava et al use.
a) They do not supply any inputs in the decoder model.. Is this tutorial only using the
conditioned model?

b) Even if we are using the any of the 2 models that is mentioned in the paper, we
should be passing the hidden state or maybe even the cell state of the encoder model to
the models first time step and not to all the time steps..

The tutorial over here shows us that the repeat vector is supplying inputs to all the time
steps in the decoder model which should not be the case in any of the models
https://machinelearningmastery.com/lstm-Autoencoders/ 36/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Also the target time steps in the auto reconstruction decoder model should have been
Never miss a tutorial:
reversed.

Please correct me if I am wrong in understanding the paper. Awaiting for you to clarify
my doubt. Thanking you in advance.

Picked for you:

How to Reshape Input Data for Long


Jason Brownlee May 7, 2019 at 6:11 am #
Short-Term Memory Networks in Keras

Perhaps.

You can
How to Develop consider the implementation inspired by the paper. Not a direct re-
an Encoder-Decoder
Model for Sequence-to-Sequence
implementation.
Prediction in Keras
Start Machine Learning ×
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with AttentionTS May 9, 2019 at 1:16 am #
in Keras without math or fancy degrees.
Find out how in this free and practical course.
Thank you for the clarification.. Thank you for the post, it helped
How to Use the TimeDistributed Layer in
Email Address
Keras

Jason Brownlee May 9, 2019 atSTART


6:46 am #
MY EMAIL COURSE
A Gentle Introduction to LSTM
Autoencoders You’re welcome.

REPLY 
Loving theXiaoyang
Tutorials?
Ruan May 22, 2020 at 9:48 am #

The LSTMs withMy Python EBook is is that repeatvector function utilizes a more “dense”
understanding
where you'll find the Really
representation ofGood stuff. inputs. For an encoder lstm with 100 hidden units, all
the original
information are compressed into a 100 elements vector (which then duplicated by
>> SEE WHAT'S INSIDE
repeatvector for desired output timesteps). For return_sequence=TRUE, it is a totally
different scenario — you end up with 100 x input time steps latent variables. It is more like a
sparse autoencoder. Correct me if i am wrong.

Start Machine Learning


REPLY 
Taraka Rama April 18, 2019 at 5:47 pm #

Hi Jason,

The blog is very interesting. A paper that I published sometime ago uses LSTM autoencoders for
German and Dutch dialect analysis.

Best,
Taraka

REPLY 
Jason Brownlee April 19, 2019 at 6:04 am #
https://machinelearningmastery.com/lstm-Autoencoders/ 37/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Thanks.
Never miss a tutorial:

REPLY 
Taraka Rama April 18, 2019 at 5:48 pm #
Picked for you:
Hi Jason,

(ForgotHow to Reshape
to paste Inputlink)
the paper Data for Long
Short-Term Memory Networks in Keras
The blog is very interesting. A paper that I published sometime ago uses LSTM autoencoders for
German and Dutch dialect analysis.
How to Develop an Encoder-Decoder
https://www.aclweb.org/anthology/W16-4803
Model for Sequence-to-Sequence
Best, Prediction in Keras
Taraka
Start Machine Learning ×
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.
REPLY 
Jason Brownlee April 19, 2019 at 6:05 am #
How to Use the TimeDistributed Layer in
Email Address
Keras Thanks for sharing.

START MY EMAIL COURSE


A Gentle Introduction to LSTM
REPLY 
Geralt Xu May 4, 2019 at 8:01 pm #
Autoencoders

Hi Jason,

Thanks for the tutorial, it really helps.


Loving the Tutorials?
Here is a question about connection between Encoder and Decoder.
The LSTMs with Python EBook is
In your implementation,
where you copy
you'll find the Really Goodthe H-dimension hidden vector from Encoder for T times, and convey it
stuff.
as a T*H time series, into the Decoder.
>> SEE WHAT'S INSIDE
Why chose this way? I’m wondering, there are some another ways to do:

Take hidden vector as the initial state at the first time-step of Decoder, with zero inputs series.

Can this way work?

Best, Start Machine Learning


Geralt

REPLY 
Jason Brownlee May 5, 2019 at 6:26 am #

Because it is an easy way to achieve the desired effect from the paper using the Keras
library.

No, I don’t think you’re approach is the spirit of the paper. Try it and see what happens!?

https://machinelearningmastery.com/lstm-Autoencoders/ 38/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial: REPLY 


Atefeh May 6, 2019 at 11:29 am #

Hello Mr.Jason
i want to start a handwritten isolated charactor recognition with RNN and lstm.
i mean, we have a number of charactor images and i want a code to recognize that charactor.
Picked for you:
would you please help me to find a basic python code for this purpose, ans so i could start the work?
How to Reshape Input Data for Long
thank you
Short-Term Memory Networks in Keras

How to Develop an Encoder-Decoder REPLY 


Jason Brownlee May 6, 2019 at 2:33 pm #
Model for Sequence-to-Sequence
Prediction in Keras
Sounds like a great problem. Start Machine Learning ×
Perhaps a CNN-LSTM model would be a good fit!
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.

REPLY 
Xinyang
How May 13, 2019 at 12:37 pm #
to Use the TimeDistributed Layer in
Email Address
Keras
Hi, Dr Brownlee

Thanks for your post, here I want to use LSTM to prediction STARTa time series.COURSE
MY EMAIL For example the series like (1
A Gentle Introduction to LSTM
2 3 4 5 6 7 8 9), and use this series for training. Then the output series is the series of multi-step
Autoencoders
prediction until it reach the ideal value, like this(9.9 10.8 11.9 12 13.1)

Loving the Tutorials? REPLY 


Jason Brownlee May 13, 2019 at 2:32 pm #
The LSTMs with Python EBook is
See
where you'll findthis
thepost:
Really Good stuff.
https://machinelearningmastery.com/how-to-develop-lstm-models-for-time-series-forecasting/
>> SEE WHAT'S INSIDE

REPLY 
Xinyang May 13, 2019 at 12:57 pm #

Sorry, maybe I didn’t make it clear. Here I want to use LSTM to prediction a time series. the
Start Machine Learning
sequence may like this[10,20,30,40,50,60,70],and use it for training,if time_step is 3. When
input[40,50,60],we want the output is 70. when finish training the model, the prediction begin. when input
[50,60,70], the output maybe 79 and then use it for next step prediction, the input is [60,70,79] and
output might be 89. Until satisfying certain condition(like the output>=100) the the iteration is over.
So how could I realize the prediction process above and where can I find the code
Please, hope to get your reply

REPLY 
Jason Brownlee May 13, 2019 at 2:32 pm #

Yes, you can get started with time series forecasting with LSTMs in this post:
https://machinelearningmastery.com/how-to-develop-lstm-models-for-time-series-forecasting/
https://machinelearningmastery.com/lstm-Autoencoders/ 39/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

I have more advanced posts here:


Never miss a tutorial:
https://machinelearningmastery.com/start-here/#deep_learning_time_series

REPLY 
Picked for you: Xinyang May 13, 2019 at 5:55 pm #

Thanks
How to Reshape for Data
Input your for
quick reply.
Long
Short-Term Memory Networks in Keras
And I still have a question, the multi-step LSTM model uses the last three time steps as input
and forecast the next two time steps. But in my case, I want to predict the capacity decline trend
of Lithium-ion battery, and for example let the data of declining curve of capacity(the cycling
How to Develop an Encoder-Decoder
number<160) as the training data, then I want to predict the future trend of capacity until it reach
Model for Sequence-to-Sequence
the certain value(maybe <=0.7Ah) –failure threshold,which might be achieved at the cycling
Prediction in Keras
number of 250 or so. And between the cyclingStartnumber Machine
of 160 and 220,Learning
around 90 data need be ×
predicted. So I have no idea how to define time-steps and samples, if the output time-steps
How to Develop an Encoder-Decoder You can
defined as 60(220-160=60),the how should I define themaster appliedofMachine
time-steps input, it Learning
seems
Model with Attention in Keras without math or fancy degrees.
unreasonable.
Find out how in this free and practical course.
I am extremely hope to get your reply, Thank you so much
How to Use the TimeDistributed Layer in
Email Address
Keras

REPLY 
Jason Brownlee May 14, 2019 at 7:41START
am # MY EMAIL COURSE
A Gentle Introduction to LSTM
Autoencoders You can define the model with any number of inputs or outputs that you require.

If you are having trouble working with numpy arrays, perhaps this will help:
https://machinelearningmastery.com/index-slice-reshape-numpy-arrays-machine-learning-
python/
Loving the Tutorials?

The LSTMs with Python EBook is


where you'll find the Really Good stuff.
REPLY 
Kishore Surendra May 15, 2019 at 8:34 pm #
>> SEE WHAT'S INSIDE
Dear Prof,

I have a list as follows :

[5206, 1878, 1224, 2, 329, 89, 106, 901, 902, 149, 8]


Start Machine Learning
When I’m passing it as an input to the reconstruction LSTM (with an added LSTM and repeat vector
layer and 1000 epochs) , I get the following predicted output :

[5066.752 1615.2777 1015.1887 714.63916 292.17035 250.14038


331.69427 356.30664 373.15497 365.38977 335.48383]

While some values are almost accurate, most of the others have large deviations from original.

What can be the reason for this, and how do you suggest I fix this ?

REPLY 
Jason Brownlee May 16, 2019 at 6:30 am #

https://machinelearningmastery.com/lstm-Autoencoders/ 40/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

No model is perfect.
Never miss a tutorial:
You can expect error in any model, and variance in neural nets over different training runs, more
here:
https://machinelearningmastery.com/faq/single-faq/why-do-i-get-different-results-each-time-i-run-the-
code
Picked for you:

How to Reshape Input Data for Long


Short-Term Memory Networks in Keras REPLY 
Kishore Surendra May 21, 2019 at 11:57 pm #

Thanks, professor.
How to Develop an Encoder-Decoder
Model for Sequence-to-Sequence
If I have varying numbers such as 2 and 1000 in the same list, is it better to normalize the list by
Prediction in Keras
dividing each element by the highest element , and then passing the resulting sequence as an
Start Machine Learning ×
input to the autoencoder ?
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.
REPLY 
Jason Brownlee May 22, 2019 at 8:10 am #
How to Use the TimeDistributed Layer in
Email
Yes, normalizing input is a good idea Address
in general:
Keras
https://machinelearningmastery.com/how-to-improve-neural-network-stability-and-modeling-
performance-with-data-scaling/ START MY EMAIL COURSE
A Gentle Introduction to LSTM
Autoencoders

REPLY 
Harsh May 15, 2019 at 11:17 pm #

Loving
Hi, the Tutorials?
I haveThe
twoLSTMs
questions, would be
with Python grateful
EBook is if you can help –
where you'll find the Really Good stuff.
1) The above sequence is very small.
How about if the length of vector is around 800. I tried, but its taking too long.
>> SEE WHAT'S INSIDE
What do you suggest.

2) Also, is it possible to encode a matrix into vector ?

thanks

Start Machine Learning

REPLY 
Jason Brownlee May 16, 2019 at 6:32 am #

Perhaps reduce the size of the sequence?


Perhaps try running on a faster computer?
Perhaps try an alternate approach?

REPLY 
Harsh May 16, 2019 at 6:30 pm #

https://machinelearningmastery.com/lstm-Autoencoders/ 41/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

thanks for you quick response… I have a confusion, right now when you mention ‘training’, it is only one
Never miss a tutorial:
vector… how can truly train it with batches of multiple vectors.

REPLY 
Jason Brownlee May 17, 2019 at 5:51 am #
Picked for you:

How toIfReshape
you are Input
new to Keras,
Data perhaps start with this tutorial:
for Long
https://machinelearningmastery.com/5-step-life-cycle-neural-network-models-keras/
Short-Term Memory Networks in Keras

How to Develop an Encoder-Decoder


REPLY 
snowbear
Model for Sequence-to-Sequence
May 26, 2019 at 5:51 pm #
Prediction in Keras
Start
Hello Jason, I really appreciate your informative posts.Machine Learning
But I got to have two questions.
×
Question
How1.toDoes model.add(LSTM(100,
Develop an Encoder-Decoder activation='relu', input_shape=(n_in,1)))
You can master applied Machine Learning mean that
you areModel with an
creating Attention
LSTMinlayer
Keras
with 100 hidden state?without math or fancy degrees.
Find out how in this free and practical course.
LSTM structure needs hidden state(h_t) and cell state(c_t) in addition to the input_t, right? So the
number 100 there means that with the data whose shape is (9,1) (timestep = 9, input_feature_number =
How to Use the TimeDistributed Layer in
LSTM layer produces 100-unit long hidden stateEmail
1), theKeras (h_t)?Address

Question 2. how small did it get reduced in terms of ‘dimension reduction?’ Can you tell me how smaller
the (9, 1) data got to be reduced in the latent vector? START MY EMAIL COURSE
A Gentle Introduction to LSTM
Autoencoders

REPLY 
Jason Brownlee May 27, 2019 at 6:46 am #
Loving the Tutorials?
100 refers to 100 units/nodes in the first hidden layer, perhaps this will help:
https://machinelearningmastery.com/faq/single-faq/what-is-the-difference-between-samples-
The LSTMs with Python EBook is
timesteps-and-features-for-lstm-input
where you'll find the Really Good stuff.

You can experiment with different sized bottlenecks to see what works well/best for your specific
>> SEE WHAT'S INSIDE
dataset.

REPLY 
hassam May 30, 2019 at 3:07 am #
Start Machine Learning
hi jason! can this approach is used for sentence correction? i.e spelling or grammatical
mistakes of the input text.
for example I have a huge corpus of unlabelled text, and I trained it using autoencoder technique. I want
to built a model that takes input (a variable length) sentence, and output the most probable or corrected
sentence based on the training data distribution, is it possible?

REPLY 
Jason Brownlee May 30, 2019 at 9:05 am #

Perhaps, I’d encourage you to review the literature first.

https://machinelearningmastery.com/lstm-Autoencoders/ 42/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial:


REPLY 
John June 11, 2019 at 9:35 am #

How do I shape the data for autoencoder if I have multiple samples

Picked for you:

How to Reshape Input Data for Long REPLY 


Jason Brownlee June 11, 2019 at 2:22 pm #
Short-Term Memory Networks in Keras
Perhaps this will help:
https://machinelearningmastery.com/faq/single-faq/what-is-the-difference-between-samples-
How to Develop an Encoder-Decoder
timesteps-and-features-for-lstm-input
Model for Sequence-to-Sequence
Prediction in Keras
Start Machine Learning ×
REPLY 
How to Develop
Jose an 14,
Luis July Encoder-Decoder
2019 at 7:48 am # You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Hi Jason, thanks for your greats articles! I have a out
Find work where
how I get
in this freeseveral hundreds
and practical of galaxy
course.
spectra (a graphic where I have a continuous number of frecuencies in the x axis and the number of
How
received to Use from
photons the TimeDistributed
each galaxy inLayer
the yinaxis; it’s something like a continuos histogram). I need to
Email Address
Keras
make an unsupervised clustering with all this spectra. Do you thing this LSTM autoencoder can be a
good option I can use? (Each spectrum has 4000 pairs frecuency-flux).
START MY EMAIL COURSE
I was thinking
A Gentleabout passing
Introduction to the
LSTMfeature space of the autoencoder with a K-means algorithm or
something similar to make the clusters (or better, something like this: https://arxiv.org/abs/1511.06335).
Autoencoders

Loving theBrownlee
Jason Tutorials?
July 14, 2019 at 8:18 am #
REPLY 

The LSTMs
Perhapswith Python EBook
try it and is
evaluate the result?
where you'll find the Really Good stuff.

>> SEE WHAT'S INSIDE


REPLY 
Xing Wang Tong July 17, 2019 at 10:30 pm #

hello and thanks for your tutorial… do you have a similar tutorial with LSTM but with multiple
features?
Start Machine Learning
The reason I ask for multiple feature is because I built multiple autoencoder models with different
structures but all had timesteps = 30… during training the loss, the rmse, the val_loss and the val_rmse
seem all to be within acceptable range ~ 0.05, but when I do prediction and plot the prediction with the
original data in one graph, it seems that they both are totally different.

I used MinMaxScaler so I tried to plot the original data and the predictions before I inverse the transform
and after, but still the original data and the prediction aren’t even close. So, I think I am having trouble
plotting the prediction correctly

REPLY 
Jason Brownlee July 18, 2019 at 8:27 am #

https://machinelearningmastery.com/lstm-Autoencoders/ 43/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

You could adapt the examples in this post:


Never miss a tutorial:
https://machinelearningmastery.com/how-to-develop-lstm-models-for-time-series-forecasting/

REPLY 
Picked forsara
you:July 18, 2019 at 1:15 am #

I would
How like toInput
to Reshape thankData
youfor
forLong
the great post, though I wish you have included more sophisticated
model.Short-Term Memory Networks in Keras

For example the same thing with 2 feature rather one feature.

How to Develop an Encoder-Decoder


Model for Sequence-to-Sequence
Prediction in Keras
Jason Brownlee July 18, 2019 at 8:30 am # Start Machine Learning ×
REPLY 

How toThanks
Developfor
anthe
Encoder-Decoder
suggestion. You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
The examples here will be helpful:
Find out how in this free and practical course.
https://machinelearningmastery.com/how-to-develop-lstm-models-for-time-series-forecasting/
How to Use the TimeDistributed Layer in
Email Address
Keras

REPLY 
Shiva July 25, 2019 at 4:30 am # START MY EMAIL COURSE
A Gentle Introduction to LSTM
Hi Jason
Autoencoders
Thanks for the tutorial.
I have a sequence A B C. Each A B and C are vectors with length 25.
my samples are like this: A B C label, A’ B’ C’ label’,….
Loving
How should the Tutorials?
I reshape the data?
what is the size of the input dimension?
The LSTMs with Python EBook is
where you'll find the Really Good stuff.

>> SEE WHAT'S INSIDE REPLY 


Jason Brownlee July 25, 2019 at 7:58 am #

Sorry, I don’t follow.

What is the problem that you are having exactly?

Start Machine Learning

REPLY 
Shiva July 25, 2019 at 7:39 pm #

my dataset is an array with the shape (10,3,25).(3 features and each feature has 25
features in a vector form)
is it necessary to reshape it?
and what is the value of input_shape for this array?

REPLY 
Jason Brownlee July 26, 2019 at 8:20 am #

https://machinelearningmastery.com/lstm-Autoencoders/ 44/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Perhaps read this first to confirm your data is in the correct format:
Never miss a tutorial:
https://machinelearningmastery.com/faq/single-faq/what-is-the-difference-between-samples-
timesteps-and-features-for-lstm-input

Picked for you:


REPLY 
Shiva July 27, 2019 at 9:24 pm #
How to Reshape Input Data for Long
Thank you,
Short-Term Jason.
Memory Networks in Keras

How to Develop an Encoder-Decoder


REPLY 
Jason
Model for Brownlee July 28, 2019 at 6:43 am #
Sequence-to-Sequence
Prediction in Keras
You’re welcome. Start Machine Learning ×
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course. REPLY 
Felix August 12, 2019 at 4:11 am #
How to Use the TimeDistributed Layer in
Hi Jason, Email Address
Keras
Thank you for the great work.
START MY EMAIL COURSE
I have one doubt about the layer concept. Is the LSTM layer (100) means, a hidden layer of 100 neurons
A Gentle Introduction to LSTM
from the first LSTM layer output and the data from all these 100 layer will consider as the final state
Autoencoders
value. Is that correct?

Loving the Tutorials?


REPLY 
Jason Brownlee August 12, 2019 at 6:39 am #
The LSTMs with Python EBook is
where you'll
Yes.find the Really Good stuff.

>> SEE WHAT'S INSIDE

REPLY 
Felix August 13, 2019 at 5:56 am #

Hello Jason,
Start
Thank you for the quick response and appreciate Machine
your kind toLearning
respond my doubt. Still I am
confused with the diagram provided by Keras.

https://github.com/MohammadFneish7/Keras_LSTM_Diagram

Here they have explained as the output of each layer will the “No of Y variables we are
predicting * timesteps”

My doubt is like is the output size is “Y – predicted value” or “Hidden Values”?

Thanks

https://machinelearningmastery.com/lstm-Autoencoders/ 45/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial: REPLY 


Jason Brownlee August 13, 2019 at 6:14 am #

Perhaps ask the authors of the diagram about it?

I have some general advice here that might help:


Picked for you:
https://machinelearningmastery.com/faq/single-faq/what-is-the-difference-between-samples-
timesteps-and-features-for-lstm-input
How to Reshape Input Data for Long
Short-Term Memory Networks in Keras

Felix August 13, 2019 at 6:55 am #


How to Develop an Encoder-Decoder
Model for Sequence-to-Sequence
Thank you Jason for the reply.
Prediction in Keras
Start
I have gone through your post and I am Machine
clear about the input Learning
format to the initial LSTM
×
layer.
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model withI Attention
have theinbelow
Kerasdoubt about the internal
without math of
structure orKeras.
fancy degrees.
Find out how in this free and practical course.
Suppose I have a code as below.
How to Use the TimeDistributed
step_size =3 Layer in
Email Address
Keras
model = Sequential()
model.add(LSTM(32, input_shape=(2, step_size), return_sequences = True))
START MY EMAIL COURSE
model.add(LSTM(18))
A Gentle Introduction to LSTM
model.add(Dense(1))
Autoencoders
model.add(Activation(‘linear’))

I am getting below summary.

Loving_________________________________________________________________
the Tutorials?
Layer (type) Output Shape Param #
The LSTMs=================================================================
with Python EBook is
where you'll find the Really Good stuff.
lstm_1 (LSTM) (None, 2, 32) 4608
_________________________________________________________________
>> SEE WHAT'S INSIDE
lstm_2 (LSTM) (None, 18) 3672
_________________________________________________________________
dense_1 (Dense) (None, 1) 19
_________________________________________________________________
activation_1 (Activation) (None, 1) 0
Start Machine Learning
=================================================================
Total params: 8,299
Trainable params: 8,299
Non-trainable params: 0
_________________________________________________________________
None

And I have the below internal layer matrix data.

Layer 1

(3, 128)

(32, 128)

https://machinelearningmastery.com/lstm-Autoencoders/ 46/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

(128,)
Never miss a tutorial:
Layer 2

(32, 72)

(18, 72)
Picked for you:
(72,)
How to Reshape Input Data for Long
Layer 3
Short-Term Memory Networks in Keras
(18, 1)

(1,)
How to Develop an Encoder-Decoder
Model for Sequence-to-Sequence
I can not find any relation between output size and the matrix size in each layer. But in
Prediction each
in Keras
layer the parameter size specified is the total of weight matrix size. Can you please
Start Machine Learning ×
help me to get an idea of the implementation of these numbers.
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.
Jason Brownlee August 13, 2019 at 2:36 pm #
How to Use the TimeDistributed Layer in
Email Address
Keras I believe this will help you understand the input shape to an LSTM model:
https://machinelearningmastery.com/faq/single-faq/what-is-the-difference-between-
samples-timesteps-and-features-for-lstm-input
START MY EMAIL COURSE
A Gentle Introduction to LSTM
Autoencoders

REPLY 
Felix August 13, 2019 at 5:07 pm #

Loving
Hi Jason,the Tutorials?

The
Thanks forLSTMs with Python EBook is
the reply.
where you'll find the Really Good stuff.
I have gone through the post
https://machinelearningmastery.com/faq/single-faq/what-is-the-difference-between-samples-timesteps-
>> SEE WHAT'S INSIDE
and-features-for-lstm-input

I can able to understand the structure of the input data into the first LSTM layer. But I am not able to
identify the matrix structure in the first Layer and the connection with Second Layer. Can you please give
me more guidelines to understand the matrix dimensions in the Layers.
Start Machine Learning
Thanks

REPLY 
Jason Brownlee August 14, 2019 at 6:35 am #

If the LSTM has return_states=False then the output of an LSTM layer is one value for each
node, e.g. LSTM(50) returns a vector with 50 elements.

If the LSTM has return_states=True, such as when we stack LSTM layers, then the return will be a
sequence for each node where the length of the sequence is the length of the input to the layer, e.g.
LSTM(50, input_shape=(100,1)) then the output will be (100,50) or 100 time steps for 50 nodes.

Does that help?


https://machinelearningmastery.com/lstm-Autoencoders/ 47/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial:

REPLY 
Felix August 14, 2019 at 6:52 am #

Thank you Jason for the reply.


Picked for you:
Really appreciate the time and effort to give me the answer. It helped me a lot. Thank you very
How to Reshape
much. You areInput Data the
teaching for Long
whole world. Great !!!
Short-Term Memory Networks in Keras

How to Develop an Encoder-Decoder REPLY 


Jason Brownlee August 14, 2019 at 2:07 pm #
Model for Sequence-to-Sequence
Prediction in Keras
Thanks, I’m glad it helped. Start Machine Learning ×
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
REPLY 
Hossein September 4, 2019 at 6:15 pm # Find out how in this free and practical course.

Howhi,toI am
Useathe
student and I wantLayer
TimeDistributed to forecast
in a time-series (electrical load) for the next 24 hr.
Email Address
I want Keras
to do it by using an autoencoder boosting with LSTM.
I am looking for a suitable topology and structure for it.Is it possible to help me?
best regards START MY EMAIL COURSE
A Gentle Introduction to LSTM
Autoencoders

REPLY 
Jason Brownlee September 5, 2019 at 6:50 am #

Loving the some


Perhaps Tutorials?
of the tutorials here will help as a first step:
https://machinelearningmastery.com/start-here/#deep_learning_time_series
The LSTMs with Python EBook is
where you'll find the Really Good stuff.

>> SEE WHAT'S INSIDE


REPLY 
Shreeram Bhattarai September 18, 2019 at 11:19 pm #

Hi,
I have a question regarding compositive model. In your tutorial, you have sent all data into LSTM
encoder. And decorder1 tries to reconstruct whatever it has been passed to the encoder. The another
Start Machine Learning
decorder tries to predict the next sequence.

My question is that once encoder has seen all the data, does it make sense for prediction branch? Since
it has already seen all day, definitely it can predict well enough, right?

I don’t know how encoder part works? Does it works differently for two branch. Does encoder part create
a single encoded latent space from which both part does their job accordingly?

Could you please help me to figure it out. Thank you.

REPLY 
Jason Brownlee September 19, 2019 at 6:01 am #

https://machinelearningmastery.com/lstm-Autoencoders/ 48/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Perhaps focus on the samples aspect of the data, the model receives a sample, and predicts the
Never miss a tutorial:
output, then the next sample is processed, and predicts an output, so on.

It just so happens when we train the model we provide all the samples in a dataset together.

Does that help?


Picked for you:

How to Reshape Input Data for Long


REPLY 
Short-Term Shreeram
Memory Networks in Keras
Bhattarai September 25, 2019 at 9:52 pm #

Thanks for your reply but still not clear to me.


How to Develop an Encoder-Decoder
For examples:
Model for Sequence-to-Sequence
we have a 10 time steps data of size 120 (N,10,120). (N is sample numbers)
Prediction in Keras
f5 = frist 5 time steps Start Machine Learning ×
l5 = last 5 time steps
How to Develop an Encoder-Decoder You can master applied Machine Learning
whilewith
Model training :
Attention in Keras without math or fancy degrees.
1 Option() Find out how in this free and practical course.
seq_in = (N,f5, 120)
seq_out
How to Use=the
(N,l5,120)
TimeDistributed Layer in
Email Address
Keras
model.fit(seq_in, [seq_in,seq_out], epochs=300, verbose=0)

2 Option() START MY EMAIL COURSE


A Gentle Introduction to LSTM
seq_in = (N,10, 120)
Autoencoders
seq_out = (N,l5,120)

model.fit(seq_in, [seq_in,seq_out], epochs=300, verbose=0)

Loving
Could youthe Tutorials?
please help me to understand that difference between above options? Which way is
the correct way to train a network? Thank you.
The LSTMs with Python EBook is
where you'll find the Really Good stuff.

>> SEE WHAT'S INSIDE REPLY 


Jason Brownlee September 26, 2019 at 6:39 am #

I don’t follow, sorry.

len(f5) == 5?

Then you’re asking the difference betweenStart Machine


(N,5,120) and Learning
(N,10,120)?

The difference is the number of time steps.

If you are new to array shapes, this will help:


https://machinelearningmastery.com/faq/single-faq/what-is-the-difference-between-samples-
timesteps-and-features-for-lstm-input

Shreeram Bhattarai September 26, 2019 at 6:28 pm #

Sorry for inconvenience .

https://machinelearningmastery.com/lstm-Autoencoders/ 49/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

I am trying to ask with you that whether we have to pass all time steps( in this case 10),
Never miss a tutorial:
or pass first 5 time steps (in this case) to predict the next 5 steps. (I have a data of 10
time steps, my wish is to train a network with two decoder. First decorder should return
the reconstruction of input, and second decorder predict the next value).

The question is if I pass all 10 time steps to the network then it will see all the time steps
Picked for you: which means it encodes all seen data. from encoding space two decorde will try to
reconstruct and predict. It seems that both decoder looks similar then what is the
How to Reshape Input Data for Long
significance of using reconstruction branch decoder? How it helps to prediction decorder
Short-Term Memory Networks in Keras
in composite model?

Thank you once again.


How to Develop an Encoder-Decoder
Model for Sequence-to-Sequence
Prediction in Keras
Start Machine Learning ×
Jason Brownlee September 27, 2019 at 7:51 am #
How to Develop an Encoder-Decoder You can master applied Machine Learning
Yes, the
Model with Attention in Keras goal is not to train a predictive
without model,
math it is to
or fancy train an effective encoding
degrees.
of the input. Find out how in this free and practical course.

Using a prediction model as a decoder does not guarantee a better encoding, it is just
How to Use the TimeDistributed Layer in
an alternate strategy to try that may beEmail
usefulAddress
on some problems.
Keras
If you want to use an LSTM for time series prediction, you can start here:
https://machinelearningmastery.com/start-here/#deep_learning_time_series
START MY EMAIL COURSE
A Gentle Introduction to LSTM
Autoencoders

Shreeram Bhattarai September 27, 2019 at 6:13 pm #

Loving the Tutorials?


Thank you very much your answers.
The LSTMs with Python EBook is
where you'll find the Really Good stuff.

REPLY 
Marvi
>> SEEWaheed September 23, 2019 at 8:07 am #
WHAT'S INSIDE

Hi Jason,

I get NaN values when i apply the reconstruction autoencoder to my data (1,1000,1)
What can be the reason for it and how to resolve?
I am exploring how reshaping data works for LSTMs and have
Start tried dividing
Machine Learningmy data into batches of 5
with 200 timesteps each but wanted to check how (1,1000,1) works

REPLY 
Jason Brownlee September 23, 2019 at 10:04 am #

Sorry to hear that, this might help with reshaping data:


https://machinelearningmastery.com/faq/single-faq/what-is-the-difference-between-samples-
timesteps-and-features-for-lstm-input

https://machinelearningmastery.com/lstm-Autoencoders/ 50/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial: REPLY 


Marvi Waheed September 23, 2019 at 5:15 pm #

Thanks for replying.

can u identify the lstm model used for reconstruction? is it 1to1 or manyto1?
Picked for you:
where can i find explicit examples for lstm models on the website?
How to Reshape Input Data for Long
Short-Term Memory Networks in Keras

REPLY 
Jason Brownlee September 24, 2019 at 7:41 am #
How to Develop an Encoder-Decoder
YouSequence-to-Sequence
Model for can get started with LSTMs here:
https://machinelearningmastery.com/start-here/#lstm
Prediction in Keras
Start Machine Learning ×
Including tutorials, a mini-course, and my book.
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.
REPLY 
Sounak Ray September 29, 2019 at 2:26 am #
How to Use the TimeDistributed Layer in
Email Address
Hello,
Keras

I had a question. If I am using the Masking layer in the first part of the network, then does the
START MY EMAIL COURSE
RepeatVector() layer support masking. Because if it does not support masking and replicates each
A Gentle Introduction to LSTM
timestep with the same value, then our output loss will not be computed properly. Because ideally in our
Autoencoders
mse loss for each example we do not want to include the timestep where we had zero paddings.
Could you please share how to ignore the zero padded values while calculating the mse loss function.

Loving the Tutorials?

The LSTMs with Python EBook is REPLY 


Jason Brownlee September 29, 2019 at 6:14 am #
where you'll find the Really Good stuff.
Masking is only needed for input.
>> SEE WHAT'S INSIDE
The bottle beck will have a internal representation of the input – after masking.

Masked values are skipped from input.

Start Machine Learning


REPLY 
Sounak Ray September 29, 2019 at 8:34 pm #

Hello,

But if the reconstructed timesteps corresponding to the padded part is not zero, then the mean
square error loss will be vary large I suppose? Can you tell me if I am wrong here because my
mse loss is becoming “nan” after certain number of epochs. And is it best to do post padding or
pre padding?

Thanks,
Sounak Ray.

https://machinelearningmastery.com/lstm-Autoencoders/ 51/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

REPLY 
Never miss a tutorial:
Jason Brownlee September 30, 2019 at 6:08 am #

Correct.

The goal is not to create a great predictive model, it is to learn a great intermediate
representation.
Picked for you:
Sorry to hear that you are getting NANs, I have some suggestions here that might help:
How to Reshape Input Data for Long
https://machinelearningmastery.com/faq/single-faq/why-does-the-code-in-the-tutorial-not-
Short-Term Memory Networks in Keras
work-for-me

How to Develop an Encoder-Decoder


Model for Sequence-to-Sequence
James December 8, 2019 at 3:55 am #
Prediction in Keras
Start Machine Learning ×
Hi Jason, thanks for the article. I’m struggling the same problem with Sounak
How to Develop an mask
that the Encoder-Decoder You can
actually get lost when LSTM master applied Machine
return_sequence = False Learning
(also the
Model withRepeatVector
Attention in Keras without
does not explicitly support mathbecause
masking or fancyitdegrees.
actually change the
Timestep dimension), since the mask Find out be
cannot how in this free
passed and
to the practical
end of the course.
model, the loss
will be calculated also for those padded timesteps (I’ve validated this on a simple
How to Use the TimeDistributed
example), which are Layer in
not preferred. Email Address
Keras

START MY EMAIL COURSE


A Gentle Introduction to LSTM
Jason Brownlee December 8, 2019 at 6:17 am #
Autoencoders

I wonder if you can do experiments to see if it makes a difference to the


bottleneck representation that is learned?

Loving the Tutorials?

The LSTMs with Python EBook is


REPLY 
whereAlireza
you'll findHadj
the Really
OctoberGood stuff.
10, 2019 at 4:58 am #

Hi
>>Jason,
SEE WHAT'S INSIDE
I really enjoy your posts. Thanks for sharing your expertise. Really appreciate it!

I also have a question regarding this post. In the “Prediction Autoencoder” shouldn’t you split the time
sequence in half and try to predict the second half by feeding the first half to the encoder. They way that
you have implemented the decoder does not truly predict the sequence because the entire sequence
had been summarized and given to it by the encoder. Start
Is thatMachine Learning
true, or am I missing something here?

REPLY 
Jason Brownlee October 10, 2019 at 7:05 am #

You can try that – there are many ways to frame a sequence prediction problem, but that is
not the model used in this example.

Recall, we are not developing a prediction model, instead an autoencoder.

https://machinelearningmastery.com/lstm-Autoencoders/ 52/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial: REPLY 


Marvi Waheed October 21, 2019 at 5:22 pm #

Hello,

I’m working on data reconstruction where input is [0:8] columns of the dataset and required output is the
Picked for you:
9th column. However the LSTM autoencoder model returns the same value as output after 10 to 15
timesteps. I have applied the model on different datasets but facing similar issue.
How to Reshape Input Data for Long
What parameter
Short-Termadjustments must in
Memory Networks I do to obtain unique reconstructed values?
Keras

How to Develop an Encoder-Decoder


REPLY 
Jason
Model for Brownlee October 22, 2019 at 5:44 am #
Sequence-to-Sequence
Prediction in Keras
StartorMachine
Perhaps try using a different model architecture Learning
different training hyperparameters?
×
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course. REPLY 
Xi Zhu October 26, 2019 at 5:38 am #
How to Use the TimeDistributed Layer in
Fantastic! I hope you are getting paid for your Email Address
time here.
Keras

START MY EMAIL COURSE


A Gentle Introduction to LSTM REPLY 
Jason Brownlee October 26, 2019 at 5:47 am #
Autoencoders

Thanks.

Yes, some readers purchase ebooks to support me:


Loving the Tutorials?
https://machinelearningmastery.com/products/
The LSTMs with Python EBook is
where you'll find the Really Good stuff.

REPLY 
wysohn October 29,
>> SEE WHAT'S 2019 at 4:47 pm #
INSIDE

Hello,

Thank you for the amazing article!

I’ve read comments regarding the RepeatVector(), yet I’m still skeptical if I understood it correctly.
Start Machine Learning
We are merely copying the last output of the encoder LSTM and feed it to each cell of the decoder LSTM
to produce the unconditional sequence. Is it correct?

Also, I’m curious that what happens to the internal state of the encoder LSTM. Is it just discarded and will
never be used for the decoder? I wonder if using the internal state of the final LSTM cell of the encoder
for the initial internal state of LSTM of the decoder would have any kind of benefit. Or is it just completely
unnecessary since all we want is to train the encoder?

Thank you for your time!

REPLY 
Jason Brownlee October 30, 2019 at 5:57 am #

https://machinelearningmastery.com/lstm-Autoencoders/ 53/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Correct.
Never miss a tutorial:
The internal state from the encoder is discarded. The decoder uses state to create the
output.

The construction of each output step is conditional on the bottleneck vector and the state from
Pickedcreating the prior output step.
for you:

How to Reshape Input Data for Long


Short-Term Memory Networks in Keras
REPLY 
Syed November 12, 2019 at 7:55 am #

HowReally appreciate
to Develop your hard work and the tutorials are great. I have learned a lot. Can you
an Encoder-Decoder
Model for Sequence-to-Sequence
please write a tutorial on teacher forcing method in encoder decoder architecture? That would be really
helpful.Prediction in Keras
Start Machine Learning ×
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
REPLY 
Jason Brownlee November 12, 2019 at 2:01 Find
pm # out how in this free and practical course.

How toThanks!
Use the TimeDistributed Layer in
Email Address
Keras
Yes, I believe all of my tutorials for the encoder-decoder use teacher forcing.

START MY EMAIL COURSE


A Gentle Introduction to LSTM
Autoencoders REPLY 
Chrysostome November 17, 2019 at 1:27 am #

I would know what is the point to doing an autoencoder.


it seem equivalent to build one side an encoder decoder and in the other side the prediction model , as
Loving the Tutorials?
the two ouput don’t seem being used by each other.
MaybeThe
it would
LSTMsbewith
meaningler to use
Python EBook is the decodeur as discriminant for the prediction like a GAN
where you'll find the Really Good stuff.

>> SEE WHAT'S INSIDE


REPLY 
Jason Brownlee November 17, 2019 at 7:15 am #

It can be used as a feature extraction model for sequence data.

E.g. you could fit a decoder or any model and make predictions.
Start Machine Learning

REPLY 
Meenal December 9, 2019 at 5:09 am #

Is there any way of building an overfitted autoencoder(is overfitting needs to be taken care while
training an autoencoder).
and how can one justify that the encoded features obtained are the best compression possible for
reconstruction of original.

Also, can you please explain the time distributed layer in terms of the input to this layer. What is the use
of time distributed layer. is this layer only useful if working with LSTM layer?

Thanks for all your posts and books, they are very useful in understanding concepts and applying them.

https://machinelearningmastery.com/lstm-Autoencoders/ 54/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial:

REPLY 
Jason Brownlee December 9, 2019 at 6:55 am #

Good question!
Picked for you:
Yes. E.g. an autoencoder that does well on a training set but cannot reconstruct test data well .
How to Reshape Input Data for Long
More on the time distributed layer here:
Short-Term Memory Networks in Keras
https://machinelearningmastery.com/timedistributed-layer-for-long-short-term-memory-networks-in-
python/

How to Develop an Encoder-Decoder


Model for Sequence-to-Sequence
Prediction in Keras
Gideon Prior December 12, 2019 at 8:04 am # Start Machine Learning ×
REPLY 

How to Develop
I am an Encoder-Decoder
having trouble seeing the bottle neck. IsYou
it the 100
can unit layer
master after
applied the input?
Machine Should this
Learning
Model
normally, with Attention
without in Keras
a trivial data set for your example, bewithout math orthan
much smaller fancy degrees.
the number of time steps?
Find out how in this free and practical course.

How to Use the TimeDistributed Layer in


Email Address
Keras Jason Brownlee December 12, 2019 at 1:41 pm # REPLY 

Yes, the output of the first hidden layer – the encoder


START – is the
MY EMAIL the encoded representation.
COURSE
A Gentle Introduction to LSTM
Autoencoders

REPLY 
Jenna December 16, 2019 at 12:53 am #

Loving
Hi Jason,the Tutorials?
ThankTheyouLSTMs
so much withfor writing
Python this great
EBook is post. But I have a question that really confusing me. Here it is.
As the Encoder-Decoder
where LSTM
you'll find the Really Goodcanstuff.
benefit the training for output variable length, I’m wondering if it can
support the variable multi-step output. I am trying to vary the length of output steps with the “Multiple
Parallel Input
>> SEEandWHAT'S
Multi-Step Output” example from another post
INSIDE
https://machinelearningmastery.com/how-to-develop-lstm-models-for-time-series-forecasting/, so the
output sequence like:
[[[ 40 45 85]
[ 0 0 0]
[ 0 0 0]] Start Machine Learning
[[ 50 55 105]
[ 60 65 125]
[ 70 75 145]]

[[ 60 65 125]
[ 70 75 145]
[ 0 0 0]]

[[ 70 75 145]
[ 80 85 165]
[ 0 0 0]]

https://machinelearningmastery.com/lstm-Autoencoders/ 55/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

[[ 80 85 165]
Never miss a tutorial:
[ 0 0 0]
[ 0 0 0]]]
But my prediction results turned out to be not good. Could you give me some guidance? Is the padding
value 0 not suitable? Is the Encoder-Decoder LSTM cannot support the variable length of steps?
Picked foragain.
Thanks you:

How to Reshape Input Data for Long


Short-Term Memory Networks in Keras

REPLY 
Jason Brownlee December 16, 2019 at 6:18 am #
How to Develop an Encoder-Decoder
Yes, but you must pad the values. If you cannot use padding with 0, perhaps try -1.
Model for Sequence-to-Sequence
Predictionyou
Alternately, in Keras
can use a dynamic LSTM and process one time step at a time. This wills how you
Start Machine Learning ×
how:
https://machinelearningmastery.com/develop-encoder-decoder-model-sequence-sequence-
How to Develop an Encoder-Decoder You can master applied Machine Learning
prediction-keras/
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.

How to Use the TimeDistributed Layer in


Email Address REPLY 
Keras Jenna December 16, 2019 at 8:49 pm #

Thank you for suggesting me to process one time


START step at
MY EMAIL a time. I suddenly realize
COURSE
A there
GentleisIntroduction
no need totomake
LSTMthe output time steps variable since we can predict the output step by
step. Did I get it right? Besides, I think there is no rationale difference between the two Encoder-
Autoencoders
Decoder models from these two posts except for predicting different timesteps and using
different Keras function. Is this understanding correct?
Hope to hear from you. Thanks again.
Loving the Tutorials?

The LSTMs with Python EBook is


where you'll find the Really Good stuff.
REPLY 
Jason Brownlee December 17, 2019 at 6:34 am #
>> SEE WHAT'S INSIDE
Perhaps.

REPLY 
jackson January 10, 2020 at 7:21 pm #
Start Machine Learning
I have tried your model with my input. The loss was getting convergenced before 10 epochs as
excepting. However, the loss became bigger after a point in 10th epochs.
5536/42706 [==>………………………] – ETA: 39s – loss: 0.4187
5600/42706 [==>………………………] – ETA: 39s – loss: 0.4190
5664/42706 [==>………………………] – ETA: 39s – loss: 0.4189
5728/42706 [===>……………………..] – ETA: 39s – loss: 0.4188
5792/42706 [===>……………………..] – ETA: 39s – loss: 0.4189
5856/42706 [===>……………………..] – ETA: 39s – loss: 0.4184
5920/42706 [===>……………………..] – ETA: 38s – loss: 0.4185
5984/42706 [===>……………………..] – ETA: 38s – loss: 0.4188
6048/42706 [===>……………………..] – ETA: 38s – loss: 7.7892
6112/42706 [===>……………………..] – ETA: 38s – loss: 8.6366
https://machinelearningmastery.com/lstm-Autoencoders/ 56/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

6176/42706 [===>……………………..] – ETA: 38s – loss: 8.5517


Never miss a tutorial:
6240/42706 [===>……………………..] – ETA: 38s – loss: 8.4680
6304/42706 [===>……………………..] – ETA: 38s – loss: 8.3862
6368/42706 [===>……………………..] – ETA: 38s – loss: 8.3056
6432/42706 [===>……………………..] – ETA: 38s – loss: 8.2270
6496/42706
Picked [===>……………………..] – ETA: 38s – loss: 8.1499
for you:
6560/42706 [===>……………………..] – ETA: 38s – loss: 8.0738
How to[===>……………………..]
6624/42706 Reshape Input Data for Long – ETA: 38s – loss: 7.9993
Short-Term Memory Networks in Keras
6688/42706 [===>……………………..] – ETA: 38s – loss: 7.9269
6752/42706 [===>……………………..] – ETA: 38s – loss: 7.8556
6816/42706 [===>……………………..] – ETA: 38s – loss: 7.7855
How to[===>……………………..]
6880/42706 Develop an Encoder-Decoder – ETA: 37s – loss: 7.7169
Model for Sequence-to-Sequence
6944/42706 [===>……………………..] – ETA: 37s – loss: 7.6496
Prediction in Keras
7008/42706 [===>……………………..] – ETA: 37s – loss: Start Machine Learning
7.5831 ×
7072/42706 [===>……………………..] – ETA: 37s – loss: 7.5183
How to[====>…………………….]
7136/42706 Develop an Encoder-Decoder You can
– ETA: 37s – loss: master applied Machine Learning
7.4546
Model [====>…………………….]
7200/42706 with Attention in Keras without
– ETA: 37s – loss: math or fancy degrees.
7.3912
7264/42706 [====>…………………….] – ETA: 37s – loss: Find out how in this free and practical course.
7.3297
7328/42706 [====>…………………….] – ETA: 37s – loss: 7.2693
How to Use the TimeDistributed Layer in
7392/42706 [====>…………………….] – ETA: 37s – loss: 7.2094
Email Address
Keras
7456/42706 [====>…………………….] – ETA: 37s – loss: 7.1505
7520/42706 [====>…………………….] – ETA: 37s – loss: 7.0928
START MY EMAIL COURSE
7584/42706 [====>…………………….] – ETA: 37s – loss: 7.0363
A Gentle Introduction to LSTM
7648/42706 [====>…………………….] – ETA: 37s – loss: 6.9807
Autoencoders
7712/42706 [====>…………………….] – ETA: 37s – loss: 6.9260
7776/42706 [====>…………………….] – ETA: 37s – loss: 6.8724
7840/42706 [====>…………………….] – ETA: 37s – loss: 6.8196
Loving
7904/42706 the Tutorials?
[====>…………………….] – ETA: 36s – loss: 6.7676
7968/42706 [====>…………………….] – ETA: 36s – loss: 6.7163
The LSTMs with Python EBook is
8032/42706 [====>…………………….] – ETA: 36s – loss: 6.6655
where you'll find the Really Good stuff.
8096/42706 [====>…………………….] – ETA: 36s – loss: 6.6160
8160/42706 [====>…………………….] – ETA: 36s – loss: 6.5667
>> SEE WHAT'S INSIDE
8224/42706 [====>…………………….] – ETA: 36s – loss: 6.5184
8288/42706 [====>…………………….] – ETA: 36s – loss: 6.4707
8352/42706 [====>…………………….] – ETA: 36s – loss: 6.4239
8416/42706 [====>…………………….] – ETA: 36s – loss: 6.3782
8480/42706 [====>…………………….] – ETA: 36s – loss: 2378.7514
Start Machine Learning
8544/42706 [=====>……………………] – ETA: 36s – loss: 27760.9716
8608/42706 [=====>……………………] – ETA: 36s – loss: 27755.8645
8672/42706 [=====>……………………] – ETA: 36s – loss: 27978.9607
8736/42706 [=====>……………………] – ETA: 36s – loss: 28032.9492
8800/42706 [=====>……………………] – ETA: 35s – loss: 28025.2542
8864/42706 [=====>……………………] – ETA: 35s – loss: 27902.1603
8928/42706 [=====>……………………] – ETA: 35s – loss: 27837.8133
8992/42706 [=====>……………………] – ETA: 35s – loss: 27830.6104
9056/42706 [=====>……………………] – ETA: 35s – loss: 27731.7000
9120/42706 [=====>……………………] – ETA: 35s – loss: 27630.7813
9184/42706 [=====>……………………] – ETA: 35s – loss: 27768.5311
9248/42706 [=====>……………………] – ETA: 35s – loss: 28076.0159

https://machinelearningmastery.com/lstm-Autoencoders/ 57/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial:


REPLY 
Jason Brownlee January 11, 2020 at 7:23 am #

Nice work.
PickedPerhaps
for you:
try fitting the model again to see if you get a different result?

How to Reshape Input Data for Long


Short-Term Memory Networks in Keras
REPLY 
sampath January 18, 2020 at 8:46 am #
How to Develop an Encoder-Decoder
Hi Jason,
Model for Sequence-to-Sequence
I am trying to implement a LSTM autoencoder using encoder-decoder architecture. What if I want to use
Prediction in Keras
the functional API of keras and also NOT have my decoder Start getMachine
the inputs fromLearning
the previous i.e. my ×
decoder LSTM will not have any input but just the hidden and cell state initialized from encoder?
HowI want
(because to Develop an Encoder-Decoder
my encoder output to preserve all theYou can master
information applied Machine
necessary Learning
to reconstruct back the
Model with Attention in Keras without math or fancy degrees.
signal with giving any inputs to the decoder). Is something like this possible in keras?
Find out how in this free and practical course.

How to Use the TimeDistributed Layer in


Email Address
Keras REPLY 
Jason Brownlee January 18, 2020 at 8:56 am #

Yes, I believe that is the normal architectureSTART MY EMAIL


described in theCOURSE
above tutorial.
A Gentle Introduction to LSTM
If not, perhaps I don’t understand what you’re trying to achieve.
Autoencoders

Lovingsampath
the Tutorials?
January 18, 2020 at 8:59 am #
REPLY 

The LSTMs with Python EBook is


I mean in a functional API(the above mentioned is a sequential api).
where you'll find the Really Good stuff.
this is the code from one of your article:

>>
1 SEE
def WHAT'S INSIDE
define_models(n_input, n_output, n_units):
2 # define training encoder
3 encoder_inputs = Input(shape=(None, n_input))
4 encoder = LSTM(n_units, return_state=True)
5 encoder_outputs, state_h, state_c = encoder(encoder_inputs)
6 encoder_states = [state_h, state_c]
7 # define training decoder
8 decoder_inputs = Input(shape=(None, n_output))
Start Machine Learning
9 decoder_lstm = LSTM(n_units, return_sequences=True, return_state=True)
10 decoder_outputs, _, _ = decoder_lstm(decoder_inputs, initial_state=encoder_
11 decoder_dense = Dense(n_output, activation='softmax')
12 decoder_outputs = decoder_dense(decoder_outputs)
13 model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
14 # define inference encoder
15 encoder_model = Model(encoder_inputs, encoder_states)
16 # define inference decoder
17 decoder_state_input_h = Input(shape=(n_units,))
18 decoder_state_input_c = Input(shape=(n_units,))
19 decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
20 decoder_outputs, state_h, state_c = decoder_lstm(decoder_inputs, initial_st
21 decoder_states = [state_h, state_c]
22 decoder_outputs = decoder_dense(decoder_outputs)
23 decoder_model = Model([decoder_inputs] + decoder_states_inputs, [decoder_ou
24 # return all models
25 return model, encoder_model, decoder_model

https://machinelearningmastery.com/lstm-Autoencoders/ 58/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

what if i want my ‘decoder_lstm’ to not have any inputs(in this code, it is give ‘decoder_inputs’ as
Never miss a tutorial:
inputs)

REPLY 
Picked for you: Jason Brownlee January 18, 2020 at 9:02 am #

How to ReshapeAh I see.


Input Thanks.
Data for Long
Short-Term Memory Networks in Keras
Some experimentation will be required, I don’t have an example for you.

How to Develop an Encoder-Decoder


Model for Sequence-to-Sequence
sampath January 18, 2020 at 9:04 am #
Prediction in Keras
Start Machine Learning ×
the reason I want to use functional API is because I want to use stacked
How to Develop an Encoder-Decoder
LSTM(multiple layers) and I want the hidden_state from
You can master all layers
applied at the
Machine last time step of
Learning
Model withencoder.
Attention This
in Keras withoutAPI
is possible only with functional math or fancy degrees.
right?
Find out how in this free and practical course.

How to Use the TimeDistributed Layer in


Email Address
Keras Jason Brownlee January 19, 2020 at 7:03 am #

Most likely, yes. START MY EMAIL COURSE


A Gentle Introduction to LSTM
Autoencoders

REPLY 
Rajnish Pandey February 3, 2020 at 11:53 pm #

Loving the Tutorials?


Hey, @Jason Brownlee, I am working on textual data could you please explain this concept
regarding the text? I am calculating errors with glove pre-trained vector but my result is not up to the
The LSTMs with Python EBook is
mark
where you'll find the Really Good stuff.
Thank you in advance
>> SEE WHAT'S INSIDE

REPLY 
Jason Brownlee February 4, 2020 at 7:55 am #

Perhaps start here: Start Machine Learning


https://machinelearningmastery.com/start-here/#nlp

REPLY 
Nattachai February 23, 2020 at 7:57 pm #

Hi Jason,
I am working on time series data.
Can I use RNN Autoencode as time series representation like SAX, PAA

Thank you

https://machinelearningmastery.com/lstm-Autoencoders/ 59/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a Jason


tutorial:
Brownlee February 24, 2020 at 7:39 am # REPLY 

Perhaps try it and see?

Picked for you:


REPLY 
David March 2, 2020 at 1:20 pm #
How to Reshape Input Data for Long
Short-Term Memory Networks in Keras
Great article! Thanks!

How to Develop an Encoder-Decoder


Model for Sequence-to-Sequence REPLY 
Jason Brownlee March 3, 2020 at 5:54 am #
Prediction in Keras
Start Machine Learning ×
Thanks, I’m happy it helped.
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.
REPLY 
Leung Lau March 9, 2020 at 3:40 pm #
How to Use the TimeDistributed Layer in
Email Address
Keras
Hi Jason, I have a question. is last 100*1 vector you printed in the end of article the feature of
the sequence? Can this vector be later used as, for example, sequence classification or regression?
Thanks! START MY EMAIL COURSE
A Gentle Introduction to LSTM
Autoencoders

REPLY 
Jason Brownlee March 10, 2020 at 5:38 am #
Loving the Tutorials?
In most of the examples we are reconstructing the input sequence of numeric values.
Regression,
The LSTMsbut
withnot really.
Python EBook is
where you'll find the Really Good stuff.
The final example is the feature vector.

>> SEE WHAT'S INSIDE

REPLY 
Han March 21, 2020 at 5:44 pm #

Hello, dr. Jason, thanks for this useful tutorial!


I built a convolutional Autoencoder (CAE), the result ofStart the reconstructed image from the decoder is
Machine Learning
better than the original image, and i think if a classifer took a better image it would provide a good
output..

so I want to classify the input weather it is a bag, shoes .. etc


Is it better to:
1- delete the decoder and make the encoder as a classifier? (if I did this will it be like a normal CNN?)
2- or do the same as “Composite LSTM Autoencoder in this tutorial” to my CAE
3- take the output of the decoder (better image) to a classifier

I do not know, and I am really new to AI world, your reply will be so useful to me.
Thank you.

https://machinelearningmastery.com/lstm-Autoencoders/ 60/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a Jason


tutorial:
Brownlee March 22, 2020 at 6:52 am # REPLY 

You would keep the encoder and use the output of the encoder as input to a new classifier
model.

Picked for you:

How to Reshape Input Data for Long REPLY 


Han March 22, 2020 at 2:41 pm #
Short-Term Memory Networks in Keras

so I take the output of the encoder (maybe 8*8 matrix) and make it as input to model
that takes the same size (8*8)? no need to connect both CNNs (encoder, classifier)?
How to Develop an Encoder-Decoder
Model for Sequence-to-Sequence
Prediction in Keras
Start Machine Learning ×
REPLY 
Jason Brownlee March 23, 2020 at 6:11 am #
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras
You can connect them if you wantwithout
or use math
the encoder
or fancyasdegrees.
a feature extractor.
Find out how in this free and practical course.
Typically extracted features are a 1d vector, e.g. a bottleneck layer.

How to Use the TimeDistributed Layer in


Email Address
Keras

REPLY 
Rekha March 28, 2020 at 1:05 am #
START MY EMAIL COURSE
A Gentle Introduction to LSTM
Is it possible to use autoencoders for lstm time series prediction
Autoencoders

REPLY 
Jason
Loving theBrownlee
Tutorials?
March 28, 2020 at 6:21 am #

The LSTMs
Sure.with Python
They EBook
could is features, then feed these features into another model to make
extract
where you'll find the Really Good stuff.
predictions.

>> SEE WHAT'S INSIDE

REPLY 
Rekha March 28, 2020 at 2:17 pm #

Will there be a blog on autoencoders for lstm time series prediction in


machinelearningmastery.com Start Machine Learning

REPLY 
Jason Brownlee March 29, 2020 at 5:49 am #

The above tutorial is exactly this.

REPLY 
Augustus Van Dusen March 29, 2020 at 5:13 am #

Jason, I ran the Prediction LSTM Autoencoder from this post and saw the following error
message:
https://machinelearningmastery.com/lstm-Autoencoders/ 61/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

2020-03-28 14:01:53.115186: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:697]


Never miss a tutorial:
Iteration = 0, topological sort failed with message: The graph couldn’t be sorted in topological order.
2020-03-28 14:01:53.120793: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:697]
Iteration = 1, topological sort failed with message: The graph couldn’t be sorted in topological order.
2020-03-28 14:01:53.127457: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:502] layout
failed:for
Picked Invalid
you: argument: The graph couldn’t be sorted in topological order.
2020-03-28 14:01:53.190262: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:502] remapper
failed: How to Reshape
Invalid argument:Input
TheData
graphfor couldn’t
Long be sorted in topological order.
Short-Term Memory Networks in Keras
2020-03-28 14:01:53.194523: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:502]
arithmetic_optimizer failed: Invalid argument: The graph couldn’t be sorted in topological order.
2020-03-28 14:01:53.198763: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:697]
How
Iteration to topological
= 0, Develop an sort
Encoder-Decoder
failed with message: The graph couldn’t be sorted in topological order.
Model for Sequence-to-Sequence
2020-03-28 14:01:53.204018: E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:697]
Prediction in Keras
Iteration = 1, topological sort failed with message: TheStart Machine
graph couldn’t Learning
be sorted in topological order. ×
However, the code ran and the answer was equivalent to your answer. Have you seen this error? If so,
How to Develop an Encoder-Decoder You can master applied Machine Learning
do youModel
knowwith
whatAttention
it means?in Keras without math or fancy degrees.
Thanks. Find out how in this free and practical course.

How to Use the TimeDistributed Layer in


Email Address
Keras
REPLY 
Jason Brownlee March 29, 2020 at 6:05 am #
START MY EMAIL COURSE
A Gentle Introduction
I have not seento these
LSTM warnings before, sorry.
Autoencoders
Perhaps try searching/posting on stackoverflow?

Loving the Tutorials?


REPLY 
Chad Paik April 19, 2020 at 11:50 am #
The LSTMs with Python EBook is
whereHello
you'll Dr.Brownlee
find the Really Good stuff.

I was wondering
>> SEE why you INSIDE
WHAT'S use RepeatVector layer after LSTM to to match the time step size, but you can
obtain the same shape tensor by using repeat_sequence = True on the LSTM layer?

Thank you!

Start Machine Learning


REPLY 
Jason Brownlee April 19, 2020 at 1:18 pm #

Perhaps that is a new addition?

Where is that in the Keras API?


https://keras.io/layers/recurrent/

REPLY 
Chad Paik April 25, 2020 at 8:26 am #

Hello Dr.Brownlee

https://machinelearningmastery.com/lstm-Autoencoders/ 62/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

To clarify what I meant, please refer to the following code snippet I ran on tensorflow2.0 with
Never miss a tutorial:
eager execution enabled. (I wanted to post a screenshot but I couldnt replay with a picture)

inputs = np.random.random([2, 10, 1]).astype(np.float32)


x = LSTM(4, return_sequences=False)(inputs)
x = RepeatVector(10)(x)
Picked forx =you:
LSTM(8, return_sequences=True)(x)
x = TimeDistributed(Dense(5))(x)
How to Reshape Input Data for Long
print(f”input1:{inputs.shape}”)
Short-Term Memory Networks in Keras
print(f”output1: {x.shape}”)

x = LSTM(4, return_sequences=True)(inputs)
How
x = to Developreturn_sequences=True)(x)
LSTM(8, an Encoder-Decoder
Model for Sequence-to-Sequence
x = TimeDistributed(Dense(5))(x)
Prediction in Keras
print(f”input2:{inputs.shape}”) Start Machine Learning ×
print(f”output2: {x.shape}”)
How to Develop an Encoder-Decoder You can master applied Machine Learning
input1:(2, 10, 1)
Model with Attention in Keras without math or fancy degrees.
output1: (2, 10, 5)
Find out how in this free and practical course.
input2:(2, 10, 1)
output2:
How to Use(2,
the10, 5)
TimeDistributed Layer in
Email Address
Keras
I have compared two architectures where the first one emulates your code with repeatvector
after the first LSTM layer, and a second architecture where I used return_sequences=True and
START MY EMAIL COURSE
did not use repeatvector layer.
A Gentle Introduction to LSTM
The output shape of each networks are the same.
Autoencoders

Going back to my original question, is there a reason why you used RepeatVector layer instead
of putting return_sequences=True on the first LSTM layer?

ILoving
hope thisthe Tutorials?
clarifies. Thank you!!
The LSTMs with Python EBook is
where you'll find the Really Good stuff.

REPLY 
Jason
>> SEE WHAT'S Brownlee April 25, 2020 at 1:21 pm #
INSIDE

Yes, it results in a different architecture called an encoder-decoder model:


https://machinelearningmastery.com/encoder-decoder-long-short-term-memory-networks/

Start Machine Learning

Kyle May 4, 2020 at 9:08 pm #

I actually had the same question as Chad. That second article you linked to
does the same thing with ‘RepeatVector’ though. It seems like unless we’re using
‘return_sequence’ with the first LSTM layer (instead of using ‘repeatvector’), this
example only works when there’s a one-to-one pairing of single value outputs to input
sequences. For example, if multiple sequences could lead to a 0.9 value, I don’t see
how this could work since the encoder only uses the last frame of the sequence with
return_sequence=False. If the only argument for using “RepeatVector” is that we have to
do that to make it fit and not throw an error, then why not use return_sequence and not
throw away useful information that the encoder presumably would need? Seems like the

https://machinelearningmastery.com/lstm-Autoencoders/ 63/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

proper way to do this would be as Chad outlined above (i.e. with


Never miss a tutorial:
return_sequences=True and without simply repeating the last output so it fits).

Picked for you: Jason Brownlee May 5, 2020 at 6:25 am #

NotData
How to Reshape Input quite.
for Long
Short-Term Memory Networks in Keras
The output of the encoder is the bottleneck – it is the internal representation of the entire
input sequence. We then condition each step of the output on this representation and
the the previous generated output step.
How to Develop an Encoder-Decoder
Model for Sequence-to-Sequence
Prediction in Keras
Start Machine Learning ×
Kyle May 5, 2020 at 11:49 pm #
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attentioncouldn’t
in Kerasrespond in proper spot
without mathso
in thread, or sorry
fancythis
degrees.
is out of order but
looking into it some more, I think I see.Find
Is itout how in this
basically thatfree and
while practical
the output course.
of the encoder
is just one element (doesn’t return the full sequence), that value could be a very precise
How to Use the TimeDistributed
number Layer
that would then in
correspond to a full sequence,
Email Address which the decoder half of it would
Keras
learn? so like two different sequences ending in 0.9 could be encoded as different floats
here, like 3.47 and 5.72 (chosen at random for illustrative purposes), for instance? I was
experimenting with this a bit on my own,START MY EMAIL
and indeed COURSE
if I use return_sequence=True,
A Gentle Introduction to LSTM
there’s very little memory that actually gets saved in the encoding, which makes it kinda
Autoencoders
pointless. What I really want to do is encode sequences of images into small vectors,
building on to the autencoder examples here: https://blog.keras.io/building-
autoencoders-in-keras.html. This has all been very helpful, so thank you
Loving the Tutorials?

The LSTMs with Python EBook is


where you'll find the Really Good stuff.
Jason Brownlee May 6, 2020 at 6:25 am #

>> SEE WHAT'S The


INSIDE
bottleneck is typically a vector, not a single number.

REPLY 
Jameson April 20, 2020 at 11:17 pm #
Start Machine Learning
Hello Jason,

In this post (https://machinelearningmastery.com/develop-encoder-decoder-model-sequence-sequence-


prediction-keras/), state vectors of the decoder network are initialized by the last layer’s states vectors of
encoder network, right (lets call it type1)? However, here you only feed the decoder network’s input using
the output of encoder network (repeating the output values, lets call it type2). Inıtial states of the decoder
network are zeroed (by automatically?), similar to the initial values of the state vectors of the encoder
network?

So, What is the difference between these encoder-decoder networks in terms of usage (e..g when to
choose type1 over type2)?

Why didn’t you do the network you explained here with type1? or the one In section 9 of your book (you
give an example similar to what you are explaining here. The basic calculator.).
https://machinelearningmastery.com/lstm-Autoencoders/ 64/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Best
Never miss a tutorial:

REPLY 
Jason Brownlee April 21, 2020 at 5:57 am #
Picked for you:
The difference in the architecture is exactly as you say, architectural.
How
I find to Reshape
both Inputare
approaches Data for Long
pretty much the same in practice although the repeatvector method is
Short-Term Memory Networks in Keras
way simpler to implement. This is the approach I recommend and use in my tutorials.

How to Develop an Encoder-Decoder


Model for Sequence-to-Sequence REPLY 
Theodor Marcu April 24, 2020 at 12:50 am #
Prediction in Keras
Start Machine Learning ×
Hi Jason – It’s unclear why both you and Francois Chollet (https://blog.keras.io/a-ten-minute-
introduction-to-sequence-to-sequence-learning-in-keras.html)
How to Develop an Encoder-Decoder fit a training
You can master appliedmodel, butLearning
Machine you don’t reuse it
Model
(and the with Attention
associated in Keras
learned weights) in the inference step. From
without math reading your
or fancy code, I don’t understand
degrees.
Find out how
how the fitted model (train variable) is used in the infenc/infdec in thissince
models, free and
thepractical course.is never
train variable
used/called again.
How to Use the TimeDistributed Layer in
Email Address
Keras

START MY EMAIL COURSE REPLY 


Jason Brownlee April 24, 2020 at 5:47 am #
A Gentle Introduction to LSTM
Autoencoders
The model/weights for the encoder and decoder are used during inference. Perhaps review
the code again?

Loving the Tutorials?


REPLY 
TheAnthony
LSTMs with The Koala
Python May 31,
EBook is 2020 at 3:40 pm #
where you'll find the Really Good stuff.
Dear Dr Jason,
In the exercises
>> SEEwhich involve
WHAT'S the plot_model function
INSIDE

1 from keras.utils import plot_model

That is in order to produce a graphical model *,png file using plot_model, the python interface may throw
an error,
For Windows OS users, in order to get the graphical model via a *.png file, you will have to:
Start Machine Learning
* Install GraphViz binaries
* Set the environment path for the GraphViz program, eg: path=c:\program files (x86)\graphviz 2.38\bin;
rest of path ;
*In a command window do the following pip

1 pip install graphviz --upgrade

Your plot_model function should work.

Thank you,
Anthony of Sydney

https://machinelearningmastery.com/lstm-Autoencoders/ 65/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial: REPLY 


Jason Brownlee June 1, 2020 at 6:16 am #

Great tip!

Picked for you:


REPLY 
How toBaqar June
Reshape 9, 2020
Input Dataat for
3:20Long
pm #
Short-Term Memory Networks in Keras
Hi, there’s still an error with graphviz installation. An error something like this:

stdout, stderr: b” b”‘C:\\Program’ is not recognized as an internal or external command,\operable


How to or
program Develop
batch an Encoder-Decoder
file.\r\n”
Model for Sequence-to-Sequence
This can however
Prediction be resolved using a solution provided here,
in Keras
Start Machine Learning ×
https://github.com/conda-forge/graphviz-feedstock/issues/43
How to Develop an Encoder-Decoder You can master applied Machine Learning
Thanks
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.

How to Use the TimeDistributed Layer in


Iraj June 1, 2020 at 8:33 am # Email Address REPLY 
Keras

Hi and thank you for great post.


START MY EMAIL COURSE
My question is in composite version you have presented, it seems the forecasting is working
A Gentle Introduction to LSTM
independent from construction layers. How can I make a change first reconstruct the input sequence
Autoencoders
then forecast layers take the extracted features and does forecasting? Can I simply first fit decoder1 then
take the output of encoder as input of decoder2 and forecast? I don’t wand to save reconstruction phase.
Thank you
Loving the Tutorials?

The LSTMs with Python EBook is


where you'll find the Really Good stuff. REPLY 
Jason Brownlee June 1, 2020 at 1:40 pm #
>> SEE WHAT'S INSIDE
You’re welcome.

Why reconstruct then forecast, why not use input directly to forecast?

For example, see this:


https://machinelearningmastery.com/how-to-develop-lstm-models-for-time-series-forecasting/
Start Machine Learning

REPLY 
Iraj June 1, 2020 at 3:31 pm #

Yes, I have seen that link as well.


I thought forecasting on extracted features may be more accurate.

REPLY 
Jason Brownlee June 2, 2020 at 6:10 am #

https://machinelearningmastery.com/lstm-Autoencoders/ 66/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

It really depends on the specifics of the model and the data. I recommend controlled experiments in
Never miss a tutorial:
order to discover what works best for your specific dataset.

REPLY 
Picked forIraj
you:June 1, 2020 at 3:45 pm #

Lettome
How rephrase
Reshape Input my question.
Data I’m not sure whether the input of the forecasting part is extracted
for Long
features or raw inputs!
Short-Term Memory If, row inputs,
Networks in how
Kerascan I use extracted features as input of decoder2?
Thank you

How to Develop an Encoder-Decoder


Model for Sequence-to-Sequence
Prediction in Keras
Jason Brownlee June 2, 2020 at 6:11 am #
Start Machine Learning ×
REPLY 

The input to the decoder are extracted features.


How to Develop an Encoder-Decoder You can master applied Machine Learning
Model
You with Attention
can define in Keras
a multi-input model using the functional API
without andor
math have thedegrees.
fancy input flow to anywhere you
like: Find out how in this free and practical course.
https://machinelearningmastery.com/keras-functional-api-deep-learning/
How to Use the TimeDistributed Layer in
Email Address
Keras

START MY EMAIL COURSE REPLY 


Killian June 13, 2020 at 3:27 am #
A Gentle Introduction to LSTM
Autoencoders
Thanks for the excellent (as usual) post Jason.

“Regardless of the method chosen (reconstruction, prediction, or composite), once the autoencoder has
been fit, the decoder can be removed and the encoder can be kept as a standalone model.”
Loving the Tutorials?
If your data are 2D images from a video, it may make more sense to use a 2D convolutional LSTM as
outlined
The[inLSTMs
this post](https://towardsdatascience.com/prototyping-an-anomaly-detection-system-for-
with Python EBook is
videos-step-by-step-using-lstm-convolutional-4e06b7dcdd29).
where you'll find the Really Good stuff. If using this method, is it possible to
extract the compressed features from the last layer of the decoder (the “bottleneck”) as you have below?
>> SEE WHAT'S INSIDE

model = Model(inputs=model.inputs, outputs=model.layers[0].output)

Jason Brownlee June 13, 2020 at 6:10 am #Start Machine Learning REPLY 

Perhaps try it?

REPLY 
weiL June 24, 2020 at 12:07 am #

Hi Jason,

Is it possible to make conv1D+LSTM autoencoder? I think i saw some example in pytorch, but not sure if
there is any example in Keras?

https://machinelearningmastery.com/lstm-Autoencoders/ 67/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial: REPLY 


Jason Brownlee June 24, 2020 at 6:33 am #

I don’t see why not. Try experimenting.

Picked for you:


REPLY 
Ananthakrishnan
How September
to Reshape Input Data 6, 2020 at 3:05 am #
for Long
Short-Term Memory Networks in Keras
Hai sir,

I have used the same algorithm mentioned here for sequence reconstruction.
How to Develop an Encoder-Decoder
But i have a total
Model of 1 sample, 2205 time steps and 1 feature.
for Sequence-to-Sequence
Prediction in Keras
Start
I am getting the reconstructed value as ‘Nan’ while using ‘relu’ Machine Learning
activation function. ×
Instead of ‘relu’, if i am using ‘tanh’, the reconstruction works fine without Nan but i am getting the same
How to Develop an Encoder-Decoder You can master applied Machine Learning
reconstructed values which is also considered as an error.
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.
Kindly help me to get the correct reconstruction values.

How to Use the TimeDistributed Layer in


Email Address
Keras
REPLY 
Jason Brownlee September 6, 2020 at 6:07 am #
START MY EMAIL COURSE
A Gentle Introduction
Perhaps to LSTM
try scaling your data prior to modeling?
Autoencoders

REPLY 
Ananthakrishnan September 6, 2020 at 2:32 pm #
Loving the Tutorials?
Thank you very much for your response.
The LSTMs with Python EBook is
where you'll find the Really Good stuff.

>> SEE WHAT'S INSIDE


REPLY 
Ananthakrishnan September 7, 2020 at 12:01 am #

Hai sir,

I have tried scaling my data by a technique called Normalization. Now i am not getting Nan errors
but the reconstructed values are same which is anStart
error.Machine
If i am feeding a total of 10 values to the
Learning
lstm auto encoder with relu activation function, reconstructions works very fine.

In my case , i need to feed the whole 2205 time steps. What can i do for getting the correct
reconstruction. Kindly waiting for your reply.

Thank you

REPLY 
Jason Brownlee September 7, 2020 at 8:33 am #

2205 time steps is probably too much, perhaps try splitting your long sequence into
subsequences:

https://machinelearningmastery.com/lstm-Autoencoders/ 68/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

https://machinelearningmastery.com/handle-long-sequences-long-short-term-memory-recurrent-
Never miss a tutorial:
neural-networks/

REPLY 
Picked for you: Ananthakrishnan September 28, 2020 at 10:44 pm #

How to ReshapeSir,
Input Data for Long
Short-Term Memory Networks in Keras
I split my data (2205 time steps) in to 147 different arrays. Each array now contains 15
elements. So 147*15 = 2205.

How toNow i am an
Develop able to input only one array at a time to the model (sequence reconstruction).
Encoder-Decoder
Model for Sequence-to-Sequence
How to input this 147 arrays (each array contains 15 elements) at a time to the above
Prediction in Keras
Start
mentioned model ( Sequence reconstruction) Machine
so that Learning
i would get the 147 reconstructed ×
arrays at a time.
How to Develop an Encoder-Decoder You can master applied Machine Learning
ModelWaiting for your
with Attention valuable reply
in Keras without math or fancy degrees.
Thank you. Find out how in this free and practical course.

How to Use the TimeDistributed Layer in


Email Address
Keras

Jason Brownlee September 29, 2020 at 5:37 am #


START MY EMAIL COURSE
A Gentle Introduction to LSTM
This will help you prepare data for LSTMs:
Autoencoders
https://machinelearningmastery.com/faq/single-faq/what-is-the-difference-between-
samples-timesteps-and-features-for-lstm-input

Loving the Tutorials?

The LSTMs withBrownlee


Python EBook is REPLY 
Jason September 7, 2020 at 8:25 am #
where you'll find the Really Good stuff.
You’re welcome.
>> SEE WHAT'S INSIDE

REPLY 
Ananthakrishnan September 7, 2020 at 5:44 pm #

Thank you very much sir


Start Machine Learning

REPLY 
Jason Brownlee September 8, 2020 at 6:47 am #

You’re welcome.

REPLY 
Abhijeet Parida September 17, 2020 at 5:18 am #

Nice and super easy to read the article explaining LSTM autoencoder.

https://machinelearningmastery.com/lstm-Autoencoders/ 69/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

I have a clarification regarding autoencoders being self-supervised. The task of an autoencoder is to


Never miss a tutorial:
learn the compressed representation. The task does not use any kind of label and so is completely
unsupervised as opposed to self-supervised.

If the task of the autoencoder were to learn to recreate the image then you could call it self-supervised
as you provide a created a label(the same image itself) where non existed.
Picked for you:
So I feel that the statement regarding autoencoder being self-supervised is not entirely correct.
How to Reshape Input Data for Long
Short-Term Memory Networks in Keras

REPLY 
Jason Brownlee September 17, 2020 at 6:54 am #
How to Develop an Encoder-Decoder
Model for Sequence-to-Sequence
Thanks.
Prediction in Keras
How so? Start Machine Learning ×
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Abhishek Mane September 25, 2020 at 7:44 am #Find out how in this free and practical course. REPLY 

How
Hi to Use the TimeDistributed Layer in
Jason, Email Address
Keras
I’m trying something different as a part of Master’s research.

I’m working on predicting hourly traffic for individual bikeSTART MY(like


stations EMAIL COURSE
lime bike or citibike).
A Gentle Introduction to LSTM
So I have this data which has start point and end point entry and the time. I convert this into a timeseries
Autoencoders
for each station based on hourly number of traffic at the station.

My goal with AE-LSTM is to use all the stations hourly data like an RBM or AE-LSTM where the model
predicts next hour’s traffic for all stations. (So it takes in account neighbouring station’s previous hour
Loving the Tutorials?
data and current stations last 24 hour timesteps traffic data)
The LSTMs with Python EBook is
Now I tried to use the model from this tutorial but I’m stuck with an error –
where you'll find the Really Good stuff.
“ValueError: Error when checking target: expected time_distributed_5 to have 3 dimensions, but got
array with>> SEE WHAT'S
shape INSIDE
(11221, 175)”

My input data shape is (11221, 23, 175) and my output should be something like (11221, 175).

The last LSTM layer generates the output size but due to the TimeDistributed layer I get an error.

Any thoughts you may have would be really helpful.


Start Machine Learning

REPLY 
Jason Brownlee September 25, 2020 at 7:45 am #

You may need to reshape your target to be [11221, 175, 1], try that and let me know how
you go.

REPLY 
Abhishek Mane October 2, 2020 at 3:21 am #

Hi Jason,
https://machinelearningmastery.com/lstm-Autoencoders/ 70/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

That did not work for some reason but after reading articles on TimeDistributedDense, I think
Never miss a tutorial:
TimeDistributedDense is more important in OneToMany and ManyToMany (predicting multiple
time steps).

So I tried with just stacked LSTM layers and a final dense layer it works but I’m not sure if this
method will give me good results.
Picked for you:
I’m not sure about RepeatVector layer as to what is actually does but I did not include in the only
How to Reshape
LSTM InputArchitecture.
and Dense Data for Long
Short-Term Memory Networks in Keras
Can this architecture be called an AE-LSTM?

——————————————————————————————————————–
How to Develop an Encoder-Decoder
The old
Model AE-LSTM (with TimeDistributedDense) –
for Sequence-to-Sequence
Prediction in Keras
model = Sequential() Start Machine Learning ×
model.add(LSTM(64, dropout=0.2,recurrent_dropout=0.2,input_shape=
How to Develop an Encoder-Decoder You can master applied Machine Learning
(X0_train.shape[1],X0_train.shape[2]),activation=’relu’))
Model with Attention in Keras without math or fancy degrees.
model.add(keras.layers.RepeatVector(n=X0_train.shape[1]))
Find out how in this free and practical course.

model.add(LSTM(64, dropout=0.2,recurrent_dropout=0.2,activation=’relu’,
How to Use the TimeDistributed Layer in
return_sequences=True)) Email Address
Keras
model.add(keras.layers.TimeDistributed(Dense(X0_train.shape[2])))
START MY EMAIL COURSE
A #Gentle
Compile model to LSTM
Introduction
model.compile(loss=’mse’, optimizer=’adam’)
Autoencoders

——————————————————————————————————————–

Model: “sequential_1”
Loving the Tutorials?
_________________________________________________________________
Layer (type) Output Shape Param #
The LSTMs with Python EBook is
=================================================================
where you'll find the Really Good stuff.
lstm_1 (LSTM) (None, 64) 61440
_________________________________________________________________
>> SEE WHAT'S INSIDE
repeat_vector_1 (RepeatVecto (None, 23, 64) 0
_________________________________________________________________
lstm_2 (LSTM) (None, 23, 64) 33024
_________________________________________________________________
time_distributed_1 (TimeDist (None, 23, 175) 11375
Start Machine Learning
=================================================================
Total params: 105,839
Trainable params: 105,839
Non-trainable params: 0

———————————————————————————————————————

and finally the error –

ValueError: Error when checking target: expected time_distributed_1 to have shape (23, 175) but
got array with shape (175, 1)

https://machinelearningmastery.com/lstm-Autoencoders/ 71/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Never miss a tutorial: REPLY 


Abhishek Mane October 2, 2020 at 5:02 am #

My goal here is to predict only next hour’s predictions so I think Dense layer is good
for my case.

Picked for you:

How to Reshape Input Data for Long


REPLY 
Jason
Short-Term Memory Brownlee
Networks in Keras
October 2, 2020 at 6:03 am #

I’m eager to help, but I don’t have the capacity to review/debug your code, see this:
How tohttps://machinelearningmastery.com/faq/single-faq/can-you-read-review-or-debug-my-code
Develop an Encoder-Decoder
Model for Sequence-to-Sequence
Prediction in Keras
Start Machine Learning ×
Abhishek Mane October 2, 2020 at 6:59 am #
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Thank you very much Jason.
Find out how in this free and practical course.
It meant a lot that you got back to me.
How to Use the TimeDistributed
What I really wantedLayer in know what exactly the TimeDistributedDense and the
was to
Email Address
Keras RepeatVector layer does?

I found out after some deep search.


START MY EMAIL COURSE
A Gentle Introduction to LSTM
Thanks a lot.
Autoencoders
Sincerely,
Abhishek

Loving the Tutorials?

The LSTMs with Python EBook is


Jason Brownlee October 2, 2020 at 8:11 am #
where you'll find the Really Good stuff.

The repeat vector repeats the output of the encode n times, where n is the
>> SEE WHAT'S INSIDE
number of outputs required from the decoder.

The time distributed wrapper allows you to use the same decoder for each step of the
output instead of outputting a vector directly.

Start Machine Learning


REPLY 
Taemin October 2, 2020 at 4:12 pm #

Hi,

I am figuring out “prediction autoencoder LSTM.” I am wondering which part is the prediction because
the input is [1 2 3 …9] and output is [ around 2 around 3 … around 9]. I am interested in the value after 9
but this system doesn’t show the result. So it looks like just reconstruction.
So, I would appreciate it if you would let me know which part is the prediction part in this system.

REPLY 
Jason Brownlee October 3, 2020 at 6:05 am #
https://machinelearningmastery.com/lstm-Autoencoders/ 72/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

The first input is 1 and the first output is 2, etc.


Never miss a tutorial:

REPLY 
Taemin October 3, 2020 at 2:30 pm #
Picked for you:
Thank you for your reply.
But,toI Reshape
How may have to ask
Input again
Data more specifically.
for Long
Short-Term Memory Networks in Keras
So, in your example, your input sequence is
[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9] and output sequence is
[0.1657285 0.28903174 0.40304852 0.5096578 0.6104322 0.70671254 0.7997272 0.8904342 ].
How to Develop an Encoder-Decoder
According to your answer, if 0.1657286 is the prediction after input 0.1, what is the prediction
Model for Sequence-to-Sequence
after the input 0.9?
Prediction in Keras
Because the last output is 0.8904342 which isStart Machine
the prediction after 0.8,Learning
I don’t see the prediction ×
after the input 0.9.
How to Develop an Encoder-Decoder You can master applied Machine Learning
Thank
Model you.
with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.

How to Use the TimeDistributed Layer in


Email Address REPLY 
Keras Jason Brownlee October 4, 2020 at 6:50 am #

We don’t predict an output for the input


STARTof MY
0.9,EMAIL
because we don’t know the answer.
COURSE
A Gentle Introduction to LSTM
Autoencoders

Leave a Reply
Loving the Tutorials?

The LSTMs with Python EBook is


where you'll find the Really Good stuff.

>> SEE WHAT'S INSIDE

Start Machine Learning


Name (required)

Email (will not be published) (required)

Website

SUBMIT COMMENT

https://machinelearningmastery.com/lstm-Autoencoders/ 73/74
10/22/2020 A Gentle Introduction to LSTM Autoencoders

Welcome!
Never miss a tutorial:
I'm Jason Brownlee PhD
and I help developers get results with machine learning.
Read more

Picked for you:

How to Reshape Input Data for Long


Short-Term Memory Networks in Keras

How to Develop an Encoder-Decoder


Model for Sequence-to-Sequence
Prediction in Keras
Start Machine Learning ×
How to Develop an Encoder-Decoder You can master applied Machine Learning
Model with Attention in Keras without math or fancy degrees.
Find out how in this free and practical course.

How to Use the TimeDistributed Layer in


Email Address
Keras

START MY EMAIL COURSE


A Gentle Introduction to LSTM
Autoencoders

Loving the Tutorials?

The LSTMs with Python EBook is


where you'll find the Really Good stuff.

>> SEE WHAT'S INSIDE

Start Machine Learning

© 2020 Machine Learning Mastery Pty. Ltd. All Rights Reserved.


Address: PO Box 206, Vermont Victoria 3133, Australia. | ACN: 626 223 336.
LinkedIn | Twitter | Facebook | Newsletter | RSS

Privacy | Disclaimer | Terms | Contact | Sitemap | Search

https://machinelearningmastery.com/lstm-Autoencoders/ 74/74

You might also like