You are on page 1of 1

Search on Cross Validated… Log in Sign up

Anybody can ask a question


Cross Validated is a question and answer site
for people interested in statistics, machine
learning, data analysis, data mining, and data
visualization. It only takes a minute to sign up. Anybody can answer

Sign up to join this community The best answers are voted up and
rise to the top

SPONSORED BY

Home
Understanding LSTM units vs. cells Ask Question

PUBLIC Asked 4 years, 5 months ago Active 5 months ago Viewed 37k times

Questions
I have been studying LSTMs for a while. I understand at a high level how everything works.
Tags The Overflow Blog
However, going to implement them using Tensorflow I've noticed that BasicLSTMCell requires a
Users 42 number of units (i.e. num_units ) parameter. How often do people actually copy and
Unanswered paste from Stack Overflow? Now we
From this very thorough explanation of LSTMs, I've gathered that a single LSTM unit is one of the know.
TEAMS following
Featured on Meta
29
Stack Overflow for
Teams – Collaborate Stack Overflow for Teams is now free for
and share knowledge up to 50 users, forever
with a private group.

Linked

10 Structure of Recurrent Neural Network


(LSTM, GRU)

7 What are 'blocks' of an LSTM?

Create a free Team 4 What's relationship between Linear


which is actually a GRU unit. Regression & Recurrent Neural Networks
What is Teams?

I assume that parameter num_units of the BasicLSTMCell is referring to how many of these we 1 What is a multilayer LSTM?

want to hook up to each other in a layer.


Related
That leaves the question - what is a "cell" in this context? Is a "cell" equivalent to a layer in a
normal feed-forward neural network? 4 How are Controllers Attached to Read-
Write Heads in Neural Turing Machines?

1 Understanding RNN/LSTM
neural-networks terminology lstm rnn tensorflow

3 Why is my LSTM +- 1DConvNet so


ineffective at waveform analysis?
Share Cite Improve this question Follow edited Nov 10 '17 at 23:18 asked Oct 23 '16 at 23:37
user124589 15 Why can RNNs with LSTM units also suffer
from “exploding gradients”?

2 Breaking through an accuracy brickwall


with my LSTM
I am still confused, I was reading colah.github.io/posts/2015-08-Understanding-LSTMs and I understand
that well. How does the term cell apply with respect to that article? It seems that an LSTM cell in the 6 Time steps in Keras LSTM
article is a vector as in Tensorflow, right? – Charlie Parker Apr 19 '17 at 4:44
2 How does an LSTM process sequences
That units in Keras is the dimension of the output space, which is equal to the length of the delay longer than its memory?
(time_step) the network is recurring to. keras.layers.LSTM(units, activation='tanh', ....)
keras.io/layers/recurrent – notilas May 28 '19 at 23:12
Hot Network Questions
Add a comment
Maximal circle packing inside a given square

My DIY RGB LED panel powered by 9V battery


6 Answers Active Oldest Votes
works but certain colors like blue, white, and
purple don't work

The terminology is unfortunately inconsistent. num_units in TensorFlow is the number of hidden Animated movie (or series). A robot gives
someone a flower. Floating islands
states, i.e. the dimension of ℎ𝑡 in the equations you gave.
How do I say I live in Kenya?
22
Also, from Student put my name in the acknowledgement
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/api_docs/python/functions_a section despite the fact I have never talked to him

nd_classes/shard9/tf.nn.rnn_cell.RNNCell.md : Earliest second-person novel

Why didn't Hermione try to get Slughorn's


memory?
The definition of cell in this package differs from the definition used in the literature. In the
How do you reliably blow up a rocket that was built
literature, cell refers to an object with a single scalar output. The definition in this package
not to explode?
refers to a horizontal array of such units.
Is it better to walk or run in the rain?

Filter nearby points leaving only one in QGIS


"LSTM layer" is probably more explicit, example:
Why do powerlifting federations have minimum
standards for records?
def lstm_layer(tparams, state_below, options, prefix='lstm', mask=None):
What does "until she passes the five-year mark,
nsteps = state_below.shape[0]
Dad’s record to date" mean here?
if state_below.ndim == 3:
n_samples = state_below.shape[1] When is the word *LEV* ‫ לב‬first mentioned in the
else: Torah?
n_samples = 1 My kitten ate a coffee bean

assert mask is not None What would make human males an inefficient
[…] option for spreading alien genes?

Understanding how Fascism showed that direct


government is not same as self-government
Share Cite Improve this answer Follow edited Oct 24 '16 at 1:36 answered Oct 24 '16 at 1:30 Is a letter of resignation needed when contractual
Franck Dernoncourt work(Time-Specific) comes to an end?
38.7k 26 142 264
How to display quotas to my user without using
currency?
Ah I see, so then a "cell" is a num_unit sized horizontal array of interconnected LSTM cells. Makes Covariant derivative of the spin connection
sense. So then it would be analogous to a hidden layer in a standard feed-forward network then? –
user124589 Oct 24 '16 at 1:32 How can I prove mathematically that the mean of
a distribution is the measure that minimizes the
*LSTM state units – user124589 Oct 24 '16 at 1:33 variance?

@rec That's correct – Franck Dernoncourt Oct 24 '16 at 1:33 Terminology: Does the term 美化語 include all
ご・お・etc. prefixes, or only ones outside of a 敬
1 @Sycorax for example, if the input of the neural network is a timeseries with 10 time steps, the horizontal 語 context?
dimension has 10 elements. – Franck Dernoncourt Oct 24 '16 at 3:36
Confusion with inversion and descending,
example with guitar shape
1 I am still confused, I was reading colah.github.io/posts/2015-08-Understanding-LSTMs and I understand
that well. How does the term cell apply with respect to that article? It seems that an LSTM cell in the Minimum reentry velocity ideal for a spacecraft
article is a vector as in Tensorflow, right? – Charlie Parker Apr 19 '17 at 4:44 returning to Earth

Why does the `-` (minus) interpretation of GNU


Show 4 more comments
date differs from the intuitive one, when a date is
specified?

Most LSTM/RNN diagrams just show the hidden cells but never the units of those cells. Hence, the Question feed
confusion. Each hidden layer has hidden cells, as many as the number of time steps. And further,
7 each hidden cell is made up of multiple hidden units, like in the diagram below. Therefore, the
dimensionality of a hidden layer matrix in RNN is (number of time steps, number of hidden units).

Share Cite Improve this answer Follow answered Jan 30 '19 at 10:10
Garima Jain
169 1 1

Add a comment

Although the issue is almost the same as I answered in this answer, I'd like to illustrate this issue,
which also confused me a bit today in the seq2seq model (thanks to @Franck Dernoncourt's
4 answer), in the graph. In this simple encoder diagram:

Each ℎ𝑖 above is the same cell in different time-step (cell either GRU or LSTM as that in your
question) and the weight vectors(not bias) in the cell are of the same size of
(num_units/num_hidden or state_size or output_size).

RNN is a special type of graphical model where nodes form a directed list as explained in section 4
of this paper: Supervised Neural Networks for the Classication of Structures. We can think of
num_units as the number of tags in CRF(although CRF is undirected), and the matrices(𝑊 's in
graph in the question) are all shared across all time steps like the transition matrix in CRF.

Share Cite Improve this answer Follow edited Mar 15 '20 at 7:49 answered Apr 24 '17 at 13:43
Lerner Zhang
3,858 1 23 42

I believe the num_units = n in this figure – notilas May 28 '19 at 23:10

@notilas No, please don't. num_units is the dimension of the ℎ𝑖 . – Lerner Zhang Mar 5 '20 at 4:25

Add a comment

In keras.layers.LSTM(units, activation='tanh', ....) , the units refers to the


dimensionality or length of the hidden state or the length of the activation vector passed on the next
1 LSTM cell/unit - the next LSTM cell/unit is the "green picture above with the gates etc from
http://colah.github.io/posts/2015-08-Understanding-LSTMs/

The next LSTM cell/unit (i.e. the green box with gates etc from http://colah.github.io/posts/2015-08-
Understanding-LSTMs/) is NOT the same as the units in keras.layers.LSTM(units,
activation='tanh', ....)

The units are also sometimes called the latent dimensions . Here is a detailed explanation of
the units LSTM parameter:

https://zhuanlan.zhihu.com/p/58854907

Share Cite Improve this answer Follow edited Nov 9 '20 at 5:00 answered Nov 9 '20 at 4:55
Utpal Mattoo
149 1 5

Add a comment

Quoting from TF's tutorial on RNNs:

0 In addition to the built-in RNN layers, the RNN API also provides cell-level APIs. Unlike
RNN layers, which processes whole batches of input sequences, the RNN cell only
processes a single timestep.

Share Cite Improve this answer Follow answered Jun 29 '20 at 17:26
mostafa.elhoushi
139 4

Add a comment

In my opinion, cell means a node such as hidden cell which is also called hidden node, for
multilayer LSTM model,the number of cell can be computed by time_steps*num_layers, and the
-1 num_units is equal to time_steps

Share Cite Improve this answer Follow answered Jun 7 '18 at 8:41
user210864
1

Add a comment

Highly active question. Earn 10 reputation in order to answer this question. The reputation requirement helps
protect this question from spam and non-answer activity.

CROSS VALIDATED COMPANY STACK EXCHANGE Blog Facebook Twitter LinkedIn Instagram
NETWORK
Tour Stack Overflow
Technology
Help For Teams
Life / Arts
Chat Advertise With Us
Culture / Recreation
Contact Hire a Developer
Science
Feedback Developer Jobs
Other
Mobile About
Disable Responsiveness Press
Legal
Privacy Policy
Terms of Service
Cookie Settings
site design / logo © 2021 Stack Exchange Inc; user contributions licensed
Cookie Policy under cc by-sa. rev 2021.4.16.39093

You might also like