Tensorflow LSTM-Cell Output

Question

Tensorflow LSTM-Cell Output

(using python)

I have a question about Tensorflow LSTM Implementation. There are currently several implementations in TF, but I use:

cell = tf.contrib.rnn.BasicLSTMCell(n_units)

where n_units is the number of "parallel" LSTM cells.

Then, to get my output, I call:

  rnn_outputs, rnn_states = tf.nn.dynamic_rnn(cell, x, initial_state=initial_state, time_major=False)

where (as time_major=False ) x has the form (batch_size, time_steps, input_length)
where batch_size is my batch_size
where time_steps is the number of timestamps with which my RNN will pass
where input_length is the length of one of my input vectors (a vector filed into the network at one specific time value in one particular batch)

I expect rnn_outputs to have the form (batch_size, time_steps, n_units, input_length) , since I did not specify a different output size. The nn.dynamic_rnn documentation tells me that the output has the form (batch_size, input_length, cell.output_size) . The tf.contrib.rnn.BasicLSTMCell documentation has an output_size property, the default value is n_units (the number of LSTM cells I use).

Thus, each LSTM-Cell only displays a scalar for each given time? I expect it to print the length vector of the input vector. It does not seem to be the way I understand it now, so I'm confused. Can you tell me if this case can, or how I could change it to output an input vector size vector to one lstm cell, maybe?

+6

python output tensorflow lstm

Ljks Feb 26 '17 at 15:21

source share

1 answer

Animesh Karnewar · Answer 1 · 2017-09-26T06:09:37+0000

I think the main confusion is related to the terminology of the LSTM cell argument: num_units . Unfortunately, this does not mean, as the name implies, “LSTM cell number”, which should be equal to your time steps. They actually correspond to the number of measurements in the latent state (cell state + vector of the latent state). The dynamic_rnn() call returns the form tensor: [batch_size, time_steps, output_size] where,

(Note this) output_size = num_units; if (num_proj = None) in lstm cell
where as, output_size = num_proj; if defined.

Now, as a rule, you will extract the last result of time_step and project it into the size of the output using the mat-mul + biases operation manually or use the num_proj argument in the LSTM cell.
I went through the same confusion and had to look very deep to clear it. Hope this answer clears some of them.

Tensorflow LSTM-Cell Output

More articles: