Unable to install LSTM using MultiRNNCell and dynamic_rnn

I am trying to build a multidimensional time series forecasting model. I followed the following temperature prediction guide. http://nbviewer.jupyter.org/github/addfor/tutorials/blob/master/machine_learning/ml16v04_forecasting_with_LSTM.ipynb

I want to extend my model to a multi-layer LSTM model using the following code:

cell = tf.contrib.rnn.LSTMCell(hidden, state_is_tuple=True) cell = tf.contrib.rnn.MultiRNNCell([cell] * num_layers,state_is_tuple=True) output, _ = tf.nn.dynamic_rnn(cell=cell, inputs=features, dtype=tf.float32) 

but I have an error:

ValueError: the sizes must be equal, but they have the values ​​256 and 142 for 'rnn / while / rnn / multi_rnn_cell / cell_0 / cell_0 / lstm_cell / MatMul_1' (op: "MatMul") with input forms: [?, 256], [142,512 ].

When I tried this:

 cell = [] for i in range(num_layers): cell.append(tf.contrib.rnn.LSTMCell(hidden, state_is_tuple=True)) cell = tf.contrib.rnn.MultiRNNCell(cell,state_is_tuple=True) output, _ = tf.nn.dynamic_rnn(cell=cell, inputs=features, dtype=tf.float32) 

I have no such error, but the prediction is really bad.

I define hidden=128 .

features = tf.reshape(features, [-1, n_steps, n_input]) has the form (?,1,14) for a single-layer case.

my data is as follows: x.shape=(594,14), y.shape=(591,1)

I am so confused how to add an LSTM cell in a tensor stream. My version of tensorflow is 0.14.

+5
source share
1 answer

This is a very interesting question. Initially, I thought that two codes produce the same output (i.e. add up two LSTM cells ).

code 1

 cell = tf.contrib.rnn.LSTMCell(hidden, state_is_tuple=True) cell = tf.contrib.rnn.MultiRNNCell([cell] * num_layers,state_is_tuple=True) print(cell) 

code 2

 cell = [] for i in range(num_layers): cell.append(tf.contrib.rnn.LSTMCell(hidden, state_is_tuple=True)) cell = tf.contrib.rnn.MultiRNNCell(cell,state_is_tuple=True) print(cell) 

However, if you print the cell in both instances, create something like the following,

code 1

 [<tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x000000000D7084E0>, <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x000000000D7084E0>] 

code 2

 [<tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x000000000D7084E0>, <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x000000000D708B00>] 

If you closely monitor the results,

  • For code 1, it prints a list of two objects in the LSTM cell and one object is a copy of the other (since pointers to two objects are the same)
  • For code 2, a list of two different objects of the LSTM cell is displayed (since pointers to two objects are different).

Stacking two LSTM cells is something like below,

enter image description here

Therefore, if you are thinking about a big picture (the actual Tensorflow operation may differ), then what it does,

  • The first input cards in hidden units of the LSTM 1 cell (in your case, 14 to 128 ).
  • Secondly, map the hidden units of the LSTM 1 cell to the hidden units of the LSTM 2 cell (in your case, 128 to 128 ).

Therefore, when you try to perform the above two operations with the same copy of the LSTM cell (since the dimensions of the weight matrices are different), an error occurs.

However, if you use the number of hidden units as well as the number of input units (input 14 in your case) and hidden 14 ) there is no mistake (since the dimensions of the weight matrices are the same), although you use the same LSTM cell .

Therefore, I think your second approach is correct if you are thinking of stacking two LSTM cells .

+5
source

Source: https://habr.com/ru/post/1273498/


All Articles