How to submit RNN pin to input in tensor stream

If we assume that I have a trained RNN (for example, a language model) and I want to see what it will generate by itself , how should I submit my output back to its input?

I read the following related questions:

It is theoretically clear to me that in the tensor flow we use truncated back propagation, so we need to determine the maximum step that we would like to β€œtrace”. In addition, we maintain the dimension for the parties, so if I wanted to train the sine wave, I have to feed the inputs [None, num_step, 1] .

The following code works:

 tf.reset_default_graph() n_samples=100 state_size=5 lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(state_size, forget_bias=1.) def_x = np.sin(np.linspace(0, 10, n_samples))[None, :, None] zero_x = np.zeros(n_samples)[None, :, None] X = tf.placeholder_with_default(zero_x, [None, n_samples, 1]) output, last_states = tf.nn.dynamic_rnn(inputs=X, cell=lstm_cell, dtype=tf.float64) pred = tf.contrib.layers.fully_connected(output, 1, activation_fn=tf.tanh) Y = np.roll(def_x, 1) loss = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples) opt = tf.train.AdamOptimizer().minimize(loss) sess = tf.InteractiveSession() tf.global_variables_initializer().run() # Initial state run plt.show(plt.plot(output.eval()[0])) plt.plot(def_x.squeeze()) plt.show(plt.plot(pred.eval().squeeze())) steps = 1001 for i in range(steps): p, l, _= sess.run([pred, loss, opt]) 

The size of the LSTM staff can vary, I also experimented with feeding a sine wave to the network and zeros, and in both cases it converged in ~ 500 iterations. Until now, I realized that in this case the graph consists of n_samples number of LSTM cells separating their parameters, and only before me I enter them as a time series. However, when it generates samples, the network clearly depends on its previous output - this means that I cannot immediately load the deployed model. I tried to calculate the state and output at each step:

 with tf.variable_scope('sine', reuse=True): X_test = tf.placeholder(tf.float64) X_reshaped = tf.reshape(X_test, [1, -1, 1]) output, last_states = tf.nn.dynamic_rnn(lstm_cell, X_reshaped, dtype=tf.float64) pred = tf.contrib.layers.fully_connected(output, 1, activation_fn=tf.tanh) test_vals = [0.] for i in range(1000): val = pred.eval({X_test:np.array(test_vals)[None, :, None]}) test_vals.append(val) 

However, in this model, there seems to be no continuity between LSTM cells. What's going on here?

Do I need to initialize a null array, i.e. 100 time steps, and assign each execution result to an array? Like network feed with this:

run 0: input_feed = [0, 0, 0 ... 0]; res1 = result input_feed = [0, 0, 0 ... 0]; res1 = result

run 1: input_feed = [res1, 0, 0 ... 0]; res2 = result input_feed = [res1, 0, 0 ... 0]; res2 = result

run 1: input_feed = [res1, res2, 0 ... 0]; res3 = result input_feed = [res1, res2, 0 ... 0]; res3 = result

etc...

What if I want to use this trained network to use my own output as my input in the next time step?

+7
source share
3 answers

If I understand you correctly, you want to find a way to feed the output of time step t as an input to the time step t+1 , right? There is a relatively simple job for this that you can use during testing:

  • Make sure your input field placeholders can take the length of a dynamic sequence, i.e. size measurement time None .
  • Make sure you use tf.nn.dynamic_rnn (what you do in the published example).
  • Pass the initial state to dynamic_rnn .
  • Then, during testing, you can scroll through your sequence and submit each step separately each time (i.e., the maximum sequence length is 1). In addition, you just need to carry the internal state of the RNN. See below pseudo-code (variable names refer to a piece of code).

Ie, change the model definition to something like this:

 lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(state_size, forget_bias=1.) X = tf.placeholder_with_default(zero_x, [None, None, 1]) # [batch_size, seq_length, dimension of input] batch_size = tf.shape(self.input_)[0] initial_state = lstm_cell.zero_state(batch_size, dtype=tf.float32) def_x = np.sin(np.linspace(0, 10, n_samples))[None, :, None] zero_x = np.zeros(n_samples)[None, :, None] output, last_states = tf.nn.dynamic_rnn(inputs=X, cell=lstm_cell, dtype=tf.float64, initial_state=initial_state) pred = tf.contrib.layers.fully_connected(output, 1, activation_fn=tf.tanh) 

Then you can output as follows:

 fetches = {'final_state': last_state, 'prediction': pred} toy_initial_input = np.array([[[1]]]) # put suitable data here seq_length = 20 # put whatever is reasonable here for you # get the output for the first time step feed_dict = {X: toy_initial_input} eval_out = sess.run(fetches, feed_dict) outputs = [eval_out['prediction']] next_state = eval_out['final_state'] for i in range(1, seq_length): feed_dict = {X: outputs[-1], initial_state: next_state} eval_out = sess.run(fetches, feed_dict) outputs.append(eval_out['prediction']) next_state = eval_out['final_state'] # outputs now contains the sequence you want 

Please note that this can also work for batches, however it can be a little more complicated if you run sequences of different lengths in the same batch.

If you want to make such a prediction not only during testing, but also during training, it is also possible to do this, but a little more difficult to implement.

+5
source

You can use your own output (last state) as the input of the next step (initial state). One way to do this:

  • use null initialized variables as input state at each time step
  • every time you complete a truncated sequence and get some output state, update the state variables with that output state that you just received.

The second can be performed either:

0
source

I know I was a bit late for the party, but I think this meaning can be useful:

https://gist.github.com/CharlieCodex/f494b27698157ec9a802bc231d8dcf31

This allows you to automatically enter data through a filter and return it to the network. To make the forms match each tf.layers.Dense processing you can set it as the tf.layers.Dense layer.

Please ask any questions!

0
source

Source: https://habr.com/ru/post/1264687/


All Articles