How to build a simple RNN with a loop on a graph in TensorFlow?

I just started playing with TensorFlow, and I'm trying to implement a very simple RNN. RNN has x as input, y as output and consists of only one level that accepts x , and the previous output as input. Here is a picture of what I mean:

Simple RNN

The problem is that I don’t see any way to use the TensorFlow API to plot a graph with a loop in it. Whenever I define a tensor, I have to indicate what it is, which means that I should already have defined its inputs. So, there is a problem with the chicken and the egg.

I don’t even know if it makes sense to define a graph with a cycle (what is calculated first? Should I determine the initial value of softmax node?). I played with the idea of ​​using a variable to represent the previous output, and then manually accepting the y value and storing it in the variable every time after submitting through the training sample. But it would be very slow if there was no way to present this procedure in the graph itself (?).

I know that TensorFlow tutorials show examples of RNN implementations, but they trick and pull the LSTM module out of a library that already has a loop. In general, the tutorials are good for helping you create certain things, but they could better explain how this beast really works.

So, TensorFlow experts, is there any way to build this thing? How can I do it?

+5
source share
1 answer

In fact, both advanced and reverse pass through all machine learning systems assume that your network does not have cycles. The usual way to implement a recurrent network is to deploy it in time for several steps (say, 50) and, therefore, transform a network that has loops into one that does not have it.

For example, in the documents you refer to:

https://www.tensorflow.org/versions/r0.7/tutorials/recurrent/index.html

They mention

To make the learning process acceptable, it is common practice to trim the gradients for backpropagation to a fixed number (num_steps) of deployed steps.

What this effectively means is that they will create num_steps LSTM cells, where each one takes the x value for the current time time and the output of the previous LSTM module as input.

BasicLSTMCell that they use, and what you think the loop has, doesn't actually have a loop. An LSTM cell is just an implementation of one LSTM step (a block that has two inputs [input and memory] and two outputs [output and memory], and uses a gate to calculate the outputs from the inputs), and not the entire LSTM network.

+2
source

Source: https://habr.com/ru/post/1243251/


All Articles