I train the RNN network, the first era used 7.5 hours. But when the training process starts, the tensor flow is slower and slower, the second era is 55 hours. I checked the code, most APIs that become slower over time:
session.run([var1, var1, ...], feed_dict=feed),tensor.eval(feed_dict=feed).
For example, one line code session.run[var1, var2, ...], feed_dict=feed), as the program starts, it uses 0.1 seconds, but when the process starts, the time used for this line of code becomes more and more, after 10 hours, the time of this line costs up to 10 seconds.
I have come across this several times. What caused this? How could I do this to avoid this?
If this line of code: self.shapes = [numpy.zeros(g[1].get_shape(), numy.float32) for g in self.compute_gradients]adds nodes to the tensor flow graph? I suspect this may be the reason. This line of code will be called many times periodically, and is selfnot an object tf.train.optimizer.
source
share