After reading this github issue, I feel like I'm missing something in my understanding in the queues:
https://github.com/tensorflow/tensorflow/issues/3009
I thought that when loading data into the queue, it will be preliminarily transferred to the GPU until the last batch is calculated, so there is practically no bandwidth bottleneck if the calculation takes longer than loading the next batch.
But the link above assumes that the graph has an expensive copy from the queue (numpy β TF) and that it will load files into the graph faster and do pre-processing instead. But that doesn't make sense to me. Why does it matter if I load a 256x256 image from a file compared to a numpy massive array? Anyway, I think the numpy version will be faster. What am I missing?
source
share