I experience an incredibly large amount (CPU) of RAM usage with Tensorflow, while each variable is allocated on the GPU device and all the calculations are done there. Even then, the use of RAM will exceed the use of VRAM by 2 times at least. I am trying to understand why it is so as to see if it can be fixed or if it is inevitable.
Question
So my main question is: Does Tensorflow provide and maintain a copy of all GPU variables in RAM (CPU)? If so, what stands out when (at what stage, see below)? And why is it useful to allocate this in processor memory?
Additional Information
I have 3 phases in which I see that RAM is increasing dramatically.
- Firstly, when defining the schedule (I add VGG-19 with rather large loss functions that many translated activation cards go through). This adds 2 GB to RAM usage.
- Secondly, the definition of the optimizer (I use Adam) adds 250 MB.
- Initializing global variables adds 750 MB.
And then it remains stable and works very fast (everything is on the GPU). (The amount of data mentioned here is when I entered tiny 8x8x3 images, lot size 1. If I do more than 1x16x16x3, the process is killed because it overflows my 8-bit RAM + 6GB).
Please note that I recorded the placement of variables using tf.ConfigProto (log_device_placement = True) and the use of the GPU using tf.RunMetadata and visualization on the tensor panel.
Thanks for any help.
- ( script, TensorFlow):
- : Linux Ubuntu 17.10
- TensorFlow ( ):
- TensorFlow ( ): 1.7
- Python: 3.6.3
- GCC/ ( ): 6.4.0
- CUDA/cuDNN: 9.0
- GPU: NVidia GeForce Titan Xp