Working with the tensor: using the GPU memory share for each model

Question

Working with the tensor: using the GPU memory share for each model

I have one GPU at my disposal for deployment, but I need to deploy several models. I do not want to allocate full GPU memory for the first deployed model, because then I cannot deploy my subsequent models. During training, this can be controlled using the gpu_memory_fraction parameter. I use the following command to deploy my model -

tensorflow_model_server --port=9000 --model_name=<name of model> --model_base_path=<path where exported models are stored &> <log file path>

Is there a flag that I can set to control the distribution of gpu memory?

thanks

+5

deep-learning tensorflow tensorflow-serving tensorflow-gpu

dragster Dec 01 '17 at 5:44

source share

2 answers

John zhou · Answer 1 · 2017-12-14T11:05:36+0000

I will just add one flag to gpu configuration memory. https://github.com/zhouyoulie/serving

Dat nguyen · Answer 2 · 2018-02-11T23:56:35+0000

New TF Serving allowed to set per_process_gpu_memory_fraction flag in this pull request

Working with the tensor: using the GPU memory share for each model

More articles: