Before creating other threads, make sure that you have finished creating the chart.
Calling finalize() on a chart can help you with this.
def __init__(self, model_path): self.cnn_model = load_model(model_path) self.session = K.get_session() self.graph = tf.get_default_graph() self.graph.finalize()
Update 1: finalize() will make your graph read-only so that it can be safely used across multiple threads. As a side effect, this will help you detect inadvertent behavior, and sometimes memory leaks, as it will throw an exception when you try to change the schedule.
Imagine that you have a thread that does, for example, one hot-coding of your inputs. (wrong example :)
def preprocessing(self, data): one_hot_data = tf.one_hot(data, depth=self.num_classes) return self.session.run(one_hot_data)
If you print the number of objects on the chart, you will notice that it will increase over time
# amount of nodes in tf graph print(len(list(tf.get_default_graph().as_graph_def().node)))
But if you first define a graph that will not matter (a little better than the code):
def preprocessing(self, data):
Update 2: According to this thread, you need to call model._make_predict_function() on the keras model before doing multithreading.
Keras creates the GPU function the first time the prediction function () is called. What By the way, if you never cause a forecast, you will save time and resources. However, the first time you make a forecast, it is slightly slower than each other at a different time.
Updated code:
def __init__(self, model_path): self.cnn_model = load_model(model_path) self.cnn_model._make_predict_function()
Update 3: I proved the concept of warm-up because _make_predict_function() does not seem to work properly. First I created a dummy model:
import tensorflow as tf from keras.layers import * from keras.models import * model = Sequential() model.add(Dense(256, input_shape=(2,))) model.add(Dense(1, activation='softmax')) model.compile(loss='mean_squared_error', optimizer='adam') model.save("dummymodel")
Then in another script I loaded this model and ran it on multiple threads
import tensorflow as tf from keras import backend as K from keras.models import load_model import threading as t import numpy as np K.clear_session() class CNN: def __init__(self, model_path): self.cnn_model = load_model(model_path) self.cnn_model.predict(np.array([[0,0]]))
Commenting out the lines to warm up and finish, I was able to reproduce your first problem