I am trying to limit the number of cores that a tf session uses, but it does not work. This is how I initialize the session:
sess = tf.Session(config=tf.ConfigProto(inter_op_parallelism_threads=1, intra_op_parallelism_threads=1, use_per_session_threads=True))
The system has 12 cores / 24 threads, and I see that 40-60% of them are used at any given time. The system also has 8 GPUs, but I built the entire graph using tf.device('/cpu:0') .
UPDATE. To clarify, the graph itself is a simple LSTM-RNN, which is very close to the examples in the tf source code. For completeness, here is the full schedule:
node_input = tf.placeholder(tf.float32, [n_steps, batch_size, input_size], name = 'input') list_input = [tf.reshape(i, (batch_size, input_size)) for i in tf.split(0, n_steps, node_input)] node_target = tf.placeholder(tf.float32, [n_steps, batch_size, output_size], name = 'target') node_target_flattened = tf.reshape(tf.transpose(node_target, perm = [1, 0, 2]), [-1, output_size]) node_max_length = tf.placeholder(tf.int32, name = 'batch_max_length') node_cell_initializer = tf.random_uniform_initializer(-0.1, 0.1) node_cell = LSTMCell(state_size, input_size, initializer = node_cell_initializer) node_initial_state = node_cell.zero_state(batch_size, tf.float32) nodes_output, nodes_state = rnn(node_cell, list_input, initial_state = node_initial_state, sequence_length = node_max_length) node_output_flattened = tf.reshape(tf.concat(1, nodes_output), [-1, state_size]) node_softmax_w = tf.Variable(tf.random_uniform([state_size, output_size]), name = 'softmax_w') node_softmax_b = tf.Variable(tf.zeros([output_size]), name = 'softmax_b') node_logit = tf.matmul(node_output_flattened, node_softmax_w) + node_softmax_b node_cross_entropy = tf.nn.softmax_cross_entropy_with_logits(node_logit, node_target_flattened, name = 'cross_entropy') node_loss = tf.reduce_mean(node_cross_entropy, name = 'loss') node_optimizer = tf.train.AdamOptimizer().minimize(node_loss) node_op_initializer = tf.initialize_all_variables()
It is important to note that if the first time I call tf.Session , I pass the appropriate parameters, then the session only works on one core. The problem is that in subsequent runs I cannot change the behavior, although I use use_per_session_threads , which is supposed to specifically take into account session-specific settings. That is, even after I close the session with sess.close() and start a new one with new parameters, the original behavior remains unchanged if I do not restart the python kernel (this is very expensive, because it takes almost hour).