Update: I misunderstood the question. This is a new answer.
For this purpose, you need to update the connections only between the hidden layer and the second output block, while maintaining between the hidden layer and the first output block.
The first approach is to introduce two sets of variables : one for the connections between the hidden layer and the first output block, one for the rest. Then you can combine them with tf.stack
and pass var_list
to get the corresponding derivatives. Like this (just to illustrate, not verified. Use with caution):
out1 = tf.matmul(hidden, W_h_to_out1) + b_h_to_out1 out2 = tf.matmul(hidden, W_h_to_out2) + b_h_to_out2 out = tf.stack([out1, out2]) out = tf.transpose(tf.reshape(out, [2, -1])) loss = some_function_of(out) optimizer = tf.train.GradientDescentOptimizer(0.1) train_op_second_unit = optimizer.minimize(loss, var_list=[W_h_to_out2, b_h_to_out2])
Another approach is to use a mask. It is easier to implement and more flexible when you work with some frameworks (for example, slim, Keras, etc.), and I recommend this method, The idea is to hide the first block of the output of the loss function without changing the second output block. This can be done using a binary variable: multiply something by 1 if you want to keep it, and multiply it by 0 to remove it. Here is the code:
import tensorflow as tf import numpy as np
======================== Below is the old answer ==================== ==== =======
To get wrt derivatives for various variables, you can pass var_list
to decide which variable to update. Here is an example:
import tensorflow as tf import numpy as np