I am trying to use the grad_loss parameter in optimizer.minimize(loss, grad_loss=) to change the gradients of a network with existing gradients. I followed the comments here: Using grads_ys parameter in tf.gradients - TensorFlow
and I would like to run a toy example in which I recreate the default values ββof 1 for grad_ys , as indicated in the documentation.
Here is the corresponding code segment:
grads_and_vars = optimizer.compute_gradients(loss_op) vars_with_grad = [v for g, v in grads_and_vars if g is not None] grad_loss = [] for grad,var in grads_and_vars: grad_loss.append(tf.ones_like(grad)) train_op = optimizer.minimize(loss_op, grad_loss=grad_loss)
The first part extracts gradients using compute_gradients . The last line calculates the gradients of the loss_op loss loss_op , but tries to use 1 filled vectors for gradations. As far as I understand, this should behave the same as funning minimize without the grad_loss parameter.
Unfortunately, this fails, as it expects grad_loss be a tensor (and has a dtype), not a list. Looking at gradients_impl.py , I see that the expected grad_loss function has the same dimension as loss (which in this case is a scalar).
I would appreciate any help in this simple example - how to add elements to gradients this way?
EDIT: I think the question boils down to defining grad_loss : "A Tensor holding a gradient calculated for loss ." How to create such a tensor from a set of gradients obtained using compute_gradients ?
Thanks.
source share