I am trying to run a very simple gradient descent on the function y = x ^ 2. I tried to implement it using the following code:
import theano
from theano import tensor as T
x = theano.shared(2)
y = x ** 2
dy_dx = T.grad(y, x)
learning_rate = 1
updates = [(x, x - learning_rate * dy_dx)]
fn = theano.function([], [y], updates = updates)
But when I try to compile the "fn" function, I get the following error:
TypeError: ('An update must have the same type as the original shared
variable (shared_var=<TensorType(int64, scalar)>,
shared_var.type=TensorType(int64, scalar),
update_val=Elemwise{sub,no_inplace}.0,
update_val.type=TensorType(float64, scalar)).', 'If the difference is
related to the broadcast pattern, you can call the
tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove
broadcastable dimensions.')
I thought this could be a problem with the learning_rate variable, as it may not be the same type as the general variable x, but if I change the code as follows:
updates = [(x, x - dy_dx)]
I still get the same error.
I'm stuck :( Any ideas?
source
share