I have a nasty optimization problem in TensorFlow that requires solving non-linear optimizers, internal flow tensor optimizers (Gradient Descent, AdaGrad, Adam) seem to be much worse than using scipy as an external optimizer (CG, BFGS) of the same graph.
That would be nice, but for mass production I want to need to use the mini filters of my training dataset for optimization. I implemented this every time the loss / gradient function is called, a new mini data record is used to calculate it. (I am using a modified version of https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/opt/python/training/external_optimizer.py ). In practice, this means that the loss function is a noisy function of the input parameters.
Scipy seems to have a problem with this, limiting any call to scipy.minimize to just a few iterations, for example:
Warning: Desired error not necessarily achieved due to precision loss.
Current function value: 71.329124
Iterations: 2
Function evaluations: 28
Gradient evaluations: 16
In contrast, if I run this optimization with a full set of data (which is possible now, but not later), it will converge to 0.1 in a single call to scipy.minimize (and do about 1000 iterations without exiting).
Has anyone encountered this issue? Is there a fix (easily preferable, but hacky is OK too) to stop scipy from getting out of these optimization problems? Something like the min_iter keyword would be perfect, but as far as I know, this is not implemented.
Hope that made sense. Thank!
EDIT: I was asked to enter the code, but the full code is several hundred lines long, so I will give a short example:
...
def minibatch_loss_function(model, inputs, outputs, batch_size=10):
minibatch_mask=random.choice(range(0, len(inputs), batch_size)
minib_inputs=inputs[minibatch_mask]
minib_outputs=outputs[minibatch_mask]
return loss(model, minib_inputs, minib_outputs),
gradients(model, minib_inputs, minib_outputs)
...
training_input, training_output = training_data(n_examples)
scp.optimize.minimize(minibatch_loss_function,
args={'inputs': training_input, 'outputs': training_output)