TensorFlow Inverse Images

Question

TensorFlow Inverse Images

I am reading an article that explains how to spoof neural networks in predicting any image you want. I am using the mnist .

The article provides a relatively detailed walk, but the person who wrote it uses Caffe .

In any case, my first step was to create a logistic regression function using TensorFlow, trained in the mnist . So, if I were to restore the logistic regression model, I could use it to predict any image. For example, I submit number 7 to the following model ...

 with tf.Session() as sess: saver.restore(sess, "/tmp/model.ckpt") # number 7 x_in = np.expand_dims(mnist.test.images[0], axis=0) classification = sess.run(tf.argmax(pred, 1), feed_dict={x:x_in}) print(classification) >>>[7]

This will correctly print the number [7] .

Now the article explains that in order to destroy the neural network, we need to calculate the gradient of the neural network. This is a derivative of the neural network.

The article says that to calculate the gradient, we first need to select the expected result for the move and set the list of output probabilities to 0, and 1 for the expected result. Backpropagation is a gradient calculation algorithm.

Then Caffe contains code on how to calculate the gradient ...

 def compute_gradient(image, intended_outcome): # Put the image into the network and make the prediction predict(image) # Get an empty set of probabilities probs = np.zeros_like(net.blobs['prob'].data) # Set the probability for our intended outcome to 1 probs[0][intended_outcome] = 1 # Do backpropagation to calculate the gradient for that outcome # and the image we put in gradient = net.backward(prob=probs) return gradient['data'].copy()

Now, my problem is that it’s hard for me to understand how this function can get the gradient just by loading only the image and the probabilities of the function. Since I do not quite understand this code, it is difficult for me to translate this logic to TensorFlow .

I think I'm confused about how the Caffe framework works, because I have never seen / used it before. If someone can explain how this logic works step by step, that would be great.

I already know the basics of Backpropagation , so you can assume that I already know how this works.

Here is the link to the article ... https://codewords.recurse.com/issues/five/why-do-neural-networks-think-a-panda-is-a-vulture

+5

python-3.x neural-network caffe tensorflow

Bolboa Mar 18 '17 at 20:45

source share

1 answer

etarion · Accepted Answer · 2017-03-21T18:03:37+0000

I will tell you how to do the basics of creating a competitive image in TF, apply this to the already studied model, you may need some adaptations.

Blocks of code work well like blocks in a Jupyter laptop if you want to try it interactively. If you are not using a laptop, you will need to add calls to plt.show () for graphs that will display and delete the matplotlib inline statement. The code is basically a simple MNIST tutorial from the TF documentation, I will point out important differences.

The first block is just a setup, nothing special ...

 from __future__ import absolute_import from __future__ import division from __future__ import print_function # if you're not using jupyter notebooks then comment this out %matplotlib inline import matplotlib.pyplot as plt import numpy as np from tensorflow.examples.tutorials.mnist import input_data import tensorflow as tf

Get the MNIST data (from time to time it does not work, so you may need to download it from web.archive.org manually and put it in this directory). We do not use one hot coding, as in the tutorial, because now TF has more convenient functions for calculating losses that no longer need one hot coding.

 mnist = input_data.read_data_sets('/tmp/tensorflow/mnist/input_data')

In the next block, we do something "special." The input image tensor is defined as a variable because later we want to optimize relative to the input image. You will usually have a placeholder here. This limits us a bit, because we need a certain form, so we only eat one example at a time. Not what you want to do in production, but for training purposes this is normal (and you can get around it with a bit of code). Shortcuts are placeholders, as usual.

 input_images = tf.get_variable("input_image", shape=[1,784], dtype=tf.float32) input_labels = tf.placeholder(shape=[1], name='input_label', dtype=tf.int32)

Our model is a standard model of logistic regression, as in the textbook. We use only softmax to visualize the results, the loss function accepts simple logins.

 W = tf.get_variable("weights", shape=[784, 10], dtype=tf.float32, initializer=tf.random_normal_initializer()) b = tf.get_variable("biases", shape=[1, 10], dtype=tf.float32, initializer=tf.zeros_initializer()) logits = tf.matmul(input_images, W) + b softmax = tf.nn.softmax(logits)

Loss is standard cross-entropy. What should be noted at the training stage is that there is an explicit list of variables that we transmitted: we defined the input image as a training variable, but we do not want to try to optimize the image when training logistic regression, just weights and offsets - therefore we explicitly state , what.

 loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits,labels=input_labels,name='xentropy') mean_loss = tf.reduce_mean(loss) train_step = tf.train.AdamOptimizer(learning_rate=0.1).minimize(mean_loss, var_list=[W,b])

Start a session ...

 sess = tf.Session() sess.run(tf.global_variables_initializer())

Learning is slower than it should be due to lot size 1. As I said, not what you want to do in production, but it is only for learning the basics ...

 for step in range(10000): batch_xs, batch_ys = mnist.train.next_batch(1) loss_v, _ = sess.run([mean_loss, train_step], feed_dict={input_images: batch_xs, input_labels: batch_ys})

At this point, we should have a model that is good enough to demonstrate how to create a competitive image. Firstly, we get the image labeled “2” because it’s easy, so even our suboptimal classifier should get them right (if it’s not, run this cell again;) this step is random, so I can’t guarantee that it works).

We set our input image variable to this example.

 sample_label = -1 while sample_label != 2: sample_image, sample_label = mnist.test.next_batch(1) sample_label plt.imshow(sample_image.reshape(28, 28),cmap='gray') # assign image to var sess.run(tf.assign(input_images, sample_image)); sess.run(softmax) # now using the variable as input, no feed dict # should show something like # array([[ 0., 0., 1., 0., 0., 0., 0., 0., 0., 0.]], dtype=float32) # With the third entry being the highest by far.

Now we are going to “break down” the classification. We want to change the image so that it looks more like a different number in the eyes of the network without changing the network itself. For this, the code looks basically identical to what we had before. We define a “fake” label, the same loss as before (cross-entropy), and we get an optimizer to minimize fake losses, but this time with var_list, consisting only of the input image, so we won’t change the logistic regression weight:

 fake_label = tf.placeholder(tf.int32, shape=[1]) fake_loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits,labels=fake_label) adversarial_step = tf.train.GradientDescentOptimizer(learning_rate=1e-3).minimize(fake_loss, var_list=[input_images])

The following block is designed to run interactively several times, while you see the image and changing ratings (here moving to label 8):

 sess.run(adversarial_step, feed_dict={fake_label:np.array([8])}) plt.imshow(sess.run(input_images).reshape(28,28),cmap='gray') sess.run(softmax)

The first time you run this block, the ratings will probably still point a lot to 2, but this will change over time, and after a couple you will see something like the following image - note that the image still looks like 2 with some noise in the background, but the score for "2" is about 3%, and the score for "8" is more than 96%.

Note that we never calculated the gradient explicitly - we don’t need it, the TF optimizer takes care of calculating the gradients and applying updates to the variables. If you want a gradient, you can do this using tf.gradients (fake_loss, input_images).

The same model works for more complex models, but what you want to do is train your model as usual - using placeholders with large batches or using a pipeline with TF readers, and when you want to make a competitive image that you recreate on the network with an input image variable as an input signal. As long as all variable names remain unchanged (which is necessary if you use the same functions to create a network), you can restore using your network control point, and then apply the steps from this message to go to the contention image. You may need to play with learning speed, etc.

TensorFlow Inverse Images

More articles: