How to freeze / lock weights of one TensorFlow variable (for example, one CNN core of one layer)

I have a TensorFlow CNN model that works well, and we would like to implement this model in hardware; i.e. FPGA. This is a relatively small network, but it would be ideal if it were smaller. For this purpose, I examined the kernels and found that there are some where the scales are quite strong, and there are others that do nothing at all (the core values ​​are close to zero). This happens specifically in layer 2, corresponding to the name tf.Variable (), "W_conv2". W_conv2 has the form [3, 3, 32, 32]. I would like to freeze / lock the W_conv2 [:,:,, 29, 13] values ​​and set them to zero so that the rest of the network can be trained for compensation. Setting the values ​​of this kernel to zero effectively removes / reduces the kernel from the hardware implementation, thereby achieving the goal indicated above.

I found similar questions with sentences that usually revolve around one of two approaches;

Offer No. 1:

tf.Variable(some_initial_value, trainable = False) 

The implementation of this sentence freezes the entire variable. I want to freeze only a fragment, in particular W_conv2 [:,:, 29, 13].

Offer No. 2:

  Optimizer = tf.train.RMSPropOptimizer(0.001).minimize(loss, var_list) 

Again, the implementation of this proposal does not allow the use of slices. For example, if I try to invert my stated goal (I optimize only one core of one variable) as follows:

  Optimizer = tf.train.RMSPropOptimizer(0.001).minimize(loss, var_list = W_conv2[:,:,0,0])) 

I get the following error:

  NotImplementedError: ('Trying to optimize unsupported type ', <tf.Tensor 'strided_slice_2228:0' shape=(3, 3) dtype=float32>) 

Slicing tf.Variables () is not possible as I tried it here. The only thing I tried that is close to what I want is to use .assign (), but it is extremely inefficient, cumbersome and looks like a caveman, since I implemented it as follows (after training the model):

  for _ in range(10000): # get a new batch of data # reset the values of W_conv2[:,:,29,13]=0 each time through for m in range(3): for n in range(3): assign_op = W_conv2[m,n,29,13].assign(0) sess.run(assign_op) # re-train the rest of the network _, loss_val = sess.run([optimizer, loss], feed_dict = { dict_stuff_here }) print(loss_val) 

The model was launched at Keras and then moved to TensorFlow, since Keras did not have a mechanism to achieve the desired results. I'm starting to think that TensorFlow doesn't allow cropping, but it's hard to believe; he just needs the right implementation.

+6
source share

Source: https://habr.com/ru/post/1015351/


All Articles