Tensorflow collect matrix columns very slowly

For two matrices A (1000 x 100) and B (100 x 1000) instead of directly calculating their product in the tensor flow, i.e. tf.dot(A,B)I want to first select 10 columns (randomly) from A and 10 rows from B, and then usetf.dot(A_s,B_s)

Naturally, the second multiplication should be much faster, since the number of required multiplications reduces by 10 times.

However, in fact, selecting given columns of matrix A in the tensor flow for creat A_s is an extremely inefficient process.

Given the indices of the required columns in idx, I tried the following solutions for creat A_s. Decisions are evaluated in accordance with their performance:

  • . A_s = tf.transpose(tf.gather(tf.unstack(A, axis=1), idx)):

tf.dot(A_s,B_s)5 times slower than tf.dot(A,B)because creating A_s is too expensive.

  1.  2.


     p_shape = K.shape(params)
     p_flat = K.reshape(params, [-1])
     i_flat = K.reshape(K.reshape(
        K.arange(0, p_shape[0]) * p_shape[1], [-1, 1]) + indices, [-1])
     indices = [i_flat]
     v = K.transpose(indices)
     updates = i_flat * 0 - 1
     shape = tf.to_int32([p_shape[0] * p_shape[1]])
     scatter = tf.scatter_nd(v, updates, shape) + 1
     out_temp = tf.dynamic_partition(p_flat,
                     partitions=scatter, num_partitions=2)[0]
     A_s = tf.reshape(out_temp, [p_shape[0], -1])

6-7

  1.  3.


      X,Y =  tf.meshgrid((tf.range(0, p_shape[0])),indices)
      idx = K.concatenate([K.expand_dims(
           K.reshape((X),[-1]),1), 
           K.expand_dims(K.reshape((Y),[-1]),1)],axis=1)
      A_s = tf.reshape(tf.gather_nd(params, idx), [p_shape[0], -1])

10-12 .

, , .

PS1: .

PS2: , . , .

+4

Source: https://habr.com/ru/post/1674555/


All Articles