How to reduce a fully connected (`InnerProduct ') level using truncated SVD

Question

How to reduce a fully connected (`InnerProduct ') level using truncated SVD

In Girshick, R Fast-RCNN (ICCV 2015) , section “3.1 Truncated SVD for Faster Detection,” the author suggests using SVD to reduce the size and computation time of a fully connected level.

Given the prepared model ( deploy.prototxt and weights.caffemodel ), how can I use this trick to replace a fully connected layer with a shortened one?

+6

deep-learning machine-learning neural-network caffe linear-algebra

Shai Nov 08 '16 at 7:02

source share

2 answers

Actually, Ross Girshick py-fast-rcnn repo includes an implementation for the SVD step: compress_net.py .

By the way, you usually need to fine-tune the compressed model in order to restore accuracy (or compress it in a more complicated way, see, for example, “ Accelerating Very Deep Convolutional Networks for Classification and Detection, ” Zhang et al.).

Also, for me, scipy.linalg.svd was faster than numpy svd.

+2

rkellerm Aug 24 '17 at 9:13

source share

Shai · Accepted Answer · 2016-11-08T08:02:19+0000

Some background of linear algebra
Singular value decomposition ( SVD ) is the decomposition of any matrix W into three matrices:

 W = USV*

Where U and V are orthonormal matrices, and S is the diagonal with elements of decreasing size along the diagonal. One of the interesting properties of SVD is that it allows you to easily approximate W using a matrix of a lower rank: suppose that you truncate S to have only its k leading elements (instead of all elements on the diagonal), then

 W_app = U S_trunc V*

is an approximation of rank k W

Using SVD to approximate a fully bonded layer
Suppose we have a deploy_full.prototxt model with a fully linked layer

 # ... some layers here layer { name: "fc_orig" type: "InnerProduct" bottom: "in" top: "out" inner_product_param { num_output: 1000 # more params... } # some more... } # more layers...

In addition, we have trained_weights_full.caffemodel - trained parameters for the deploy_full.prototxt model.

Copy deploy_full.protoxt to deploy_svd.protoxt and open it in the editor of your choice. Replace the fully bonded layer with these two layers:

 layer { name: "fc_svd_U" type: "InnerProduct" bottom: "in" # same input top: "svd_interim" inner_product_param { num_output: 20 # approximate with k = 20 rank matrix bias_term: false # more params... } # some more... } # NO activation layer here! layer { name: "fc_svd_V" type: "InnerProduct" bottom: "svd_interim" top: "out" # same output inner_product_param { num_output: 1000 # original number of outputs # more params... } # some more... }

There is a bit of network surgery in python:

 import caffe import numpy as np orig_net = caffe.Net('deploy_full.prototxt', 'trained_weights_full.caffemodel', caffe.TEST) svd_net = caffe.Net('deploy_svd.prototxt', 'trained_weights_full.caffemodel', caffe.TEST) # get the original weight matrix W = np.array( orig_net.params['fc_orig'][0].data ) # SVD decomposition k = 20 # same as num_ouput of fc_svd_U U, s, V = np.linalg.svd(W) S = np.zeros((U.shape[0], k), dtype='f4') S[:k,:k] = s[:k] # taking only leading k singular values # assign weight to svd net svd_net.params['fc_svd_U'][0].data[...] = np.dot(U,S) svd_net.params['fc_svd_V'][0].data[...] = V[:k,:] svd_net.params['fc_svd_V'][1].data[...] = orig_net.params['fc_orig'][1].data # same bias # save the new weights svd_net.save('trained_weights_svd.caffemodel')

Now we have deploy_svd.prototxt with trained_weights_svd.caffemodel which approximates the original network with much smaller multiplications and weights.

How to reduce a fully connected (`InnerProduct ') level using truncated SVD

More articles: