TensorFlow profiling using tfprof

I am trying to profile the use of TensorFlow computation / memory and found that tfprof is the right tool for my purpose. However, I could not get the FLOPS of all the operators.

Here is what I did after the tfprof tutorial using the cifar10 tutorial in the TensorFlow repository (tensorflow / models / image / cifar10 / cifar10_train.py):

run_metadata = tf.RunMetadata() _, loss_value = sess.run([train_op, loss], options=tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE), run_metadata=run_metadata) op_log = tfprof_log_pb2.OpLog() // TODO: add op information tf.contrib.tfprof.tfprof_logger.write_op_log( tf.get_default_graph(), log_dir="/tmp/log_dir", op_log=op_log, run_meta=run_metadata) tf.contrib.tfprof.model_analyzer.print_model_analysis( tf.get_default_graph(), run_metadata=run_metadata, op_log=op_log, tfprof_options=tf.contrib.tfprof.model_analyzer.FLOAT_OPS_OPTIONS) 

And the result

 Parsing GraphDef... Parsing RunMetadata... Parsing OpLog... Preparing Views... =========================Options============================= -max_depth 10000 -min_bytes 0 -min_micros 0 -min_params 0 -min_float_ops 1 -device_regexes .* -order_by float_ops -account_type_regexes .* -start_name_regexes .* -trim_name_regexes -show_name_regexes .* -hide_name_regexes -account_displayed_op_only true -select float_ops -viz false -dump_to_file ==================Model Analysis Report====================== _TFProfRoot (0/5.23b flops) conv2/Conv2D (3.77b/3.77b flops) conv1/Conv2D (707.79m/707.79m flops) gradients/local3/MatMul_grad/MatMul (226.49m/226.49m flops) gradients/local3/MatMul_grad/MatMul_1 (226.49m/226.49m flops) local3/MatMul (226.49m/226.49m flops) gradients/local4/MatMul_grad/MatMul (18.87m/18.87m flops) gradients/local4/MatMul_grad/MatMul_1 (18.87m/18.87m flops) local4/MatMul (18.87m/18.87m flops) conv1/BiasAdd (4.72m/4.72m flops) conv2/BiasAdd (1.18m/1.18m flops) gradients/softmax_linear/MatMul_grad/MatMul (491.52k/491.52k flops) gradients/softmax_linear/MatMul_grad/MatMul_1 (491.52k/491.52k flops) softmax_linear/MatMul (491.52k/491.52k flops) ======================End of Report========================== 

However, the result does not contain all ops, such as max pooling, relu, gradient of conv layers. Perhaps the flop statistics for these options are undefined (RegisterStatistics ("flops")). Therefore, in order to provide runtime information, as in the tfprof tutorial 11), I tried to create an OpLog (see code above).

However, I'm not sure how to add op information (how can I get the entry name for ops?). Is there a way to add ALL the ops it contains?

Or any other tool, not tfprof? Perhaps a profiling tool from NVIDIA?

+5
source share
1 answer

You are right that other operators do not have flops before they do not have RegisterStatistics ("flops"). You can contribute.

I'm not sure if NVIDA has tools for this.

+2
source

Source: https://habr.com/ru/post/1264369/


All Articles