Understanding tensor flow profiling results

This example shows how to profile tensor flow programs. I used this tool to profile my program, a simple LSTM. And the results are shown as:

/gpu:0/stream:all Compute(pid 5)

MatMul_AllCompute

/job:localhost/replica:0/task:0/gpu:0 Compute(pid 3)

MatMul_GpuCompute

My question is:

a) what is the meaning of each line.

b) Especially what is the difference between /gpu:0/stream:all Compute(pid 5)and /job:localhost/replica:0/task:0/gpu:0 Compute(pid 3).

c) Why are they running time is different, namely 0.072msand 0.094ms.

+4
source share
1 answer

Here's an update from one of the engineers:

Timelines '/ gpu: 0 / stream: *' is a hardware trace of the CUDA kernel runtime.

'/gpu: 0' - TF, ops CUDA ( )

+1

Source: https://habr.com/ru/post/1674673/


All Articles