Understanding tensor flow profiling results

Question

This example shows how to profile tensor flow programs. I used this tool to profile my program, a simple LSTM. And the results are shown as:

/gpu:0/stream:all Compute(pid 5)

/job:localhost/replica:0/task:0/gpu:0 Compute(pid 3)

My question is:

a) what is the meaning of each line.

b) Especially what is the difference between /gpu:0/stream:all Compute(pid 5)and /job:localhost/replica:0/task:0/gpu:0 Compute(pid 3).

c) Why are they running time is different, namely 0.072msand 0.094ms.

+4

pgplus1628 Apr 12 '17 at 14:32

1 answer

Pete Warden · Accepted Answer · 2017-04-18T01:22:57+0000

Here's an update from one of the engineers:

Timelines '/ gpu: 0 / stream: *' is a hardware trace of the CUDA kernel runtime.

'/gpu: 0' - TF, ops CUDA ( )