Communication between devices

As the tensor paper states, communication between devices through Tensorflow is achieved by adding “receive node” and “send node” to the devices.

In my opinion, the device (given that only the CPU devices are involved) is responsible for the operation. However, data (for example: Tensor created from operation, Variable buffer) is in memory. I don’t know how to physically transfer data from one device to another device . I assume that data transfer is achieved through shared memory. It is right?

I would appreciate any explanation / relevant codes regarding how the data is transferred. PS: TensorFlow paper binding , Figure 4 shows the communication mechanism between the devices.

+3
source share
1 answer

In TensorFlow, interaction between devices is achieved through an interface Rendezvousthat has several different implementations, depending on the deployment. A comment on this interface describes the general idea:

// A Rendezvous is an abstraction for passing a Tensor
// from a producer to a consumer, where the consumer may safely
// request the Tensor before or after it has been produced.  A
// producer never blocks when using a Rendezvous.  A consumer has the
// choice of making a blocking call or providing a callback: in either
// case, the consumer receives the Tensor as soon as it is available.

, TensorFlow Send Recv ops, , . , , Send Recv, "" ( , rendezvous ' , ). Send op : Rendezvous::Send(), , . Recv op : , , . , "" Recv op .

Rendezvous :

+5

Source: https://habr.com/ru/post/1659735/


All Articles