You should use UDP, this is pretty fast already. At least it was fast enough for W32 / SQLSlammer to spread all over the internet.
For your original question, see system calls (vm)spliceand teeLinux.
From the man page:
Three systems cause splicing (2), vmsplice (2) and tee (2)), provide user-space programs with full control over an arbitrary kernel buffer, implemented in the kernel using the same type of pipe buffer used. In the overview, this call system performs the following tasks:
splices (2)
moves data from the buffer to an arbitrary file descriptor, or vice
vice versa, or from one buffer to another.
tee (2)
"copies" the data from one buffer to another.
vmsplice (2)
"copies" data from user space into the buffer.
, . , . "" ( ), , : , .