Memory items in DirectX and OpenGL

I am trying to sort the memory barrier functions in DirectX and OpenGL. My ultimate goal is to implement HLSL memory functions in GLSL. I found that the documentation for DX and GL is pretty obscure. There are 6 synchronization procedures in HLSL:

  • GroupMemoryBarrier()
  • GroupMemoryBarrierWithGroupSync()
  • DeviceMemoryBarrier()
  • DeviceMemoryBarrierWithGroupSync()
  • AllMemoryBarrier()
  • AllMemoryBarrierWithGroupSync()

It seems like I really understand the end of WithGroupSync , which means "block the execution of all threads in the group until they all reach this call." I am also almost 100% sure that this is what the barrier() function in GLSL does.

I'm not sure what the device’s memory is, group memory and all memory. My current idea is that

  • Group memory is only group memory.
  • Device memory is GPU memory (e.g. textures, buffers)
  • All memory is the shared memory of the device and group

What I really don't understand is how this maps to the GLSL sync functions:

  • GroupMemoryBarrier() . The documentation reads: "groupMemoryBarrier waits for all memory accesses made by a computational shader call to access the same access as other calls in the same workgroup, and then returns without any other effect." Main question:

    • Although it has the same name as a function in HLSL, it looks like it is waiting for all memory transactions to complete, and not just for grouping shared memory.
  • memoryBarrier() . The documentation reads: "memoryBarrier waits for all calls that result from the use of image variables or atomic counters to complete, and then returns without any other effect. Questions:

    • Is it really just waiting for the completion of operations with image memory and atomic counters and ignoring the buffer and shared memory? ( memoryBarrierBuffer and memoryBarrierShared suggest that there is another type of memory synchronization)
    • What is the difference between GroupMemoryBarrier() and memoryBarrier() in a compute shader? I can only imagine that the latter is waiting for the completion of ALL transactions on all threads, which will have a huge impact on performance and, thus, is prohibited in HLSL.
  • memoryBarrierBuffer , memoryBarrierImage and memoryBarrierAtomicCounter . The documentation says: "memoryBarrier * waits for all calls that result from using the buffer / image / atomic counter to complete, and then returns without any other effect." Questions:
    • I cannot understand if this applies only to threads in one workgroup or if synchronization is performed in all threads in all workgroups.

The following shows how I understand the mapping of HLSL functions to GLSL:

  • GroupMemoryBarrier() = GroupMemoryBarrier() + memoryBarrierShared()
  • GroupMemoryBarrierWithGroupSync() = GroupMemoryBarrier() + barrier()
  • DeviceMemoryBarrier() = memoryBarrierBuffer() + memoryBarrierImage() + memoryBarrierAtomicCounter()
  • DeviceMemoryBarrierWithGroupSync() = DeviceMemoryBarrier() + barrier()
  • AllMemoryBarrier() = all barrier functions
  • AllMemoryBarrierWithGroupSync() = AllMemoryBarrier() + barrier()

I would really appreciate any help in sorting this issue.

+6
source share

Source: https://habr.com/ru/post/987104/


All Articles