I am trying to sort the memory barrier functions in DirectX and OpenGL. My ultimate goal is to implement HLSL memory functions in GLSL. I found that the documentation for DX and GL is pretty obscure. There are 6 synchronization procedures in HLSL:
GroupMemoryBarrier()GroupMemoryBarrierWithGroupSync()DeviceMemoryBarrier()DeviceMemoryBarrierWithGroupSync()AllMemoryBarrier()AllMemoryBarrierWithGroupSync()
It seems like I really understand the end of WithGroupSync , which means "block the execution of all threads in the group until they all reach this call." I am also almost 100% sure that this is what the barrier() function in GLSL does.
I'm not sure what the device’s memory is, group memory and all memory. My current idea is that
- Group memory is only group memory.
- Device memory is GPU memory (e.g. textures, buffers)
- All memory is the shared memory of the device and group
What I really don't understand is how this maps to the GLSL sync functions:
GroupMemoryBarrier() . The documentation reads: "groupMemoryBarrier waits for all memory accesses made by a computational shader call to access the same access as other calls in the same workgroup, and then returns without any other effect." Main question:
- Although it has the same name as a function in HLSL, it looks like it is waiting for all memory transactions to complete, and not just for grouping shared memory.
memoryBarrier() . The documentation reads: "memoryBarrier waits for all calls that result from the use of image variables or atomic counters to complete, and then returns without any other effect. Questions:
- Is it really just waiting for the completion of operations with image memory and atomic counters and ignoring the buffer and shared memory? (
memoryBarrierBuffer and memoryBarrierShared suggest that there is another type of memory synchronization) - What is the difference between
GroupMemoryBarrier() and memoryBarrier() in a compute shader? I can only imagine that the latter is waiting for the completion of ALL transactions on all threads, which will have a huge impact on performance and, thus, is prohibited in HLSL.
memoryBarrierBuffer , memoryBarrierImage and memoryBarrierAtomicCounter . The documentation says: "memoryBarrier * waits for all calls that result from using the buffer / image / atomic counter to complete, and then returns without any other effect." Questions:- I cannot understand if this applies only to threads in one workgroup or if synchronization is performed in all threads in all workgroups.
The following shows how I understand the mapping of HLSL functions to GLSL:
GroupMemoryBarrier() = GroupMemoryBarrier() + memoryBarrierShared()GroupMemoryBarrierWithGroupSync() = GroupMemoryBarrier() + barrier()DeviceMemoryBarrier() = memoryBarrierBuffer() + memoryBarrierImage() + memoryBarrierAtomicCounter()DeviceMemoryBarrierWithGroupSync() = DeviceMemoryBarrier() + barrier()AllMemoryBarrier() = all barrier functionsAllMemoryBarrierWithGroupSync() = AllMemoryBarrier() + barrier()
I would really appreciate any help in sorting this issue.
source share