I always thought that the divergence in the branches was caused only by the branch code, for example, "if", "else", "for", "switch", etc. However, I recently read an article that says:
βYou can clearly see that the number of diverging branches occupied by threads in each first intelligence-based algorithm is at least two times more important than a full intelligence strategy. As a rule, this is the result of additional incompatible calls to global memory. Therefore, this discrepancy threads leads to many memory accesses that need to be serialized, which increases the total number of instructions executed.
You may notice that the number of warp serializations for a version using incompatible accesses is seven to sixteen times more important than for its counterpart. Indeed, thread discrepancies caused by non-coalesced accesses lead to numerous memory accesses that need to be serialized, increasing the number of executed commands. "
It seems that, according to the author, noncollinear calls can cause diverging branches. It's true? My question is, how many reasons are there for a divergence in the industry? Thanks in advance.
source share