The short answer is that they are automatically synchronized ... and synchronized in the same way as all other commands synchronized (for example, checking the danger of pipelines). On processors that can issue multiple instructions per cycle, NEON instructions can be issued along with instructions other than NEON.
NEON is part of the kernel and uses the same caches as normal load / store instructions. However, this also means that on some processors it may be inefficient to mix NEON and non-NEON downloads and store or move data between NEON and universal registers.
source share