Reverse vector order in ARM NEON intrinsics

I am trying to reorder a 128 bit vector (uint16x8).

For example, if I have

abcdefgh 

I would like to receive

 hgfedcba 

Is there an easy way to do this with NEON? I tried with VREV, but it does not work.

+4
source share
1 answer

You want the vrev64.16 command, however it does not swap between double registers of the same quadrant register. You must achieve this using additional vswp .

For built-in functions

 q = vrev64q_u16(q) 

should do the trick to replace inside double words, then you need to swap words in square case. However, this becomes cumbersome since vswp intrinsics do not directly exist, which forces you to use something like

 q = vcombine_u16(vget_high_u16(q), vget_low_u16(q)) 

which actually ends as a vswp instruction.

The following is an example.

 #include <stdio.h> #include <stdlib.h> #include <arm_neon.h> int main() { uint16_t s[] = {0x101, 0x102, 0x103, 0x104, 0x105, 0x106, 0x107, 0x108}; uint16_t *t = malloc(sizeof(uint16_t) * 8); for (int i = 0; i < 8; i++) { t[i] = 0; } uint16x8_t a = vld1q_u16(s); a = vrev64q_u16(a); a = vcombine_u16(vget_high_u16(a), vget_low_u16(a)); vst1q_u16(t, a); for (int i = 0; i < 8; i++) { printf("0x%3x ", t[i]); } printf("\n"); return 0; } 

which generates the assembly as shown below

 vld1.16 {d16-d17}, [sp:64] movs r4, #0 vrev64.16 q8, q8 vswp d16, d17 vst1.16 {d16-d17}, [r5] 

and displays

 $ rev 0x108 0x107 0x106 0x105 0x104 0x103 0x102 0x101 
+3
source

Source: https://habr.com/ru/post/1501813/


All Articles