How to efficiently compare string with short string literal

If I don't need the overhead needed to call strcmp, I compare strings with short string literals, as described in the following code example:

#ifdef LITTLE_ENDIAN //little-endian-addressing #define BytesAsDWord_M(a, b, c, d)\ ((ulong) ((a) | ((b) << 8) | ((ulong) (c) << 16) | ((ulong) (d) << 24))) #define BytesAsWord_M(a, b)((ushort) ((a) | ((b) << 8))) #else //LITTLE_ENDIAN //little-endian-addressing #define BytesAsDWord_M(a, b, c, d)\ ((ulong) ((d) | ((c) << 8) | ((b) << 16) | ((a) << 24))) #define BytesAsWord_M(a, b) ((ushort) ((b) | ((a) << 8))) #endif //LITTLE_ENDIAN //little-endian-addressing bool AbsCompare(char* chr_p) //compare string with "abs" { if (*((ulong*) &chr_p[1]) == BytesAsDWord_M('a', 'b', 's', '\0')) return true; return false; } 

gcc compiles this example while I compile without optimization options enabled. When optimization is enabled, I get a warning:

"dereferenced pointer type will violate strict anti-aliasing rules"

Even optimization with -O3 does not lead to efficient code, as the example shows:

 //abstest.c #include <string.h> typedef unsigned long ulong; typedef unsigned short ushort; #if BYTE_ORDER == LITTLE_ENDIAN //little-endian-addressing #define BytesAsDWord_M(a, b, c, d)\ ((ulong) ((a) | ((b) << 8) | ((ulong) (c) << 16) | ((ulong) (d) << 24))) #define BytesAsWord_M(a, b)((ushort) ((a) | ((b) << 8))) #else //BYTE_ORDER == LITTLE_ENDIAN //little-endian-addressing #define BytesAsDWord_M(a, b, c, d)\ ((ulong) ((d) | ((c) << 8) | ((b) << 16) | ((a) << 24))) #define BytesAsWord_M(a, b) ((ushort) ((b) | ((a) << 8))) #endif //BYTE_ORDER == LITTLE_ENDIAN //little-endian-addressing int AbsCompare1(char* chr_p) { return *(ulong*) chr_p == BytesAsDWord_M('a', 'b', 's', '\0'); } int AbsCompare2(char* chr_p) { return strcmp(chr_p, "abs"); } int main(int argc __attribute__((unused)), char ** argv) { int i; int j; i = AbsCompare1(argv[0]); j = AbsCompare2(argv[0]); return i + j; } 

objdump -d -Mintel abstest:

 080483d0 <AbsCompare1>: 80483d0: 55 push ebp 80483d1: 89 e5 mov ebp,esp 80483d3: 8b 45 08 mov eax,DWORD PTR [ebp+0x8] 80483d6: 5d pop ebp 80483d7: 81 38 61 62 73 00 cmp DWORD PTR [eax],0x736261 80483dd: 0f 94 c0 sete al 80483e0: 0f b6 c0 movzx eax,al 80483e3: c3 ret 080483f0 <AbsCompare2>: 80483f0: 55 push ebp 80483f1: 0f b6 0d 5c 85 04 08 movzx ecx,BYTE PTR ds:0x804855c 80483f8: 89 e5 mov ebp,esp 80483fa: 8b 55 08 mov edx,DWORD PTR [ebp+0x8] 80483fd: 0f b6 02 movzx eax,BYTE PTR [edx] 8048400: 29 c8 sub eax,ecx 8048402: 75 2b jne 804842f <AbsCompare2+0x3f> 8048404: 0f b6 42 01 movzx eax,BYTE PTR [edx+0x1] 8048408: 0f b6 0d 5d 85 04 08 movzx ecx,BYTE PTR ds:0x804855d 804840f: 29 c8 sub eax,ecx 8048411: 75 1c jne 804842f <AbsCompare2+0x3f> 8048413: 0f b6 42 02 movzx eax,BYTE PTR [edx+0x2] 8048417: 0f b6 0d 5e 85 04 08 movzx ecx,BYTE PTR ds:0x804855e 804841e: 29 c8 sub eax,ecx 8048420: 75 0d jne 804842f <AbsCompare2+0x3f> 8048422: 0f b6 42 03 movzx eax,BYTE PTR [edx+0x3] 8048426: 0f b6 15 5f 85 04 08 movzx edx,BYTE PTR ds:0x804855f 804842d: 29 d0 sub eax,edx 804842f: 5d pop ebp 8048430: c3 ret 

Is it possible to directly compare this short literal without bypassing the embedding of chr_p in the union, especially because I want to compare chr_p with arbitrary indices like "& chr_p [1]"?

+4
source share
1 answer

No no. Did you know that the compiler will use its knowledge of strcmp ? It will do what you like (removal of official duties) without resorting to the type of pinning in the source. Such code conversions are most often performed in a code generator after the compiler has taken full advantage of the alias analysis.

If I use gcc -O3 to compile the following program, strcmp will not be found.

 #include <string.h> int main(int argc, char ** argv) { return strcmp(argv[0], "abs"); } 

My x86 collector, for example, looked like this (gcc version 4.3.2 (Debian 4.3.2-1.1)) (I know its old):

 main: leal 4(%esp), %ecx andl $-16, %esp pushl -4(%ecx) pushl %ebp movl %esp, %ebp pushl %ecx movl 4(%ecx), %eax movl (%eax), %edx movzbl (%edx), %eax subl $97, %eax jne .L2 movzbl 1(%edx), %eax subl $98, %eax jne .L2 movzbl 2(%edx), %eax subl $115, %eax jne .L2 movzbl 3(%edx), %eax .L2: popl %ecx popl %ebp leal -4(%ecx), %esp ret 

Basically strcmp was embedded and deployed. Of course, it depends a lot on the code generator for your purpose. Therefore, if it is not yet sufficiently developed, it can still generate strcmp . It still makes you wonder if you should not burden yourself with ugly code if perhaps the code generator will support it later .. when you are still sticking to your code.

+9
source

Source: https://habr.com/ru/post/1479318/


All Articles