Hello to all C codes.
Having looked first at similar questions, such as mine, I could not find them.
How to get / compare 4 bytes in portable mode (without memcpy / memcmp, of course)?
I never studied C, and because of this, I am living proof that, without knowing the basics, everything becomes an unpleasant mess after that. In any case, writing words (already) has no time to say "start with the alphabet."
ulHashPattern = *(unsigned long *)(pbPattern); for (a=0; a < ASIZE; a++) bm_bc[a]=cbPattern; for (j=0; j < cbPattern-1; j++) bm_bc[pbPattern[j]]=cbPattern-j-1; i=0; while (i <= cbTarget-cbPattern) { if ( *(unsigned long *)&pbTarget[i] == ulHashPattern ) {
This snippet works the same as in the 32-bit Windows compiler. My wish is all such 4vs4 comparisons for working under 64-bit Windows and Linux. Many times I need transfers of 2.48 bytes, in the above example I need explicitly 4 bytes from some pbTarget offset. Here's the real question: what type should I use instead of an unsigned long ? (I think something is close to UINT16, UINT32, UINT64). In other words, what 3 types do I need to represent 2,4,8 bytes ALWAYS regardless of environment.
I believe that this basic question causes a lot of problems, so it needs to be clarified.
Update 2012-Jan-16:
@ Richard J. Ross III
I am embarrassed! Since I don't know if Linux uses 1] or 2], that is, _STD_USING defined in Linux, in other words, which group is portable with the types uint8_t, ..., uint64_t or _CSTD uint8_t, ..., _ CSTD uint64_t?
1] Excerpt from MVS 10.0 stdint.h
typedef unsigned char uint8_t; typedef unsigned short uint16_t; typedef unsigned int uint32_t; typedef _ULonglong uint64_t;
2] Excerpt from MVS 10.0 stdint.h
#if defined(_STD_USING) ... using _CSTD uint8_t; using _CSTD uint16_t; using _CSTD uint32_t; using _CSTD uint64_t; ...
There are no problems with Microsoft C 32bit:
; 3401 : if ( *(_CSTD uint32_t *)&pbTarget[i] == *(_CSTD uint32_t *)(pbPattern) ) 01360 8b 04 19 mov eax, DWORD PTR [ecx+ebx] 01363 8b 7c 24 14 mov edi, DWORD PTR _pbPattern$GSCopy$[esp+1080] 01367 3b 07 cmp eax, DWORD PTR [edi] 01369 75 2c jne SHORT $LN80@Railgun _Qu@6
But when 64bit is the target code, this is what happens:
D:\_KAZE_Simplicius_Simplicissimus_Septupleton_r2-_strstr_SHORT-SHOWDOWN_r7>cl /Ox /Tcstrstr_SHORT-SHOWDOWN.c /Fastrstr_SHORT-SHOWDOWN /w /FAcs Microsoft (R) C/C++ Optimizing Compiler Version 15.00.30729.01 for x64 Copyright (C) Microsoft Corporation. All rights reserved. strstr_SHORT-SHOWDOWN.c strstr_SHORT-SHOWDOWN.c(1925) : fatal error C1083: Cannot open include file: 'stdint.h': No such file or directory D:\_KAZE_Simplicius_Simplicissimus_Septupleton_r2-_strstr_SHORT-SHOWDOWN_r7>
What about Linux 'stdint.h, is it always presented?
I did not give up and commented on this: //#include <stdint.h> , then the compilation went fine:
; 3401 : if ( !memcmp(&pbTarget[i], &ulHashPattern, 4) ) 01766 49 63 c4 movsxd rax, r12d 01769 42 39 2c 10 cmp DWORD PTR [rax+r10], ebp 0176d 75 38 jne SHORT $LN1@Railgun _Qu@6 ; 3401 : if ( *(unsigned long *)&pbTarget[i] == ulHashPattern ) 01766 49 63 c4 movsxd rax, r12d 01769 42 39 2c 10 cmp DWORD PTR [rax+r10], ebp 0176d 75 38 jne SHORT $LN1@Railgun _Qu@6
This very "unsigned long *" bothers me, since gcc -m64 will retrieve QWORD, not DWORD, right?
@Mysticial
Just wanted to show three different translations made by Microsoft CL 32bit v16:
1]
; 3400 : if ( !memcmp(&pbTarget[i], pbPattern, 4) ) 01360 8b 04 19 mov eax, DWORD PTR [ecx+ebx] 01363 8b 7c 24 14 mov edi, DWORD PTR _pbPattern$GSCopy$[esp+1080] 01367 3b 07 cmp eax, DWORD PTR [edi] 01369 75 2c jne SHORT $LN84@Railgun _Qu@6
2]
; 3400 : if ( !memcmp(&pbTarget[i], &ulHashPattern, 4) ) 01350 8b 44 24 14 mov eax, DWORD PTR _ulHashPattern$[esp+1076] 01354 39 04 2a cmp DWORD PTR [edx+ebp], eax 01357 75 2e jne SHORT $LN83@Railgun _Qu@6
3]
; 3401 : if ( *(uint32_t *)&pbTarget[i] == ulHashPattern ) 01350 8b 44 24 14 mov eax, DWORD PTR _ulHashPattern$[esp+1076] 01354 39 04 2a cmp DWORD PTR [edx+ebp], eax 01357 75 2e jne SHORT $LN79@Railgun _Qu@6
The initial goal was to extract (with one mov command * * (uint32_t *) & pbTarget [i] respectively) and compare 4 bytes compared to the register variable 4 bytes in length, that is, one RAM access for comparison in one command . Flooring I was able to reduce only memcmp () 3 calls to RAM (applied to pbPattern, which points to 4 or more bytes) to 2, fortunately, to inlining. Now, if I want to use memcmp () on the first 4 bytes of pbPattern (as in 2), ulHashPattern does not have to be a type register, whereas 3] does not need such a restriction.
; 3400 : if ( !memcmp(&pbTarget[i], &ulHashPattern, 4) )
The above line gives an error (ulHashPattern is defined as: register unsigned long ulHashPattern;):
strstr_SHORT-SHOWDOWN.c(3400) : error C2103: '&' on register variable
Yes, you are right: memcmp () saves the situation (but with a limitation) - fragment 2] is identical to 3] my dirty style. Obviously, my tendency not to use the function, when it can be manually encoded, is a thing of the past, but I like it.
However, I'm not quite happy with the compilers, I defined ulHashPattern as a registration variable, but it loads from RAM every time ?! Maybe something is missing me, but this very (mov eax, DWORD PTR _ulHashPattern $ [esp + 1076]) line degrades performance - ugly code in my opinion.