", sizeof(...">

Duplicate Literals and Hard Coding

I see that the following pattern happens quite often:

 b->last = ngx_cpymem(b->last, "</pre><hr>", sizeof("</pre><hr>") - 1);

Note that the literal string is used twice. Extract from nginx source database.

The compiler should be able to combine these literals when it occurs inside the compilation unit.

My questions:

  • Do commercial-level compilers (VC ++, GCC, LLVM / Clang) remove this redundancy when they occur in the compiler?
  • Does the (static) linker eliminate such abbreviations when linking object files.
  • If 2 is used, will this optimization occur during dynamic linking?
  • If 1 and 2 apply, do they apply to all literals.

These questions are important because it allows the programmer to be detailed without loss of efficiency, i.e. to think that huge static data models are rigidly tied to the program (for example, the rules of the Decision Support System, level).

Edit

2 points / clarification

  • The above code is written by a recognized "master" programmer. The guy wrote nginx alone.

  • I did not ask which of the possible mechanisms of literal hard coding is better. Therefore, do not leave the topic.

Edit 2

. , . , , , . , , , .

static ngx_conf_bitmask_t  ngx_http_gzip_proxied_mask[] = {
   { ngx_string("off"), NGX_HTTP_GZIP_PROXIED_OFF },
   { ngx_string("expired"), NGX_HTTP_GZIP_PROXIED_EXPIRED },
   { ngx_string("no-cache"), NGX_HTTP_GZIP_PROXIED_NO_CACHE },
   { ngx_string("no-store"), NGX_HTTP_GZIP_PROXIED_NO_STORE },
   { ngx_string("private"), NGX_HTTP_GZIP_PROXIED_PRIVATE },
   { ngx_string("no_last_modified"), NGX_HTTP_GZIP_PROXIED_NO_LM },
   { ngx_string("no_etag"), NGX_HTTP_GZIP_PROXIED_NO_ETAG },
   { ngx_string("auth"), NGX_HTTP_GZIP_PROXIED_AUTH },
   { ngx_string("any"), NGX_HTTP_GZIP_PROXIED_ANY },
   { ngx_null_string, 0 }
};

:

static ngx_str_t  ngx_http_gzip_no_cache = ngx_string("no-cache");
static ngx_str_t  ngx_http_gzip_no_store = ngx_string("no-store");
static ngx_str_t  ngx_http_gzip_private = ngx_string("private");

, , !

+3
5

, sizeof("</pre><hr>") , - sizeof 11 .

, .

+8

, ( #define ) . , , ( , , ).

, , :)

+8

, - , - , ? ; .

, - : ( , .)

C GCC 4.4.3, .

: , , ...

#include <stdlib.h>
#include <string.h>
#include <stdio.h>

main(){
    char *n = (char*)malloc(sizeof("teststring"));
    memcpy((void*)n, "teststring", sizeof("teststring"));
    printf("%s\n", n);
}

, , , ...

strings a.out|grep teststring

, , , , .

+5
  • Yes for GCC, should also be true for others.
  • Maybe yes for GNU linker (see -fmerge-constants, -fmerge-all-constants)
  • No
  • Not sure
+4
source

I wrote a small code example and compiled:

void func (void)
{
    char ps1[128];
    char ps2[128];

    strcpy(ps1, "string_is_the_same");
    strcpy(ps2, "string_is_the_same");

    printf("", ps1, ps2);
}

As a result, there is only one instance of the string_is_the_same literal in the assembler file, even without optimization. However, I’m not sure that these lines are not duplicated, they are placed in different files → different object files.

+4
source

Source: https://habr.com/ru/post/1752055/


All Articles