Get the start and end address of a text section in an executable file

Question

Get the start and end address of a text section in an executable file

I need to get the start and end address of an executable text section. How can i get it?

I can get the start address from the _init or _start , but what about the end address? Should I consider the end address of the text section to be the last address before the start of the .rodata section?

Or do I need to change the default ld script and add my own characters to indicate the beginning and end of the text section and pass it to GCC when compiling? In this case, where should I put the new characters, should I consider the init and fini sections?

What is a good way to get the start and end address of a text section?

+15

c gcc ld

phoxis 10 sept. '11 at 7:56

source share

4 answers

It is wrong to talk about the “text segment”, because there may be several (guaranteed for the usual case when you have shared libraries, but it is still possible that for one ELF binar there are several PT_LOAD sections with anyway).

The following sample program downloads all the information returned by dl_iterate_phr . You are interested in any segment of type PT_LOAD with the flag PF_X (note that PT_GNU_STACK will turn on the flag if passed to the linker, so you really need to check both).

 #define _GNU_SOURCE #include <link.h> #include <stddef.h> #include <stdio.h> #include <stdlib.h> const char *type_str(ElfW(Word) type) { switch (type) { case PT_NULL: return "PT_NULL"; // should not be seen at runtime, only in the file! case PT_LOAD: return "PT_LOAD"; case PT_DYNAMIC: return "PT_DYNAMIC"; case PT_INTERP: return "PT_INTERP"; case PT_NOTE: return "PT_NOTE"; case PT_SHLIB: return "PT_SHLIB"; case PT_PHDR: return "PT_PHDR"; case PT_TLS: return "PT_TLS"; case PT_GNU_EH_FRAME: return "PT_GNU_EH_FRAME"; case PT_GNU_STACK: return "PT_GNU_STACK"; case PT_GNU_RELRO: return "PT_GNU_RELRO"; case PT_SUNWBSS: return "PT_SUNWBSS"; case PT_SUNWSTACK: return "PT_SUNWSTACK"; default: if (PT_LOOS <= type && type <= PT_HIOS) { return "Unknown OS-specific"; } if (PT_LOPROC <= type && type <= PT_HIPROC) { return "Unknown processor-specific"; } return "Unknown"; } } const char *flags_str(ElfW(Word) flags) { switch (flags & (PF_R | PF_W | PF_X)) { case 0 | 0 | 0: return "none"; case 0 | 0 | PF_X: return "x"; case 0 | PF_W | 0: return "w"; case 0 | PF_W | PF_X: return "wx"; case PF_R | 0 | 0: return "r"; case PF_R | 0 | PF_X: return "rx"; case PF_R | PF_W | 0: return "rw"; case PF_R | PF_W | PF_X: return "rwx"; } __builtin_unreachable(); } static int callback(struct dl_phdr_info *info, size_t size, void *data) { int j; (void)data; printf("object \"%s\"\n", info->dlpi_name); printf(" base address: %p\n", (void *)info->dlpi_addr); if (size > offsetof(struct dl_phdr_info, dlpi_adds)) { printf(" adds: %lld\n", info->dlpi_adds); } if (size > offsetof(struct dl_phdr_info, dlpi_subs)) { printf(" subs: %lld\n", info->dlpi_subs); } if (size > offsetof(struct dl_phdr_info, dlpi_tls_modid)) { printf(" tls modid: %zu\n", info->dlpi_tls_modid); } if (size > offsetof(struct dl_phdr_info, dlpi_tls_data)) { printf(" tls data: %p\n", info->dlpi_tls_data); } printf(" segments: %d\n", info->dlpi_phnum); for (j = 0; j < info->dlpi_phnum; j++) { const ElfW(Phdr) *hdr = &info->dlpi_phdr[j]; printf(" segment %2d\n", j); printf(" type: 0x%08X (%s)\n", hdr->p_type, type_str(hdr->p_type)); printf(" file offset: 0x%08zX\n", hdr->p_offset); printf(" virtual addr: %p\n", (void *)hdr->p_vaddr); printf(" physical addr: %p\n", (void *)hdr->p_paddr); printf(" file size: 0x%08zX\n", hdr->p_filesz); printf(" memory size: 0x%08zX\n", hdr->p_memsz); printf(" flags: 0x%08X (%s)\n", hdr->p_flags, flags_str(hdr->p_flags)); printf(" align: %zd\n", hdr->p_align); if (hdr->p_memsz) { printf(" derived address range: %p to %p\n", (void *) (info->dlpi_addr + hdr->p_vaddr), (void *) (info->dlpi_addr + hdr->p_vaddr + hdr->p_memsz)); } } return 0; } int main(void) { dl_iterate_phdr(callback, NULL); exit(EXIT_SUCCESS); }

+7

o11c Jun 23 '16 at 7:14

source share

.rodata not guaranteed to always appear immediately after .text . You can use objdump -h file and readelf --sections file for more information. With objdump you get the size and offset to the file.

+2

Emil romanus 10 sept. '11 at 8:03

source share

For Linux, consider using the nm(1) tool to check what characters the object file provides. You can select this character set where you could recognize both characters that Matthew Slater provided in his answer.

+2

sholsapp Jan 18 '12 at 10:54

source share

Matthew slattery · Accepted Answer · 2011-09-10T17:18:42+0000

The standard GNU binutils linker scripts for ELF-based platforms typically define quite a few different characters that can be used to find the beginning and end of various sections.

The end of a text section usually refers to a choice of three different characters: etext , _etext or __etext ; start can be found as __executable_start . (Note that these characters are usually exported using the PROVIDE () mechanism, which means that they will be overridden if something else in your executable defines them, and not just refers to them. In particular, this means that _etext or __etext more likely to be more secure than etext .)

Example:

 $ cat etext.c #include <stdio.h> extern char __executable_start; extern char __etext; int main(void) { printf("0x%lx\n", (unsigned long)&__executable_start); printf("0x%lx\n", (unsigned long)&__etext); return 0; } $ gcc -Wall -o etext etext.c $ ./etext 0x8048000 0x80484a0 $

I don’t believe that any of these characters is defined by any standard, so this should not be considered portable (I don’t know if GNU binutils provides them for all ELF-based platforms, or the set of provided characters has changed in different versions of binutils), although I assume that if you are doing something that needs this information, and b) you are considering hacked linker scripts as an option, then portability is not too much concern!

To see the exact character set that you get when creating a certain thing on a specific platform, give the --verbose ld (or -Wl,--verbose to gcc ) icon to print the script it linker (there are indeed several different default linker scripts that vary depending on the layout options and the type of object you are creating).

Get the start and end address of a text section in an executable file

More articles: