Is it possible to determine if a character is a variable or function in C?

I am implementing some of the limited remote debugging features for an application written in C running on a Linux server. The goal is to communicate with the application and search for the value of an arbitrary variable or launch an arbitrary function.

I can search for characters through dlsym() calls, but I cannot determine if the return address refers to a function or variable. Is there a way to determine the information to enter through this character table?

+6
source share
5 answers

You can read the file /proc/self/maps and parse the first three fields of each line:

 <begin-addr>-<end-addr> rwxp ... 

Then you look at the line containing the address you are looking for and check the permissions:

  • rx : this is the code;
  • rw- : this is the data being recorded;
  • r-- : this is read-only data;
  • any other combination: something strange ( rwxp : generated code, ...).

For example, the following program:

 #include <stdio.h> void foo() {} int x; int main() { int y; printf("%p\n%p\n%p\n", foo, &x, &y); scanf("%*s"); return 0; } 

... on my system produces this result:

 0x400570 0x6009e4 0x7fff4c9b4e2c 

... and these are the corresponding lines from /proc/<pid>/maps :

 00400000-00401000 r-xp 00000000 00:1d 641656 /tmp/a.out 00600000-00601000 rw-p 00000000 00:1d 641656 /tmp/a.out .... 7fff4c996000-7fff4c9b7000 rw-p 00000000 00:00 0 [stack] .... 

Thus, the addresses are: code, data, and data.

+2
source

On x86 platforms, you can check the instructions used to configure the stack for a function if you can examine its address space. Usually this:

 push ebp mov ebp, esp 

I'm not sure about the x64 platforms, but I think it looks like:

 push rbp mov rbp, rsp 

This describes the C call convention

However, remember that compiler optimization can optimize these instructions. If you want this to work, you may need to add a flag to disable this optimization. I believe that for GCC, -fno-omit-frame-pointer will do the trick.

+3
source

One possible solution is to extract the symbol table for the application by analyzing the nm utility output. nm includes character type information. Type T characters (global text) are functions.

The problem with this solution is that you have to make sure your character table matches the target (especially if you intend to use it to extract addresses, although using it in combination with dlsym () will be more secure). The method I used to ensure that part of the generation of the symbol table is a build process as a post-processing step.

+2
source

I think this is not a very reliable method, but it may work:

Take the address of a well-known function such as main() and the address of a well-known global variable.

Now take the address of an unknown character and calculate the absolute value of the difference between this address and two others. The smallest difference will indicate that the unknown address is closer to the function or to the global variable, which means that it is probably another function or another global variable.

This method works provided that the compiler / linker packs all global variables into a specific memory block and all functions in another memory block. For example, the Microsoft compiler puts all global variables in front (lower addresses in virtual memory).

I assume that you will not want to check local variables, since the address cannot be returned by the function (as soon as the function completes, the local variable will be lost)

+1
source

This can be done by combining dlsym() and dladdr1() .

 #define _GNU_SOURCE #include <dlfcn.h> #include <link.h> #include <stdio.h> int symbolType(void *sym) { ElfW(Sym) *pElfSym; Dl_info i; if (dladdr1(sym, &i, (void **)&pElfSym, RTLD_DL_SYMENT)) return ELF32_ST_TYPE(pElfSym->st_info); return 0; } int main(int argc, char *argv[]) { for (int i=1; i < argc; ++i) { printf("Symbol [%s]: ", argv[i]); void *mySym = dlsym(RTLD_DEFAULT, argv[i]); // This will not work with symbols that have a 0 value, but that not going to be very common if (!mySym) puts("not found!"); else { int type = symbolType(mySym); switch (type) { case STT_FUNC: puts("Function"); break; case STT_OBJECT: puts("Data"); break; case STT_COMMON: puts("Common data"); break; /* get all the other types from the elf.h header file */ default: printf("Dunno! [%d]\n", type); } } } return 0; } 
+1
source

Source: https://habr.com/ru/post/958520/


All Articles