Is the download address common to all C programs on Linux?

Question

Is the download address common to all C programs on Linux?

Let's say I have prog1.c, which is created as prog1.out. There is linker information in prog1.out that tells you where the elf will be loaded. These addresses will be virtual addresses. The bootloader will look for this information and run it as a process. Each section, such as DS, BSS, will be loaded into a virtual address, as indicated in the linker. For example, I have prog2.out, which also has the same bootloader address, BSS, DS, etc., Then will it conflict? I know this does not conflict, but then there will be a performance problem. Since the two processes have the same virtual address, but are they mapped to different physical addresses? I am confused how it can protect two processes having the same virtual addresses.

+1

c memory-management linux

dexterous_stranger 18 sept. '13 at 7:00

source share

3 answers

There is no connection between virtual addresses and physical addresses (in ordinary machines). Mapping always happens - this is part of what virtual addresses are for.

There are “performance issues” - if at all - all the time hidden inside the hardware. Perhaps there was an actual collision (whether these addresses were physical) or not.

How can he protect two processes with the same virtual addresses.

You do not do this and do not need. The memory content for each process is independent of the others, and this depends on the operation of the OS and hardware. You might think of the term “virtual memory” as its own process-specific memory.

Please note that this whole discussion is really little related to the contents of the object code or the binding / loading process - you can link to any address at runtime, and the linker / loader should not know about it.

+1

Elazar 18 sept. '13 at 7:03

source share

Each process "thinks" that it works alone on the computer and only virtual addresses know about it. Then, the memory management module (MMU) translates the virtual address to the physical address. Of course, the MMU must make sure that the two virtual addresses are not mapped to the same physical address. The mapping is very fast because the processor has (most of the time) dedicated hardware support for the MMU.

The MMU should also determine exactly where to receive the data, for example. Main memory (perhaps a typical case), level 1 cache, level 2 cache, paging file from disk, etc. And, as I already said, the process itself does not pay attention to this, it “thinks” that it works on one flat memory file and there is no need to worry about any collisions.

+1

Lucas 18 sept. '13 at 7:11

source share

Marco · Accepted Answer · 2013-09-19T04:57:56+0000

The fact is that when a process uses a memory address, it talks about a Virtual address, which may differ from the same physical address. This means that two processes can refer to the same address and not mix their data, because it will be in two different physical places.

Below I describe how a virtual address translates to a physical address on typical computers (this is slightly different from other architectures, but this is the same idea)

Understanding the process of converting memory to Intel x86 architecture (three-level paging call)

So, in one hand you have a virtual memory address, and you want to get to the physical memory address (that is: the actual address in RAM), the workflow is basically this:

Virtual Address → [Segmentation Block] → [Paging Module] → Physical Address

Each operating system can determine how segmentation and paging works. For example, Linux uses the Flat Segmentation Model, which means that it is ignored, so we would do the same now.

Now our virtual address goes through something called a Paging Unit, and somehow translates into a physical address. Here's how to do it.

The memory is divided into blocks of a certain size , on Intel this size can be 4 KB or 4 MB.

Each process defines a set of tables in memory so that the computer knows how it should translate memory addresses. These tables are organized hierarchically, and in fact, the memory address you want to get is decomposed into indexes for these tables.

I know this sounds confusing, but stay with me for a few suggestions. You can follow my letter with this image:

There is an internal CPU register called CR3 that stores the base address of the first table (we will call this directory of pages, and each of its entries is called a directory entry). When a process runs, its CR3 loads (among other things).

So, now you want to access, say, the memory address 0x00C30404 ,

The paging unit says “OK, let's get the page catalog base”, look at the CR3 register and find out where the page catalog base is located, call this PDB address (Directory Base page).

Now you want to know which directory entry to use. As I said, the address is decomposed into a bunch of indexes. The most significant 10 bits (bit 22 through 31) correspond to the index of the page directory. In this case, 0x00C30404 0000 0000 1100 0011 0000 0100 0000 0100 in binary format, and its most important 10 bits: 0000 0000 11 0x3 . This means that we want to search for an entry in the third page directory.

¿What are we doing now?

Remember that these tables are hierarchical: each directory entry has, among other things, the address of the next table, called the Page Table . (This table may differ for each directory entry ).

So now we have another table. The next 10 bits of our address will tell us which index of this table we will be accessing (let's call them page tables).

00 0011 0000 is the next 10 bits, and they are a number: 0x30 . This means that we must access the 30th page of the page table ..

And finally, this Entry in the table contains the offset of the desired PAGE PAGE (remember that the memory is divided into 4k blocks). Finally, the least significant 12 bits of our address are the memory offset of this PAGE FRAME , note that PAGE FRAME is the actual address of the physical memory.

This is called a three-level paging call, on 64 bits (or with PAE) it is very similar, but there is another level of paging.

You might think that this is a real bummer to get all these memory accesses just to get a variable. And it is true. The computer has mechanisms to avoid all these steps, since it is a TLB (Buffer Table Lookaside Buffer), it stores a cache of all translations that are completed, so it can easily extract memory.

In addition, each record of these structures has some permissions properties, such as "Is this page writable?" "is this executable page?".

So now that you understand how memory swapping works, it's easy to understand how Linux processes memory:

Each process has its own CR3, and when the process is scheduled to start, the cr3 register is loaded.
Each process has its own virtual space (tables can be incomplete, a process can start with, for example, only one page out of 4kb).
Each process displays some other pages (RAM), so when it is interrupted, the interrupt handler is launched for the same task and processes the interrupt inside this task, executing the necessary code.

Some motives for such schemes

You do not need to have all the memory of the processes at the same time, you can save them to disk.
You can safely isolate every process memory and give it some permissions.
A process can use 10 MB of RAM, but they do not have to be contiguous in physical memory.

(Did you like my explanation?), I am a specialist in the field of computer organization and would like some feedback: P)

Is the download address common to all C programs on Linux?

Understanding the process of converting memory to Intel x86 architecture (three-level paging call)

¿What are we doing now?

Some motives for such schemes

More articles: