Layout of the Process Address Space

The virtual address space is populated by a number of regions. How they are distributed is architecture-specific, but all approaches have the following elements in common:

□ The binary code of the code currently running. This code is normally referred to as text and the area of virtual memory in which it is located as a text segment.1

□ The code of dynamic libraries used by the program.

1 This is not the same as a hardware segment, which is featured in some architectures and acts as a separate address space. It is simply the linear address space area used to hold the data.

□ The heap where global variables and dynamically generated data are stored.

□ The stack used to hold local variables and to implement function and procedure calls.

□ Sections with environment variables and command-line arguments.

□ Memory mappings that map the contents of files into the virtual address space.

Recall from Chapter 2 that each process in the system is equipped with an instance of struct mm_struct that can be accessed via the task structure. This instance holds memory management information for the process:

struct mm_struct unsigned unsigned unsigned unsigned unsigned unsigned

The start and end of the virtual address space area consumed by the executable code are marked by start_code and end_code. Similarly, start_data and end_data mark the region that contains initialized data. Notice that the size of these areas does not change once an ELF binary has been mapped into the address space.

The start address of the heap is kept in start_brk, while brk denotes the current end of the heap area. While the start is constant during the lifetime of a process, heap size and thus the value of brk will vary.

The position of the argument list and the environment is described by arg_start and arg_end, respectively, env_start and env_end. Both regions reside in the topmost area of the stack.

mmap_base denotes the starting point for memory mappings in the virtual address space, and get_ unmapped_area is invoked to find a suitable place for a new mapping in the mmap area.

task_size — variable names don't lie — stores the task size of the corresponding process. For native applications, this will usually be task_size. However, 64-bit architectures are often binary-compatible with their predecessors. If a 32-bit binary is executed on a 64-bit machine, then task_size describes the effective task size visible to the binary.

The individual architectures can influence the layout of the virtual address space by several configuration options:

□ If an architecture wants to choose between different possibilities for how the mmap area is arranged, it needs to set have_arch_pick_mmap_layout and provide the function arch_ pick_mmap_layout.

long (*get_unmapped_area) (struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags);

long mmap_base; /* base of mmap area */ long task_size; /* size of task vm space */

long start_code, end_code, start_data, end_data;

long start_brk, brk, start_stack;

long arg_start, arg_end, env_start, env_end;

□ When a new memory mapping is created, the kernel needs to find a suitable place for it unless a specific address has been specified by the user. If the architecture wants to choose the proper location itself, it must set the pre-processor symbol have_arch_unmapped_area and define the function arch_get_unmapped_area accordingly.

□ New locations for memory mappings are usually found by starting the search from lower memory locations and progressing toward higher addresses. The kernel provides the default function arch_get_unmapped_area_topdown to perform this search, but if an architecture wants to provide a specialized implementation, it needs to set the pre-processor symbol have_arch_ GET_UNMAPPED_AREA.

□ Usually, the stack grows from bottom to top. Architectures that handle this differently need to set the configuration option config_stack_growsup.2 In the following, only stacks that grow from top to bottom are considered.

Finally, we need to consider the task flag pf_randomize. If it is set, the kernel does not choose fixed locations for stack and the starting point for memory mappings, but varies them randomly each time a new process is started. This complicates, for instance, exploiting security holes that are caused by buffer overflows. If an attacker cannot rely on a fixed address where the stack can be found, it will be much harder to construct malicious code that deliberately manipulates stack entries after access to the memory region has been gained by a buffer overflow.

Figure 4-1 illustrates how the aforementioned components are distributed across the virtual address space on most architectures.

:

' v./

1 MMAP

1 Heap

Text

mm->mmap_base (TASK_UNMAPPED_SIZE)

V777ZA Gap already used mm->mmap_base (TASK_UNMAPPED_SIZE)

Figure 4-1: Composition of the linear process address space.

2Currently only PA-Risc processors require this option. The constants in the kernel thus have a slight tendency toward a situation where the stack grows from downward, albeit the PA-Risc code is not quite satisfied with that, as we can read in include/asm-parisc/a.out.h:

/* XXX: STACK_TOP actually should be STACK_BOTTOM for parisc. * prumpf *\ The funny thing is that ''prumpf'' is not a grumpy sign of discontent, but an abbreviation for a developer, Philipp Rumpf :-)

How the text segment is mapped into the virtual address space is determined by the ELF standard (see Chapter E for more information about this binary format). A specific starting address is specified for each architecture: IA-32 systems start at 0x08048000, leaving a gap of roughly 128 MiB between the lowest possible address and the start of the text mapping that is used to catch NULL pointers. Other architectures keep a similar hole: UltraSparc machines use 0x100000000 as the starting point of the text segment, while AMD64 uses 0x0000000000400000. The heap starts directly above the text segment and grows upward.

The stack starts at stack_top, but the value is decremented by a small random amount if pf_randomize is set. stack_top must be defined by each architecture, and most set it to task_size — the stack starts at the highest possible address of the user address space. The argument list and environment of a process are stored as initial stack elements.

The region for memory mappings starts at mm_struct->mmap_base, which is usually set to TASK_ unmapped_base, needing to be defined by every architecture. In nearly all cases, task_size/3 is chosen. Note that the start of the mmap region is not randomized if the default kernel approach is used.

Using the described address space layout works very well on machines that provide a large virtual address space. However, problems can arise on 32-bit machines. Consider the situation on IA-32: The virtual address space ranges from 0 to 0xC0000000, so 3 GiB are available for each user process. task_ unmapped_base starts at 0x4000000, that is, at 1 GiB. Unfortunately, this implies that the heap can only consume roughly 1 GiB before it crashes right into the mmap area, which is clearly not a desirable situation.

The problem is caused by the memory mapping region that is located in the middle of the virtual address space. This is why a new virtual address space layout for IA-32 machines (in addition to the classical one, which can still be used) was introduced during the development of kernel 2.6.7. It is illustrated in Figure 4-2.

TASK_SIZE

STACK_TOP-randomized_variable

VTTTZA Gap

I Random offset ' mm->mmap_base

0x0804 80000 0

Figure 4-2: Layout of the virtual address space on IA-32 machines when the mmap region is expanded from top to bottom.

The idea is to limit the maximal stack size to a fixed value. Since the stack is bounded, the region into which memory mappings are installed can then be started immediately below the end of the stack. In contrast to the classical approach, it now expands from top to bottom. Since the heap is still located in the lower region of the virtual address space and grows upward, both mmap region and heap can expand until there is really no portion of the virtual address space left. To ensure that the stack does not collide with the mmap region, a safety gap is installed between both.

+1 0

Post a comment