Linear Addresses of Noncontiguous Memory Areas

To find a free range of linear addresses, we can look in the area starting from page_offset (usually 0xc0000000, the beginning of the fourth gigabyte). Figure 7-7 shows how the fourth gigabyte linear addresses are used:

• The beginning of the area includes the linear addresses that map the first 896 MB of RAM (see Section 2.5.4); the linear address that corresponds to the end of the directly mapped physical memory is stored in the high memory variable.

• The end of the area contains the fix-mapped linear addresses (see Section 2.5.6).

• Starting from pkmap_base (0xfe000000), we find the linear addresses used for the persistent kernel mapping of high-memory page frames (see Section 7.1.6 earlier in this chapter).

• The remaining linear addresses can be used for noncontiguous memory areas. A safety interval of size 8 MB (macro vmalloc_offset) is inserted between the end of the physical memory mapping and the first memory area; its purpose is to "capture" out-of-bounds memory accesses. For the same reason, additional safety intervals of size 4 KB are inserted to separate noncontiguous memory areas.

Figure 7-7. The linear address interval starting from PAGE_OFFSET

Figure 7-7. The linear address interval starting from PAGE_OFFSET

The VMALLOC_START macro defines the starting address of the linear space reserved for noncontiguous memory areas, while vmalloc end defines its ending address.

7.3.2 Descriptors of Noncontiguous Memory Areas

Each noncontiguous memory area is associated with a descriptor of type struct struct vm_struct {

unsigned long flags; void * addr; unsigned long size; struct vm struct * next;

These descriptors are inserted in a simple list by means of the next field; the address of the first element of the list is stored in the vmlist variable. Accesses to this list are protected by means of the vmlist_lock read/write spin lock. The addr field contains the linear address of the first memory cell of the area; the size field contains the size of the area plus 4,096 (which is the size of the previously mentioned inter-area safety interval).

The get_vm_area( ) function creates new descriptors of type struct vm_struct; its parameter size specifies the size of the new memory area. The function is essentially equivalent to the following:

struct vm struct * get vm area(unsigned long size, unsigned long flags) {

unsigned long addr;

area = (struct vm struct *) kmalloc(sizeof(*area), GFP KERNEL); if (!area)

return NULL; size += PAGE_SIZE; addr = VMALLOC_START; write lock(&vmlist lock);

for (p = &vmlist; (tmp = *p) ; p = &tmp->next) {

if (size + addr <= (unsigned long) tmp->addr) { area->flags = flags; area->addr = (void *) addr; area->size = size; area->next = *p; *p = area;

write unlock(&vmlist lock); return area;

addr = tmp->size + (unsigned long) tmp->addr; if (addr + size > VMALLOC_END) { write unlock(&vmlist lock); kfree(area); return NULL;

The function first calls kmalloc( ) to obtain a memory area for the new descriptor. It then scans the list of descriptors of type struct vm_struct looking for an available range of linear addresses that includes at least size+4096 addresses. If such an interval exists, the function initializes the fields of the descriptor and terminates by returning the initial address of the noncontiguous memory area. Otherwise, when addr + size exceeds vmalloc_end, get_vm_area( ) releases the descriptor and returns null. 7.3.3 Allocating a Noncontiguous Memory Area

The vmalloc( ) function allocates a noncontiguous memory area to the kernel. The parameter size denotes the size of the requested area. If the function is able to satisfy the request, it then returns the initial linear address of the new area; otherwise, it returns a null pointer:

void * vmalloc(unsigned long size) {

struct vm_struct *area;

size = (size + PAGE_SIZE - 1) & PAGE_MASK; area = get vm area(size, VM ALLOC); if (!area)

return NULL; addr = area->addr;

if (vmalloc area pages((unsigned long) addr, size,

vfree(addr); return NULL;

return addr;

The function starts by rounding up the value of the size parameter to a multiple of 4,096 (the page frame size). Then vmalloc( ) invokes get_vm_area( ), which creates a new descriptor and returns the linear addresses assigned to the memory area. The flags field of the descriptor is initialized with the vm_alloc flag, which means that the linear address range is going to be used for a noncontiguous memory allocation (we'll see in Chapter 13 that vm_struct descriptors are also used to remap memory on hardware devices). Then the vmalloc( ) function invokes vmalloc_area_pages( ) to request noncontiguous page frames and terminates by returning the initial linear address of the noncontiguous memory area.

The vmalloc_area_pages( ) function uses four parameters:


The initial linear address of the area.


The size of the area.

gfp mask

The allocation flags passed to the buddy system allocator function. It is always set to



The protection bits of the allocated page frames. It is always set to 0x63, which corresponds to Present, Accessed, Read/Write, and Dirty.

The function starts by assigning the linear address of the end of the area to the end local variable:

The function then uses the pgd_offset_k macro to derive the entry in the master kernel Page Global Directory related to the initial linear address of the area; it then acquires the kernel Page Table spin lock:

dir = pgd_offset_k(address);

spin lock(&init table lock);

The function then executes the following cycle:

pmd t *pmd = pmd alloc(&init mm, dir, address); ret = -ENOMEM; if (!pmd) break;

if (alloc area pmd(pmd, address, end - address, gfp mask, prot)) break;

address = (address + PGDIR_SIZE) & PGDIR_MASK;

spin unlock(&init table lock); return ret;

In each cycle, it first invokes pmd_alloc( ) to create a Page Middle Directory for the new area and writes its physical address in the right entry of the kernel Page Global Directory. It then calls alloc_area_pmd( ) to allocate all the Page Tables associated with the new Page

Middle Directory. It adds the constant 222—the size of the range of linear addresses spanned by a single Page Middle Directory—to the current value of address, and it increases the pointer dir to the Page Global Directory.

The cycle is repeated until all Page Table entries referring to the noncontiguous memory area are set up.

The alloc_area_pmd( ) function executes a similar cycle for all the Page Tables that a Page Middle Directory points to:

pte_t * pte = pte_alloc(&init_mm, pmd, address); if (!pte)

return -ENOMEM;

if (alloc_area_pte(pte, address, end - address))

return -ENOMEM; address = (address + PMD_SIZE) & PMD_MASK; pmd++;

The pte_alloc( ) function (see Section 2.5.2) allocates a new Page Table and updates the corresponding entry in the Page Middle Directory. Next, alloc_area_pte( ) allocates all the page frames corresponding to the entries in the Page Table. The value of address is increased by 222—the size of the linear address interval spanned by a single Page Table—and the cycle is repeated.

The main cycle of alloc_area_pte( ) is:

while (address < end) { unsigned long page;

spin unlock(&init table lock); page_alloc(gfp_mask); spin lock(&init table lock); if (!page)

return -ENOMEM; set_pte(pte, mk_pte(page, prot)); address += PAGE_SIZE; pte++;

Each page frame is allocated through page_alloc( ). The physical address of the new page frame is written into the Page Table by the set_pte and mk_pte macros. The cycle is repeated after adding the constant 4,096 (the length of a page frame) to address.

Notice that the Page Tables of the current process are not touched by vmalloc_area_pages( ). Therefore, when a process in Kernel Mode accesses the noncontiguous memory area, a Page Fault occurs, since the entries in the process's Page Tables corresponding to the area are null. However, the Page Fault handler checks the faulty linear address against the master kernel Page Tables (which are init_mm.pgd Page Global Directory and its child Page Tables; see Section 2.5.5). Once the handler discovers that a master kernel Page Table includes a non-null entry for the address, it copies its value into the corresponding process's Page Table entry and resumes normal execution of the process. This mechanism is described in Section 8.4.

Continue reading here: Releasing a Noncontiguous Memory Area

Was this article helpful?

0 0