The Memory Descriptor

Last Updated on Tue, 19 Jan 2021 | Linux Kernel Reference

All information related to the process address space is included in a data structure called a memory descriptor. This structure of type mm_struct is referenced by the mm field of the process descriptor. The fields of a memory descriptor are listed in Table 8-2.

Table 8-2. The fields of the memory descriptor
Type	Field	Description
struct vm area struct *	mmap	Pointer to the head of the list of memory region objects
rb root t	mm rb	Pointer to the root of the red-black tree of memory region objects
struct vm area struct *	mmap cache	Pointer to the last referenced memory region object
pgd t *	pgd	Pointer to the Page Global Directory
atomic t	mm users	Secondary usage counter
atomic t	mm count	Main usage counter
int	map count	Number of memory regions
struct rw semaphore	mmap sem	Memory regions' read/write semaphore
spinlock t	page table lock	Memory regions' and Page Tables' spin lock
struct list head	mmlist	Pointers to adjacent elements in the list of memory descriptors
unsigned long	start_code	Initial address of executable code
unsigned long	end code	Final address of executable code
unsigned long	start data	Initial address of initialized data

unsigned long	end_data	Final address of initialized data
unsigned long	start brk	Initial address of the heap
unsigned long	brk	Current final address of the heap
unsigned long	start stack	Initial address of User Mode stack
unsigned long	arg start	Initial address of command-line arguments
unsigned long	arg end	Final address of command-line arguments
unsigned long	env start	Initial address of environment variables
unsigned long	env_end	Final address of environment variables
unsigned long	rss	Number of page frames allocated to the process
unsigned long	total vm	Size of the process address space (number of pages)
unsigned long	locked vm	Number of "locked" pages that cannot be swapped out (see Chapter 16)
unsigned long	def flags	Default access flags of the memory regions
unsigned long	cpu vm mask	Bit mask for lazy TLB switches (see Chapter 2)
unsigned long	swap address	Last scanned linear address for swapping (see Chapter 16)
unsigned int	dumpable	Flag that specifies whether the process can produce a core dump of the memory
mm context t	context	Pointer to table for architecture-specific information (e.g., LDT's address in 80 x 86 platforms)

All memory descriptors are stored in a doubly linked list. Each descriptor stores the address of the adjacent list items in the mmlist field. The first element of the list is the mmlist field of init_mm, the memory descriptor used by process 0 in the initialization phase. The list is protected against concurrent accesses in multiprocessor systems by the mmlist_lock spin lock. The number of memory descriptors in the list is stored in the mmlist_nr variable.

The mm_users field stores the number of lightweight processes that share the mm_struct data structure (see Section 3.4.1). The mm_count field is the main usage counter of the memory descriptor; all "users" in mm_users count as one unit in mm_count. Every time the mm_count field is decremented, the kernel checks whether it becomes zero; if so, the memory descriptor is deallocated because it is no longer in use.

We'll try to explain the difference between the use of mm_users and mm_count with an example. Consider a memory descriptor shared by two lightweight processes. Normally, its mm_users field stores the value 2, while its mm_count field stores the value 1 (both owner processes count as one).

If the memory descriptor is temporarily lent to a kernel thread (see the next section), the kernel increments the mm_count field. In this way, even if both lightweight processes die and the mm_users field becomes zero, the memory descriptor is not released until the kernel thread finishes using it because the mm_count field remains greater than zero.

If the kernel wants to be sure that the memory descriptor is not released in the middle of a lengthy operation, it might increment the mm_users field instead of mm_count (this is what the swap_out( ) function does; see Section 16.5). The final result is the same because the increment of mm_users ensures that mm_count does not become zero even if all lightweight processes that own the memory descriptor die.

The mm_alloc( ) function is invoked to get a new memory descriptor. Since these descriptors are stored in a slab allocator cache, mm_alloc( ) calls kmem_cache_alloc( ), initializes the new memory descriptor, and sets the mm_count and mm_users field to 1.

Conversely, the mmput( ) function decrements the mm_users field of a memory descriptor. If that field becomes 0, the function releases the Local Descriptor Table, the memory region descriptors (see later in this chapter), and the Page Tables referenced by the memory descriptor, and then invokes mmdrop( ). The latter function decrements mm_count and, if it becomes zero, releases the mm_struct data structure.

The mmap, mm_rb, mmlist, and mmap_cache fields are discussed in the next section. 8.2.1 Memory Descriptor of Kernel Threads

Kernel threads run only in Kernel Mode, so they never access linear addresses below task_size (same as page_offset, usually 0xc0000000). Contrary to regular processes, kernel threads do not use memory regions, therefore most of the fields of a memory descriptor are meaningless for them.

Since the Page Table entries that refer to the linear address above task_size should always be identical, it does not really matter what set of Page Tables a kernel thread uses.

To avoid useless TLB and cache flushes, kernel threads use the Page Tables of a regular process in Linux 2.4. To that end, two kinds of memory descriptor pointers are included in every memory descriptor: mm and active_mm.

The mm field in the process descriptor points to the memory descriptor owned by the process, while the active_mm field points to the memory descriptor used by the process when it is in execution. For regular processes, the two fields store the same pointer. Kernel threads, however, do not own any memory descriptor, thus their mm field is always null. When a kernel thread is selected for execution, its active_mm field is initialized to the value of the active_mm of the previously running process (see Section 11.2.2.3).

There is, however, a small complication. Whenever a process in Kernel Mode modifies a Page Table entry for a "high" linear address (above task_size), it should also update the corresponding entry in the sets of Page Tables of all processes in the system. In fact, once set by a process in Kernel Mode, the mapping should be effective for all other processes in Kernel Mode as well. Touching the sets of Page Tables of all processes is a costly operation; therefore, Linux adopts a deferred approach.

We already mentioned this deferred approach in Section 7.3: every time a high linear address has to be remapped (typically by vmalloc( ) or vfree( )), the kernel updates a canonical set of Page Tables rooted at the swapper_pg_dir master kernel Page Global Directory (see Section 2.5.5). This Page Global Directory is pointed to by the pgd field of a master memory descriptor, which is stored in the init_mm variable. m

[1] We mentioned in Section 3.4.2 that the swapper kernel thread uses init_mm during the initialization phase. However, swapper never uses this memory descriptor once the initialization phase completes.

Later, in Section 8.4.5, we'll describe how the Page Fault handler takes care of spreading the information stored in the canonical Page Tables when effectively needed.

I [email protected] RuBoard

Continue reading here: Memory Regions

Was this article helpful?

The Memory Descriptor

Related Posts