The general approach to implementation of the swap policy algorithms has been discussed above. The following sections focus on the interaction of the swap policy functions and procedures and describe their implementation in detail. Figure 18-11 shows a code flow diagram listing the most important methods and illustrating how they are interlinked.

Direct page reclaim




Swap Daemons (one instance per NUMA node)

kswapd kswapd kswapd balance_pgdat balance_pgdat balance_pgdat shrink zone shrink_active_list shrink_inactive_list shrink_page_list

Figure 18-11: ''Big picture'' of the page reclaim implementation. Note that the figure is not a proper code flow diagram but just displays the most important functions.

The diagram is another refinement of the overview shown in Figure 18-7. Page reclaim is triggered at two points, as shown in the figure:

1. try_to_free_pages is invoked if the kernel detects an acute shortage of memory during an operation. It checks all pages in the current memory zone and frees those least frequently needed.

2. A background daemon named kswapd checks memory utilization at regular intervals and detects impending memory shortage. It can be used to swap out pages as a preventive measure before the kernel discovers in the course of another operation that it does not have enough memory.

On NUMA machines, which do not share memory uniformly over all processors (see Chapter 3), there is a separate kswapd daemon for each NUMA zone. Each daemon is responsible for all memory zones in a NUMA zone.

On non-NUMA systems, there is just one instance of kswapd, which is responsible for all main memory zones (non-NUMA zones). Recall that, for instance, IA-32 can have up to three zones — ISA-DMA, normal memory, and high memory.

The paths of the two versions merge very quickly in the shrink_zone function. The remaining code of the page reclaim subsystem is identical for both options.

Once the number of pages to be swapped out in order to provide the system with fresh memory has been determined — using algorithms designed to deal with acute memory shortage in try_to_free_pages and to regularly check memory utilization in the kswap daemon — the kernel must still decide which specific pages are to be swapped out (and ultimately pass these from the policy part of the code to the kernel routines responsible for writing the pages back to their backing store and adapting the page table entries).

Recall from Chapter 3.2.1 that the kernel tries to categorize pages into two LRU lists: one for active pages, and one for inactive pages. These lists are managed per memory zone:

Continue reading here: Info

Was this article helpful?

0 0