Determining Page Activity

The kernel must track not only whether a page is actually used by one or more processes, but also how often it is accessed in order to assess its importance. As only very few architectures support a direct access counter for memory pages, the kernel must resort to other means and has therefore introduced two page flags named referenced and active. The corresponding bit values are PG_referenced and PG_active, and the usual set of macros as discussed in Section 3.2.2 is available to set or receive the state. Recall that, for instance, PageReferenced checks the PG_referenced bit, while SetPageActive sets the PG_active bit.

Why are two flags used for the page state? Suppose that only a single flag were used to determine page activity — PG_active would lend itself to that rather well. When the page is accessed, the flag is

9 The page migration code for NUMA systems, which is otherwise not covered in this book, is also a user of the function.

set, but when is it going to be removed again? Either the kernel does not remove it automatically, but then the page would remain in the active state forever even if it would only be used very little, or not at all anymore. To remove the flag automatically after some specific time-out would require a huge number of kernel timers because appropriate hardware support is not available on all CPUs supported by Linux. Considering the large number of pages that are present in a typical system, this approach is also doomed to fail.

CPU 0

CPU 1

lru_add_pvecs lru_add_active_pvecs

shrink_active_list

I I I I

lru_inactive lru_active

XSetPageActive^

activate_page

(1 __pagevec_lru_add

(2 __pagevec_lru_add_active

© SetPageLRU ® SetPageActive

Figure 18-12: Page movements between the per-CPU page lists and the global LRU lists. To simplify matters, only a single zone is used as the basis of the global lists. Only the most important functions that move pages between the active and inactive lists are shown.

Having two flags allows for implementing a more sophisticated approach to determining page activity. The core idea is to use one flag to denote the current activity rating, and another one that signals if the page has been recently referenced. Both bits need to be set in close cooperation. Figure 18-13 illustrates the corresponding algorithm. Essentially the following steps are necessary:

1. If the page is deemed active, the PG_active flag is set; otherwise, not. The flag directly corresponds to the LRU list the page is on, namely, the (zone-specific) inactive or active list.

2. Each time the page is accessed, the flag PG_referenced is set. The function responsible for this is mark_page_accessed, and the kernel must make sure to call it appropriately.

3. The PG_referenced flag and information provided by reverse mapping are used to determine page activity. The crucial point is that the PG_referenced flag is removed each time an activity check is performed. page_referenced is the function that implements this behavior.

4. Enter mark_page_accessed again. When it finds that the PG_accessed bitis already set when it checks the page, this means that no check was performed by page_referenced. The calls to mark_page_accessedhave thus been more frequent than the calls to page_referenced, which implies that the page is often accessed. If the page is currently on the inactive list, it is moved to the active list. Additionally, the PG_active bit is set, and PG_referenced is removed.

5. A downward promotion is also possible. If the page is on the active list and receives much attention, then PG_referenced is usually set. Once the page starts to experience less activity, then two calls of page_referenced are required without intervention of mark_page_accessedbefore it is put on the inactive list.

(2 page_referenced (3 shrink_active_list activate_page

Figure 18-13: Overview of possible state transitions of a page with respect to PG_active and PG_referenced, and the corresponding placement of the page on the active and inactive lists.

If a page is steadily accessed, then the calls of mark_page_accessed and page_referenced will essentially average out, so the page remains on its current list.

A page that is not often accessed (and thus inactive) has none of the bits PG_active and PG_referenced set. This means that two subsequent activity markings with mark_page_accessed (and without the interference of page_referenced in between) are required to move it from the inactive to the active list. The same holds vice versa: A highly active page has both PG_active and PG_referenced set.

All in all, the solution ensures that pages do not bounce between the active and inactive lists too fast, which would clearly be undesirable for a reliable estimation of the page's activity level. The method is a variation of the ''second chance'' approach discussed at the beginning of this chapter: Highly active pages get a second chance before they are down-promoted to an inactive page, and highly inactive pages require a second proof before they become active pages. This is combined with a ''least recently used'' method (or at least an approximation, because no exact usage count is available for the pages) to realize page reclaim policy.

Note that while Figure 18-13 illustrates the most important state and list transitions, some more are still possible. This is caused, on the one hand, by code not covered in this book (e.g., the page migration code). On the other hand, some changes are necessary to handle special cases, (e.g., the lumpy page reclaim technique). These exceptions are discussed in the course of this chapter.

Several auxiliary functions are provided by the kernel to support moving pages between both LRU lists: <mm_inline.h>

void add_page_to_active_list(struct zone *zone, struct page *page) void add_page_to_inactive_list(struct zone *zone, struct page *page)

void del_page_from_active_list(struct zone *zone, struct page *page) void del_page_from_inactive_list(struct zone *zone, struct page *page)

void del_page_from_lru(struct zone *zone, struct page *page)

The function names say it all, and the implementation is also a matter of simple list manipulation. The only thing to note is that del_page_from_lru must be used if the current LRU list of the page is unknown to the caller.

Moving pages from the active to the inactive list does, however, require more than just handling the list entries. To promote an inactive page to the active list, activate_page is responsible. Without locking and statistics accounting, the code looks as follows:

Continue reading here: Mmswapc

Was this article helpful?

+1 0