Swapping Pages in
You already know from Chapter 4 that page faults as a result of accessing a swapped-out page are handled by do_swap_page from mm/memory.c. As the associated code flow diagram in Figure 18-19 shows, it is much easier to swap a page in than to swap it out, but it still involves more than just a simple read operation.
The kernel must not only check whether the requested page is still or already in the swap cache, but it also uses a simple readahead method to read several pages from the swap area in a chunk to anticipate future possible page faults.
As discussed in Section 18.4.1, the swap area and slot of a swapped-out page are held in the page table entry (the actual representation differs from machine to machine). To obtain general values, the kernel first invokes the familiar pte_to_swp_entry function to a swp_entry_t instance with machine-independent values that uniquely identify the page.
On the basis of these data, lookup_swap_cache checks whether the required page is in the swap cache. This applies if either the data have not yet been written or the data are shared and have already been read earlier by another process.
If the page is not in the swap cache, the kernel must not only cause the page to be read, but must also initiate a readahead operation to read a few pages in anticipation:
□ grab_swap_token grabs the swap token as described before.
□ swapin_readahead is responsible to perform the readahead. As a result, read requests are issued not only for the desired page but also for a few pages in the adjacent slots. This requires relatively little effort but speeds things up considerably because processes very often access the data they need from memory sequentially. When this happens, the corresponding pages will have already been read into memory by the readahead mechanism.
□ read_swap_cache_async is called once more for the presently required page. As the function name indicates, the read operation is asynchronous. However, the kernel uses a trick to ensure that the required data have been read in before further work is commenced. read_swap_cache_async locks the page before a read request is submitted to the block layer. When the block layer has finished the data transfer, the page is unlocked. Therefore, it is sufficient to call lock_page in do_swap_page to lock the page — the operation will have to wait until the block layer unlocks the page. Unlocking the page from the block layer's side is, however, a confirmation that the read request has been completed.
I take a look at the implementation of these two actions below.
Once the page has been swapped in (if necessary), the following points must be addressed regardless of whether the page came from the page cache or had to be read from a block device.
The page is first marked with mark_page_accessed so that the kernel regards it as accessed — recall the state diagram in Figure 18-13 in this context. It is then inserted in the page tables of the process, and the corresponding caches are flushed if necessary. Thereafter, page_add_anon_rmap is invoked to include the page in the reverse mapping mechanism discussed in Chapter 4. The familiar swap_free function then checks whether the slot in the swap area can be freed. This also ensures that the usage counter in the swap data structure is decremented by 1. If the slot is no longer needed, the routine modifies the lowest_bit or highest_bit fields of the swap_info instance provided the swap page is at one of its two ends.
If the page is accessed in Read/Write mode, the kernel must conclude the operation by invoking do_wp_page. This creates a copy of the page, adds it to the page tables of the process that caused the fault, and decrements the usage counter on the original page by 1. These are the same steps performed by the copy-on-write mechanism discussed in Chapter 4.
Continue reading here: Reading the Data
Was this article helpful?