Copy on Write

Copy on write is handled in do_wp_page, whose code flow diagram is shown in Figure 4-20.

Figure 4-20: Code flow diagram for do_wp_page.

Let's examine a slightly simplified version in which I have omitted potential interference with the swap cache as well as some corner cases, since this would complicate the situation without revealing anything insightful about the mechanism itself.

The kernel first invokes vm_normal_page to find the struct page instance of the page by reference to the page table entry — essentially, this function builds on pte_pfn and pfn_to_page, which must be defined on all architectures. The former finds the page number for an associated page table entry, and the latter determines the page instance associated with the page number. This is possible because the COW mechanism is invoked only for pages that actually reside in memory (otherwise, they are first automatically loaded by one of the other page fault mechanisms).

After obtaining a reference on the page with page_cache_get, anon_vma_prepare then prepares the data structures of the reverse mapping mechanism to accept a new anonymous area. Since the fault originates from a page filled with useful data that must be copied to a new page, the kernel invokes alloc_page_vma to allocate a fresh page. cow_user_page then copies the data of the faulted page into the new page to which the process may subsequently write.

The reverse mapping to the original read-only page is then removed using page_remove_rmap. The new page is added to the page tables, at which point the CPU caches must also be updated.

The final actions involve placing the newly allocated pages on the active list of the LRU cache using lru_cache_add_active and inserting them in the reverse mapping data structures by means of page_add_anon_rmap. Thereafter, the userspace process can write to the page to its heart's content.

