The msync( ) system call can be used by a process to flush to disk dirty pages belonging to a shared memory mapping. It receives as parameters the starting address of an interval of linear addresses, the length of the interval, and a set of flags that have the following meanings:
Asks the system call to suspend the process until the I/O operation completes. In this way, the calling process can assume that when the system call terminates, all pages of its memory mapping have been flushed to disk.
Asks the system call to return immediately without suspending the calling process.
Asks the system call to remove all pages included in the memory mapping from the process address space (not really implemented).
The sys_msync( ) service routine invokes msync_interval( ) on each memory region included in the interval of linear addresses. In turn, the latter function performs the following operations:
1. If the vm_file field of the memory region descriptor is null, or if the vm_shared flag is clear, returns 0 (the memory region is not a writable shared memory mapping of a file).
2. Invokes the filemap_sync( ) function, which scans the Page Table entries corresponding to the linear address intervals included in the memory region. For each page found, it invokes flush_tlb_page( ) to flush the corresponding translation lookaside buffers, and marks the page as dirty.
3. If the MS_SYNC flag is not set, returns. Otherwise, continues with the following steps to flush the pages in the memory region to disk, sleeping until all I/O data transfers terminate. Notice that, at least in the last stable version of the kernel at the time of this writing, the function does not take the ms_invalidate flag into consideration.
4. Acquires the i_sem semaphore of the file's inode.
5. Invokes the filemap_fdatasync( ) function, which receives the address of the file's address_space object. For every page belonging to the dirty pages list of the address_space object, the function performs the following substeps:
a. Moves the page from the dirty pages list to the locked pages list.
b. If the PG_Dirty flag is not set, continues with the next page in the list (the page is already being flushed by another process).
c. Increments the usage counter of the page and locks it, sleeping if necessary.
d. Clears the PG_dirty flag of the page.
e. Invokes the writepage method of the address_space object on the page (described following this list).
f. Releases the usage counter of the page
The writepage method for block device files and almost all disk-based filesystems is just a wrapper for the block_write_full_page( ) function; it is used to pass to block_write_full_page( ) the address of a filesystem-dependent function that translates the block numbers relative to the beginning of the file into logical block numbers relative to positions of the block in the disk partition. (This is the same mechanism that is already described in the earlier section Section 15.1.1 and that is used for the readpage method). In turn, block_write_full_page( ) is very similar to block_read_full_page( ) described earlier: it allocates asynchronous buffer heads for the page, and invokes the submit_bh( ) function on each of them specifying the write operation.
6. Checks whether the fsync method of the file object is defined; if so, executes it. For regular files, this method usually limits itself to flushing the inode object of the file to disk. For block device files, however, the method invokes sync_buffers( ), which activates the I/O data transfer of all dirty buffers of the device.
7. Executes the filemap_fdatawait( ) function. For each page in the locked pages list of the address_space object, the function waits until the page becomes unlocked — when the ongoing I/O data transfer on the page terminates.
8. Releases the i_sem semaphore of the file.
I [email protected] RuBoard nssmm
Was this article helpful?