The pdflush Mechanism

The pdflush mechanism is implemented in a single file: mm/pdflush.c. This contrasts with the fragmented implementation of the synchronization mechanisms in earlier versions.

pdflush is started with the usual kernel thread mechanisms: mm/pdflush.c static void start_one_pdflush_thread(void) {

kthread_run(pdflush, NULL, "pdflush");

start_one_pdflush starts a single pdflush thread — however, the kernel uses several threads at the same time in general, as you will see below. It should be noted that a specific pdflush thread is not always responsible for the same block device. Thread allocation may vary over time simply because the number of threads is not constant and differs according to system load.

In fact, the kernel starts the specific number of threads defined in min_pdflush_threads when it initializes the pdflush subsystem. Typically, this number is 2 so that in a normally loaded system, two active instances of pdflush appear in the task list displayed by ps:

[email protected]> ps fax

There is a lower and an upper limit to the number of threads. max_pdflush_threads specifies the maximum number of pdflush instances, typically 8. The number of concurrent threads is held in the nr_pdflush_threads global variable, but no distinction is made as to whether the threads are currently active or sleeping. The current value is visible to userspace in /proc/sys/vm/nr_pdflush_threads.

The policy for when to create and destroy pdflush threads is simple. The kernel creates a new thread if no idle thread has been available for 1 second. In contrast, a thread is destroyed if it has been idle for more than 1 second. The upper and lower limits on the number of concurrent pdflush threads defined in min_pdflush_threads (2) and max_pdflush_threads (8) are always obeyed.

Why is more than one thread required? Modern systems will be typically equipped with more than one block device. If many dirty pages exist in the system, it is the kernel's job to keep these devices as busy as possible with writing back data. Queues of different block devices are independent of each other, so data can be written in parallel. Data transfer rates are mainly limited by I/O bandwidth, not CPU power on current hardware. The connection between pdflush threads and writeback queues is summarized in Figure 17-2. The figure shows that a dynamically varying number of pdflush threads feeds the block devices with data that must be synchronized with the underlying block devices. Notice that a block device may have more than one queue that can transfer data, and that a pdflush thread may either serve all queues or just a specific one.

Former kernel versions only employed a single flushing daemon (which was then called bdflush), but this led to a performance problem: If one block device queue was congested because too many writeback operations were pending, other queues for different devices could not be fed with new data anymore. They remained idle, which can be a good thing on a summer vacation, but certainly not for block devices if there is work to do. This problem is solved by the dynamical creation and destruction of pdflush kernel threads, which allows for keeping many queues busy in parallel.

Pdflush threads Queues

Figure 17-2: Overview of the pdflush mechanism.

Pdflush threads Queues

Figure 17-2: Overview of the pdflush mechanism.

Continue reading here: Mmpdflushc

Was this article helpful?

+3 -2