Request Queues

Read and write requests to block devices are placed on a queue known as a request queue. The gendisk structure includes a pointer to the device-specific queue, which is represented by the following data type.

struct request_queue {

* Together with queue_head for cacheline sharing

struct list_head queue_head;

struct list_head *last_merge;

elevator_t elevator;

struct request_list rq; /* Queue request freelists */

request_fn_proc *request_fn;

make_request_fn *make_request_fn;

prep_rq_fn *prep_rq_fn; unplug_fn *unplug_fn; merge_bvec_fn *merge_bvec_fn; prepare_flush_fn *prepare_flush_fn; softirq_done_fn *softirq_done_fn;

Auto-unplugging state

struct timer_list int unsigned long struct work struct unplug_timer; unplug_thresh; unplug_delay; unplug_work;

/* After this many requests */ /* After this many jiffies */

struct backing_dev_info backing_dev_info;

/* queue needs bounce pages for pages above this limit */

unsigned long int bounce_pfn; bounce_gfp;

unsigned

long

queue_flags;

/* queue

settings */

unsigned

long

nr_requests; /*

unsigned

int

nr_congestion_on;

unsigned

int

nr_congestion_off;

unsigned

int

nr_batching;

unsigned

short

max_sectors;

unsigned

short

max_hw_sectors;

unsigned

short

max_phys_segments;

unsigned

short

max_hw_segments;

unsigned

short

hardsect_size;

unsigned

int

max_segment_size;

queue_head is the central list head used to construct a doubly linked list of requests — each element is of the data type request discussed below and stands for a request to the block device to read or fetch data. The kernel rearranges the list to achieve better I/O performance (several algorithms are provided to perform I/O scheduler tasks as described below). As there are various ways of resorting requests, the elevator element8 groups the required functions together in the form of function pointers. I shall come back to this structure further below.

rq serves as a cache for request instances. struct request_list is used as a data type; in addition to the cache itself, it provides two counters to record the number of available free input and output requests.

The next block in the structure contains a whole series of function pointers and represents the central request handling area. The parameter settings and return type of the function are defined by typedef macros (struct bio manages the transferred data and is discussed below).

typedef void (request_fn_proc) (struct request_queue *q); typedef int (make_request_fn) (struct request_queue *q, struct bio *bio); typedef int (prep_rq_fn) (struct request_queue *, struct request *); typedef void (unplug_fn) (struct request_queue *);

typedef int (merge_bvec_fn) (struct request_queue *, struct bio *, struct bio_vec *); typedef void (prepare_flush_fn) (struct request_queue *, struct request *); typedef void (softirq_done_fn)(struct request *);

The kernel provides standard implementations of these functions that can be used by most device drivers. However, each driver must implement its own request_fn function because this represents the main link between the request queue management and the low-level functionality of each device — it is invoked when the kernel processes the current queue in order to perform pending read and write operations.

The first four functions are responsible to manage the request queue:

□ request_fn is the standard interface for adding new requests to the queue. The function is automatically called by the kernel when the driver is supposed to perform some work like reading data from or writing data to the underlying device. In kernel nomenclature, this function is also referred to as strategy routine.

□ make_request_fn creates new requests. The standard kernel implementation of this function adds the request to the request list as you will see below. When there are enough requests in the list, the driver-specific request_fn function is invoked to process them together.

The kernel allows device drivers to define their own make_request_fn functions because some devices (RAM disks, for example) do not make use of queues as data can be accessed in any sequence without impairing performance, or they might know better than the kernel how to deal with requests and would not benefit from the standard methods (volume managers, for example). However, this practice is rare.

□ prep_rq_fn is a request preparation function. It is not used by most drivers and is therefore set to NULL. If it is implemented, it generates the hardware commands needed to prepare a request before the actual request is sent. The auxiliary function blk_queue_prep_rq sets prep_rq_fn in a given queue.

8This term is slightly confusing because none of the algorithms used by the kernel implements the classic elevator method. Nevertheless, the basic objective is similar to that of elevators.

□ unplug_fn is used to unplug a block device. A plugged device does not execute requests but collects them and sends them when it is unplugged. Used skillfully, this method enhances block layer performance. The remaining three functions are slightly more specialized.

□ merge_bvec_fn determines if it is allowed to augment an existing request with more data. Since request queues usually have fixed size limits for their requests, the kernel can use these to answer the question. However, more specialized drivers — especially compound devices — may have varying limits so that they need to provide this function. The kernel provides the auxiliary routine blk_queue_merge_bvec to set merge_bvec_fn for a queue.

□ prepare_flush_fn is called to prepare flushing the queue, that is, before all pending requests are executed in one go. Devices can perform necessary cleanups in this method.

The auxiliary function blk_queue_ordered is available to equip a request queue with a specific method.

□ Completing requests, that is, ending all I/O, can be a time-consuming process for large requests. During the development of 2.6.16, the possibility to complete requests asynchronously using SoftIRQs (see Chapter 14 for more details on this mechanism) was added. Asynchronous completion of a request can be demanded by calling blk_complete_request, and softirq_done_fn is in this case used as a callback to notify the driver that the completion is finished.

The kernel provides the standard function blk_init_queue_node to generate a standard request queue. The only management function that must be provided by the driver itself in this case is request_fn. Any other management issues are handled by standard functions. Drivers that implement request management this way are required to call blk_init_queue_node and attach the resulting request_queue instance to their gendisk before add_disk is called to activate the disk.

Request queues can be plugged when the system is overloaded. New requests then remain unprocessed until the queue is unplugged (this is called queue plugging). The unplug_ elements are used to implement a timer mechanism that automatically unplugs a queue after a certain period of time. unplug_fn is responsible for actual unplugging.

queue_flags is used to control the internal state of the queue with the help of flags.

The last part of the request_list structure contains information that describes the managed block device in more detail and reflects the hardware-specific device settings. This information is always in the form of numeric values; the meaning of the individual elements is given in Table 6-2.

nr_requests indicates the maximum number of requests that may be associated with a queue; we come back to this topic in Chapter 17.

Was this article helpful?

+1 0

Post a comment