Request Structure

The kernel provides its own data structure to describe a request to a block device.

struct request {

struct list_head queuelist;

struct list_head donelist;

struct request_queue *q;

unsigned int cmd_flags;

enum rq_cmd_type_bits cmd_type;

sector_t sector; /* next sector to submit */

sector_t hard_sector; /* next sector to complete */

unsigned long nr_sectors; /* no. of sectors left to submit */

unsigned long hard_nr_sectors; /* no. of sectors left to complete */ /* no. of sectors left to submit in the current segment */ unsigned int current_nr_sectors;

/* no. of sectors left to complete in the current segment */ unsigned int hard_cur_sectors;

struct bio *bio; struct bio *biotail;

void *elevator_private; void *elevator_private2;

struct gendisk *rq_disk; unsigned long start_time;

unsigned short nr_phys_segments; unsigned short nr_hw_segments;

unsigned int cmd_len;

The very nature of a request is to be kept on a request queue. Such queues are implemented using doubly linked lists, and queuelist provides the required list element.10 q points back to the request queue to which the request belongs, if any.

Once a request is completed, that is, when all required I/O operations have been performed, it can be queued on a completed list, and donelist provides the necessary list element.

The structure includes three elements to indicate the exact position of the data to be transferred.

□ sector specifies the start sector at which data transfer begins.

□ current_nr_sectors indicates the number of sectors to transfer for the current request.

□ nr_sectors specifies the number of sector requests still pending.

hard_sector, hard_cur_sectors, and hard_nr_sectors have the same meaning but relate to the actual hardware and not to a virtual device. Usually, both variable collections have the same values, but differences may occur when RAID or the Logical Volume Manager is used because these combine several physical devices into a single virtual device.

When scatter-gather operations are used, nr_phys_segments and nr_hw_segments specify, respectively, the number of segments in a request and the number of segments used after possible re-sorting by an I/O MMU.

10This is only necessary for asynchronous request completion. Normally the list is not required.

Like most kernel data types, requests are equipped with pointers to private data. In this case, not only one, but two elements (elevator_private and elevator_private2) are available! They can be set by the I/O scheduler — traditionally called elevator — which currently processes the request.

BIOs are used to transfer data between the system and a device. Their definition is examined below.

□ bio identifies the current BIO instance whose transfer has not yet been completed.

□ biotail points to the last request, since a list of BIOs may be used in a request.

A request can be used to transmit control commands to a device (more formally, it can be used as packet command carrier). The desired commands are listed in the cmd array. We have omitted several entries related to bookkeeping required in this case.

The flags associated with a request are split into two parts. cmd_flags contains a set of generic flags for the request, and cmd_type denotes the type of request. The following request types are possible:

enum rq_cmd_type_bits {


The most common request type is REQ_TYPE_FS: It is used for requests that actually transfer data to and from a block device. The remaining types allow for sending various types of commands as documented in the source comments to a device.

Besides the type, several additional flags characterize the request type:

enum rq_flag_bits {













/* shutdown request */

/* driver defined type */

/* generic block layer message */

/* no low level driver retries */

/* elevator knows about this request */

/* may not be passed by ioscheduler */

/* may not be passed by drive either */

/* drive already may have started this one */

/* elevator private data attached */

/* set for "ide_preempt" requests */

_REQ_ORDERED_COLOR, /* is before or after barrier */

_REQ_RW_SYNC, /* request is sync (O_DIRECT) */

_REQ_ALLOCED, /* request came from our alloc pool */

_req_rw is especially important because it indicates the direction of data transfer. If the bit is set, data are written; if not, data are read. The remaining bits are used to send special device-specific commands, to set up barriers,11 or to transfer control codes. Their meaning is concisely described by the kernel commentary, so I need not add anything further.

6.5.6 BIOs

Before giving an exact definition of BIOs, it is advisable to discuss their underlying principles as illustrated in Figure 6-15.

Figure 6-15: Structure of BIOs.

The central management structure (bio) is associated with a vector whose individual entries each point to a memory page (caution: Not the address in memory but the page instance belonging to the page). These pages are used to receive data from and send data to the device.

It is explicitly possible to use highmem pages that are not directly mapped in the kernel and cannot therefore be addressed via virtual kernel addresses. This is useful when data are copied directly to userspace applications that are able to access the highmem pages using their page tables.

The memory pages can but need not be organized contiguously; this facilitates the implementation of scatter-gather operations.

BIOs have the following (simplified) structure in the kernel sources: <bio.h>

struct bio {

sector_t bi_sector;

struct bio *bi_next; /* request queue link */

11If a device comes across a barrier in a request list, all still pending requests must be fully processed before any other actions can be performed.

struct block_device *bi_bdev;

unsigned short unsigned short bi_vcnt; bi_idx;

/* how many bio_vec's */ /* current index into bvl_vec */

unsigned short unsigned short bi_phys_segments; bi_hw_segments;

unsigned int bi_size;

struct bio_vec




□ bi_sector specifies the sector at which transfer starts.

□ bi_next combines several BIOs in a singly linked list associated with a request.

□ bi_bdev is a pointer to the block device data structure of the device to which the request belongs.

□ bi_phys_segments and bi_hw_segments specify the number of segments in a transfer before and after remapping by the I/O MMU.

□ bi_size indicates the total size of the request in bytes.

□ bi_io_vec is a pointer to the I/O vectors, and bi_vcnt specifies the number of entries in the array. bi_idx denotes which array entry is currently being processed.

The structure of the individual array elements is as follows:

struct bio_vec {

bv_page points to the page instance of the page used for data transfer. bv_offset indicates the offset within the page; typically this value is 0 because page boundaries are normally used as boundaries for I/O operations.

len specifies the number of bytes used for the data if the whole page is not filled.

□ bi_private is not modified by the generic BIO code and can be used for driver-specific information.

□ bi_destructor points to a destructor function invoked before a bio instance is removed from memory.

□ bi_end_io must be invoked by the device driver when hardware transfer is completed. This gives the block layer the opportunity to do clean-up work or wake sleeping processes that are waiting for the request to end.

struct page *bv_page;

unsigned int bv_len; unsigned int bv_offset;

Continue reading here: Submitting Requests

Was this article helpful?

0 0