Request descriptors

Each block device request is represented by a request descriptor, which is stored in the request data structure illustrated in Table 13-7. The direction of the data transfer is stored in the cmd field; it is either read (from block device to RAM) or write (from RAM to block device). The rq_status field is used to specify the status of the request; for most block devices, it is simply set either to rq_inactive (for a request descriptor not in use) or to RQ_active (for a valid request that is to be serviced or is already being serviced by the low-level driver).

Table 13-7. The fields of a request descriptor

Type

Field

Description

struct list head

queue

Pointers for request queue list

int

elevator sequence

The "age" of the request for the elevator algorithm

volatile int

rq status

Request status

kdev t

rq dev

Device identifier

int

cmd

Requested operation

int

errors

Success or failure code

unsigned long

sector

First sector number on the (virtual) block device

unsigned long

nr sectors

Number of sectors of the request on the (virtual) block device

unsigned long

hard sector

First sector number of the (real) block device

unsigned long

hard nr sectors

Number of sectors of the request on the (real) block device

unsigned int

nr segments

Number of segments in the request on the (virtual) block device

unsigned int

nr hw segments

Number of segments in the request on the (real) block device

unsigned long

current nr sectors

Number of sectors in the block currently transferred

void *

special

Used only by drivers of SCSI devices

char *

buffer

Memory area for I/O transfer

struct completion *

waiting

Wait queue associated with request

struct buffer head *

bh

First buffer descriptor of the request

struct buffer head *

bhtail

Last buffer descriptor of the request

request_queue_t *

q

Pointer to request queue descriptor

The request may encompass many adjacent blocks on the same device. The rq_dev field identifies the block device, while the sector field specifies the number of the first sector of the first block in the request. The nr_sector field specifies the number of sectors in the request yet to be transferred. The current_nr_sector field stores the number of sectors in first block of the request. As we'll later see in Section 13.4.7, the sector, nr_sector, and current_nr_sector fields could be dynamically updated while the request is being serviced.

The nr_segments field store the number of segments in the request. Although all blocks in the requests must be adjacent on the block device, their corresponding buffers are not necessarily contiguous in RAM. A segment is a sequence of adjacent blocks in the request whose corresponding buffers are also contiguous in memory. Of course, a low-level device driver could program the DMA controller so as to transfer all blocks in the same segment in a single operation.

The hard sector, hard nr sectors, and nr hw segments fields usually have the same value as the sector, nr_sectors, and nr_segments fields, respectively. They differ, however, when the request refers to a driver that handles several physical block devices at once. A typical example of such a driver is the Logical Volume Manager (LVM), which is able to handle several disks and disk partitions as a single virtual disk partition. In this case, the two series of fields differ because the former refers to the real physical block device, while the latter refers to the virtual device. Another example is software RAID, a driver that duplicates data on several disks to enhance reliability.

All buffer heads of the blocks in the request are collected in a simply linked list. The b_reqnext field of each buffer head points to the next element in the list, while the bh and bhtail fields of the request descriptor point, respectively, to the first element and the last element in the list.

The buffer field of the request descriptor points to the memory area used for the actual data transfer. If the request involves a single block, buffer is just a copy of the b_data field of the buffer head. However, if the request encompasses several blocks whose buffers are not consecutive in memory, the buffers are linked through the b_reqnext fields of their buffer heads as shown in Figure 13-3. On a read, the low-level driver could choose to allocate a large memory area referred by buffer, read all sectors of the request at once, and then copy the data into the various buffers. Similarly, for a write, the low-level device driver could copy the data from many nonconsecutive buffers into a single memory area referred by buffer and then perform the whole data transfer at once.

Figure 13-3. A request descriptor and its buffers and sectors

Figure 13-3. A request descriptor and its buffers and sectors

Figure 13-3 illustrates a request descriptor encompassing three blocks. The buffers of two of them are consecutive in RAM, while the third buffer is by itself. The corresponding buffer heads identify the logical blocks on the block device; the blocks must necessarily be adjacent. Each logical block includes two sectors. The sector field of the request descriptor points to the first sector of the first block on disk, and the b_reqnext field of each buffer head points to the next buffer head.

During the initialization phase, each block device driver usually allocates a fixed number of request descriptors to handle its forthcoming I/O requests. The blk_init_queue( ) function sets up two equally sized lists of free request descriptors: one for the read operation and another for the write operations. The size of these lists is set to 64 if the RAM size is greater than 32 MB, or to 32 if the RAM size is less than or equal to 32 MB. The status of all request descriptors is set initially to rq_inactive.

The fixed number of request descriptors may become, under very heavy loads and high disk activity, a bottleneck. A dearth of free descriptors may force processes to wait until an ongoing data transfer terminates. Thus, a wait queue is used to queue processes waiting for a free request element. The get_request_wait( ) tries to get a free request descriptor and puts the current process to sleep in the wait queue if none is found; the get_request( ) function is similar but simply returns null if no free request descriptor is available.

A threshold value known as batch_requests (set to 32 or to 16, depending on the RAM size) is used to cut down kernel overhead; when releasing a request descriptor, processes waiting for free request descriptors are not woken up unless there are at least batch_requests free descriptors. Conversely, when looking for a free request descriptor, get_request_wait( ) relinquishes the CPU if there are fewer than batch_requests free descriptors.

Was this article helpful?

0 0

Post a comment