Figure 910 Code flow diagram for ext2getblocks creation of a block

The path array passed as a function argument is structured in accordance with the familiar method because it doesn't make any difference whether a block is available in a file or not. Only the position within the file and the block size of the filesystem must be known in order to set up the path.

The difference as compared with the ext2_get_blocks version above does not become apparent until the ext2_get_branch function is invoked. Whereas previously a null pointer was returned to indicate a successful search, the address of the last indirection block is now returned as the starting point for extending the file if the desired data block is outside the previously valid range.

To understand the situation where a new block is created, it is necessary to take a closer look at how ext2_get_branch works because a new data structure is introduced:

fs/ext2/inode.c typedef struct

_le32

_le32

struct } Indirect;

buffer_head *bh;

While key indicates the block number, p is a pointer to the address in memory where the key resides. A buffer head (bh) is used to keep the block data in memory.

In the case of a fully filled Indirect instance, the information on the block number is stored redundantly in key and in *p. This again becomes clear when we examine the auxiliary function to fill Indirect:

fs/ext2/inode.c static inline void add_chain(Indirect *p, struct buffer_head *bh, u32 *v) {

If, when traversing the block path, ext2_get_branch detects that there is no pointer to the next indirection level (or to a data block if direct allocation is used), an incomplete Indirect instance is returned. Although the p element points to the position where the number of the next indirection or data block should be located in the indirection block, the number itself is 0 because the block has not yet been allocated.

Figure 9-11 illustrates this fact graphically. The fourth data block to be addressed by the simple indirection block is not present but should be used. The returned Indirect instance contains a pointer to the position in the indirection block where the block number must be inserted (namely, 1,003 because the indirection block starts at the address 1,000 and the fourth element is of interest). However, the value of the key is 0 because the associated data block has not yet been allocated.

Direct pointers

0x1234

0x7f3c

0x9831

0x0

0x0

0x0

0x1000

Figure 9-11: Return of ext2_get_branch.

Figure 9-11: Return of ext2_get_branch.

Now that the position in the indirection chain in which no further blocks are allocated is clear, the Second Extended Filesystem must find out where there is space in the partition to add one or more new blocks to the file. This is not a trivial task because ideally the blocks of a file should be contiguous or, if this is not feasible, at least as close together as possible. This ensures that fragmentation is minimized and results not only in better utilization of hard disk capacity but in faster read/write operations because read/write head travel is reduced.

Several steps are involved in searching for a new block. A search is first made for a goal block; from the perspective of the filesystem, this block is the ideal candidate for allocation. The search for a global block is based on general principles only and does not take account of the actual situation in the filesystem. The ext2_find_goal function is invoked to search for the best new block. When searching is performed, it is necessary to distinguish between two situations:

□ When the block to be allocated logically immediately follows the block last allocated in the file (in other words, data are to be written contiguously), the filesystem tries to write to the next physical block on the hard disk. This is obvious — if data are stored sequentially in a file, they should if at all possible be stored contiguously on the hard disk.

□ If the position of the new logical block is not immediately after the last allocated block, the ext2_find_near function is invoked to find the most suitable new block. Depending on the specific situation, it finds a block close to the indirection block or at least in the same cylinder group. I won't bother with the details here.

Once it has these two pieces of information (the position in the indirection chain in which there are no further data blocks, and the desired address of the new block), the kernel sets about reserving a block on the hard disk. Of course, there is no guarantee that the desired address is really free, so the kernel may have to be satisfied with poorer alternatives, which unavoidably entails data fragmentation.

Not only might new data blocks be required — it can also be the case that some blocks are required to hold indirection information. ext2_blks_to_allocate computes the total number of new blocks, that is, the sum of data and (single, double, and triple) indirection blocks. The allocation proper is then done by ext2_alloc_branch. The parameters passed to this function include the desired address of the new block, information on the last incomplete part of the indirection chain, and the number of indirection levels still missing up to the new data block. Expressed differently, the function returns a linked list of indirection and data blocks that can be added to the existing indirection list of the filesystem. Last but not least, ext2_slice_branch adds the resulting hierarchy (or, in the simplest case, the new data block) to the existing network and performs several relatively unimportant updates on Ext2 data structures.

Block Allocation ext2_alloc_branch is responsible to allocate the required blocks for a given new path and set up the chain that connects them. At a first glance, this seems an easy task, as the code flow diagram in Figure 9-12 might suggest.

The function calls ext2_alloc_blocks, which, in turn relies on ext2_new_blocks to reserve the required new blocks. Since the function always allocates consecutive blocks, one single call might not be sufficient to obtain the total number of required blocks. If the filesystem becomes fragmented, it can be that no such consecutive region is available. However, this is no problem: ext2_new_block is called multiple times until at least the number of blocks that is required for the indirection mechanism has been allocated. The surplus blocks can be used as data blocks.

Finally, ext2_alloc_branch need only set up the Indirect instances for the indirection blocks, and it is done.

Obviously, the hard work is hidden in ext2_new_blocks. The code flow diagram in Figure 9-13 proves that this is really the case!

ext2_alloc_branch ext2_alloc_blocks

Set up indirection structures ext2_alloc_blocks

Continue reading here: Figure 912 Code flow diagram for ext2 alloc branch

Was this article helpful?

0 0