Physical Structure

Various structures, defined as C data types in the kernel, must be created to hold filesystem data — file contents, representation of the directory hierarchy, and associated administration data such as access permissions or user and group affiliations, as well as metadata to manage filesystem-internal information. This is necessary so that data can be read from block devices for analysis. Obviously persistent copies of these structures need to reside on the hard disk so that data are not lost between working sessions and are still available the next time the kernel is activated. Because hard disk and RAM requirements differ, there

2Another filesystem of the time that has now fallen into oblivion (and for which kernel support has long been withdrawn) is the Xia filesystem, an enhancement of the Minix filesystem. The author nevertheless still has fond memories of using this filesystem for one of his first Linux installations, a choice that did not prove to be very visionary ...

are usually two versions of a data structure — one for persistent storage on disk, the other for working with memory.

In the sections below, the frequently used word block has two different meanings:

□ On the one hand, some filesystems reside on block-oriented devices that — as explained in Chapter 6 — do not transfer individual characters but entire data blocks.

□ On the other, the Second Extended Filesystem is a block-based filesystem that divides the hard disk into several blocks, all of the same size, to manage metadata and the actual file contents. This means that the structure of the underlying storage medium is imposed on the structure of the filesystem and this naturally influences the design of the algorithms and data structures used. This chapter takes a closer look at this influence.

One aspect is of particular importance when dividing the hard disk into fixed-sized blocks — files may occupy only integer multiples of the block size. Let us look at the impact of this situation by reference to Figure 9-1, in which, for simplicity's sake, we assume a block size of 5 units. We want to store three files whose sizes are 2, 4, and 11 units.

File C

Figure 9-1: File distribution in block-based filesystems.

File A

OH] File B

File C

Figure 9-1: File distribution in block-based filesystems.

The clearly more effective method of dividing existing storage space is applied in the upper part, where the contents of the individual files are spread as compactly as possible across the available blocks. However, this method is not used in practice because it has a major disadvantage.3 The information needed to manage the file boundaries within the individual blocks would be so voluminous that it would immediately cancel out any advantage gained as compared to the wasteful assignment of blocks in the right part of the figure. As a result, each file occupies not only the space needed for its data but also the space left over when the block size is rounded up to the next integer multiple.

Continue reading here: Structure Overview

Was this article helpful?

0 0