The similarity between memory management and disk storage in terms of their block structure means that they share the familiar problem of fragmentation discussed in Chapter 3. Over time, many files of a filesystem are deleted at random positions on the disk, and new ones are added. This inevitably leads to fragmentation of free disk space into chunks of different sizes, as illustrated in Figure 9-5.

Although the situation illustrated may well be exaggerated, it clearly indicates the nature of the problem. There are still 12 blocks free on the hard disk, but the longest contiguous unit is 5 blocks. What happens when a program wants to save data occupying a total of 7 blocks to disk? Or what about when it is necessary to add data to an existing file and the data blocks beyond the end of the file are already occupied by other data?

The answer is obvious. The data are spread over different areas of the disk and become fragmented. It is important that this be done transparently to the user process. Processes accessing a file always see the file as a continuous linear structure, regardless of the degree of data fragmentation on the hard disk. This is reminiscent of the way in which a processor presents working memory to processes, the difference being that there is no automatic hardware instance to ensure linearization on behalf of the filesystem. The code of the filesystem itself is responsible for this task.

Of course, this does not present any basic difficulty when direct pointers or simple, double, and triple indirection are used to point to the file data blocks. The data block numbers are always uniquely identified by the information in the pointers. From this point of view, it is irrelevant whether the data blocks are sequential or are spread randomly across the entire hard disk.

However, there is a noticeable difference in access speed. If all file blocks are contiguous on the hard disk (this is desirable), movement of the read/write head when reading data is reduced to a minimum, thus boosting the speed of data transfer. If the opposite is true — and the file blocks are distributed across the disk — the read/write head is forced to constantly traverse the disk in order to read the data, and this slows access.

Consequently, the Second Extended Filesystem does its best to prevent fragmentation. When fragmentation cannot be avoided, it attempts to keep the individual file blocks in the same block group.6 It is very helpful if the filesystem is not filled to capacity and is operated with appropriate reserves; more file storage options are then available, and this automatically reduces susceptibility to fragmentation.

6The defrag.ext2 system tool analyzes Ext2 partitions and reorganizes fragmented data in a contiguous structure.








Figure 9-5: Fragmentation in filesystems.

Figure 9-5: Fragmentation in filesystems.

Continue reading here: Data Structures

Was this article helpful?

0 0