I have used the term congestion a few times without precisely defining what it means. On an intuitive level it is not difficult to understand — when a kernel block device queue is overloaded with read or write operations, it doesn't make sense to add further requests for communication with the block device. It is best to wait until a certain number of requests have been processed and the queue is shorter before submitting new read or write requests.

Below I examine how the kernel implements this definition on a technical level.

