Figure 62 The groups of lists associated with dynamic timers

The tv1 structure is of type struct timer_vec_root, which includes an index field and a vec array of 256 list_head elements — that is, lists of dynamic timers. It contains all dynamic timers that will decay within the next 255 ticks.

The index field specifies the currently scanned list; it is initialized to 0 and incremented by 1 (modulo 256) at every tick. The list referenced by index contains all dynamic timers that expired during the current tick; the next list contains all dynamic timers that will expire in the next tick; the (index+k)th list contains all dynamic timers that will expire in exactly k ticks. When index returns to 0, it signifies that all the timers in tv1 have been scanned. In this case, the list pointed to by tv2.vec[tv2.index] is used to replenish tv1.

The tv2, tv3, and tv4 structures of type struct timer_vec contain all dynamic timers that will decay within the next 214-1, 220-1, and 226-1 ticks, respectively.

The tv5 structure is identical to the previous ones, except that the last entry of the vec array includes dynamic timers with extremely large expires fields. It never needs to be replenished from another array.

The timer_vec structure is very similar to timer_vec_root: it contains an index field and a vec array of 64 pointers to dynamic timer lists. The index field specifies the currently scanned list; it is incremented by 1 (modulo 64) every 256i-1 ticks, where i, ranging between 2 and 5, is the tvi group number. As in the case of tv1, when index returns to 0, the list pointed to by tvj.vec[tvj.index] is used to replenish tvi (i ranges between 2 and 4, j is equal to i+1).

Thus, the first element of tv2 holds a list of all timers that expire in the 256 ticks following the tv1 timers; the timers in this list are sufficient to replenish the whole array tv1. The second element of tv2 holds all timers that expire in the following 256 ticks, and so on. Similarly, a single entry of tv3 is sufficient to replenish the whole array tv2.

Figure 6-2 shows how these data structures are connected.

The timer_bh( ) function associated with the timer_bh bottom half invokes the run_timer_list( ) auxiliary function to check for decayed dynamic timers. The function relies on a variable similar to jiffies that is called timer_jiffies. This new variable is needed because a few timer interrupts might occur before the activated timer_bh bottom half has a chance to run; this happens typically when several interrupts of different types are issued in a short interval of time.

The value of timer_jiffies represents the expiration time of the dynamic timer list yet to be checked: if it coincides with the value of jiffies, no backlog of bottom half functions has accumulated; if it is smaller than jiffies, then bottom half functions that refer to previous ticks must be dealt with. The variable is set to 0 at system startup and is incremented only by run_timer_list( ) . Its value can never be greater than jiffies.

The run_timer_list( ) function is essentially equivalent the following C fragment:

struct list_head *head, *curr; struct timer list *timer; void (*fn)(unsigned long); unsigned long data;

spin lock irq(&timerlist lock);

while ((long)(jiffies - timer jiffies) >= 0) { if (!tvi.index) { int n = 1; do {

cascade_timers(tvecs[n]); } while (tvecs[n]->index == 1 && ++n < 5));

for (curr = head->next; curr != head; curr = head->next) { timer = list entry(curr, struct timer list, list); fn = timer->function; data= timer->data;

detach timer(timer);

timer->list.next = timer->list.prev = NULL; running timer = timer; spin unlock irq(&timerlist lock); fn(data);

spin lock irq(&timerlist lock); running timer = NULL;

++timer jiffies;

spin unlock irq(&timerlist lock);

The outermost while loop ends when timer_jiffies becomes greater than the value of jiffies. Since the values of jiffies and timer_jiffies usually coincide, the outermost while cycle is often executed only once. In general, the outermost loop is executed jiffies - timer_jiffies + 1 consecutive times. Moreover, if a timer interrupt occurs while run_timer_list( ) is being executed, dynamic timers that decay at this tick occurrence are also considered, since the jiffies variable is asynchronously incremented by the PIT's interrupt handler (see the earlier section Section 6.2.1.1).

During a single execution of the outermost while cycle, the dynamic timer functions included in the tv1.vec[tv1.index] list are executed. Before executing a dynamic timer function, the loop invokes the detach_timer( ) function to remove the dynamic timer from the list. Once the list is emptied, the value of tv1.index is incremented (modulo 256) and the value of timer_jiffies is incremented.

When tv1.index becomes equal to 0, all the lists of tv1 have been checked; in this case, it is necessary to refill the tv1 structure. This is accomplished by the cascade_timers( ) function, which transfers the dynamic timers included in tv2.vec[tv2.index] into tv1.vec, since they will necessarily decay within the next 256 ticks. If tv2.index is equal to 0, the tv2 array of lists must be refilled with the elements of tv3.vec[tv3.index].

Notice that run_timer_list( ) disables interrupts and acquires the timerlist_lock spin lock just before entering the outermost loop; interrupts are enabled and the spin lock is released right before invoking each dynamic timer function, until its termination. This ensures that the dynamic timer data structures are not corrupted by interleaved kernel control paths.

To sum up, this rather complex algorithm ensures excellent performance. To see why, assume for the sake of simplicity that the timer_bh bottom half is executed right after the corresponding timer interrupt occurs. Then, in 255 timer interrupt occurrences out of 256 (in 99.6% of the cases), the run_timer_list( ) function just runs the functions of the decayed timers, if any. To replenish tv1.vec periodically, it is sufficient 63 times out of 64 to partition the list pointed to by tv2.vec[tv2.index] into the 256 lists pointed to by tv1.vec. The tv2.vec array, in turn, must be replenished in 0.02 percent of the cases (that is, once every 163 seconds). Similarly, tv3 is replenished every 2 hours and 54 minutes, and tv4 is replenished every 7 days and 18 hours. tv5 doesn't need to be replenished.

Was this article helpful?

0 0

Post a comment