Spin Locks

A widely used synchronization technique is locking. When a kernel control path must access a shared data structure or enter a critical region, it needs to acquire a "lock" for it. A resource protected by a locking mechanism is quite similar to a resource confined in a room whose door is locked when someone is inside. If a kernel control path wishes to access the resource, it tries to "open the door" by acquiring the lock. It succeeds only if the resource is free. Then, as long as it wants to use the resource, the door remains locked. When the kernel control path releases the lock, the door is unlocked and another kernel control path may enter the room.

Figure 5-1 illustrates the use of locks. Five kernel control paths (P0, P1, P2, P3, and P4) are trying to access two critical regions (C1 and C2). Kernel control path P0 is inside C1, while P2 and P4 are waiting to enter it. At the same time, P1 is inside C2, while P3 is waiting to enter it. Notice that P0 and P1 could run concurrently. The lock for critical region C3 is open since no kernel control path needs to enter it.

Figure 5-1. Protecting critical regions with several locks

Figure 5-1. Protecting critical regions with several locks

Spin locks are a special kind of lock designed to work in a multiprocessor environment. If the kernel control path finds the spin lock "open," it acquires the lock and continues its execution. Conversely, if the kernel control path finds the lock "closed" by a kernel control path running on another CPU, it "spins" around, repeatedly executing a tight instruction loop, until the lock is released.

Of course, spin locks are useless in a uniprocessor environment, since the waiting kernel control path would keep running, so the kernel control path that is holding the lock would not have any chance to release it.

The instruction loop of spin locks represents a "busy wait." The waiting kernel control path keeps running on the CPU, even if it has nothing to do besides waste time. Nevertheless, spin locks are usually very convenient, since many kernel resources are locked for a fraction of a millisecond only; therefore, it would be far more time-consuming to release the CPU and reacquire it later.

In Linux, each spin lock is represented by a spinlock_t structure consisting of a single lock field; the value 1 corresponds to the unlocked state, while any negative value and zero denote the locked state.

Five functions shown in Table 5-5 are used to initialize, test, and set spin locks. On uniprocessor systems, none of these functions do anything, except for spin_trylock( ), which always returns 1. All these functions are based on atomic operations; this ensures that the spin lock will be properly updated by a process running on a CPU even if other processes running on different CPUs attempt to modify the spin lock at the same time.^^i

[3] Spin locks, ironically enough, are global and therefore must themselves be protected against concurrent accesses.

Table 5-5. Spin lock functions

Function

Description

spin lock init( )

Set the spin lock to 1 (unlocked)

spin lock( )

Cycle until spin lock becomes 1 (unlocked), then set it to 0 (locked)

spin unlock( )

Set the spin lock to 1 (unlocked)

spin unlock wait( )

Wait until the spin lock becomes 1 (unlocked)

spin is locked( )

Return 0 if the spin lock is set to 1 (unlocked); 0 otherwise

spin trylock( )

Set the spin lock to 0 (locked), and return 1 if the lock is obtained; 0 otherwise

Let's discuss in detail the spin_lock macro, which is used to acquire a spin lock. It takes the address slp of the spin lock as its parameter and yields essentially the following assembly language code: 141

[41 The actual implementation of spin_lock is slightly more complicated. The code at label 2, which is executed only if the spin lock is busy, is included in an auxiliary section so that in the most frequent case (free spin lock) the hardware cache is not filled with code that won't be executed. In our discussion, we omit these optimization details.

1: lock; decb slp jns 3f 2: cmpb $0,slp pause jle 2b jmp 1b

The decb assembly language instruction decrements the spin lock value; the instruction is atomic because it is prefixed by the lock byte. A test is then performed on the sign flag. If it is clear, it means that the spin lock was set to 1 (unlocked), so normal execution continues at label 3 (the f suffix denotes the fact that the label is a "forward" one; it appears in a later line of the program). Otherwise, the tight loop at label 2 (the b suffix denotes a "backward" label) is executed until the spin lock assumes a positive value. Then execution restarts from label 1, since it is unsafe to proceed without checking whether another processor has grabbed the lock. The pause assembly language instruction, which was introduced in the

Pentium 4 model, is designed to optimize the execution of spin lock loops. By introducing a small delay, it speeds up the execution of code following the lock and reduces power consumption. The pause instruction is backward compatible with earlier models of Intel processors because it corresponds to the instructions rep,-nop, which essentially do nothing.

The spin_unlock macro releases a previously acquired spin lock; it essentially yields the following code:

Again, the lock byte ensures that the loading instruction is atomic.

Was this article helpful?

0 -1

Post a comment