Nesting software interrupts

Figure 16.40, from < asm-i386 softirq.h> , shows a number of higher-level macros that keep track of how deeply nested we are into software interrupts. 16.5.1.3 Extracting information from the irq_stat_t do barrier() local_bh_count(cpu)- while (0) define cpu_bh_disable(cpu) do local_bh_count(cpu)++ barrier() while (0) 13 define local_bh_disable() (cpu_bh_disable(smp_processor_id()) 14 definelocal_bh_enable() cpu_bh_enable(smp_processor_id()) unsigned int *ptr cmpl 0, -8( 0) jnz 2f 1 n 2...

Timer management

The functions described in this section insert, modify, and delete elements in these timer lists. They are used heavily by all sorts of drivers. In order to indicate when a particular timer_list structure is not inserted into the tree, its pointer fields are set to NULL. The function to do this, from < linux timer.h> , is shown in Figure 15.29. 46 static inline void init_timer(struct timer_list * timer) 48 timer-> list.next timer-> list.prev NULL Figure 15.29 Initialising a timer_list...

Eliminating exceptions

The first few lines of schedule() are shown in Figure 7.12 This deals with preliminaries that have little or nothing to do with scheduling. It merely declares some local variables and checks for some exceptions that have to be handled. 549 asmlinkage void schedule(void) 551 struct schedule_data* sched_data 552 struct task_struct *prev, *next, *p 559 if ( current-> active_mm)BUG() 562 this_cpu prev-> processor in interrupt n) Figure 7.12 Checking for errors and software interrupts 557 this...

Updating the timeofday clock by one tick

The timer interrupt is used, among other things, to update the computer's time-of-day clock. But, owing to small irregularities in the frequency of this interrupt, the clock may run fast or slow over a period of time. Sophisticated algorithms are used in an attempt to offset this. Any attempt to correct a clock relies on access to an external time-source. This source updates kernel variables, which are then read by the functions described in this and the following sections. Before describing...

Conditional interruptible sleep

The significant difference between this section and the previous one is the value in the state field of the process while it is sleeping. As before, there are two macros involved in putting a process to sleep conditionally. The first just checks the condition, whereas the main one actually puts the process to sleep. The macro shown in Figure 4.37, from < linux sched.h> , puts a process to sleep in the TASK_INTERRUPTIBLE state. It is only a wrapper that tests the condition before ever...

The software interrupt kernel thread

The previous section has described how software interrupts are handled in interrupt context on the return path from hardware interrupt handling, but there is also a kernel thread (in fact, one per CPU) dedicated to handling software interrupts. This thread is woken up when the load of software interrupts becomes too great to handle in interrupt context (it would take too many machine cycles from the current process). 16.1.4.1 Spawning kernel threads to handle software interrupts At boot time,...

Subsidiary functions

There are two worker functions used by dequeue_signal(), which are considered in this section. Although there may be a number of signals pending, they can be handled only one at a time. The function shown in Figure 18.31, from kernel signal.c, returns the number of the first pending signal that is not blocked. Figure 18.31 Finding a signal to service 51 the first parameter is a pointer to the task_struct of the process the second points to the blocked bitmap of the process. 56 this points s to...

Manipulating whole bitmaps

There are another group of functions that clear, set, or do logical operations on whole bitmaps. Some of these are generated by parameterised macros others are straightforward functions. 17.2.2.1 Macro to generate bit-manipulating functions The macro shown in Figure 17.18, from < linux signal.h> , generates a function called name, that performs the bitwise (binary) operation specified by op on two input signal masks pointed to by a and b and writes the result to the signal mask pointed to...

Handling a vectored interrupt in vm86 mode

Vectored interrupts, either traps or INTx, will be handled by indexing into the 16-bit interrupt table, or the 32-bit IDT. A special function is provided, to check for all sorts of conditions see Figure 24.29, from arch i386 kernel vm86.c. This function is called from handle_vm86_trap() (vm86pus is set) and handle_vm86_fault() (if an INTx). 398 static void do_int(struct kernel_vm86_regs *regs, inti, unsigned char * ssp, unsigned long sp) 400 unsigned long*intr_ptr, segoffs 402 if (regs-> cs...

Reading and writing Io Apics

A number of functions are provided in < asm-i386 io_apic.h> for reading, writing, and modifying specific registers in an IO APIC. These will be considered in this section. 14.4.5.1 Reading from an IO APIC register The function shown in Figure 14.41 reads from a specific IO APIC register. 10 5 static inline unsigned int io_apic_read(unsigned int apic, 108 return *(IO_APIC_BASE(apic)+4) Figure 14.41 Reading from an IO APIC register 105 the first parameter identifies the APIC the second...

Generating firstlevel interrupt handlers

In Section 12.3.1 an array of pointers to first-level handlers for hardware interrupts was initialised. The handler stubs themselves are built using some ugly macros, which create the low-level assembly routines that save register context and call the second-level handler, do_IRQ(). The do_IRQ() function then does all the operations that are needed to keep the hardware interrupt controller happy. 12.3.2.1 Building all the handler stubs Figure 12.12, from arch i386 kernel i82 59.c, shows the...

Sending signals to a parent

A pair of functions are provided for sending signals to a parent or to a parent's thread group. Although these can send any specified signal, they are most frequently used for sending SIGCHLD. Notifying a child's change of status to its parent When a child process changes its status (e.g. it stops or terminates), it is necessary to let its parent know about this status change. The function shown in Figure 18.17, from kernel signal.c, fills in a struct siginfo with the relevant information and...

Event timer data structures

Each timer is represented by a struct timer_list which specifies the function and when it is to be run. Then Linux uses what at first sight might seem an unusual data structure to keep track of these structures. It has headers for 512 different lists, sorted on the order in which the timer is to expire. These are divided into five different groups, known as vectors. The first one, the root vector, contains headers for 256 different lists. This is the 'ready-use' vector it maintains timers with...

Line 286 flags

This is a bit field, recording various items of interest about the status of the process. Individual bits in this field denote different milestones in the lifecycle of a process. These bits are defined as shown in Figure 2.3, from < linux sched.h> . They are not mutually exclusive more than one can be set at the same time. Figure 2.3 Values for the flags field Figure 2.3 Values for the flags field 418 alignment warning messages should be printed. No example of its use has been found...

The hash table for task structures

As all the data structures representing processes (task_struct) are on a doubly linked list, any one can be found by searching the list linearly. This method can be time-consuming, particularly if there are a large number of processes. So, to speed things up, all the structures are also kept on hash lists, hashed on the pid of the process, and linked through the pidhash_next and pidhash_pprev fields of the task_struct. This section examines the hash structure itself, the hash function used, and...

The debug exception handler

The function that actually handles the debug exception is shown in Figure 11.7, from arch i386 kernel traps.c. It checks for several unusual situations, before sending a signal to the current process. 477 asmlinkage void do_debug(struct pt_regs * regs, longerror_code) 480 struct task_struct *tsk current r (condition)) 486 if (condition& (DR_TRAP0 DR_TRAP1 DR_TRAP2 DR_TRAP3)) 491 if (regs-> eflags & VM_MASK) 498 if (condition& DR_STEP) 508 if ((tsk-> ptrace & (PT_DTRACE...

Run queued tasks

Linux provides two functions for executing tasks on a task queue. One checks for entries on the queue, the other runs them. 16.4.3.1 Checking for entries on a task queue The function shown in Figure 16.33, from < linux tqueue.h> , checks if there are any tasks on the specified queue. If there are, it calls _run_task_queue() (see Section 119 static inline void run_task_queue(task_queue *list) Figure 16.33 Checking for an empty queue before running tasks 121 the macro was defined in Section...

Restarting an interrupted system call

The only way control can transfer to the code shown in Figure 18.29 is by breaking out of the infinite loop at line 609 (Figure 18.24) because there were no further signals queued. One final possibility must be considered. When a process calls a system service that blocks, it is put into the TASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE state. When a signal is posted to a process in the TASK_INTERRUPTIBLE state it is woken up and moved to the runqueue, even though the system service has not...

Generating load statistics

Another aspect of the processing done by the timer bottom half is to generate load statistics. The discussion here will consider the main function called to do that, calc_load() a subsidiary function called to determine the number of processes currently active and some miscellaneous macros used in the calculations. The function shown in Figure 15.21, from kernel timer.c, records load averages at fixed intervals. It is called every time the timer bottom half runs. The parameter ticks is the...

Sending a signal to a range of processes

By using zero-valued or negative pids, different ranges of processes can be specified as the target for a signal. The function shown in Figure 18.16, from kernel signal.c, accepts such pids, and differentiates between them 0 means the signal is sent to every process in the same group as the sender. A parameter of 1 means send the signal to any process for which it is permissible. If the parameter is less than 1, then the signal is sent to every process in the process group identified by the...

Manipulating the linked list of task structures

This section examines the sequential list. There are three macros defined in < linux sched.h> that manipulate the various links in a task_struct. One removes a structure, another inserts a structure, and a third follows the links from start to finish. The macro shown in Figure 3.1, from < linux sched.h> , removes a descriptor p from the process structure, and from lists of siblings. Note that it does nothing about mutual exclusion. Any functions that use this macro have to guarantee...

Semaphore data structures and macros

This section will examine the data structure representing a semaphore, and the macros provided for declaring and initialising semaphores 6.1.1.1 The semaphore data structure The data structure used to represent a semaphore to the system is shown in Figure 6.1, from < asm-i386 semaphore.h> . Figure 6.1 The semaphore data structure 45 this is the value of the semaphore. When positive, the resource is free and can be acquired. When zero or negative, the resource protected by the semaphore is...

Writing to a register

The function shown in Figure 22.6, from arch i386 kernel ptrace.c, writes to the field in the struct pt_regs on the kernel stack of the traced process, corresponding to a specified hardware register. 73 static int putreg(struct task_struct *child, 74 unsigned long regno, unsigned long value) if (value & & (value & 3) 3) child-> thread.fs value return 0 if (value & & (value & 3) 3) child-> thread.gs value return 0 if (value & & (value & 3) 3) get_stack_long(child,...

Task state segment

The task state segment (TSS) is specific to the i386 architecture. It is Intel's layout for the volatile environment of a process. The TR register in the CPU always points to the TSS of the current process. Intel intended that each process would have its own TSS and that the volatile environment of a process would be saved there when it was context switched out. Linux does not implement things that way, preferring to save most of the volatile environment on the kernel stack of the process and...

Subsidiary functions used when exiting

338 static void exit_notify(void) 340 struct task_struct * p, *t if((t-> pgrp current-> pgrp) & & (t-> session current-> session) & & current) & & has_stopped_jobs(current-> pgrp)) if(current-> exit_signal SIGCHLD & & (current-> parent_exec_id t-> self_exec_id current-> self_exec_id current-> parent_exec_id) & & capable(CAP_KILL)) 397 do_notify_parent(current, current-> exit_signal) 398 while (current-> p_cptr NULL) 404 p-> p_pptr p->...

Enabling and disabling tasklets

It is also possible to mark a tasklet as disabled. Although a tasklet can always be scheduled to run, it will not actually be run until it is in the enabled state. This is indicated by its count field having a value of 0. Two functions are provided for disabling tasklets see Figure 16.20, from < linux interrupt.h> . Although disabled, a tasklet may still be scheduled to run, using the functions from Section 16.2.3 or Section 16.2.4, but it will not run until enabled again, by one of the...

Spinlocks with full debugging

This section describes an implementation of spinlocks that not only maintains lock state but also does a significant amount of checking that the locks have been properly initialised and properly used. Warning messages are printed if anomalies are discovered. Figure 5.20 is from < linux spinlock.h> . 86 else * (DEBUG_SPINLOCKS > 2)* volatile unsigned long lock volatile unsigned int babble const char *module 93 define SPIN_LOCK_UNLOCKED (spinlock_t) 0, 25,_BASE_FILE_ 97...

Line 391 files

Figure 2.21, from < linux sched.h> , shows the format of an open file descriptor table for a process. It contains information about the I O streams that this process has open. Most of the information about an individual open stream is actually contained in another structure, a struct file. The files_struct essentially gathers together the pointers to all the instances of struct file belonging to this process. Figure 2.21 Open file table structure Figure 2.21 Open file table structure 173...

Registering context for handling a signal

As part of the stack frame built on the user mode stack, the kernel makes a large amount of context information available to the handler. This consists mainly of the values in the CPU registers when the process last switched to kernel mode, along with some extra items of information. All this information is encapsulated in a struct sigcontext see Figure 19.3, from < asm-i386 sigcontext.h> . Note that the order of the fields is completely different from that of a struct pt_regs. Figure 19.3...

Setting up the idle thread

Figure 3.21, from arch i386 kernel process.c , is the function executed by the idle thread. There is no useful work to be done, so it tries to conserve power, halting the processor, waiting for something to happen. 131 void (*idle)(void) pm_idle 134 while ( current-> need_resched) idle() schedule() check_pgt_cache() 126 this function will be dealt with in Section 1.3.2. It merely initialises some fields specific to the CPU on which this thread is running. 127 this gives the idle thread the...

Nonmaskable interrupt

427 if ( (reason & 0xc0)) unknown_nmi_error(reason, regs) return Figure 11.8 The handler for the nonmaskable interrupt 421 although passed an error code by the first-level handler, the function never actually uses it. 423 input-output (IO) port 0x61, port B in the PC, has bits indicating the source or reason for an nmi (among other things) Bit 2 is for system board parity error checking 0 means that it is enabled, 1 means that it is reset but disabled. It is a read-write bit. Bit 3 is for...

Secondlevel handler for machine check

Figure 11.17, from arch i386 kernel bluesmoke.c, shows the generic handler for the machine check exception. It prints debugging information on the console and, depending on the seriousness of the problem, either shuts down or continues. 17 void intel_machine_check(struct pt_regs * regs, long error_code) rdmsr((MSR_IA32_MCG_STATUS, mcgstl, mcgsth) if(mcgstl& (1< < 0)) recover 0 printk(KERN_EMERG CPU d Machine Check Exception 08x 08x n, smp_processor_id(), mcgsth, mcgstl) high) 39...

Stack layout

Whichever of the data structures described in the previous section is actually provided by the user, it always remains in the user address space. However, a slightly expanded version of it is built on the kernel stack. The standard stack layout, as used in 32-bit protected mode, is the struct pt_regs (see Section 10.3.1.1). In vm86 mode, registers are only 16 bits wide. But the stack layout used for saving registers is made to be similar to a struct pt_regs, by the use of padding fields. In...

Parity error on main memory board

If the non maskable interrupt was caused by a memory parity error on the main board, the function shown in Figure 11.11, from arch i386 kernel traps.c, is called. It prints a warning message and reenables parity error detection. 380 static void mem_parity_error(unsigned char reason, struct 382 printk(Uhhuh. NMI received. Dazed and confused, but 38 3 printk(You probably have a hardware problem with your RAM 386 reason (reason & 0xf) 4 Figure 11.11 Clearing and disabling the memory parity bit...

Posting a signal to the target process

As part of the procedure of sending a signal, as discussed in Section 18.1.1, the signal has to be posted to the target process. This is the subject of the current section. There is one basic function, and a series of subfunctions called by this. The root of all this processing is the function shown in Figure 18.8, from kernel signal.c. It is called by send_sig_info( ) every time a signal is to be sent. 493 static int deliver_signal(int sig, struct siginfo *info, int retval send_signal(sig,...

Divide error

The first-level handler for the divide error exception (number 0) is the assembly language routine shown in Figure 10.25, from entry.S. This occurs if the result of a divide instruction is too big to fit into the result operand or if the divisor is 0. The CS and EIP values on the stack point to the instruction that caused the exception. The CPU does not push any error code on the stack corresponding to this exception. 264 pushl SYMBOL_NAME(do_divide_error) 269 xorl eax , eax 2 70 pushl ebp 275...

Instantiating firstlevel handlers for APIC interrupts

Figure 13.29, from arch i386 kernel i82 59.c, shows the code that instantiates handlers for these APIC interrupts, by calling macros described in Section 13.6.2. 83 BUILD_SMP_INTERRUPT Figure 13.29 Interprocessor and local APIC interrupts 81-85 these are relevant only in an SMP environment. They are not, strictly speaking, hardware interrupts there is no hardware irq pin equivalent for them rather they are IPIs. These three lines generate first-level handler code. The generating macro is...

Enabling an irq line

The enable_irq() function is shown in Figure 12.29, from arch i386 kernel irq.c. It undoes the effect of one call to disable_irq(). If this matches the last disable, processing of interrupts on this irq line is reenabled. 531 void enable_irq(unsigned int irq) 533 irq_desc_t *desc irq_desc + irq flags) 539 unsigned int status desc-> status & IRQ_DISABLED 541 if((status & (IRQ_PENDING IRQ_REPLAY)) IRQ_PENDING) 542 desc-> status status IRQ_REPLAY 552 printk(enable_irq( u) unbalanced from...

Dequeue a signal

The function shown in Figure 18.30, from kernel signal .c, takes a signal off the queue and returns the information about it to the caller, which is expected to be holding the sigmask_lock. siginfo_t*info) 244 sig next_signal(current, mask) sig)) 248 if 249 current-> sigpending 0 2 50 return 0 printk( d-> d n, signal_pending(current), sig) the parameters are a pointer to the blocked mask of the current process and a struct siginfo into which the extra information about the signal being...

Offsets into the taskstruct

Interrupt handlers may need to reference the task_struct of the current process, which is declared as a C struct (see Chapter 2). Assembly language routines cannot access that directly. When they do need to access the task_struct they use byte offsets to identify the fields. This, of course, means that the position and size of the fields should be constant. We have seen in Section 2.1 that the fields accessed that way are gathered together at the beginning of the task_struct. Figure 10.18, from...

The childs return path

The final point to consider is how the child process moves from kernel mode (in which it was created) back to user mode. When the child process is first scheduled onto a CPU the value in its EIP register is the address of the ret_from_fork routine see Figure 8.19, from arch i386 kernel entry.S. This value was set up in its thread structure by the code in Section 8.3.5.1. Figure 8.19 First code executed by a child process 179 at this stage, all the other registers contain values inherited from...

The Intel binary specification entry point

The Intel family binary compatibility specification (iBCS) defines a system interface for application programs that are compiled for various Unix implementations. It was designed to enable binary application compatibility and migration between different operating system environments on the i386 architecture. The assembly language routine shown in Figure 10.19, from entry.S, is the kernel entry point used by the iBCS. This is the target of the call gate set up in Section 10.2.2. The layout of...

The execdomain structure

The information that identifies any particular execution domain to the process manager is stored in a struct exec_domain see Figure 21.4, from < linux personality.h> . The task_struct has an exec_domain field, which points to the descriptor representing the execution domain currently in use (see Section 2.1). typedefvoid (*handler_t)(int, struct pt_reg 75 this is a prototype of the function used to change domain and personality. It takes two parameters the number of the interrupt used to...

System entries in the interrupt descriptor table

After the code in Section 10.1.2 has been executed, the IDT is set up, with each of the 256 entries pointing to a default handler, ignore_int. At a later stage in the boot procedure, the trap_init() function shown in Figure 10.9, from arch i386 kernel traps.c, is called from line 556 of init_kernel(). It overwrites a number of specific entries in the IDT, pointing them to more appropriate handlers. These are the entries associated with exceptions, the subject of this chapter and the next. All...

The user hash structure

To facilitate easy access to any particular user_struct, and to avoid having to search the entire process list each time a new process is created, these structures are also kept on a hash table. The data structures involved in the implementation of this are shown in Figure 8.22, from kernel user.c. defineUIDHASH_BITS define UIDHASH_SZ define UIDHASH_MASK define_uidhashfn(uid) static kmem_cache_t static struct user_struct static spinlock_tuidhash_lock (((uid > > UIDHASH_BITS) A uid) &...

Sanity checks and APIC identification

The first part of the code, as shown in Figure 13.5, from arch i386 kernel apic. c, consists of some sanity checks, and setting up of the ID of the APIC. 263 void_init setup_local_APIC (void) 265 unsigned long value, ver, maxlvt 2 75 value apic_read(APIC_LVR) 2 76 ver GET_APIC_VERSION(value) 278 if ((SPURIOUS_APIC_VECTOR& 0x0f) 0x0f) 2 79 _error_in_apic_c() 285 if( clustered_apic_mode & & & phys_cpu_present_map)) 301 apic_write_around(APIC_DFR, 0xffffffff) 307 value &...

Maintaining the time of day

The time of day is maintained in seconds and microseconds in the variable xtime (see Section 15.2.2.2). This is known as wall time. This section will first examine the data structure used to maintain time in this format and then the function that actually updates it. Linux maintains time of day to a granularity of a microsecond. It is not stored in the usual year month day format but in the number of seconds that have elapsed since 1 January 1970. Figure 15.10, from < linux time.h> , shows...

Data structures

As usual, the process manager represents external realities, such as the PIC in this case, by means of data structures. 12,7.2,1 Data structure representing a programmable interrupt controller As we have seen in Section 12.1, an interrupt controller is represented in software by a struct hw_interrupt_type. The instance of this specific to an 8259A PIC is shown in Figure 12.32, from arch i386 kernel i82 59.c. 132 spinlock_t i8259A_lock SPIN_LOCK_UNLOCKED 150 static struct hw_interrupt_type i82...

Declaring and initialising wait queue entries

This section will examine the data structure used to represent an individual entry in a wait queue and how such entries are declared and initialised. Creating a new entry for a wait queue is quite a frequent event in the kernel, so there are a number of macros and functions provided for this purpose. One declares new entries the other fills in fields in an existing entry. To allow more than one process to wait on the same event, a link data structure _wait_queue is used (see Figure 4.1, from...

Conditional uninterruptible sleep

There are three macros involved in this. The first one just checks the condition the main macro puts the process to sleep last, there is a worker macro to change the state of the process. The macro shown in Figure 4.34, from < linux sched.h> , is the principal piece of code called to put processes to sleep uninterruptibly, depending on a condition. However, it is only a wrapper that tests the condition. Note that as this is a macro the condition is actually defined in the code into which...

Semaphore worker functions

The group of functions examined in this section are called only when a high-level semaphore operation is unable to complete. Also, they are all called through the wrapper routines introduced in Section 6.1.3. In all cases they take one parameter, a pointer to the struct semaphore. This has been pushed on the stack by the wrapper routine. When the first-level up() function determines that there is one or more processes waiting on the semaphore, it calls the _up() function see Figure 6.12, from...

Freeing an interrupt line

The function shown in Figure 12.26, from arch i386 kernel irq.c, deallocates an interrupt line. The handler is removed and the interrupt line is not available for use by any driver it is disabled and shutdown. If the irq was shared, then the caller must ensure that the interrupt is disabled on the card that issues this irq before calling this function. This function may be called from interrupt context, but note that attempting to free an irq in a handler for the same irq hangs the machine. 740...

Getting the polarity and trigger type of an irq line

The polarity of an irq line indicates whether it is active high or low. Also, a line can be edge triggered or level triggered. This section will examine the standard definitions for the different busses, and various functions provided for determining the polarity and trigger type of a particular irq. The four different bus types have different values for polarity and trigger type, as defined in Figure 14.10, from arch i386 kernel io_apic.c. 299 static int_init EISA_ELCR unsigned int irq 302...

Lowlevel functions to send an interprocessor interrupt

Finally, the low-level functions used for sending IPIs, both in the previous section and elsewhere, will be examined here. There are several functions for sending IPIs between CPUs. The destination can be one, some, or all of the CPUs in the system. It is also possible for a CPU to send an IPI to itself. The destination can be specified either physically or logically. In physical mode, the destination processor is specified by the 4-bit hardware-assigned ID 8-bit for Pentium 4 and Xeon of the...

Programmable interrupt controller

A hardware interrupt line is an electrical connection between a device and the CPU. The device can put an electrical signal on this line and so get the attention of the CPU. Because devices use these lines to request interrupts they are commonly referred to as an irq line, or just an irq. The Intel 8080 was designed at a time when the number of transistors that could be integrated onto one chip was quite limited. The designers only had space in the CPU to implement one interrupt line. For...

Initialising hardware interrupts

The root of the whole setup is init_IRQ , from arch i386 kernel i82 59 .c see Figure 12.19 . This function is called at boot time, from line 560 of init main.c. 447 ifndef CONFIG_X86_VISWS_APIC int vector FIRST_EXTERNAL_VECTOR i if vector SYSCALL_VECTOR set_intr_gate vector, interrupt i interrupt 0 reschedule_interrupt invalidate_interrupt 483 ifdef CONFIG_X86_LOCAL_APIC apic_timer_interrupt spurious_interrupt error_interrupt 497 outb_p LATCH amp 0xff , 0x40 498 outb LATCH gt gt 8 , 0x40 508 if...

Macros to generate masking and unmasking functions

The macros shown in Figure 14.38, from arch i386 kernel io_apic.c, generate generic masking and unmasking functions for an IO APIC. 134 define DO_ACTION name,R,ACTION, FINAL 136 static void name _IO_APIC_irq unsigned int irq 137 _DO_ACTION R, ACTION, FINAL 139 DO_ACTION _mask,0, 0x00010000, io_apic_sync entry- gt apic 141 DO_ACTION _unmask, 0, amp 0xfffeffff, 143 DO_ACTION _mask_and_edge,0, reg amp 0xffff7fff 0x00010000, 0, reg amp 0xfffeffff 0x00008000, Figure 14.38 Macros to generate masking...

Edge triggered interrupts on an Io Apic

The discussion begins with the struct hw_interrupt_type and then we go on to look at the individual functions. 14.3.1.1 Controller functions for edge triggered interrupts The struct hw_interrupt_type declared and initialised for an edge triggered irq is shown in Figure 14.30, from arch i386 kernel io_apic.c. 13 34 static struct hw_interrupt_type ioapic_edge_irq_type Figure 14.30 Controller functions for edge triggered interrupts Figure 14.30 Controller functions for edge triggered interrupts...

Interrupt command register

This section considers the definitions for the 64-bit interrupt command register, as shown in Figure 13.3, from lt asm-i386 apicdef.h gt . A CPU sends an interprocessor interrupt IPI by writing to this register. APIC_INT_LEVELTRIG APIC_INT_ASSERT APIC_ICR_BUSY APIC_DEST_LOGICAL 0x08000 0x04000 0x01000 0x00800 0x00000 0x00100 APIC_DM_NMI APIC_DM_INIT APIC_DM_STARTUP APIC_DM_EXTINT GET_APIC_DEST_FIELD x SET_APIC_DEST_FIELD x C x 24 amp 0xFF C x lt lt 24 Figure 13.3 The interrupt command register...

The local vector table

Each local APIC has a range of registers known as the local vector table LVT . The definitions for these registers are shown in Figure 13.4, from lt asm-i386 apicdef.h gt . define define define define define define define define define define define define 108 define APIC_BASE fix_to_virt FIX_APIC_BASE 110 define MAX_IO_APICS 8 Figure 13.4 Constants for the local vector table 73-94 these registers constitute the LVT, which specifies delivery and status information for local interrupts. There...

Interrupt handling registers

The next block of definitions are shown in Figure 13.2, from lt asm-i386 apicdef.h gt . These are concerned with incoming interrupts from all sources. APIC_SPIV_FOCUS_DISABLED 1 lt lt 9 APIC_ESR_RECV_ACC APIC_ESR_SENDILL APIC_ESR_RECVILL APIC_ESR_ILLREGA Figure 13.2 Interrupt handling registers this is the offset for the logical destination register LDR . The destination of an interrupt can be specified logically, using an 8-bit destination address. Each local APIC is given a unique logical ID...

Level triggered interrupts on an Io Apic

Level triggered interrupts are special because no IO APIC registers are touched while handling them. The APIC is acknowledged in the end handler, not in the start handler. Protection against re-entrance from the same interrupt is still provided, both by the generic irq layer and by the fact that an unacknowledged local APIC does not accept irqs. 14.3.2.1 Controller functions for level triggered interrupts There is also a struct hw_interrupt_type declared and initialised for an IO APIC with...

Error handling on a local APIC

The final block of code, as shown in Figure 13.8, from arch i386 kernel apic.c, is relevant only to an integrated local APIC, not an 82489DX. It is setting up the ESR error status register . 393 if APIC_INTEGRATED ver amp amp esr_disable 398 printk ESR value before enabling vector 08lx n, value value 408 printk ESR value after enabling vector 08lx n, value 417 printk Leaving ESR disabled. n 419 printk No ESR for 82489DX. n 422 if nmi_watchdog NMI_LOCAL_APIC Figure 13.8 Error handling on a local...

The local interrupt pins

The next part of the code, as shown in Figure 13.7, from arch i386 kernel apic. c, sets up the two local interrupt pins, LINT0 and LINT1. 372 value apic_read APIC_LVT0 amp APIC_LVT_MASKED 373 if smp_processor_id amp amp pic_mode lvalue smp_processor_id 377 value APIC_DM_EXTINT APIC_LVT_MASKED 378 printk masked ExtINT on CPU d n, smp_processor_id apic_write_around APIC_LVT0, value value APIC_DM_NMI APIC_LVT_MASKED if APIC_INTEGRATED ver 82489DX value APIC_LVT_LEVEL_TRIGGER apic_write_around...

The reschedule interrupt

The function in Figure 13.18 from arch i386 kernel smp.c sends a reschedule IPI to another CPU. 494 void smp_send_reschedule int cpu 496 send_IPI_mask 1 lt lt cpu, RESCHEDULE_VECTOR Figure 13.18 Sending a reschedule interprocessor interrupt to another processor 417 the send_IPI_mask function is discussed in Section 13.5.4.2. It instructs the APIC to send the RESCHEDULE_VECTOR interrupt to cpu. See Section 12.5.3 for a definition of this vector.

Disabling an Io Apic before rebooting

Linux provides a number of functions for clearing one or more entries from the IO APIC registers as well as for disabling the whole APIC. These are typically used by the reboot code. The function shown in Figure 14.26, from arch i386 kernel io_apic.c, is used by the reboot code. It clears all registers in all IO APICs before rebooting. 1029 void disable_IO_APIC void 1034 this clears all the registers in all IO APICs see Section 14.2.4.2 . 1036 this function reenables the PIC now that the APIC...

Setup virtual wire mode

If there is no IO APIC present, then external devices are connected to a PIC, which in turn is connected to a local APIC. In this case the local APIC is set into virtual wire mode, merely providing a connection to the CPU. All arbitration between interrupts and provision of vectors is done by an 8259A PIC. The code shown in Figure 13.9, from arch i386 kernel apic.c, sets up a local APIC in this mode. 224 void_init init_bsp_APIC void if smp_found_config cpu_has_apic return value apic_read...

SIMD coprocessor errors

The function that actually handles errors in the SIMD co-processor is shown in Figure 11.25, from arch i386 kernel traps. c. 610 void simd_math_error void eip task current save_init_fpu task task- gt thread.trap_no 19 task- gt thread.error_code 0 info.si_signo SIGFPE info.si_errno 0 switch mxcsr amp 0x1f80 gt gt 7 amp mxcsr amp 0x3f case 0x000 default info.si_code FPE_FLTINV break case 0x002 case 0x010 info.si_code FPE_FLTUND break case 0x004 info.si_code FPE_FLTDIV break case 0x008...

Manipulating floating point unit registers

Later models of the i386 architecture have more FPU registers than do earlier ones. The streaming SIMD extensions were introduced with the Pentium III. They are of use in areas such as image processing and word recognition. SSE uses 128-bit registers, called XMM registers. There is also an MXCSR register, containing control and status bits for operating the XMM registers. H.10.2.1 Initialising floating point unit registers When the current process uses the FPU for the first time, the function...

Local APIC timer interrupt

The second-level handler for this increments the running count of timer ticks and acknowledges the irq. Then it calls a third-level handler, which does the sort of housekeeping work associated with timer interrupts. 13.7.3.1 Second-level timer handling Each local APIC generates a timer interrupt. Figure 13.34, from arch i386 kernel apic.c, shows the second-level handler for this interrupt. 1021 unsigned int apic_timer_irqs NR_CPUS 102 3 void smp_apic_timer_interrupt struct pt_regs regs 1025 int...