Dispatching and Parameter Passing
System calls are uniquely identified by a number assigned by the kernel. This is done for practical reasons that become clear when system calls are activated. All calls are handled by a single central piece of code that uses the number to dispatch a specific function by reference to a static table. The parameters passed are also handled by the central code so that parameter passing is implemented independently of the actual system call.
Switching from user to kernel mode — and therefore to dispatching and parameter passing — is implemented in assembly language code to cater for many platform-specific features. Owing to the very large number of architectures supported, every detail cannot be covered, and our description is therefore restricted to the widespread IA-32 architectures. The implementation approach is much the same on other processors, even though assembler details may differ.
To permit switching between user and kernel mode, the user process must first draw attention to itself by means of a special machine instruction; this requires the assistance of the C standard library. The kernel must also provide a routine that satisfies the switch request and looks after the technical details. This routine cannot be implemented in userspace because commands are needed that normal applications are not permitted to execute.
Different platforms use different assembler methods to execute system calls.5 System call parameters are passed directly in registers on all platforms — which handler function parameter is held in which register is precisely defined. A further register is needed to define the system call number used during subsequent dispatching to find the matching handler function.
The following overview shows the methods used by a few popular architectures to make system calls:
□ On IA-32 systems, the assembly language instruction int $0x80 raises software interrupt 128. This is a call gate to which a specific function is assigned to continue system call processing. The system call number is passed in register eax, while parameters are passed in registers ebx, ecx, edx, esi, and edi.6
On more modern processors of the IA-32 series (Pentium II and higher), two assembly language instructions (sysenter and sysexit) are used to enter and exit kernel mode quickly. The way in which parameters are passed and returned is the same, but switching between privilege levels is faster.
To enable sysenter calls to be made faster without losing downward compatibility with older processors, the kernel maps a memory page into the top end of address space (at 0x0xffffe000). Depending on processor type, the system call code on this page includes either int 0x80 or sysenter.
5The details are easy to find in the sources of the GNU standard library by referring to the filenamed sysdeps/unix/sysv/ linux/arch/syscall.S. The assembly language code required for the particular platform can be found under the syscall label; this code provides a general interface for invoking system calls for the rest of the library.
6In addition to the 0x80 call gate, kernel implementation on IA-32 processors features two other ways of entering kernel mode and executing system calls — the lcall7 and lcall27 call gates. These are used to perform binary emulation for BSD and Solaris because these systems make system calls in native mode. They differ only slightly from the standard Linux method and offer little in the way of new insight — which is why I do not bother to discuss them here.
Calling the code stored there (with call 0xffffe000) allows the standard library to automatically select the method that matches the processor used.
□ Alpha processors provide a privileged system mode (PAL, privileged architecture level) in which various system kernel routines can be stored. The kernel employs this mechanism by including in the PAL code a function that must be activated in order to execute system calls. call_pal PAL_callsys transfers control flow to the desired routine. v0 is used to pass the system call number, and the five possible arguments are held in a0 to a4 (note that register naming is more systematic in recent architectures than in earlier architectures such as IA-32 ... ).
□ PowerPC processors feature an elegant assembly language instruction called sc (system call). This is used specifically to implement system calls. Register r3 holds the system call number, while parameters are held in registers r4 to r8 inclusive.
□ The AMD64 architecture also has its own assembly language instruction with the revealing name of syscall to implement system calls. The system call number is held in the raw register, parameters in rdi, rsi, rdx, r10, r8, and r9.
Once the application program has switched to kernel mode with the help of the standard library, the kernel is faced with the task of finding the matching handler function for the system call and supplying it with the passed parameters. A table named sys_call_table, which holds a set of function pointers to handler routines, is available for this purpose on all (!) platforms. Because the table is generated with assembly language instructions in the data segment of the kernel, its contents differ from platform to platform. The principle, however, is always the same: by reference to the system call number, the kernel finds the appropriate position in the table at which a pointer points to the desired handler function.
System Call Table
Let us take a look at the sys_call_table of an Sparc64 system as defined in arch/sparc/kernel/ systlbs.s (System call tables for other systems can be found in a file often called entry.s in the corresponding directory for the processor type.)
Continue reading here: Archsparc64kernelsystblsS
Was this article helpful?