Task state segment
The task state segment (TSS) is specific to the i386 architecture. It is Intel's layout for the volatile environment of a process. The TR register in the CPU always points to the TSS of the current process. Intel intended that each process would have its own TSS and that the volatile environment of a process would be saved there when it was context switched out. Linux does not implement things that way, preferring to save most of the volatile environment on the kernel stack of the process and the remainder in the thread structure. However, the CPU expects to find a valid value in its TR, and a valid TSS as well. So Linux provides one TSS, shared by all processes, to keep the CPU happy. On a multiprocessor system, there is an array of these, one per CPU.
7.2.3.1 Structure of the task state segment
Figure 7.6, from <asm-i386/processor.h>, shows the structure of the TSS.
334 struct tss_struct{
335 unsigned short
336 unsigned long
337 unsigned short
338 unsigned long
339 unsigned short
340 unsigned long
341 unsigned short
342 unsigned long
343 unsigned long
344 unsigned long
345 unsigned long
346 unsigned long
347 unsigned long
348 unsigned long
349 unsigned long
350 unsigned short
351 unsigned short
352 unsigned short
353 unsigned short
354 unsigned short
355 unsigned short
356 unsigned short
357 unsigned short
358 unsigned long back_link,_blh;
esp0;
ss0,_ss0h;
esp1;
ss1,_ss1h;
esp2;
ss2,_ss2h;
eip; eflags;
eax,ecx,edx,ebx;
esp;
ebp;
esi;
edi;
trace,bitmap;
io_bitmap[IO_BITMAP_SIZE+1];
362 unsigned long
cacheline_filler[5];
Figure 7.6 The task state segment - one per computer processing unit
336-337
338-341 342 343-349 350-355
this is a link field that allows more than one TSS to be linked together. As each TSS is a segment in its own right, this is only a 16-bit selector (an index into the segment table), so_blh (for back link high) is a filler. It is not used by Linux.
the esp0 field is a 32-bit pointer to the top of the register save area on the stack used when operating at protection level 0 (kernel mode in Linux); ss0 is a 16-bit selector for the segment containing the level-0 stack;_ss0h is a filler.
this is similar information for the level-1 and level-2 stacks. These are not used by Linux. the CR3 register in the CPU holds a pointer to the page table of the process. these are standard CPU registers.
these are the standard segment registers. As these are all 16-bit values (selectors), each is padded out with an unused unsigned short.
the value of the LDT register is saved here. This register contains the selector for the local descriptor table of the current process. This is the concern of the i386-specific part of the memory manager and will not be considered further here.
the trace field is available to indicate special attributes of the process. The only one used is the T (debug trap) flag, in bit 0. This lets the context switcher know if the debug registers contain valid information or not. The bitmap field contains the offset within this present structure at which the input-output (IO) bitmap can be found.
this is the IO bitmap itself. It defines the IO addresses that this process can access. Its size is determined by the constant in <asm-i386/processor.h> as
278 #define IO_BITMAP_SIZE 32
It is really the province of the IO manager.
362 this is a dummy 5 longs, to bring the size of the structure up to a multiple of 32 bytes, the size of a cacheline. With these 20 bytes added, the size of a struct tss_struct is now 256 bytes, or 8 cachelines.
7.2.3.2 The task state segment array
There is an array of these tss_struct structures, one per CPU. See Figure 7.7 from arch/i386/kernel/init_task.c. They are statically initialised at compile time.
32 struct tss_struct init_tss[NR_CPUS]_cacheline_aligned
Figure 7.7 The initial task state segment array
32 as the TSS size has been kept to a multiple of a cacheline, there is no problem keeping each one cacheline aligned. The initialising macro is described in the next section.
7.2.3.3 Initial values for an entry in the task state segment
The macro to initialise a TSS entry is found in <asm-i386/processor.h> (see Figure
|
396 |
#defineINIT_TSS { |
\ | |||
|
397 |
0,0, |
/* |
back_link,_blh */ |
\ | |
|
398 |
sizeof(init_stack) + (long) &init_stack, |
\ | |||
|
399 |
_KERNEL_DS, 0, |
/* |
ss0 */ |
\ | |
|
400 |
0,0,0,0,0,0, |
/* |
stack1, stack2 */ |
\ | |
|
401 |
0, |
/* |
cr3 */ |
\ | |
|
402 |
0,0, |
/* |
eip,eflags */ |
\ | |
|
403 |
0,0,0,0, |
/* |
eax,ecx,edx,ebx */ |
\ | |
|
404 |
0,0,0,0, |
/* |
esp,ebp,esi,edi */ |
\ | |
|
405 |
0,0,0,0,0,0, |
/* |
es,cs,ss */ |
\ | |
|
406 |
0,0,0,0,0,0, |
/* |
ds,fs,gs */ |
\ | |
|
407 |
_LDT(0),0, |
/* |
ldt */ |
\ | |
|
408 |
0, INVALID_IO_BITMAP_OFFSET, |
\ | |||
|
409 |
{~0, } |
/* |
ioperm */ |
\ | |
|
410 |
} | ||||
|
Figure 7.8 |
Initialisation values for the TSS | ||||
397 the link field is 0, as it is unused by Linux.
398 the TSS is initialised for the init process. This is the esp0 field. It takes the address of the beginning of the stack space (&init_stack; see Section 3.3.1) and adds the size of the stack space to it, so giving the top of the stack. So the initial value is pointing to an empty stack.
399 all kernel stacks are in the kernel data segment.
400 level-2 and level-3 stacks are not used by Linux, so these fields are initialised to 0.
401-406 the save areas for all these registers are initialised to 0.
407 the __LDT() macro, part of the memory manager, evaluates to a selector for the local descriptor table.
408 the trace field is set to 0. The value supplied for the bitmap field is from <asm-i386/processor.h>:
280 #define INVALID_IO_BITMAP_OFFSET 0x8000
This puts it outside the TSS. It will actually cause a segment fault if ever referenced. It can then be filled out with valid values.
409 the initial setting is that all IO ports are protected.
Continue reading here: Kernel statistics
Was this article helpful?
Readers' Questions
-
jens10 months ago
- Reply