Module Representation

Before looking more closely at the implementation of the module-related functions, it is necessary to explain how modules (and their properties) are represented in the kernel. As usual, a set of data structures is defined to do this.

Not surprisingly, the name of the most important structure is module; an instance of this structure is allocated for each module resident in the kernel. It is defined as follows:

struct module {

enum module_state state;

/* Member of list of modules */ struct list_head list;

/* Unique handle for this module */ char name[MODULE_NAME_LEN];

/* Exported symbols */ const struct kernel_symbol *syms; unsigned int num_syms; const unsigned long *crcs;

/* GPL-only exported symbols. */ const struct kernel_symbol *gpl_syms; unsigned int num_gpl_syms; const unsigned long *gpl_crcs;

/* symbols that will be GPL-only in the near future. */ const struct kernel_symbol *gpl_future_syms; unsigned int num_gpl_future_syms; const unsigned long *gpl_future_crcs;

unsigned int num_exentries;

const struct exception_table_entry *extable;

/* If this is non-NULL, vfree after init() returns */ void *module_init;

/* Here is the actual code + data, vfree'd on unload. */ void *module_core;

/* Here are the sizes of the init and core sections */ unsigned long init_size, core_size;

/* The size of the executable code in each section. */ unsigned long init_text_size, core_text_size;

/* Arch-specific module values */ struct mod_arch_specific arch;

unsigned int taints; /* same bits as kernel:tainted */

#ifdef CONFIG_MODULE_UNLOAD /* Reference counts */ struct module_ref ref[NR_CPUS];

/* What modules depend on me? */ struct list_head modules_which_use_me;

/* Who is waiting for us to be unloaded */ struct task_struct *waiter;

/* Destruction function. */ void (*exit)(void); #endif

#ifdef CONFIG_KALLSYMS

/* We keep the symbol and string tables for kallsyms. */ Elf_Sym *symtab; unsigned long num_symtab; char *strtab;

/* Section attributes */

struct module_sect_attrs *sect_attrs;

struct module_notes_attrs *notes_attrs;

#endif

/* The command line arguments (may be mangled). People like keeping pointers to this stuff */ char *args;

As this source code extract shows, the structure definition depends on the kernel configuration settings:

□ kallsyms is a configuration option (but only for embedded systems — it is always enabled on regular machines) that holds in memory a list of all symbols defined in the kernel itself and in the loaded modules (otherwise only the exported functions are stored). This is useful if oops messages (which are used if the kernel detects a deviation from the normal behavior, for example, if a null pointer is de-referenced) are to output not only hexadecimal numbers but also the names of the functions involved.

□ In contrast to kernel versions prior to 2.5, the ability to unload modules must now be configured explicitly. The required additional information is not included in the module data structure unless the configuration option module_unload is selected.

Other configuration options that occur in conjunction with modules but do not change the definition of struct module are as follows:

□ modversions enables version control; this prevents an obsolete module whose interface definitions no longer match those of the current version from loading into the kernel. Section 7.5 deals with this in more detail.

□ module_force_unload enables modules to be removed from the kernel by force, even if there are still references to the module or the code is being used by other modules. This brute force method is never needed in normal operation but can be useful during development.

□ kmod enables the kernel to automatically load modules once they are needed. This requires some interaction with the userspace, which is described below in the chapter.

The elements of struct module have the following meaning:

□ state indicates the current state of the module and can assume one of the values of module_state:

enum module_state {

MODULE_STATE_LIVE,

MODULE_STATE_COMING,

MODULE_STATE_GOING,

During loading, the state is module_state_coming. In normal operation (after completion of all initialization tasks), it is module_state_live; and while a module is being removed, it is

MODULE_STATE_GOING.

□ list is a standard list element used by the kernel to keep all loaded modules in a doubly linked list. The modules global variable defined in kernel/module.c is used as list header.

□ name specifies the name of the module. This name must be unique because it is referenced, for example, to select the module to be unloaded. In this element, the name of the binary file is usually given without the suffix .ko - vfat, for example, for the VFAT filesystem.

□ syms, num_syms, and crc are used to manage the symbols exported by the module. syms is an array of num_syms entries of the kernel_symbol type and is responsible for assigning identifiers (name) to memory addresses (value):

struct kernel_symbol {

unsigned long value; const char *name;

crcs is also an array with num_syms entries that store checksums for the exported symbols needed to implement version control (see Section 7.5).

□ When symbols are exported, the kernel considers not only symbols that may be used by all modules regardless of their license, but also symbols that may be used only by modules with GPL and GPL-compatible licenses. The third category consists of modules that may at present still be used by modules with any license, but will be made GPL-only in the near future. The gpl_syms, num_gpl_syms and gpl_crcs elements are provided for GPL-only symbols, while gpl_future_syms, num_gpl_future_syms and gpl_future_crcs serve for future GPL-only symbols. They have the same meaning as the entries discussed above but are responsible for managing symbols that may be used only by GPL-compatible modules now or in the future.

Two more sets of symbols (which are for brevity's sake omitted from the structure definition above) are described by the structure members unused_gpl_syms and unused_syms, together with the corresponding counter and checksum members. The sets are used to store (GPL-only) symbols that are exported, but unused by in-tree kernel modules. The kernel prints a warning message when an out-of-tree module nevertheless uses a symbol of this type.

□ If a module defines new exceptions (see Chapter 4), their description is held in the extable array. num_exentries specifies the number of entries in the array.

□ init is a pointer to a function called when the module is initialized.

□ The binary data of a module are divided into two parts: the initialization part and the core part. The former contains everything that can be discarded after loading has terminated (e.g., the initialization functions). The latter contains all data needed during the current operation. The start address of the initialization part is held in module_init and comprises init_size bytes, whereas the core area is described by module_core and core_size.

□ arch is a processor-specific hook that, depending on the particular system, can be filled with various additional data needed to run modules. Most architectures do not require any additional information and therefore define struct mod_arch_specific as an empty structure that is removed by the compiler during optimization.

□ taints is set if a module taints the kernel. Tainting means that the kernel suspects the module of doing something harmful that could prevent correct kernel operation. Should a kernel panic9

9 A kernel panic is triggered when a fatal internal error occurs that does not allow resumption of regular operations.

occur, then the error diagnosis will also contain information about why the kernel is tainted. This helps developers to distinguish bug reports coming from properly running systems and those where something was already suspicious.

The function add_taint_module is provided to taint a given instance of struct module. A module can taint the kernel for two reasons:

□ taint_proprietary_module is used if a module with a proprietary license, or a license that is not compatible with the GPL, is loaded into the kernel. Since the source code for proprietary modules is most likely not available, kernel developers will not be willing to fix kernel bugs that appear in possibly even completely unrelated kernel areas. The module might have done arbitrary things to the kernel that cannot be tracked, so the bugs might well have been introduced by the module.

Note that the kernel provides the function license_is_gpl_compatible to decide whether a given license is compatible with the GPL.

All licenses are, in contrast to the usual habit, not specified by constants, but by C strings.

□ taint_forced_module denotes that the module was forcibly loaded. Forced loading can be requested if no version information (also called version magic) is present in the module, or if the module and kernel disagree about the version of some symbol.

□ license_gplok is a Boolean variable that specifies whether the module license is GPL-compatible; in other words, whether GPL-exported functions may be used or not. The flag is set when the module is inserted into the kernel. How the kernel judges a license to be compatible with the GPL or not is discussed below.

□ module_ref is used for reference counting. There is an entry in the array for each CPU of the system; this entry specifies at how many other points in the system the module is used. The data type module_ref used for the individual array elements contains only one entry, which should, however, be aligned on the L1 cache:

struct module_ref {

local_t count; } _cacheline_aligned;

The kernel provides the try_module_get and module_put functions to increment or decrement the reference counter. It is also possible to use __module_get to increment the reference count if the caller is sure that the module is not being unloaded right now. try_module_get, in contrast, ensures that this is really the case.

□ modules_which_use_me is used as a list element in the data structures that describe the intermodule dependencies in the kernel. Section 7.3.2 goes into greater detail.

□ waiter is a pointer to the task structure of the process that caused the module to be unloaded and is now waiting for the action to terminate.

□ exit is the counterpart to init. It is a pointer to a function called to perform module-specific clean-up work (e.g., releasing reserved memory areas) when a module is removed.

□ symtab, num_symtab and strtab are used to record information on all symbols of the module (not only on the explicitly exported symbols).

percpu points to per-CPU data that belong to the module. It is initialized when the module is loaded.

□ args is a pointer to the command-line arguments passed to the module during loading.

Continue reading here: Dependencies and References

Was this article helpful?

0 0