Selection According to PID

Let us turn our attention to how the process-specific information is selected by PID.

Creating the Directory Inode

If a PID is passed to proc_pid_lookup instead of "self", the course of the lookup operation is as shown in the code flow diagram in Figure 10-4.

Because filenames are always processed in the form of strings but PIDs are integer numbers, the former must be converted accordingly. The kernel provides the name_to_int auxiliary function to convert strings consisting of digits into an integer.

The information obtained is used to find the task_struct instance of the desired process by means of the find_task_by_pid_ns function described in Chapter 2. However, the kernel cannot make the assumption that the desired process actually exists. After all, it is not unknown for programs to try to process a nonexistent PID, in which case, a corresponding error (-enoent) is reported.

Once the desired task_struct is found, the kernel delegates the rest of the work mostly to proc_pid_instantiate implemented in fs/proc/base.c, which itself relies on proc_pid_make_inode. First, a new inode is created by the new_inode standard function of VFS; this basically boils down to the same proc-specific proc_alloc_inode routine mentioned above that makes use of its own slab cache.

The routine not only generates a new struct inode instance, but also reserves memory needed by struct proc_inode; the reserved memory holds a normal VFS inode as a "subobject," as noted in Section 10.1.2. The elements of the object generated are then filled with standard values.

After calling proc_pid_make_inode, all the remaining code in proc_pid_instantiate has to do is perform a couple of administrative tasks. Most important, the inode->i_op inode operations are set to the proc_tgid_base_inode_operations static structure whose contents are examined below.

Processing Files

When a file (or directory) in the PID-specific /proc/pid directory is processed, this is done using the inode operations of the directory, as noted in Chapter 8 when discussing the virtual filesystem mechanisms. The kernel uses the statically defined proc_base_inode_operations structure as the inode operations of PID inodes. This structure is defined as follows:

fs/proc/base.c static const struct inode_operations proc_tgid_base_inode_operations = { .lookup = proc_tgid_base_lookup, .getattr = pid_getattr, .setattr = proc_setattr,

In addition to attribute handling, the directory supports just one more operation — subentry lookup.2

The task of proc_tgid_base_lookup is to return an inode instance with suitable inode operations by reference to a given name (cmdline, maps, etc.). The extended inode operations (proc_inode) must also include a function to output the desired data. Figure 10-5 shows the code flow diagram.

Figure 10-5: Code flow diagram for proc_tgid_base_lookup.

The work is delegated to proc_pident_lookup, which works not only for TGID files, but is a generic method for other ID types. The first step is to find out whether the desired entry exists at all. Because the contents of the PID-specific directory are always the same, a static list of all files together with a few other bits of information is defined in the kernel sources. The list is called tgid_base_stuff and is used to find out easily whether a desired directory entry exists or not. The array contains elements of type pid_entry, which is defined as follows:

fs/proc/base.c struct pid_entry { char *name; int len; mode_t mode;

const struct inode_operations *iop; const struct file_operations *fop; union proc_op op;

name and len specify the filename and the string length of the name, while mode denotes the mode bits. Additionally, there are fields for the inode and file operations associated with the entry, and a copy of proc_op. Recall that this contains a pointer to the proc_get_link or proc_read_link operation, depending on the file type.

Some macros are provided to ease the construction of static pid_entry instances: fs/proc/base.c

2 A special readdir method is also implemented for proc_tgid_base_operations (an instance of struct file_operations) to read a list of all files in the directory. It's not discussed here simply because every PID-specific directory always contains the same files, and therefore the same data would always be returned.

&proc_##OTYPE##_inode_operations, &proc_##OTYPE##_operations, \ {} )


&proc_pid_link_inode_operations, NULL, \ { .proc_get_link = &proc_##OTYPE##_link } ) REG(NAME, MODE, OTYPE) \ NOD(NAME, (S_IFREG|(MODE)), NULL, \

&proc_##OTYPE##_operations, {}) INF(NAME, MODE, OTYPE) \ NOD(NAME, (S_IFREG|(MODE)), \

NULL, &proc_info_file_operations, \ { .proc_read = &proc_##OTYPE } )

As the names indicate, the macros generate directories, links, and regular files. INF also generates regular files, but in contrast to REG files, they do not need to provide specialized inode operations, but need only fill in proc_read from pid_entry->op. Observe how

REG("environ", S_IRUSR, environ)

INF("auxv", S_IRUSR, pid_auxv)

is expanded to see how both types differ:

.len = sizeof("environ") - 1, .mode = (S_IFREG|(S_IRUSR)), .iop = NULL,

.len = sizeof("auxv") - 1, .mode = (S_IFREG|(S_IRUSR)), .iop = NULL,

.fop = &proc_info_file_operations, .op = { .proc_read = &proc_pid_auxv },

The macros are used to construct the TGID-specific directory entries in tgid_base_stuff: fs/proc/base.c static const struct pid_entry tgid_base_stuff[] = { DIR("task", S_IRUGO|S_IXUGO, task), DIR("fd", S_IRUSR|S_IXUSR, fd), DIR("fdinfo", S_IRUSR|S_IXUSR, fdinfo), REG("environ", S_IRUSR, environ), INF("auxv", S_IRUSR, pid_auxv), INF("status", S_IRUGO, pid_status), INF("limits", S_IRUSR, pid_limits),

INF("oom_score", S_IRUGO, oom_score), REG("oom_adj", S_IRUGO|S_IWUSR, oom_adjust),





REG("loginuid", S_IWUSR|S_IRUGO, loginuid),



REG("make-it-fail", S_IRUGO|S_IWUSR, fault_inject),


#if defined(USE_ELF_CORE_DUMP) && defined(CONFIG_ELF_CORE)

REG("coredump_filter", S_IRUGO|S_IWUSR, coredump_filter),



INF("io", S_IRUGO, pid_io_accounting),

The structure describes each entry by type, name, and access rights. The latter are defined using the usual VFS constants with which we are familiar from Chapter 8.

To summarize, various types of entry can be distinguished:

□ INF-style files use a separate read_proc function to obtain the desired data. The proc_info_file_operations standard instance is used as the file_operations structure. The methods it defines represent the VFS interface that passes the data returned upward using read_proc.

□ sym generates symbolic links that point to another VFS file. A type-specific function in proc_get_link specifies the link target, and proc_pid_link_inode_operations forwards the data to the virtual filesystem in suitable form.

□ REG creates regular files that use specialized inode operations responsible for gathering data and forwarding them to the VFS layer. This is necessary if the data source does not fit into the framework provided by proc_info_inode_operations.

Let us return to proc_pident_lookup. To check whether the desired name is present, all the kernel does is iterate over the array elements and compare the names stored there with the required name until it strikes lucky — or perhaps not. After it has ensured that the name exists in tgid_base_stuff, the function generates a new inode using proc_pident_instantiate, which, in turn, uses the already known proc_pid_make_inode function.

Continue reading here: System Control Mechanism

Was this article helpful?

+1 0