The Three Standard Unix Files

Computers have been described as machines that move data around, and that's not a bad way to see it. That said, the best way to get a grip on program input and output via terminal emulators is to understand one of Unix's fundamental design principles: Everything is a file. A file can be a collection of data on disk, as I explained in some detail in Chapter 5; but in more general terms, a file is an endpoint on a path taken by data. When you write to a file, you're sending data along a path to an endpoint. When you read from a file, you are accepting data from an endpoint. The path that the data takes between files may be entirely within a single computer, or it may be between computers along a network of some kind. Data may be processed and changed along the path, or it may simply move from one endpoint to another without modification. No matter. Everything is a file, and all files are treated more or less identically by Unix's internal file machinery.

Figure 6-9: Changing Konsole's character encoding to IBM-850

The ''everything is a file'' dictum applies to more than collections of data on disk. Your keyboard is a file: it's an endpoint that generates data and sends it somewhere. Your display is a file: it's an endpoint that receives data from somewhere and puts it where you can see it. Unix files do not have to be text files. Binary files (like the executables created by NASM and the linker) are handled the same way.

Table 6-1 lists the three standard files defined by Unix. These files are always open to your programs while the programs are running.

Table 6-1: The Three Standard Unix Files

FILE C IDENTIFIER FILE DESCRIPTOR DEFAULTS TO

Standard Input stdin 0 Keyboard

Standard Output stdout 1 Display

Standard Error stderr 2 Display

At the bottom of it, a file is known to the operating system by its file descriptor, which is just a number. The first three such numbers belong to the three standard files. When you open an existing file or create a new file from within a program, Linux will return a file descriptor value specific to the file you've opened or created. To manipulate the file, you call into the operating system and pass it the file descriptor of the file you want to work with. Table 6-1 also provides the conventional identifiers by which the standard files are known in the C world. When people talk about ''stdout,'' for example, they're talking about file descriptor 1.

If you refer back to Listing 5-1, the short example program I presented in Chapter 5 during our walk-through of the assembly language development process, you'll see this line:

mov ebx,1 ; Specify File Descriptor 1: Standard Output

When we sent the little slogan ''Eat at Joe's!'' to the display, we were in fact writing it to file descriptor 1, standard output. By changing the value to 2, we could have sent the slogan to standard error instead. It wouldn't have been displayed any differently on the screen. Standard error is identical in all ways to standard output in terms of how data is handled. By custom, programs like NASM send their error messages to standard error, but the text written to standard error isn't marked as an ''error message'' or displayed in a different color or character set. Standard error and standard output exist so that we can keep our program's output separate from our program's errors and other messages related to how and what the program is doing.

This will make a lot more sense once you understand one of the most useful basic mechanisms of all Unix-descended systems: I/O redirection.

Was this article helpful?

0 0

Post a comment