Linux Assembly Programming

The Game of Big

I've invented my own board game to continue down the road with this particular metaphor. In the sense that art mirrors life, the Game of Big Bux mirrors life in Silicon Valley, where money seems to be spontaneously created (generally in somebody else's pocket) and the three big Money Black Holes are fast cars, California real estate, and messy divorces. There is luck, there is work, and assets often change hands very quickly. A portion of the Big Bux game board is shown in Figure 1-1. The line...

NASM f elf g F stabs

The name of the source code file to be assembled. Specifies that debug information is to be generated in the stabs format. Specifies that debug information is to be included in the .o file. Specifies that the .o file will be generated in the elf format. Figure 5-11 The anatomy of a NASM command line -f elf There are a fair number of useful object file formats, and each one is generated differently. The NASM assembler is capable of generating most of them, including other formats, such as bin,...

Setting Up a Stack Frame

The stack is extremely important in assembly language work, and this is doubly true in programs that interface with C, because in C (and in truth most other native-code high-level languages, including Pascal) the stack has a central role. The reason for this is simple Compilers are robots that write assembly language code, and they are not human and clever like you. This means that a compiler has to use what might seem like brute-force methods to create its code, and most of those methods...

ATT Mnemonic Conventions

When gcc compiles a C source code file to machine code, what it really does is translate the C source code to assembly language source code, using the AT&T mnemonics. Look back to Figure 12-1. The gcc compiler takes as input a .c source code file, and outputs an .s assembly source file, which is then handed to the GNU assembler, gas, for assembly. This is how the GNU tools work on all platforms. In a sense, assembly language is an intermediate language used for the C compiler's benefit. In most...

Step 5 Watch It Run in the Debugger

Assuming that you entered Listing 5-1 correctly (or unpacked it from the listings archive), there are no bugs in eatsyscall.asm. That's an uncommon circumstance for programmers, especially those just starting out. Most of the time you'll need to start bug-hunting almost immediately. The easiest way to do this is to load your executable file into a debugger so that you can single-step it, pausing after the execution of each machine instruction in order to see what effect each instruction has on...

Using Linux Kernel Services Through INT80

Everything else in eatsyscall.asm is leading to the single instruction that performs the program's only real work displaying a line of text in the Linux console. At the heart of the program is a call into the Linux operating system, performed using the int instruction, with a parameter of 80h. As explained in Chapter 6, an operating system is something like a god and something like a troll, and Linux is no different. It controls all the most important elements of the machine in godlike fashion...

Translating with MOV or XLAT

So how do you use the Upcase table Very simply 1. Load the character to be translated into AL. 2. Create a memory reference using AL as the base term and Upcase as the displacement term, and mov the byte at the memory reference into AL, replacing the original value used as the base term. The mov instruction would look like this There's only one problem NASM won't let you do this. The AL register can't take part in effective address calculations, nor can any of the other 8-bit registers. Enter...

Kates File Management

Kate makes it easy to load existing files into the editor window, browse your session working directory, create new files, rename files, and move unneeded files to the Trash. The primary mechanism for file management is the sidebar on the left side of the Kate window. Absent other plugins that use it, the management sidebar serves two functions When in the Document view, the sidebar displays the documents associated with the current session. You can click on one of the document listing lines to...

Binary Files vs Text Files

If you've worked with Windows or Linux (and before that, DOS) for any length of time, you may have a sense of the differences between files in terms of how you ''look at'' them. A simple text file is opened and examined in a simple text editor. A word processor file is opened in the species of word processor that created it. A PowerPoint presentation file is opened from inside the PowerPoint application. If you try to load it into Word or Excel, the application...

The Mechanics of Macro Definition

A macro definition looks a little like a procedure definition, framed between a pair of special NASM directives macro and endmacro. Note that the endmacro directive is on the line after the last line of the macro. Don't make the mistake of treating endmacro like a label that marks the macro's last line. One minor shortcoming of macros vis-a-vis procedures is that macros can have only one entry point. A macro, after all, is a sequence of code lines that are inserted into your program in the...

Installing the Software

One of the fantastic things about Linux is the boggling array of software available for it, nearly all of which is completely free of charge. If you've used Linux for any length of time you've probably encountered products such as OpenOffice, Kompozer, Gnumeric, and Evolution. Some of these are preinstalled when you install the operating system. The rest are obtained through the use of a package manager. A package manager is a catalog program that lives on your PC and maintains a list of all...

Simple Cursor Control in the Linux Console

As a segue from assembly language procedures into assembly language macros, I'd like to spend a little time on the details of controlling the Linux console display from within your programs. Let's return to our little greasy-spoon advertising display for Joe's diner. Let's goose it up a little, first clearing the Linux console and then centering the ad text on the cleared display. I'm going to present the same program twice, first with several portions expressed as procedures, and later with...

Translation Tables

A translation table is a special type of table, and it works the following way you set up a table of values, with one entry for every possible value that must be translated. A number (or a character, treated as a number) is used as an index into the table. At the index position in the table is a value that is used to replace the original value that was used as the index. In short, the original value indexes into the table and finds a new value that replaces the original value, thus translating...

Code and Data

Like most board games (including the Game of Big Bux), the assembly language board game consists of two broad categories of elements game steps and places to store things. The ''game steps'' are the steps and tests I've been speaking of all along. The places to store things are just that cubbyholes into which you can place numbers, with the confidence that those numbers will remain where you put them until you take them out or change them somehow. In programming terms, the game steps are called...

Looking at File Internals with the Bless Editor

Bless Hex Editor

Very fortunately, there are utilities that can open, display, and enable you to change characters or binary bytes inside any kind of file. These are called binary editors or hexadecimal editors, and the best of them in my experience at least for the Linux world is the Bless Hex Editor. It was designed to operate under graphical user interfaces such as Gnome, and it is very easy to figure out by exploring the menus. Bless is not installed by default under Ubuntu. You can download it free of...

Character Encoding in Konsole

There's not much to configure in a terminal emulator program, at least while taking your first steps in assembly language. One thing that does matter for the example programs in this book is character encoding. A terminal emulator has to put characters into its window, and one of the configurable options is related to what glyphs correspond to which 8-bit character code. Note well that this has nothing directly to do with fonts. A glyph is a specific recognizable symbol, like the letter ''A''...

Want to Do That

It was 1985, and I was in a chartered bus in New York City, heading for a press reception with a bunch of other restless media egomaniacs. I was only beginning my media career (as Technical Editor for PC Tech Journal) and my first book was still months in the future. I happened to be sitting next to an established programming writer guru, with whom I was impressed and to whom I was babbling about one thing or another. I won't name him, as he's done a lot for the field, and may do a fair bit...

Index x Scale Displacement Addressing

Base + Index addressing is what you'll typically use to scan through a buffer in memory byte by byte, but what if you need to access a data item in a buffer or table where each data item is not a single byte, but a word or a double word This requires slightly more powerful memory addressing machinery. As a side note here, the word array is the general term for what I've been calling a buffer or a table. Other writers may call a table an array, especially when the context of the discussion is a...

Passing Parameters to printf

The real challenge in working with printf(), assuming you understand how it works logically, is knowing how to pass it all the parameters that it needs to handle any particular string display. Like the Writeln() function in Pascal, printf() has no set number of parameters. It can take as few parameters as one base string, or as many parameters as you need, including additional strings, character values, and numeric values of various sorts. All parameters to C library functions are passed on the...

The Rotate Instructions

That said, if a bit's destiny is not to be lost in cosmic nothingness, you need to use the rotate instructions rcl, rcr, rol, and ror instead. The rotate instructions are almost identical to the shift instructions, but with a crucial difference a bit bumped off one end of the operand reappears at the opposite end of the operand. As you rotate an operand by more than one bit, the bits march steadily in one direction, falling off the end and immediately reappearing at the opposite end. The bits...

Terminal Control with Escape Sequences

By default, output to a terminal emulator window enters at the left end of the bottom line, and previously displayed lines scroll up with the addition of each new line at the bottom. This is perfectly useful, but it's not pretty and certainly doesn't qualify as a ''user interface'' in any honest sense. There were plenty of ''full screen'' applications written for the Unix operating system in ancient times, and they wrote their data entry fields and prompts all over the screen. When color...

Using Kate While Programming

At least for the programs I present in this book, Kate is going to be the ''workbench'' where we edit, assemble, link, test, and debug our code. In other words, you run Kate from the Applications menu or from a desktop or panel launcher, and then everything else you do you do from inside Kate. ''Inside'' here has an interesting wrinkle Kate has its own built-in Linux terminal window, and this terminal window enables us to launch other tools from inside Kate specifically, the Make utility (more...

Reading Text from Files with fgets

When fopen() successfully creates or opens a file for you, it returns a file handle in EAX. Keep that file handle safe somewhere I recommend either copying it to a memory variable allocated for that purpose or putting it in one of the sacred registers. If you store it in EAX, ECX, or EDX and then make a call to almost any C library function, the file handle in the register will be trashed and you'll lose it. Once a file is opened for reading, you can read text lines from it sequentially with...

Creating and Opening Files

By this time you should be pretty comfortable with the general mechanism for making C library calls from assembly. And whether you realize it or not, you're already pretty comfortable with some of the machinery for manipulating text files. You've already used printf() to display formatted text to the screen by way of standard output. The very same mechanism is used to write formatted text to disk-based text files you're basically substituting a real disk file for standard output, so...

Formatted Text Output with printf

The puts() library routine may seem pretty useful, but compared to a few of its more sophisticated siblings, it's kid stuff. With puts() we can only send a simple text string to a file (by default, stdout), without any sort of formatting. Worse, puts() always includes an EOL character at the end of its display, whether we include one in the string data or not. This prevents us from using multiple calls to puts() to output several text strings all on the same line. About the best you can say for...

Defining Macros with Parameters

Macros are for the most part a straight text-substitution trick, but text substitution has some interesting and sometimes useful wrinkles. One of these is the ability to pass parameters to a macro when the macro is invoked. For example, in eatmacro there's an invocation of the macro Writectr with three parameters The literal constant 12 is passed ''into'' the macro and used to specify the screen row on which the centered text is to be displayed in this case, line 12 from the top. You could...

Short Near and Far Jumps

One of the oddest assembler errors you may ever encounter can appear in a completely correct program and if you work with NASM long enough and create programs large enough, you will encounter it. Here it is This error occurs when a conditional jump instruction is too far from the label that it references, where ''too far'' means too many locations away in memory. This only applies to conditional jumps the unconditional jump instruction jmp is not subject to this error. The problem arises...

Protected Mode Flat Model

Intel's CPUs have implemented a very good protected mode architecture since the 386 appeared in 1986. However, application programs cannot make use of protected mode all by themselves. The operating system must set up and manage a protected mode before application programs can run within it. MS-DOS couldn't do this, and Microsoft Windows couldn't really do it either until Windows NT first appeared in 1994. Linux, having no real-mode ''legacy'' issues to deal with, has operated in protected mode...