Where Things

I wrote this book in large part because I could not find a beginning text on assembly language that I respected in the least. Nearly all books on assembly start by introducing the concept of an instruction set, and then begin describing machine instructions, one by one. This is moronic, and the authors of such books should be hung. Even if you've learned every single instruction in an instruction set, you haven't learned assembly language.

You haven't even come close.

The naive objection that a CPU exists to execute machine instructions can be disposed of pretty easily: it executes machine instructions once it has them in its electronic hands. The real job of a CPU, and the real challenge of assembly language, lies in locating the required instructions and data in memory. Any idiot can learn machine instructions. (Many do.) The skill of assembly language consists of a deep comprehension of memory addressing. Everything else is details—and easy details at that.

The Joy of Memory Models

Memory addressing is a difficult business, made much more difficult by the fact that there are a fair number of different ways to address memory in the x86 CPU family. Each of these ways is called a memory model. There are three major memory models that you can use with the more recent members of the x86

CPU family, and a number of minor variations on those three, especially the one in the middle.

In programming for 32-bit Linux, you're pretty much limited to one memory model, and once you understand memory addressing a little better, you'll be very glad of it. However, I'm going to describe all three in some detail here, even though the older two of the trio have become museum pieces. Don't skip over the discussion of those museum pieces. In the same way that studying fossils to learn how various living things evolved over time will give you a better understanding of livings things as they exist today, knowing a little about older Intel memory models will give you a more intuitive understanding of the one memory model that you're likely to use.

At the end of this chapter I'll briefly describe the 64-bit memory model that is only just now hitting the street in any numbers. That will be just a heads-up, however. In this book and for the next few years, 32-bit protected mode is where the action is.

The oldest and now ancient memory model is called the real mode flat model. It's thoroughly fossilized, but relatively straightforward. The elderly (and now retired) memory model is called the real mode segmented model. It may be the most hateful thing you ever learn in any kind of programming, assembly or otherwise. DOS programming at its peak used the real mode segmented model, and much Pepto Bismol was sold as a result. The newest memory model is called protected mode flat model, and it's the memory model behind modern operating systems such as Windows 2000/XP/Vista/7 and Linux. Note that protected mode flat model is available only on the 386 and newer CPUs that support the IA-32 architecture. The 8086, 8088, and 80286 do not support it. Windows 9x falls somewhere between models, and I doubt anybody except the people at Microsoft really understands all the kinks in the ways it addresses memory—maybe not even them. Windows 9x crashes all the time, and one main reason in my view is that it has a completely insane memory model. (Dynamic link libraries, or DLLs—a pox on homo computationis—are the other major reason.) Its gonzo memory model isn't the only reason you shouldn't consider writing Win 9x programs in assembly, but it's certainly the best one; and given that Windows 9x is now well on its way to being a fossil in its own right, you'll probably never have to.

I have a strategy in this book, and before we dive in, I'll lay it out: I will begin by explaining how memory addressing works under the real mode flat model, which was available under DOS. It's amazingly easy to learn. I discuss the real mode segmented model because you will keep stubbing your toe on it here and there and need to understand it, even if you never write a single line of code for it. Real work done today and for the near future lies in 32-bit protected mode flat model, for Windows, Linux, or any true 32-bit protected mode operating system. Key to the whole business is this: Real mode flat model is very much like protected mode flat model in miniature.

There is a big flat model and a little flat model. If you grasp real mode flat model, you will have no trouble with protected mode flat model. That monkey in the middle is just the dues you have to pay to consider yourself a real master of memory addressing.

So let's go see how this crazy stuff works.

16 Bits'll Buy You 64K

In 1974, the year I graduated from college, Intel introduced the 8080 CPU and basically invented microcomputing. (Yes, I'm an old guy, but I've been blessed with a sense of history—by virtue of having lived through quite a bit of it.) The 8080 was a white-hot little item at the time. I had one that ran at 1 MHz, and it was a pretty effective word processor, which is mostly what I did with it.

The 8080 was an 8-bit CPU, meaning it processed 8 bits of information at a time. However, it had 16 address lines coming out of it. The ''bitness'' of a CPU—how many bits wide its general-purpose registers are—is important, but to my view the far more important measure of a CPU's effectiveness is how many address lines it can muster in one operation. In 1974, 16 address lines was aggressive, because memory was extremely expensive, and most machines had 4K or 8K bytes (remember, that means 4,000 or 8,000) at most—and some had a lot less.

Sixteen address lines will address 64K bytes. If you count in binary (which computers always do) and limit yourself to 16 binary columns, you can count from 0 to 65,535. (The colloquial ''64K'' is shorthand for the number 66,536.) This means that every one of 65,536 separate memory locations can have its own unique address, from 0 up to 65,535.

The 8080 memory-addressing scheme was very simple: you put a 16-bit address out on the address lines, and you got back the 8-bit value that was stored at that address. Note well: there is no necessary relation between the number of address lines in a memory system and the size of the data stored at each location. The 8080 stored 8 bits at each location, but it could have stored 16 or even 32 bits at each location, and still have 16 memory address lines.

By far and away, the operating system most used with the 8080 was CP/M-80. CP/M-80 was a little unusual in that it existed at the top of installed memory—sometimes so that it could be contained in ROM, but mostly just to get it out of the way and allow a consistent memory starting point for transient programs, those that (unlike the operating system) were loaded into memory and run only when needed. When CP/M-80 read a program in from disk to run it, it would load the program into low memory, at address 0100H—that is, 256 bytes from the very bottom of memory. The first 256 bytes of memory were called the program segment prefix (PSP) and contained various odd bits of information as well as a general-purpose memory buffer for the program's disk input/output (I/O). The executable code itself did not begin until address 0100H.

I've drawn the 8080 and CP/M-80 memory model in Figure 4-1.

16-Bit Memory Address


Top of Installed Memory



Addresses Without Installed Memory

CP/M-80 Operating System

Unused Memory

Transient Program Code

Program Segment Prefix (PSP)

Often 16K, 32K, or 48K

Code Execution Begins Here

Figure 4-1: The 8080 memory model

The 8080's memory model as used with CP/M-80 was simple, and people used it a lot; so when Intel created its first 16-bit CPU, the 8086, it wanted to make it easy for people to translate older CP/M-80 software from the 8080 to the 8086—a process called porting. One way to do this was to make sure that a 16-bit addressing system such as that of the 8080 still worked. So, even though the 8086 could address 16 times as much memory as the 8080

(16 x 64K = 1MB), Intel set up the 8086 so that a program could take some 64K byte segment within that megabyte of memory and run entirely inside it, just as though it were the smaller 8080 memory system.

This was done by the use of segment registers, which are basically memory pointers located in CPU registers that point to a place in memory where things begin, be this data storage, code execution, or anything else. You'll learn a lot more about segment registers very shortly. For now, it's enough to think of them as pointers indicating where, within the 8086's megabyte of memory, a program ported from the 8080 world would begin (see Figure 4-2).

20-Bit Memory Address




Segment Register CS


Figure 4-2: The 8080 memory model inside an 8086 memory system

Segment Register CS


Figure 4-2: The 8080 memory model inside an 8086 memory system

When speaking of the 8086 and 8088, there are four segment registers to consider (again, we'll be dealing with them in detail very soon). For the purposes of Figure 4-2, consider the register called CS—which stands for code segment. Again, it's a pointer to a location within the 8086's megabyte of memory. This location acts as the starting point for a 64K region of memory, within which a quickly converted CP/M-80 program could run very happily.

This was very wise short-term thinking—and catastrophically bad long-term thinking. Any number of CP/M-80 programs were converted to the 8086 within a couple of years. The problems began big-time when programmers attempted to create new programs from scratch that had never seen the 8080 and had no need for the segmented memory model. Too bad —the segmented model dominated the architecture of the 8086. Programs that needed more than 64K of memory at a time had to use memory in 64K chunks, switching between chunks by switching values into and out of segment registers.

This was a nightmare. There is one good reason to learn it, however: understanding the way real-mode segmented memory addressing works will help you understand how the two x86 flat models work, and in the process you will come to understand the nature of the CPU a lot better.

The Nature of a Megabyte

When running in segmented real mode, the x86 CPUs can use up to one megabyte of directly addressable memory. This memory is also called real mode memory. As discussed briefly in Chapter 3, a megabyte of memory is actually not 1 million bytes of memory, but 1,048,576 bytes. As with the shorthand term ''64K,'' a megabyte doesn't come out even in our base 10 because computers operate on base 2. Those 1,048,576 bytes expressed in base 2 are 100000000000000000000B bytes. That's 220, a fact that we'll return to shortly. The printed number 100000000000000000000B is so bulky that it's better to express it in the compatible (and much more compact) base 16, the hexadecimal system described in Chapter 2. The quantity 220 is equivalent to 165, and may be written in hexadecimal as 100000H. (If the notion of number bases still confounds you, I recommend another trip through Chapter 2, if you haven't been through it already—or, perhaps, even if you have.)

Now, here's a tricky and absolutely critical question: In a bank of memory containing 100000H bytes, what's the address of the very last byte in the memory bank? The answer is not 100000H. The clue is the flip side to that question: What's the address of the first byte in memory? That answer, you might recall, is 0. Computers always begin counting from 0. (People generally begin counting from 1.) This disconnect occurs again and again in computer programming. From a computer programming perspective, the last in a row of four items is item number 3, because the first item in a row of four is item number 0. Count: 0, 1, 2, 3.

The address of a byte in a memory bank is just the number of that byte starting from zero. This means that the last, or highest, address in a memory bank containing one megabyte is 100000H minus one, or 0FFFFFH. (The initial zero, while not mathematically necessary, is there for the convenience of your assembler, and helps keep the assembler program from getting confused. Get in the habit of using an initial zero on any hex number beginning with the hex digits A through F.)

The addresses in a megabyte of memory, then, run from 00000H to 0FFFFFH. In binary notation, that is equivalent to the range of 00000000000000000000B to mmmmmmilB. That's a lot of bits—20, to be exact. If you refer back to Figure 3-3 in Chapter 3, you'll see that a megabyte memory bank has 20 address lines. One of those 20 address bits is routed to each of those 20 address lines, so that any address expressed as 20 bits will identify one and only one of the 1,048,576 bytes contained in the memory bank.

That's what a megabyte of memory is: some arrangement of memory chips within the computer, connected by an address bus of 20 lines. A 20-bit address is fed to those 20 address lines to identify 1 byte out of the megabyte.

Backward Compatibility and Virtual 86 Mode

Modern x86 CPUs such as the Pentium can address much more memory than this, and I'll explain how and why shortly. With the 8086 and 8088 CPUs, the 20 address lines and one megabyte of memory was literally all they had. The 386 and later Intel CPUs could address 4 gigabytes of memory without carving it up into smaller segments. When a 32-bit CPU is operating in protected mode flat model, a segment is 4 gigabytes—so one segment is, for the most part, plenty.

However, a huge pile of DOS software written to make use of segments was still everywhere in use and had to be dealt with. So, to maintain backward compatibility with the ancient 8086 and 8088, newer CPUs were given the power to limit themselves to what the older chips could address and execute. When a Pentium-class CPU needs to run software written for the real mode segmented model, it pulls a neat trick that, temporarily, makes it become an 8086. This is called virtual-86 mode, and it provided excellent backward compatibility for DOS software.

When you launch an MS-DOS window or ''DOS box'' under Windows NT and later versions, you're using virtual-86 mode to create what amounts to a little real mode island inside the Windows protected mode memory system. It was the only good way to keep that backward compatibility, for reasons you will understand fairly soon.

16-Bit Blinders

In real mode segmented model, an x86 CPU can ''see'' a full megabyte of memory. That is, the CPU chips set themselves up so that they can use

20 of their 32 address pins and can pass a 20-bit address to the memory system. From that perspective, it seems pretty simple and straightforward. However, the bulk of the trouble you might have in understanding real mode segmented model stems from this fact: whereas those CPUs can see a full megabyte of memory, they are constrained to look at that megabyte through 16-bit blinders.

The blinders metaphor is closer to literal than you might think. Look at Figure 4-3. The long rectangle represents the megabyte of memory that the CPU can address in real mode segmented model. The CPU is off to the right. In the middle is a piece of metaphorical cardboard with a slot cut in it. The slot is 1 byte wide and 65,536 bytes long. The CPU can slide that piece of cardboard up and down the full length of its memory system. However, at any one time, it can access only 65,536 bytes.


A full one megabyte (1,048,576 bytes) of memory is at the CPU's disposal. However...

Was this article helpful?

0 0

Post a comment