How C Sees Command Line Arguments

In Chapter 11 I explained how to access command-line arguments from a Linux program as part of a more general discussion of stack frames. One of the odder things about linking and calling functions out of the standard C

library in glibc is that the way you access command-line arguments changes, and changes significantly.

The arguments are still on the stack, as is the table of argument addresses. (If you haven't been through that section of Chapter 11 yet, please go to page 427 and study it now or this section won't make much sense.) The problem is that the glibc startup code places other things there as well, and those other things are now in the way.

Fortunately, you have a way past those obstructions, which include important things like the return address that takes execution into the glibc shutdown code and allows your program to make a graceful exit. The way past is your ''thumb in the stack,'' EBP. EBP anchors your access to both your own items (stored down-memory from EBP) and those placed on the stack by glibc, which are up-memory.

Glibc adds a pointer to the stack frame that points to the table of addresses pointing to the arguments themselves. Because we're talking about a pointer to a pointer to a table of pointers to the actual argument strings, the best way to begin is to draw a picture of the lay of the land. Figure 12-4 shows the pointer relationships and stack structures you have to understand to identify and read the command-line arguments.

The Argument

The Stack Pointer Table The Arguments

Figure 12-4: Accessing command-line arguments when the C library is present

Immediately above EBP is the return address for your portion of the program. When your code is done, it executes a ret instruction, which uses this return address to take execution back into the C library's shutdown sequence. You don't need to access this return address directly for anything; and you certainly shouldn't change it.

Immediately above the return address, at offset 8 from EBP (as the literature would say, at EBP+8) is an integer count of the number of arguments. Don't be confused: This is a duplicate copy of the argument count that exists just below the table of argument addresses, as described in Chapter 11.

Immediately above the argument count, at EBP+12, is a pointer to the table of argument addresses. Immediately above that, at EBP+16, is a pointer to the table of environment variable pairs. Reading environment variable pairs is done much the same way as reading command-line arguments, so if you understand one, you won't have much trouble with the other.

Even with all that additional indirection, it takes surprisingly little code to display a list of arguments:

mov edi,[ebp+8] mov ebx,[ebp+12] xor esi,esi .showit:

push dword [ebx+esi*4]

push esi push ArgMsg call printf add esp,12

inc esi dec edi jnz .showit

Load argument count into EDI

Load pointer to argument table into EBX

Clear ESI to 0

Push address of an argument on the stack Push argument number Push address of display string Display the argument # and argument Stack cleanup: 3 parms x 4 bytes = 12 Bump argument # to next argument Decrement argument counter by 1 If argument count is 0, we're done

The argument count and the address of the argument table are at fixed offsets from EBP. The argument count goes into EDI. The address of the argument table goes into EBX. ESI is cleared to 0, and provides an offset into the table of argument addresses. With that accomplished, we go into a loop that pushes the argument pointer, the argument number, and a base string onto the stack and calls printf() to display them. After printing each argument, we increment the argument number in ESI and decrement the argument count in EDI. When EDI goes to 0, we've displayed all the arguments, and we're done.

One important note about this program, which I've said before but must emphasize: If you're calling a C library function in a loop, either you must use the sacred registers to hold the counters that govern the loop or you must push them onto the stack before making a library call. The library trashes the nonsacred registers EAX, ECX, and EDX. If you had tried to store the argument count in ECX, the count would have been destroyed the first time you called printf(). The sacred nature of EBX, ESI, and EDI makes them ideal for this use.

A full program incorporating the preceding code is present in the listings archive for this book, as showargs3.asm.

Was this article helpful?

0 0

Post a comment