Looking at File Internals with the Bless Editor

Very fortunately, there are utilities that can open, display, and enable you to change characters or binary bytes inside any kind of file. These are called binary editors or hexadecimal editors, and the best of them in my experience (at least for the Linux world) is the Bless Hex Editor. It was designed to operate under graphical user interfaces such as Gnome, and it is very easy to figure out by exploring the menus.

Bless is not installed by default under Ubuntu. You can download it free of charge from its home page:

http://home.gna.org/bless/

However, you can very easily install it from the Ubuntu Applications menu. Select Add/Remove and leave the view set to All (the default). Type Bless in the Search field, and the Bless Hex Editor should be the only item to appear. (Give it a few seconds to search; the item won't appear instantaneously.) Check its check box to select it for installation, and then click Apply. Once installed, the Bless Hex Editor will be available in Applications ^ Programming, or you can create a desktop launcher for it if you prefer.

Demonstrating Bless will also demonstrate why it's necessary for programmers to understand even text files at the byte level. In the listings archive for this book (see the Introduction for the URL) are two files, samwindows.txt and samlinux.txt. Extract them both. Launch Bless, and using the File ^ Open command, open samlinux.txt. When that file has been opened, use File ^ Open again to open samwindows.txt. What you'll see should look like Figure 5-1.

Bless Hex Editor
Figure 5-1: Displaying a Linux text file with the Bless Hex Editor

I've shortened the display pane vertically just to save space here on the printed page; after all, the file itself is only about 15 bytes long. Each opened file has a tab in the display pane, and you can switch instantly between files by clicking on the tabs.

The display pane is divided into three parts. The left column is the offset column. It contains the offset from the beginning of the file for the first byte displayed on that line in the center column. The offset is given in hexadecimal. If you're at the beginning of the file, the offset column will be 00000000. The center column is the hex display column. It displays a line of data bytes from the file in hexadecimal format. How many bytes are shown depends on how you size the Bless window and what screen resolution you're using. The minimum number of bytes displayed is (somewhat oddly) seventeen. In the center column the display is always in hexadecimal, with each byte separated from adjacent bytes by a space. The right column is the same line of data with any ''visible'' text characters displayed as text. Nondisplayable binary values are represented by period characters.

If you click on the samwindows.txt tab, you'll see the same display for the other file, which was created using the Windows Notepad text editor. The samwindows.txt file is a little longer, and you have a second line of data bytes in the center column. The offset for the second line is 00000012. This is the offset in hex of the first (and in this case, the only) byte in the second line.

Why are the two files different? Bring up a terminal window and use the cat command to display both files. The display in either case will be identical:

Sam was man.

Figure 5-2 shows the Bless editor displaying samwindows.txt. Look carefully at the two files as Bless displays them (or at Figures 5-1 and 5-2) and try to figure out the difference on your own before continuing.

Eile

Edit View Search Jools H^'p

Ô

a □ t ■ ■ I *

ti Ei

¡¡k

New

Open Save h-¿do Cut

Copy Paste

Find

Find and Replace

samlinmi.tHt □ samwindows.txt O

00000(100 61 6D 0D OR 77 61 73 0D OR 61 0D OR 6D 61 6E 2EI3am..was..a..man 00000011 OU OA

samlinmi.tHt □ samwindows.txt O

00000(100 61 6D 0D OR 77 61 73 0D OR 61 0D OR 6D 61 6E 2EI3am..was..a..man 00000011 OU OA

Signed H hit: ji« | signed • > hit: !' 14. | Hexadecimal: 1 r tin DO

Unsigned B bit: |s3 | Unsigned 32 bit: ¡225272147 | Decimal |oS3 097 109 013

Signed 10 bit: 124915 | Float 32 bit: |7.314B41E-31 | Octal: 1123 141 155 015 |

Unsigned 16 bit: 124915 | rtoat 64 bit: [6.105593Q5392904II I 247 | Dmary: |01010011 01100001 0l|

E Shaw little endian decoding □ show unsigned eg hexadecimal ASCIITejit: |sam[%g

Offset: 0x0 / 0x12 Selection: None INS

Figure 5-2: Displaying a Windows text file with the Bless editor

At the end of each line of text in both files is a 0AH byte. The Windows version of the file has a little something extra: a 0DH byte preceding each 0AH byte. The Linux file lacks the 0DH bytes. As standardized as ''plain'' text files are, there can be minor differences depending on the operating system under which the files were created. As a convention, Windows text files (and DOS text files in older times) mark the end of each line with two characters: 0DH followed by 0AH. Linux (and nearly all Unix-descendent operating systems) mark the end of each line with a 0AH byte only.

As you've seen in using cat on the two files, Linux displays both versions identically and accurately. However, if you were to take the Linux version of the file and load it into the Windows Notepad text editor, you'd see something a little different, as shown in Figure 5-3.

C samlinux.fcxt - Notepad

-ilBI*

| ffc EiJL Fumd. View rtdp

£amGwas DaOman.D

3 J

Li j

Figure 5-3: A Linux text file displayed under Windows

Notepad expects to see both the 0DH and the 0AH at the end of each text line, and doesn't understand a lonely 0AH value as an end-of-line (EOL) marker. Instead, it inserts a thin rectangle everywhere it sees a 0AH, as it would for any single character that it didn't know how to display or interpret. Not all Windows software is that fussy. Many or most other Windows utilities understand that 0AH is a perfectly good EOL marker.

The 0DH bytes at the end of each line are another example of a ''fossil'' character. Decades ago, in the Teletype era, there were two separate electrical commands built into Teletype machines to handle the end of a text line when printing a document. One command indexed the paper upward to the next line, and the other returned the print head to the left margin. These were called linefeed and carriage return, respectively. Carriage return was encoded as 0DH and line feed as 0AH. Most computer systems and software now ignore the carriage return code, though a few (like Notepad) still require it for proper display of text files.

This small difference in text file standards won't be a big issue for you, and if you're importing files from Windows into Linux, you can easily remove the extra carriage return characters manually, or—what a notion!—write a small program in assembly to do it for you. What's important for now is that you understand how to load a file into the Bless Hex Editor (or whatever hex editor you prefer; there are many) and inspect the file at the individual byte level.

You can do more with Bless than just look. Editing of a loaded file can be done in either the center (binary) column or the right (text) column. You can bounce the edit cursor between the two columns by pressing the Tab key. Within either column, the cursor can be moved from byte to byte by using the standalone arrow keys. Bless respects the state of the Insert key, and you can either type over or insert bytes as appropriate.

I shouldn't have to say that once you've made changes to a file, save it back to disk by clicking the Save button.

Was this article helpful?

0 0

Responses

  • payton munro
    How to change bytes in bless hex editor?
    1 month ago
  • alem
    How to view files in bless hex editor?
    1 month ago

Post a comment