The sgrep command is structured grep and enables you to extract sections from structured files (rather as grepmail does, but allowing for much more general types of files). It enables you to extract particular sections from, for example, XML or HTML files (based on the content and the markup surrounding them) or from program source files, mailboxes, or any file with a known and defined structure. The reason for mentioning it here is to alert you to its existence; it may be by far the quickest way to extract information from files with a known (and clearly defined) structure and save you from having to write complex scripts.

Here is a very simple example:

[email protected]:~ > cat index.html

<title>Web Page Title</title>

[email protected]:~ > sgrep '"<title>"_"</title>"' index.html

Web Page Title

Here you are searching for text enclosed by the opening and closing HTML title tags, and the command outputs the relevant string.

The sgrep package is not installed by default: You may have to install it from the media before trying it out.

