Changing Web Page Locations

As a general rule, Apache supports two types of static web pages: a site's web pages and individual users' web pages. A site's web pages are maintained by the system administrator or a designated webmaster; most ordinary users can't modify these pages. Multiuser systems sometimes provide users with web space. These pages are typically served from subdirectories of the users' home directories. Naturally, Apache provides tools for changing the locations of both site web pages and individuals' web pages.

URI Components

To understand how Apache returns web pages, it's helpful to look at how an HTTP request is structured. The usual form of this request, as typed by a person in the

Address or Location field of a web browser, is as a Uniform Resource Identifier (URi), which looks like this:

http://www.threeroomco.com/products/biggie.html#orig

Note The URI acronym is the official replacement for another acronym, Uniform Resource Locator (URL). Although URL is still in common use, it's officially an "informal" term.

This URI consists of four components:

The Protocol The first few characters of a URI specify the protocol—http in this case. The protocol is terminated by a colon (:), and in many cases (including URIs for HTTP transfers) two slashes follow it.

Other common protocols in URIs include https (secure HTTP), ftp, and email.

The Hostname The hostname follows the protocol name in HTTP URIs, as well as some other types of URIs, such as FTP URIs. In this example, the hostname is www.threeroomco.com.

The Filename After the hostname in HTTP URIs comes the filename that's to be retrieved—/products/biggie.html in this example. The filename can be a single file or a complete path to a file, as in this example. Normally, the filename is specified relative to the server's document root, as described next, in "Changing the Site's Web Page." If a tilde (~) leads the filename, though, it's relative to a specified user's web storage area, as described in the upcoming section, "Enabling User Web Pages."

Additional Information Some URIs include additional information. The preceding example specifies #orig after the filename, meaning that the browser should locate a tag called orig within the page and display the text at that point. Dynamic content uses this part of the URI to enable browsers to pass data to the web server for processing.

Many of these components can be omitted or abbreviated. For instance, most web browsers assume an HTTP transfer if you start the URI with the hostname. If you omit the filename, the web server assumes a default filename. In Apache, you can set this default with the Directorylndex directive. If you provide more than one value for this directive, Apache searches for them all. Most installations create a default that searches for one or more of index.htm, index.html, or index.shtml. If you're moving an existing set of web pages to Apache and that set includes a different default index filename, you may want to change the default.

Changing the Site's Web Page

One of the earliest directives in the Apache configuration file is probably a DocumentRoot directive, which tells Apache where to look for the web pages it delivers. You'll find the default web pages, such as the one displayed in Figure 23.1, in this location. In order to use Apache to deliver your own site's pages, you can do one of two things:

• Change the DocumentRoot directive to point to another directory in which you've stored your website's pages.

• Replace the files in the default DocumentRoot directory with ones you create.

The first option is slightly preferable because it reduces the odds that your web pages will be accidentally overwritten when you upgrade your web server installation. When you create a new directory to house your website, you should be sure that it's readable to the user under whose name Apache runs. This username is often specified with the User directive in the main Apache configuration file. The group may also be important; that's set via the Group directive. Because public websites seldom contain sensitive data, it's not uncommon to make the directories and the files within them readable to the world.

Typically, a user known as the webmaster is responsible for maintaining the website. The webmaster may also be the system administrator, but this isn't always the case. The webmaster normally has full write access to the site's web page directory, and the webmaster may in fact be the owner of this directory tree and all the files within it. The default document root directory isn't normally the webmaster's home directory, though; configuring the system in this way would enable anybody to download files such as the webmaster's .bashrcfile.

Enabling User Web Pages

In addition to a site's main web pages, Apache can deliver web pages belonging to individual users. In order to activate this feature, you must set the UserDir directive, which takes the name of a directory within a user's home directory as an argument. For instance, you might use the following definition:

UserDir public_html

Once this directive is set, users can create subdirectories called public_html and store their personal web pages in that directory. For instance, suppose a remote user enters http://www.threeroomco.com/~charlotte/apage.html as a URL. If the server is configured with UserDir set to public_html and if the user charlotte has a home directory of/home/charlotte, then Apache will attempt to return /home/charlotte/public_html/apage.html to the client.

Tip Delivery of user web pages relies on the userdir_module module. If your site shouldn't deliver user web pages, you may want to remove the LoadModule directive that loads this module. If you remove this directive, an attempt to use the UserDir directive will cause Apache to fail at startup, unless it's surrounded by an clfModule mod_userdir.c> directive to test for the module's presence. Warning Be sure when you set up the UserDir directive and the root user's home directory (typically /home) that outsiders can't retrieve files frorrroot's home directory. Such a configuration is a potential security threat.

0 0

Post a comment