Using the wget Command for File Transfers

The wget command is used for file transfers using FTP, HTTP, and various secure protocols such as HTTPS and HFTP if a remote proxy is enabled. However, wget only supports noninteractive transfers, unlike other FTP clients. This is actually a feature, as wget can be used to download files as a background process and to recursively replicate remote file directories. The command also supports download completion of partially downloaded files, which can save a lot of time during periods of intermittent connectivity or broken connections.

For example, here is a simple invocation showing FTP retrieval from a remote computer using wget and an FTP URL:

$ wget ftp://phudson:[email protected]/mp3/* —13:13:28-- ftp://phudson:*password*@stinky/mp3/*

Connecting to stinky[192.168.2.33]:21... connected. Logging in as phudson ... Logged in! ==> SYST ... done. ==> PWD ... done.

==> TYPE I ... done. ==> CWD /home/paul/mp3 ... done. ==> PORT ... done. ==> LIST ... done.

Removed '.listing'.

—13:13:28— ftp://phudson:*password*@stinky/mp3/C31821-01A.mp3

==> PORT ... done. ==> RETR C31821-01A.mp3 ... done. Length: 5,172,089

60% [=====================> ] 3,123,680 264.80K/s ETA 00:07

In this example, the user retrieves all files in a directory named mp3 (under /home/paul) on the remote host named stinky. The wget command will first retrieve a directory listing, then proceed to download the specified files (all marked with '*' in this example). Note that you can specify a username and password (mypasswd in the example) on the command. This generally is not a good idea. A better, but still not really secure, approach is to save the password in a file named .wgetrc in your home directory. See the wget man page for more information, or check the only documentation at http://www.gnu.org/software/wget/manual/wget-1.8.1/html mono/wget.html.

Another popular use for wget is downloading complete copies of websites for offline reading, although it is not very friendly toward website owners who have to pay for all the bandwidth!

To download an entire site, you need to specify the --mirror, --convert-links, and -p parameters, followed by the URL of the site to download. The first parameter tells wget to download all the pages and pictures from the site, following links as it can. The second tells it to rewrite the HTML so that it works when browsed locally. The last parameter, -p, tells wget to download all the files referenced in the HTML, such as sounds, CSS files, and other related documents. You might also want to specify the w parameter, which allows you to specify a number of seconds between individual wget requests; this stops your download from overloading the web server.

So, the complete command to download a website would be wget -mirror -- convert-

links p w 2 http://www.example.com/.

Was this article helpful?

0 0

Post a comment