wget

Overview

The wget command is a non-interactive network downloader that retrieves files from web servers using HTTP, HTTPS, and FTP protocols. It’s designed for robust downloading with retry capabilities.

Syntax

wget [options] [URL...]

Common Options

Option	Description
`-O file`	Output to file
`-c`	Continue partial download
`-r`	Recursive download
`-np`	No parent directories
`-k`	Convert links for local viewing
`-p`	Download page requisites
`-m`	Mirror website
`-q`	Quiet mode
`-v`	Verbose output
`-t n`	Retry n times
`-T n`	Timeout in seconds
`--limit-rate=rate`	Limit download speed

Download Types

Type	Description
Single file	Download one file
Recursive	Download directory structure
Mirror	Complete website copy
Resume	Continue interrupted download
Batch	Multiple URLs from file

Key Use Cases

Download files from web
Mirror websites
Automated downloads
Backup web content
Batch file retrieval

Examples with Explanations

Example 1: Basic Download

wget https://example.com/file.zip

Downloads file to current directory

Example 2: Save with Different Name

wget -O myfile.zip https://example.com/file.zip

Downloads and saves with specified name

Example 3: Resume Download

wget -c https://example.com/largefile.iso

Continues interrupted download

Recursive Downloads

Download website:
```
wget -r -np -k https://example.com/
```
Mirror with limits:
```
wget -m -l 2 https://example.com/
```
Download directory:
```
wget -r -np https://example.com/files/
```

Advanced Options

Option	Description
`--user-agent=agent`	Set user agent
`--referer=url`	Set referer
`--header=header`	Add HTTP header
`--post-data=data`	POST request
`--cookies=on/off`	Handle cookies
`--no-check-certificate`	Skip SSL verification
`--spider`	Check if file exists

Common Usage Patterns

Download with rate limit:

wget --limit-rate=200k https://example.com/file.zip

Background download:

wget -b https://example.com/largefile.iso

Download from file list:
```
wget -i urls.txt
```

Authentication

Basic auth:

wget --user=username --password=password URL

Certificate auth:

wget --certificate=cert.pem --private-key=key.pem URL

Cookie authentication:
```
wget --load-cookies=cookies.txt URL
```

Performance Analysis

Efficient for large files
Good retry mechanisms
Bandwidth limiting available
Parallel downloads possible
Resume capability reduces waste

Additional Resources

Best Practices

Use appropriate retry settings
Respect robots.txt
Limit download rate for courtesy
Use resume for large files
Verify downloaded files

Website Mirroring

Complete mirror:

wget -m -p -E -k -K -np https://example.com/

Limited depth:
```
wget -r -l 3 -k -p https://example.com/
```

Specific file types:

wget -r -A "*.pdf,*.doc" https://example.com/

Security Considerations

Verify SSL certificates
Be cautious with –no-check-certificate
Validate downloaded content
Use secure protocols when possible
Check file integrity

Troubleshooting

SSL certificate errors
Connection timeouts
Server blocking requests
Disk space issues
Permission problems

Integration Examples

With cron for scheduled downloads:

0 2 * * * wget -q -O /backup/file.zip https://example.com/file.zip

With find for cleanup:

wget https://example.com/file.zip && find . -name "*.tmp" -delete

Batch processing:

for url in $(cat urls.txt); do wget "$url"; done