Home SysAdmin Commands & Shells Wget Examples

Wget Examples

October 25, 2023

175

Originally published March 1, 2017 @ 1:12 am

This is a follow-up to my previous wget notes (1, 2, 3, 4). From time to time I find myself googling wget syntax even though I think I’ve used every option of this excellent utility over the years. Perhaps my memory is not what it used to be, but I’m probably the most frequent visitor to my own Web site… Anyway, here’s the grand list of the more useful wget snippets.

Download tar.gz and uncompress with a single command:

wget -q ${url}/archive.tar.gz -O - | tar xz

Download tar.bz2 and uncompress with a single command:

wget -q ${url}/archive.tar.bz2 -O - | tar xj

Download in background, limit bandwidth to 200KBps, do not ascend to parent URL, download only newer files, do not create new directories, download only htm*,php and, pdf, set 5-second timeout per link:

wget -b --limit-rate=200k -np -N -m -nd --accept=htm,html,php,pdf --wait=5 "${url}"

Download recursively, span multiple hosts, convert links to local, limit recursion level to 4, fake “mozilla” user agent, ignore “robots” directives:

wget -r -H --convert-links --level=4 --user-agent=Mozilla "${url}" -e robots=off

Generate a list of broken links:

wget --spider -o broken_links.log --wait 2 -r -p "${url}" -e robots=off

Download new PDFs from a list of URLs:

wget -r --level=1 -H --timeout=2 -nd -N -np --accept=pdf -e robots=off -i urls.txt

Save and use authentication cookie:

wget -O ~/.domain_cookie_tmp "https://domain.com/login.cgi?login=${username}&amp;password=${password}"

grep "^cookie" ~/.domain_cookie_tmp | awk -F'=' '{print $2}' > ~/.domain_cookie
wget -c --no-cookies --header="Cookie: enc=`cat ~/.domain_cookie`" -i "${url_file}" -nc

Use wget with anonymous proxy:

export http_proxy=proxy_server:port
wget -Y -O /tmp/yahoo.htm "http://www.yahoo.com"

Use wget with authorized proxy:

export http_proxy=proxy_server:port
wget -Y --proxy-user=${username} --proxy-passwd=${password} \
-O /tmp/yahoo.htm "http://www.yahoo.com"

Make a local mirror of a Web site, including FTP links; limit rate to 50kbps; set link timeout to 5s; ignore robots directive; randomize access rate:

wget -U Mozilla -m -k -D ${domain} --follow-ftp \
--limit-rate=50k --wait=5 --random-wait -np "${url}" -e robots=off

Download images from a Web site:

wget -r -l 0 -U Mozilla -t 1 -nd -D ${domain} \
-A jpg,jpeg,gif,png "${url}" -e robots=off

Extract a list of HTTP(S) and FTP(S) links from a single URL:

wget -qO- "${url}" | grep -oE "(https?|ftps?)://[^\<\>\"\' ]+" | sort -u

Mirror a subfolder of a site:

wget -mk -w 20 -np ${url}

Update only changed files:

wget -mk -w 20 -N "${url}"

Mirror site with random delay between requests:

wget -w 20 --random-wait -mk "${url}"

Download a list of URLs from a file:

wget -i "${url_file}"

Resume interrupted file download:

wget -c "${file_url}"

Download files in the background:

wget -c "${url}"

Download the first two levels of pages from a site:

wget -r -l2 "${url}"

Make a static copy of a dynamic Web site two levels deep:

wget -P /var/www/html/ -mpck -l2 --user-agent="Mozilla" -e robots=off -E "${url}"

Igor

Experienced Unix/Linux System Administrator with 20-year background in Systems Analysis, Problem Resolution and Engineering Application Support in a large distributed Unix and Windows server environment. Strong problem determination skills. Good knowledge of networking, remote diagnostic techniques, firewalls and network security. Extensive experience with engineering application and database servers, high-availability systems, high-performance computing clusters, and process automation.

Symbol	USD	% 1h	% 24h	% 7d
BTC	37,157	0.55	2.50	7.72
ETH	1,716.5	0.31	3.66	4.71
USDT	0.9996	0.00	0.02	0.01
XRP	0.3813	0.14	0.63	2.13
BNB	590.18	0.04	1.34	1.08
SOL	147.93	0.13	1.23	6.13
USDC	1.0000	0.00	0.00	0.00
	?	---	0.00	0.00
	?	---	0.00	0.00
	?	---	0.00	0.00

Bitcoin $ 37,157	Bitcoin 2.50 %
Ethereum $ 1,716.5	Ethereum 3.66 %
Litecoin $ 53.16	Litecoin 0.18 %
XRP $ 0.3813	XRP 0.63 %

Adding and Removing sshd instances on CentOS 7

Adding and Removing sshd instances on CentOS 6

Gnuplot with Bash

Multi-Dimensional Arrays in Bash

Asciinema Notes

Notes on ownCloud configuration

Removing Chef Server Installation

Curated Downloads

Exporting WordPress to Markdown

Creating a Chroot Jail for SSH Access

Late Night Rant: Facebook

GPG Encryption QSG

Encrypting Log Data During Log Rotation

Late Night Rant: College Admissions Scandal

Measure DNS Server Performance

Resizing Photos for Instagram

QNAP NAS Performance Analysis

Adding and Removing sshd instances on CentOS 7

Adding and Removing sshd instances on CentOS 6

Measure DNS Server Performance

Inventory Network Services with Nmap

Finding Duplicate Photos

Maryland Renaissance Festival

Focus Stacking with Lightroom and Photoshop

Longwood Gardens, April 2018

Wget Examples

Adding and Removing sshd instances on CentOS 6

Improving Your Scripts with ShellCheck

Understanding Memory Utilization in Linux

Shenandoah National Park, 2023

Verify Network Port Access

College Students Demand Refunds