Originally published December 14, 2019 @ 11:52 pm
I have a squid
proxy server that uses a long list of authenticated cache peers in a round-robin configuration. The process looks something like this:
The key to getting this setup working well is to weed out unresponsive cache peers. In my case the proxies used as cache peers are ‘premium’ – I pay for the service. The vendor provides me with a regularly-updated list of working proxies. Or so they claim. I don’t how they’re checking this list, but almost always it requires some cleaning.
I wrote a little script that in a couple of minutes can go though a list of about three thousand proxies and select a few hundred that are sufficiently responsive. The script makes sure the proxy’s response meets these four conditions:
1. The proxy responds to a request.
2. The response arrives within the specified time window.
3. The response contains an IP address.
4. This IP address is not your own.
The last line of my `/etc/squid/squid.conf` contains this directive:
include /etc/squid/peers.conf
The `peers.conf` file looks something like this:
cache_peer ${proxy_ip} parent ${proxy_port} 0 proxy-only round-robin login=${user}:${password} cache_peer_access ${proxy_ip} allow all ...
The script will go through the list of proxies and run about 200 `curl` instances at a time. Depending on your system’s resources, you may adjust this value (`maxthreads`). I am expecting the proxy to provide output within 5 seconds (`timeout_01`), but you may want to adjust this as well, depending on how picky you can afford to be.
The very last step would be to reload `squid`, making it re-read the configuration. Should something go wrong, the original `peers.conf` file will be preserved with the current date extension. Give this script a shot and, should everything work out well, add it to `cron` to run at least daily.
And here is the script (also available in my GitHub repo):
#!/bin/bash # # | # ___/"\___ # __________/ o \__________ # (I) (G) \___/ (O) (R) # Igor Os # igor@comradegeneral.com # 2019-12-14 # ---------------------------------------------------------------------------- # Script description # Documentation URL: https:// # Validate a list of HTTPS proxies and generate Squid peer configuration # # CHANGE CONTROL # ---------------------------------------------------------------------------- # 2019-12-14 igor wrote this script # ---------------------------------------------------------------------------- configure() { basedir="/var/adm/bin/squid" infile="${basedir}/proxyips.txt" squiddir="/etc/squid" squidpeers="${squiddir}/peers.conf" creds="user:pass" proto="https" testurl="${proto}://ipecho.net/plain" maxthreads=200 timeout_01=5 # realip="your actual external IP" realip="$(curl -q -s0 -k "${testurl}")" if [ -z "${realip}" ] then echo "Unable to determine your actual external IP. Exiting..." exit 1 fi } backup_do() { /bin/mv "${squidpeers}" "${squidpeers}_$(date +'%Y-%m-%d')" 2>/dev/null } proxy_check() { my_ip="$(/usr/bin/timeout ${timeout_01} /usr/bin/curl --silent --proxy "${line}" --proxy-user "${creds}" "${testurl}" | grep -oE -m1 "([0-9]{1,3}\.){3}([0-9]{1,3})")" if [ ! -z "${my_ip}" ] && [ $(echo "${my_ip}" | fgrep -c "${realip}") -eq 0 ] && [ $(echo "${my_ip}" | grep -coE "([0-9]{1,3}\.){3}([0-9]{1,3})") -eq 1 ] then echo "${line}" ip=$(echo ${line} | awk -F: '{print $1}') port=$(echo ${line} | awk -F: '{print $2}') echo "cache_peer ${ip} parent ${port} 0 proxy-only round-robin login=${creds}" >> "${squidpeers}" echo "cache_peer_access ${ip} allow all" >> "${squidpeers}" fi } export -f proxy_check # RUNTIME configure backup_do i=1 cat "${infile}" | sort -Vu | shuf | while read line do if [ ${i} -le ${maxthreads} ] then proxy_check & (( i = i + 1 )) else i=1 sleep ${timeout_01} fi done sleep ${timeout_01} /sbin/service squid reload
Experienced Unix/Linux System Administrator with 20-year background in Systems Analysis, Problem Resolution and Engineering Application Support in a large distributed Unix and Windows server environment. Strong problem determination skills. Good knowledge of networking, remote diagnostic techniques, firewalls and network security. Extensive experience with engineering application and database servers, high-availability systems, high-performance computing clusters, and process automation.