Originally published October 15, 2020 @ 10:10 pm
When “once in a while” answer is insufficient and you need to know exactly how often a particular event occurs on the system, there are a few easy things you can try from command line.
In the examples below, we’re looking at the vsftpd.log
to determine the number of successful logins on the hourly, daily, and monthly basis. First, hourly distribution of logins:
[root@vsftp01~]# p="\] OK LOGIN" ; log="/var/log/vsftpd.log" ; zgrep -E "${p}" "${log}"* | tr -s ' ' | cut -d: -f2- | cut -d' ' -f2-4 | cut -d: -f1 | sort -k1M -k2n -k3n -u | while read line ; do c=$(zgrep -E "${p}" "${log}"* | tr -s ' ' | cut -d: -f2- | grep -c "${line}") ; d=$(date --date="${line}" +'%Y-%m-%d %H %p') ; echo -e "${d}:\t${c}" ; done 2019-03-22 03 AM: 5 2019-03-22 04 AM: 7 2019-03-22 05 AM: 9 2019-03-22 06 AM: 12 2019-03-22 07 AM: 7 2019-03-22 08 AM: 19 2019-03-22 09 AM: 20 2019-03-22 10 AM: 21 2019-03-22 11 AM: 19 2019-03-22 12 PM: 19 2019-03-22 13 PM: 19 2019-03-22 14 PM: 19 2019-03-22 15 PM: 19
Now daily:
[root@vsftp01 ~]# p="\] OK LOGIN" ; log="/var/log/vsftpd.log" ; zgrep -E "${p}" "${log}"* | tr -s ' ' | cut -d: -f2- | cut -d' ' -f2-3 | sort -k1M -k2n -u | while read line ; do c=$(zgrep -E "${p}" "${log}"* | tr -s ' ' | cut -d: -f2- | grep -c "${line}") ; d=$(date --date="${line}" +'%Y-%m-%d') ; echo -e "${d}:\t${c}" ; done 2019-03-22: 304 2019-03-23: 3149 2019-03-24: 9188 2019-03-25: 3209 2019-03-26: 1248 2019-03-27: 1883 2019-03-28: 339 2019-03-29: 338 2019-03-30: 1421
Here’s a fancier version that’s useful for identifying daily patterns. As you can see, I had to resort to Bash shell’s more sophisticated counterpart – the Korn shell – still unsurpassed in performance and functionality since the last version came out in 1993:
[root@vsftp01 ~]# /bin/ksh # p="\] OK LOGIN" ; log="/var/log/vsftpd.log" ; i=0 ; zgrep -E "${p}" "${log}"* | tr -s ' ' | cut -d: -f2- | cut -d' ' -f2-3 | sort -k1M -k2n -u | while read line ; do c=$(zgrep -E "${p}" "${log}"* | tr -s ' ' | cut -d: -f2- | grep -c "${line}") ; d=$(date --date="${line}" +'%Y-%m-%d %a') ; a1[$i]="${d}" ; a2[$i]="${c}" ; (( i = i + 1 )) ; done ; w=80 ; max=$(printf '%s\n' "${a2[@]}" | sort -nr | head -n1) ; x=$(echo "scale=4;${w}/${max}"|bc -l) ; for i in $(seq 0 `echo "scale=0;${#a1[*]}-1"|bc -l`) ; do echo -ne "${a1[$i]}\t${a2[$i]}\t" ; for j in $(seq 0 `echo "scale=0;$(echo ${a2[$i]})*${x}/1"|bc -l`) ; do echo -n "." ; done ; echo "" ; done 2019-03-22 Sun 304 ... 2019-03-23 Mon 3149 ............................ 2019-03-24 Tue 9188 ................................................................................ 2019-03-25 Wed 3209 ............................ 2019-03-26 Thu 1248 ........... 2019-03-27 Fri 1883 ................. 2019-03-28 Sat 339 ... 2019-03-29 Sun 338 ... 2019-03-30 Mon 1421 ............. 2019-03-31 Tue 1131 .......... 2019-04-01 Wed 1211 ........... 2019-04-02 Thu 260 ...
And monthly:
[root@vsftp01 ~]# p="\] OK LOGIN" ; log="/var/log/vsftpd.log" ; zgrep -E "${p}" "${log}"* | tr -s ' ' | cut -d: -f2- | cut -d' ' -f2 | sort -k1M -u | while read line ; do c=$(zgrep -E "${p}" "${log}"* | tr -s ' ' | cut -d: -f2- | grep -c "${line}") ; d=$(date --date="${line} 1" +'%Y-%b') ; echo -e "${d}:\t${c}" ; done 2019-Mar: 22210 2019-Apr: 1420
If you feel like some serious scripting, here’s an example that will show you the daily distribution of vsftpd
successful and failed logins, as well as successful and failed downloads and uploads. Metrics like this can be useful in catching problems early on, before too many users get upset:
[root@vsftp01 ~]# log="/var/log/vsftpd.log" ; k=0 ; array=( "OK LOGIN" "FAIL LOGIN" "OK DOWNLOAD" "FAIL DOWNLOAD" "OK UPLOAD" "FAIL UPLOAD" ) ; for d in {7..0}; do date -d "`date +'%Y-%m-%d'` - $d days" +'%b %-d'; done | while read d ; do if [ ${k} -eq 0 ] ; then echo -ne "PERIOD," ; printf "%s," "${array[@]}" | sed 's/,$//g' ; echo "" ; k=1 ; fi; j=1 ; for i in "${array[@]}" ; do p="\] ${i}"; eval "$(echo c${j})"=$(zgrep -E "${p}" "${log}"* | tr -s ' ' | cut -d: -f2- | grep -c "${d} ") ; (( j = j + 1 )) ; done; echo -ne "`date -d "${d}" +'%Y-%m-%d'`," ; for j in $(seq 1 `echo "${#array[@]}"`) ; do eval echo -ne $(echo $`eval echo "c${j},"`); done | sed 's/,$//g' ; echo "" ; done PERIOD,OK LOGIN,FAIL LOGIN,OK DOWNLOAD,FAIL DOWNLOAD,OK UPLOAD,FAIL UPLOAD 2019-03-26,1248,72,3635,2,512,0 2019-03-27,1883,70,2524,2,410,0 2019-03-28,339,2,46,2,10,0 2019-03-29,338,4,11,3,13,0 2019-03-30,1421,49,950,3,549,0 2019-03-31,1131,71,1684,2,401,0 2019-04-01,0,0,0,0,0,0 2019-04-02,0,0,0,0,0,0
This approach can be relatively easily adopted to most standard system and application logs and used to track distribution over time of most uniquely-identifiable events. As long as you can grep for something predictable and there is a time stamp, you can use this method.
Experienced Unix/Linux System Administrator with 20-year background in Systems Analysis, Problem Resolution and Engineering Application Support in a large distributed Unix and Windows server environment. Strong problem determination skills. Good knowledge of networking, remote diagnostic techniques, firewalls and network security. Extensive experience with engineering application and database servers, high-availability systems, high-performance computing clusters, and process automation.