Originally published June 26, 2020 @ 12:24 am
I frequently use MySQL to store data collected or processed by various shell scripts. The database comes in handy when massaging data with good old awk
and sed
gets too cumbersome.
This is really very basic stuff: just loading some data into MySQL and retrieving it. The usual database configuration section I include in my scripts looks something like this:
db_host=hostname.local db_port=3306 db_user="mysqluser" db_pass="mypass" db_name="schema_name" tbl_name="table_name" MYSQL="/usr/bin/mysql --batch --skip-column-names --max_allowed_packet=100M -h${db_host} -u${db_user} -p${db_pass} ${db_name} -e"
Here’s a way to retrieve some data from the table:
$MYSQL "select count(*) from $tbl_name"
Now, let’s say your database table has four columns:
id | timestamp | hostname | message |
(primary index, auto-increment) | datetime | varchar | long text |
Let’s also imagine you have a comma-separated file $datafile
that looks like this:
2020-06-25 23:23:53,host01,Linux ncc1711 2.6.32-754.30.2.el6.x86_64 #1 SMP
You would like to load this file into the database. Here you go:
${MYSQL} "LOAD DATA LOCAL INFILE '${datafile}' IGNORE INTO TABLE ${tbl_name} \ FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' \ (timestamp, hostname, message);"
Here’s an example of comparing records from yesterday with those older than two weeks to find missing lines:
diff --new-line-format="" --unchanged-line-format="" \ <($MYSQL "SELECT DISTINCT message FROM ${tbl_name} WHERE hostname LIKE 'host01' AND timestamp BETWEEN CURDATE() - INTERVAL 1 DAY AND CURDATE() - INTERVAL 1 SECOND ORDER BY message ASC") \ <($MYSQL "SELECT DISTINCT message FROM ${tbl_name} WHERE hostname LIKE 'host01' AND timestamp < '$(date -d'2 weeks ago' +'%Y-%m-%d %H:%M:%S')' ORDER BY message ASC")
Here’s another example where we extract dhcpd
messages from /var/log/messages
, convert it to CSV format, and load it into the database table:
# Source file format: Jun 21 04:20:42 ncc1711 dhcpd: DHCPREQUEST for 192.168.122.159 from 6c:72:20:c5:0b:83 (One) via eth1 Jun 21 04:20:42 ncc1711 dhcpd: DHCPACK on 192.168.122.159 to 6c:72:20:c5:0b:83 (One) via eth1 Jun 21 04:20:43 ncc1711 dhcpd: DHCPREQUEST for 192.168.122.146 from e0:37:bf:8e:a2:f1 (audiocast) via eth1 Jun 21 04:20:43 ncc1711 dhcpd: DHCPACK on 192.168.122.146 to e0:37:bf:8e:a2:f1 (audiocast) via eth1 # Script: tmpfile=$(mktemp) grep dhcpd /var/log/messages | while read line; do timestamp="$(date -d"$(echo $line | awk '{print $1,$2,$3}')" +'%F %T')" host_name="$(echo $line | awk '{print $4}')" message="$(echo $line | \ awk '{ s = ""; for (i = 5; i <= NF; i++) s = s $i " "; print s }' | \ sed 's/,//g')" echo "${timestamp},${host_name},${message}" >> $tmpfile done ${MYSQL} "LOAD DATA LOCAL INFILE '${tmpfile}' IGNORE INTO TABLE ${tbl_name} \ FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' \ (timestamp, hostname, message);" /bin/rm $tmpfile
This is pretty much it. The rest is just your SQL foo.
Experienced Unix/Linux System Administrator with 20-year background in Systems Analysis, Problem Resolution and Engineering Application Support in a large distributed Unix and Windows server environment. Strong problem determination skills. Good knowledge of networking, remote diagnostic techniques, firewalls and network security. Extensive experience with engineering application and database servers, high-availability systems, high-performance computing clusters, and process automation.