Originally published December 26, 2017 @ 10:44 pm
In this scenario, a user emailed some other user something that probably should not have been emailed. You don’t know who the users are or exactly what they sent. What you have is a bunch of PST files and a list of keywords. And this shell script.
Here’s how the script works. You dump the PST files into $indir
. I suggest using a local filesystem and not a network-mounted one for performance and security reasons. Considering the potentially-sensitive nature of this data, you may want to set up an encrypted filesystem that would not auto-mount on startup.
You then create the $keyword_list
containing one keyword per line and encrypt it with gpg
using a passphrase. The script will prompt you for that passphrase when you launch it. Depending on the volume of PSTs, the conversion process may take some time.
As the script digs through the emails, you may start seeing something along these lines:
PST: /downloads/input/username@domain.com --------------------------------------- FILE: /downloads/input/username@domain.com/Inbox/104.eml KEYS: keyword_01,keyword_07 DATE: Mon, 9 Jun 2010 20 08 16 -0400 FROM: username2@domain.com TO: username@domain.com SUBJ: Email subjet FILE: /downloads/input/username@domain.com/Inbox/269.eml KEYS: keyword_03 DATE: Mon, 7 Jul 2010 12 58 29 -0400 FROM: username3@domain.com TO: username@domain.com SUBJ: Another email subject
You can then view the listed email files for more information. I am sure there is a more civilized tool for this task, but all I had was bash
. The script is below and you can also get it here.
#!/bin/bash # | # ___/"\___ # __________/ o \__________ # (I) (G) \___/ (O) (R) # 2017-12-15 # ---------------------------------------------------------------------------- # Convert *.pst mailbox files to text and scan for keywords # ---------------------------------------------------------------------------- # readpass() { # Read your GPG password echo -n "Password: " read -s p if [ -z "${p}" ]; then exit 1 fi } configure() { # Install readpst, if not there already if [ ! -x /usr/bin/readpst ]; then yum -y install libpst.x86_64 || exit 1 fi # Install the Silver Searcher, if not there already if [ ! -x /usr/bin/ag ]; then yum -y install the_sliver_searcher || exit 1 fi # Install GPG, if not there already if [ ! -x /usr/bin/gpg ]; then yum -y install gpg || exit 1 fi # Put your *.pst files in here indir="/downloads/input" # Put your keywords in here, one per line # and encrypt it like so: # gpg --batch --symmetric --passphrase "${p}" "${keyword_list}" 2>/dev/null # chmod 600 "${keyword_list}.gpg" # /bin/rm -f "${keyword_list}" keyword_list="/tmp/keywords.txt" if [ ! -r "${keyword_list}.gpg" ]; then exit 1 fi # Just in case you forgot chmod 400 "${keyword_list}.gpg" } extractpst() { # Find and convert *.pst files to text find "${indir}" -maxdepth 1 -mindepth 1 -type f -name "*\.pst" | while read pst; do cd "${indir}" && readpst -j $(grep -c processor /proc/cpuinfo) -b -e "${pst}" done } extractkeywords() { # Read keyword list into an array IFS=$'\n'; a=($(gpg --batch --decrypt --passphrase "${p}" "${keyword_list}.gpg" 2>/dev/null)); unset IFS # Assign keywords to a variable s=$(for ((i = 0; i < ${#a[@]}; i++)) ; do echo -n "${a[$i]}|" ; done | sed 's/|$//g') } findkeywords() { c=() IFS=$'\n'; b=($(ag -c "${s}" "${pst_folder}" | awk -F: '{print $1}' | sort -u)); unset IFS echo "PST: ${pst_folder}" echo "---------------------------------------" for ((i = 0; i < ${#b[@]}; i++)) ; do echo "${b[$i]}" ; done | while read line; do message_id="$(grep -oP -m1 "(?<=Message-ID: <).*(?=>$)" "${line}")" if [ "$(for ((i = 0; i < ${#c[@]}; i++)) ; do echo "${c[$i]}"; done | grep -c "${message_id}")" -eq 0 ]; then cat << EOF FILE: $(echo "${line}") KEYS: $(grep -oP "${s}" "${line}" | sort -u | tr '\n' ', ' | sed -r 's/,$//g') DATE: $(grep -P -m1 "^Date:" "${line}" | awk -F: '{$1=""; print $0}' | sed 's/^ //g') FROM: $(grep -P -m1 "^From:" "${line}" | grep -Po '(?i)\b[A-Z0-9._%+-]+@(?:[A-Z0-9-]+\.)+[A-Z]{2,6}\b' | sort -u | tr '\n' ', ' | sed -r 's/,$//g') TO: $(grep -P -m1 "^To:" "${line}" | grep -Po '(?i)\b[A-Z0-9._%+-]+@(?:[A-Z0-9-]+\.)+[A-Z]{2,6}\b' | sort -u | tr '\n' ', ' | sed -r 's/,$//g') SUBJ: $(grep -P -m1 "^Subject:" "${line}" | awk -F: '{$1=""; print $0}' | sed 's/^ //g') EOF echo c+=("${message_id}") fi done } find_do() { # Search and parse SAVEIFS=$IFS IFS=$(echo -en "\n\b") for pst_folder in $(find "${indir}" -maxdepth 1 -mindepth 1 -type d); do findkeywords done IFS=$SAVEIFS } # RUNTIME readpass configure extractpst extractkeywords find_do
Experienced Unix/Linux System Administrator with 20-year background in Systems Analysis, Problem Resolution and Engineering Application Support in a large distributed Unix and Windows server environment. Strong problem determination skills. Good knowledge of networking, remote diagnostic techniques, firewalls and network security. Extensive experience with engineering application and database servers, high-availability systems, high-performance computing clusters, and process automation.