Originally published May 11, 2021 @ 5:51 pm
Linux command-line tools provide access to a wealth of dictionaries, encyclopedias, thesauri, directories, and other reference sources. Learning to use these tools is a worthwhile endeavor even if solving crossword puzzles is not your favorite pastime.
Let’s start with installing the various tools we’ll be discussing in this article:
apt install sdcv stardict dict wamerican ddgr dictd dict-gcide wordnet cd ~ mkdir -p /usr/share/stardict/dic/ for i in stardict-dictd-web1913-2.4.2.tar.bz2 \ stardict-dictd_www.dict.org_gcide-2.4.2.tar.bz2 \ stardict-dictd_www.dict.org_wn-2.4.2.tar.bz2 \ stardict-dictd_www.dict.org_gazetteer-2.4.2.tar.bz2 \ stardict-dictd-gazetteer2k-places-2.4.2.tar.bz2 \ stardict-BritannicaConcise-2.4.2.tar.bz2 \ stardict-merrianwebster-2.4.2.tar.bz2 do wget "http://download.huzheng.org/dict.org/${i}" tar -xvjf "${i}" -C /usr/share/stardict/dic done
Word Lists
Many word lists in various languages are available for Linux. An example would be /usr/share/dict/american-english
. Let’s say you’re looking for all the words that start and end with the letter “t”:
egrep '^[Tt].*[Tt]$' /usr/share/dict/american-english # Alternatively, you can exclude any words containing # anything but letters (i.e. apostrophies or hyphens) egrep -i '^t[[:alpha:]]{1,}t$' /usr/share/dict/american-english
I’m sure you can see how this may be useful for solving crossword puzzles. If you know just a couple of letters, you can substantially narrow down your list of options.
In the example below, we’re looking for a 10-letter word where the third and fourth letters are “o” and “f”, respectively:
egrep '^..of...$' /usr/share/dict/american-english
In this case, you end up with only 19 words. The provided puzzle clue is the “Reindeer landing area”. It does take much to figure out that the word you’re seeking is “rooftop”.
WordNet
WordNet is a lexical database of semantic relations between words and wn
is its command-line interface. Here we’re looking up the list of synonyms/hypernyms for the word “rooftop” ordered by estimated use frequency:
wn rooftop -hypen
DICT
DICT is a dictionary network protocol and dict
is its command-line interface allowing you to query multiple dictionary servers. You can specify the servers, although the default selection is usually entirely sufficient. You can even set up your own server.
Here’s an example of looking up the definition of the word “rooftop” using a specific server (you can find the list of DICT servers here) offering one of the most comprehensive dictionary collections:
dict -h dict.uk.dyslexicfish.net rooftop
StarDict
StarDict is a cross-platform dictionary application and sdcv
is a command-line interface. Multiple dictionaries in many languages are available for StarDict, but they need to be downloaded and installed separately (as shown above). It is also possible to convert DICT dictionaries for use with StarDict.
Here we’re looking up the word “rooftop” using command-line flags for non-interactive search with exact word matching (the syntax you would use in a script):
sdcv -n -e rooftop
ddgr
The ddgr
is the command-line interface for the DuckDuckGo search engine. Crossword puzzles may contain words – first names, movie titles, etc – that you may not be able to find in a dictionary, or at least not easily. In such a case a quick online search may be the shortest path to victory.
Here we are looking up the puzzle clue “raindeer landing area”. The command below will reformat the output of ddgr
to show one word per line. We will extract unique lowercase words and then further narrow the list down to 7-letter words:
ddgr --np --num 25 "raindeer landing area" | \ xargs -n1 | tr '[:upper:]' '[:lower:]' | sort -u | \ egrep '^[[:alpha:]]{7}$'
With only 9 choices, you’ll quickly realize that the word “rooftop” is the best match for our puzzle clue.
solver
The solver
is a bash
script I quickly threw together to illustrate practical use scenarios for the aforementioned CLI tools. When you run the script, it will ask you for the puzzle clue and the word pattern.
The script will use the word lists, WordNet, StarDict, and DICT to try to match the words in the clue (and their synonyms discovered using wn
) to the word pattern you specified.
The script will start the process by extracting every word matching the pattern from the word file. There may be thousands of them. Then the script will use sdcv
to narrow down the search. It will then use dict
to refine the results.
The most likely matches and their dictionary definitions will then be presented to you for further analysis. This may take a while, your CPU will be working overtime, and often this won’t work because the puzzle clue was too broad, used a slang expression, referenced some cultural trivia, or relied on tribal knowledge.
There will be some examples later on, but right now let’s get to solving an actual crossword puzzle.
Solving the Puzzle
I never liked crossword puzzles, consequently, I am terrible at solving them. I find most modern computer-generated crossword puzzles utterly boring. They are less a test of one’s vocabulary and more of an exercise in interpreting the puzzle creator’s far-fetched and contrived clues. But let’s give this a shot…
I am looking at the May 9, 2021 crossword puzzle from the Washington Post. For whatever reason, 13 down “Raindeer landing area” caught my eye. I used ddgr
as shown above to find the possible solution: rooftop. I have to admit that halfway through typing the command, I already knew the answer. But the point is that the ddgr
results confirmed my guess.
In many crossword puzzles, the low-hanging fruit is the names of people, places, works of art, etc. However, this information you are unlikely to find using dictionary tools. The next piece of the puzzle on my list was 14 down “Little House series author Laura __ Wilder”. A quick search with ddgr
was all it took:
ddgr --np --num 25 "little house laura" | \ grep -oP "(?<=Laura )[[:alpha:]]{1,}(?= Wilder)" | \ sort -u #> Ingalls
Now, the next word I picked was 24 across “Night vis-á-vis Nacht, e.g.” I had no idea what this was other than a 7-character word with “o” as the second letter. There are over two thousand possible matches in the dictionary file. So I decided to use my solver
script.
I didn’t really know what to put in as the “clue”, so I improvised: “foreign language equivalent word”. This was my best free-form description of the original puzzle clue. After listening to my laptop spin all the cooling fans at top speed for a few minutes, I finally got my list of choices:
[root:~] # solver Enter the clue: foreign language equivalent word Enter the pattern: ?og???? Step 1 of 2: Matching clues to pattern .og.... using sdcv. This may take a while... Step 2 of 2: Matching clues to pattern .og.... using dict. This may take a while... Cognate: blood relation, blood relative, cognate, sib, relative, relation, cognate, cognate word, word, connate, cognate, related (vs. unrelated), related to, cognate, related (vs. unrelated), related to, akin(predicate), blood-related, cognate, consanguine, consanguineous, consanguineal, kin(predicate), related (vs. unrelated) Dogmata: dogma, tenet, religious doctrine, church doctrine, gospel, creed, dogma, doctrine, philosophy, philosophical system, school of thought, ism Logbook: logbook, record, record book, book
A quick look through the presented options led to the obvious choice: cognate.
The 12 down “Highlight” was to be a 6-letter word with “c” as its third letter. There are 462 possible matches in the dictionary, so I resorted to using my solver
script again. This time around it came back with the answer much quicker because of fewer options and a single-word clue:
Enter the clue: highlight Enter the pattern: ??c??? Step 1 of 2: Matching clues to pattern ..c... using sdcv. This may take a while... Step 2 of 2: Matching clues to pattern ..c... using dict. This may take a while... Accent: accent, speech pattern, pronunciation, emphasis, accent, importance, grandness, dialect, idiom, accent, non-standard speech, stress, emphasis, accent, prosody, inflection, accent, accent mark, diacritical mark, diacritic, stress, emphasize, emphasise, punctuate, accent, accentuate, express, show, evince, stress, accent, accentuate, pronounce, articulate, enounce, sound out, enunciate, say Anchor: anchor, ground tackle, hook, claw, anchor, mainstay, keystone, backbone, linchpin, lynchpin, support, anchor, anchorman, anchorperson, television reporter, television newscaster, TV reporter, TV newsman, anchor, ground, fasten, fix, secure, anchor, cast anchor, drop anchor, fasten, fix, secure ...
So now I was getting somewhere. 47 across was “psst” – that one I did not need a computer to figure out. The 30 across “1983 Streisand title role” was an easy find with dggr
:
ddgr -C --np --num 25 "1983 Streisand title role" | dos2unix | awk 'BEGIN{RS=" "} 1' | sort -u | egrep -o '\b[A-Z][a-z]ntl\b' | uniq -c 1 Jentl 5 Yentl
I got one hit on “Jentl” and five on “Yentl”, so I went with the latter.
It was pretty much the same approach for 27 down “Survivorman creator Stroud” – a three-letter name:
ddgr -C --np --num 25 "survivorman creator Stroud" | egrep -o '\b[A-Z][a-z]{1,} Stroud\b' | sort | uniq -c 3 Creator Stroud 1 Jamison Stroud 10 Les Stroud 2 Survivorman Stroud
At this point 26 across “Using a lifestyle magazine to cool off?” started to look like easy pickings. There can’t be that many 11-letter words containing “lefa”. And there weren’t: the only match in the dictionary was “malefactors”, but it made no sense.
So this likely was one of those dreaded pop culture references. And this is exactly why I dislike modern crossword puzzles. So not being a teenage girl, I had to skip this one. (The answer to 26 across was “ellefanning” – just as stupid as I thought it would be).
I moved to 39 across “Furry wrap”. The only word in the dictionary matching the ‘stol?’ pattern was “stole”, but I had no idea what it meant other than the past tense of “steal”. It had to be a noun, so a quick search using wn
taught me that a stole is some sort of scarf. It probably qualifies as a “furry wrap”.
[root:~] # grep '^stol.$' /usr/share/dict/american-english stole [root:~] # wn stole -synsn Synonyms/Hypernyms (Ordered by Estimated Frequency) of noun stole 1 sense of stole Sense 1 stole => scarf
The 21 across “Like Hershey’s Kisses” proved rather tedious. I had to go through a couple of dozen matched dictionary words until I finally saw the most probable solution: conical.
egrep '^con[[:alpha:]]{4}$' /usr/share/dict/american-english # or using wn wn con -grepr | grep '^con....'
The 12 across “Resulting (from)” I let my solver
script deal with, as I was still recovering from my “conical” search:
[root:~] # solver Enter the clue: outcome result Enter the pattern: ari???? Step 1 of 2: Matching clues to pattern ari.... using sdcv. This may take a while... Step 2 of 2: Matching clues to pattern ari.... using dict. This may take a while... Arising: originate, arise, rise, develop, uprise, spring up, grow, become, arise, come up, bob up, become, arise, rise, uprise, get up, stand up, change posture, arise, come up, happen, hap, go on, pass off, occur, pass, fall out, come about, take place, rise, lift, arise, move up, go up, come up, uprise, travel, go, move, locomote, rebel, arise, rise, rise up, protest, resist, dissent
16 down “Defeatist’s statement” was easy with just one letter left: “icant”. And this is another example of why I don’t like modern crossword puzzles: they can’t spell words.
The 18 down “Two-time Oscar winner Jackson” matching “glen??” was obviously Glenda Jackson
The 31 across “Uproar”, which was easy enough to find among the synonyms for “uproar” (“to-do” minus the hyphen):
wn uproar -synsn
The last word I still had energy left for was 40 down “Social companions”. I let my script deal with it, but there is an important nuance here: the clue suggests a plural form of a word, while the dictionaries use a singular form. And so I dropped the last character in the pattern:
[root:~] # solver Enter the clue: companion Enter the pattern: es???? Step 1 of 2: Matching clues to pattern es.... using sdcv. This may take a while... Step 2 of 2: Matching clues to pattern es.... using dict. This may take a while... Escort: bodyguard, escort, defender, guardian, protector, shielder, escort, accompaniment, protection, escort, attendant, attender, tender, date, escort, companion, comrade, fellow, familiar, associate, escort, accompany, see, escort, accompany Essays: essay, writing, written material, piece of writing, essay, attempt, effort, endeavor, endeavour, try, try, seek, attempt, essay, assay, act, move, test, prove, try, try out, examine, essay, evaluate, pass judgment, judge
Why you would want to subject yourself to this kind of torture is beyond me, but now you know how to. Enjoy?
Experienced Unix/Linux System Administrator with 20-year background in Systems Analysis, Problem Resolution and Engineering Application Support in a large distributed Unix and Windows server environment. Strong problem determination skills. Good knowledge of networking, remote diagnostic techniques, firewalls and network security. Extensive experience with engineering application and database servers, high-availability systems, high-performance computing clusters, and process automation.