The Azed Archive WordStats

WordStats is special feature of the Archive that looks at all the words that make up the clues, and their length and frequency. By breaking down clues into their component words it’s possible to see and compare them in a new way. Azed is of course the sole arbiter of quality in the clue-writing competition, but WordStats is a great source when it comes to quantity.

How to find WordStats in the Archive

WordStats are collected for all clues in the Azed archive from 1972 onwards.

J. R. Tozer’s clues. Image generated by wordle.net

Clue lengths

WordStats shows us that there is a remarkable consistency of clue lengths, even though clues vary in length from 4 to 230 letters. In each of the past 48 competition years, the average ‘normal’ clue has contained between 9 and 10 words and between 40 and 48 letters. The average clue length over the whole archive (more than 15,200 clues of all types) is 9.7 words and 46 letters. There has been no trend to longer or shorter clues over the life of the Azed series.

Clue length is most highly correlated with clue type, for the obvious reason that clues such as ‘Right and Left’ and DLM demand longer clues.

In normal clues there’s a strong relationship between the number of letters in the word clued and the number of letters in the clue. In clues for words from 4 to 12 letters the average clue length increases by about one letter for each extra letter in the word clued. 4-letter words have an average clue length of 40 letters, rising to an average of 47 letters for 12-letter words (competitions 1 to 2547).

Word
length
Words
in clue
Letters
in clue
No of
comps
48.44012
58.64232
69.04340
79.24466
89.44586
99.64664
109.74760
119.74639
129.84763
 

Shortest clues

Azed competitors can be extremely brief when the opportunity arises. Here are the normal clues – 3 of them Cup-winners – that are shorter than the words they clue:

576TOP-NOTCHA1 V1?4
709GINGERGo pop5
508POSTURE-MAKERProteus?7
701BALUSTRADEBears cope?9
221PADDY-WHACKIre-lander?9
1372MASTERSTROKEA major coup10
 

and some others that weigh in at no more than ten letters:

788CROWB-r-ag?4
2495BUNGCork tip?7
1021BEARDFace down8
2096TRICKFob watch?8
88BLOOMERYHothouse?8
709GINGERPop group?8
891CATDandy lion?9
1711MINUSDash in sum?9
464SIMKINI’m bottled9
735MALIGNKnock Fell9
1797HEARTMeat balls9
709GINGERBeer bottle10
709GINGERBuck Rogers?10
1026LET-OFFFire escape10
788CROWJimmy or Jim?10
2279CLEMENTINEPapal cross?10
203BOGYPoker fiend10
401GO-AHEADSmart try-on!10
1181SLADE‘Noddy’ leads ——10
 

Longest clues

Competitors’ brevity is at least equalled by their prolixity. The longest ‘clue’ in the archive is this 230 letter Jingle for the Three Kings by C. H. Hudson from competition 143

You were three chiefs of Middle-East,
You the three Kings of Orient were –
The modern Sheik’s not in the least
Concerned with frankincense and myrrh.
You journeyed through the winter’s cold
To mark with gifts the angels’ news –
Your counterpart, agog for gold,
Just grabs his oily revenues

Definition and Letter Mixtures also discourage terseness. The only 3-word DLM competition (1810 HARE / POSEUR / SERINETTE) produced these 165 letters from Rev Canon C. M. Broun

Be a better preacher, a sensible one – move quickly to conclusion, then shut up – or sermons will seem a show-off of theology and congregations lose interest, glad to substitute hymns and organ for teaching.

The longest cryptic clue is a ‘Right and Left’ of 30 words and 149 letters by P. F. Henderson for URTICARIA / APOGRAPHS in competition 447

Rash primaries in USA – really, those idiots Carter and Reagan (ignoring Anderson) – you can’t choose between them! Perhaps hostages will be released – then you could say: ‘He’s set these free’

As a normal clue, C. M. Edmunds’ work from 1985 (696 ANTIMNEMONIC) of 122 letters and 22 words (10 of which are the definition) has never been surpassed in length:

One leader of morris men experiencing volte-face over performing in grotesque pageant – I’m opposed to knotted handkerchiefs, I shan’t ring any bells!

R. J. Heald came close with his tribute to the Azed 1750 lunch (1750 PLOUGHMAN), at 120 letters and 19 words

Characters foremost among puzzlers love Oxford University get-togethers with regular doses of champagne (my lunches are far less extravagant!)

You might expect cup-winning normal clues to be somewhat briefer, and they are (8.9 words and 42 letters on average), though in competition 2486 R. J. Heald (again) got through 82 letters and 14 words clueing the 7 letters of BRUHAHA

Baron Hardup merrily gives away daughter, Prince becoming husband by end of Cinderella pantomime

The Printer’s Devilry clue with most letters is from I. Carr (2040 EASTER), at 12 words and 80 letters:

Global warming and oceanic pollution are what? Many scientists’ se/minal planetary afflictions

Word frequencies

There are 147,687 words altogether in the clues of the Archive, and 22,312 distinct words. Just as in everyday language, a few words occur very frequently, a large number very rarely, and the rest somewhere in between. But this distribution makes the clues much more diverse than everyday language or literature, as this rather selective table shows:

SourceTotal
Words
Distinct
Words
Diversity
King James Bible788,25814,5651.8%
Shakespeare’s Sonnets60,4314,1696.9%
Hamlet39,4764,68611.9%
Azed Slip Archive Clues147,68722,31215.1%
 

Popular words

The most popular words in the Archive clues are unsurprisingly also some of the most frequently occurring words in English (competitions 1 to 2547).

 Freq
in4175
of4164
a4114
to2964
the2733
with2291
for1804
and1598
one1570
 
 

Behind them we find some cryptic staples

 Freq
end325
old315
see312
bit274
time256
round247
possibly209
new199
left192
head189
good166
short164
half153
man148
English140
cut139
Men137
find135
love134
initially134
heart132
top127
back126
 
 

and a little further on, some thematic favourites

 Freq
work118
hard102
Christmas98
run97
Party90
playing88
play87
French80
character79
endless78
bar78
King76
lost74
big74
power72
red70
girl65
energy62
air60
 
 

which come just ahead of the judge and setter in several guises

 Freq
Azed69
Azed’s45
AZ4
AZ’s1
 
 

and, of course, references to the art of clue-writing

 Freq
Clue60
clues20
clued11
cluing7
cluers5
Clue’s5
cluer3
Clue-writer’s2
 
 

Unique words

Over half of the distinct words in clues (11,734 out of 22,312) occur only once in the entire Archive. Every competition and most competitors have contributed some unique words. Each competitor is credited with their unique contributions in the competitor WordStats pages.

The very first clue in the Azed Archive, S. L. Paton’s

Before the heart ensnares one, one likes to go on a binge

contains the Archive’s only instance of ‘ensnares’ and Competition 1’s clues include ‘Doughboy’, ‘unhealthy’, ‘groats’, ‘lunch-time’, ‘Borgia’, ‘fascinate’, ‘mysteries’, ‘strychnine’s’, ‘corgi’, ‘tigs’, ‘Tollesbury’, ‘promiscuously’ and ‘Bacchae’, none of which has been repeated in the ensuing 48 years of competitions.

New additions to the word list in the latest competition in the WordStats, 2547, include ‘bespoke’, ‘woof’, ‘laity’, ‘Yarn-spinner’, ‘Priti’s’, ‘swathed’, ‘Heald’, ‘Clotho’, ‘profitably’, ‘Buttler’, ‘admirable’, ‘albs’, ‘throstler’s’ and ‘factory’.

And which clue has contributed the most unique words? The answer lies in that unique ‘Jingle’ competition 143, and the following verse from W. Jackson:

Caspar rex et Melchior
Balthazarque lumine
lucent Alphabetici
claro iam aenigmatos.
Vos, observatores Stellae,
Stellae nunc Observatoris,
die Salvatoris nostri
Salutamus hodie.

What’s in a word?

In these WordStats a ‘letter’ is any character from A to Z (including accented letters) or digit from 0 to 9.

A ‘word’ is any contiguous sequence of letters, digits, hyphens or apostrophes terminated by any other characters (spaces, punctuation) or either end of the clue. Every form of a word, including differently hyphenated or apostrophised forms is considered distinct (e.g. ITS and IT’S are each counted as distinct single words).