The Azed Archive WordStats

WordStats is special feature of the Archive that looks at all the words that make up the clues, and their length and frequency. By breaking down clues into their component words it’s possible to see and compare them in a new way. Azed is of course the sole arbiter of quality in the clue-writing competition, but WordStats is a great source when it comes to quantity.

How to find WordStats in the Archive

WordStats are collected for all clues in the Azed archive from 1972 onwards.

J. R. Tozer’s clues. Image generated by wordle.net

Clue lengths

WordStats shows us that there is a remarkable consistency of clue lengths, even though clues vary in length from 4 to 230 letters. In each of the past 49 competition years, the average ‘normal’ clue has contained between 9 and 10 words and between 40 and 48 letters. The average clue length over the whole archive (more than 15,500 clues of all types) is 9.7 words and 46 letters. There has been no trend to longer or shorter clues over the life of the Azed series.

Clue length is most highly correlated with clue type, for the obvious reason that clues such as ‘Right and Left’ and DLM demand longer clues.

In normal clues there’s a strong relationship between the number of letters in the word clued and the number of letters in the clue. In clues for words from 4 to 12 letters the average clue length increases by about one letter for each extra letter in the word clued. 4-letter words have an average clue length of 41 letters, rising to an average of 47 letters for 12-letter words (competitions 1 to 2603).

Word
length
Words
in clue
Letters
in clue
No of
comps
48.44113
58.74233
68.94341
79.24468
89.44587
99.64664
109.74764
119.74639
129.84763
 

Shortest clues

Azed competitors can be extremely brief when the opportunity arises. Here are the normal clues – 3 of them Cup-winners – that are shorter than the words they clue:

576TOP-NOTCHA1 V1?4
709GINGERGo pop5
508POSTURE-MAKERProteus?7
701BALUSTRADEBears cope?9
221PADDY-WHACKIre-lander?9
1372MASTERSTROKEA major coup10
 

and some others that weigh in at no more than ten letters:

788CROWB-r-ag?4
2495BUNGCork tip?7
1021BEARDFace down8
2096TRICKFob watch?8
88BLOOMERYHothouse?8
709GINGERPop group?8
891CATDandy lion?9
1711MINUSDash in sum?9
464SIMKINI’m bottled9
735MALIGNKnock Fell9
1797HEARTMeat balls9
709GINGERBeer bottle10
709GINGERBuck Rogers?10
1026LET-OFFFire escape10
788CROWJimmy or Jim?10
2279CLEMENTINEPapal cross?10
203BOGYPoker fiend10
401GO-AHEADSmart try-on!10
1181SLADE‘Noddy’ leads ——10
 

Longest clues

Competitors’ brevity is at least equalled by their prolixity. The longest ‘clue’ in the archive is this 230 letter Jingle for the Three Kings by C. H. Hudson from competition 143

You were three chiefs of Middle-East,
You the three Kings of Orient were –
The modern Sheik’s not in the least
Concerned with frankincense and myrrh.
You journeyed through the winter’s cold
To mark with gifts the angels’ news –
Your counterpart, agog for gold,
Just grabs his oily revenues

Definition and Letter Mixtures also discourage terseness. The only 3-word DLM competition (1810 HARE / POSEUR / SERINETTE) produced these 165 letters from Rev Canon C. M. Broun

Be a better preacher, a sensible one – move quickly to conclusion, then shut up – or sermons will seem a show-off of theology and congregations lose interest, glad to substitute hymns and organ for teaching.

The longest cryptic clue is a ‘Right and Left’ of 30 words and 149 letters by P. F. Henderson for URTICARIA / APOGRAPHS in competition 447

Rash primaries in USA – really, those idiots Carter and Reagan (ignoring Anderson) – you can’t choose between them! Perhaps hostages will be released – then you could say: ‘He’s set these free’

As a normal clue, C. M. Edmunds’ work from 1985 (696 ANTIMNEMONIC) of 122 letters and 22 words (10 of which are the definition) has never been surpassed in length:

One leader of morris men experiencing volte-face over performing in grotesque pageant – I’m opposed to knotted handkerchiefs, I shan’t ring any bells!

R. J. Heald came close with his tribute to the Azed 1750 lunch (1750 PLOUGHMAN), at 120 letters and 19 words

Characters foremost among puzzlers love Oxford University get-togethers with regular doses of champagne (my lunches are far less extravagant!)

You might expect cup-winning normal clues to be somewhat briefer, and they are (9 words and 42 letters on average), though in competition 2486 R. J. Heald (again) got through 82 letters and 14 words clueing the 7 letters of BRUHAHA

Baron Hardup merrily gives away daughter, Prince becoming husband by end of Cinderella pantomime

The Printer’s Devilry clue with most letters is from I. Carr (2040 EASTER), at 12 words and 80 letters:

Global warming and oceanic pollution are what? Many scientists’ se/minal planetary afflictions

Word frequencies

There are 150,833 words altogether in the clues of the Archive, and 22,619 distinct words. Just as in everyday language, a few words occur very frequently, a large number very rarely, and the rest somewhere in between. But this distribution makes the clues much more diverse than everyday language or literature, as this rather selective table shows:

SourceTotal
Words
Distinct
Words
Diversity
King James Bible788,25814,5651.8%
Shakespeare’s Sonnets60,4314,1696.9%
Hamlet39,4764,68611.9%
Azed Slip Archive Clues150,83322,61915%
 

Popular words

The most popular words in the Archive clues are unsurprisingly also some of the most frequently occurring words in English (competitions 1 to 2603).

 Freq
of4265
in4257
a4182
to3035
the2760
with2342
for1851
and1628
one1593
 
 

Behind them we find some cryptic staples

 Freq
end331
see319
old318
bit282
time259
round249
possibly221
new199
head196
left195
good170
short166
half157
man149
initially144
English143
cut139
men137
love137
find137
heart132
back129
top129
 
 

and a little further on, some thematic favourites

 Freq
work119
hard103
Christmas100
run98
party92
playing91
play89
French82
character82
bar79
endless78
King77
lost74
big74
power72
red71
girl67
energy63
air63
 
 

which come just ahead of the judge and setter in several guises

 Freq
Azed71
Azed’s49
AZ4
AZ’s1
 
 

and, of course, references to the art of clue-writing

 Freq
clue60
clues20
clued12
cluing7
cluers5
clue’s5
cluer3
clue-writer’s2
 
 

Unique words

Over half of the distinct words in clues (11,871 out of 22,619) occur only once in the entire Archive. Every competition and most competitors have contributed some unique words. Each competitor is credited with their unique contributions in the competitor WordStats pages.

The very first clue in the Azed Archive, S. L. Paton’s

Before the heart ensnares one, one likes to go on a binge

contains the Archive’s only instance of ‘ensnares’ and Competition 1’s clues include ‘Doughboy’, ‘unhealthy’, ‘groats’, ‘lunch-time’, ‘Borgia’, ‘fascinate’, ‘mysteries’, ‘strychnine’s’, ‘corgi’, ‘tigs’, ‘Tollesbury’, ‘promiscuously’ and ‘Bacchae’, none of which has been repeated in the ensuing 49 years of competitions.

New additions to the word list in the latest competition in the WordStats, 2603, include ‘Starmer’s’, ‘Sunak’, ‘festers’, ‘Kim’s’, ‘monstrosity’, ‘twerking’, ‘fundament’, ‘uterus’, ‘Outrage’, ‘supermarkets’, ‘dread’s’, ‘tanks’, ‘assuredly’, ‘absentia’ and ‘prodigy’.

And which clue has contributed the most unique words? The answer lies in that unique ‘Jingle’ competition 143, and the following verse from W. Jackson:

Caspar rex et Melchior
Balthazarque lumine
lucent Alphabetici
claro iam aenigmatos.
Vos, observatores Stellae,
Stellae nunc Observatoris,
die Salvatoris nostri
Salutamus hodie.

What’s in a word?

In these WordStats a ‘letter’ is any character from A to Z (including accented letters) or digit from 0 to 9.

A ‘word’ is any contiguous sequence of letters, digits, hyphens or apostrophes terminated by any other characters (spaces, punctuation) or either end of the clue. Every form of a word, including differently hyphenated or apostrophised forms is considered distinct (e.g. ITS and IT’S are each counted as distinct single words).