THE EXTRAORDINARY INVESTIGATIONS UNIT

Unravelling conundrums since 1910

Our investigations will often involve decoding secret messages - cracking codes or breaking cyphers. (There is a difference between the two – technically a ‘code’ replaces whole words with symbols, whilst a ‘cypher’ replaces individual letters with other letters or numbers. What most people call codes are actually cyphers.)

Our investigators deploy a variety of methods to unlock the contents of secret messages. A few of these methods are outlined below, but first we'll take a look at some simple cyphers…

Shift or Rotation Cypher

This cypher is one of the earliest known encryption methods. Allegedly invented by Julius Caesar, it is sometimes referred to as the Caesar Cypher. It is a simple substitution cypher where each letter of the alphabet is ‘shifted’ by a certain amount. So if the required shift is 2 then the letter A would become C, B becomes D, C is E etc. And towards the end of the alphabet, the cypher ‘wraps around’ – so with the 2-shift example, the letter Y becomes A, and Z is B.

There will often be a clue somewhere to give a hint as to how to crack the cypher – references to movement or rotation etc, or perhaps to Caesar himself. Another way to identify this cypher is to take a short section of the coded message and try shifting the letters up or down a few different alphabet places - if recognisable words suddenly appear then you have a Shift Cypher on your hands.

Number Cypher

A number cipher is often a variation of a Shift cipher. It looks more complicated, but it's the same in essence. You give each letter of the alphabet a number, and then perform the shift. So in the 2-shift example, the letter A would become 3, B is 4 etc.

Mirror Cypher (aka the Atbash Cypher)

The Mirror Cipher is another ancient cypher, a simple letter substitution, originally used to encode messages in Hebrew. It involves taking the alphabet, splitting it in half, and ‘mirroring’ it. So the letter A becomes Z, and Z is A, whilst the letter B becomes Y, and Y is B.

Substitution Cypher

A Substitution Cipher is more complex than just shifting the alphabet, or mirroring its halves. Here, letters will be reassigned at random, so the letter A could be H, and B could be R, and C could be X etc. There isn’t an obvious shift of alphabet, and the recipient of the message needs the key (the regular alphabet alongside its enciphered equivalent) to read the text.

Messages encyphered with letter substitution can look complex and intimidating to the would-be decoder. But there are a variety of tactics to crack this kind of message, even if you don’t have access to the key...

Cracking a Substitution Cypher

Take advantage of the hidden structure in language to break a cypher. The most basic tool in the art of decyphering is ‘frequency analysis’. Some letters appear much more frequently in regular text than others. In English, the letter E is the most frequently-occurring letter, followed by T, and then A. The least frequent letters are X, Q, U, and Z.

If you have access to a decent length of encyphered text, you can count the number of times particular letters appear and then make an educated guess about the identities of the most commonly occuring. The ten most frequently used letters in English are: E, T, A, O, I, N, S, R, H, D. Identifying some of them will provide you with a crowbar to crack the rest of the message open.

Once you have a rough idea of what some of the more common letters in the message might represent, you can use another decyphering tool – ‘pattern recognition’...

Are there frequent patterns of 3 letters in the message? The most common 3-letter words in English are ‘the’ and ‘and’. So any frequent 3-letter sequence ending in what might be E is probably ‘the’. This clue can confirm your identification of E, and also give you the identity of T and H. Once you’ve spotted ‘the’, the next-most common 3-letter pattern is probably ‘and’, revealing the identity of a further 3 letters.

Are there double-letters in your enciphered text? They can be an excellent clue for identifying letters. The most common double-letters in English are LL, SS, and EE. They occur almost twice as often as the next set of ‘doubles’ which are OO and TT.

Lastly, if you know anything about the message’s context, you may have some clues to help break the cipher. Are people’s names, or place names, likely to show up in the text? If you can identify any of those then you’ve found yourself a skeleton key for the remainder of the message.

Once you’ve started to identify a decent number of the letters with these tools, the message will begin to reveal itself. Good luck with your decoding.