Page 3 of 6
There are two kinds of simple substitution ciphers, algorithmic and mapping.
Algorithmic substitution cipher
In an algorithmic substitution cipher the "key" to the cipher is the algorithm. For example, an algorithmic cipher that has
the key "Add 3" would operate as follows:
When the above cipher is used, the input message, HELLO BOB (this is called the message text) would be translated into the output message KHOOR ERE (called the cipher text). Unfortunately, algorithmic ciphers are pretty easy to decode even without the key, as there are very common letter patterns in the English language. Further, the "rules" associated with English spelling (such as the necessity of having a vowel in every word) and certain common single-character words make the letter relationships fairly apparent. To obscure the obvious relationship of the letters in the output message to the letters in the input message, the output message was often reformatted in a fixed pattern of character groups. Thus the output KHOOR ERE would be reformatted using four-letter groups to be KHOO RERE. The reformatting makes it less obvious where the words begin and end, but it means that the clerk decoding the message has to read it to figure out where to reinsert the spaces.
Another problem with algorithmic ciphers is their construction. The construction of an algorithmic substitution cipher requires that the key be just one algorithm. Once that algorithm is known for one letter, it's known for all letters. This can be used by an adept cryptanalyst who first counts all of the letters in the cipher text, notes the most commonly occurring letter, combines this with the knowledge that the letter E is the most commonly occurring letter in the English language, and then figures out through deduction what the relationship is between the most common character in the cipher text and the letter E. In short order, the algorithm is known and the cipher is broken. The message is then easily decoded.
Mapping substitution cipher
This brings me to the other type of substitution cipher, the mapping cipher. In a mapping cipher, a fixed relationship exists
between input characters and output characters, but that relationship is held by a translation table rather than by using
an algorithm. For example, let us assume your alphabet consisted of exactly seven letters -- A, B, C, D, E, F, and G -- and
you set up a translation table as shown below.
| Input character | A | B | C | D | E | F | G |
|---|---|---|---|---|---|---|---|
| Output character | B | G | E | D | F | C | A |
| Table 1: Character substitution map | |||||||
If you were to use the message text CAFE BABE and substitute the letters from the table, you would get the cipher text EBCF GBGF. This doesn't look all that different from the algorithmic cipher -- until the cryptanalyst tries to break the code. The advantage of using the table is that while the analyst could find the relationship between the message text letter E and the cipher text letter F using the same statistical method that I described above, that information wouldn't tell her the relationship between the message text letter B and the cipher text letter G. This property, the independence of the relationship between the letters of the message and the encrypted message, makes the mapping cipher stronger -- in the sense that it's harder to break than the algorithmic cipher.