Название | The Code Book: The Secret History of Codes and Code-breaking |
---|---|
Автор произведения | Simon Singh |
Жанр | Прочая образовательная литература |
Серия | |
Издательство | Прочая образовательная литература |
Год выпуска | 0 |
isbn | 9780007378302 |
He soon gained a reputation within London society as a cryptanalyst prepared to tackle any encrypted message, and strangers would approach him with all sorts of problems. For example, Babbage helped a desperate biographer attempting to decipher the shorthand notes of John Flamsteed, England’s first Astronomer Royal. He also came to the rescue of a historian, solving a cipher of Henrietta Maria, wife of Charles I. In 1854, he collaborated with a barrister and used cryptanalysis to reveal crucial evidence in a legal case. Over the years, he accumulated a thick file of encrypted messages, which he planned to use as the basis for an authoritative book on cryptanalysis, entitled The Philosophy of Decyphering. The book would contain two examples of every kind of cipher, one that would be broken as a demonstration and one that would be left as an exercise for the reader. Unfortunately, as with many other of his grand plans, the book was never completed.
While most cryptanalysts had given up all hope of ever breaking the Vigenère cipher, Babbage was inspired to attempt a decipherment by an exchange of letters with John Hall Brock Thwaites, a dentist from Bristol with a rather innocent view of ciphers. In 1854, Thwaites claimed to have invented a new cipher, which, in fact, was equivalent to the Vigenère cipher. He wrote to the Journal of the Society of Arts with the intention of patenting his idea, apparently unaware that he was several centuries too late. Babbage wrote to the Society, pointing out that ‘the cypher … is a very old one, and to be found in most books’. Thwaites was unapologetic and challenged Babbage to break his cipher. Whether or not it was breakable was irrelevant to whether or not it was new, but Babbage’s curiosity was sufficiently aroused for him to embark on a search for a weakness in the Vigenère cipher.
Cracking a difficult cipher is akin to climbing a sheer cliff face. The cryptanalyst is seeking any nook or cranny which could provide the slightest purchase. In a monoalphabetic cipher the cryptanalyst will latch on to the frequency of the letters, because the commonest letters, such as e, t and a, will stand out no matter how they have been disguised. In the polyalphabetic Vigenère cipher the frequencies are much more balanced, because the keyword is used to switch between cipher alphabets. Hence, at first sight, the rock face seems perfectly smooth.
Remember, the great strength of the Vigenère cipher is that the same letter will be enciphered in different ways. For example, if the keyword is KING, then every letter in the plaintext can potentially be enciphered in four different ways, because the keyword contains four letters. Each letter of the keyword defines a different cipher alphabet in the Vigenère square, as shown in Table 7. The e column of the square has been highlighted to show how it is enciphered differently, depending on which letter of the keyword is defining the encipherment:
If the K of KING is used to encipher e, then the resulting ciphertext letter is O.
If the I of KING is used to encipher e, then the resulting ciphertext letter is M.
If the N of KING is used to encipher e, then the resulting ciphertext letter is R.
If the G of KING is used to encipher e, then the resulting ciphertext letter is K.
Table 7 A Vigenère square used in combination with the keyword KING. The keyword defines four separate cipher alphabets, so that the letter e may be encrypted as O, M, R or K.
Similarly, whole words will be enciphered in different ways: the word the, for example, could be enciphered as DPR, BUK, GNO or ZRM, depending on its position relative to the keyword. Although this makes cryptanalysis difficult, it is not impossible. The important point to note is that if there are only four ways to encipher the word the, and the original message contains several instances of the word the, then it is highly likely that some of the four possible encipherments will be repeated in the ciphertext. This is demonstrated in the following example, in which the line The Sun and the Man in the Moon has been enciphered using the Vigenère cipher and the keyword KING.
Keyword K I N G K I N G K I N G K I N G K I N G K I N G
Plaintext t h e s u n a n d t h e m a n i n t h e m o o n
Ciphertext D P R Y E V N T N B U K W I A O X B U K W W B T
The word the is enciphered as DPR in the first instance, and then as BUK on the second and third occasions. The reason for the repetition of BUK is that the second the is displaced by eight letters with respect to the third the, and eight is a multiple of the length of the keyword, which is four letters long. In other words, the second the was enciphered according to its relationship to the key word (the is directly below ING), and by the time we reach the third the, the keyword has cycled round exactly twice, to repeat the relationship, and hence repeat the encipherment.
Babbage realised that this sort of repetition provided him with exactly the foothold he needed in order to conquer the Vigenère cipher. He was able to define a series of relatively simple steps which could be followed by any cryptanalyst to crack the hitherto chiffre indéchiffrable. To demonstrate his brilliant technique, let us imagine that we have intercepted the ciphertext shown in Figure 13. We know that it was enciphered using the Vigenère cipher, but we know nothing about the original message, and the keyword is a mystery.
The first stage in Babbage’s cryptanalysis is to look for sequences of letters that appear more than once in the ciphertext. There are two ways that such repetitions could arise. The most likely is that the same sequence of letters in the plaintext has been enciphered using the same part of the key. Alternatively, there is a slight possibility that two different sequences of letters in the plaintext have been enciphered using different parts of the key, coincidentally leading to the identical sequence in the ciphertext. If we restrict ourselves to long sequences, then we largely discount the second possibility, and, in this case, we shall consider repeated sequences only if they are of four letters or more. Table 8 is a log of such repetitions, along with the spacing between the repetition. For example, the sequence E-F-I-Q appears in the first line of the ciphertext and then in the fifth line, shifted forward by 95 letters.
Figure 13 The ciphertext, enciphered using the Vigenère cipher.
As well as being used to encipher the plaintext into ciphertext, the keyword is also used by the receiver to decipher the ciphertext back into plaintext. Hence, if we could identify the keyword, deciphering the text would be easy. At this stage we do not have enough information to work out the keyword, but Table 8 does provide some very good clues as to its length. Having listed which sequences repeat themselves and the spacing between these repetitions, the rest of the table is given over to identifying the factors of the spacing – the numbers that will divide into the spacing. For example, the sequence W-C-X-Y-M repeats itself after 20 letters, and the numbers 1, 2, 4, 5, 10 and 20 are factors, because they divide perfectly into 20 without leaving a remainder. These factors suggest six possibilities:
(1) The key is 1 letter long and is recycled 20 times between encryptions.
(2) The key is 2 letters long and is recycled 10 times between encryptions.
(3) The key is 4 letters long and is recycled 5 times between encryptions.
(4) The key is 5 letters long and is recycled 4 times between encryptions.
(5) The key is 10 letters long and is recycled 2 times between encryptions.
(6) The key is 20 letters long and is recycled 1 time between encryptions.
The first possibility can be excluded, because a key that is only 1 letter long gives rise to a monoalphabetic cipher – only one row of the Vigenère square would be used for the entire encryption, and the cipher alphabet would remain unchanged; it is unlikely that a cryptographer would do this. To indicate each of the other possibilities, a