Vigenère cipher is a method of encrypting alphabetic text using a series of different caesar ciphers based on the letters of a keyword. Each letter in the plaintext is shifted by a different amount, determined by the corresponding letter in the keyword.

For example, our key is the word CRYPTO and our plaintext is THISISAGREATDAY. Encryption and decryption happen as below.

Frequency analysis

Every language has an associated characteristic frequency at which letters occur. This is preserved in the ciphertext. Making it easy to deduce the mapping. In english, the most frequent letter occur is the letter ‘E’.

Breaking the Vigenère cipher

Knowing the key length

If you know the key length, you can group the ciphertext into segments were each segment was encrypted using the same character from the key. For example, if the key length is 3, the first, fourth, seventh, etc., character were all encrypted by the first key character. The second, fifth, eighth, etc., by the second key character, and so on.

Repetition of encryption The idea is the repetition occurs. Since each key character repeats every few positions (based on the key length), the same plaintext letters are encrypted by the same key character at certain intervals. This creates a pattern that can be exploited.

Frequency analysis In English, the most common letter in words is “E”. By examining the distribution of letters in a group of ciphertext (sufficiently enough letters to be statistically significant), you can guess which letter corresponds to “E” in that group. For example, in a group of cipher text, the letter “G” occurs the most. You might guess that this corresponds to “E” in the plaintext. Then you can use modular arithmetic to recover the key. Then you can repeat the process to find the next frequent letters.

Don’t know the key length

Kasiski examination Kasiski examination is a method used to break the Vigenère cipher by determining the key length.

Consider the blow example, where the key is KEY and the plain text is THE BELOW IS AN EXAMPLE FOR KASISKI TEST. THE TEXT IS ENCRYPTED UTILISING THE VIGENERE CIPHER.

  1. Identify the repeating sequence

    • In the Vigenère cipher, the same plaintext letter can be encrypted using the same character multiple times, especially if the key is shorted relative to the plaintext. This is because the natural language has a number of repetitive patterns (e.g., the word “the” will be repeated quite a lot in English language) and given the short key sizes, there will be occasions where the same plaintext was encrypted by the same part of the key.
    • In the example, the repeated sequence of letters in the plaintext THE appear as the repeated sequences in the ciphertext DLC, though shifted by the key.
  2. Measure distance between repeats

    • The next step is to identify the distances between the repeating sequences in the ciphertext. These distances are typically measured in the number of characters between each occurrence.
    • In the example, the distance between the 1st and the 2nd repetition is 33 letters; the distance between the 2nd and the 3rd repetition is 27.
  3. Common divisors

    • The assumption is that the repeating sequences (THE) were encrypted with the same characters of the key (KEY), so the distance between the repeated sequences is likely a multiple of the key length. By calculating the Greatest Common Divisor (GCD) of these distances, you can often determine the key length.

    • In the example,

      • The distance between the 1st and the 2nd repetition is 33. Therefore, the key size must be one of 1, 3, 11 and 33, which are factors of 33.
      • the distance between the 2nd and the 3rd repetition is 27. Therefore, the key size must be one of 1, 3, 9 and 27.
    • The common factors are 1 and 3. We can ignore the 1 as it is a very weak key. So that the possible key size is 3.

    • Once you know the key size, you can conduct frequency analysis as discussed prior.


Back to parent page: Network Security and Cryptography

Cyber_SecurityNetwork_SecuritySymmetric_CryptographyVigenère_CipherFrequency_AnalysisESEC3616