Computer scienceAlgorithms and Data StructuresAlgorithmsString coding algorithmsCryptographic coding

Vigenère cipher

9 minutes read

Imagine the following situation: your friend and you are spies and have to exchange some secret messages. Obviously, you don't want other people to read these messages. Therefore, you decide to use a cipher, that may help you to protect the information. Previously you've already used the Caesar cipher, but it was broken, so now you are looking for something new. Why don't you use the Vigenère cipher for this purpose?

Algorithm

Basically, in this algorithm, each letter of the message is encoded with the Caesar cipher with a different shift. But how is the shift determined?

This is the most interesting part of the algorithm: in order to encode the message, a key is used. You can use any word or even a phrase as a key – whatever you and the other people, using the cipher, agree on. The steps of the algorithm are the following:

  1. (Optional) Get rid of the spaces and punctuation in the initial message.

  2. Rewrite the key as many times, as it is needed to reach the length of the message.

  3. Now the shift for each letter of the message can be found as the position of the corresponding letter of the key in the alphabet. Note that counting starts from zero. For example, the letter to encode is "B" and the corresponding letter of the key is "D". In this case shift equals 33 (0A,1B,2C,3D0 - A, 1 - B, 2 -C, 3-D).

  4. After figuring out the shift, you encode the letter using the Caesar cipher: you move the position of the letter forward shiftshift times. Returning to the example, "B" will be encoded into the letter "E".

  5. (Optional) Put the spaces and the punctuation back.

Now let's move on to decoding the messages. After you've learned how to encode the messages, it is quite easy: the only difference from encoding is that in step 4 of the algorithm, you decode the letter and not encode it. This means, that you need to move the position of the letter backward and not forward. For example, if the letter of the encoded message is "A" and the corresponding letter of the key is "E", you move the position of "A" backward 4 times and get the letter "W".

That's it! Let's get down to the example now, in order to make sure that the algorithm is absolutely clear.

Example

So, let's encode the text "Computer Science", using the key "Hyperskill". First of all, you need to repeat the key, so that its length reaches one of the messages:

A table with the initial phrase and the repeated key

Now you can start encoding. The first letter is "C" with position 22 and the corresponding letter of the key is "H", which is the seventh letter in the alphabet. Therefore, the shift equals 77. The position of the encoded letter is the following:

new_pos=(pos(C)+pos(H)) mod n=(2+7) mod 26=9new\_pos = (pos(C) + pos(H)) \ mod \ n = (2 + 7)\ mod\ 26 = 9

nn in the formula stands for the length of the alphabet and pos(letter)pos(letter) determines the position of the letter in the alphabet. The letter in the ninth position is "J", which is why "C" gets encoded into "J".

The next letter is "O". Of course, you could repeat all of the steps that we took with the previous letter. However, it's much easier to encode the message with the help of a special table.

A table that is used to encode and decode the letters

It's used in the following way: you look for a letter that stands in a position with a letter from the message in a row and a letter from the key in a column. Back to our example, you take a row with the letter "O" in it and a column with "Y". As you can see, the letter "O" is encoded into "M".

You can make the process even easier by using the code in order to solve the task. Here is how the solution could look in pseudocode:

class VigenereCipher is 
  method encodeMessage(message: String, key: String, n: Integer) is // n is the length of the alphabet
    encoded = "" // a string variable for storing the encoded message 
    keyRepeated = key 

    while (length(keyRepeated) < length(message)) do 
      keyRepeated += key 

    for i in length(message) do 
      newPosition = position(message[i]) + position(keyRepeated[i]) // a position of a letter after it was encoded 
      encoded += char(newPosition % n) // adding an encoded letter to the resulting variable 
    
    return encoded 

You can choose either way to encode the rest of the text. After you do it, you will get the following result. The number above each letter is its position in the alphabet.

A table with the initial message, the repeated key and the encoded message with the positions above each letter

It is no less important to practice decoding the message. Let's decode the text "Cgviewbm" with the key "Hyperskill". The position of the letter "H" is 77, therefore we move the position of the letter "C" (which is 2) backward 7 times. You can also use the formula:

decoded_pos=(n+pos(C)pos(H)) mod n=(26+27) mod 26=21decoded\_pos = (n+pos(C)-pos(H))\ mod\ n = (26+2-7)\ mod\ 26=21

nn in the formula stands for the length of the alphabet and pos(letter)pos(letter) determines the position of the letter in the alphabet. The letter with position 2121 is "V", therefore "C" gets decoded into "V".

In order to try out all of the methods, let's decode the second letter using the table. To do it you need to find a column with the letter "Y" and look for a cell with "G" in it. The letter that stands in the row – "I" – is the one that should be put into the decoded message.

However, it may take a lot of time to decode the message even with the use of the table. Let's look at the pseudocode that may help us make the process easier and much faster:

class VigenereCipher is 
  method decodeMessage(message: String, key: String, n: Integer) is // n is the length of the alphabet 
    decoded = ""  // string variable for storing the decoded message 
    keyRepeated = key 

    while (length(keyRepeated) < length(message)) do 
      keyRepeated += key 

    for i in length(message) do 
      oldPosition = n + position(message[i]) - position(keyRepeated[i])  // the position of a letter before it was encoded 
      decoded += char(oldPosition % n)  // adding a decoded letter to the resulting variable 
    
    return decoded 

After you decode the message you will get the following result. The number above each letter is its position in the alphabet.

A table with the encoded message, the repeated key and the decoded message with the positions above each letter

Conclusion

To sum up, let's go through the main points about the algorithm once again:

  • It is based on the Caesar algorithm.

  • It is harder to break that the Caesar algorithm.

  • To encode and decode the information, you need to know the key, which may be a word or even a phrase.

  • For each letter of the initial text, you get the corresponding letter of the key, which position is used as a shift for encoding or decoding the letter.

  • A special table may be used for encoding and decoding the messages.

6 learners liked this piece of theory. 0 didn't like it. What about you?
Report a typo