VIC cipher

A flow diagram of the VIC Cipher

The VIC cipher was a pencil and paper cipher used by the Soviet spy Reino Häyhänen, codenamed "VICTOR".

If the cipher were to be given a modern technical name, it would be known as a "straddling bipartite monoalphabetic substitution superenciphered by modified double transposition."[1] However, by general classification it is part of the Nihilist family of ciphers.

It was arguably the most complex hand-operated cipher ever seen, when it was first discovered. The initial analysis done by the American National Security Agency (NSA) in 1953 did not absolutely conclude that it was a hand cipher, but its placement in a hollowed out 5¢ coin (later known as the Hollow Nickel Case) implied it could be decoded using pencil and paper. The VIC cipher remained unbroken until more information about its structure was available.

Although certainly not as complex or secure as modern computer operated stream ciphers or block ciphers, in practice messages protected by it resisted all attempts at cryptanalysis by at least the NSA from its discovery in 1953 until Häyhänen's defection in 1957.

A revolutionary leap

The VIC cipher can be regarded as the evolutionary pinnacle of the Nihilist cipher family.

The VIC cipher has several important integrated components, including mod 10 chain addition, a lagged Fibonacci generator (a recursive formula used to generate a sequence of pseudorandom digits), a straddling checkerboard, and a disrupted double transposition.

Until the discovery of VIC, it was generally thought that a double transposition alone was the most complex cipher an agent, as a practical matter, could use as a field cipher.

History

During World War II, several Soviet spy rings communicated to Moscow Centre using two ciphers which are essentially evolutionary improvements on the basic Nihilist cipher. A very strong version was used by Max Clausen in Richard Sorge's network in Japan, and by Alexander Foote in the Lucy spy ring in Switzerland.[2] A slightly weaker version was used by the Rote Kapelle network.[3]

In both versions, the plaintext was first converted to digits by use of a straddling checkerboard rather than a Polybius square. This has the advantage of slightly compressing the plaintext, thus raising its unicity distance and also allowing radio operators to complete their transmissions quicker and shut down sooner. Shutting down sooner reduces the risk of the operator being found by enemy radio direction finders. Increasing the unicity distance increases strength against statistical attacks.

Clausen and Foote both wrote their plaintext in English, and memorized the 8 most frequent letters of English (to fill the top row of the checkerboard) through the mnemonic phrase "a sin to err" (dropping the second "r"). The standard English straddling checkerboard has 28 character slots and in this cipher the extra two became "full stop" and "numbers shift". Numbers were sent by a numbers shift, followed by the actual plaintext digits in repeated pairs, followed by another shift. Then, similarly to the basic Nihilist, a digital additive was added in, which was called "closing". However a different additive was used each time, so finally a concealed "indicator group" had to be inserted to indicate what additive was used.

Unlike basic Nihilist, the additive was added by non-carrying addition (digit-wise addition modulo 10), thus producing a more uniform output which doesn't leak as much information. More importantly, the additive was generated not through a keyword, but by selecting lines at random from almanacs of industrial statistics. Such books were deemed dull enough to not arouse suspicion if an agent was searched (particularly as the agents' cover stories were as businessmen), and to have such high entropy density as to provide a very secure additive. Of course the figures from such a book are not actually uniformly distributed (there is an excess of "0" and "1" (see Benford's Law), and sequential numbers are likely to be somewhat similar), but nevertheless they have much higher entropy density than passphrases and the like; at any rate, in practice they seem never to have been successfully cryptanalysed.

The weaker version generated the additive from the text of a novel or similar book (at least one Rote Kapelle member actually used The Good Soldier Schweik). This text was converted to a digital additive using a technique similar to a straddling checkerboard.

The ultimate development along these lines was the VIC cipher, used in the 1950s by Reino Häyhänen. By this time, most Soviet agents were instead using one-time pads. However, despite the theoretical perfection of the one-time pad, in practice they were broken, while VIC was not. The one-time cipher could however only be broken when cipher pages were re-used, due to logistic problems, and therefore was no longer truly one-time.[4]

Mechanics overview

The secret key for the encryption is the following:

  • A short Phrase (e.g. the first line of a song, at least 20 letters long)
  • A Date (written numerically, with no leading zeroes)
  • A Personal Number (unique to agent, a 1 or 2 digit number)

The encryption was also aided by the adversary not knowing a 5-digit Keygroup which was unique to each message. The Keygroup was not strictly a 'secret', (as it was embedded in-clear in the ciphertext), but it was at a location in the ciphertext that was not known to an adversary.

The cipher broadly worked as follows:

  1. Use the secrets above (Phrase, Date, Keygroup and Personal Number) create a 50 digit block of pseudo random-numbers
  2. Use this block to create the message keys for:
    1. A Straddling Checkerboard
    2. Two Columnar transpositions
  3. Encrypt the Plaintext message via the straddling checkerboard
  4. Apply two transpositions to the resultant (intermediary) ciphertext through two columnar
    1. A 'Standard' Columnar Transposition
    2. A Diagonal Columnar Transposition
  5. Insertion of the Keygroup into the ciphertext - as determined by the sixth digit of the Date

Detailed mechanics

Note: this section tracks the calculations by referring to [Line-X] or similar. This is to align with the notation stated in the CIA archive description.[1]

Pseudorandom block derivation

  • [Line-A]: Generate a random 5-digit Keygroup
  • [Line-B]: Write the first 5 digits of the secret Date
  • [Line-C]: Subtract [Line-B] from [Line-A] by modular arithmetic (digit-by-digit, not 'borrowing' any tens from a neighboring column)
  • [Line-D]: Write out the first 20 letters from the secret Phrase
  • [Line-E.1&2]: Sequence (see below) the first and second ten characters separately (to get [Line-E.1] & [Line-E.2] respectively)
  • [Line-F.1]: Write out the 5-Digits from [Line-C], then apply Chain Addition (see below) applied to create five more digits
  • [Line-F.2]: The digit sequence '1234567890' is written out (under [Line-E.2]) as an aide for encoding when creating [Line-H]
  • [Line-G]: Addition of [Line-E.1] to [Line-F.1] - this is digit-by-digit by mod-10 arithmetic, i.e. no 'carrying' over tens to the next column
  • [Line-H]: Encoding (see below) of the digits in [Line-G] under [Line-E.2] as the key
  • [Line-I]: No [Line-I] used, presumably to avoid confusion (as 'I' may be misread as a '1' or 'J')
  • [Line-J]: The Sequencing of [Line-H]
  • [Lines-K,L,M,N,P]: These are five 10-digit lines created by chain addition of [Line-H]. The last two non-equal digits are added to the agent's personal number to determine the key length of the 2 transpositions. (Lines K-to-P are in-effect a key-driven pseudo-random block used for the next stage of encryption)
  • [Line-O]: No [Line-O] used, presumably to avoid confusion (as 'O' may be misread as a zero or 'Q')

Message key derivation

  • [Line-Q]: The first 'a' digits (where 'a' is the first value resulting from the addition of the last non-equal digits in [Line-P] to the Personal Number) extracted from [Lines-K,L,M,N,P] when transposed via [Line-J]. This transposition is done by writing out the columns of numbers underneath the digits in [Line-J], beginning with the column of numbers underneath the digit ‘1', then the column beneath the digit ‘2’, and so on in numerical order, until 'a' digits have been written out (in the example below, the value of 'a' would be 13, so this transposition would carry on for 13 digits). These digits are used as the key to the Columnar Transposition.
  • [Line-R]: The next 'b' digits (where 'b' is the second value resulting from the addition of the last non-equal digits in [Line-P] to the Personal Number) extracted (after the 'a' digits have been extracted) from [Lines-K,L,M,N,P] when transposed via [Line-J]. This transposition is done in the same manner as [Line-Q], but is carried on from the previous transposition, meaning it may start mid-column under the digit of [Line-J] where the last transposition left off (in the example below, [Line-R] starts mid-way through the column underneath the digit '3' in [Line-J]). These resulting digits are used to key the Diagonal Transposition.
  • [Line-S]: The Sequencing of [Line-P], this is used as the key to the Straddling Checkerboard.

Example of key generation

Personal Number: 6
Date: 13 Sept 1959                          // Moon Landing - 13 Sept 1959 ('139195' - truncated to 6 digits) 
Phrase: 'Twas the night before Christmas'   // from 'A visit from St. Nicholas' - poem
Keygroup: 72401                             // randomly generated

[Line-A]: 72401                             // Keygroup
[Line-B]: 13919                             // Date - truncated to 5 digits     
[Line-C]: 69592                             // subtract [Line-B] from [Line-A]
[Line-D]: TWASTHENIG HTBEFORECH             // Phrase - truncated to 20 characters
[Line-E]: 8017942653 6013589427             // via Sequencing
[Line-F]: 6959254417 1234567890             // from [Line-C] and chain addition, then '1234567890'
[Line-G]: 4966196060                        // add [Line-E.1] to [Line-F.1]
[Line-H]: 3288628787                        // encode [Line-G] with [Line-E.2], [Line-F.2] helps 
[Line-J]: 3178429506                        // The Sequencing of [Line-H]
[Line-K]: 5064805552                        // BLOCK: Chain addition of [Line-H] for 50 digits
[Line-L]: 5602850077
[Line-M]: 1620350748
[Line-N]: 7823857125
[Line-P]: 5051328370

Last two non-equal digits are '7' and '0', added to Personal Number (6) means that the permutation keys are 13 and 6 digits long.
 
[Line-Q]: 0668005552551     // first 13 digits from block
[Line-R]: 758838            // next 6 digits from block
[Line-S]: 5961328470        // Sequencing of [Line-P]

Message encryption

Straddling checkerboard

Once the key has been generated, the first stage of actually encrypting the Message is to convert it to a series of digits via a straddling checkerboard. The key (header row) for the checkerboard is based on [Line-S]. Then a pre-agreed series of common letters used on the second row. The example below uses the English mnemonic 'AT ONE SIR'; however, the Cyrillic mnemonic used by Hayhanen was 'snegopad', the Russian word for snowfall.

The remaining cells are filled in, with the rest of the alphabet filled in in order, followed by . (fullstop) and / (number-shift symbol, indicating that the following symbols until the next number-shift are numbers; see example below).

  5 9 6 1 3 2 8 4 7 0
  A T O N E S I R
6 B C D F G H J K L M
8 P Q U V W X Y Z . /

An example encoding is below, where numbers :

MESSAGE: 'Attack at dawn. By dawn I mean 0500. Not 0915 like you did last time.'

Encoding of individual symbols:
 A  T  T  A  C  K  A  T  D  A  W  N  (...)      0   9   1   5
 5  9  9  5 69 64  5  9 66  5 83  3        80 000 999 111 555 80

Final code:
 59956 96459 66583 38765 88665 83376 02538 00005 
 55000 00080 87319 80000 99911 15558 06776 42881
 86667 66675 49976 0287-

Transpositions: columnar transposition

The message is transposed via standard columnar transposition keyed by [Line-Q] above. (Note: if the message encoded length is not a multiple of 5 at this stage, an additional digit is added)

The message is then transposed via Diagonal Transposition keyed by [Line-R] above. The meaning of 'diagonal' in this case is that triangular disruption zones were employed to make the transposition irregular.

Keygroup insertion

The (unencrypted) Keygroup is inserted into the ciphertext 'P' groups from the end; where 'P' is the unused sixth digit of the Date.

Modular addition/subtraction

Modular addition or subtraction, also known as 'false adding/subtraction', in this context (and many pen and paper ciphers) is digit-by-digit addition and subtraction without 'carrying' or 'borrowing'. For example:

  • 1234 + 6789 = 7913
  • 1234 - 6789 = 5555
Sequencing

Sequencing in this context is ordering the elements of an input from 1-10 (where '0' represents 10). This occurs either to letters (whereby alphabetical order is used), or numbers (where numerical value is used). In the event of equal values, then the leftmost value is sequenced first. For example:

  • LETTERS: The word 'Octopus' is sequenced as '2163475' - (i.e. C=1, first 'O'=2, second 'O'=3, ...)
  • NUMBERS: The number '90210' is sequenced as '34215' - (by numerical order. Zero is valued at '10' in terms of ordering)
Chain addition

Chain addition is akin to a linear-feedback shift register, whereby a stream of number is generated as an output (and fed back in as an input) to a seed number. Within the VIC Cipher chain addition works by (1) taking the original (seed) number, (2) false-adding the first two digits, (3) putting this new number at the end of the chain. This continues, however the digits being added are incremented by one. For example, if the seed was '90210', the first 5 iterations are shown below:

90210           // Initial seed value
90210 9         // 9 = 9+0 (first two digits)
90210 92        // 2 = 0+2 (next two...)
90210 923       // 3 = 2+1
90210 9231      // 1 = 1+0
90210 92319     // 9 = 0+9; note how the first '9' generated is being fed back in
Digit encoding

The encoding step replaces each digit in a number (i.e. [Line-G] in the cipher) with one from a key sequence (i.e. [Line-E.2]) that represents its position in the 1-10 ordering. It should be seen that by writing out the series '1234567890' (shown as [Line-F.2]) underneath [Line.E.2] each value from 0-9 has another above it. Simply replace every digit in the number to be encoded with the one above it in the key sequence.

Key (Line E.2) 6 0 1 3 5 8 9 4 2 7
Aide (Line F.2) 1 2 3 4 5 6 7 8 9 0

For example, the number '90210' would have encodings as follows; .

So the output would be: '27067'.

Decryption

Decryption of the VIC Cipher is as follows:

  • Extract the Keygroup - By knowledge of the sixth digit of the secret Date, remove the 5 digits of the Keygroup from the ciphertext
  • Generate the Message Keys - By using the knowledge of the various secrets (Phrase, Date, Personal Number, Keygroup) generate the keys in the same manner as the encryption process
  • Decrypt the Ciphertext - By using knowledge of the Message Keys for the transpositions and straddling checkerboard decrypt them

Cryptanalysis

The cipher is one of the strongest pen and paper ciphers actually used in the real world, and was not broken (in terms of determining the underlying algorithm) by the NSA at the time.[1] However, with the advent of modern computing, and public disclosure of the algorithm this would not be considered a strong cipher. It can be observed that the majority of the entropy in the secret key converges to a 10-digit number [Line-H]. This 10-digit number is approximately 34 bits of entropy, combined with the last digit of the date (needed to identify where the KeyGroup is) would make about 38 bits of entropy in terms of Message Key strength. 38 bits is subject to a Brute-force attack within less than a day on modern computers.

See also

References

  1. ^ a b c David Kahn. "Number One From Moscow". 1993.
  2. ^ Kahn, David (1996). The Codebreakers. Scribner. p. 650.
  3. ^ Kahn, David (1996). The Codebreakers. Scribner. p. 652.
  4. ^ "VENONA: An Overview" (PDF). www.nsa.gov. Archived from the original (PDF) on 4 March 2016. Retrieved 12 January 2022.