Enigma and a way to its decryption

On Enigma and a Method for its Decryption

Abstract:

The present paper investigates and describes the German mechanical cipher machine called Enigma, and presents a possible way to its decryption. During World War II, the Germans used a typewriter-like machine named Enigma to encrypt military messages. We make an attempt at a historical introduction, and non-technical background information is provided. This paper includes a C program that simulates Enigma, and a method to break its code. The latter is achieved by a simulation of a simplified Bombe, a mechanical device that was actually used during WWII at Bletcheley Park, a site in England where a major deal of the work on breaking Enigma's code was done. Much of the credit for breaking the code must be given to a group of accomplished Polish mathematicians, and the famous, and in the computer community well known mathematician Alan Turing.

Keywords: Enigma, Bombe, Cryptography

Some History:

The principle behind Enigma is rather old. Mechanical ciphering devices based on rings and cylinders have been described as early as in the 4th century B.C. The Roman Aeneas Tacitus talked about a cipher disk and its usage [6]. Enigma was patented by Arthur Scherbius in 1918, an inventor who thought of it as a ciphering device for businesses that needed to communicate confidential documents. The German military caught interest, and commercial production was stopped in 1923 [6]. The initial idea was pretty much based on the well known Caesar cipher [7], a mono-alphabetic substitution. Each letter is substituted for example by its n-th neighbor in the alphabet.

ABCDEF...
CDEFGH...

DEAF becomes FGCH in this example where each letter is replaced by its n+2 neighbor (wrap around after Z!). The code however is easy to break. Scherbius came up with a better idea, which led to the use of a poly-alphabetic scheme. Each time a letter is encoded, also the number n changes. For example,

ABCDEF... first letter to encode
CDEFGH...

ABCDEF... second letter to encode
DEFGH I...

ABCDEF... third letter to encode
EFGH I J...

ABCDEF... fourth letter to encode
FGH I J K...

Thus, DEAF becomes FHEK. Although both encodings are similarly far from the original, the latter technique enhances the security of the code since it does alter the encoding scheme.

The following image shows a sketch of the interior design of Enigma [1].

The main units consist of a keyboard, the scrambler unit and the lamp board. The encoding is done in the scrambler unit. It holds a number of rotors with 26 contacts (A-Z) left and right of the rotor. Each left contact is connected to a contact on the right by an internal wiring scheme. Rotors are connected with each other by sliding contacts. A reflecting rotor mirrors the connection backwards. Depending on the relative position of the rotors, a current flows on a certain path from right to left through all the rotors, is reflected and passed back to the right side. The entry point corresponds to a letter in plain text, the exit point accordingly to a letter in cipher text. By using the reflecting rotor, the handling of the machine is simplified. If A is encoded to say E, then the reverse is also true. Therefore the same machine can be used for encoding and decoding without any rewiring necessary [1].
The message is typed in like on a typewriter. Each time a key is hit, current flows through the contacts in the scrambler unit and illuminates a lamp with the enciphered letter showing. After a key was pressed, the rightmost rotor changes its position by a 1/26 rotation. Like in an automobile's odometer, after a full rotation of the rightmost wheel, also the middle wheel changes by a 1/26 rotation, and accordingly the leftmost after the middle wheel has made one full rotation. Little notches on the rotor's sides take care of that. To decrypt a message, one needs not only an Enigma machine, but also the knowledge of the starting state, i.e. at which positions the wheels were when the text was typed in. To decrypt the message, the machine must be set to the same starting state, and the cipher text is entered. Output is the plain text.
Given the current and original Enigma setup, we yield 26x26x26 = 17,576 possible starting states. To enhance the security of Enigma, five rotors with different internal wirings were in use. Any three of these could be used in the machine in any order. We arrive at 17,576x60 = 1,054,560 possibilities. The rotors had letters on their outer rim. This rim was moveable, giving the ring positions. Basically, the rings allow shifting the relative position of letters shown to internal core wiring of the rotor. Exactly the same position of the wheel can now show a different letter. A further gain of security is added by implementing the so called Steckerboard. Before the letters get to the scrambler unit, they may be swapped by cross-connecting them. Connecting A with say E by a plug swaps A with E and vice versa. Using a number of plugs around 5 or 6 (maximun 13) raises the possible states by a factor of 2-3 billion. With the methods given at these days, i.e. trying keys manually, Scherbius calculated that 1000 operators would have to sit several million years to try all starting states. This led the German officials to the claim that their codes were unbreakable and safe [6].
To summarize, Enigma's security relies on the rotor settings, the ring settings, and the Steckerboard connections. For completeness, the following image shows a real Enigma.

Usage:

Slightly different Enigmas were in use. The German Navy used a somewhat more sophisticated model then the Army. A model with four rotors was developed, features to enhance security were added during the life of Enigma, and the way how to use it was altered frequently [1]. Some basic guidlines and features stayed mostly the same though.
The basic settings were written down in a code book, and changed daily. Basic settings are the rotors in use and their arrangement, the ring settings, and the Stecker connections. The user picked a random starting position for the wheels. It was transmitted as part of the message. Messages had a header consisting of sender, time, date and a code for recipient. Possible recipients were Army, Navy etc. Using codes made clear if a message needed to be decoded by the interceptor at all, i.e. if it was of any concern to him. The header was transmitted en clair. The sender then picked another random starting position for the wheels, encoded it, and included it twice after the message header. Therefore the two triples of letters after the header gave the recipient, when decoded according to the code book, the actual position of the wheels to which Enigma needed to be set for decoding the text. After that, the message body was appended. Messages were supposed to be shorter than 200 characters. The longer a message, the more likely it was for it to be broken. If needed, messages were split into pieces. Each piece was then encoded with a different key. There were certain conventions how to separate parts, words, make paragraphs and so on. The letter combination CH is very frequent in German, similar to TH in English, so it was abbreviated as Q. Words were separated with X. Numbers had to be spelled out and so forth [6].
Decoding of a message made necessary the possession of an Enigma machine, the starting state, and the knowledge of how to use the machine.

Breaking the Code:

With all its perfection, especially for the time, there were also flaws in Enigma. The human factor, and not to underestimate the belief in its being unbreakable eventually led to the code being cracked. Untrained and lazy radio staff used bad keys like keyboard diagonals and abbreviated swear words [6]. The whole concept of transmission with the doubly encoded starting position of the rotors had weak points. Knowing this gave the code breakers an entry point into Enigma's big secret. Many messages started in a similar way, say 'ANXGENERAL...'. 'AN' is German for 'TO', and 'X' is a word separator. Knowing parts of a message in plain text and cipher text was called a crib. Cribs are starting points for the attack on breaking the codes.
First and successful attempts at Enigma were being done by the Poles. After WWI, Poland kept an eye and ear on Germany by intercepting radio messages from the German Navy which was active around Polish shores. In 1926, when suddenly those messages became no longer understandable, it was clear that Germany had started to employ some ciphering scheme [6]. There is some myth aroud how Poland found out about what was behind these codes. It was likely a combination of espionage and simply luck how they found out about the electro-mechanical ciphering machine Enigma. It is said that some factory workers and even money needing German officials sold their knowledge. Another incident might be when Germany sent an Enigma accidentally by regular mail, and the Polish secret service had time over a weekend to inspect the machine. It is however without a doubt majorly the achievement of a number of talented young Polish mathematicians to aquire the deeper knowledge on how to use Enigma [5]. The commercial Enigma was not of too much help, but gave some clues on how the machine supposedly worked.
Poland may have bought some messages in plain text and cipher text, but they knew little about the rotor wiring at first, and without starting keys, no message could be deciphered. The mathematician Rejevski came up with a method to find the rotor wiring using linear algebra. All these little clues together gave them enough knowledge to build models of Enigma, and finally a replica. Furthermore, knowing that the first six letters were two equal triples of letters when decrypted helped him develop a scheme, that when analyzing many messages of the same day allowed coming up with the initial settings. It involved the usage of socalled Zygalski sheets. Perforated paper sheets were arranged on an illuminated table and revealed possible starting states. These starting states had then to be tried on an Enigma replica [1][6].
Over the years, the rotor wiring and the usage of Enigma was altered. This made some of the Polish developments obsolete, and they basically had to start all over again. The dual encoded message key at the beginning of a message was changed to singly transmitted key at some point shortly before the war.
A more sophisticated technique for code breaking was given by an electro-mechanical device named Bomby. Having a crib, i.e. a part of a message in plain text an code as given by the frequently similar message starts we can make certain deductions. There will only be a limited number of starting states that allows a crib. Too many for a human to try manually, but feasible for a machine. The machine steps through all possible starting states and stops when a match for the given crib is found. For three rotors, six Bombys were needed, corresponding to the six permutations [6].
In 1938, two more rotors were added. Three of which were in use at a time. The resulting 60 permutations would have required 60 Bombys. Poland didn't have the monetary ressources at that time to produce that many of their machines. When Poland was attacked by Germany, the heads of the code cracker team fled the country, and all their knowledge went to France and England. Some of the Polish mathematicians cooperated with especially the English in refining the strategies that finally led to cracking the code of Enigma.

Bletchley Park and the Bombe:

After the Polish had done that much quality work, England could take a head start on solving the riddle. The Government Communication Headquarters were located in Bletchley Park about 40 miles north of London, and employed about 10,000 people at the end of the war. The most famous among them were Alan Turing and Gordon Welchman, two mathematicians.
Benefiting from the knowledge aquired by the Polish, England's work concentrated on refining the Bomby into the Bombe. It is said that the name stems from the clicking sound that the Bombe made when working on a crib, much like a timer bomb fuse. The Bombe in principle also iterates through the possible starting states, giving possible solutions for a crib. They achieved however a more efficient solution that allowed leaving out some states, thus speeding up the process.
The Bombe is described in some detail in [3]. The first Bombes relied on circular loops in the cribs, as in the following example where in position 1, E is encoded to X, then in position 4 X to W, in position 8 W to V, and in position 10 V back to E.
E..X...W.V...
X..W...V. E...
A collection of scrambler units worked in parallel on the parts of the crib. We would have four units for our particular crib. The scrambler units are set to appropriate starting states. Say the first to AAA, the second to AAD, the third to AAH, and the fourth to AAJ, corresponding to the position of the letter pairs in the crib (1, 4, 8, 10). The scramblers work on the crib in parallel, and move to the next starting states (AAB, AAE, AAI, AAK) if the present state is not good, i.e. does not allow the crib. Cascading the units speeds up the decoding process. A test register is responsible for revealing possible Stecker settings. We might set the reference wire on the test register to X. If we get an output X for a valid setup, then there was no Stecker substitution for X. If the output however was another letter Y, then there was a possible substitution X with Y and vice versa. Detecting a valid setup was known as a drop. The bombe stopped, an operator noted the possible setup, and moved the machine on. Not every drop however revealed a valid setup. Some drops were always false solutions.
The problem here is to find cribs of sufficient length that included loops. By adding a feature called diagonal board, this problem could be solved by creating artificial loops by an additional scrambler unit. Also some of the false drops could be removed by the diagonal board. This feature was introduced by Gordon Welchman.

Do it yourself:

The earlier described Bomby is basically implemented here. Code for an Enigma simulation is incorporated into a program of two parts. The first part is simply an Enigma that allows the encryption of a text. Starting states may be chosen freely. Five wheels are available, and a Steckerboard may be used. The second part needs as input an encoded text and a number of characters known to be the plain text, i.e. a crib. The program then iterates through all possible starting states, and announces possible setups. Stecker substitutions up to two letter pairs are also detected by forming all possible strings of length four consisting of nonredundant pairs. However, the program runs a long time when this feature is used. It finds the starting state reliably and in an acceptable amount of time when no Steckerboard is used, or the alphabet for Steckering is limited to say only six characters, A-F. Depending on the length of the crib, between one exact and many possible solutions are found. It is therefore vital to know that the encoded message actually makes some sense to some kind of humans. Knowing that the first two letters of e.g. FWXTU decode to HE, and finding the possible solutions HEMTP, HEXMA and HELLO gives us three starting states. Either we can assume the third with HELLO is the right one, or we have many messages encoded in the same key available on which we can test the three possible starting states, thus finding the right one.
It became clear during the tests, that the ring position is not cruxial when trying to find the starting state. Moving the rings relative to the core wiring of the rotors gives actually 26x26x26 equivalent states, and therefore 26x26x26 equivalent solutions. Inserting a rotor with Z up is the same as calling Z an A and inserting it with A up. It ends up in the same position! To decrypt a message we only need to know the relative positions. The objective is to find a solution which we can use for the decryption of a message, i.e. given an Enigma knowing how to set it up correctly. It is therefore sufficient to find the relative position of the rotors, no matter how these are called by changing the rings. We set the rings to AAA by default in the program therefore.

Click here for the complete code for the Enigma simulation and the simple Bomby (enigma.c). Compile this program with a suitable C-compiler. At the head of the file you find two #define statements. One defines MSGLEN, currently set to 80 for the maximum length of a message to encode. The other one is called TO, and is currently set to 'E'. This means Stecker substitutions are checked only within an alphabet consisting of 'A'..'E'. The purpose of this is to save running time ('A'..'Z' runs days!!), and serves demo purposes of the program. Change both values freely as desired.
Run the program by typing the file name of the executable. Without parameters, you will be prompted to enter the starting values. You may then encode a message. E.g. type HELLO, and yield FWXTU. To decode this , type the program name followed by the cipher code and a number of known characters in plain text. E.g. 'enigma FWXTU HE'. The program iterates through all possible starting states and outputs possible solutions as appropriate, i.e. the machine settings and what the decoded message would be for these settings.
IMPORTANT ADDITION: It has been noted that the decoding part of "enigma.c" works not with all Enigma simulators. You should not attempt to decode messages from arbitrary simulators. This is likely due to different rotor wiring as used in different simulators. My Enigma (first part of the program) was taken from the web, and used to write the Bombe. Encode a message with the first part, decode it with the second. Attempting to decode messages from different simulators is likely to fail!

Discussion:

As stated earlier, Enigma had its flaws. Yet it was a complex machine, and its intrinsics were broken starting with only very little knowledge. It is especially for the time and the possibilities given an accomplishment to come up with a mechanical machine that is fast enough to compute possible codes. The Bombe reminds in many ways of an early parallel processor. Limited in its capabilities, and with a hardwired program, but efficient and doing what it was supposed to do by only using wheels and some wires.
The present program tries to implement some of the Bombe's features in a more modern software solution. Codes are broken in acceptable time. To further refine the program, an efficient implementation of how to cope with the Steckerboard substitutions would be necessary.

Acknowledgments:

Credit must be given to the people who have websites about Enigma and its decryption. See references for detail. The images in this text were taken from some of them. Fauzan Mirza wrote a program to simulate Enigma, and a part of the code is used here.

References:

[1] http://www.attlabs.att.co.uk/andyc/enigma/about_enigma.html, Andy Carlson's home page about general Enigma features

[2] http://www.cranfield.ac.uk/ccc/bpark/morebpark.htm, Computer Centre at the Cranfield University home page about Bletchley Park,

[3] http://www.geocities.com/CapeCanaveral/Hangar/4040/bombe.html, Nik Shaylor's home page about general Enigma features and the bombe

[4] http://www.wadham.ox.ac.uk/~ahodges/Turing.html, Andrew Hodges' home page about Alan Turing

[5] http://www.gl.umbc.edu/~lmazia1/Enigma/enigma.html, Dr. Wladyslaw Kozaczuk writes about the Polish work on deciphering Enigma codes

[6] http://home.earthlink.net/~nbrass1/enigma.htm, Nautical Brass home page on history and usage of Enigma

[7] Network Security, Textbook for class

go to my home page