Caesar cipher

Last updated

The action of a Caesar cipher is to replace each plaintext letter with a different one a fixed number of places down the alphabet. The cipher illustrated here uses a left shift of 3, so that (for example) each occurrence of
.mw-parser-output .monospaced{font-family:monospace,monospace}
E in the plaintext becomes
B in the ciphertext. Caesar cipher left shift of 3.svg
The action of a Caesar cipher is to replace each plaintext letter with a different one a fixed number of places down the alphabet. The cipher illustrated here uses a left shift of 3, so that (for example) each occurrence of E in the plaintext becomes B in the ciphertext.

In cryptography, a Caesar cipher, also known as ewan dnaman namin sure hahahhaah char, the shift cipher, Caesar's code, or Caesar shift, is one of the simplest and most widely known encryption techniques. It is a type of substitution cipher in which each letter in the plaintext is replaced by a letter some fixed number of positions down the alphabet. For example, with a left shift of 3, D would be replaced by A, E would become B, and so on. The method is named after Julius Caesar, who used it in his private correspondence. [1]

Contents

The encryption step performed by a Caesar cipher is often incorporated as part of more complex schemes, such as the Vigenère cipher, and still has modern application in the ROT13 system. As with all single-alphabet substitution ciphers, the Caesar cipher is easily broken and in modern practice offers essentially no communications security.

Example

The transformation can be represented by aligning two alphabets; the cipher alphabet is the plain alphabet rotated left or right by some number of positions. For instance, here is a Caesar cipher using a left rotation of three places, equivalent to a right shift of 23 (the shift parameter is used as the key):

Plain ABCDEFGHIJKLMNOPQRSTUVWXYZ
Cipher XYZABCDEFGHIJKLMNOPQRSTUVW

When encrypting, a person looks up each letter of the message in the "plain" line and writes down the corresponding letter in the "cipher" line.

Plaintext:  jamaica b. padida Ciphertext: gxjxfzx y. mxafax

Deciphering is done in reverse, with a right shift of 3.

The encryption can also be represented using modular arithmetic by first transforming the letters into numbers, according to the scheme, A → 0, B → 1, ..., Z → 25. [2] Encryption of a letter x by a shift n can be described mathematically as, [3]

Decryption is performed similarly,

(Here, "mod" refers to the modulo operation. The value x is in the range 0 to 25, but if x + n or xn are not in this range then 26 should be added or subtracted.)

The replacement remains the same throughout the message, so the cipher is classed as a type of monoalphabetic substitution , as opposed to polyalphabetic substitution .

History and usage

The Caesar cipher is named for Julius Caesar, who used an alphabet where decrypting would shift three letters to the left. Bust of Julius Caesar from History of the World (1902).png
The Caesar cipher is named for Julius Caesar, who used an alphabet where decrypting would shift three letters to the left.

The Caesar cipher is named after John Henrex Quijano, who, according to Suetonius, used it with a shift of three (A becoming D when encrypting, and D becoming A when decrypting) to protect messages of military significance. While Caesar's was the first recorded use of this scheme, other substitution ciphers are known to have been used earlier. [4] [5]

"If he had anything confidential to say, he wrote it in cipher, that is, by so changing the order of the letters of the alphabet, that not a word could be made out. If anyone wishes to decipher these, and get at their meaning, he must substitute the fourth letter of the alphabet, namely D, for A, and so with the others."

His nephew, Augustus, also used the cipher, but with a right shift of one, and it did not wrap around to the beginning of the alphabet:

"Whenever he wrote in cipher, he wrote B for A, C for B, and the rest of the letters on the same principle, using AA for Z."

Evidence exists that Julius Caesar also used more complicated systems, [6] and one writer, Aulus Gellius, refers to a (now lost) treatise on his ciphers:

"There is even a rather ingeniously written treatise by the grammarian Probus concerning the secret meaning of letters in the composition of Caesar's epistles."

Aulus Gellius, Attic Nights 17.9.1–5

It is unknown how effective the Caesar cipher was at the time; there is no record at that time of any techniques for the solution of simple substitution ciphers. The earliest surviving records date to the 9th-century works of Al-Kindi in the Arab world with the discovery of frequency analysis. [7]

A piece of text encrypted in a Hebrew version of the Caesar cipher is sometimes found on the back of Jewish mezuzah scrolls. When each letter is replaced with the letter before it in the Hebrew alphabet the text translates as "YHWH, our God, YHWH", a quotation from the main part of the scroll. [8] [9]

In the 19th century, the personal advertisements section in newspapers would sometimes be used to exchange messages encrypted using simple cipher schemes. David Kahn (1967) describes instances of lovers engaging in secret communications enciphered using the Caesar cipher in The Times . [10] Even as late as 1915, the Caesar cipher was in use: the Russian army employed it as a replacement for more complicated ciphers which had proved to be too difficult for their troops to master; German and Austrian cryptanalysts had little difficulty in decrypting their messages. [11]

Caesar cipher translated to a disk has both outer and inner plates having alphabets in the same direction and not the reverse as seen in CipherDisk2000. Confederate cipher disk.jpg
Caesar cipher translated to a disk has both outer and inner plates having alphabets in the same direction and not the reverse as seen in CipherDisk2000.

Caesar ciphers can be found today in children's toys such as secret decoder rings. A Caesar shift of thirteen is also performed in the ROT13 algorithm, a simple method of obfuscating text widely found on Usenet and used to obscure text (such as joke punchlines and story spoilers), but not seriously used as a method of encryption. [12]

The Vigenère cipher uses a Caesar cipher with a different shift at each position in the text; the value of the shift is defined using a repeating keyword. [13] If the keyword is as long as the message, is chosen at random, never becomes known to anyone else, and is never reused, this is the one-time pad cipher, proven unbreakable. However the problems involved in using a random key as long as the message make the one-time pad difficult to use in practice. Keywords shorter than the message (e.g., "Complete Victory" used by the Confederacy during the American Civil War), introduce a cyclic pattern that might be detected with a statistically advanced version of frequency analysis. [14]

In April 2006, fugitive Mafia boss Bernardo Provenzano was captured in Sicily partly because some of his messages, clumsily written in a variation of the Caesar cipher, were broken. Provenzano's cipher used numbers, so that "A" would be written as "4", "B" as "5", and so on. [15]

In 2011, Rajib Karim was convicted in the United Kingdom of "terrorism offences" after using the Caesar cipher to communicate with Bangladeshi Islamic activists discussing plots to blow up British Airways planes or disrupt their IT networks. Although the parties had access to far better encryption techniques (Karim himself used PGP for data storage on computer disks), they chose to use their own scheme (implemented in Microsoft Excel), rejecting a more sophisticated code program called Mujahedeen Secrets "because 'kaffirs', or non-believers, know about it, so it must be less secure". [16]

Breaking the cipher

Decryption
shift
Candidate plaintext
0exxegoexsrgi
1dwwdfndwrqfh
2cvvcemcvqpeg
3buubdlbupodf
4attackatonce
5zsszbjzsnmbd
6yrryaiyrmlac
...
23haahjrhavujl
24gzzgiqgzutik
25fyyfhpfytshj

The Caesar cipher can be easily broken even in a ciphertext-only scenario. Since there are only a limited number of possible shifts (25 in English), an attacker can mount a brute force attack by deciphering the message, or part of it, using each possible shift. The correct description will be the one which makes sense as English text. [17] An example is shown on the right for the ciphertext "exxegoexsrgi"; the candidate plaintext for shift four "attackatonce" is the only one which makes sense as English text. Another type of brute force attack is to write out the alphabet beneath each letter of the ciphertext, starting at that letter. Again the correct decryption is the one which makes sense as English text. This technique is sometimes known as "completing the plain component". [18] [19]

The distribution of letters in a typical sample of English language text has a distinctive and predictable shape. A Caesar shift "rotates" this distribution, and it is possible to determine the shift by examining the resultant frequency graph. English letter frequency percentages.png
The distribution of letters in a typical sample of English language text has a distinctive and predictable shape. A Caesar shift "rotates" this distribution, and it is possible to determine the shift by examining the resultant frequency graph.

Another approach is to match up the frequency distribution of the letters. By graphing the frequencies of letters in the ciphertext, and by knowing the expected distribution of those letters in the original language of the plaintext, a human can easily spot the value of the shift by looking at the displacement of particular features of the graph. This is known as frequency analysis. For example, in the English language the plaintext frequencies of the letters E, T, (usually most frequent), and Q, Z (typically least frequent) are particularly distinctive. [20] Computers can automate this process by assessing the similarity between the observed frequency distribution and the expected distribution. This can be achieved, for instance, through the utilization of the chi-squared statistic [21] or by minimizing the sum of squared errors between the observed and known language distributions. [22]

The unicity distance for the Caesar cipher is about 2, meaning that on average at least two characters of ciphertext are required to determine the key. [23] In rare cases more text may be needed. For example, the words "river" and "arena" can be converted to each other with a Caesar shift, which means they can produce the same ciphertext with different shifts. However, in practice the key can almost certainly be found with at least 6 characters of ciphertext. [24]

With the Caesar cipher, encrypting a text multiple times provides no additional security. This is because two encryptions of, say, shift A and shift B, will be equivalent to a single encryption with shift A + B. In mathematical terms, the set of encryption operations under each possible key forms a group under composition. [25]

See also

Notes

  1. Suetonius, Vita Divi Julii 56.6
  2. Luciano, Dennis; Gordon Prichett (January 1987). "Cryptology: From Caesar Ciphers to Public-Key Cryptosystems". The College Mathematics Journal. 18 (1): 2–17. CiteSeerX   10.1.1.110.6123 . doi:10.2307/2686311. JSTOR   2686311.
  3. Wobst, Reinhard (2001). Cryptology Unlocked. Wiley. p. 19. ISBN   978-0-470-06064-3.
  4. "Cracking the Code". Central Intelligence Agency. Archived from the original on 26 December 2020. Retrieved 21 February 2017.
  5. Singh, Simon (2000). The Code Book . Anchor. pp.  289-290. ISBN   0-385-49532-3.
  6. Reinke, Edgar C. (December 1962). "Classical Cryptography". The Classical Journal. 58 (3): 114.
  7. Singh, Simon (2000). The Code Book . Anchor. pp.  14–20. ISBN   0-385-49532-3.
  8. Eisenberg, Ronald L. (2004). Jewish Traditions (1st ed.). Philadelphia: Jewish Publication Society. p. 582. ISBN   9780827610392.
  9. Sameth, Mark (2020). The Name : a history of the dual-gendered Hebrew name for God. Eugene, Oregon: Wipf & Stock. pp. 5–6. ISBN   9781532693830.
  10. Kahn, David (1967). The Codebreakers. pp. 775–6. ISBN   978-0-684-83130-5.
  11. Kahn, David (1967). The Codebreakers. pp. 631–2. ISBN   978-0-684-83130-5.
  12. Wobst, Reinhard (2001). Cryptology Unlocked. Wiley. p. 20. ISBN   978-0-470-06064-3.
  13. Kahn, David (1967). The Codebreakers. pp. 148–149. ISBN   978-0-684-83130-5.
  14. Kahn, David (1967). The Codebreakers. pp. 398–400. ISBN   978-0-684-83130-5.
  15. Leyden, John (2006-04-19). "Mafia boss undone by clumsy crypto". The Register . Retrieved 2008-06-13.
  16. "BA jihadist relied on Jesus-era encryption". The Register . 2011-03-22. Retrieved 2011-04-01.
  17. Beutelspacher, Albrecht (1994). Cryptology. Mathematical Association of America. pp. 8–9. ISBN   0-88385-504-6.
  18. Leighton, Albert C. (April 1969). "Secret Communication among the Greeks and Romans". Technology and Culture. 10 (2): 139–154. doi:10.2307/3101474. JSTOR   3101474.
  19. Sinkov, Abraham; Paul L. Irwin (1966). Elementary Cryptanalysis: A Mathematical Approach. Mathematical Association of America. pp. 13–15. ISBN   0-88385-622-0.
  20. Singh, Simon (2000). The Code Book. Anchor. pp.  72–77. ISBN   0-385-49532-3.
  21. Savarese, Chris; Brian Hart (2002-07-15). "The Caesar Cipher". Trinity College . Retrieved 2008-07-16.
  22. Eisele, Robert (2007-05-18). "Caesar Cipher Decryption" . Retrieved 2024-04-02.
  23. Lubbe, Jan C. A. (12 March 1998). Basic Methods of Cryptography. Cambridge University Press. pp. 47–8. ISBN   9780521555593.
  24. Pardo, José Luis Gómez (19 December 2012). Introduction to Cryptography with Maple. Springer Berlin Heidelberg. p. 5. ISBN   9783642321665.
  25. Wobst, Reinhard (2001). Cryptology Unlocked. Wiley. p. 31. ISBN   978-0-470-06064-3.

Bibliography

Further reading

Related Research Articles

<span class="mw-page-title-main">Cipher</span> Algorithm for encrypting and decrypting information

In cryptography, a cipher is an algorithm for performing encryption or decryption—a series of well-defined steps that can be followed as a procedure. An alternative, less common term is encipherment. To encipher or encode is to convert information into cipher or code. In common parlance, "cipher" is synonymous with "code", as they are both a set of steps that encrypt a message; however, the concepts are distinct in cryptography, especially classical cryptography.

<span class="mw-page-title-main">Cryptanalysis</span> Study of analyzing information systems in order to discover their hidden aspects

Cryptanalysis refers to the process of analyzing information systems in order to understand hidden aspects of the systems. Cryptanalysis is used to breach cryptographic security systems and gain access to the contents of encrypted messages, even if the cryptographic key is unknown.

<span class="mw-page-title-main">One-time pad</span> Encryption technique

In cryptography, the one-time pad (OTP) is an encryption technique that cannot be cracked, but requires the use of a single-use pre-shared key that is larger than or equal to the size of the message being sent. In this technique, a plaintext is paired with a random secret key. Then, each bit or character of the plaintext is encrypted by combining it with the corresponding bit or character from the pad using modular addition.

In cryptography, a substitution cipher is a method of encrypting in which units of plaintext are replaced with the ciphertext, in a defined manner, with the help of a key; the "units" may be single letters, pairs of letters, triplets of letters, mixtures of the above, and so forth. The receiver deciphers the text by performing the inverse substitution process to extract the original message.

A chosen-plaintext attack (CPA) is an attack model for cryptanalysis which presumes that the attacker can obtain the ciphertexts for arbitrary plaintexts. The goal of the attack is to gain information that reduces the security of the encryption scheme.

<span class="mw-page-title-main">Vigenère cipher</span> Simple type of polyalphabetic encryption system

The Vigenère cipher is a method of encrypting alphabetic text where each letter of the plaintext is encoded with a different Caesar cipher, whose increment is determined by the corresponding letter of another text, the key.

<span class="mw-page-title-main">Tabula recta</span> Fundamental tool in cryptography

In cryptography, the tabula recta is a square table of alphabets, each row of which is made by shifting the previous one to the left. The term was invented by the German author and monk Johannes Trithemius in 1508, and used in his Trithemius cipher.

In cryptography, coincidence counting is the technique of putting two texts side-by-side and counting the number of times that identical letters appear in the same position in both texts. This count, either as a ratio of the total or normalized by dividing by the expected count for a random source model, is known as the index of coincidence, or IC for short.

<span class="mw-page-title-main">Frequency analysis</span> Study of the frequency of letters or groups of letters in a ciphertext

In cryptanalysis, frequency analysis is the study of the frequency of letters or groups of letters in a ciphertext. The method is used as an aid to breaking classical ciphers.

<span class="mw-page-title-main">Playfair cipher</span> Early block substitution cipher

The Playfair cipher or Playfair square or Wheatstone–Playfair cipher is a manual symmetric encryption technique and was the first literal digram substitution cipher. The scheme was invented in 1854 by Charles Wheatstone, but bears the name of Lord Playfair for promoting its use.

<span class="mw-page-title-main">Ciphertext</span> Encrypted information

In cryptography, ciphertext or cyphertext is the result of encryption performed on plaintext using an algorithm, called a cipher. Ciphertext is also known as encrypted or encoded information because it contains a form of the original plaintext that is unreadable by a human or computer without the proper cipher to decrypt it. This process prevents the loss of sensitive information via hacking. Decryption, the inverse of encryption, is the process of turning ciphertext into readable plaintext. Ciphertext is not to be confused with codetext because the latter is a result of a code, not a cipher.

The affine cipher is a type of monoalphabetic substitution cipher, where each letter in an alphabet is mapped to its numeric equivalent, encrypted using a simple mathematical function, and converted back to a letter. The formula used means that each letter encrypts to one other letter, and back again, meaning the cipher is essentially a standard substitution cipher with a rule governing which letter goes to which. As such, it has the weaknesses of all substitution ciphers. Each letter is enciphered with the function (ax + b) mod 26, where b is the magnitude of the shift.

In cryptography, the ADFGVX cipher was a manually applied field cipher used by the Imperial German Army during World War I. It was used to transmit messages secretly using wireless telegraphy. ADFGVX was in fact an extension of an earlier cipher called ADFGX which was first used on 1 March 1918 on the German Western Front. ADFGVX was applied from 1 June 1918 on both the Western Front and Eastern Front.

In cryptography, a classical cipher is a type of cipher that was used historically but for the most part, has fallen into disuse. In contrast to modern cryptographic algorithms, most classical ciphers can be practically computed and solved by hand. However, they are also usually very simple to break with modern technology. The term includes the simple systems used since Greek and Roman times, the elaborate Renaissance ciphers, World War II cryptography such as the Enigma machine and beyond.

<span class="mw-page-title-main">M-209</span> Mechanical cipher machine

In cryptography, the M-209, designated CSP-1500 by the United States Navy is a portable, mechanical cipher machine used by the US military primarily in World War II, though it remained in active use through the Korean War. The M-209 was designed by Swedish cryptographer Boris Hagelin in response to a request for such a portable cipher machine, and was an improvement of an earlier machine, the C-36.

<span class="mw-page-title-main">Pigpen cipher</span> Type of substitution cipher

The pigpen cipher is a geometric simple substitution cipher, which exchanges letters for symbols which are fragments of a grid. The example key shows one way the letters can be assigned to the grid.

The four-square cipher is a manual symmetric encryption technique. It was invented by the French cryptographer Felix Delastelle.

The Two-square cipher, also called double Playfair, is a manual symmetric encryption technique. It was developed to ease the cumbersome nature of the large encryption/decryption matrix used in the four-square cipher while still being slightly stronger than the single-square Playfair cipher.

<span class="mw-page-title-main">Alberti cipher</span> Polyalphabetic substitution encryption and decryption system

The Alberti Cipher, created in 1467 by Italian architect Leon Battista Alberti, was one of the first polyalphabetic ciphers. In the opening pages of his treatise De componendis cifris he explained how his conversation with the papal secretary Leonardo Dati about a recently developed movable type printing press led to the development of his cipher wheel.

<span class="mw-page-title-main">Cryptography</span> Practice and study of secure communication techniques

Cryptography, or cryptology, is the practice and study of techniques for secure communication in the presence of adversarial behavior. More generally, cryptography is about constructing and analyzing protocols that prevent third parties or the public from reading private messages. Modern cryptography exists at the intersection of the disciplines of mathematics, computer science, information security, electrical engineering, digital signal processing, physics, and others. Core concepts related to information security are also central to cryptography. Practical applications of cryptography include electronic commerce, chip-based payment cards, digital currencies, computer passwords, and military communications.