Cryptographic attacks: an explanation for confused minds
At the word "cryptography", some remember their WiFi password, a green lock next to the address of their favorite site and how difficult it is to get into someone else's mail. Others recall a series of vulnerabilities of recent years with speaking abbreviations (DROWN, FREAK, POODLE ...), stylish logos and a warning urgently update the browser.
Cryptography covers all of this, but the essence is different. The bottom line is between simple and complex. Some things are easy to do, but hard to get back: for example, breaking an egg. Other things are easy to do, but hard to get back when a small, crucial crucial part is missing: for example, to open a locked door when the “critical part” is the key. Cryptography studies these situations and how to use them in practice.
Over the past years, the collection of cryptographic attacks has turned into a zoo of flashy logos full of scientific article formulas and has created a general gloomy feeling that everything is broken. But in fact, many of the attacks are based on several general principles, and the endless pages of formulas often come down to easy-to-understand ideas. In this series of articles, we will look at various types of cryptographic attacks, with an emphasis on basic principles. In general terms and not quite in that order, but we will tell the following:
Basic strategies: brute force, frequency analysis, interpolation, downgrade and cross-protocols.
Third-party attacks and their close relatives, failure analysis methods.
Attacks on public-key cryptography: cubic root, Broadcast, related message, Coppersmith attack, Polyg - Hellman algorithm, numerical sieve, Wiener attack, Bleichenbacher attack.
This particular article covers the above material up to Kelsey's attack.
Basic strategies
The following attacks are simple in the sense that they can be almost completely explained without special technical details. We explain each type of attack in the simplest terms, without delving into complex examples or advanced use cases.
Some of these attacks have mostly lost relevance and have not been used for many years. Others are old-timers, they still regularly sneak up on unsuspecting cryptosystem developers in the 21st century. We can assume that the era of modern cryptography began with the advent of IBM DES - the first cipher that withstood all the attacks on this list.
Simple brute force
The encryption scheme consists of two parts: 1) the encryption function, which receives a message (plaintext) in combination with a key, and then creates an encrypted message - ciphertext; 2) a decryption function that takes a ciphertext and a key and creates plaintext. Both encryption and decryption should be easy to compute with the key - and hard without it.
Suppose we see ciphertext and try to decrypt it without any additional information (this is called a ciphertext only attack). If we somehow magically find the right key, we can easily verify that it is indeed correct if the result is a reasonable message.
Note that there are two implicit assumptions. Firstly, that we know how to perform decryption, that is, how the cryptosystem works. This is a standard assumption when discussing cryptography. Hiding the details of the cipher implementation from intruders may seem like an additional security measure, but as soon as the intruder finds out these details, this extra security is invisibly and irreversibly lost. This is the principle of Kerchhoffs : falling into the hands of the enemy should not cause inconvenience.
Secondly, we assume that the correct key is the only key that will lead to a reasonable decryption. This is also a reasonable assumption; it is executed if the ciphertext is much longer than the key and is well read. As a rule, this happens in the real world, with the exception of huge impractical keys or other frauds, which are better left aside (if you do not like that we have dismissed the explanations, please see Theorem 3.8 here ).
Given the above, a strategy arises: check every possible key. This is called brute force, and such an attack is guaranteed to work against all practical ciphers - in the end. For example, brute force is enough to crack Caesar 's cipher , an ancient cipher where the key is one letter from the alphabet, which implies a little more than 20 possible keys.
Unfortunately for cryptanalysts, increasing the key size protects against brute force. As the key size grows, the number of possible keys increases exponentially. With modern key sizes, a simple brute force is not at all practical. To understand what we mean, take the fastest known supercomputer in mid-2019: IBM's Summit , with peak performance on the order of 10 ^{17} operations per second. Today, a typical key length is 128 bits, which means ^{2,128} possible combinations. Enumerating all the keys requires Summit supercomputer to take about 7800 times the age of the universe.
Should brute force be considered a historical curiosity? Not at all: it is an essential ingredient in the cryptanalysis cookbook. Rarely so weak ciphers are found that they can be cracked only with a smart attack, without the use of force to one degree or another. Many successful hacks use the algorithmic method first to weaken the target cipher, and then run brute force.
Frequency analysis
Most texts are not gibberish. For example, in English texts there are many letters 'e' and articles 'the'; in binary files - a lot of null bytes as a placeholder between pieces of information. Frequency analysis is any attack that exploits this fact.
A canonical example of a cipher vulnerable to this attack is a simple substitution cipher. In this cipher, the key is a table with the replacement of all letters. For example, 'g' is replaced by 'h', 'o' by j, Therefore, the word 'go' turns into 'hj'. This cipher is difficult to simple brute force, since there are so many possible lookup tables. If you are interested in math, the effective key length is about 88 bits: this $$log2(26!) . But frequency analysis usually does the job quickly.
Consider the following ciphertext processed with a simple substitution cipher:
XDYLY ALY UGLY XDWNKE WN DYAJYN ANF YALXD DGLAXWG XDAN ALY FLYAUX GR WN OGQL ZDWBGEGZDO
Since Y is common, including at the end of many words, we can tentatively assume that this is the letter e :
XDeLe ALe UGLe XDWNKE WN DeAJeN ANF eALXD DGLAXWG XDAN ALe FLeAUX GR WN OGQL ZDWBGEGZDO
The XD pair is repeated at the beginning of a few words. In particular, the combination of XDeLe explicitly suggests the word these or there , so we continue:
theLe ALe UGLe thWNKE WN heAJeN ANF eALth DGLAtWG thAN ALe FLeAUt GR WN OGQL ZDWBGEGZDO
Further, suppose that L corresponds to r , A - a and so on. You will probably have to make several attempts, but compared to full brute force, this attack restores the source text in the shortest possible time:
there are more things in heaven and earth horatio than are dreamt of in your philosophy
For some, the solution to such “cryptograms” is a fascinating hobby.
The idea of frequency analysis is more fundamental than it seems at first glance. And it applies to much more complex ciphers. Throughout history, various cipher designs have attempted to withstand such an attack using "polyalphabetic substitution." Here, in the encryption process, the letter replacement table changes in complex, but predictable, key-dependent ways. All these ciphers were once considered difficult to crack; and yet a modest frequency analysis ended up defeating them all.
The most ambitious polyalphabetic cipher in history and, probably, the most famous, was the Enigma cipher in World War II. It was relatively complex compared to its predecessors, but as a result of long and hard work, British cryptanalysts cracked it using frequency analysis. Of course, they could not develop an elegant attack, as shown above; they had to compare well-known pairs of open and encrypted texts (the so-called “plaintext attack”) and even provoking Enigma users to encrypt certain messages with analysis of the result (“attack based on selected plaintext”). But this did not facilitate the fate of the defeated armies of enemies and sunk submarines.
After this triumph, frequency analysis disappeared from the history of cryptanalysis. The ciphers of the modern digital era are designed to work with bits, not letters. More importantly, these ciphers are designed with a gloomy understanding of what later became known as Schneier’s law : anyone can create an encryption algorithm that he himself can’t crack. It is not enough that the encryption system seems complicated: in order to prove its worth, it must go through a ruthless security review from many cryptanalysts who will do everything possible to crack the cipher.
Preliminary calculations
Take the hypothetical city of Precom Heights with a population of 200,000. In each house of the city there are valuables on average of $ 30,000, but not more than $ 50,000. The security market in Precom was monopolized by ACME Industries, which produces the legendary Coyote (tm) class door locks. According to expert analysis, only a very complex hypothetical machine can break a Coyote-class lock, the creation of which requires about five years and $ 50,000 investment. Is the city safe?
Most likely no. In the end, a rather ambitious criminal will appear. He will reason like this: “Yes, I will incur large upfront costs. Five years of patient expectation, and $ 50,000. But after finishing work I will have access to all the wealth of this city . If I play my cards correctly, this investment will pay off many times. ”
Similarly, in cryptography. Attacks against a specific cipher are subjected to a ruthless analysis of costs and benefits. If the ratio is favorable, the attack will not occur. But the attacks that immediately attack many potential victims almost always pay off, and in this case, the best design practice is to assume that they started from day one. We have essentially a cryptographic version of Murphy’s law: “Everything that can really break the system will break the system.”
The simplest example of a cryptosystem vulnerable to an attack with preliminary calculations is a cipher with a constant algorithm without using a key. This was the case with the Caesar cipher , which simply shifts each letter of the alphabet three letters forward (the table is looped, so the last letter in the alphabet is encrypted with the third). Here again, the Kerchhoffs principle is manifested: as soon as the system is hacked, it is hacked forever.
The concept is simple. Even a novice cryptosystem developer is likely to be aware of the threat and prepare accordingly. If you look at the evolution of cryptography, such attacks were inappropriate for most ciphers, starting with the first improved versions of Caesar's cipher, right up to the decline of polyalphabetic ciphers. Such attacks returned only with the advent of the modern era of cryptography.
This return is due to two factors. Firstly, finally, quite complex cryptosystems appeared, where the possibility of exploitation after hacking was not obvious. Secondly, cryptography is so widespread that millions of lay people make decisions every day on where and which parts of cryptography to reuse. Some time passed before the experts realized the risks and raised the alarm.
Remember the precomputed attack: at the end of the article we will consider two cryptographic examples from real life, where it played an important role.
Interpolation
Here is the famous detective Sherlock Holmes, performing an attack with interpolation on the unlucky Dr. Watson:
I immediately guessed that you came from Afghanistan ... My thoughts were as follows: “This person is a doctor by type, but he has a military bearing. So a military doctor. He had just arrived from the tropics - his face is swarthy, but this is not a natural shade of his skin, since his wrists are much whiter. The emaciated face, obviously, suffered a lot and suffered the disease. He was wounded in his left hand - holds it motionless and a little unnatural. Where, under the tropics, could a British medical doctor suffer hardships and be injured? Of course, in Afghanistan. ” The whole train of thought did not take even a second. And so I said that you came from Afghanistan, and you were surprised.
From each evidence individually, Holmes could extract very little information. He could come to his conclusion only by examining them all together. The interpolation attack works in a similar way, exploring the known pairs of open and encrypted texts obtained as a result of applying the same key. Separate observations are extracted from each pair, which allow us to draw a general conclusion about the key. All these conclusions are vague and useless until they suddenly reach a critical mass and lead to the only possible conclusion: no matter how incredible it may be, it must be true. After that, either the key is opened, or the decryption process becomes so mature that it can be duplicated.
We illustrate with a simple example how interpolation works. Suppose we want to read the personal diary of our enemy, Bob. He encrypts every number in his diary with the help of a simple cryptosystem, which he learned about from an advertisement in the magazine “Mockery of cryptography”. The system works as follows: Bob selects two numbers that he likes: $$M and $$N . From now on, to encrypt any number $$x he calculates $$Mx+N . For example, if Bob chose $$M=3 and $$N=4 then the figure $$2 encrypted as $$3∗2+4=10 .
Suppose we noticed on December 28th that Bob was scratching something in his diary. When he finishes, we quietly take it and see the last entry:
Date: 235/520
Dear Diary,
Today was a good day. After 64 days, I have a date with Alice, who lives in apartment 843 . I really think she can be 26 !
Since we very seriously intend to follow Bob on his date (in this scenario, we are 15 years old), it is crucial to find out the date, as well as Alice's address. Fortunately, we notice that Bob's cryptosystem is vulnerable to an interpolation attack. We may not know $$M and $$N , but we know today's date, so we have two pairs of “plaintext - ciphertext”. Namely, we know that $$12 encrypted in $$235 , but $$27 - at $$520 . What we write:
$$M∗12+N=235
$$M∗27+N=520
Since we are 15 years old, we already know about the system of two equations with two unknowns, which in this situation is enough to find $$M and $$N without any problems. Each “plaintext-ciphertext” pair imposes a constraint on Bob's key, and two constraints together are enough to completely restore the key. In our example, the answer $$M=19 and $$N=7 (at $$19x+7=26$$x=1 , so 26 in the diary corresponds to the word 'the one', that is, “the same” - approx. per.) .
Interpolation attacks, of course, are not limited to such simple examples. Each cryptosystem, which comes down to a well-understood mathematical object and a list of parameters, is at risk of an interpolation attack - the more understandable the object, the higher the risk.
Beginners often complain that cryptography is "the art of designing as ugly things as possible." Interpolation attacks are probably largely to blame. Bob can either use an elegant mathematical design, or keep the confidentiality of a date with Alice - but alas, usually you can not get both. This will become very clear when we finally move on to the topic of public key cryptography.
Cross Protocol / Downgrade
In the film “The Illusion of Deception” (2013), a group of illusionists try to trick the entire state of the corrupt insurance magnate Arthur Tressler. To gain access to Arthur's bank account, illusionists must either provide his username and password, or make him personally appear in the bank and take part in the scheme.
Both options are very difficult; the guys are used to performing on stage, and not to participate in intelligence operations. Therefore, they choose the third option: their accomplice calls the bank and pretends to be Arthur. The bank asks several questions for verification of identity, such as the name of the uncle and the name of the first pet; our heroes in advance easily get this information from Arthur with the help of clever social engineering . From now on, excellent password security no longer matters.
(According to a city legend, which we personally checked and confirmed, cryptographer Eli Behem once came across a bank teller who insisted on setting a secret question. When the cashier asked her mother’s grandmother’s name, Beekham began to dictate: “Capital X, little y, three ... ").
It is the same in cryptography if two cryptographic protocols are used in parallel to protect the same asset, while one is much weaker than the other. The resulting system becomes vulnerable to a cross-protocol attack when a weaker protocol is attacked to get to the prize without touching a stronger one.
In some complex cases, it is not enough just to contact the server using a weaker protocol, and the involuntary participation of a legitimate client is required. This can be organized using a so-called downgrade attack. To understand this attack, suppose our illusionists have a more difficult task than in the film. Suppose that a bank employee (cashier) and Arthur had some unforeseen circumstances, as a result of which there was such a dialogue:
Cracker: Hello? This is Arthur Tressler. I would like to reset my password.
Cashier: Great. Please take a look at your personal secret code book, page 28, word 3. All of the following messages will be encrypted with this particular word as a key. PQJGH. LOTJNAM PGGY MXVRL ZZLQ SRIU HHNMLPPPV ...
Cracker: Hey hey, wait a minute. Is it really necessary? Can't we just talk like normal people?
Cashier: I do not advise doing this.
Cracker: I just ... listen, I had a lousy day, okay? I am a VIP client and not in the mood to delve into these stupid code books.
Cashier: Good. If you insist, Mr. Tressler. What do you want?
Cracker: Please, I would like to transfer all my money to the Arthur Tressler National Victims Fund.
(Pause).
Cashier: Okay. Please provide your PIN code for large transactions.
Cracker: My what?
Cashier: At your personal request, transactions of this size require a PIN for large transactions. This code was given to you when opening an account.
Cracker: ... I lost him. Is it really necessary? Can't you just approve the deal?
Cashier: None. I'm sorry, Mr. Tressler. Again, this is the security measure you requested. If you want, we can send a new PIN code to the mailbox.
Our heroes postpone the operation. They are listening to several major Tressler transactions, hoping to hear a pin; but each time the conversation turns into an encrypted gibberish before something interesting sounds there. Finally, one day the plan is put into effect. They patiently wait for the moment when Tressler must complete a major transaction by phone, he connects to the line, and then ...
Thressler: Hello. I would like to issue a remote transaction, please.
Cashier: Great. Please take a look at your personal secret code book, page ...
(The cracker presses a button; the cashier’s voice turns into an inaudible noise).
Cashier: - # @ $ # @ $ # * @ $$ @ # * will be encrypted with this word as a key. AAAYRR PLRQRZ MMNJK LOJBAN ...
Thressler: Sorry, I did not quite understand. Again? On what page? What is the word?
Cashier: This is the page @ # $ @ # * $) # * # @ () # @ $ (# @ * $ (# @ *.
Thressler: What?
Cashier: Word number twenty @ $ # @ $ #% # $.
Thressler: Seriously! Enough is enough! You and your security protocol are some kind of circus. I know that you can just talk normally with me.
Cashier: I do not advise ...
Thressler: And I do not advise you to waste my time. I don’t want to hear more about this until I fix the problems with my telephone line. Can we arrange this deal or not?
Cashier: ... yes. Good. What do you want?
Thressler: I would like to transfer $ 20,000 to Lord Business Investments, account number ...
Cashier: Just a moment, please. This is a big deal. Please provide your PIN code for large transactions.
Thressler: What? Oh, for sure. 1234.
Here is the fall attack. The weaker protocol, “just speak plainly,” was intended as an option in extreme cases. And yet we are here.
You can ask a question who, in their right mind, will design a real system of the type “safe until asked otherwise”, which is described above. But just as a fictitious bank takes risks to save customers who do not like cryptography, so systems in general often tend to requirements that are indifferent or even openly hostile to security.
This is exactly the story that happened with SSLv2 in 1995. The US government has long begun to consider cryptography as a weapon that is best kept away from external and internal enemies. The code fragments were individually approved for export from the United States, often subject to an intentional weakening of the algorithm. Netscape, the developer of the most popular browser, Netscape Navigator, was given SSLv2 permission only with an initially vulnerable RSA key of 512 bits (and 40 bits for RC4).
By the end of the millennium, the rules had softened and access to modern encryption became widely available. However, clients and servers have for many years supported weakened “export” cryptography due to the same inertia that maintains support for any legacy system. Clients thought they might encounter a server that does not support anything else. Servers did the same. Of course, the SSL protocol dictates that clients and servers should never use a weak protocol when the best is available. But the same premise was valid for Tressler and his bank.
This theory found application in two high-profile attacks that shook SSL protocol security one after another in 2015, both discovered by Microsoft and INRIA researchers. First, in February, the details of the FREAK attack were disclosed, and after three months, another similar attack called Logjam, which we will discuss in more detail when we move on to public key cryptography attacks.
The FREAK (also known as "Smack TLS") vulnerability appeared when researchers analyzed the implementation of the client / server TLS and found a curious error. In these implementations, if the client does not even ask to use weak export cryptography, but the server still responds with such keys, the client says “Okay” and switches to a weak set of ciphers.
At that time, everyone considered export cryptography to be outdated and forbidden to use, so the attack was a real shock and affected many important domains, including the sites of the White House, the US tax administration and the NSA. Worse, it turned out that many vulnerable servers optimized performance by reusing the same keys rather than creating new ones for each session. This allowed, after lowering the protocol, to carry out an attack with precomputation: hacking one key remained relatively expensive ($ 100 and 12 hours at the time of publication), but the practical cost of attacking the connection was significantly reduced. It is enough to pick up the server key once and crack the ciphers for all subsequent connections from now on.
And before moving on, we need to mention one advanced attack ...
Oracle attack
Moxie Marlinspike is best known as the father of the cross-platform Signal crypto messenger;but personally, we like one of his lesser-known innovations - the principle of cryptographic doom (Cryptographic Doom Principle). To paraphrase slightly, we can say this: "If the protocol performs any cryptographic operation on a message from a potentially malicious source and behaves differently depending on the result, it is doomed." Or in a sharper form: “Do not take information from the enemy for processing, and if you had to, at least do not show the result.”
Leave aside buffer overflows, injection commands, and the like; they are beyond the scope of this discussion. Violation of the "doom principle" leads to serious hacking of cryptography due to the fact that the protocol behaves exactly as it should.
For example, take a fictitious design with a vulnerable substitution cipher, and then demonstrate a possible attack. Although we have already seen an attack on a substitution cipher using frequency analysis, this is not just “another way to break the same cipher”. Conversely, oracle attacks are a much more modern invention, applicable to many situations where frequency analysis fails, and we will see a demonstration of this in the next section. Here, a simple cipher is chosen only to make the example more clear.
So, Alice and Bob communicate using a simple substitution cipher, using a key known only to them. They are very strict about the length of messages: their length is exactly 20 characters. Therefore, they agreed that if someone wants to send a shorter message, they should add some dummy text at the end of the message so that it is exactly 20 characters. After some discussion, they decided that they would only accept the following dummy text: a, bb, ccc, dddd, etc. Thus, the known dummy text of any desired length...
When Alice or Bob receives the message, they first check that the message has the correct length (20 characters) and the suffix is the correct dummy text. If this is not the case, then they respond with the corresponding error message. If the length of the text and the fictitious text are in order, the recipient reads the message itself and sends an encrypted response.
During the attack, the attacker pretends to be Bob and sends fake messages to Alice. Messages - complete nonsense - the attacker does not have a key, and therefore he can not fake a meaningful message. But since the protocol violates the principle of doom, the attacker can still trap Alice, so that she will reveal information about the key, as shown below.
:PREWF ZHJKL MMMN. LA
: .
:PREWF ZHJKL MMMN. LB
: .
:PREWF ZHJKL MMMN. LC
:ILCT? TLCT RUWO PUT KCAW CPS OWPOW!
, , , Ca , .
:REWF ZHJKL MMMN. LAA
: .
:REWF ZHJKL MMMN. LBB
: .
…
:REWF ZHJKL MMMN. LGG
: .
:REWF ZHJKL MMMN. LHH
:TLQO JWCRO FQAW SUY LCR C OWQXYJW. IW PWWR TU TCFA CHUYT TLQO JWFCTQUPOLQZ.
Again, the cracker has no idea what Alice just said, but notes that H must match with b because Alice accepted the dummy text.
And so on, until the attacker finds out the meaning of each character.
At first glance, the method resembles an attack based on selected plaintext. In the end, the attacker picks up ciphertexts, and the server obediently processes them. The main difference that makes these attacks viable in the real world is that the attacker does not need access to the actual decryption - the server’s response is enough, even as harmless as “Wrong fictitious text”.
Although this particular attack is instructive, one should not get too focused on the specifics of the “dummy text” scheme, the particular cryptosystem used, or the exact sequence of messages sent by the attacker. The main idea is how Alice reacts differently based on the properties of the plaintext, and does this without verifying that the corresponding encrypted text is actually received from a trusted party. Thus, Alice allows an attacker to extract secret information from her responses.
There is much to change in this scenario. The characters that Alice reacts to, or the very difference in her behavior, or even the cryptosystem used. But the principle will remain the same, and the attack as a whole will remain viable in one form or another. The basic implementation of this attack helped detect several security errors that we will address shortly; but first some theoretical lessons should be learned. How to use this fictional “Alice script” in an attack that can work on a real modern cipher? Is this possible at all, even in theory?
In 1998, Swiss cryptographer Daniel Bleichenbacher answered this question in the affirmative. He demonstrated an oracle attack in a widely used RSA public key cryptosystem using a specific message scheme. In some RSA implementations, the server responds with different error messages, depending on whether the plaintext matches the scheme or not; that was enough to carry out an attack.
Four years later, in 2002, French cryptographer Serge Vaudenay demonstrated an oracle attack, almost identical to that described in Alice's script above - except that instead of a fictitious cipher, he cracked a whole respectable class of modern ciphers that people really use. In particular, Wodene’s attack targets ciphers with a fixed input size (“block ciphers”) when they are used in the so-called “CBC encryption mode” and with a certain popular padding scheme, basically equivalent to the one in Alice’s script.
Also in 2002, American cryptographer John Kelsey, co-author of Twofish- proposed various oracle attacks on systems that compress messages and then encrypt them. The most noticeable among them was the attack, which used the fact that it is often possible to derive the original length of plaintext from the length of the ciphertext. In theory, this allows an oracle attack to restore parts of the original plaintext.
Next, we provide a more detailed description of the Woden and Kelsey attacks (we will give a more detailed description of the Bleichenbacher attack when we turn to attacks on public key cryptography). Despite all our efforts, the text becomes somewhat technical; therefore, if the above is enough for you, skip the next two sections.
Attack of Woden
To understand the Woden attack, you first need to talk a bit more about block ciphers and encryption modes. A “block cipher” is, as already mentioned, a cipher that accepts a key and input a certain fixed length (“block length”) and produces an encrypted block of the same length. Block ciphers are widely used and are considered relatively secure. The now retired DES, considered the first modern cipher, was blocky. As mentioned above, the same is true for AES, widely used today.
Unfortunately, block ciphers have one glaring weakness. A typical block size is 128 bits, or 16 characters. Obviously, modern cryptography requires working with larger input data, and this is where encryption modes appear. The encryption mode is essentially a hack: it is a way to somehow apply a block cipher, which accepts input data of only a certain size, to input data of an arbitrary length.
The Woden attack is focused on the popular CBC mode of operation (Cipher Block Chaining, ciphertext block coupling mode). The attack considers the basic block cipher as a magical impregnable black box and completely bypasses its security.
Here is a diagram that shows how CBC mode works:
The circled plus signifies the XOR operation (exclusive OR). For example, the second ciphertext block is received:
Performing an XOR operation on the second block of plaintext with the first block of ciphertext.
Encrypting the received block using a block cipher using a key.
Since CBC uses the XOR binary operation so intensively, let's take a moment to recall some of its properties:
Idempotency: $$A ⊕ 0 = A
Commutativity: $$A ⊕ B = B ⊕ A
Associativity: $$A ⊕ ( B ⊕ C ) = ( A ⊕ B ) ⊕ C
Self-reversibility: $$A ⊕ A = 0
Byte-byte: Byte n of $$( A ⊕ B ) = (byte n of$$A ) $$⊕ (byte n of$$B )
Typically, these properties imply that if we have an equation that includes XOR operations and one unknown, it can be solved. For example, if we know that$$A ⊕ X = B with unknown$$X and famous$$A and $$B , then we can rely on the above properties above to solve the equation for$$X . Applying XOR on both sides of the equation with $$A we get$$X = A ⊕ B .In a moment, all this will become very relevant.
There are two minor differences and one major difference between our Alice script and the Woden attack. Two minor:
In the scenario Alice expect that the open end in the text a, bb, cccand so on. In a Woden attack, the victim instead expects the plaintext to end N times with byte N (i.e., hexadecimal 01 or 02 02, or 03 03 03 and so on). This is a purely cosmetic difference.
In Alice’s script, it was easy to tell if Alice had received the message, in response to “Wrong fictitious text.” In a Woden attack, more analysis is needed and accurate implementation on the victim's side is important; but for brevity we take for granted that this analysis is still possible.
The main difference:
, ( ), , . .
This is the main difference - the last piece of the puzzle to understand Woden's attack, so let's think for a moment about why and how you can organize an oracle attack on CBC.
Suppose a CBC ciphertext of 247 blocks is given, and we want to decrypt it. We can send fake messages to the server, as before we could send fake messages to Alice. The server will decrypt the messages for us, but will not show the decryption - instead, again, as in the case of Alice, the server will report only one bit of information: the plaintext has a valid filling or not.
Note that in Alice’s scenario, we had the following relationships:
We call this the Alice equation. We controlled the ciphertext; the server (Alice) merged vague information about the received plaintext; and this allowed us to derive information about the last factor - the key. By analogy, if we can find such a connection for the CBC script, we could extract some secret information there too.
Fortunately, there really are relationships that we can use. Consider the output of the block cipher decryption final call and designate this data as$$W . Also denote plaintext blocks $$P 1 , P 2 , . . . and ciphertext blocks$$C 1 , C 2 , . . . . Take another look at the CBC chart and notice that it turns out:
$$C 246 ⊕ W = P 247
We call this the "CBC equation."
In Alice’s script, by monitoring the ciphertext and observing the leak of information about the corresponding plaintext, we were able to organize an attack that restored the third member of the equation — the key. In the CBC scenario, we also control the ciphertext and observe information leakage in the corresponding plaintext. If an analogy holds, we can get information about$$W .
Suppose we really restored $$W , then what? Well, then we can immediately print the entire last block of plaintext ($$P 247 ) by simply typing$$C 246 (which we have) andreceived $$W in the CBC equation. So, we are optimistic about the general plan of attack, and it's time to work out the details. We pay attention to exactly how the server leaks plaintext information. In Alice’s script, a leak occurred because Alice answered with the correct message only if
$$SIMPLE_SUBSTITUTION ( ciphertext , key ) ended with a line a(or bb, and so on, but the chances of these conditions being accidentally triggered were very small). Similarly with CBC, the server accepts padding if and only if$$C 246 ⊕ W ends in hexadecimal. So, let's try the same trick: sending fake ciphertexts with our own fake values01$$C 246 until the server accepts the fill. When the server accepts padding for one of our fake messages, it means that:
$$C 246 ⊕ W = data with hex 01 at the end
Now use the XOR byte property:
$$( last byte of C 246 ) ⊕ ( last byte of W ) = hex 01
We know the first and third member. And we have already seen that this allows you to restore the remaining member - the last byte from$$W :
$$( last byte of W ) = ( last byte of C 246 ) ⊕ ( hex 01 )
It also gives us the last byte of the final block of plaintext through the CBC equation and the byte property.
We could end up with this and be satisfied that we attacked a theoretically robust cipher. But in fact, we can do much more: we can really restore the entire text. This requires a certain trick, which was not in the original script of Alice and it is not included in the prerequisites for the attack of the oracle, but the method is still worth exploring.
To understand it, first note that as a result of deriving the correct value of the last byte$$W we have a new ability. Now, when faking ciphertexts, we can control the last byte of the corresponding plaintext. Again, this is due to the CBC equation and the byte property:
$$( last byte of C 246 ) ⊕ ( last byte of W ) = Last byte of P 247
Since we now know the second term, we can use our control of the first to control the third. We just calculate:
$$( Last byte of fake C 246 ) = ( Desired last byte of P 247 ) ⊕ ( Last byte of W )
We couldn’t do this before, because we didn’t yet have the last byte $$W .
How will this help us? Suppose now that we will create all ciphertexts so that in the corresponding plaintexts the last byte is equal 02. Now the server accepts padding only if the plaintext ends with 02 02. Since we corrected the last byte, this will happen only if the penultimate byte of plaintext is also 02. We continue to send fake ciphertext blocks, changing the penultimate byte, until the server accepts padding for one of them. At this moment we get:
$$( Penultimate byte of fake C 246 ) ⊕ ( penultimate byte of W ) = hex 02
And we restore the penultimate byte $$W exactly the same as restored last. We continue in the same vein: we fix the last two bytes of plaintext on, repeat this attack for the third byte from the end of the byte, and so on, ultimately fully recovering03 03$$W .
What about the rest of the text? Please note that the value$$W is actually$$BLOCK_DECRYPT ( key , C 247 ) . We can put any other block instead $$C 247 , and the attack will still be successful. In fact, we can ask the server to do$$BLOCK_DECRYPT for any data. At this point, the game is over - we can decrypt any ciphertext (once again, look at the CBC decryption diagram to see this; and note that vector IV is publicly available).
This particular method plays a decisive role in the attack of the oracle, which we will encounter later.
Attack kelsey
A close-minded John Kelsey set forth the principles that underlie many possible attacks, and not just the detailed details of a particular attack on a particular cipher. His 2002 article is a study of possible attacks on encrypted compressed data. You thought that to conduct an attack, information alone was not enough, that the data was compressed before encryption? It turns out enough.
This amazing result is due to two principles. Firstly, there is a strong correlation between the length of the plaintext and the length of the ciphertext; for many ciphers, exact equality. Secondly, when compression is performed, there is also a strong correlation between the length of the compressed message and the degree of “noise” of the plaintext, that is, the proportion of non-repeating characters (the technical term is “large entropy”).
To see the principle in action, consider two plaintexts:
Clear text 1: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Clear text 2:ATVXCAGTRSVPTVVULSJQHGEYCMQPCRQBGCYIXCFJGJ
Suppose both plaintexts are compressed and then encrypted. You get two resulting ciphertexts and must guess which ciphertext matches which plaintext:
The answer is clear. Among plaintexts, only plaintext 1 could be compressed to the meager length of the second ciphertext. We found out without knowing anything about the compression algorithm, the encryption key, or even the cipher itself. Compared to the hierarchy of possible cryptographic attacks, this is a kind of madness.
Kelsey further indicates that under certain unusual circumstances this principle can also be used to conduct an oracle attack. In particular, he describes how an attacker can recover secret plaintext if he can force the server to encrypt form data (plaintext, followed by$$X while he controls$$X and can somehow check the length of the encrypted result). Again, as in other oracle attacks, we have the ratio:
$$Encryption ( Compression ( Clear text followed by X ) ) = Ciphertext
Again, we control one member ( $$X ), we see a small leak of information about another member (ciphertext) and try to restore the latter (plaintext). Despite the analogy, this is a somewhat unusual situation compared to the other oracle attacks we saw.To illustrate how such an attack can work, we use the fictional compression scheme we just came up with: TOYZIP. He searches for lines of text that have already appeared earlier in the text, and replaces them with three placeholder bytes that indicate where to find an earlier instance of the line and how many times it appears there. For example, a stringcan be compressed with alength of 13 bytes compared to the original 15 bytes.Suppose an attacker tries to recover the plaintext of a form
helloworldhellohelloworld[00][00][05]
password=...where the password itself is unknown. According to Kelsey’s attack model, an attacker may ask the server to compress and then encrypt form messages (plaintext, followed by$$X ) where$$X is free text. When the server is finished, it reports the length of the result. The attack goes as follows:
Cracker: Please compress and encrypt the plaintext without any fillings.
Server: Length of the result 14.
Cracker: Please compress and encrypt the plaintext to which you added password=a.
Server: Result Length 18.
The cracker notes: [original 14] + [three bytes that replaced password=] +a
Cracker: Please compress and encrypt the plaintext to which you have added password=b.
Server: Length of the result 18.
Cracker: Please compress and encrypt the plaintext to which you have added password=.
Server: Length of result 17.
The cracker notes: [original 14] + [three bytes that replaced password=c]. This assumes that the original plaintext contains a string password=c. That is, the password begins with a letterc
Cracker: Please compress and encrypt the plaintext to which you have added password=a.
Server: Result Length 18.
The cracker notes: [original 14] + [three bytes that replaced password=] +a
Cracker: Please compress and encrypt the plaintext to which you have added password=b.
Server: Result Length 18.
(… a little while later…)
Cracker: Please compress and encrypt the plaintext to which you have added password=.
Server: Length of result 17.
The cracker notes: [original 14] + [three bytes that replaced password=co]. By the same logic, the cracker concludes that the password begins with the lettersco
And so on until the entire password is restored.
The reader is forgiven for thinking that this is a purely academic exercise and such an attack scenario will never occur in the real world. Alas, as we will soon see, in cryptography it is better not to promise.
Brand Vulnerabilities: CRIME, POODLE, DROWN
Finally, after a detailed study of the theory, we can see how these methods are used in real cryptographic attacks.
CRIME (Compression Ratio Infoleak Made Easy, ). , 2012 (Juliano Rizzo) (Thai Duong). , , . Google, , . :
$$-=(())
Here, the cracker controls the request and has access to the traffic sniffer, including packet size. Kelsey's fictional scenario came true.
Understanding the theory, CRIME authors created an exploit that could steal session cookies for a wide range of sites, including Gmail, Twitter, Dropbox and Github. The vulnerability affected most modern web browsers, as a result of which patches were released that silently buried the compression function in SSL so that it was not used at all. The only one protected from the vulnerability was the venerable Internet Explorer, which never used SSL compression at all.
POODLE
In October 2014, the Google security team made a stir in the security community. They were able to exploit the vulnerability in the SSL protocol, fixed more than ten years ago.
It turned out that although the new TLSv1.2 is running on the servers, many left support for outdated SSLv3 for backward compatibility with Internet Explorer 6. We already talked about downgrade attacks, so you can imagine what is happening. Well-organized sabotage of the handshake protocol - and the servers are ready to return to the good old SSLv3, essentially canceling the last 15 years of security research.