Introduction to Crpytograhy
The following page describes the very basics of modern encryption systems. Though more focus is put on the theory of encryption, a practical example is provided below. We hope this article can serve you as a basis in understanding all our future writings.
The goal of encryption is three-fold:
- To protect the privacy of the information (a third-party should not be able to read the contents)
- To protect the integrity of the information (a third-party should not be able to modify the contents)
- To guarantee the data's origin (a third-party should not be able to impersonate or send on behalf of one of the participants)
With this in mind, we can define a cryptographic scheme/system with the following information:
A domain - the unencrypted (plain text message) - m
A codomain - the encrypted (cipher text message) - CT
An encryption key - ke
A decryption key - kd
An encryption function - E ke(m) = CT
A decryption function - D kd(CT) = m
Based on the keys we distinguish between two types of cryptographic schemes.
Symmetric Encryption
Symmetric Encryption describes a cryptographic scheme where the encryption and decryption key are the same. This encryption process is perhaps the most commonly imagined one.
To understand how symmetric encryption works (which will apply for all symmetric encryptions), let us suppose Bob wants to send a love letter to Allice. However, Bob is scared that his very jealous ex-wife, Eve, will see the letter when he sends it. Thus, he decides to put the letter in a box and locks it with a padlock (which is the encryption format used, be it AES, Caesar, etc). The key in this analogy is the cryptographic key or secret passcode in the cipher. Then Bob sends the box to Allice.
While in transit, Eve can see that Bob is sending something to Alice, but cannot see the contents of it, without a key. Ideally, Eve should also not be able to modify the letter sent by Bob (so as to not impede his confession).
When Alice receives the box, she can simply open it with her key (the shared passcode). She is now able to read Bob's love letter without Eve interfering.
Formally, a symmetric encryption comprises an Encryption and Decryption function (the locking and unlocking of the padlock), a key, and a domain (the original message) and a codomain (the encrypted message - the locked box). The message before encryption is called a plaintext (PT) and after - ciphertext (CT).
Mathematically, encryption is:
Ek : PT -> CT
And decryption:
Dk : CT -> PT
It is possible that the message is English but the encrypted message is in Russian (so not even in the same alphabet). As in, the encryption can translate the messge from one set of symbols to another.
The Limitations of Symmetric Encryption
While this approach works perfectly well in practice, it has a fundamental problem - Bob and Alice need to both have a copy of the key. This would require them to have met prior to the message exchange process in order to obtain the same key. However, Eve is REALLY JEALOUS and would definitely try to interfere during such a meeting. Of course, Bob can't just send the key to Alice, as Eve can make a copy for herself.
A possible solution where no exchange of keys is necessary is as follows:
Bob puts the message in a box and locks it with his lock. He sends the box to Alice. She puts her own padlock on the box and resends it to Bob (it now has 2 locks that Eve will need to break). Bob unlocks his lock and send it back again. The box now has only Alice's lock. She now removes her lock and reads the love letter.
While this solution works in our analogy, it is difficult to apply in practice. It requires that the two encryption functions are commutable. As in:
DkB(EkB (m)) = m ----> Bob encrypting and then decrypting a message with his key needs to give the original message
and:
EkA(EkB (m)) = EkB(EkA (m)) ----> Whether Bob or Alice locks the box first, the end result needs to be the same.
This, unfortunately is not a property inherit to most systems. Hence a different approach is needed when Bob and Alice cannot meet to exchange keys.
Asymmetric Encryption
Asymmetric encryption schemes, or more commonly referred to as public-key encryption schemes, were a revolutionary discovery in the 1970s. As the name would suggest, in such a situation the key for encryption are decryption are different (and one cannot be derived from the other). Thus there is no need to meet to exchange the same key prior to the conversation.
Let us return to our analogy of locks and boxes. In order to describe a public-key system, we need to assume that if one has the lock, making the key for said lock is nearly impossible (which is not always the case in the real world, unfortunately). Thus, in order for Bob and Alice to communicate in a secure manner, Alice needs to send her lock before Bob sends her the letter. While in transit, Eve can take the lock and copy it. This is not a problem. Bob receives Alice's package. He then puts his secret letter in the box, which he locks with the padlock received. He sends it to Alice. Now, Eve can see a locked box coming through the postal system. She may have copied the padlock, but has no means of opening it. When Alice receives the package, she opens it with her own key and reads Bob's letter. And what happens next with them is history.
It is important to recognise in this system that the flow of secure communication is currently one-way, i.e. only Bob can send secure messages to Alice. Alice can lock her box with her padlock, but Bob won't be able to open it without the key. Thus Bob needs to send Alice his own padlock as well.
The system can thus be described formally:
EPubKey (m) = CT
And decryption:
DPrivKey (CT) = m
Also:
DPubKey (CT) =/= m -----> Decrypting with the public key should not give you the original message.
Here PubKey means Public Key (the padlock) and PrivKey - private key (the secret key to open the padlock with).
What happens in the cyber space, is that a user would generate a unique pair of keys PubKey and PrivKey, such that:
DPrivKey (EPubKey (m)) = m for all messages m
The user would then publish their public key somewhere. Thus essentially anyone can use the metaphorical padlock to message them. Any time they receive an encrypted message, they simply decrypt it with their private key. No one else can see the contents of that message (they can't decrypt it with the public key). However, the private key is never used for encryption, since any message can be decrypted with the public key (in most cryptosystems). Thus it also holds that:
DPubKey (EPrivKey (m)) = m for all messages m in most public-key cryptosystems
A Concrete Example
What was said in the previous section sounds wonderful and maybe almost magical. But how can something like this work?
Let us take a look at a concrete public-key encryption scheme (and the most popular one) - RSA.
Here I will spare you the math, which proves the security of this system. All you need to remember is that:
aPhi(n)+1 mod n = a
Where:
n = p * q
p and q are two very large prime numbers
Phi(n) = (p - 1) * (q - 1)
mod n is the modulo operation (remainder when dividing by n)
a is any whole number less than n
aPhi(n) mean a to the power of Phi(n)
The RSA system works as follows:
- Bob finds two very large prime numbers (ideally they would each be 1024 bits, however here for this example we will use much smaller ones) - 23 and 53.
- Bob calculates n = 23 * 53 = 1219
- Bob calculates phi(n) = (23 - 1) * (53 - 1) = 22 * 52 = 1144
- Bob chooses a public key, such that the greatest common divisor (gcd) of phi(n) and PubKey is 1 (they don't have common divisors). For this example we will choose 3.
- Bob finds a private key, such that PubKey * PrivKey mod phi(n) = 1. For our example this gives us PrivKey of 763
- Bob publishes n and PubKey.
- Alice encrypts a message (m = 13 in this example) computing CT = mPubKey mod n = 133 mod 1219 = 2197 mod 1219 = 978
- Alice sends 978 to Bob
- Bob decrypts 978 by computing m = CTPrivKey mod n = 978763 mod 1219 = 13
As you can see, the system can only encrypt whole numbers and only those less than the value of n. Thus some encoding is necessary, with very limited expression. You can play around with the cryptosystem below:
Comparison
An unfamiliar reader might think to themselves, after reading the above sections, "Why ever use symmetric cryptosystems? Asymmetric ones are clearly superior!" And they might have some grounds to claim that. Indeed, asymmetric schemes remove the need for Bob and Alice exchange a secret key when channels may be unsecure (as internet protocols by themselves are). This makes public key cryptosystems essential in our modern communication. However, they suffer from two major issues.
First, as some of you might have seen the calculations above (978763 mod 1219), can be very difficult to solve even by a computer with specialised algorithms and hardware. Imagine what a hassle it would be to calculate if we used 2048-bit keys. And when one considers the fact that a message is rarely less than 2048-bits (as these systems require), this would mean that it would need to be split into chunks and for each of them the computations repeated.
Second, the security of the algorithms is not "proven". They rely on the assumption that some mathematical problems are difficult to compute even by machines (for example given n, it is hard to compute its factors p and q). However, a concrete bound on how long a brute-force attack would take cannot be set. It is believed that this problem requires exponential computations, but this claim cannot be proven. Symmetric key encryptions, on the other hand, have much stronger guarantees about their security.
Thus in light of this, the modern world relies on hybrid systems. Initially Alice and Bob use the public-key encryption schemes to agree on some key and establish their identities. Then they use the agreed upon key to communicate in a secure manner via symmetric encryption systems.