Previously, I wrote an article on SSL protocol, which introduces the fundamentals of cryptography and gives an overall overview of the SSL protocol. For space reasons, the detailed flow of the SSL handshake was not covered in depth. In this paper, we will disassemble the handshake process and introduce it in detail at the message level. If you don’t have the basic concepts of cryptography or don’t know the basic information of SSL protocol, it is recommended to read the previous article first. In addition, this article mainly focuses on TLSv1.2 and GMVPN, and does not cover TLSv1.3 for now.
Overview
This article will be divided into the following sections, with a focus on key exchange and authentication.
We all know that the main purpose of SSL protocol is to negotiate the security parameters of both sides of the communication, and then to communicate securely based on the negotiated security parameters. Therefore, the first problem to be solved is how to exchange keys. Considering the significant performance difference between symmetric and asymmetric ciphers, in practice, symmetric ciphers are often used to encrypt data, while asymmetric ciphers are used to perform the important key exchange and authentication.
Basic handshake flow
The diagram below shows a flowchart of the SSL handshake as we normally see it. The SSL protocol is divided into two phases: the handshake phase and the application phase. The handshake phase is primarily responsible for negotiating security parameters, while the application phase communicates data based on the previously negotiated security parameters.
Each message is not described in detail here for now, but will be covered later in the contents. The ChangeCipherSpec message is a separate class of message that indicates that the handshake protocol has been completed and that the subsequent communication will be performed using the negotiated encryption parameters. The main reason for its existence is that the preceding message CertificateRequest is optional, so it needs to explicitly tell the client that the server side has finished sending the message, otherwise the client has no way to know whether to wait for the CertificateRequest message.
How to perform key exchange
The methods of key exchange can be divided into two categories: one based on encryption and one based on DH. the former has RSA algorithm, GM’s ECC algorithm; the latter has ECDHE, GM’s ECDHE.
Although it has been covered in the previous article, here is a little bit with the basic principles of these two to facilitate the comparison in the actual handshake process later.
Principle of key exchange
The basic principle of public key encryption is as follows.
A public key cipher has two keys, one of which is the public key, which can be disseminated freely, and the other is the private key, which needs to be kept tightly by yourself. For example, if Bob wants to send a message to Alice, Bob encrypts the message with Alice’s public key and sends it to Alice, who then decrypts it with her own private key.
The principle of DH class algorithm can be explained graphically by the following diagram.
First both parties negotiate an identical base color (algorithm parameter), then each generates its own private color (equivalent to the private key) and obtains the corresponding public color (equivalent to the public key) by mixing. Then both parties exchange their public colors and mix them with their private key colors to finally negotiate an identical color (i.e., the exchanged key). The eavesdropper cannot generate the same key even if he gets this information exchanged between the two parties, the difficulty of solving the discrete object problem ensures the security of the DH algorithm.
TLS RSA key exchange
The typical algorithm suite for this case is AES256-SHA. (The relevant messages have been marked in blue, and the solid arrows attached to the message block indicate that the message carries the corresponding content.)
This case is relatively simple, first the client generates a random number to bring in the ClientHello message, and the server also generates a random number to bring in the ServerHello message. Then the server sends its certificate to the client in the Certificate message, which contains its public key. After the client receives it, it encrypts a randomly generated premaster key with the server’s public key and sends it to the server via ClientKeyExchange message, and the server decrypts it with its own private key to get the premaster key. Here is the application of the principle of public key encryption mentioned earlier.
After the key exchange, both sides get the pre-master key, plus the two random numbers exchanged earlier, and these three materials are used together for key derivation to get the master key. The master key is then combined with the two random numbers to derive the final session key.
Let’s throw out a few questions here for your consideration. Why not use the pre-master key directly, but derive the master key with two random numbers? Why not use the master key directly, but derive the session key with two more random numbers? I’ll leave the answer for now and break it down later.
TLS ECDHE key exchange
The typical algorithm suites for this case are ECDHE-RSA-AES256-GCM-SHA384, ECDHE-ECDSA-AES256-GCM-SHA384, where EC means elliptic curve, DH means based on DH algorithm, and the last E means using temporary key for key exchange instead of certificate related non-temporary key.
Let’s compare the difference with the previous case, first of all, ClientHello and ServerHello messages are the same, both with a random number. The difference is that the server side sends a temporary DH parameter to the other side through the ServerKeyExchange message (that is, the public color of the DH principle), and similarly the client also sends its temporary DH parameter to the server side through the ClientKeyExchange message. Thus both sides exchange each other’s temporary DH public keys, and then they each use their temporary private keys and each other’s temporary public keys to calculate the pre-master key. The major difference with the former case is that the client encrypts the pre-master key and sends it to the server, while the latter case is that both sides exchange the temporary DH public key and then each calculates the pre-master key.
After getting the pre-master key, the subsequent process is the same. Combine two random numbers to derive the master key, and then derive the session key.
GM ECC Key Exchange
Next we look at the GM ECC key exchange case, the typical algorithm suite for this case is ECC-SM4-SM3. One of the major differences of the GMVPN protocol is that it introduces a dual certificate system, where each party has two certificates (corresponding to two private keys), one encryption certificate for key exchange and one signing certificate for authentication.
The main difference with TLS RSA key exchange is that the certificate message in this case contains a dual certificate, and the client uses the public key from the server-side encryption certificate when encrypting the pre-master key. Correspondingly, the server uses its own encrypted private key when decrypting. The subsequent key derivation process is the same, so we won’t repeat it here.
The reason why GM introduces dual certificates is that your encrypted private key is kept in the CA and it can decrypt your messages if necessary.
GM ECDHE Key Exchange
The typical algorithm suite for this case is ECDHE-SM4-SM3, also with a dual certificate, the ServerKeyExchange message with the server’s temporary DH parameters, and the client sending the dual certificate and its temporary DH parameters to the server. Notice that when calculating the pre-master key, four materials are involved in the operation, the public key in the other party’s encryption certificate and the temporary public key, its own encryption private key and the temporary private key, and the pre-master key is calculated from these four materials together. As a comparison, the ECDHE of ordinary TLS only has its own temporary private key and the temporary public key of the other party involved in the calculation.
The biggest difference of GM ECDHE lies in this, so the algorithm suite of GM ECDHE must be bi-directional authentication, because the cryptographic certificate of the client is also required to participate in the key exchange. From here, we can also see that the GM protocol is not strict in design, which algorithm suite only allows two-way authentication. Moreover, since GM ECDHE also has a temporary key involved in calculating the pre-master key, it cannot be decrypted even if the CA has an encrypted private key, which is contrary to the original purpose of the dual certificate.
TLS1.2 key derivation process
The process of key derivation has actually been drawn earlier, based on encryption or DH exchange between the two sides to get the pre-master key, the pre-master key combined with two random numbers to derive the master key, the derivation in TLS1.2 is done through PRF, where secret is the pre-master key, label is a specific string, and seed is the two random numbers in front.
After the master key is obtained, the session key is derived by combining two random numbers, also using PRF, but then secret is the master key, label string is different, and seed is still two random numbers.
The IV is drawn as a dashed line in the figure, because it is not generally needed in TLS1.2. Because since TLS1.1 changed to explicit IV (in order to prevent CBC explicit selection attack), IV is brought explicitly in the record, only when using AEAD algorithm, implicit nonce is needed.
As for the two questions thrown out earlier, why not just use the premaster key? On the one hand, it is to unify the length, because the encryption-based key exchange premaster key is 48 bytes, while the DH-based key exchange premaster key length depends on the specific algorithm. A more important point is that the random numbers of both sides are added to the calculation, which can prevent replay attacks and also increase the entropy source of random numbers to increase security.
So why not use the master key directly and derive the session key again? This is mainly a life cycle consideration, the two have different life cycles, the life cycle of the master key is longer, the session key is shorter, only in the current session valid. The difference can be seen later on when we talk about session reuse.
How to perform authentication
We discussed how to do key exchange earlier, but key exchange only negotiates out the key, and does not consider who the other party is? How do you confirm the identity of the other party and prevent attacks by man-in-the-middle? That’s why authentication is introduced.
Authentication needs to solve two problems.
- confirming that the other party has the private key corresponding to the public key
- Confirming the identity of the other party
The first problem can be solved by using a simple digital signature. By verifying the other party’s digital signature, it can be confirmed that the other party has the corresponding private key.
The second problem, on the other hand, requires the introduction of digital certificates and PKI. Digital certificate, to be frank, is to bind a public key with identity information, and then a third-party trusted authority (CA) will give you a certificate (signed by CA). Then who will prove the identity of the CA itself, and then through a higher level of CA for proof. The ultimate root CA can only be authenticated by other means, otherwise it’s an infinite loop.
Principle of digital signature
The basic principle of digital signature is also very simple, as opposed to the previous public key encryption, here the private key is used to sign the data, and the other party uses the public key to check the signature data.
RSA authentication for TLS RSA key exchange
The typical algorithm suite is still AES256-SHA.
The server-side authentication involves only Certificate messages. The server sends the certificate and CA certificate chain to the client, and the client only needs to verify the certificate and certificate chain. The final Root CA needs to be locally trusted, otherwise there is a security risk and may be subject to man-in-the-middle attack.
If two-way authentication is required, i.e. the server side also needs to authenticate the client, it also involves the 3 messages in orange in the figure. The server side will send a CertificateRequest message to tell the client the type of certificate supported, the signature hash algorithm, and the DN entry of the CA. The client selects a suitable client certificate based on this information, then sends the certificate and CA certificate chain to the server in the Certificate message, and in addition needs to make a signature with its own private key to prove that it has the private key corresponding to the certificate. Here the content of the signature is all the previous handshake messages (from ClientHello to all messages before the current message). Then the server verifies the client’s certificate chain as well as the signature.
Authentication during TLS ECDHE key exchange
Typical algorithm suites such as ECDHE-RSA-AES256-GCM-SHA384, ECDHE-ECDSA-AES256-GCM-SHA384.
Compared with the previous case, the server side has one more signature. Because the key exchange in this case is done by temporary DH parameters, there is no private key corresponding to the server-side certificate involved, so you need to make an extra signature with the private key corresponding to the certificate to prove that you really have the private key corresponding to the certificate. The content of the signature here is two hello random numbers and the temporary DH parameters of the server side, which are actually the three materials involved in the calculation of the pre-master key.
The client needs to verify this signature in addition to the certificate chain.
The part of client-side authentication is exactly the same as the previous case and will not be repeated.
GM ECC authentication during key exchange
The typical algorithm suite is still ECC-SM4-SM3.
In the previous case of TLS RSA, the server does not need to make an additional signature. Because both key exchange and authentication are carried out through the same certificate, the server can complete the key exchange (decrypt the pre-master key) already means it has the corresponding private key.
This is not the case for GM ECC, because the key exchange and identity authentication are carried out by different certificates due to the dual certificate relationship. So the server still needs to make a signature to prove its private key possession. The signature is done by signing the private key, and the content of the signature is changed, which is two hello random numbers and the encryption certificate of the server. In addition to verifying the certificate chain, the client also needs to verify the signature with the server-side signature certificate.
The process of client-side authentication is still similar, except that the double certificate is also sent, and then the signature private key is also used in the signature. In addition to verifying the certificate chain, the server side also needs to verify the signature with the client’s signing certificate.
GM ECDHE authentication during key exchange
The typical algorithm suite is still ECDHE-SM4-SM3.
Compared with the previous cases, the main difference is that the content of the signature on the server side is a little bit different, which is two hello random numbers + temporary DH parameters on the server side. We notice that in the previous cases, the content of the server-side signature is the material involved in the calculation of the pre-master key. From this logic, there are some problems in the protocol design on this side, because the server-side encryption certificate actually also participates in the key exchange, which should also be included in the signature content according to the protocol design consistency.
The other parts are exactly the same as the previous case, so we won’t go over them again.
How to negotiate the protocol algorithm
As we discussed earlier, we can see that different protocol versions and different algorithm suites handle the handshake process differently, so how do the two sides agree on this?
So it’s time to introduce a Hello interaction. This is the main reason why a complete SSL handshake requires two interactions. This Hello interaction is used to agree on the protocol version and algorithm suite to be used. Also thanks to the extensions introduced by the SSL protocol, not only the protocol version and algorithm suite, but also many other things can be negotiated, even user-defined extensions.
But the most basic thing is the protocol version, algorithm suite and so on. The client sends a list of supported versions, algorithm suites, and in the case of elliptic curves, a list of supported curves, signature algorithms, etc. Based on the information from the client, the server selects the final protocol version, algorithm suite, EC point format, etc.
How to improve performance
A full SSL handshake requires two RTTs and also time-consuming asymmetric operations. The protocol has been designed with performance improvement in mind. In terms of handshake process optimization, the main focus is to simplify the handshake process by session reuse. (For TLSv1.3, there are PSK, 1-RTT and 0-RTT, but this paper will not cover TLS1.3.) Here is the main introduction to session reuse.
The basic flow of session reuse
The basic flow of session reuse is shown above. First is a complete handshake, then the client if it wants to reuse the previous session, in ClientHello to carry out the corresponding instructions to tell the server, the server if agreed to reply in ServerHello, and then it is a straightforward simplified handshake, without the need for key exchange and identity authentication. Not only does it save one interaction, but it also eliminates time-consuming asymmetric operations.
Session ID
There are two ways of session reuse, first look at the Session ID process.
In the previous handshake, the server will tell the client the Session ID in the ServerHello message, and after the handshake is completed normally, the server will save the corresponding Session, which contains the master key.
Later, if the client wants to reuse the previous Session, it can bring the previous Session ID in the ClientHello, and the server will look for it according to the Session ID after receiving it, if it is found and not expired, then the session will be reused and the server will send the Session ID to the client again as it is, and enter the simplified handshake process; otherwise, the Otherwise, the server still randomly generates a new Session ID and sends it to the client, reverting to the full handshake process.
When session reuse, the session key is derived using the master key of the previous Session. Here you can see the difference in the lifecycle. The same master key is used for session reuse, but the session key is regenerated each time. And notice that two random numbers are also involved in key derivation, again for replay prevention reasons.
Session Ticket
In the former case, the server needs to save the session cache, which consumes memory resources and causes cache synchronization problems if it is a cluster. The appearance of session ticket is to solve these problems.
Let’s see how it works, first of all the client brings the session_ticket extension in the ClientHello extension to indicate that it wants to use the session_ticket function, and if the server agrees, it replies to the session_ticket extension in the ServerHello. The server does not save the session, but sends it to the client via NewSessionTicket message after encryption. Here the session ticket key is actually a symmetric key, which only the server side knows.
Later, if the client wants to reuse the session, the previous session_ticket is inserted in the ClientHello extension, and the session is reused after the server side successfully decrypts it and passes the verification, otherwise it reverts to the full handshake. Then the session key is derived again based on the master key as well.
This way the server does not have the burden of saving the session, but there is no free lunch, and the session ticket will bring some damage to the forward security. If the session ticket key is leaked, then the previous handshake based on session reuse can be broken.
Therefore, in practice, the session ticket key should be changed frequently to reduce the risk of forward security.
How security is ensured
A brief review of how security is ensured in the handshake.
The first anti-replay is achieved by two hello random numbers. The prevention of man-in-the-middle attacks is accomplished by authenticating the server side. Anti-tampering is done by the final Finished message, which contains a verify_data, which is also calculated by the PRF, where the seed is the summary value of all previous handshake messages (starting with ClientHello and ending with the current Finished message).
Forward security is also a point that needs special attention. In fact, TLS 1.3 cut all algorithm suites without forward security, leaving only DHE or ECDHE algorithm suites. In addition, as mentioned earlier, the session ticket will also have some damage to the forward security.
Finally, I would like to emphasize the importance of random numbers. The security of a good encryption algorithm is entirely based on the security of the key. If the random number itself is not of good quality, for example, it can be predicted, then all the previous work is in vain. Random numbers are arguably the cornerstone of all of this.
Finally, the barrel theory applies in the field of security, where the security of a communication depends on its weakest link. Whether it is the protocol design, the code implementation or in the user’s use, any single mistake can lead to huge security problems.