With the popularity of HTTPS, the word TLS also appears more frequently, so what is TLS and how does TLS make HTTP transmission secure?
TLS (Transport Layer Security), formerly known as SSL (Secure Socket Layer), is located between TCP and the application layer. Compared to HTTP, HTTPS does not change the protocol itself, but adds a layer of TLS between TCP and HTTP for encryption to ensure information security.
For information transmitted in clear text, there are several risks
- Eavesdropping
- Tampering
- Forgery
TLS addresses all three of these issues through multiple measures.
Symmetric and Asymmetric Encryption
Before describing TLS, it is worth mentioning two encryption methods. Experience has taught us that trying to encrypt by one method is unreliable, because once the encryption method is made public then it is completely broken, and once the encryption method is widely used it is difficult to keep it a secret. Modern encryption techniques make the encryption method public, using something called a key for encryption/decryption, so that even if a key is compromised (not cracked), the rest of the encrypted message remains reliable.
Two techniques emerged based on this, symmetric and asymmetric encryption. With symmetric encryption, the same key is used for encryption and decryption, and the biggest challenge with this method is how to securely transmit the key. The other type of asymmetric encryption generates a pair of keys, called public and private keys, both of which are easy to generate but difficult to derive from each other. The information encrypted with the public key can only be decrypted by the private key, so that the public key can be transmitted online at will, while the private key does not need to be transmitted throughout, thus solving the drawbacks of symmetric encryption, but bringing performance challenges.
We can see later how TLS uses both to ensure security and not degrade performance.
Handshake Purpose
The TLS handshake process is primarily designed to accomplish the following.
- Encryption component negotiation
- Server authentication (optionally client authentication is possible)
- Session key information exchange
This process negotiates the exchange of all information necessary for subsequent encrypted communications to be viable.
Handshake Process
The TLS handshake can be divided into the following steps
- The client initiates the connection and sends a
Client Hello
request to the server with its generated random number and supported encryption suite. - The server receives the request and returns a
Server Hello
message with its own random number and the encryption suite of its choice. Afterwards the server sends its own certificate. At this point the server may also ask the client for a certificate. When it is done, it sends theServer Hello Done
message. - The client decides whether to continue communication by verifying that the server certificate is reliable, and closes the connection if it is not.
- If the client is considered reliable, it generates a new random number, called the Pre Master Key, which is used to later generate the session key and provide it to the server by encrypting it with the public key from the certificate.
- The client then passes a
Change Cipher Spec
indicating that the message will be encrypted and hashed with the new session keys. The client then sendsClient finished
to end the handshake. - The server receives the data and decrypts it to get the pre-master key, calculates the session key, and then sends
Change Cipher Spec
andServer finished
to the client as well.
At this point, both parties have a session key for symmetric encryption, which is used for subsequent communications.
The above is the handshake process of TLS 1.2, but TLS 1.3 simplifies the handshake process.
The image above is the result of a handshake packet capture when visiting this blog.
There are also two random numbers and the pre-master key not mentioned here, which are introduced here as the RSA algorithm. The communicating parties can use all three to compute the key used for the final symmetric encryption using the same algorithm. Since the public key in the certificate is static, a way must be found to negotiate a different key each time, so random numbers must be used. In addition, since the randomness of the random number of the other host cannot be trusted (e.g., the other host may always use only one identical number or there may be a pattern), it is likely that the random number will be guessed, so three random numbers are used to generate the key.
Another key exchange algorithm is the Diffie-Hellman algorithm, where both the client and the server generate a pair of public and private keys, and then send the public key to each other. After each party gets the other’s public key, they use a digital signature to ensure that the public key has not been tampered with, and then combine it with their own private key to compute the same key.
Here, a function called key derivation function (kdf) is also used to derive multiple keys from secret information such as the master key to improve the randomness of the key and ensure its security.
This simply means that TLS uses symmetric encryption during communication, and the purpose of the handshake is to allow both parties to negotiate the key for encryption. The key is generated by a function whose parameters are the key negotiated by both parties, and this information is transmitted by asymmetric encryption, while authentication is done using the asymmetric encryption feature, thus guaranteeing both performance and security.
Certificate Authentication
During the handshake, the client needs to verify the reliability of the certificate after receiving it. Here the certificate also uses asymmetric encryption, as mentioned before the public key can only be used for encryption and the private key can only be used for decryption, for the certificate the private key can be made public while the public key is stored by itself, so that only you can encrypt and others can only decrypt. The certificate is the encrypted data, and other people can decrypt the correct information to ensure that the certificate comes from the encryptor and not from the third party.
In fact, the two numbers obtained when generating a key pair are equal, i.e., either one encrypted can be decrypted by the other one, and whether it becomes a public key or a private key depends on which one is made public. For certificates, the one used for decryption is still called public key because it is public, and the one used for encryption is called private key.
But here is another question: How can I be sure that the certificate comes from the right person? Because anyone can generate a key pair and use someone else’s information to generate a fake certificate, so how can I be sure that it is not a forged certificate? Here TLS uses a CA (Certificate Authority), which is a reliable authority specifically designed to issue certificates. For a TLS certificate, it is actually a certificate chain.
The above picture is the certificate information of this blog, you can see it is issued by Let’s encrypt, and there is a layer of DST Root CA on it, this layer is the root CA. the trusted root CA certificate is distributed with Windows, Chrome and other software by default. When the client verifies the certificate, it will be authenticated layer by layer until the root, then it will be compared with the trustworthy certificate in the system, and if it comes from a trusted CA, then the certificate chain is considered to be trusted.
CA-issued certificates are considered trustworthy because the CA verifies the identity of the applicant during the issuance process, from the simplest site ownership to the more stringent written submission of information by the applicant, thus ensuring that the target of the issuance is the person who actually owns the site. Moreover, the CA’s private key is usually strictly protected and the issuance process is strictly supervised, so that if there is abusive issuance of certificates or private key leakage then the root certificate will be revoked, and any certificate chain based on this is considered unreliable, so as to avoid the possibility of forged certificates by such coercion.
The key to a forged certificate is that it cannot be certified by the root CA and thus cannot be trusted. However, if there is a forger’s root certificate in the system, all the certificates of the website issued by the forger in this way will be trusted by the system, which will lead to information that can be forged by the attacker, so it is not easy to add a certificate of unknown origin to the system.
The above picture is the certificate information obtained when visiting this blog, which carries various basic information of the certificate, such as domain name, valid time and so on. The most critical one is the signature signature, which is the certificate hash and then encrypted with private key. The client will compare the information obtained by the same hash algorithm and decryption, if the same means the certificate has not been tampered with, plus the certificate chain mechanism to ensure the certificate is credible.
Summary
Now let’s go back to the three problems that TLS solves: eavesdropping, tampering, and forgery.
- Eavesdropping: The information exchanged during the handshake is in plaintext, but only about the cryptographic suite and certificate information, and it is useless for a third party to obtain it. The key obtainable key pre-master key is transmitted by asymmetric encryption, and after the handshake is completed using symmetric encryption, none of the eavesdroppers can obtain the key so they cannot eavesdrop.
- Tampering: During the handshake, the certificate has hash this process can not be tampered with, other information is tampered with will cause the communication between the two sides can not be carried out afterwards. After the handshake is completed, because the key cannot be obtained for encryption, so although the packet content can be modified, it cannot become what the attacker expects.
- Forgery: The pre-master key in key negotiation is encrypted with a certificate public key, where the certificate is determined to be trusted and only the certificate owner can decrypt it, so the key is only owned by the two communicating parties and therefore cannot be forged.
One of the things worth mentioning is that although it seems to be possible to modify the packet contents and still be dangerous, this will only lead to interference with the communication. But after all, if I can modify the packet content, then I can choose not to forward the packet, which is beyond the scope of TLS, and it does not matter as long as it does not turn out to be what the attacker has carefully chosen.
It can be said that TLS has successfully completed the problem of information security of communication and can ensure that the data transmitted by upper layer applications are not obtained by third parties, but there are still some minor problems.
The above image shows the contents of the Client Hello
packet grab, which you can see has the address of the access. The information during the handshake is explicit, and for SNI information. This information should be located in the HTTP packet header, but during the handshake the server must know which application it corresponds to, so the client must provide this information, but there is nothing available to encrypt it, so the SNI can only be transmitted in clear text, resulting in a third party still knowing which URL is being accessed.
The good thing is that TLS is still improving, and many problems have been proposed, so I believe that a more secure Internet will appear soon.