Category The Certificate message

X.509 trust model – Digital Certificates and Certification Authorities

10.2.6 X.509 trust model

Generally speaking, there are three steps in verifying a certificate:

  • Verify that the certificate has not been revoked
  • Verify that the certificate is valid (for instance, verify the certificate’s extensions and its validity period)
  • Verify the signature of the CA over the certificate

As discussed earlier, the first step consists of either consulting a CRL or an OCSP server. But to verify the signature over a certificate (or a CRL or an OCSP response), we need an authentic copy of the CA’s public key – in other words, we need another certificate, in which the original CA is the subject. In X.509, these CA certificates are issued by a higher-order CA, which in turn has a certificate from an even higher-order CA, so that a hierarchical tree structure emerges with a Root CA at the top, which certifies itself (see Figure 10.4).

Figure 10.4: X.509 hierarchical trust model. An arrow from X to Y means that X certifies Y . EE stands for End Entity

The Path Length Constraint within the Basic Constraints extension field limits the number of CA certificates that may follow a CA certificate down the tree until the EE certificate under consideration is reached. For CA0 in Figure 10.4, for example, the path length constraint could be set to 0, whereas for CA1, the path length constraint should be at least 1, lest verification of the end entity certificates for C,D, and E fails.

In practice, a CA is rarely certified by another, independent CA. Rather, there are instances within the same company that act as root CAs for their own intermediate CA instances. From a security perspective, this is equivalent to a CA self-signing its own public key. This means that, in most cases, the relying parties cannot transfer the question of whether to trust a particular CA to some independent higher-order instance, but must base their trust decisions entirely on the publicly available information about a CA, for example, the CPS. Although modern browsers contain lists of trusted CAs, it is not entirely clear how these lists come about, and they should be no substitute for making your own decisions.

The Certificate message – Digital Certificates and Certification Authorities

10.5.6 The Certificate message

The Certificate message contains the endpoint’s – server Alice’s or client Bob’s – certificate chain.

Server Alice sends the Certificate message to client Bob whenever the key exchange Alice and Bob agreed upon uses certificate-based authentication. This is the case for all TLS 1.3 key exchange methods except TLS handshake based on a pre-shared key.

Client Bob sends a Certificate message if and only if server Alice has requested him to authenticate himself using the CertificateRequest message. If Alice requested Bob authenticate himself using a certificate, but Bob has no suitable certificate at hand, he sends a Certificate message containing no certificates. More precisely, Bob sends a Certificate message with the certificate˙list field having a length of zero. The structure of the Certificate message is shown in Listing 10.5.

Listing 10.5: Structure of the Certificate message

enum {
   X509(0),
   RawPublicKey(2),
   (255)
} CertificateType;
struct {
   select (certificate_type) {
       case RawPublicKey:
           /* From RFC 7250 ASN.1_subjectPublicKeyInfo */
           opaque ASN1_subjectPublicKeyInfo<1..2^24-1>;
       case X509:
           opaque cert_data<1..2^24-1>;
         };
   Extension extensions<0..2^16-1>;
} CertificateEntry;
struct {
   opaque certificate_request_context<0..2^8-1>;
   CertificateEntry certificate_list<0..2^24-1>;
} Certificate;

If the Certificate message is received in response to a CertificateRequest message, the certificate˙request˙context variable stores an identifier for the certificate request. When server Alice requests client Bob authenticate himself using a certificate, Alice can supply an additional context, which client Bob would return to bind his certificate to Alice’s request. If client Bob does not authenticate himself using a certificate (which is the most common case on the internet), certificate˙request˙context has a length of zero.

The certificate˙list field contains a chain of CertificateEntry data structures. Each CertificateEntry stores a single digital certificate together with its corresponding set of extensions.

The extensions field consists of certain extension values for a CertificateEntry, including OCSP status for server certificates and the SignedCertificateTimestamp extensions. The extensions in the Certificate message from server Alice must match those found in client Bob’s ClientHello message. Similarly, extensions in the Certificate message from client Bob have to match those in server Alice’s CertificateRequest message.

Server Alice’s certificate˙list must never be empty. In contrast, client Bob may send an empty certificate˙list if he has no appropriate certificate to respond to server Alice’s authentication request with.

Main components of a public-key infrastructure – Digital Certificates and Certification Authorities

10.3 Main components of a public-key infrastructure

A Public-Key Infrastructure (PKI) is a system that is able to issue, distribute, and validate certificates. While a CA is an important part of a PKI, the two terms are not the same. In order to limit the potential damage in case of a compromise, it is customary that the various operational tasks of a PKI are taken over by logically separate functional entities within the PKI, which have their own private keys. One of these is the CA. We will now take a closer look at all these entities:

  • Certification Authority (CA): Within a PKI, the CA is responsible for creating, signing, and issuing the certificates. Moreover, all certificates issued by the CA should be archived in a secure manner.

When looking at Figure 10.4, it quickly becomes clear that a CA is a single point of failure within a PKI. It is therefore mandatory to run the CA within a specially secured environment with strictly enforced access control rules. Nevertheless, there have been incidents in the past where CAs have been compromised by an attacker, and we will discuss one such incident in the next section.

  • Registration Authority (RA): The RA is the instance Alice sends her certificate signing request in the course of the enrollment. The RA checks the data provided by Alice in the CSR. Depending on the security level of the requested certificate, this check can happen online or offline (for example, by Alice having to appear in person at the RA and present her passport). If the check is successful, the data is forwarded to the CA, which generates a corresponding certificate.
  • Directory Server (DIR): The directory server stores the certificates issued by the CA and makes them publicly available over a searchable directory.
  • Validation Authority (VA): The VA is responsible for issuing CRLs and/or running the OCSP server. As such, they form another critical component of the PKI and should be secured accordingly.
  • Time Stamping Server (TSS): A TSS signs documents along with the current time and date. Therefore, it can be used to prove that a document existed at a certain time. If Alice, for example, loses her private key and revokes her certificate, it is important to know which of the documents Alice has signed previously already existed before she revoked her certificate. Otherwise, Alice could claim that basically all documents she has ever signed were actually not signed by herself but by an attacker who stole her private key.

Another application is the so-called Signed Certificate Timestamps (SCTs). By way of an SCT, the CA records the exact time a certificate was issued. SCTs can be delivered to a relying party either by directly embedding them into the certificate’s extension (see Figure 10.5), by sending them in a TLS extension during the TLS handshake or as part of a response to an OCSP stapling request [51].

Figure 10.5: SCTs embedded in a certificate

What cryptographic guarantees does encryption provide? – Hash Functions and Message Authentication Codes

11.2 What cryptographic guarantees does encryption provide?

On a more fundamental level, the attacks described in the above examples work because Alice and Bob, as illustrated in Figure 11.1, can only use encryption ek.

Intuitively, it might seem as if encryption protects Alice’s and Bob’s messages against manipulation by Mallory because the ciphertext hides the plaintext message and Mallory cannot know how to manipulate the encrypted message in a meaningful way. But this is completely wrong! Encryption provides no guarantees for message integrity or authenticity.

We can convince ourselves that this is indeed the case by taking a closer look at the one-time pad encryption scheme from Chapter 4, Encryption and Decryption.

Recall that the one-time pad encrypts a message m under the key k as:

where ⊕ denotes a bit-wise exclusive OR (XOR) operation. If you take two bits b0,b1 and apply the XOR operation to them, b0 ⊕ b1 will yield zero whenever both bits have the same value (that is, both are zero or both are one) and one whenever the bits have a different value.

To decrypt a message encrypted under the one-time pad scheme, the receiver computes:

In the case a one-time pad is used, it is very easy for Mallory to manipulate the ciphertext c because every bit flip in the ciphertext leads to the same bit being flipped in the decrypted message.

In other words, if Mallory has a ciphertext c that encrypts Bob’s message m, she can easily generate a manipulated ciphertext c′ that encrypts the same message as m, but with one or more bits of Mallory’s choice flipped.

Thus, even a perfectly secret – that is, information-theoretically secure – encryption scheme does not provide message integrity. The same is true for stream ciphers because, as we have seen in Chapter 4, Encryption and Decryption, their encryption operation is identical to encryption using the one-time pad: the plaintext is simply XORed with the key stream.

Intuitively, one might think that block ciphers – a class of much stronger encryption algorithms we will learn about in Chapter 14, Block Ciphers and Their Modes of Operation – are resilient against the above manipulations and would offer some degree of integrity. After all, modern block ciphers are pseudorandom permutations where a one-bit change in the plaintext results in the change of roughly half of the bits in the ciphertext.

As a result, if Mallory changes only a single bit in ciphertext c to obtain a manipulated ciphertext c′, the decryption result p′ = dk(c′) will be totally different from the genuine decryption result p = dk(c). It turns out, however, that even this property does not help with message integrity and authenticity!

As an example, if a block cipher is used in the so-called electronic codebook (ECB) mode (more on this in Chapter 14, Block Ciphers and their Modes of Operation) and Mallory flips a bit in the i-th block of the ciphertext, only the i-th block of the plaintext will change when Bob decrypts the manipulated ciphertext.

Alternatively, Mallory can manipulate the order of blocks in the original ciphertext c to obtain a manipulated ciphertext c′. When Bob decrypts c′, he will obtain a manipulated plaintext p′ where the individual blocks are identical to those of the genuine plaintext p, but their order is manipulated (their order is the same as in c′).

If only encryption is used, Bob is not able to detect these manipulations. Moreover, this is independent of the type of symmetric key encryption algorithm used by Bob. We can therefore conclude that encryption alone provides only confidentiality, but not message integrity or authenticity.

Candidate one-way functions – Hash Functions and Message Authentication Codes

11.3.2 Candidate one-way functions

With the current knowledge in complexity theory, mathematicians do not know how to unconditionally prove the existence of one-way functions. As a result, their existence can only be assumed. There are, however, good reasons to make this assumption: there exists a number of very natural computational problems that were the subject of intense mathematical research for decades, sometimes even centuries, yet no one was able to come up with a polynomial-time algorithm that can solve these problems.

According to the fundamental theorem of arithmetic, every positive integer can be expressed as a product of prime numbers, the numbers being referred to as prime factors of the original number [74]. One of the best known computational problems that is believed to be a one-way function is prime factorization: given a large integer, find its prime factors.

The first table of prime factors of integers was created by the Greek mathematician Eratosthenes more than 2,500 years ago. The last factor table for numbers up to 10,017,000 was published by the American mathematician Derrick Norman Lehmer in 1909 [178].

Trial division was first described in 1202 by Leonardo of Pisa, also known as Fibonacci, in his manuscript on arithmetic Liber Abaci. French mathematician Pierre de Fermat proposed a method based on a difference of squares, today known as Fermat’s factorization method, in the 17th century. A similar integer factorization method was described by the 14th century Indian mathematician Narayana Pandita. The 18th century Swiss mathematician Leonhard Euler proposed a method for factoring integers by writing them as a sum of two square numbers in two different ways.

Starting from the 1970s, a number of so-called algebraic-group factorization algorithms were introduced that work in an algebraic group. As an example, the British mathematician John Pollard introduced two new factoring algorithms called the p− 1 algorithm and the rho algorithm in 1974 and 1975, respectively. The Canadian mathematician Hugh Williams proposed the Williams’ p + 1 algorithm in 1982.

In 1985, the Dutch mathematician Hendrik Lenstra published the elliptic curve method, currently the best known integer factorization method among the algorithms whose complexity depends on the size of the factor rather than the size of the number to be factored [204].

Moreover, a number of so-called general-purpose integer factorization methods whose running time depends on the size of the number to be factored have been proposed over time. Examples of such algorithms are the Quadratic Sieve introduced in 1981 by the American mathematician Carl Pomerance, which was the most effective general-purpose algorithm in the 1980s and early 1990s, and the General Number Field Sieve method, the fastest currently known algorithm for factoring large integers [39].

Yet, except for the quantum computer algorithm proposed in 1994 by the American mathematician Peter Shor, none of the above algorithms can factor integers in polynomial time. As a result, mathematicians assume that prime factorization is indeed a one-way function, at least for classical computers. Shor’s algorithm, on the other hand, requires a sufficiently large quantum computer with advanced error correction capabilities to keep the qubits stable during the entire computation, a technical challenge for which no solutions are known as of today.

Another example of a function believed to be one-way is a family of permutations based on the discrete logarithm problem. Recall that the discrete logarithm problem involves determining the integer exponent x given a number of the form gx where g is a generator of a cyclic group. The problem is believed to be computationally intractable, that is, no algorithms are known that can solve this problem in polynomial time.

The conjectured one-way family of permutations consists of the following:

  • An algorithm to generate an n-bit prime p and an element g ∈ {2,…,p − 1}
  • An algorithm to generate a random integer x in the range {1,…,p−1}
  • The function fp,g(x) = gx (mod p)

Function fp,g(x) is easy to compute, for example, using the square-and-multiply algorithm. Inverting fp,g(x), on the other hand, is believed to be computationally hard because inverting modular exponentiation is equivalent to solving the discrete logarithm problem for which no polynomial-time algorithms are known to date, as discussed in Chapter 7, Public-Key Cryptography.

Hash functions – Hash Functions and Message Authentication Codes

11.4 Hash functions

A hash function is some function hash that maps an arbitrarily long input string onto an output string of fixed length n. More formally, we have hash : {0,1}∗→{0,1}n.

A simplistic example of a hash function would be a function that always outputs the last n bits of an arbitrary input string m. Or, if n = 1, one could use the bitwise XOR of all input bits as the hash value.

However, these simple hash functions do not possess any of the properties required from a cryptographically secure hash function. We will now first discuss these properties, and afterward look at how secure hash functions are actually constructed.

11.4.1 Collision resistance

Cryptographic hash functions are hard to construct because they have to fulfill stringent requirements, which are motivated by their use within Message Authentication Codes (MACs) (see Section 11.5, Message Authentication Codes) and Digital Signatures (see Chapter 9, Digital Signatures).

Recall, for example, that in the RSA cryptosystem, Alice computes a digital signature over some message m as

where d is Alice’s private key and n is the public module. Alice then sends the pair (m,sigAlice(m)) to Bob.

If Eve observes this pair, she can compute hash(m) using Alice’s public key PKAlice = (e,n) via

Eve now knows hash(m) and the corresponding preimage m. If she manages to find another message m with the same hash value (a so-called second preimage), m and m will have the same signature. Effectively, Eve has signed m in Alice’s name without knowing her private key. This is the most severe kind of attack on a cryptographic hash function.

Therefore, given m and hash(m), it must be computationally hard for Eve to find a second preimage m such that hash(m) = hash(m). This property of a cryptographic hash function is called second preimage resistance or weak collision resistance.

Note that when trying out different input messages for the hash function, collisions must occur at some point, because hash can map longer messages onto shorter messages. In particular, if the given hash value hash(m) is n bits long, a second preimage m should be found after O(2n) trials. Therefore a second preimage attack is considered successful only if it has a significantly smaller complexity than O(2n).

A weaker form of attack occurs if Eve manages to find any collision, that is, any two messages m1,m2 with

without reference to some given hash value. If it is computationally hard for Eve to construct any collisions, the hash is called strongly collision resistant.

Again, when trying out many different candidate messages, collisions will naturally occur at some point, this time after about 2n∕2 trials. This smaller number is a consequence of a phenomenon commonly known as Birthday Paradox, which we will discuss in detail in Section 19.7, Attacks on hash functions in Chapter 19, Attacks on Cryptography.

Consequently, an attack on strong collision resistance is considered successful only if it has a significantly smaller complexity than O(2n∕2). This also shows that in general, that is, assuming there are no cryptographic weaknesses, hash functions with longer hash values can be considered to be more secure than hash functions with shorter hash values.

Note that strong collision resistance of a hash function implies weak collision resistance. Hash functions that are both strongly and weakly collision resistant are called collision resistant hash functions (CRHF).

How to compute a MAC – Hash Functions and Message Authentication Codes

11.5.1 How to compute a MAC

Basically, there are two options to form a MAC. The first option closely follows the approach we adopted to compute digital signatures in Chapter 9, Digital Signatures. Back then, we hashed the message m first and encrypted the hash value with the signer’s private key:

Analogously, using their shared secret k, Alice and Bob could compute

as MAC. Here, encryption is done via some symmetric encryption function, for example, a block cipher (see Chapter 14, Block Ciphers and Their Modes of Operation). Note that if Alice sends m||t to Bob and Eve manages to find another message m so that hash(m) = hash(m), then Eve can replace m with m without being noticed. This motivates the collision resistance requirement on hash functions described in Section 11.4, Hash functions.

However, even if we are using a collision-resistant hash function, in a symmetric setting where Alice and Bob both use the same key k, one might ask whether it is really necessary to agree on and deploy two different kinds of algorithms for computing a MAC. Moreover, hash functions are built for speed and generally have a much better performance than block ciphers.

The second option for computing a MAC therefore only uses hash functions as building blocks. Here, the secret k is used to modify the message m in a certain way and the hash function is applied to the result:

This option is called a key-dependent hash value. In which way k should influence the message m, depends on how the hash function is constructed. In any case, if Eve is able to reconstruct the input data from the output value hash(m,k), she might be able to get part of or even the complete secret key k. This motivates the one-way property requirement on hash functions described in Section 11.4, Hash functions. A well-proven way to construct a key-dependent hash called HMAC is defined in [103].

11.5.2 HMAC construction

The HMAC construction is a generic template for constructing a MAC via a key-dependent hash function. In this construction, the underlying hash function hash is treated as a black box that can be easily replaced by some other hash function if necessary. This construction also makes it easy to use existing implementations of hash functions. It is used within TLS as part of the key derivation function HKDF (see Section 12.3, Key derivation functions in TLS within Chapter 12, Key Exchange).

When looking at the way hash functions are built, using either the Merkle-Damgard or the Sponge Construction, it quickly becomes clear that input bits from the first message blocks are well diffused over the final output hash value. Input bits in the last message blocks, on the other hand, are only processed at the very end and the compression or the round function, respectively, is only applied a few times on these bits. It is therefore a good idea to always append the message to the key in key-dependent hash functions. The simple construction

however, suffers from so-called Hash Length Extension Attacks, if the hash function is constructed according to the Merkle-Damgard scheme. Here, an attacker knowing a valid pair (m,MACk(m)) can append another message mA to the original message m and compute the corresponding MAC without knowing the secret key k. This is because

where comp is the compression function used for building the hash function.

In the HMAC construction, the input message m is therefore appended twice to the keying material, but the second time in a hashed form that cannot be forged by an attacker. More specifically, for an input message m and a symmetric key k, we have

where:

  • hash : {0,1}∗ → {0,1}n is some collision-resistant OWHF, which processes its input in blocks of size r.
  • k is the symmetric key. It is recommended that the key size should be ≥ n. If k has more than r bits, one should use hash(k) instead of k.
  • k′ is the key padded with zeros so that the result has r bits.
  • opad and ipad are fixed bitstrings of length r: opad = 01011100 repeated r∕8 times, and ipad = 00110110 repeated r∕8 times. Both opad and ipad, when added via ⊕, flip half of the key bits.

In this construction, the hash length extension attack will not work, because in order to forge MACk(m||mA), an attacker would need to construct hash(k′⊕ ipad||m||mA). This is impossible, however, as the attacker does not know hash(k′⊕ ipad||m).

More generally, the HMAC construction does not rely on the collision-resistance of the underlying hash function, because a collision in the hash function does not imply the construction of a colliding HMAC.

MAC versus CRC – Hash Functions and Message Authentication Codes

11.6 MAC versus CRC

Can we construct a MAC without a cryptographic hash function and without a secret key? Let’s take a look at the Cyclic Redundancy Check (CRC), which is popular error-detecting code used in communication systems to detect accidental errors in messages sent over a noisy or unreliable communication channel.

The working principle of error-detecting code is for the sender to encode their plaintext message in a redundant way. The redundancy, in turn, allows the receiver to detect a certain number of errors – that is, accidental bit flips – in the message they receive. The theory of channel coding, pioneered in the 1940s by the American mathematician Richard Hamming, aims to find code that has minimal overhead (that is, the least redundancy) but, at the same time, has a large number of valid code words and can correct or detect many errors.

CRC is so-called cyclic code, that is, a block code where a circular shift of every code word yields another valid code word. The use of cyclic code for error detection in communication systems was first proposed by the American mathematician and computer scientist Wesley Peterson in 1961.

Cyclic code encodes the plaintext message by attaching to it a fixed-length check value based on the remainder of a polynomial division of the message’s content. The receiving party repeats that calculation and checks whether the received check value is equal to the computed check value.

The algebraic properties of cyclic code make it suitable for efficient error detection and correction. Cyclic code is simple to implement and well suited to detect so-called burst errors. Burst errors are contiguous sequences of erroneous bits in communication messages and are common in many real-world communication channels.

CRC code is defined using a generator polynomial g(x) with binary coefficients 0 and 1. The plaintext message, encoded as another polynomial m(x), is divided by the generator polynomial. The CRC is then computed by discarding the resulting quotient polynomial and taking the remainder polynomial r(x) as CRC, which is subsequently appended to the plaintext as a checksum. The whole arithmetic is done within the finite field 𝔽2, therefore the coefficients of the remainder polynomial are also 0 and 1.

As an example, we can compute an 8-bit CRC using the generator polynomial g(x) = x2 + x + 1. To encode a message, we encode it as a polynomial, divide it by the generator polynomial x2 + x + 1, and take the remainder of this division as the CRC check value to be appended to the plaintext message.

Hash functions in TLS 1.3 – 3 – Hash Functions and Message Authentication Codes

Setting the initial hash value

Before the actual hash computation can begin, the initial hash value H0 must be set based on the specific hash algorithm used. For SHA-256, H(0) is composed of the following 8 32-bit words – denoted H0(0) to H7(0) – which are the first 32 bits of the fractional parts of the square roots of the first 8 prime numbers:

H0(0) = 6a09e667

H1(0) = bb67ae85

H2(0) = 3c6ef372

H3(0) = a54ff53a

H4(0) = 510e527f

H5(0) = 9b05688c

H6(0) = 1f83d9ab

H7(0) = 5be0cd19

For SHA-384, H(0) is composed of eight 64-bit words denoted H0(0) to H7(0), the words being the first 64 bits of the fractional parts of the square roots of the ninth through sixteenth prime numbers:

H0(0) = cbbb9d5dc1059ed8

H1(0) = 629a292a367cd507

H2(0) = 9159015a3070dd17

H3(0) = 152fecd8f70e5939

H4(0) = 67332667ffc00b31

H5(0) = 8eb44a8768581511

H6(0) = db0c2e0d64f98fa7

H7(0) = 47b5481dbefa4fa4

For SHA-512, H(0) is composed of the 8 64-bit words – denoted H0(0) to H7(0) – which are the first 64 bits of the fractional parts of the square roots of the first 8 prime numbers:

H0(0) = 6a09e667f3bcc908

H1(0) = bb67ae8584caa73b

H2(0) = 3c6ef372fe94f82b

H3(0) = a54ff53a5f1d36f1

H4(0) = 510e527fade682d1

H5(0) = 9b05688c2b3e6c1f

H6(0) = 1f83d9abfb41bd6b

H7(0) = 5be0cd19137e2179

The way the constants for the initial hash value H(0) were chosen for the SHA-2 family hash algorithms – namely, by taking the first 16 prime numbers, computing a square root of these numbers, and taking the first 32 or 64 bits of the fractional part of these square roots – is yet another example of nothing-up-my-sleeve numbers.

Because prime numbers are the atoms of number theory and the square root is a simple, well-known operation, it is very unlikely that these constants were chosen for any specific reason.

The choice of the constants is natural and their values are limited because only the first 16 prime numbers are used. As a result, it is very unlikely that someone could design a cryptographic hash function containing a backdoor based on these constants.

Message digest computation

Recall that for SHA-256, the message is first padded to have a length that is a multiple of 512. To compute the SHA-256 message digest, the message is parsed into N 512-bit blocks M(1),M(2),…M(N) is processed as shown in Algorithm 1.

The term Mt(i) denotes specific 32 bits of the 512-bit block M(i). As an example, M0(i) denotes the first 32 bits of block M(i), M1(i) denotes the next 32 bits of block M(i), and so on, up to M15(i). Moreover, the SHA-256 algorithm uses a so-called message schedule consisting of 64 32-bit words W0 to W63, 8 32-bit working variables a to h, and 2 temporary variables T1,T2. The algorithm outputs a 256-bit hash value composed of 8 32-bit words.

Computation of the SHA-512 message digest, shown in Algorithm 2, is identical to that of SHA-256, except that the message schedule consists of 80 64-bit words W0 to W79 and the algorithm uses 8 64-bit working variables a to h and outputs a 512-bit message digest composed of 8 64-bit words.

Moreover, the term Mt(i) now denotes specific 64 bits of the 1,024-bit block M(i). That is, M0(i) denotes the first 64 bits of block M(i), M1(i) denotes the next 64 bits of block M(i), and so on, up to M15(i).

Finally, the SHA-384 hash algorithm is computed exactly like SHA-512, except the following:

  • The initial hash value H(0) for SHA-384 is used
  • The final hash value H(N) is truncated to H0(N)|| H1(N)|| H2(N)|| H3(N)|| H4(N)|| H5(N) to produce a 384-bit message digest

Algorithm 1: Computation of the SHA-256 message digest.

Algorithm 2: Computation of the SHA-512 message digest

Hash functions in TLS 1.3 – Hash Functions and Message Authentication Codes

11.7 Hash functions in TLS 1.3

We’ll now take a look at how hash functions are negotiated within the TLS handshake and how they are subsequently used in the handshake.

11.7.1 Hash functions in ClientHello

Recall that Alice and Rob use the TLS handshake protocol to negotiate the security parameters for their connection. They do it using TLS handshake messages shown in Listing 11.3. Once assembled by the TLS endpoint – that is, server Alice or client Bob – these messages are passed to the TLS record layer where they are embedded into one or more TLSPlaintext or TLSCiphertext data structures. The data structures are then transmitted according to the current state of the TLS connection.

Listing 11.3: TLS 1.3 handshake messages

enum {
   client_hello(1),
   server_hello(2),
   new_session_ticket(4),
   end_of_early_data(5),
   encrypted_extensions(8),
   certificate(11),
   certificate_request(13),
   certificate_verify(15),
   finished(20),
   key_update(24),
   message_hash(254),
   (255)
} HandshakeType;

One of the most important TLS handshake messages is ClientHello since this message starts a TLS session between client Bob and server Alice. The structure of the ClientHello message is shown in Listing 11.4. The cipher˙suites field in ClientHello carries a list of symmetric key algorithms supported by client Bob, specifically the encryption algorithm protecting the TLS record layer and the hash function used with the HMAC-based key derivation function HKDF.

Listing 11.4: TLS 1.3 ClientHello message

struct {
   ProtocolVersion legacy_version = 0x0303;    /* TLS v1.2 */
   Random random;
   opaque legacy_session_id<0..32>;
   CipherSuite cipher_suites<2..2^16-2>;
   opaque legacy_compression_methods<1..2^8-1>;
   Extension extensions<8..2^16-1>;
} ClientHello;

11.7.2 Hash Functions in TLS 1.3 signature schemes

Recall that server Alice and client Bob also agree upon the signature scheme they will use during the TLS handshake. The SignatureScheme field indicates the signature algorithm with the corresponding hash function. The following code shows digital signature schemes supported in TLS 1.3:


enum {
    /* RSASSA-PKCS1-v1_5 algorithms */
    rsa_pkcs1_sha256(0x0401),
    rsa_pkcs1_sha384(0x0501),
    rsa_pkcs1_sha512(0x0601),
    /* ECDSA algorithms */
    ecdsa_secp256r1_sha256(0x0403),
    ecdsa_secp384r1_sha384(0x0503),
    ecdsa_secp521r1_sha512(0x0603),
    /* RSASSA-PSS algorithms with public key OID rsaEncryption */
    rsa_pss_rsae_sha256(0x0804),
    rsa_pss_rsae_sha384(0x0805),
    rsa_pss_rsae_sha512(0x0806),
    /* EdDSA algorithms */
    ed25519(0x0807),
    ed448(0x0808),
    /* RSASSA-PSS algorithms with public key OID RSASSA-PSS */
    rsa_pss_pss_sha256(0x0809),
    rsa_pss_pss_sha384(0x080a),
    rsa_pss_pss_sha512(0x080b),
    — snip —
} SignatureScheme;

We’ll now discuss the SHA family of hash functions in detail.

SHA-1

SHA-1 is a hash algorithm that was in use from 1995 as part of the FIPS standard 180-1, but has been deprecated by NIST, BSI, and other agencies due to severe security issues with regard to its collision resistance. In 2005, a team of Chinese researchers published the first cryptanalytic attacks against the SHA-1 algorithm. These theoretical attacks allowed the researchers to find collisions with much less work than with a brute-force attack. Following further improvements in these attacks, NIST deprecated SHA-1 in 2011 and disallowed using it for digital signatures in 2013.

In 2017, a team of researchers from the CWI Institute in Amsterdam and Google published Shattered, the first practical attack on SHA-1, by crafting two different PDF files having an identical SHA-1 signature. You can test the attack yourself at https://shattered.io/.

Finally, in 2020, two French researchers published the first practical chosen-prefix collision attack against SHA-1. Using the attack, Mallory can build colliding messages with two arbitrary prefixes. This is much more threatening for cryptographic protocols, and the researchers have demonstrated their work by mounting a PGP/GnuPG impersonation attack. Moreover, the cost of computing such chosen-prefix collisions has been significantly reduced over time and is now considered to be within the reach of attackers with computing resources similar to those of academic researchers [64].

While SHA-1 must not be used as a secure cryptographic hash function, it may still be used in other cryptographic applications [64]. As an example, based on what is known today, SHA-1 can be used for HMAC because the HMAC construction does not require collision resistance. Nevertheless, authorities recommend replacing SHA-1 with a hash function from the SHA-2 or SHA-3 family as an additional security measure [64].