Category Server certificate selection

X.509 trust model – Digital Certificates and Certification Authorities

10.2.6 X.509 trust model

Generally speaking, there are three steps in verifying a certificate:

  • Verify that the certificate has not been revoked
  • Verify that the certificate is valid (for instance, verify the certificate’s extensions and its validity period)
  • Verify the signature of the CA over the certificate

As discussed earlier, the first step consists of either consulting a CRL or an OCSP server. But to verify the signature over a certificate (or a CRL or an OCSP response), we need an authentic copy of the CA’s public key – in other words, we need another certificate, in which the original CA is the subject. In X.509, these CA certificates are issued by a higher-order CA, which in turn has a certificate from an even higher-order CA, so that a hierarchical tree structure emerges with a Root CA at the top, which certifies itself (see Figure 10.4).

Figure 10.4: X.509 hierarchical trust model. An arrow from X to Y means that X certifies Y . EE stands for End Entity

The Path Length Constraint within the Basic Constraints extension field limits the number of CA certificates that may follow a CA certificate down the tree until the EE certificate under consideration is reached. For CA0 in Figure 10.4, for example, the path length constraint could be set to 0, whereas for CA1, the path length constraint should be at least 1, lest verification of the end entity certificates for C,D, and E fails.

In practice, a CA is rarely certified by another, independent CA. Rather, there are instances within the same company that act as root CAs for their own intermediate CA instances. From a security perspective, this is equivalent to a CA self-signing its own public key. This means that, in most cases, the relying parties cannot transfer the question of whether to trust a particular CA to some independent higher-order instance, but must base their trust decisions entirely on the publicly available information about a CA, for example, the CPS. Although modern browsers contain lists of trusted CAs, it is not entirely clear how these lists come about, and they should be no substitute for making your own decisions.

Signature algorithms in TLS certificates – Digital Certificates and Certification Authorities

10.5.4 Signature algorithms in TLS certificates

In Chapter 9, Digital Signatures, we learned that TLS 1.3 has two extensions for client Bob to specify which signature algorithms shall be used in the TLS session with server Alice:

  • The signature˙algorithms extension specifies algorithms to be used for signatures in CertificateVerify messages
  • The signature˙algorithms˙cert extension, if used, specifies algorithms to be used for signatures in certificates

The signature˙algorithms extension was introduced in TLS version 1.2 so that the client can indicate to the server which signature algorithms and hash algorithms can be used in digital signatures of that particular TLS session.

The signature˙algorithms˙cert extension was added in TLS version 1.3 so that TLS endpoints that support different sets of algorithms for certificates and in the TLS itself can clearly communicate their capabilities. RFC 8446 specifies that TLS 1.2 implementations should also process this extension.

In these extensions, the extension˙data field contains a list of SignatureScheme values in a descending order of client Bob’s preference. Each such value is a single signature algorithm that Bob is willing to verify.

Bob may use signature˙algorithms˙cert extension to tell Alice which certificate-specific signature algorithms he wants to use to validate X.509 certificates. Otherwise, if the signature˙algorithms˙cert is omitted, algorithms specified in signature˙algorithms are also used for calculating and verifying digital signatures in certificates.

As a result, using the signature˙algorithms˙cert extension, TLS clients that support different sets of algorithms for certificates and in the TLS itself can signal this to a TLS server in an unambiguous way.

If client Bob wants server Alice to authenticate herself using a certificate, he must send at least the signature˙algorithms extension (optionally, Bob may also send Alice the signature˙algorithms˙cert extension).

If server Alice can only authenticate herself using a certificate but client Bob sends no signature˙algorithms extension, then Alice immediately aborts the TLS handshake and sends a missing˙extension alert.

Server certificate selection – Digital Certificates and Certification Authorities

10.5.8 Server certificate selection

When server Alice sends her certificate to client Bob, the certificate must have the following properties:

  • The certificate must be an X.509v3 certificate (unless Alice and Bob negotiate a different certificate type).
  • The public key of the server must be compatible with the selected authentication algorithm from client Bob’s signature˙algorithms extension. Possible signature algorithms are RSA, ECDSA, or EdDSA.
  • The Certificate Key Usage field (discussed earlier in this chapter, in the X.509V3 Extension Fields section), must include the digitalSignature value. The signature algorithm must match the signature scheme specified in Bob’s signature˙algorithms and signature˙algorithms˙cert extensions.
  • The server˙name and certificate˙authorities extensions are used to select the certificate.

All certificates sent by server Alice must be signed by the digital signature algorithm specified by client Bob. Self-signed certificates are not validated and, therefore, can be signed with any digital signature algorithm.

If server Alice is not able to provide a certificate chain where all certificates are signed using the signature algorithms specified by client Bob, she continues the TLS handshake by sending Bob a certificate chain of her choice that might use signature algorithms not supported by client Bob.

If Bob is not able to construct a valid certificate chain using the certificates provided by Alice and decides to abort the TLS handshake, he sends a corresponding certificate-related alert. The default alert is unsupported˙certificate.

10.5.9 Client certificate selection

When client Bob sends his certificate to server Alice, Bob’s certificate must have the following properties:

  • The certificate must be an X.509v3 certificate (unless Alice and Bob negotiate a different certificate type).
  • If the certificate˙authorities extension was present in the CertificateRequest message, at least one of the certificates in Bob’s certificate chain should be issued by one of the specified certification authorities.
  • The certificate must be signed using one of the digital signature algorithms specified in the signature˙algorithms extension of the CertificateRequest message.
  • If the CertificateRequest message from server Alice has a non-empty oid˙filters extension, the client certificate must contain all extension OIDs recognized by client Bob. This extension is covered in detail in the next subsection, OID filters.

This concludes the list of requirements on client certificates in TLS. These requirements become relevant only if server Alice sends a CertificateRequest message to client Bob.

The certificate_authorities extension – Digital Certificates and Certification Authorities

10.5.12 The certificate_authorities extension

Finally, we turn to certificate authorities within TLS. TLS in itself places hardly any requirement on CAs, except that they should be able to issue X.509 certificates. However, Alice and Bob can use the certificate˙authorities extension to specify which certification authorities they support and which should be used by the receiving TLS party to select the appropriate certificates.

Client Bob sends the certificate˙authorities extension in his ClientHello message. Server Alice sends the certificate˙authorities extension in her CertificateRequest message. certificate˙authorities contains the CertificateAuthoritiesExtension data structure, as shown in Listing 10.8.

Listing 10.8: CertificateAuthoritiesExtension data structure

opaque DistinguishedName<1..2^16-1>;

struct {
   DistinguishedName authorities<3..2^16-1>;
} CertificateAuthoritiesExtension;

In CertificateAuthoritiesExtension, field authorities holds a list of the distinguished names of suitable CAs. The CA names are represented in the DER-encoded format. The authorities variable also contains the name of the trust anchor or the subordinate CA. As a result, server Alice and client Bob can use the certificate˙authorities extension to specify the known trust anchors as well as the preferred authorization space.

10.6 Summary

In this chapter, we have discussed digital certificates as a means to provide authenticity for public keys, and the bodies that issue certificates, Certification Authorities (CAs). In particular, we looked at the minimum set of data that needs to be presented within a certificate, and the optional parts of a certificate.

Regarding CAs, we discussed their tasks and the processes for obtaining and validating certificates. We have also seen how CAs fit into the larger structure needed to manage public keys, the Public-Key Infrastructure (PKI).

After these more general considerations, we looked in detail at how digital certificates are handled within the TLS 1.3 handshake protocol.

The next chapter will be more technical again, as it discusses hash functions and message authentication codes. Apart from digital signatures (which also use hash functions), they are the main cryptographic mechanisms for providing authenticity to handshake messages.

One-way functions – Hash Functions and Message Authentication Codes

11.3 One-way functions

In Chapter 4, Encryption and Decryption, we learned that the notion of computational security is built on the concept of pseudorandomness, the idea that bit strings can look completely random even though they are not. In fact, pseudorandom generators, functions, and permutations form the basis of modern symmetric key cryptography. As being one-way is also one of the defining properties of a cryptographic hash function, we chose to include a more formal discussion of this property in this section, even though it is fundamental for the whole of cryptography.

This is because mathematicians have proved that pseudorandom generators, functions, and permutations can be constructed from one-way functions.

As a result, the existence of one-way functions is equivalent to the existence of any non-trivial symmetric-key cryptography [97]. This means, if we can find functions that we can prove to be one-way, we can use them to construct symmetric-key cryptographic schemes, for example, symmetric-key encryption algorithms or keyed hash functions, that are provably computationally secure.

The good news is that there is a number of functions mathematicians have studied for decades that exhibit one-way properties. We will cover some of the most prominent examples of such candidate one-way functions in a minute.

The bad news is that the currently known candidate one-way functions are much less efficient than constructions actually used in practice, say a modern block cipher. Bridging this gap between theory and practice is one of the most important open research problems in modern cryptography as it allows you to build provably-secure pseudorandom generators, functions, and permutations.

11.3.1 Mathematical properties

A function f : X → Y is called a one-way function if f(x) is easy to compute for all x ∈ X, but for essentially all elements y ∈ Y it is computationally infeasible to find any x ∈ X such that f(x) = y [117]. Such an x is called a preimage of y.

Here, easy to compute simply means that for any given x ∈ X, Alice and Bob can compute f(x) in polynomial time.

The requirement that it is computationally infeasible for Eve to find any x ∈ X such that f(x) = y means that a one-way function must be hard to invert. Since modern cryptography is about specifying cryptographic algorithms that are secure against a probabilistic polynomial-time attacker – to put it more precisely, that can be broken by a probabilistic polynomial-time attacker with only negligible probability – a function f is considered hard to invert if no probabilistic polynomial-time algorithm is capable of finding a preimage x ∈ X for any given element y ∈ Y , except with negligible probability.

The definition of what it means for a function to be hard to invert might seem complicated at first. But it actually only excludes two extremes:

  • It is always possible to guess a preimage x. Since we can choose the size of the range of f, that is, the set {y = f(x) ∈ Y |x ∈ X}, when designing a cryptographic algorithm, we can make the likelihood of a successful guess arbitrarily small, but never zero. As a result, we need to account for the fact that Eve could find a preimage y for a randomly chosen x with negligible probability.
  • It is always possible to find a preimage of y by brute-force searching the domain of f(x), that is, by simply trying all values of x until one produces the correct y. Such a brute-force search requires exponential time. As a result, we exclude this extreme by requiring f to be computationally infeasible to invert only for probabilistic polynomial-time algorithms.

Summing up, a function y = f(x) is said to be one-way, if it is hard to invert for all values of y. If there is a probabilistic polynomial-time algorithm that can invert f for some values of y with a non-negligible probability—even if this probability is very small—then f is not a one-way function.

Because the brute-force search runs in exponential time and always succeeds, the existence of one-way functions is an assumption about computational complexity and computational hardness [97]. In other words, it is an assumption about the existence of mathematical problems that can be solved in principle, but cannot be solved efficiently.

Candidate one-way functions – Hash Functions and Message Authentication Codes

11.3.2 Candidate one-way functions

With the current knowledge in complexity theory, mathematicians do not know how to unconditionally prove the existence of one-way functions. As a result, their existence can only be assumed. There are, however, good reasons to make this assumption: there exists a number of very natural computational problems that were the subject of intense mathematical research for decades, sometimes even centuries, yet no one was able to come up with a polynomial-time algorithm that can solve these problems.

According to the fundamental theorem of arithmetic, every positive integer can be expressed as a product of prime numbers, the numbers being referred to as prime factors of the original number [74]. One of the best known computational problems that is believed to be a one-way function is prime factorization: given a large integer, find its prime factors.

The first table of prime factors of integers was created by the Greek mathematician Eratosthenes more than 2,500 years ago. The last factor table for numbers up to 10,017,000 was published by the American mathematician Derrick Norman Lehmer in 1909 [178].

Trial division was first described in 1202 by Leonardo of Pisa, also known as Fibonacci, in his manuscript on arithmetic Liber Abaci. French mathematician Pierre de Fermat proposed a method based on a difference of squares, today known as Fermat’s factorization method, in the 17th century. A similar integer factorization method was described by the 14th century Indian mathematician Narayana Pandita. The 18th century Swiss mathematician Leonhard Euler proposed a method for factoring integers by writing them as a sum of two square numbers in two different ways.

Starting from the 1970s, a number of so-called algebraic-group factorization algorithms were introduced that work in an algebraic group. As an example, the British mathematician John Pollard introduced two new factoring algorithms called the p− 1 algorithm and the rho algorithm in 1974 and 1975, respectively. The Canadian mathematician Hugh Williams proposed the Williams’ p + 1 algorithm in 1982.

In 1985, the Dutch mathematician Hendrik Lenstra published the elliptic curve method, currently the best known integer factorization method among the algorithms whose complexity depends on the size of the factor rather than the size of the number to be factored [204].

Moreover, a number of so-called general-purpose integer factorization methods whose running time depends on the size of the number to be factored have been proposed over time. Examples of such algorithms are the Quadratic Sieve introduced in 1981 by the American mathematician Carl Pomerance, which was the most effective general-purpose algorithm in the 1980s and early 1990s, and the General Number Field Sieve method, the fastest currently known algorithm for factoring large integers [39].

Yet, except for the quantum computer algorithm proposed in 1994 by the American mathematician Peter Shor, none of the above algorithms can factor integers in polynomial time. As a result, mathematicians assume that prime factorization is indeed a one-way function, at least for classical computers. Shor’s algorithm, on the other hand, requires a sufficiently large quantum computer with advanced error correction capabilities to keep the qubits stable during the entire computation, a technical challenge for which no solutions are known as of today.

Another example of a function believed to be one-way is a family of permutations based on the discrete logarithm problem. Recall that the discrete logarithm problem involves determining the integer exponent x given a number of the form gx where g is a generator of a cyclic group. The problem is believed to be computationally intractable, that is, no algorithms are known that can solve this problem in polynomial time.

The conjectured one-way family of permutations consists of the following:

  • An algorithm to generate an n-bit prime p and an element g ∈ {2,…,p − 1}
  • An algorithm to generate a random integer x in the range {1,…,p−1}
  • The function fp,g(x) = gx (mod p)

Function fp,g(x) is easy to compute, for example, using the square-and-multiply algorithm. Inverting fp,g(x), on the other hand, is believed to be computationally hard because inverting modular exponentiation is equivalent to solving the discrete logarithm problem for which no polynomial-time algorithms are known to date, as discussed in Chapter 7, Public-Key Cryptography.

One-way property – Hash Functions and Message Authentication Codes

11.4.2 One-way property

In Chapter 5, Entity Authentication, we showed how passwords can be stored in a secure way on a server using hash functions. More specifically, each password is hashed together with some random value (the salt) and the hash value is stored together with the corresponding user ID. This system can only be secure if it is computationally difficult to invert the hash function, that is, to find a matching input for a known output. The same requirement emerges if the hash function is used in a key-dependent way in order to form a MAC (see Section 11.5, Message authentication codes).

In order to put this requirement in a more precise way, we only need to apply our earlier definition of a one-way function from Section 11.3, One-way functions, to hash functions:

A hash function hash is said to be one-way or preimage resistant, if it is computationally infeasible to find an input m for a given output y so that y = hash(m).

As is the case for second preimages, preimages for a given n-bit output will occur automatically after O(2n) trial inputs. Hash functions that are preimage resistant and second preimage resistant are called one-way hash functions (OWHF).

11.4.3 Merkle-Damgard construction

Our previous discussion of requirements on a secure hash function shows that in order to achieve collision resistance, it is important that all input bits have an influence on the hash value. Otherwise, it would be very easy to construct collisions by varying the input bits that do not influence the outcome.

How can we accommodate this requirement when dealing with inputs m of indeterminate length? We divide m into pieces (or blocks) of a fixed size, then you deal with the blocks one after the other. In one construction option, the block is compressed, that is mapped onto a smaller bit string, which is then processed together with the next block.

The Merkle-Damgard scheme has been the main construction principle for cryptographic hash functions in the past. Most importantly, the MD5 (128-bit hash), SHA-1 (160-bit hash), and SHA-2 (256-bit hash) hash functions are built according to this scheme. Later in this chapter, in Section 11.7, Hash functions in TLS, we will look at the SHA-family of hash functions in detail, as these functions play an important role within TLS.

For now, we’ll concentrate on the details of the Merkle-Damgard scheme. In order to compute the hash value of an input message m of arbitrary length, we proceed according to the following steps:

  • Separate message m into k blocks of length r, using padding if necessary. In the SHA-1 hash function, input messages are always padded by a 1 followed by the necessary number of 0-bits. The block length of SHA-1 is r = 512.
  • Concatenate the first block m1 with an initialization vector IV of length n.
  • Apply a compression function comp : {0,1}n+r → {0,1}n on the result, to get
  • Process the remaining blocks by computing

Note that each hi has length n.

  • Set

Note that finding a collision in comp implies a collision in hash. More precisely, if we can find two different bit strings y1,y2 of length r so that comp(x||y1) = comp(x||y2) for some given n−bit string x, then we can construct two different messages m,m with the same hash value:

The article [8] lists a number of generic attacks on hash functions based on the Merkle-Damgard scheme. Although in most cases these attacks are far from being practical, they are still reason for concern about the general security of the scheme.

MAC versus CRC 2 – Hash Functions and Message Authentication Codes

So, to encode a two-byte message 0x0102, Bob would interpret it as the polynomial m(x) = x8 + x, divide it by x2 + x + 1 using polynomial division, and get a remainder polynomial r(x) = 1. In hexadecimal notation, the remainder has the value 0x01. He would then append the remainder value as the CRC check value and transmit the message 0x010201 to Alice.

Upon receiving the message, Alice would perform the same computation and check whether the received CRC value 0x01 is equal to the computed CRC value. Let’s assume there was an error during transmission – an accidental bit flip – so that Alice received the message 0x010101. In that case, the CRC value computed by Alice would be 0x02 and Alice would detect the transmission error.

At first glance, this looks very similar to a MAC and, especially in systems that already support CRCs, it might be tempting to use CRC as a replacement for a MAC. Don’t! Recall that MACs are built on top of cryptographic hash functions, and cryptographic hash functions are collision-resistant. CRCs, on the other hand, are not collision resistant.

As an example, Listing 11.1 shows the Python code for computing CRC-8. This CRC uses generator polynomial x2 + x + 1 and outputs an 8-bit CRC value.

Listing 11.1: Python code for computing CRC-8 using generator polynomial x2+x+1

def crc8(data, n, poly, crc=0):
   g = 1 << n | poly  # Generator polynomial
   for d in data:
       crc ^= d << (n – 8)
       for _ in range(8):
           crc <<= 1
           if crc & (1 << n):
               crc ^= g
   return crc

Now, if you compute CRC-8 checksum values for different 2-byte messages using the code shown in Listing 11.2, you can quickly verify yourself that messages 0x020B, 0x030C, 0x0419, and many others have the same CRC value of 0x1B.

Listing 11.2: Python code to compute CRC-8 for different 2-byte messages

for i in range(0,256):
   for j in range(0, 256):
       if crc8([i,j], 8, 0x07) == 0x1b:
           print(f”Message {hex(i)}, {hex(j)} has CRC 0x1b”)

Consequently, if Alice and Bob were to use CRCs to protect their message integrity against malicious attacker Mallory rather than accidental transmission errors, it would be very easy for Mallory to find messages that have an identical CRC check value. That, in turn, would allow Mallory to exchange a message that Bob sent to Alice without her noticing it (and vice versa). And that is exactly the reason why a MAC needs to be collision-resistant. Moreover, and maybe even more importantly, even if Mallory cannot be bothered to find collisions for the CRC value already in place, he can simply compute the matching CRC value for a message of his choice and replace both the message and the CRC. This is possible because there is no secret information going into the CRC. To summarize, a CRC will only protect you against accidental, random transmission errors, but not against an intelligent attacker.

MAC versus CRC – Hash Functions and Message Authentication Codes

11.6 MAC versus CRC

Can we construct a MAC without a cryptographic hash function and without a secret key? Let’s take a look at the Cyclic Redundancy Check (CRC), which is popular error-detecting code used in communication systems to detect accidental errors in messages sent over a noisy or unreliable communication channel.

The working principle of error-detecting code is for the sender to encode their plaintext message in a redundant way. The redundancy, in turn, allows the receiver to detect a certain number of errors – that is, accidental bit flips – in the message they receive. The theory of channel coding, pioneered in the 1940s by the American mathematician Richard Hamming, aims to find code that has minimal overhead (that is, the least redundancy) but, at the same time, has a large number of valid code words and can correct or detect many errors.

CRC is so-called cyclic code, that is, a block code where a circular shift of every code word yields another valid code word. The use of cyclic code for error detection in communication systems was first proposed by the American mathematician and computer scientist Wesley Peterson in 1961.

Cyclic code encodes the plaintext message by attaching to it a fixed-length check value based on the remainder of a polynomial division of the message’s content. The receiving party repeats that calculation and checks whether the received check value is equal to the computed check value.

The algebraic properties of cyclic code make it suitable for efficient error detection and correction. Cyclic code is simple to implement and well suited to detect so-called burst errors. Burst errors are contiguous sequences of erroneous bits in communication messages and are common in many real-world communication channels.

CRC code is defined using a generator polynomial g(x) with binary coefficients 0 and 1. The plaintext message, encoded as another polynomial m(x), is divided by the generator polynomial. The CRC is then computed by discarding the resulting quotient polynomial and taking the remainder polynomial r(x) as CRC, which is subsequently appended to the plaintext as a checksum. The whole arithmetic is done within the finite field 𝔽2, therefore the coefficients of the remainder polynomial are also 0 and 1.

As an example, we can compute an 8-bit CRC using the generator polynomial g(x) = x2 + x + 1. To encode a message, we encode it as a polynomial, divide it by the generator polynomial x2 + x + 1, and take the remainder of this division as the CRC check value to be appended to the plaintext message.

Hash functions in TLS 1.3 – 2 – Hash Functions and Message Authentication Codes

SHA-256, SHA-384, and SHA-512 hash functions

SHA-256, SHA-384, and SHA-512 are hash algorithms from the Secure Hash Algorithm-2 (SHA-2) family. The algorithms are defined in FIPS 180-4, Secure Hash Standard (SHS) [129], the standard specifying NIST-approved hash algorithms for generating message digests, and are based on the Merkle-Damgard construction.

The suffix of the SHA-2 algorithms denotes the length of the message digest in bits [140]. As an example, the message digest of SHA-256 has a length of 256 bits. Table 11.1 summarizes the message size, block size, and digest size of all SHA-2 hash family algorithms.

Algorithm.Message size (bits)Block size (bits)Digest size (bits)
SHA-1< 264512160
SHA-224< 264512224
SHA-256< 264512256
SHA-384< 21281024384
SHA-512< 21281024512
SHA-512/224< 21281024224
SHA-512/256< 21281024256

 Table 11.1: Basic properties of SHA-2 hash family algorithms

All SHA-2 hash algorithms use a set of similar basic functions, only with different lengths of input and output. Every SHA-2 algorithm uses the following functions where x,y, and z are either 32-bit or 64-bit values, ⊕ denotes exclusive-OR, and ∧ denotes bitwise AND:

  • Ch(x,y,z) = (x ∧ y) ⊕ (¬x ∧ z)
  • Maj(x,y,z) = (x ∧ y) ⊕ (x ∧ z) ⊕ (y ∧ z)

SHA-256 functions

In addition to the preceding functions, SHA-256 uses four logical functions. Each function is applied to a 32-bit value x and outputs a 32-bit result:

∑ 0256(x) = ROTR2(x) ⊕ ROTR13(x) ⊕ ROTR22(x)

∑ 1256(x) = ROTR6(x) ⊕ ROTR11(x) ⊕ ROTR25(x)

σ0256(x) = ROTR7(x) ⊕ ROTR18(x) ⊕ SHR3(x)

σ1256(x) = ROTR17(x) ⊕ ROTR19(x) ⊕ SHR10(x)

In the preceding functions, ROTRn(x) denotes a circular right-shift operation applied to a w-bit word x, using an integer 0 ≤ n < w, defined as (x ≫ n) ∨ (x ≪ w −n), and SHRn(x) denotes a right-shift operation applied to a w-bit word x, using an integer 0 ≤ n < w, defined as x ≫ n.

SHA-512 functions

Similar to SHA-256, SHA-384 and SHA-512 also use four logical functions. However, the functions are applied to a 64-bit value x and output a 64-bit result:

∑ 0512(x) = ROTR28(x) ⊕ ROTR34(x) ⊕ ROTR39(x)

∑ 1512(x) = ROTR14(x) ⊕ ROTR18(x) ⊕ ROTR41(x)

σ0512(x) = ROTR1(x) ⊕ ROTR8(x) ⊕ SHR7(x)

σ1512(x) = ROTR19(x) ⊕ ROTR61(x) ⊕ SHR6(x)

SHA-256 constants

SHA-256 uses 64 32-bit constants K0256,K1256,…,K63256 that are the first 32 bits of the fractional parts of cube roots of the first 64 prime numbers.

SHA-384 and SHA-512 constants

SHA-384 and SHA-512 use 80 64-bit constants K0512,K1512,…,K79512 that are the first 64 bits of the fractional parts of cube roots of the first 80 prime numbers.

Preprocessing the message

All hash functions in the SHA-2 family preprocess the message before performing the actual computation. The preprocessing consists of three steps:

  1. Padding the plaintext message to obtain a padded message that is a multiple of 512 bits for SHA-256 and a multiple of 1,024 bits for SHA-384 and SHA-512.
  2. Parsing the message into blocks.
  3. Setting the initial hash value H(0).

For SHA-256, the padded message is parsed into N 512-bit blocks M1,M2,…,MN. Because every 512-bit input block can be divided into 16 32-bit words, the input block i can be expressed as M0i,M1i,…,M1i5 where every Mji has the length of 32 bits.

Similarly, for SHA-384 and SHA-512, the padded message is parsed into N 1,024-bit blocks M1,M2,…MN. Because a 1024-bit input block can be divided into 16 64-bit words, the input block i can be expressed as M0i,M1i,…,M1i5 where every Mji has the length of 64 bits.