Testing constant-timeness using Valgrind: case of the NSS library

Cryptographic code needs to be constant-time to not leak secrets via timing. Being constant-time is usually defined as:

No branching on secret-dependent values.
No memory access based on secret-dependent values.
No secret-dependent values given to some variable time functions.

There are a few ways of testing or verifying that code is constant-time, for example using the tools I described in a previous post. In this post I looked at using Valgrind’s memcheck tool to test constant-timeness of primitives in the NSS cryptographic library.

Testing constant-timeness using Valgrind#

As far as I know, the original idea of using Valgrind’s MemCheck for testing constant-timeness goes to Adam Langley’s ctgrind, introduced in a blog post and on github, back in 2010. Older versions of Valgrind did not expose the necessary interface so a patch was needed, however, no patches are needed for current versions of Valgrind. A modern presentation of the idea is Timecop.

The idea is to use Valgrind MemCheck’s memory definedness tracking as a sort of dynamic data dependency tracker that reports issues when data derived from undefined data is branched upon or is used for memory access.

First you write a test-wrapper for the function you want to test, which in this case is the compute function below. Then in the wrapper you mark all of the secrets as undefined using Valgrind’s client_request feature. The macros VALGRIND_MAKE_MEM_UNDEFINED(addr, len) and VALGRIND_MAKE_MEM_DEFINED(addr, len) are available from <valgrind/memcheck.h>.

#include <valgrind/memcheck.h>

int compute(unsigned char secret[32]) {
    if (secret[0] == 0) {
        return 0;
    } else {
        return 1;
    }
}

int main(void) {
    unsigned char buf[32];
    for (int i = 0; i < 32; ++i)
        buf[i] = 0;
    VALGRIND_MAKE_MEM_UNDEFINED(buf, 32);
    compute(buf);
    return 0;
}

Then when you run valgrind on the produced binary you get:

Conditional jump or move depends on uninitialised value(s)
        at 0x1093BC: compute (code.c:4)
        by 0x109464: main (code.c:16)

Which shows that Valgrind’s MemCheck correctly detected the branching on secret values present in the compute function.

Properties & Limitations#

This method of testing constant-timeness has some limitations. It is a runtime technique so only code that gets executed gets tested. Getting good code coverage is thus important. While Valgrind MemCheck’s checking machinery is complex there is no guarantee that it in itself is correctly implemented or that it does not have false negatives or false positives. This is however the case with all software, and by doing any sort of tool-assisted analysis one has to include the tool in the trusted computing base and assume that it works.

Testing the Mozilla NSS library#

I started this work with the goal of trying out this technique of testing constant-timeness on a real-worls library and also with the goal of upstreaming the changes to the library CI.

Working with NSS created quite a bit of hassle. It uses Mercurial for version control. I have never used Mercurial before and some of its concepts looked completely backwards to my git-familiar brain. Also, setting up and understanding all of the bazillion Mozilla’s services necessary to submit patches to NSS was quite involved.

Integrating annotations#

I decided to focus on testing constant-timeness of the cryptographic primitives in NSS first. Timing attacks often focus on these primitives, but some focus on higher level constructions in TLS (like Lucky13) so larger parts of the TLS stack should be tested. To use Valgrind to test constant-timeness of crypto primitives one needs to write test-cases which execute the primitives on secret inputs marked undefined. Luckily, NSS already has test-cases for many of its primitives in the bltest and fbectest binaries, so I added the Valgrind annotations to those and pushed the revision.

I only annotated private keys and random nonces as secret values. I decided to not annotate inputs to hash functions and messages in encryption functions, even though these are also often meant to be secret (e.g. when hashing a secret to produce a key, or when encrypting a secret message, duh).

Testing constant-timeness#

To actually run the test-cases in CI a test suite was necessary. I decided to copy over the cipher test suite, which runs the bltest binary usualy and create a new test suite ct which does the same but under Valgrind’s memcheck (revision). I also added tests using Valgrind on the fbectest binary which performs ECDH.

Results#

I collected data while running on a machine with AMD Ryzen 7 PRO 4750U, using Valgrind 3.17.0 and gcc 11.1.0 on a Debug build of NSS targeting x86_64 based on revision ea6fb7d0d0fc. The test process reported AES-NI, PCLMUL, SHA, AVX, AVX2, SSSE3, SSE4.1, SSE4.2 all supported.

AES-GCM decryption#

The first report from Valgrind was for AES-GCM decryption:

Conditional jump or move depends on uninitialised value(s)
    at 0x5115E5C: intel_AES_GCM_DecryptUpdate (intel-gcm-wrap.c:319)
    by 0x5087D43: AES_Decrypt (rijndael.c:1206)
    by 0x11D496: AES_Decrypt (loader.c:509)
    by 0x10F46F: aes_Decrypt (blapitest.c:1159)
    by 0x113D4D: cipherDoOp (blapitest.c:2506)
    by 0x1170E1: blapi_selftest (blapitest.c:3358)
    by 0x118C62: main (blapitest.c:3912)

It points at line 319 in the intel_AES_GCM_DecryptUpdate function, if we look at that line we see:

if (NSS_SecureMemcmp(T, intag, tagBytes) != 0) {     // Line 319
    memset(outbuf, 0, inlen);
    *outlen = 0;
    /* force a CKR_ENCRYPTED_DATA_INVALID error at in softoken */
    PORT_SetError(SEC_ERROR_BAD_DATA);
    return SECFailure;
}

which is benign leakage, as what leaks is whether the GCM tag is valid or not. As Valgrind does not report further leaks inside of the secure compare function we can assume that only the result of the function leaks via the branch on line 319. I have to give bonus points to NSS here for clearing the output buffer when an AEAD tag verification fails and not passing the unauthenticated decrypted data to the caller.

DSA signing#

During the DSA_SignDigestWithSeed call in DSA signing, the raw private key data is converted to an mpi via mp_read_unsigned_octets. This seems to leak some amount of information on the private key. This leakage is similar to leakage in ¹ and ², where a non-constant-time base64 decoding operation performed on private key data was targeted. However, as there is no base64 decoding here, transforming raw data into an mpi, I expect the leakage to be rather small. With that said, in DSA even leaking partial information about the random nonce (mainly most or least significant bits) can lead to key recovery via lattice attacks like Minerva ³. I can’t tell whether this function is vulnerable so more analysis is necessary.

Conditional jump or move depends on uninitialised value(s)
    at 0x50753D8: mp_cmp_z (mpi.c:1577)
    by 0x5079552: mp_read_unsigned_octets (mpi.c:4772)
    by 0x5031F07: dsa_SignDigest (dsa.c:384)
    by 0x50326F4: DSA_SignDigestWithSeed (dsa.c:547)
    by 0x11C971: DSA_SignDigestWithSeed (loader.c:184)
    by 0x10FA27: dsa_signDigest (blapitest.c:1326)
    by 0x114799: cipherDoOp (blapitest.c:2602)
    by 0x116F09: blapi_selftest (blapitest.c:3321)
    by 0x118C62: main (blapitest.c:3912)

Then Valgrind proceeds to report leakage all over the DSA signing process, in various mp_mul and mp_mulmod calls mostly, like the report below:

Conditional jump or move depends on uninitialised value(s)
    at 0x507712D: s_mp_clamp (mpi.c:2929)
    by 0x50740C8: mp_mul (mpi.c:888)
    by 0x5032288: dsa_SignDigest (dsa.c:439)
    by 0x50326F4: DSA_SignDigestWithSeed (dsa.c:547)
    by 0x11C971: DSA_SignDigestWithSeed (loader.c:184)
    by 0x10FA27: dsa_signDigest (blapitest.c:1326)
    by 0x114799: cipherDoOp (blapitest.c:2602)
    by 0x116F09: blapi_selftest (blapitest.c:3321)
    by 0x118C62: main (blapitest.c:3912)

However, the DSA signing code uses a blinding side-channel countermeasure in which the nonce \(k\) is blinded and used as \(t = k + q f\) where \(f\) is a random int with its most-significant bit set and \(q\) is the order of the generator \(g\). This blinding is done for the modular exponentiation \(g^t \mod q = g^k \mod q\). For later steps, namely the modular inversion of the nonce \(k\) and work with the private key, further blinding is done with random values from \(\mathbb{Z}_q\). This blinding seems to be enough to stop the leakage of the nonce or the private key, but more detailed analysis would be required to be sure.

RSA decryption, RSA-OAEP decryption and RSA-PSS signing#

All of the RSA, RSA-OAEP decryption and RSA-PSS signature algorithms use the same underlying rsa_PrivateKeyOpCRTNoCheck function, so they can be described together.

Similarly to DSA signing, the function first loads the contents of the raw RSA-CRT private key into an mpi: \[p,\; q,\; d\pmod{p - 1},\; d\pmod{q - 1},\; q^{-1}\pmod{p}\] Here, the leakage might be more serious than in the DSA case, as more secret values are loaded and reconstructing the RSA private key given some information on those values might be possible (see for example ⁴, ⁵ or ⁶).

Furthermore, the code then proceeds to directly give the private values to several mp_mod and mp_exptmod calls which Valgrind flags as leaking. I expect the mp_exptmod function to leak a lot of information about the exponent as it is just a general purpose modular exponentiation function and not RSA specific one created with constant-timeness in mind.

ECDH key derivation#

The ECDH_Derive function uses the same logic to read the raw private key into an mpi as do DSA and RSA. This leaks some amount of information but like in the DSA case I expect it to be very small and practically unusable.

P-256#

The next reported group of issues concerns the NIST P-256 implementation in NSS. ECDH on this curve is implemented using the ec_GFp_nistp256_points_mul_vartime function. Now this may be a red flag , but it is a benign one. Luckily, the variable time function is a two-scalar multiplication function which short-circuits to the constant-time ec_GFp_nistp256_point_mul if only one scalar input is provided, which is the case for ECDH. Why does Valgrind report issues then? Well, it only reports issues inside the from_montgomery which is supposed to transform the resulting point coordinates back from Montgomery form after the scalar multiplication. It is hard to classify hw serious this leakage is. It reminds me of a few papers (⁷ and ⁸) which exploited similar side-channel leakage of the projective representation of a result of scalar multiplication to extract partial information about the scalar used. However, that was done for ECDSA, where partial information about the scalar can lead to key recovery. For ECDH, this leakage might reveal some bits of the scalar. Perhaps in case of static ECDH an attacker could adaptively query the scalar multiplication while measuring this leaking conversion and mount something like the Raccoon attack (⁹)?

The final report by Valgrind for P-256 is about the ec_point_at_infinity function, which is used to test whether the derived point in ECDH is not the point at infinity and if it is, to abort the operation. I think this is benign leakage as it is leaked anyway when the operation fails.

P-284 and P-521#

NSS’s implementation of the P-384 and P-521 curves is from ECCKiila which itself uses fiat-crypto. Its use in ECDH still has the reading of the private key leak and the final comparison to point at infinity but doesn’t have the from_montgomery leaks, because of a different scalar multiplication implementation.

Curve25519#

Valgrind only reports one issue in the Curve25519 case and that is a branching on a comparision of the output point to the point at infinity, which is again benign.

ECDSA signing#

During this implementation work I stumbled upon some broken tests of ECDSA in the bltest binary and reported them here.

The ECDSA signing code in NSS is very similar to DSA signing and Valgrind reports roughly the same issues. The ECDSA code is also using blinding but only on the modular arithmetic level, the random nonce is not blinded when it is used in the scalar multiplier. Due to the broken nature of the tests I couldn’t really look at how different curves are handled in ECDSA but a cursory run saw no leakage from the from_montgomery function as in ECDH on P-256.

As in DSA, the random nonce is also read into an mpi in ECDSA using the mp_read_unsigned_octets function and Valgrind reports some leakage present. Similarly to the DSA case, this leakage might be an issue because it presents information on the random nonce which can lead to vulnerability to lattice attacks.

Conclusions#

Doing this analysis was harder but also more insightful than I expected. Here are some of my observations:

Getting this testing of constant-timeness to work, even with a simple tool like Valgrind, on a library as a new contributor is definitely harder than I thought. Maybe this is just NSS, but I expect other popular open-source crypto libraries to be similar.
Testing constant-timeness of cryptographic code does not end with running Valgrind on annotated test-cases or even after including constant-timeness test runs in CI. The results presented by Valgrind require actual human eyes to look at them and evaluate whether the leaks are serious/benign. Even I am unsure of some of the leaks presented and I have experience with timing attacks and the related cryptosystems. The noise from benign leaks presented in Valgrind output needs to be solved somehow before these patches can land and run in CI, otherwise these failing tests can mask the introduction of actual exploitable leaks. Valgrind offers a way to silent MemCheck warnings from given codepaths which could be used to achieve this.
I believe that some of the presented leaks should be fixed, namely the RSA ones seem to be quite serious.

Just to add, this work was done and this post written before the post of Google Project Zero’s Tavis Ormandy on an unrelated vulnerability in NSS. I certainly don’t want this post to sound like I’m criticizing the folks behind NSS and want this post to help them instead.

Florian Sieck, Sebastian Berndt, Jan Wichelmann, Thomas Eisenbarth: Util::Lookup: Exploiting Key Decoding in Cryptographic Libraries ↩
Daniel Moghimi, Moritz Lipp, Berk Sunar, Michael Schwarz: Medusa: Microarchitectural Data Leakage via Automated Attack Synthesis ↩
Jan Jancar, Vladimir Sedlacek, Petr Svenda, Marek Sys: Minerva: The curse of ECDSA nonces ↩
CryptoHackers: RECOVERING A FULL PEM PRIVATE KEY WHEN HALF OF IT IS REDACTED ↩
Nadia Heninger, Hovav Shacham: Reconstructing RSA Private Keys from Random Key Bits ↩
Gabrielle de Micheli, Nadia Heninger: Recovering cryptographic keys from partial information, by example ↩
David Naccache, Nigel P. Smart, Jacques Stern: Projective Coordinates Leak ↩
Alejandro Cabrera Aldaya, Cesar Pereida García, Billy Bob Brumley: From A to Z: Projective coordinates leakage in the wild ↩
Robert Merget, Marcus Brinkmann, Nimrod Aviram, Juraj Somorovsky, Johannes Mittmann, Jörg Schwenk: Raccoon attack ↩