Testing constant-timeness using Valgrind: case of the NSS library

Cryptographic code needs to be constant-time to not leak secrets via timing. Being constant-time is usually defined as:

No branching on secret-dependent values.
No memory access based on secret-dependent values.
No secret-dependent values given to some variable time functions.

There are a few ways of testing or verifying that code is constant-time, for example using the tools I described in a previous post. In this post I looked at using Valgrind’s memcheck tool to test constant-timeness of primitives in the NSS cryptographic library.

Testing constant-timeness using Valgrind#

As far as I know, the original idea of using Valgrind’s MemCheck for testing constant-timeness goes to Adam Langley’s ctgrind, introduced in a blog post and on github, back in 2010. Older versions of Valgrind did not expose the necessary interface so a patch was needed, however, no patches are needed for current versions of Valgrind. A modern presentation of the idea is Timecop.

The idea is to use Valgrind MemCheck’s memory definedness tracking as a sort of dynamic data dependency tracker that reports issues when data derived from undefined data is branched upon or is used for memory access.

First you write a test-wrapper for the function you want to test, which in this case is the compute function below. Then in the wrapper you mark all of the secrets as undefined using Valgrind’s client_request feature. The macros VALGRIND_MAKE_MEM_UNDEFINED(addr, len) and VALGRIND_MAKE_MEM_DEFINED(addr, len) are available from <valgrind/memcheck.h>.

#include <valgrind/memcheck.h>

int compute(unsigned char secret[32]) {
    if (secret[0] == 0) {
        return 0;
    } else {
        return 1;
    }
}

int main(void) {
    unsigned char buf[32];
    for (int i = 0; i < 32; ++i)
        buf[i] = 0;
    VALGRIND_MAKE_MEM_UNDEFINED(buf, 32);
    compute(buf);
    return 0;
}

Then when you run valgrind on the produced binary you get:

Conditional jump or move depends on uninitialised value(s)
        at 0x1093BC: compute (code.c:4)
        by 0x109464: main (code.c:16)

Which shows that Valgrind’s MemCheck correctly detected the branching on secret values present in the compute function.

Properties & Limitations#

This method of testing constant-timeness has some limitations. It is a runtime technique so only code that gets executed gets tested. Getting good code coverage is thus important. While Valgrind MemCheck’s checking machinery is complex there is no guarantee that it in itself is correctly implemented or that it does not have false negatives or false positives. This is however the case with all software, and by doing any sort of tool-assisted analysis one has to include the tool in the trusted computing base and assume that it works.

Testing the Mozilla NSS library#

I started this work with the goal of trying out this technique of testing constant-timeness on a real-worls library and also with the goal of upstreaming the changes to the library CI.

Working with NSS created quite a bit of hassle. It uses Mercurial for version control. I have never used Mercurial before and some of its concepts looked completely backwards to my git-familiar brain. Also, setting up and understanding all of the bazillion Mozilla’s services necessary to submit patches to NSS was quite involved.

Integrating annotations#

I decided to focus on testing constant-timeness of the cryptographic primitives in NSS first. Timing attacks often focus on these primitives, but some focus on higher level constructions in TLS (like Lucky13) so larger parts of the TLS stack should be tested. To use Valgrind to test constant-timeness of crypto primitives one needs to write test-cases which execute the primitives on secret inputs marked undefined. Luckily, NSS already has test-cases for many of its primitives in the bltest and fbectest binaries, so I added the Valgrind annotations to those and pushed the revision.

I only annotated private keys and random nonces as secret values. I decided to not annotate inputs to hash functions and messages in encryption functions, even though these are also often meant to be secret (e.g. when hashing a secret to produce a key, or when encrypting a secret message, duh).

Testing constant-timeness#

To actually run the test-cases in CI a test suite was necessary. I decided to copy over the cipher test suite, which runs the bltest binary usualy and create a new test suite ct which does the same but under Valgrind’s memcheck (revision). I also added tests using Valgrind on the fbectest binary which performs ECDH.

Results#

I collected data while running on a machine with AMD Ryzen 7 PRO 4750U, using Valgrind 3.17.0 and gcc 11.1.0 on a Debug build of NSS targeting x86_64 based on revision ea6fb7d0d0fc. The test process reported AES-NI, PCLMUL, SHA, AVX, AVX2, SSSE3, SSE4.1, SSE4.2 all supported.

AES-GCM decryption#

The first report from Valgrind was for AES-GCM decryption:

Conditional jump or move depends on uninitialised value(s)
    at 0x5115E5C: intel_AES_GCM_DecryptUpdate (intel-gcm-wrap.c:319)
    by 0x5087D43: AES_Decrypt (rijndael.c:1206)
    by 0x11D496: AES_Decrypt (loader.c:509)
    by 0x10F46F: aes_Decrypt (blapitest.c:1159)
    by 0x113D4D: cipherDoOp (blapitest.c:2506)
    by 0x1170E1: blapi_selftest (blapitest.c:3358)
    by 0x118C62: main (blapitest.c:3912)

It points at line 319 in the intel_AES_GCM_DecryptUpdate function, if we look at that line we see:

if (NSS_SecureMemcmp(T, intag, tagBytes) != 0) {     // Line 319
    memset(outbuf, 0, inlen);
    *outlen = 0;
    /* force a CKR_ENCRYPTED_DATA_INVALID error at in softoken */
    PORT_SetError(SEC_ERROR_BAD_DATA);
    return SECFailure;
}

which is benign leakage, as what leaks is whether the GCM tag is valid or not. As Valgrind does not report further leaks inside of the secure compare function we can assume that only the result of the function leaks via the branch on line 319. I have to give bonus points to NSS here for clearing the output buffer when an AEAD tag verification fails and not passing the unauthenticated decrypted data to the caller.

DSA signing#

During the DSA_SignDigestWithSeed call in DSA signing, the raw private key data is converted to an mpi via mp_read_unsigned_octets. This seems to leak some amount of information on the private key. This leakage is similar to leakage in ¹ and ², where a non-constant-time base64 decoding operation performed on private key data was targeted. However, as there is no base64 decoding here, transforming raw data into an mpi, I expect the leakage to be rather small. With that said, in DSA even leaking partial information about the random nonce (mainly most or least significant bits) can lead to key recovery via lattice attacks like Minerva ³. I can’t tell whether this function is vulnerable so more analysis is necessary.

Conditional jump or move depends on uninitialised value(s)
    at 0x50753D8: mp_cmp_z (mpi.c:1577)
    by 0x5079552: mp_read_unsigned_octets (mpi.c:4772)
    by 0x5031F07: dsa_SignDigest (dsa.c:384)
    by 0x50326F4: DSA_SignDigestWithSeed (dsa.c:547)
    by 0x11C971: DSA_SignDigestWithSeed (loader.c:184)
    by 0x10FA27: dsa_signDigest (blapitest.c:1326)
    by 0x114799: cipherDoOp (blapitest.c:2602)
    by 0x116F09: blapi_selftest (blapitest.c:3321)
    by 0x118C62: main (blapitest.c:3912)

Then Valgrind proceeds to report leakage all over the DSA signing process, in various mp_mul and mp_mulmod calls mostly, like the report below:

Conditional jump or move depends on uninitialised value(s)
    at 0x507712D: s_mp_clamp (mpi.c:2929)
    by 0x50740C8: mp_mul (mpi.c:888)
    by 0x5032288: dsa_SignDigest (dsa.c:439)
    by 0x50326F4: DSA_SignDigestWithSeed (dsa.c:547)
    by 0x11C971: DSA_SignDigestWithSeed (loader.c:184)
    by 0x10FA27: dsa_signDigest (blapitest.c:1326)
    by 0x114799: cipherDoOp (blapitest.c:2602)
    by 0x116F09: blapi_selftest (blapitest.c:3321)
    by 0x118C62: main (blapitest.c:3912)

However, the DSA signing code uses a blinding side-channel countermeasure in which the nonce $k$ is blinded and used as $t = k + q f$ where $f$ is a random int with its most-significant bit set and $q$ is the order of the generator $g$. This blinding is done for the modular exponentiation $g^t \mod q = g^k \mod q$. For later steps, namely the modular inversion of the nonce $k$ and work with the private key, further blinding is done with random values from $\mathbb{Z}_q$. This blinding seems to be enough to stop the leakage of the nonce or the private key, but more detailed analysis would be required to be sure.

RSA decryption, RSA-OAEP decryption and RSA-PSS signing#

All of the RSA, RSA-OAEP decryption and RSA-PSS signature algorithms use the same underlying rsa_PrivateKeyOpCRTNoCheck function, so they can be described together.

Similarly to DSA signing, the function first loads the contents of the raw RSA-CRT private key into an mpi: \[p,\; q,\; d\pmod{p - 1},\; d\pmod{q - 1},\; q^{-1}\pmod{p}\] Here, the leakage might be more serious than in the DSA case, as more secret values are loaded and reconstructing the RSA private key given some information on those values might be possible (see for example ⁴, ⁵ or ⁶).

Furthermore, the code then proceeds to directly give the private values to several mp_mod and mp_exptmod calls which Valgrind flags as leaking. I expect the mp_exptmod function to leak a lot of information about the exponent as it is just a general purpose modular exponentiation function and not RSA specific one created with constant-timeness in mind.

ECDH key derivation#

The ECDH_Derive function uses the same logic to read the raw private key into an mpi as do DSA and RSA. This leaks some amount of information but like in the DSA case I expect it to be very small and practically unusable.

P-256#

The next reported group of issues concerns the NIST P-256 implementation in NSS. ECDH on this curve is implemented using the ec_GFp_nistp256_points_mul_vartime function. Now this may be a red flag , but it is a benign one. Luckily, the variable time function is a two-scalar multiplication function which short-circuits to the constant-time ec_GFp_nistp256_point_mul if only one scalar input is provided, which is the case for ECDH. Why does Valgrind report issues then? Well, it only reports issues inside the from_montgomery which is supposed to transform the resulting point coordinates back from Montgomery form after the scalar multiplication. It is hard to classify hw serious this leakage is. It reminds me of a few papers (⁷ and ⁸) which exploited similar side-channel leakage of the projective representation of a result of scalar multiplication to extract partial information about the scalar used. However, that was done for ECDSA, where partial information about the scalar can lead to key recovery. For ECDH, this leakage might reveal some bits of the scalar. Perhaps in case of static ECDH an attacker could adaptively query the scalar multiplication while measuring this leaking conversion and mount something like the Raccoon attack (⁹)?

The final report by Valgrind for P-256 is about the ec_point_at_infinity function, which is used to test whether the derived point in ECDH is not the point at infinity and if it is, to abort the operation. I think this is benign leakage as it is leaked anyway when the operation fails.

P-284 and P-521#

NSS’s implementation of the P-384 and P-521 curves is from ECCKiila which itself uses fiat-crypto. Its use in ECDH still has the reading of the private key leak and the final comparison to point at infinity but doesn’t have the from_montgomery leaks, because of a different scalar multiplication implementation.

Curve25519#

Valgrind only reports one issue in the Curve25519 case and that is a branching on a comparision of the output point to the point at infinity, which is again benign.

ECDSA signing#

During this implementation work I stumbled upon some broken tests of ECDSA in the bltest binary and reported them here.

The ECDSA signing code in NSS is very similar to DSA signing and Valgrind reports roughly the same issues. The ECDSA code is also using blinding but only on the modular arithmetic level, the random nonce is not blinded when it is used in the scalar multiplier. Due to the broken nature of the tests I couldn’t really look at how different curves are handled in ECDSA but a cursory run saw no leakage from the from_montgomery function as in ECDH on P-256.

As in DSA, the random nonce is also read into an mpi in ECDSA using the mp_read_unsigned_octets function and Valgrind reports some leakage present. Similarly to the DSA case, this leakage might be an issue because it presents information on the random nonce which can lead to vulnerability to lattice attacks.

Conclusions#

Doing this analysis was harder but also more insightful than I expected. Here are some of my observations:

Getting this testing of constant-timeness to work, even with a simple tool like Valgrind, on a library as a new contributor is definitely harder than I thought. Maybe this is just NSS, but I expect other popular open-source crypto libraries to be similar.
Testing constant-timeness of cryptographic code does not end with running Valgrind on annotated test-cases or even after including constant-timeness test runs in CI. The results presented by Valgrind require actual human eyes to look at them and evaluate whether the leaks are serious/benign. Even I am unsure of some of the leaks presented and I have experience with timing attacks and the related cryptosystems. The noise from benign leaks presented in Valgrind output needs to be solved somehow before these patches can land and run in CI, otherwise these failing tests can mask the introduction of actual exploitable leaks. Valgrind offers a way to silent MemCheck warnings from given codepaths which could be used to achieve this.
I believe that some of the presented leaks should be fixed, namely the RSA ones seem to be quite serious.

Just to add, this work was done and this post written before the post of Google Project Zero’s Tavis Ormandy on an unrelated vulnerability in NSS. I certainly don’t want this post to sound like I’m criticizing the folks behind NSS and want this post to help them instead.

Florian Sieck, Sebastian Berndt, Jan Wichelmann, Thomas Eisenbarth: Util::Lookup: Exploiting Key Decoding in Cryptographic Libraries ↩
Daniel Moghimi, Moritz Lipp, Berk Sunar, Michael Schwarz: Medusa: Microarchitectural Data Leakage via Automated Attack Synthesis ↩
Jan Jancar, Vladimir Sedlacek, Petr Svenda, Marek Sys: Minerva: The curse of ECDSA nonces ↩
CryptoHackers: RECOVERING A FULL PEM PRIVATE KEY WHEN HALF OF IT IS REDACTED ↩
Nadia Heninger, Hovav Shacham: Reconstructing RSA Private Keys from Random Key Bits ↩
Gabrielle de Micheli, Nadia Heninger: Recovering cryptographic keys from partial information, by example ↩
David Naccache, Nigel P. Smart, Jacques Stern: Projective Coordinates Leak ↩
Alejandro Cabrera Aldaya, Cesar Pereida García, Billy Bob Brumley: From A to Z: Projective coordinates leakage in the wild ↩
Robert Merget, Marcus Brinkmann, Nimrod Aviram, Juraj Somorovsky, Johannes Mittmann, Jörg Schwenk: Raccoon attack ↩

The state of tooling for verifying constant-timeness of cryptographic implementations

This post explores the current state of tools for verification of constant-time properties in cryptographic implementations, both static and dynamic. These tools are mostly unused in the development of open-source cryptographic libraries and remain only as results of academic work. I know of only four open-source cryptographic library that utilize these tools in an automated manner, similar to how unit tests, test-vectors, or even fuzzing is commonplace. Below is a list of what popular open-source cryptographic libraries run in their Continuous Integration (CI) setups collected on a best-effort basis.

UPDATE: For an updated list of tools see the following Github page: https://crocs-muni.github.io/ct-tools/

OpenSSL: Builds, tests, fuzzing (OSS-Fuzz)
LibreSSL: Builds, tests, fuzzing (OSS-Fuzz)
BoringSSL: Builds, tests, fuzzing (Custom buildbots + OSS-Fuzz), constant-time verification using a ctgrind-like approach
BearSSL: No public CI, fuzzing (OSS-Fuzz) (constant-time documentation)
Botan: Builds, tests, fuzzing (Travis + OSS-Fuzz), constant-time verification using ctgind
Crypto++: Builds, tests
wolfSSL: No public CI, fuzzing (OSS-Fuzz)
mbedTLS: Builds, tests, fuzzing (OSS-Fuzz), constant-time verification using ctgrind and MemSan
libtomcrypt: Builds, tests
libgcrypt: No public CI?
libsodium: Builds, tests, fuzzing (OSS-Fuzz)
MatrixSSL: No public CI?
Amazon s2n: Builds, tests, fuzzing, constant-time verification using ct-verif and SideTrail
GnuTLS: Builds, tests, fuzzing (OSS-Fuzz)
NSS: Builds, tests, fuzzing (OSS-Fuzz)

Of particular note is the cryptofuzz project, which fuzzes the above (and more) cryptographic libraries as part of OSS-Fuzz.

Evaluation#

The tools presented here are evaluated and categorized based on several characteristics.

The first and most significant criterion is the general approach the tool takes and whether it is dynamic or static, e.g., whether it runs the target or not. Dynamic tools usually instrument the target in some way, observe its runs with varying inputs and then evaluate whether differences in runs leak via an observable timing side-channel. Static tools, on the other hand, usually use formal techniques from static analysis and verification of programs to analyze the target and conclude whether it is constant time, with regards to some leakage model.

A second significant criterion that differentiates many of the tools is the input level at which they work. Several tools work on the source code level, most with the C language, some with a custom domain-specific language. Other tools choose to work on a lower level, often in LLVM Intermediate Representation (IR), an assembly-like language in Static Single Assignment (SSA) form used in the LLVM toolchain but also in many open-source program analysis tools. As all of the mentioned input levels are above the assembly/binary level, they are exposed to a level of risk that a compiler will somehow introduce timing leakage, which will not be caught. This risk is entirely realistic, as general-purpose compilers do not offer side-channel resistance and have no obligation to keep the code leakage free. Finally, some tools work directly with compiled binaries and thus provide the highest guarantees that a compiler will not introduce leakage.

The leakage model used and the possibility to configure it forms another essential property of the tools. I consider the leakage model to be what the tool considers to be sources of timing leakage. There are three common leakage models that are often combined in the tools. The branching leakage model considers all branching instructions (program counter changes) conditional on secret values to be leaking. The memory-access leakage model considers all secret dependent memory accesses to be leaking. Lastly, the operand leakage model considers all use of particular instructions with secret dependent operands to be leaking the operands. This leakage model is specific to some processor architectures and instructions which take a variable time to execute, see, for example, the variable time multiplier on ARM (ARM7TDMI Technical Reference Manual). The three mentioned leakage models are usually used together. Both the branching and memory-access leakage models have a version that modifies them such that the existence of processor cache is properly modeled. For example, this allows for code that accesses memory based on a secret value, but only in the space of one cache line (sometimes as a result of applying a cache preloading countermeasure), thus a cache attacker gains no secret information. The above models, especially the operand one, have drawbacks as they require assumptions on hardware behavior or modeling of hardware, and are hardware-specific. Dynamic tools might consider a completely different leakage model, for example, the attacker might only learn the number of instructions that were executed during a run of a function, or sometimes its runtime. This model is applicable for remote or local - but not cache - timing attackers.

For static tools, the properties of soundness and completeness are achievable and essential. A sound tool only deems secure programs secure, thus has no false negatives, while a complete one only deems insecure programs insecure, thus has no false positives. Here I adopt the notion that the tool aims to detect the presence of timing leakage (positive result) and thus derive the false positive/negative notions as above. In the case of dynamic tools, soundness and completeness are often unachievable, and one can only argue about the false-negative rate and the false-positive rate or generally about classes of leakage the tool can find and of classes of program constructs the tool will flag falsely.

Connected to the notion of errors in the tool’s output is its flexibility regarding what values it considers secure detects their leakage. Some tools give the developer the ability to declassify a secret value, thereby exempting it from analysis. This is clearly a double-edged sword. While necessary for some program constructions, its abuse would lead to false negatives. One common application of declassification in cryptographic implementations is in rejection sampling, e.g., when a secret is repeatedly sampled and thrown away until a condition is satisfied. Rejection sampling leads to variable-time code, however with some slight assumptions on the random number generator and the condition on the secret, such leakage is benign and should be ignored. Another source of benign leakage exists in the form of publicly observable outputs. They arise, for example, in the decryption function of an authenticated encryption cipher, as the verification result (which is clearly secret dependent and which is returned) also affects whether the ciphertext gets decrypted. The tool support for such outputs forms another criterion which expands its possibilities of use.

Cryptographic implementations pose a unique challenge for (static) program verification tools. With cryptographic sizes of inputs and outputs, the state space (which program verification tools often work with) can be too large for the tools, even with their many analysis tricks. Thus, performance or even whether the tool is practically usable on real-world cryptographic codebases is an important criterion.

Last but not least, there is a concern for usability. It is the ease of use of the tool that, in the end, drives most of the adoption. Usability is hard to characterize, as it contains elements of all criteria discussed thus far. For example, if a domain-specific language (DSL) is used as the input level to the tool, existing projects will likely not adopt it, as it will require a rewrite of a part of their codebase. As these tools are products of research papers, the usual saying about the quality of research code, tooling and packaging applies as well. Furthermore, with the absence of proper packaging, the tools and their dependencies are often left outdated and no longer work on current versions of their dependencies. All of this points to usability being a significant concern for the adoption of tools for verification of constant-time properties.

The platform support of the tools is unclear as they usually only explicitly target and evaluate on x86. However, as their input level is often the LLVM IR or source code, they could work on other platforms.

Tools#

The tools below are discussed in this report, ordered chronologically. Some properties of the tools are unclear and are marked with an ?. This list will hopefully grow as I find time to look at more tools and add them.

ctgrind#

(2010) github(agl) github(dfaranha)

ctgrind is a patch available for the Valgrind (memcheck) tool, which adds functionality to mark areas of memory as uninitialized. This is to be used on secrets. At runtime, the memcheck tool then checks that the secret(uninitialized) memory is not used in branches or for memory access. As Valgrind’s memcheck supports the VALGRIND_MAKE_MEM_UNDEFINED and VALGRIND_MAKE_MEM_DEFINED client requests, it is now possible to implement a ctgrind-like approach without patches to Valgrind.

Approach: Dynamic
Input level: Source code required to embed annotations, then binary for analysis.
Leakage model: Branching model, memory-access model
Soundness: No, Completeness: No
Declassification: yes, Publicly observable outputs: No
Performance: Ok
Usability: Good. Developers are often experienced with Valgrind and similar tools. However, the need for a custom patch, and thus a recompile of Valgrind, hinders usability, as the patch is not maintained and might get out of date or no longer be supported by upstream Valgrind.

ct-verif#

(2015) github(imdea) github(michael-emmi) paper

The ct-verif tool is a static analysis tool verifying constant-time properties of code, working on the level of LLVM IR, with source code annotations. It uses the SMACK modular software verification toolchain, Bam-Bam-Boogieman for Boogie source transformation, Boogie intermediate verification language as well as the Corral and Z3 solvers.

The tool is actively deployed in the CI of Amazon’s s2n library at link. However, even there, it is only used to verify two functions that together have less than 100 lines of code.

Approach: Static
Input level: Source code required to embed annotations, then LLVM IR for analysis.
Leakage model: Branching model, memory-access model, or even the operand model is possible.
Soundness: Yes, Completeness: Yes
Declassification: Yes, Publicly observable outputs: Yes
Performance: ?
Usability: Bad. The tool relies on a whole host of other tools and has been broken by updates at times. The latest update to the repository is in 2018.

dudect#

(2016) github ePrint

dudect is a dynamic tool that uses leakage assessment techniques from physical (power and EM) side-channel analysis, namely test-vector leakage assessment (TVLA). It first runs the target using two classes of secret input data with varying public input data and measures the duration of execution for each run. It then applies a test to the two distributions of the duration of execution for the two classes (either Welch’s t-test for equality of means or Kolmogorov-Smirnov test for equality of distributions), and if the distributions differ, leakage is reported. This is analogous to how leakage assessment is used in power side-channel attacks, in that instead of comparing distributions of power consumption at points during the execution of the target, the runtime distributions are compared.

Approach: Dynamic
Input level: Binary
Leakage model: Instruction counter or the runtime of a function call
Soundness: No, Completeness: No
Declassification: No, Publicly observable outputs: No
Performance: Good
Usability: Ok

FlowTracker#

(2016) page paper code

The FlowTracker tool is a static tool that works by analyzing the Program Dependence Graph (PDG) of the target in LLVM IR form.

Approach: Static
Input level: LLVM IR
Leakage model: Branching model, memory-access model.
Soundness: Yes, Completeness: No
Declassification: No, Publicly observable outputs: No
Performance: Good
Usability: Bad, uses a very old version of the LLVM compiler stack.

SideTrail#

(2018) paper preprint

SideTrail (at one point called SideWinder) is a tool for verifying time-balanced implementations. The notion of time-balance is a weakening of the constant-time notion that allows for the presence of leakage that is provably under some bound $\delta$ (execution time is negligibly influenced by secrets). For $\delta = 0$ this notion fits well with the notion of constant-time. The tool uses a cross-product technique similar to that of ct-verif. However, instead of asserting the equality of memory accesses and program counter, it asserts the equality of an instruction counter. Its leakage model and technique are well suited against remote (non-cache) attackers.

The tool is deployed in the CI of Amazon’s s2n library at link, where it is used to verify the time-balancedness of several parts of the codebase, handling the CBC decryption, HMAC padding, and AEAD decryption.

Approach: Static
Input level: Source code for annotations, then LLVM IR
Leakage model: Duration as measured by an instruction counter and a model of instruction runtime
Soundness: ?, Completeness: ?
Declassification: No, Publicly observable outputs: No
Performance: Ok.
Usability: Ok. It requires code annotations, as well as providing manual assumptions and loop invariants to ease the verifier’s work.

MicroWalk#

(2018) github arXiv

The MicroWalk framework is a dynamic tool that uses Dynamic Binary Instrumentation (DBI) and Mutual Information Analysis (MIA). As a dynamic tool, it runs the target with random inputs and uses dynamic binary instrumentation to log events such as memory allocations, branches, calls, returns, memory reads/writes as well as stack operations into an execution trace. It then processes these traces by applying the chosen leakage model, i.e., in the branching model, it only keeps the control flow events in the execution traces. After collection of traces, it offers several analysis options, either directly comparing the traces or using mutual information analysis either on the whole trace or a specific offset in the execution traces (a specific instruction).

Approach: Dynamic
Input level: Binary
Leakage model: Branching model, memory-access model, with the possibility of extending that allows the operand model as well.
Soundness: No, Completeness: No
Declassification: No, Publicly observable outputs: No
Performance: Good
Usability: Good. It was Windows-only until recently, but now it also supports Linux.

DATA#

(2018,2020) github paper(2018) paper(2020)

DATA (Differential Address Trace Analysis) is a tool quite similar to the Microwalk framework in that it is a dynamic tool that records memory-accesses of the target into address traces as it processes random secret inputs. The traces are then aligned and analyzed using generic and specific leakage tests. The tool reports the location of leakage and even offers a graphical user interface for analysis.

Approach: Dynamic
Input level: Binary
Leakage model: Branching model, memory-access model.
Soundness: No, Completeness: No
Declassification: No, Publicly observable outputs: No
Performance: Ok
Usability: Good

FaCT#

(2019) github paper

The FaCT tool is less of a tool for analysis of implementations for timing leakage and more of a domain-specific language for writing constant-time implementations that automatically removes leakage during compilation. The language is C-like, compiles into LLVM IR, and offers the secret keyword, which is used to mark certain variables as secret, which then triggers the compiler to generate constant-time code with regards to their values.

Approach: Static
Input level: A domain-specific language called FaCT
Leakage model: Branching model, memory-access model, and operand model.
Soundness: Yes, Completeness: No
Declassification: Yes, Publicly observable outputs: No
Performance: Ok
Usability: Bad for projects with existing codebases as it requires the use of a compiled DSL. Good for new implementations, as it was user-tested and found to improve the developer’s ability to write constant-time code.

ct-fuzz#

(2019) github paper arXiv

ct-fuzz takes inspiration from ct-verif in its method but diverges significantly. It first constructs a product program using self-composition of the target with itself, where it asserts that at each point that the memory address accessed by the two programs, whether through control from or indexing, is the same. It then uses a fuzzer against this product program, which splits its fuzzing input equally into the secret inputs for the two instances or the original program in the product program. If the fuzzer detects a failed assert, leakage is detected, as it found two runs through the target, which differ only in secret inputs yet access different offsets in memory.

Approach: Dynamic, using fuzzing, with static analysis used to construct the fuzzed program.
Input level: LLVM IR, implemented as an LLVM IR transformation in afl-fuzz but requires source code to embed annotations.
Leakage model: The branching model and the memory-access model are used together, with the configurable option of extending them to be cache-aware.
Soundness: No, Completeness: Yes
Declassification: Yes, Publicly observable outputs: No
Performance: Good
Usability: Ok, uses afl-fuzz, which is a well-known fuzzing tool. However, it uses outdated versions of several dependencies and thus cannot be build without downgrading or fixing them.

TIMECOP#

(2020) page page(SUPERCOP)

The TIMECOP tool is a tool that uses Valgrind’s memcheck client requests VALGRIND_MAKE_MEM_{UN}DEFINED to essentially implement a method like ctgrind. It is a part of the SUPERCOP toolkit (System for Unified Performance Evaluation Related to Cryptographic Operations and Primitives) and is used to evaluate the constant-time properties of implementations in SUPERCOP.

Approach: Dynamic
Input level: Source code required to embed annotations, then binary for analysis.
Leakage model: Branching model, memory-access model
Soundness: No, Completeness: No
Declassification: Yes, Publicly observable outputs: No
Performance: Good
Usability: Ok

Binsec/Rel#

(2020) paper preprint

Binsec/Rel is a static analysis tool that works on the binary level, thereby overcoming issues of compilers inserting non-constant-time code or turning constant-time code into non-constant-time one.

Approach: Static
Input level: Binary
Leakage model: Branching model, memory-access model.
Soundness: Yes, Completeness: Yes
Declassification: No, Publicly observable outputs: No
Performance: Good
Usability: ?

Miscellaneous#

Several other tools and languages exist that are related to the analysis of implementations for verifying constant-time properties. I chose to omit a detailed analysis of these tools for now, as the analyzed selection presents a good sample of what the landscape has to offer. Several of these tools also enable one to remove leaks from leaking implementations or create constant-time implementations. We list the tools below in no particular order.

MemSan page
pitchfork-angr github
Haybale-pitchfork github
tis-ct archived page
VirtualCert paper ePrint
CacheD paper
CacheAudit paper
ABPV13 paper preprint
KMO12 paper ePrint
CT-WASM paper
Jasmin paper
Low* paper
SC-Eliminator paper preprint
Vale paper
Blazy et al. 2017 paper preprint
Blazer paper preprint
Themis paper preprint
COCO-CHANNEL paper
CACHEFIX preprint
ENCoVer preprint
Constantine paper

Conclusions#

There is an abundance of tools for verifying constant-time properties of cryptographic implemenetations, yet none seem to be actually used in an automated way outside of the papers that introduced them. This is troubling, as their impact is then limited to the select few implementations that the authors chose to verify in a given work while real-world cryptographic libraries change daily and have no automated verification.

From the analyzed static tools, ct-verif and SideTrail stand out, as they are actively deployed; of note is also the Binsec/Rel tool for its approach on a binary level.

The usability of dynamic tools is usually much better than that of static tools, with ctgrind/timecop‘s approach being almost zero cost in terms of integration as ordinary tests or fuzzing together with Valgrind suffice. The dudect tool could also be used in continuous integration, provided some test harnesses are created for it. The MicroWalk/DATA tools are quite similar and might be more suited to interactive testing of implementations.