Analysis of the Covid19 ZostanZdravy app - Contact-tracing

Covid19 ZostanZdravy

This post analyzes the Slovak contact-tracing app Covid19 ZostanZdravy from a security and privacy perspective. The app is being developed by volunteers from Sygic, but is officially running under control of NCZI, the National Health Information Center, with data ownership by UVZ, the Public Health Authority of Slovakia (see the privacy policy). This analysis was performed from publicly available sources, which was possible as both the app and backend are open-source (the analyzed commits were 400aa52, 2710f09 and f9b9d2c). The text below represents the issues I see in the current workings of the contact-tracing part of the app and provides an outlook on fixing them and moving forward. The analysis represents a best effort analysis done in a day, it might contain errors, or I might have misrepresented something, I am open to comments .

Privacy

The app does not use an established contact-tracing protocol, such as DP-3T, PEPP-PT NTK or ROBERT, but instead uses a custom designed protocol to perform contact-tracing. This is because the app predates those protocols by a few weeks. The contact-tracing protocol is a BLE-based contact-tracing protocol with static IDs that roughly works as follows:
  1. The user installs the app, which generates a deviceID a random UUID of the device, enrolls this device with the server and receives back a profileID which is an unsigned integer, assigned in a increasing sequence by the server.
  2. The app then broadcasts the profileID of the device on BLE and listens to other broadcasted profileIDs of other devices.
  3. The app then periodically upload a list of seen profileIDs to the server. This upload and all of the app's interaction with the server is authenticated by the deviceID which is sent to the server in every request and is kept on the device otherwise. The uploaded list of contacts used to contain the time and duration of the contacts, but this was abandoned and instead only the day of contact is uploaded.
  4. When the user becomes infected, the actions of the protocol become unclear, as the open-source backend is just an HTTP API, the administration of the whole system is done through an admin app that interacts with the backend, but is not open-source. However, something can be deduced from the API offered by the backend, as it offers one administrative call to query the seen profileIDs by a given device (identified by both the deviceID and profileID). This call is likely used by the admin app to query the contacts of a newly infected user and send alerts/quarantine recommendations to them. It is important to note that this call reports one-sided contacts as reported by the users.

This approach clearly provides the whole contact graph of a user's device to the server, whether the user is infected or not. Such a contact graph, while it is pseudonymous, leaks significant private information about the users to the server (see this document, section 4).

Contact reporting

As described above, it is likely that the reporting of contacts of an infected user uses only one-sided contacts submitted by the user's device, i.e. when querying the contacts of a user X, the contacts of all users are queried for X's profileID (see the code). Which might make sense, if one accounts for the possibility of some devices going offline and not uploading their contacts. If contact reports from both parties were necessary to report a contact, this might pose problems. However, this implicit trust of user's reported contacts, together with the way profileIDs are assigned (unsigned increasing integer sequence, see the code here and here) creates an attack on the system, in which an attacker can get the infection status of all of the users.

  1. Attacker first creates a new profile, and receives back their profileID. As profileIDs are generated incrementally, the attacker can now enumerate all previously registered profileIDs.
  2. The attacker creates a new profile for each of the existing user's profiles.
  3. Then the attacker will report a contact from each of his profiles with exactly one of the legitimate user's profiles, i.e. attacker's profile #1 reports contact with user profile #1, attacker's profile #2 reports contact with user profile #2, and so on.
  4. When any of the users registered before attacker's profile registration are confirmed infected, the query for their contacts will always include the one attacker's profile and the attacker will get a notification of being in contact with an infected user.
  5. There is also the possibility of extending this attack to complete deanonymization of an infected user, by placing BLE listening devices in particular public places, together with a camera capturing the area, and then correlating the captured broadcasts with the camera view of the area (see here). This data collection can be performed even before the attack itself or before the user's infection.

If however, the implicit trust was one-sided the other way, i.e. querying the contacts of a user X would trust their reported contacts a different attack would be possible, one that would mark all users as having contact with an infected person.

The attack would work as follows:
  1. Attacker registers a profile with the server, and receives back their profileID. As profileIDs are generated incrementally, the attacker can now enumerate all previously registered profileIDs. They can not however spoof messages to the API as users with those profileIDs as deviceIDs are required for that, and those are random UUIDs that contain enough entropy.
  2. The attacker can however report any and all profileIDs in use to the server as contacts, possibly daily for some period of time.
  3. The attacker can now give the account details/device with the account details to a likely infected cooperating person, which will get tested and obtain a confirmation of infection from a health authority. The person then confirms their infection with the attackers account details, which immediately marks all of the users in the system as exposed to an infected person.

Modifying the system to rely on both sides of an encounter to report it might seem like an easy fix, however that brings the aforementioned issues of false-negatives created by devices going offline, or devices with different bluetooth strength (where only one device saw enough broadcasts of the other device to report a contact) and so on. The current system with predictable and static user IDs will likely always suffer from similar attacks.

Using a custom contact-tracing protocol, as the system does, is a security risk even if the above attack is fixed, as proper specification and security analysis is necessary to get it right. One can get both of those by using an established protocol such as DP-3T. As the cryptography community mantra rightfully states, Don't roll your own crypto!

Build reproducibility and deployment

The three components of the system, the Android app, the iOS app and the backend server are all open-source, which is quite nice from an analysis perspective and also the bare-minimum a contact-tracing system should be.

There is however no transparency over the build and deployment process, e.g. what versions of code actually run on the server, or are provided in the respective app stores. The Android app does not contain the full configuration and it is thus not possible to build it reproducibly such that the built APK matches the app store APK perfectly.

Having build reproducibility for a privacy sensitive app is important, to ensure that code can be analyzed and that arguments from this code analysis can be applied to the deployed app. Also to make decompilation and analysis of deployed apps not necessary apart from a comparison of the app's hash.

Specification and documentation

The system lacks any proper specification, of the contact-tracing protocol, backend API or really any component. Without a detailed specification of all of the system's components and their responsibilities and behavior, proper analysis is resource-intensive if not impossible. This can be seen from my statements about the attacks above, where an unavailable component of the system, the admin app, makes decisions that influence how and if an attack would work. Without this specification, which should have been created before implementation took place, more vulnerabilities in the system cannot be ruled out, they will however remain harder to find and fix.

The components also lack documentation, apart from a README here and there. Having properly documented components would make security analysis of the system easier, as well as help new contributors to contribute to the project.

Tests

The android app contains no tests at all, the iOS app contains a test directory that contains no tests. The server is the only component with any tests, and contains a few tests for the push-notification service, SMS messaging service and a few unit tests for the core repository. This absence of tests is a serious issue for a privacy sensitive app, as the likelihood of errors in the code with absolutely no tests is high.

Calibration and real-world testing

The contact-tracing capabilities of the app have not been properly tested in the real-world, to the best of my knowledge. Such testing is necessary for proper calibration of what an epidemiologically significant encounter is and how it manifests in the BLE broadcasts. Modern devices have strong capabilities to both broadcast and receive the broadcasts, if any sequence of correctly received broadcasts longer than 5 minutes is counted as an encounter (as currently done in the app), the number of false-positives would likely be quite high.

Calibration and real-world testing is currently being performed by the DP-3T team, using an app built using their decentralized contact-tracing protocol, even before the deployment of the app in Switzerland (see here and here).

Other solutions

In comparison with current contact-tracing efforts and plans of different countries, the app is clearly the least privacy-preserving, due to using the aforementioned privacy issues (full contact graph on server, attacks possible, static and predictable IDs used).

The DP-3T project presents a decentralized privacy-preserving approach to contact-tracing, with strong guarantees, a detailed specification, published SDKs and extensive security analysis. It is also backed by a large group of researchers from the security & privacy area. This approach will be deployed in Switzerland (app). There has also been extensive work on interoperability of contact-tracing protocols, focusing on DP-3T (here).

The situation in the UK seems worse than the case of Switzerland, the NHSX/NCSC recently released a specification for a custom centralized contact-tracing system, which does not have privacy-preserving properties (see here for an analysis by Martin Albrecht and here for an analysis by Kenny Paterson).

There have been several statements from hundreds of scientists and researchers mainly in the fields of security & privacy that called for a responsible, privacy-preserving by design, approach to contact-tracing. See here and here. These statements endorse the decentralized privacy-preserving approach taken by DP-3T and clearly advise against the centralized approach taken by the Covid19 ZostanZdravy app (obviously without directly mentioning it).

Conclusions and recommendations

I believe the app, as it is now, presents a significant risk from a privacy perspective. The following list summarizes the issues presented:

  • The app reveals the full contact graph of all of its users to the server.
  • The app uses static and predictable user IDs.
  • The app allows for an attack in which an attacker gains the infection status of all users.
  • The app is not build reproducibly and thus correspondence between the deployed apps and the sources can not be easily confirmed.
  • The app has no specification and documentation.
  • The app has almost no tests.
  • There was no public security analysis of the contact-tracing protocol or the apps.
  • There was no calibration and real-world testing of the app and system.

When comparing the app to the principles outlined in the Joint Statement on Contact Tracing, the app fails all but one.

  • "Contact tracing Apps must only be used to support public health measures for the containment of COVID-19. The system must not be capable of collecting, processing, or transmitting any more data than what is necessary to achieve this purpose." The app collects the full contact graph of all users, which is unnecessary.
  • "Any considered solution must be fully transparent. The protocols and their implementations, including any sub-components provided by companies, must be available for public analysis. The processed data and if, how, where, and for how long they are stored must be documented unambiguously. Such data collected should be minimal for the given purpose." The data collected by the app is not minimal.
  • "When multiple possible options to implement a certain component or functionality of the app exist, then the most privacy-preserving option must be chosen. Deviations from this principle are only permissible if this is necessary to achieve the purpose of the app more effectively, and must be clearly justified with sunset provisions." The contact-tracing protocol implemented is clearly not the most privacy-preserving, but likely the simplest.
  • "The use of contact tracing Apps and the systems that support them must be voluntary, used with the explicit consent of the user and the systems must be designed to be able to be switched off, and all data deleted, when the current crisis is over." The app is currently voluntary.

I want to stress that an analysis like this one should have been performed long before the app achieved current levels of deployment. A way to fix some of the issues above would be to move the app to the DP-3T contact-tracing protocol, which has SDKs available for both Android and iOS, and has passed significant security and privacy analysis. This would fix the privacy and security issues inherent in the protocol used, but also help with other issues, as the need for a full specification would be lower, the code to document would be simpler and there would be less code to test. Calibration and testing issues would be also resolved by the currently ongoing testing by the DP-3T team.

One practical issue that I did not mention, as it does not pertain to security or privacy, is that of Bluetooth broadcast issues on iOS. This would be resolved by using DP-3T as well, since the iOS SDK of DP-3T plans to utilize the Apple provided contact-tracing APIs, when they become available.


GSoC 2017 - Final work submission

As the GSoC 2017 final evaluation period just ended, my final work product is finally submitted. This post is a summary of my final work product.

Mailman-pgp#

  • repository@gitlab
  • docs@rtd
  • Plugin for Mailman Core.
  • Enables creating a PGP mailing list, which has a list key, can receive and serve messages encrypted, can sign and receive signed messages from subscribers.
  • Creates the key email command, which is used for per-address user key management.
  • Subscription to a PGP enabled mailing list the subscribing address to send and confirm an address public key, which the moderator must verify.
  • Somewhat confirms the user has possession of the appropriate private key to the one sent on subscription.
  • Has per-list settings for encryption/signatures/what to do with non encrypted / non signed messages, etc..
  • Optionally exposes a REST API for list configuration.
  • Has local archivers which can store the messages encrypted by the list key.
  • Stores list and address keys in configurable key directories.
  • Requires (some not merged) MRs in Mailman Core:
  • Additional MR (not required):
  • Required branches are merged and maintained at J08nY/mailman/plugin.
  • To install, do pip install mailman-pgp, warning: it will pull in a development version of Mailman Core and PGPy.

django-pgpmailman#

mailman-rest-events#

  • repository@gitlab
  • A plugin for Mailman Core that turned out to be unnecessary for the working of django-pgpmailman, but implemented a similar feature as this MR.
  • This plugin sends the events (and some information about them) from Mailman Core to a list of configurable endpoints using JSON in HTTP POST requests.

Other contributions#

Overall#

I think I met almost all goals that the project idea required and my original proposal stated, with the noteworthy exception of remote archiving to HyperKitty which I just couldn’t find a way to integrate.


GSoC 2017 - Web UI progress

django-pgpmailman progress#

Successfully created the mail list views. Inspired heavily by Postorius, to get the same look, both in templates and views. There is a list index view, which lists only PGP enabled lists, and their key fingerprints. This also allows one to download the list key as it’s linked from the list key fingerprint. The list name link leads to a list settings/info view. The info tab is available to any logged in user, while the settings are list owner only. All the per-list PGP settings are configurable there.

django-mailman3 template chunks#

In order to make plugging the django-mailman3 based apps together and deduplicate some of their code, as well as to integrate the django-pgpmailman app into any Postorius + HyperKitty project I refactored the direct references of Postorius to HyperKitty and vice versa.

This is done in the template chunk MR. It introduces a new template tag in django-mailman3, which is intended to be used by all django-mailman3 based apps to let other installed apps add their entries to the navbar and user menu. Which I are two main ways Postorius and HyperKitty reference each other.


GSoC 2017 - WebUI integration

This post is about my current plans on how to implement the web ui part of PGP enabled Mailman. It strives to integrate into the Mailman Suite and use its features to the maximum possible degree.

General idea: Refactor general stuff to django-mailman3, to allow apps to hook up together in Mailman Suite easily, and then use that to hook up django-pgpmailman.

Features#

Show PGP enabled public lists, with their key fingerprints, with the option to download their public keys, also show some of their configuration (so that subscribers can see that for example if they send a cleartext message to a list that requires encrypted messages, it will be bounced).

Enable list owner to configure the PGP related per-list configuration options.

Enable list owner to set/see the list key (private part). This is quite questionable and will have a site-level option to be turned off (the REST API will then not serve the list private key).

The same level of user key management as the key command allows, with similar steps during key change/revocation.

Implementation#

Another django app is installed in the same project as Postorius + HyperKitty, django-pgpmailman. This app provides a list of PGP enabled mailing lists and their PGP related management in a similar way Postorius does, also user key management.

There are few places where Postorius refers to HyperKitty and vice versa, for adding the appropriate links/icons to the navbar as well as for the user menu entries. These references will be refactored to some mechanism in django-mailman3, which will allow any installed django app to add it’s entry to the navbar or the user menu. This will allow django-pgpmailman to hook up rather easily, without any direct references to it from Postorius/HyperKitty/django-mailman3.

Archiving#

The archiving web UI is a tougher nut to crack. I either have to develop a custom PGP mail archive frontend and integrate it similar to the PGP list management app, or integrate with HyperKitty transparently, so that archives are received encrypted, stored encrypted, and yet served to subscribers in clear. Developing a custom app is quite unrealistic and it would lack most HyperKitty functions.

However hooking up an encrypted message store to HyperKitty is also non-trivial, as HyperKitty is strongly tied with storing messages in it’s database and using a django Model to represent a message.

I currently have no realistic ideas, one that comes to mind, is to create a custom django database backend, that somehow stores everything encrypted, but thats a very unwieldy solution that likely won’t work well.

Other progress#

Fixed many little issues with the PGP plugin and PGPy. Got it to work quite nicely, below you can see a message received by a subscriber, by a PGP enabled discussion list, encrypted to his key, as shown by Thunderbird with the EnigMail plugin:

message_encrypted

Also finally merged the finished key revoke command to mailman-pgp/master.


GSoC 2017 - Progress

This week was tough but productive. Temperatures spiking to 34°C in my hometown have a really bad effect on my daily productivity.

Setup instance with PGP plugin#

Finally got a complete mailman instance setup and running with J08nY/mailman/plugin + J08nY/mailman-pgp/master and J08nY/Postorius/plugin + J08nY/mailmanclient/plugin + mailman/HyperKitty/master + mailman/django-mailman3/master. The plugin branches merge MR branches that introduce the plugin infrastructure for that particular Mailman component. For Mailman Core, the plugin branch merges the pluggable-components, pluggable-workflows and list-style-descriptions branches.

The pluggable-components one introduces the concept of a plugin to Mailman Core and replaces the (pre|post)_hooks and is essential to let site admins easily add plugins to Mailman Core by simply installing them to the same environment as Mailman Core and some simple configuration to enable. pluggable-workflows splits the subscription/unsubscription monolithic workflows into composable workflows, that are also pluggable by a plugin and set per-list. list-style-descriptions are exposed via the REST api and Postorius uses them for displaying list style selection.

I even successfully created a PGP enabled discussion list through Postorius. Subscribed to it by sending the subscription request, confirming it, replying to the key set <token> challenge with key attached, replying to the key confirm <token> with the challenge body signed by the key being set. This would of course be followed by the moderator verifying the supplied key in any real application of PGP enabled lists, which is also supported.

The instance runs on a Raspberry Pi with 512MB RAM along with my web-server, mail-server and several other services, so don’t expect lightning fast performance, or it being up anyway, reserving the right for any extended downtime ;).

Key revocation#

Working on proper key revocation behavior from the PGP plugin took much of my week as getting this right is pretty hard and the OpenPGP revocation mechanism is quite complex. The usual workflow for just an ordinary key change was already presented in one of my previous posts. However if the user needs to revoke a key with a revocation signature, we cannot use the old key to perform the key change challenge. Also, the key revocation can be only partial, as in a subkey being revoked, and the key can still be used for encryption and signing, then it’s usable for the PGP plugin and nothing needs to be done. This also gets more complex as when we allow a user to change his key without moderator approval, only with the challenge (which makes the user sign a challenge/statement signifying they are changing their key to the new one, by the old key). Then if the user revokes his former key using a reason for revocation that invalidates all signatures by that key(even former ones), we cannot trust the users current key, as the old one could have been compromised and used to set the new one.

For now, giving a mechanism for users to provide a revocation certificate that is verified merged with the key is implemented. If the revocation certificate revoked the key or a subkey/uid that makes the key not usable by the PGP plugin (the key can no longer be encrypted to or can sign) then the users key is reset and he/she has to send and confirm a new one with moderator approval necessary. That is almost completely implemented as it’s almost the same as the subscription challenge.

More PGPy work#

Necessary to make it usable, as for example, not having support for partial length headers would break handling of most messages encrypted with GPG as it likes to create plenty of packets with partial lengths. However, now I think that my development branch of PGPy is feature complete enough to support an instance of Mailman with the PGP plugin running.

Trello board#

Setup a Trello board to better track the issues that came up and keep my head sane:

mailman-pgp trello board

Next up#

Web UI integration#

The original proposal proposed adding support for PGP enabled lists to Postorius and HyperKitty directly, now when mailman-pgp is dynamically enabled in Mailman Core a similar approach needs to be taken with the Postorius and HyperKitty integration.

Archiving#

Thinking of doing local archiving very similar to the prototype archiver, encrypted by the list local-archive key. The remote archiving capability is a much tougher nut to crack and depends a lot on how the HyperKitty integration ends up looking.


GSoC 2017 - Second evaluation

The second evaluation period came quite fast after the first one, nonetheless the project advanced much further so quick recap of its current state is in order.

Since first evaluation#

SMTPS + STARTTLS#

MR 286

Finally with working tests after upstream fixes in aiosmtpd.

Pluggable components (plugins)#

MR 288

Rebased after the Click CLI processing branch was merged and got it to work nicely in the end. There was an issue with config processing, as plugins can now supply their own CLI commands and plugins are configured/specified in the Mailman core config, which can be supplied by a CLI option. So that created a bit of an argument processing/lazy-loading issue but looking some more at how Click works I was able to quickly resolve it. Config option is now an eager one and initializes Mailman before commands need to be listed/used.

Pluggable workflows#

MR 299

Also rebased and added tests to get diffcov to 100% and fix one of the migrations in that branch, as it was broken before.

Key management (plugin)#

Implemented email commands for managing per-address PGP keys in plugin. It uses pluggable workflows to plug into the subscription process of a PGP enabled mailing list and requests the user key, also does mailback confirmation of it. The key is then available to the list moderator during subscription moderation, and more generally to the plugin for verifying signatures and encrypting. After subscription key management is also implemented where a user can change his set key, provided he can sign a challenge provided by the plugin with the old key. Key revocation handling is also necessary, but not yet done.

Outgoing processing (plugin)#

Implemented custom Bulk and Individual delivery classes for PGP enabled lists in plugin. These deliveries optionally encrypt the mail to subscribers keys, and/or sign with the list key. The bulk one retains anonymity of subscribers as the keyids are zeroed out of the PKESK(Public Key Encrypted Symmetric Key) packets which OpenPGP implementations should handle as a wildcard keyid and try decrypting with all usable private keys. It is also almost as efficient as it can be, as it only encrypts the message with one session key per chunk and then encrypts said session key to recipients in the chunk. The signing is also configurable.

Signature hash tracking (plugin)#

Implemented store of signature hashes from successful postings to a PGP enabled list which are then (optionally) used to stop replay of the same signature by Mailman in a future posting.

PyPI package (plugin)#

mailman-pgp @ PyPI

Created the PyPI package for the PGP plugin.

Overall#

I would like to be a bit further in the project at this point, however I am very optimistic about the next days and weeks. After resolving some issues and TODOs I have for the current plugin implementation and setting up the live Mailman instance with the PGP plugin, which should be up by the end of the week. I believe I can start work on the other side of the REST API, of somehow hooking PGP enabled archives/lists to Postorius and HyperKitty.


GSoC 2017 - Post title goes here

Signature hash tracking#

It would be relatively easy to replay a signed message to a mailing list by a user as no kind of challenge-response is done on posting.

While signature replay checking is usually done on the end users point with against his keyring and messages he has so far received and their context, I think it is kind of expected of PGP enabled Mailman to also do this as it relays the messages.

On a successful posting to a PGP enabled mailing list (AcceptedEvent to be precise), the message is searched for PGP signatures and their digests and key fingerprints are added into the sighash table.

The signature rule then checks the digests from a posting against those in the sighash table. If it finds it has previously accepted a message with any of those hashes it takes the duplicate_sig_action which is per-list configurable.

This of course means that if the duplicate_sig_action is not set to Action.defer, which means the message gets rejected or dropped, that a signature that was sent to a list cannot be sent again. Of course if a user wants to send a signed message again, he can just resign it and send it again, the hashes won’t match. However sometimes it might be useful to post the message as signed originally, for example to prove something. However, I think it is worth it to keep this as a configurable option. Maybe with another option disabling the collection of signature hashes, for performance reasons.

Outgoing processing#

Since a the plugin needs to process the messages for outgoing encryption on a per-recipient or per-chunk basis, it couldn’t be implemented as a Handler, I thought about implementing it as a custom OutgoingRunner but that didn’t work out either. However the IMailTransportAgentDelivery interface is great for what the plugin needs to do. So there are two custom PGP enabled delivery classes that get selected by a custom [mta] outgoing callable, one for bulk delivery and other for individual delivery.

Documentation#

I got the pgpmailman-proposal repository almost completely up to date on changes to the plugin and my current proposed Core/Client/Postorius/HyperKitty changes and MRs.

Next up#

More testing#

As I remembered to ignore coverage of tests it dropped to 88% and I think it is currently in order to add more comprehensive tests for features implemented that check the edge cases which currently remain.

Setup live development instance#

I believe I got pretty far into the development without having a proper live development instance of Mailman Core + plugins and Postorius + HyperKitty + Client, so I’m going to set that up now and test manually. With time I might set this up as a public site/mailing list server, to demonstrate the features of the PGP plugin.

Archivers#

Currently thinking about implementing two encrypted archivers, one local, one remote. The local one would be similar to the prototype archiver but store messages encrypted, TBD how. The remote one will send the messages encrypted to a receiving archiver, the django-pgpmailman instance running next to HyperKitty.

django-pgpmailman#

With most of the essential stuff in the core plugin done, the web ui part of development can begin.


 Prev
1 2 3