The goals of artifact evaluation
I have participated in artifact evaluation committees since 2021. I have done 5 years of CHES, one year of EUROCRYPT, and 2 NDSS years. Over these years, I have seen good artifacts, even great ones, and a fair share of “just OK” ones. There have been no truly terrible ones, though some required a bit of polish.
The approach to artifact evaluation at the different venues also differs, and reviewing an artifact for NDSS this year reminded me of the differences. In the following text, I have some opinions on those differences and on the aims of artifact evaluation at large. I also looked at a how other conferences do artifact evaluation, even if I was not reviewing for them, such as USENIX Sec, CCS, CRYPTO, or ACSAC. I summarize how their approach changed over the years, what badges they use, what they focus on, and whether they have artifact appendices with a strict format. I also look at link rot and availability of artifacts over the years.
I want to highlight the work done by the community at secartifacts.github.io. Having one place that survives in the event of the conference pages disappearing is quite nice.
Focus on reproducibility#
NDSS, as well as a few other conferences (e.g., USENIX Sec), puts a heavy focus on reproducibility. They require that artifacts contain a detailed artifact appendix (USENIX Sec 2026 example). It describes the artifact, its requirements, and setup but most importantly, it contains a detailed description of the major claims of the paper and how they are supported by the artifact and how reviewers can run experiments to verify these claims. The appendix is thus usually very focused on the details of the paper and on how to reproduce them exactly. It certainly helps in reproducing the experiments in the paper.
Recently, I presented our work on
I recently attended CHES 2024 and presented a paper on pyecsca: Reverse engineering black-box elliptic curve cryptography via side-channel analysis. Check it out on 

