CC #6: A New Privacy-Preserving Aggregate Measurement Hope

Welcome to the sixth issue of Ciphertext Compression (now abbreviated “CC”), meant to be a quick digest of what happened on ePrint last week. If you are wondering why I am doing this, I am wondering that too, see this post.

Range considered. new ePrints posted between Sun, 07 Nov 2021 12:01AM UTC and Sun, 14 Nov 2021 12:01AM UTC!

Meta. CC is on hiatus for the next two weeks for thanksgiving (I also have a couple of class projects due soon that I have been procrastinating on.) Have a nice thanksgiving y’all!

Private Measurement

Aggregate Measurement via Oblivious Shuffling — Aggregate measurement is super useful, if you are making an app, it is useful to know what features the users use (in aggregate) and what features they do not touch (in aggregate) so you can build a better experience for them with your limited budget. Further, in theory, aggregate measurement usually does not hurt user privacy. But, how many of y’all have turned off app telemetry because it is too creepy? 🙋 The creepiness, in part, comes from the fact that the app gets to see true data (which can be super privacy invasive) not just the aggregates. But, in the past few years, we have made a lot of progress, like there is now a draft IETF standard on privacy-preserving aggregate measurement. This paper proposes a new scheme for privacy-preserving aggregate measurement that achieves better performance than prior work in situations where the attribute we are measuring can have a large number of potential values; for instance, if you wanted to measure a histogram over the referer domain.

Cryptographic Error Correction

Interactive Error Correcting Codes Over Binary Erasure Channels Resilient to >1/2 Adversarial Corruption — Recall that an error correcting code has distance $d$ if the the Hamming distance between any two codewords is at least $d$. A classic result in error correction is that a code with distance $d$ can correct up to $d/2$ bitflips. For erasures, where rather than being flipped bits are changed to some distinguished “erased” state, we can do slightly better—an error correcting code with distance $d$ can correct $d$ erasures. In relative terms, an error correcting code with relative distance $1/2$ can correct at most $1/2$ relative erasures. This is cool and all but the traditional error correction model is a bit simplistic, Alice sends a coded message to Bob who then recovers it. In practice, Bob can speak back to Alice, and indeed real-world protocols (like TCP) use interaction. This paper looks at this interactive erasure setting with adversarial corruption where a meddler in the middle can erase up to some fraction of the total communication between Alice and Bob, and shows that in this relaxed model, we can evade the $1/2$ bound by giving an explicit scheme that can handle up to $3/5$ relative erasure.

PAKEs

SoK: Password-Authenticated Key Exchange – Theory, Practice, Standardization and Real-World Lessons — I think the title mostly covers it. I’ll just add that it has beautiful figures. 🙂

Hardware AES

VASA: Vector AES Instructions for Security Applications — I love Intel’s AES-NI instructions, they afford super-fast AES implementations. A couple of years ago, Intel vectorized these instructions. Before, you could use AES-NI instructions like AESENC to apply a round of AES encryption to a 128-bit state and 128-bit roundKey. The newer vector instructions like VAESNI let you do the same but now on 4-vectors, they take a 512-bit vector (state0,state1,state2,state3) and a 512-bit vector (roundKey0,roundKey1,roundKey2,roundKey3) and apply a round of AES encryption to each state in parallel. Off the bat, this primitive can be used to significantly speedup AES modes like AES-CTR and AES-GCM. This paper looks at MPC protocols which have data dependencies preventing immediate application of this hardware acceleration and adapts them to take advantage of it.

Distributed Storage

Succinct Erasure Coding Proof Systems — Nowadays I don’t worry about my hard drive failing, I have cloud backups whose durability rating is so high that if you put it in a 32-bit float it says 100. The reason we can have such nice things is erasure coding which lets you split your data into $n$ shards such that if you have at least $m$ shards you can reconstruct the original data. If, like me, you were curious how these work under the hood, they typically use Reed-Solomon codes and variants. When recombining these shards, we need a way to know that the provided shard was correctly generated. You cannot store a SHA256 hash of the shard with the shard, an attacker then can just update the shard and the SHA256 hash (since it is unkeyed). You could add a MAC with some secret key that only you know but we ideally wanna avoid that. We can store the SHA256 hash of every shard with each shard so you can cross compare (and indeed a prior system does this.) This paper studies this verification problem in detail and proposes schemes that can be plugged into existing erasure coding schemes to get better performance.

Closing Words

Thanks for sticking till the end. 💜

You can subscribe to this hooey via Buttondown (using email or rss) and find it on Twitter. Oh, and I almost forgot, you can now get to this hooey via the fancy url ciphertext.blog! 🥳

Obvious Disclaimer. Views are mine and not serious. All mistakes are mine. If you’d like to point them out, shoot me an email. Maybe I can embed secret messages here and no one will notice. 75EC65F50299E615E1AB6DEF495597EB